Compare commits

...

6750 Commits

Author SHA1 Message Date
Anastassios Nanos
1e6cea24c8 Merge pull request #10890 from zvonkok/arm64-fix-release
release: Remove artifacts for release
2025-02-17 22:29:23 +02:00
Zvonko Kaiser
1d9915147d release: Remove artifacts for release
We need to make sure the release does not have any residual binaries
left for the release payload

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-17 20:16:48 +00:00
Anastassios Nanos
ae1be28ddd Merge pull request #10880 from nubificus/3.14.0-release
release: Bump version to 3.14.0
2025-02-17 20:25:30 +02:00
Zvonko Kaiser
72833cb00b Merge pull request #10878 from zvonkok/agent_cdi_timeout
gpu: agent cdi timeout
2025-02-17 12:49:51 -05:00
Zvonko Kaiser
fda095a4c9 Merge pull request #10786 from zvonkok/gpu-config-update
gpu: Update config files
2025-02-17 12:45:54 -05:00
Anastassios Nanos
c7347cb76d release: Bump version to 3.14.0
Bump VERSION and helm-chart versions

Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
2025-02-17 16:47:24 +00:00
Fabiano Fidêncio
639bc84329 Merge pull request #10787 from fidencio/topic/bump-kernel-to-6.12.11
version: Bump kernel to 6.12.13
2025-02-17 17:39:14 +01:00
Fabiano Fidêncio
7ae5fa463e versions: Bump coco-guest-components
So attestation-agent and others have a version including the ttrpc bump
to v0.8.4, allowing us to use the latest LTS kernel.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-02-17 15:16:54 +01:00
Fabiano Fidêncio
1381cab6f0 build: Fix rootfs cache logic
We've been appending to the wrong variable for quite some time, it
seems, leading to not actually regenerating the rootfs when needed.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-02-17 13:55:36 +01:00
Fabiano Fidêncio
7fc7328bbc versions: Bump kernel to 6.12.13
Let's try to keep up with the LTS patch releases.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-02-17 13:47:35 +01:00
Simon Kaegi
f5edbfd696 kernel: support loop device in v6.8+ kernels
Set CONFIG_BLK_DEV_WRITE_MOUNTED=y to restore previous kernel behaviour.

Kernel v6.8+ will by default block buffer writes to block devices mounted by filesystems.
This unfortunately is what we need to use mounted loop devices needed by some teams
to build OSIs and as an overlay backing store.

More info on this config item [here](https://cateee.net/lkddb/web-lkddb/BLK_DEV_WRITE_MOUNTED.html)

Fixes: #10808

Signed-off-by: Simon Kaegi <simon.kaegi@gmail.com>
2025-02-17 13:47:35 +01:00
Fabiano Fidêncio
d96e8375c4 Merge pull request #10885 from stevenhorsman/bump-agent-crates-to-resolve-CVEs
agent: Bump agent crates to resolve CVEs
2025-02-17 12:11:43 +01:00
stevenhorsman
e5a284474d deps: Update cookie-store & publicsuffix
Run:
```
cargo update -p cookie-store
cargo update -p publicsuffix
```
to update the version of idna and resolve CVE-2024-12224

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-14 17:30:03 +00:00
stevenhorsman
5656fc6139 deps: Bump reqwest
Bump reqwest to 0.12.12 to pick up fixes

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-14 17:30:03 +00:00
stevenhorsman
3a3849efff deps: Update quinn-proto
Update quin-proto to fix CVE-2024-45311

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-14 17:30:03 +00:00
Fabiano Fidêncio
64ceb0832a Merge pull request #10851 from fidencio/topic/bump-image-rs-to-bring-in-ttrpc-0.8.4
agent: Bump image-rs to 514c561d93
2025-02-14 18:21:56 +01:00
Fabiano Fidêncio
d5878437a4 Merge pull request #10845 from DataDog/dind-subcgroup-fix
Add process to init subcgroup when we're using dind with cgroups v2
2025-02-14 18:12:24 +01:00
Steve Horsman
469c651fc0 Merge pull request #10879 from nubificus/fix_version
packaging(release): Properly handle version tag for the release bundle
2025-02-14 14:40:37 +00:00
Zvonko Kaiser
908aacfa78 gpu: Update the logging around CDI
Removed a rogue printf and updated the logging to say
that we're waiting for CDI spec(s) to be generated rather
than saying there is an error, it's not we have a timeout
after that it is an error.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-14 14:32:00 +00:00
Zvonko Kaiser
4bda16565b gpu: Update timeouts
With the create_container_timeout the dial_timeout is lest important.
Add the custom timeout for GPUs in create_container_timeout

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-14 14:29:18 +00:00
Zvonko Kaiser
66ccc25724 tdx: Update GPU config for the latest TDX stack
We need extra kernel_params for TDX

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-14 14:29:18 +00:00
Zvonko Kaiser
d4dd87a974 gpu: Update config files
With the recent changed to cgroupsv1 and AGENT_INIT=no we
need update to the config files.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-14 14:29:18 +00:00
Anastassios Nanos
b13db29aaa packaging(release): Properly handle version tag for the release bundle
The tags created automatically for published Github releases
are probably not annotated, so by simply running `git describe` we are
not getting the correct tag. Use a `git describe --tags` to allow git
to look at all tags, not just annotated ones.

Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
2025-02-14 12:41:08 +00:00
Zvonko Kaiser
2499d013bd gpu: Update handle_cdi_devices
AgentConfig now has the cdi_timeout from the kernel
cmdline, update the proper function signature and use
it in the for loop.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-13 20:11:48 +00:00
Zvonko Kaiser
d28410ed75 Merge pull request #10877 from AdithyaKrishnan/main
CI: Deprecate SEV
2025-02-13 14:55:11 -05:00
Zvonko Kaiser
95aa21f018 gpu: Add CDI timeout via kernel config
Some systems like a DGX where we have 8 H100 or 8 H800 GPUs
need some extended time to be initialized. We need to make
sure we can configure CDI timeout, to enable even systems with 16 GPUs.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-13 19:23:19 +00:00
Adithya Krishnan Kannan
6cc5b79507 CI: Deprecate SEV
Phase 1 of Issue #10840
AMD has deprecated SEV support on
Kata Containers, and going forward,
SNP will be the only AMD feature
supported. As a first step in this
deprecation process, we are removing
the SEV CI workflow from the test suite
to unblock the CI.

Will be adding future commits to
remove redundant SEV code paths.

Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
2025-02-13 12:20:21 -06:00
Steve Horsman
0a39f59a9b Merge pull request #10874 from stevenhorsman/skip-consistently-failing-block-volume-test
tests: Skip block volume test on fc, stratovirt
2025-02-13 15:39:45 +00:00
Zvonko Kaiser
a0766986e7 Merge pull request #10832 from RuoqingHe/update-yq
ci: Update yq to v4.44.5 to support riscv64
2025-02-13 08:33:02 -05:00
stevenhorsman
56fb2a9482 tests: Skip block volume test on fc, stratovirt
The block volume test has failed on 10/10 nightlies
and all the PRs I've seen, so skip it until it can be assessed.

See #10873

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-13 11:50:35 +00:00
stevenhorsman
2d266df846 test: Update expected error in signed image tests
We are seeing a different error in the new version of image-rs,
so update our tests to match.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-13 11:44:51 +00:00
stevenhorsman
d28a512d29 agent: Wait for network before init_image_service
Based on the guidance from @Xynnn007 in #10851
> The new version of image-rs will do attestation once
ClientBuilder.build().await() is called, while the old version
will do so lazily the first image pull request comes.
Looks like it's called in  rpc::start() in kata-agent, when
I'm afraid the network hasn't been initialized yet.

> I am not sure if the guest network is prepared after
the DNS is configured (in create_sandbox),
if so we can move (the init_image_service) right after that.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-13 11:44:51 +00:00
Tobin Feldman-Fitzthum
a13d5a3f04 agent: Bump image-rs to 514c561d93
As this brings in the commit bumping ttrpc to 0.8.4, which fixes
connection issues with kernel 6.12.9+.

As image-rs has a new builder pattern and several of the values in the
image client config have been renamed, let's change the agent to account
for this.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@linux.ibm.com>
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-13 11:44:51 +00:00
Steve Horsman
8614e5efc4 Merge pull request #10869 from stevenhorsman/bump-kcli-ubuntu-version
ci: k8s: Bump kcli image version
2025-02-13 09:59:20 +00:00
Antoine Gaillard
4b5b788918 agent: Use init subcgroup for process attachment in DinD
cgroups v2 enforces stricter delegation rules, preventing operations on
cgroups outside our ownership boundary. When running Docker-in-Docker (DinD),
processes must be attached to an "init" subcgroup within the systemd unit.
This fix detects and uses the init subcgroup when proxying process attachment.

Fixes #10733

Signed-off-by: Antoine Gaillard <antoine.gaillard@datadoghq.com>
2025-02-13 10:44:51 +01:00
Dan Mihai
958cd8dd9f Merge pull request #10613 from 3u13r/feat/policy/refactor-out-policy-crate-and-network-namespace
policy: add policy crate and add network namespace check to policy
2025-02-12 18:28:09 -08:00
Alex Lyn
e1b780492f Merge pull request #10839 from RuoqingHe/appease-clippy
dragonball: Appease clippy
2025-02-13 09:12:15 +08:00
Zvonko Kaiser
acd2a933da Merge pull request #10864 from fidencio/topic/packaging-move-to-ubuntu-22-04
packaging: Move builds to Ubuntu 22.04
2025-02-12 14:29:41 -05:00
Wainer Moschetta
62e239ceaa Merge pull request #10810 from arvindskumar99/nydus_perm_install
Skipping SNP and SEV from deploying and deleting Snapshotter
2025-02-12 14:38:56 -03:00
stevenhorsman
fd7bcd88d0 ci: k8s: Bump kcli image version
When trying to deploy nydus on kcli locally we get the
following failure:
```
root@sh-kata-ci1:~# kubectl get pods -n nydus-system
NAMESPACE                   NAME                                          READY   STATUS              RESTARTS      AGE
nydus-system                nydus-snapshotter-5kdqs                       0/1     CrashLoopBackOff    4 (84s ago)   7m29s
```
Digging into this I found that the nydus-snapshotter service
is failing with:
```
ubuntu@kata-k8s-worker-0:~$ journalctl -u nydus-snapshotter.service
-- Logs begin at Wed 2025-02-12 15:06:08 UTC, end at Wed 2025-02-12 15:20:27 UTC. --
Feb 12 15:10:39 kata-k8s-worker-0 systemd[1]: Started nydus snapshotter.
Feb 12 15:10:39 kata-k8s-worker-0 containerd-nydus-grpc[6349]: /usr/local/bin/containerd-nydus-grpc:
/lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required b>
Feb 12 15:10:39 kata-k8s-worker-0 containerd-nydus-grpc[6349]: /usr/local/bin/containerd-nydus-grpc:
/lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required b>
Feb 12 15:10:39 kata-k8s-worker-0 systemd[1]: nydus-snapshotter.service: Main process exited, code=exited, status=1/FAILURE
```
I think this is because 20.04 has version:
```
ubuntu@kata-k8s-worker-0:~$ ldd --version
ldd (Ubuntu GLIBC 2.31-0ubuntu9.16) 2.31
```
so it's too old for the nydus snapshotter.
Also 20.04 is EoL soon, so bumping is better.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-12 15:38:18 +00:00
Zvonko Kaiser
fbc8454d3d Merge pull request #10866 from zvonkok/enable-cc-gpu-build
gpu: enable confidential initrd build
2025-02-12 09:26:08 -05:00
Ruoqing He
897e2e2b6e dragonball: Appease clippy
Some problem hidden in `dbs` crates are revealed after making these
crates workspace components, fix according to `cargo clippy` suggests.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-02-12 19:44:34 +08:00
Leonard Cohnen
ec0af6fbda policy: check the linux network namespace
Peer pods have a linux namespace of type network. We want to make sure that all
container in the same pod use the same namespace. Therefore, we add the first
namespace path to the state and check all other requests against that.

This commit also adds the corresponding integration test in the policy crate
showcasing the benefit of having rust integration tests for the policy.

Signed-off-by: Leonard Cohnen <lc@edgeless.systems>
2025-02-12 10:41:15 +01:00
Leonard Cohnen
7aca7a6671 policy: use agent policy crate in genpolicy test
The generated rego policies for `CreateContainerRequest` are stateful and that
state is handled in the policy crate. We use this policy crate in the
genpolicy integration test to be able to test if those state changes are
handled correctly without spinning up an agent or even a cluster.

This also allows to easily test on a e.g., CreateContainerRequest level
instead of relying on changing the yaml that is applied to a cluster.

Signed-off-by: Leonard Cohnen <lc@edgeless.systems>
2025-02-12 10:41:15 +01:00
Leonard Cohnen
d03738a757 genpolicy: expose create as library
This commit allows to programmatically invoke genpolicy. This allows for other
rust tools that don't want to consume genpolicy as binary to generate policies.
One such use-case is the policy integration test implemented in the following
commits.

Signed-off-by: Leonard Cohnen <lc@edgeless.systems>
2025-02-12 10:41:15 +01:00
Leonard Cohnen
cf54a1b0e1 agent: move policy module into separate crate
The policy module augments the policy generated with genpolicy by keeping and
providing state to each invocation.
Therefore, it is not sufficient anymore to test the passing of requests in
the genpolicy crate.

Since in Rust, integration tests cannot call functions that are not exposed
publicly, this commit factors out the policy module of the agent into its
own crate and exposes the necessary functions to be consumed by the agent
and an integration tests. The integration test itself is implemented in the
following commits.

Signed-off-by: Leonard Cohnen <lc@edgeless.systems>
2025-02-12 10:41:15 +01:00
Fupan Li
ec7b2aa441 Merge pull request #10850 from teawater/direct
Clean the config block_device_cache_direct of runtime-rs
2025-02-12 09:45:37 +08:00
Zvonko Kaiser
5431841a80 Merge pull request #10814 from kata-containers/shellcheck-gha
gha: Add shellcheck
2025-02-11 18:30:41 -05:00
Zvonko Kaiser
b231a795d7 gha: Add shellcheck
We need to start to fix our scripts. Lets run shellcheck
and see what needs to be reworked.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-11 16:00:34 +00:00
Zvonko Kaiser
befb2a7c33 gpu: Confidential Initrd
Start building the confidential initrd

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-11 15:41:36 +00:00
Fupan Li
5b809ca440 CI: a workaround for containerd v2.x e2e test
the latest containerd had an issue for its e2e test, thus we should do
the following fix to workaround this issue. For much info about this issue,
please see:

https://github.com/containerd/containerd/pull/11240

Once this pr was merged and release new version, we can remove
this workaround.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-02-11 15:21:53 +01:00
Fupan Li
a3fd3d90bc ci: Add the sandbox api testcases
A test case is added based on the intergrated cri-containerd case.
The difference between cri containerd integrated testcase and sandbox
api testcase is the "sandboxer" setting in the sandbox runtime handler.

If the "sandboxer" is set to "" or "podsandbox", then containerd will
use the legacy shimv2 api, and if the "sandboxer" is set to "shim", then
it will use the sandbox api to launch the pod.

In addition, add a containerd v2.0.0 version. Because containerd officially
supports the sandbox api from version 2.0.0.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-02-11 15:21:53 +01:00
Fupan Li
36bf080c1e runtime-rs: register the sandbox api service
add and resiger the sandbox api service, thus runtime-rs
can deal with the sandbox api rpc call from the containerd.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-02-11 15:21:53 +01:00
Fupan Li
8332f427d2 runtime-rs: add the wait and status method for sandbox api
Add the sandbox wait and sandbox status method for sandbox
api.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-02-11 15:21:53 +01:00
Fupan Li
2d6b1e6b13 runtime-rs: add the sandbox api support
For Kata-Containers, we add SandboxService for these new calls alongside
the existing
TaskService, including processing requests and replies, and properly
calling
VirtSandbox's interfaces. By splitting the start logic of the sandbox,
virt_container
is compatible with calls from the SandboxService and TaskService. In
addition, we modify
the processing of resource configuration to solve the problem that
SandboxService does not
have a spec file when creating a pod.

Sandbox api can be supported from containerd 1.7. But there's a
difference from container 2.0.
To enbale it from 2.0, you can support the sandbox api for a specific
runtime by adding:
 sandboxer = "shim", take kata runtime as an example:

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata]
          runtime_type = "io.containerd.kata.v2"
          sandboxer = "shim"
          privileged_without_host_devices = true
          pod_annotations = ["io.katacontainers.*"]

For container version 1.7, you can enable it by:

1: add env ENABLE_CRI_SANDBOXES=true
2: add sandbox_mode = "shim" to runtime config.

Acknowledgement

This work was based on @wllenyj's POC code:
(f5b62a2d7c)

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2025-02-11 15:21:53 +01:00
Fupan Li
65e908a584 runtime-rs: add the sandbox init for sandbox api
For the processing of init sandbox, the init of task
api has some more special processing procedures than
the init of sandbox api, so these two types of init
are separated here.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-02-11 15:21:53 +01:00
Fupan Li
be40646d04 runtime-rs: move the sandbox start from sandbox init function
Split the sandbox start from the sandbox init process, and call
them separately.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-02-11 15:21:53 +01:00
Fupan Li
438f81b108 runtime-rs: only get the containerd id when start container
When start the sandbox, the sandbox id would be passed from the
shim command line, and it only need to get the containerd id from
oci spec when starting the pod container instead of the pod sandbox.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-02-11 15:21:53 +01:00
Fupan Li
9492c45d06 runtime-rs: load the cgroup path correctly
When the sandbox api was enabled, the pause container would
be removed and sandbox start api only pass an empty bundle
directory, which means there's no oci spec file under it, thus
the cgroup config couldn't get the cgroup path from pause container's
oci spec. So we should set a default cgroup path for sandbox api
case.

In the future, we can promote containerd to pass the cgroup path during
the sandbox start phase.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-02-11 15:21:53 +01:00
Fupan Li
78b96a6e2e runtime-rs: fix the issue of missing create sandbox dir
It's needed to make sure the sandbox storage path
exist before return it.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-02-11 15:21:53 +01:00
Fupan Li
97785b1f3f runtime-rs: rustfmt against lib.rs
It seemed some files was mssing run rustfmt.
This commit do rustfmt for them.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-02-11 15:21:53 +01:00
Fupan Li
33555037c0 protocols: Add the cri api protos
Add the cri api protos to support the sandbox api.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-02-11 15:21:53 +01:00
Hui Zhu
27cff15015 runtime-rs: Remove block_device_cache_direct from config of fc
Remove block_device_cache_direct from config of fc in runtime-rs because
fc doesn't support this config.

Fixes: #10849

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2025-02-11 14:04:11 +08:00
Hui Zhu
70d9afbd1f runtime-rs: Add block_device_cache_direct to config of ch and dragonball
Add block_device_cache_direct to config of ch and dragonball in
runtime-rs because they support this config.

Fixes: #10849

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2025-02-11 14:04:11 +08:00
Hui Zhu
db04c7ec93 runtime-rs: Add block_device_cache_direct config to ch and qemu
Add block_device_cache_direct config to ch and qemu in runtime-rs.

Fixes: #10849

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2025-02-11 14:04:11 +08:00
Hui Zhu
e4cbc6abce runtime-rs: CloudHypervisorInner: Change config type
This commit change config in CloudHypervisorInner to normal
HypervisorConfig to decrease the change of its type.

Fixes: #10849

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2025-02-11 14:04:11 +08:00
Fabiano Fidêncio
75ac09baba packaging: Move builds to Ubuntu 22.04
As Ubuntu 20.04 will reach its EOL in April.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-02-10 21:25:43 +01:00
Fabiano Fidêncio
c9f5966f56 Merge pull request #10860 from kata-containers/topic/debug-ci
workflows: build: Do not store unnecessary content on the tarball
2025-02-10 20:01:37 +01:00
Fabiano Fidêncio
ec290853e9 workflows: build: Do not store unnecessary content on the tarball
Otherwise we may end up simply unpacking kata-containers specific
binaries into the same location that system ones are needed, leading to
a broken system (most likely what happened with the metrics CI, and also
what's happening with the GHA runners).

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-02-10 18:57:29 +01:00
Steve Horsman
fb341f8ebb Merge pull request #10857 from fidencio/topic/ci-tdx-only-use-one-machine-for-testing
ci: Only use the Ubuntu TDX machine in the CI
2025-02-10 15:25:06 +00:00
Fabiano Fidêncio
23cb5bb6c2 ci: Only use the Ubuntu TDX machine in the CI
We've been hitting issues with the CentOS 9 Stream machine, which Intel
doesn't have cycles to debug.

After raising this up in the Confidential Containers community meeting
we got the green light from Red Hat (Ariel Adam) to just disable the CI
based on CentOS 9 Stream for now.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-02-10 12:50:16 +01:00
Zvonko Kaiser
eb1cf792de Merge pull request #10791 from kata-containers/gpu_ci_cd
gpu: Add first target and fix extratarballs
2025-02-06 15:47:27 -05:00
Zvonko Kaiser
62a975603e Merge pull request #10806 from stevenhorsman/rust-1.80.0-bump
Rust 1.80.0 bump
2025-02-06 14:49:23 -05:00
Dan Mihai
fdf3088be0 Merge pull request #10842 from microsoft/danmihai1/disable-job-policy-test
tests: disable k8s-policy-job.bats on coco-dev
2025-02-06 09:09:49 -08:00
Hyounggyu Choi
48c5b1fb55 Merge pull request #10841 from BbolroC/make-measured-rootfs-configurable
local-build: Do not build measured rootfs on s390x
2025-02-06 16:07:15 +01:00
Hyounggyu Choi
1bdb34e880 tests: Skip trusted storage tests for IBM SE
Let's skip all tests for trusted storage until #10838 is resolved.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-02-06 12:09:14 +01:00
Hyounggyu Choi
27ce3eef12 local-build: Do not use measured rootfs on s390x
IBM SE ensures to make initrd measured by genprotimg and verified by ultravisor.
Let's not build the measured rootf on s390x.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-02-06 10:12:55 +01:00
stevenhorsman
fce49d4206 dragonball: Skip unsafe tests
Skip tests that use unsafe uses of file descriptor
which causes
```
fatal runtime error: IO Safety violation: owned file descriptor already closed
```

See #10821

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-06 08:54:17 +00:00
Fabiano Fidêncio
2ceb7a35fc versions: Bump rust to 1.80.0 (matching coco-guest-components)
This is needed in order to avoid agent build issues, such as:
```
error[E0658]: use of unstable library feature 'lazy_cell'
  --> /home/ansible/.cargo/git/checkouts/guest-components-1e54b222ad8d9630/514c561/ocicrypt-rs/src/lib.rs:10:5
   |
10 | use std::sync::LazyLock;
   |     ^^^^^^^^^^^^^^^^^^^
   |
   = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information
```

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-06 08:53:51 +00:00
Fabiano Fidêncio
76df852f33 packaging: agent: Add rust version to the builder image name
As we want to make sure a new builder image is generated if the rust
version is bumped.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-06 08:53:51 +00:00
stevenhorsman
d3e0ecc394 kata-ctl: Allow empty const
Due to the way that multi-arch support is done, on various platforms
we will get a clippy error:
```
error: this expression always evaluates to false
```
which might not be true on those other platforms, so
allow this code pattern to suppress the clippy error

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-06 08:53:51 +00:00
Fabiano Fidêncio
6de8e59109 Merge pull request #10824 from stevenhorsman/updates-in-prep-of-rust-1.80-bump
Updates in prep of rust 1.80 bump
2025-02-06 09:05:23 +01:00
Dan Mihai
47ce5dad9d tests: disable k8s-policy-job.bats on coco-dev
k8s-policy-job is modeled after the older k8s-job, and it appears
that both of them fail occasionally on coco-dev.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-02-05 23:06:16 +00:00
Arvind Kumar
47534c1c3e nydus: Skipping SNP and SEV from deploying and deleting Snapshotter
Preparing to install nydus permanently on the AMD node,
so disabling deploy and delete command for SNP and SEV.

Signed-off-by: Arvind Kumar <arvinkum@amd.com>
2025-02-05 12:26:53 -06:00
Zvonko Kaiser
45bd451fa0 ci: add arm64 attestation
Do the very same thing that we do on amd64 and add attestation

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-05 16:30:20 +00:00
Zvonko Kaiser
9a7dff9c40 gpu: Add arm64 targets
We want to make sure we deliver arm64 GPU targets as well

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-05 16:30:20 +00:00
Zvonko Kaiser
968318180d ci: Add extratarballs steps
We introduced extratarballs with a make target. The CI
currently only uploads tarballs that are listed in the matrix.
The NV kernel builds a headers package which needs to be uploaded
as well.

The get-artifacts has a glob to download all artifacts hence we
should be good.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-05 16:30:20 +00:00
Zvonko Kaiser
b04bdf54a5 gpu: Add rootfs target amd64/arm64
Adding the initrd build first to get the rootfs on amd64.
With that we can start to add tests.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-05 16:30:20 +00:00
stevenhorsman
7831caf1e7 libs/safe-path: Fix doc formatting
Clippy fails with
```
error: doc list item missing indentation
```
so indent further to avoid this.
2025-02-05 15:16:47 +00:00
stevenhorsman
17b1e94f1a cargo: Update time crate
So it avoids us hitting
```
error[E0282]: type annotations needed for `Box<_>`
  --> /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/time-0.3.31/src/format_description/parse/mod.rs:83:9
   |
83 |     let items = format_items
   |         ^^^^^
...
86 |     Ok(items.into())
   |              ---- type must be known at this point
   |
help: consider giving `items` an explicit type, where the placeholders `_` are specified
   |
83 |     let items: Box<_> = format_items
   |              ++++++++
```

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 15:16:47 +00:00
stevenhorsman
e9393827e8 agent: Workaround ppc formatting
On powerpc64le platform the ip neigh command has
a trailing space after the state, so the test is failing e.g.
```
 assertion `left == right` failed
  left: "169.254.1.1 lladdr 6a:92:3a:59:70:aa PERMANENT \n"
 right: "169.254.1.1 lladdr 6a:92:3a:59:70:aa PERMANENT\n"
```
Trim the whitespace to make the test pass on all platforms

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 15:16:47 +00:00
stevenhorsman
1ac0e67245 kata-ctl: Add stub of missing method for ppc
`host_is_vmcontainer_capable` is required, but wasn't
implemented for powerpc64, so copy the aarch64 approach
@Amulyam24

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 15:16:47 +00:00
stevenhorsman
bd3c93713f kata-sys-util: Complete code move
In #7236 the guest protection code was moved to kata-sys-utils,
but some of it was left behind, and the adjustment to the new
location wasn't completed, so the powerpc64 code doesn't
build now we've fixed the cfg to test it.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 15:16:47 +00:00
stevenhorsman
9f865f5bad kata-ctl: Allow dead_code
Some of the Kernel structs have `#[allow(dead_code)]`
but not all and this results in the clippy error:
```
 error: fields `name` and `value` are never read
 ```
 so complete the job started before to remove the error.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:02 +00:00
stevenhorsman
61a252094e dragonball: Fix feature typo
Replace `legacy_irq` with `legacy-irq`

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:02 +00:00
stevenhorsman
add785f677 dragonball: Remove unused fields
`metrics` is never used, so remove this code

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:02 +00:00
stevenhorsman
dde34bb7b8 runtime-rs: Remove un-used code
The `r#type` method is never used, so neither
are the log type constants

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:02 +00:00
stevenhorsman
71fffb8736 runtime-rs: Allow dead code
Clippy errors with:
```
error: field `driver` is never read
  --> crates/resource/src/network/utils/link/driver_info.rs:77:9
   |
76 | pub struct DriverInfo {
   |            ---------- field in this struct
77 |     pub driver: String,
   |         ^^^^^^
```
We set this, but never read it, so clippy is correct,
but I'm not sure if it's useful for logging, or other purposes,
so I'll allow it for now.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:02 +00:00
stevenhorsman
d75a0ccbd1 dragonball: Allow test-mock feature
Clippy fails with:
```
warning: unexpected `cfg` condition value: `test-mock`
    --> /root/go/src/github.com/kata-containers/kata-containers/src/dragonball/src/dbs_pci/src/vfio.rs:1929:17
     |
1929 | #[cfg(all(test, feature = "test-mock"))]
     |                 ^^^^^^^^^^^^^^^^^^^^^ help: remove the condition
     |
     = note: no expected values for `feature`
     = help: consider adding `test-mock` as a feature in `Cargo.toml`
```
So add it as an expected cfg in the linter to skip this

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:02 +00:00
stevenhorsman
bddaea6df1 runtime-rs: Allow enable-vendor feature
Clippy fails with:
```
error: unexpected `cfg` condition value: `enable-vendor`
   --> crates/hypervisor/src/device/driver/vfio.rs:180:11
    |
180 |     #[cfg(feature = "enable-vendor")]
    |           ^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: expected values for `feature` are: `ch-config`, `cloud-hypervisor`, `default`, and `dragonball`
    = help: consider adding `enable-vendor` as a feature in `Cargo.toml`
```

So add it as an expected cfg in the linter to skip this

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:02 +00:00
stevenhorsman
bed128164a runtime-rs: Allow unexpected config
Clippy fails with:
```
error: unexpected `cfg` condition value: `enable-vendor`
   --> crates/hypervisor/src/device/driver/vfio.rs:180:11
    |
180 |     #[cfg(feature = "enable-vendor")]
    |           ^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: expected values for `feature` are: `ch-config`, `cloud-hypervisor`, `default`, and `dragonball`
    = help: consider adding `enable-vendor` as a feature in `Cargo.toml`
```
allow this until we can check this behaviour with @Apokleos

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:02 +00:00
stevenhorsman
53bcb0b108 runtime-rs: Fix for-loops-over-fallibles
Clippy complains about:
```
error: for loop over a `&Result`. This is more readably written as an `if let` statement
  --> crates/hypervisor/src/firecracker/fc_api.rs:99:22
   |
99 |         for param in &kernel_params.to_string() {
   |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
```

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:02 +00:00
stevenhorsman
c332a91ef8 runtime-rs: Fix doc list item missing indentation
Add the extra space to format the list correctly

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:02 +00:00
stevenhorsman
fe98d49a29 runtime-rs: Remove direct implementation of ToString
Fix clippy error:
```
direct implementation of `ToString`
```
by switching to implement Display instead

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:02 +00:00
stevenhorsman
730c56af2a runtime-rs: Fix clippy::unnecessary-get-then-check
Clippy errors with:
```
error: unnecessary use of `get(&id).is_none()`
   --> crates/hypervisor/src/device/device_manager.rs:494:29
    |
494 |             if self.devices.get(&id).is_none() {
    |                -------------^^^^^^^^^^^^^^^^^^
    |                |
    |                help: replace it with: `!self.devices.contains_key(&id)`
```
so fix this as suggested

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:01 +00:00
stevenhorsman
a9358b59b7 runtime-rs: Allow unused enum field
Clippy errors with:
```
error: field `0` is never read
   --> crates/hypervisor/src/qemu/cmdline_generator.rs:375:25
    |
375 |     DeviceAlreadyExists(String), // Error when trying to add an existing device
    |     ------------------- ^^^^^^
```
but this is used when creating the error later, so add an allow
to ignore this warning

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:01 +00:00
stevenhorsman
1d9efeb92b runtime-rs: Remove use of legacy constants
Fix clippy error
```
error: usage of a legacy numeric constant
```
by swapping `std::u8::MAX` for `u8::MAX`

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:01 +00:00
stevenhorsman
225c7fc026 kata-ctl: Allow unused enum field
Clippy errors with:
```
error: field `0` is never read
```
but the field is required for the `map_err`, so ignore this
error for now to avoid too much disruption

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:01 +00:00
stevenhorsman
f1d3450d1f runtime-rs: Remove unused config
`gdb` is only activated by a feature `guest_debug` that doesn't
exist, so remove this.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:01 +00:00
stevenhorsman
1e90fc38de dragonball: Fix incorrect reference
There were references to `config_manager::DeviceInfoGroup`
which doesn't exist, so I guess it means `DeviceConfigInfo`
instead, so update them

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:01 +00:00
stevenhorsman
f389b05f20 dragonball: Fix doc formatting issue
Clippy errors with:
```
error: doc list item missing indentation
```
which I think is because the Return is between two list
items, so add a blank line to separate this into a separate
paragraph

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:01 +00:00
stevenhorsman
8bea57326a dragonballl: Fix thread_local initializer error
clippy errors with:
```
error: initializer for `thread_local` value can be made `const`
```
so update as suggested

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:01 +00:00
stevenhorsman
7257ee0397 agent: Remove implementation of ToString
Fix clippy error:
```
direct implementation of `ToString`
```
by switching to implement Display instead

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:01 +00:00
stevenhorsman
ca87aca1a6 agent: Remove use of legacy constants
Fix clippy error
```
error: usage of a legacy numeric constant
```
by swapping `std::i32::<MIN/MAX>` for `i32::<MIN/MAX>`

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:01 +00:00
stevenhorsman
6008fd56a1 agent: Fix clippy error
```
error: file opened with `create`, but `truncate` behavior not defined
```
`truncate(true)` ensures the file is entirely overwritten with new data
which I believe is the behaviour we want

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:01 +00:00
stevenhorsman
a640bb86ec agent: cdh: Remove unnecessary borrows
Fix clippy error:
```
error: the borrowed expression implements the required traits
```

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:01 +00:00
stevenhorsman
a131eec5c1 agent: config: Remove supports_seccomp
supports_seccomp is never used, so throws a clippy error

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:01 +00:00
stevenhorsman
0bd36a63d9 agent: Fix clippy error
```
error: bound is defined in more than one place
```

Move Sized into the later definition of `R` & `W`
rather than defining them in two places

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:01 +00:00
stevenhorsman
7709198c3b rustjail: Fix clippy error
```
error: file opened with `create`, but `truncate` behavior not defined
```
`truncate(true)` ensures the file is entirely overwritten with new data
which I believe is the behaviour we want

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:01 +00:00
Fabiano Fidêncio
b4de302cb2 genpolicy: Adjust to build with rust 1.80.0
```
error: field `image` is never read
  --> src/registry.rs:35:9
   |
34 | pub struct Container {
   |            --------- field in this struct
35 |     pub image: String,
   |         ^^^^^
   |
   = note: `Container` has derived impls for the traits `Debug` and `Clone`, but these are intentionally ignored during dead code analysis
   = note: `-D dead-code` implied by `-D warnings`
   = help: to override `-D warnings` add `#[allow(dead_code)]`

error: field `use_cache` is never read
   --> src/utils.rs:106:9
    |
105 | pub struct Config {
    |            ------ field in this struct
106 |     pub use_cache: bool,
    |         ^^^^^^^^^
    |
    = note: `Config` has derived impls for the traits `Debug` and `Clone`, but these are intentionally ignored during dead code analysis

error: could not compile `genpolicy` (bin "genpolicy") due to 2 previous errors
```

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:01 +00:00
stevenhorsman
099b241702 powerpc64: Add target_endian = "little"
Based on comments from @Amulyam24 we need to use
the `target_endian = "little"` as well as target_arch = "powerpc64"
to ensure we are working on powerpc64le.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:45:01 +00:00
stevenhorsman
4c006c707a build: Fix powerpc64le target_arch
Starting with version 1.80, the Rust linter does not accept an invalid
value for `target_arch` in configuration checks:

```
   Compiling kata-sys-util v0.1.0 (/home/ddd/Work/kata/kata-containers/src/libs/kata-sys-util)
error: unexpected `cfg` condition value: `powerpc64le`

  --> /home/ddd/Work/kata/kata-containers/src/libs/kata-sys-util/src/protection.rs:17:34
   |
17 | #[cfg(any(target_arch = "s390x", target_arch = "powerpc64le"))]
   |                                  ^^^^^^^^^^^^^^-------------
   |                                                |
   |                                                help: there is a expected value with a similar name: `"powerpc64"`
   |
   = note: expected values for `target_arch` are: `aarch64`, `arm`, `arm64ec`, `avr`, `bpf`, `csky`, `hexagon`, `loongarch64`, `m68k`, `mips`, `mips32r6`, `mips64`, `mips64r6`, `msp430`, `nvptx64`, `powerpc`, `powerpc64`, `riscv32`, `riscv64`, `s390x`, `sparc`, `sparc64`, `wasm32`, `wasm64`, `x86`, and `x86_64`
   = note: see <https://doc.rust-lang.org/nightly/rustc/check-cfg/cargo-specifics.html> for more information about checking conditional configuration
   = note: `-D unexpected-cfgs` implied by `-D warnings`
   = help: to override `-D warnings` add `#[allow(unexpected_cfgs)]`
```

According [to GitHub user @Urgau][explain], this is a new warning
introduced in Rust 1.80, but the problem exists before. The correct
architecture name should be `powerpc64`, and the differentiation
between `powerpc64le` and `powerpc64` should use the `target_endian =
"little"` check.

[explain]: #10072 (comment)

Fixes: #10067

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
[emlima: fix some more occurences and typos]
Signed-off-by: Emanuel Lima <emlima@redhat.com>
[stevenhorsman: fix some more occurences and typos]
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-05 14:20:47 +00:00
Zvonko Kaiser
429b2654f4 Merge pull request #10812 from zvonkok/fix-arch-build-gpu
gpu: Fix arm64 build
2025-02-04 17:03:37 -05:00
Dan Mihai
3fc170788d Merge pull request #10811 from microsoft/cameronbaird/hyp-loglevel-upstream
CLH: config: add hypervisor_loglevel
2025-02-04 11:59:21 -08:00
Zvonko Kaiser
eeacd8fd74 gpu: Adapt rootfs build for multi-arch
Add aarch64 and x86_64 handling. Especially build the Rust
dependency with the correct rust musl target.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-04 16:44:21 +00:00
Steve Horsman
9060904c4f Merge pull request #10826 from kata-containers/topic/crio-test-timeouts
workflows: Add delete kata-deploy timeouts for crio tests
2025-02-04 13:09:49 +00:00
Ruoqing He
8e073a6715 ci: Update yq to v4.44.5 to support riscv64
In v4.44.5 of `yq`, artifacts for riscv64 are released. Update the
version used for `yq` and enable `install_yq.sh` to work on riscv64.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-02-04 19:36:34 +08:00
Zvonko Kaiser
95c63f4982 Merge pull request #10827 from stevenhorsman/bump-golang-1.22.11
versions: Bump golang version
2025-02-03 16:06:56 -05:00
Zvonko Kaiser
7dc8060051 Merge pull request #10828 from stevenhorsman/fix-versions-comments
versions: Fix formatting
2025-02-03 16:06:37 -05:00
stevenhorsman
546e3ae9ea versions: Fix formatting
The static_checks_versions test uses yamllint which fails with:
```
[comments] too few spaces before comment
```
many times and so makes code reviews more annoying with
all these extra messages. Other it's probably not the worse issues,
I checked the
[yaml spec](https://yaml.org/spec/1.2.2/#66-comments)
and it does say
> Comments must be separated from other tokens by white space character*s*

so it's easiest to fix it and move on.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-03 17:08:25 +00:00
Zvonko Kaiser
122ad95da6 Merge pull request #10751 from ryansavino/snp-upstream-host-kernel-support
snp: update kata to use latest upstream packages for snp
2025-02-03 11:20:59 -05:00
stevenhorsman
d9eb1b0e06 versions: Bump golang version
Bump golang versions so we are more up-to-date and
have the extra security fixes

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-03 15:28:53 +00:00
stevenhorsman
5203158195 workflows: Add delete kata-deploy timeouts for crio tests
I've also seen cases (the qemu, crio, k0s tests) where Delete kata-deploy is still
running for this test after 2 hours, and had to be manually
cancelled, so let's try adding a 5m timeout to the kata-deploy delete to stop CI jobs hanging.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-03 11:45:43 +00:00
Greg Kurz
a806d74ce3 Merge pull request #10807 from kata-containers/dependabot/go_modules/src/tools/csi-kata-directvolume/go_modules-8d4d0c168c
build(deps): bump github.com/golang/glog from 1.2.0 to 1.2.4 in /src/tools/csi-kata-directvolume in the go_modules group across 1 directory
2025-02-01 08:29:44 +01:00
Cameron Baird
b6b0addd5e config: add hypervisor_loglevel
Implement HypervisorLoglevel config option for clh.

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
2025-01-31 18:37:03 +00:00
Steve Horsman
41f23f1d2a Merge pull request #10823 from stevenhorsman/fix-virtiofsd-build-error
packaging: virtiofsd: Allow building a specific commit
2025-01-31 16:18:02 +00:00
stevenhorsman
1cf1a332a5 packaging: virtiofsd: Allow building a specific commit
#10714 added support for building a specific commit,
but due to the clone only having `--depth=1`, we can only
reset to a commit if it's the latest on the `main` branch,
otherwise we will get:
```
+ git clone --depth 1 --branch main https://gitlab.com/virtio-fs/virtiofsd virtiofsd
Cloning into 'virtiofsd'...
warning: redirecting to https://gitlab.com/virtio-fs/virtiofsd.git/
+ pushd virtiofsd
+ git reset --hard cecc61bca981ab42aae6ec490dfd59965e79025e
...
fatal: Could not parse object 'cecc61bca981ab42aae6ec490dfd59965e79025e'.
```

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-01-31 11:24:23 +00:00
Greg Kurz
0215d958da Merge pull request #10805 from balintTobik/egrep_removal
egrep/fgrep removal
2025-01-30 18:26:59 +01:00
Hyounggyu Choi
530fedd188 Merge pull request #10767 from BbolroC/enable-coldplug-vfio-ap-s390x
Enable VFIO-AP coldplug for s390x
2025-01-30 12:11:00 +01:00
Balint Tobik
1943a1c96d tests: replace egrep with grep -E to avoid deprecation warning
https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html

Signed-off-by: Balint Tobik <btobik@redhat.com>
2025-01-29 11:26:27 +01:00
Balint Tobik
47140357c4 docs: replace egrep/fgrep with grep -E/-F to avoid deprecation warning
https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html

Signed-off-by: Balint Tobik <btobik@redhat.com>
2025-01-29 11:25:54 +01:00
Ryan Savino
90e2b7d1bc docs: updated build and host setup instructions for SNP
Referenced AMD developer page for latest SEV firmware.
Instructions to point to upstream 6.11 kernel or later.
Referenced sev-utils and AMDESE fork for kernel setup.

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2025-01-28 18:09:40 -06:00
Ryan Savino
c1ca49a66c snp: set snp to use upstream qemu in config
use upstream qemu in snp and nvidia snp configs.
load ovmf with bios flag on qemu cmdline instead of file.

Fixes: #10750

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2025-01-28 18:09:40 -06:00
Ryan Savino
af235fc576 Revert "builds: ovmf: Workaround Zeex repo becoming private"
This reverts commit aff3d98ddd.
2025-01-28 18:09:40 -06:00
Ryan Savino
bb7ca954c7 ovmf: upgrade standard and sev ovmf
ovmf upgraded to latest tag for standard and sev.

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2025-01-28 18:09:40 -06:00
Ryan Savino
e87231edc7 snp: remove snp certs on qemu cmdline
snp standard attestation with the upstream kernel and qemu do not support extended attestation with certs.

Fixes: #10750

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2025-01-28 18:09:40 -06:00
Zvonko Kaiser
f9bbe4e439 Merge pull request #10785 from zvonkok/agent-cgv2-activate
agent: Add proper activation param handling to activate cgroupV2
2025-01-28 14:21:15 -05:00
dependabot[bot]
df5eafd2a1 build(deps): bump github.com/golang/glog
Bumps the go_modules group with 1 update in the /src/tools/csi-kata-directvolume directory: [github.com/golang/glog](https://github.com/golang/glog).


Updates `github.com/golang/glog` from 1.2.0 to 1.2.4
- [Release notes](https://github.com/golang/glog/releases)
- [Commits](https://github.com/golang/glog/compare/v1.2.0...v1.2.4)

---
updated-dependencies:
- dependency-name: github.com/golang/glog
  dependency-type: direct:production
  dependency-group: go_modules
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-01-28 17:38:14 +00:00
Fabiano Fidêncio
5e00a24145 Merge pull request #10749 from zvonkok/pass-through-stack
gpu: Add driver version selection
2025-01-28 16:24:16 +01:00
Hyounggyu Choi
dde627cef4 test: Run full set of zcrypttest for VFIO-AP coldplug
Previously, the test for VFIO-AP coldplug only checked whether a
passthrough device was attached to the VM guest. This commit expands
the test to include a full set of zcrypttest to verify that the device
functions properly within a container.

Additionally, since containerd has been upgraded to v1.7.25 on the
test machine, it is no longer necessary to run the test via crictl.
The commit removes all related codes/files.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-01-28 10:53:00 +01:00
Hyounggyu Choi
47db9b3773 agent: Run check_ap_device() for VFIO-AP coldplug
This commit updates the device handler to call check_ap_device()
instead of wait_for_ap_device() for VFIO-AP coldplug.
The handler now returns a SpecUpdate for passthrough devices if
the device is online (e.g., `/sys/devices/ap/card05/05.001f/online`
is set to 1).

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-01-28 10:53:00 +01:00
Hyounggyu Choi
200cbfd0b0 kata-types: Introduce new type vfio-ap-cold for VFIO-AP coldplug
This newly introduced type will be used by the VFIO-AP device handler
on the agent.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-01-28 10:53:00 +01:00
Hyounggyu Choi
4a6ba534f1 runtime: Introduce new gRPC device type for VFIO-AP coldplug
This commit introduces a new gRPC device type, `vfio-ap-cold`, to support
VFIO-AP coldplug. This enables the VM guest to handle passthrough devices
differently from VFIO-AP hotplug.
With this new type, the guest no longer needs to wait for events (e.g., device
addition) because the device already exists at the time the device type is checked.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-01-28 10:53:00 +01:00
Hyounggyu Choi
419b5ed715 runtime: Add DeviceInfo to Container for VFIO coldplug configuration
Even though ociSpec.Linux.Devices is preserved when vfio_mode is VFIO,
it has not been updated correctly for coldplug scenarios. This happens
because the device info passed to the agent via CreateContainerRequest
is dropped by the Kata runtime.
This commit ensures that the device info is added to the sandbox's
device manager when vfio_mode is VFIO and coldPlugVFIO is true
(e.g., vfio-ap-cold), allowing ociSpec.Linux.Devices to be properly
updated with the device information before the container is created on
the guest.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-01-28 10:53:00 +01:00
Balint Tobik
233d15452b runtime: replace egrep with grep -E to avoid deprecation warning
https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html

Signed-off-by: Balint Tobik <btobik@redhat.com>
2025-01-28 10:46:44 +01:00
Balint Tobik
e657f58cf9 ci: replace egrep with grep -E to avoid deprecation warning
https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html

Signed-off-by: Balint Tobik <btobik@redhat.com>
2025-01-28 10:46:44 +01:00
Zvonko Kaiser
9f2799ba4f Merge pull request #10790 from JakubLedworowski/add-xattr-to-confidential-kernel
kernel: Add CONFIG_TMPFS_XATTR to tdx.conf
2025-01-27 13:47:08 -05:00
Zvonko Kaiser
d2528ef84f gpu: Initialize unbound variables rootfs.sh
Since we're importing some build script for nvidia and we're
setting set -u we have some unbound variables in rootfs.sh
add initialization for those.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-01-27 18:37:21 +00:00
Zvonko Kaiser
9162103f85 agent: Update macro for e.g. String type
stack-only types are handled properly with the
parse_cmdline_param macro advancted types like
String couldn't be guarded by a guard function since
it passed the variable by value rather than reference.

Now we can have guard functions for the String type

parse_cmdline_param!(
    param,
    CGROUP_NO_V1,
    config.cgroup_no_v1,
    get_string_value,
    | no_v1 | no_v1 == "all"
);

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-01-27 17:56:43 +00:00
Zvonko Kaiser
aab9d36e47 agent: Add tests for cgroup_no_v1
The only valid value is "all", ignore all other

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-01-27 17:56:43 +00:00
Zvonko Kaiser
e1596f7abf agent: Add option to parse cgroup_no_v1
For AGENT_INIT=yes we do not run systemd and hence
systemd.unified_... does not mean anything to other init
systems. Providing cgroup_no_v1=all is enough to signal
other init systemd to use cgroupV2.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-01-27 17:56:43 +00:00
Zvonko Kaiser
cd7001612a gpu: rootfs adjust for AGENT_INIT=no
Since we're defaulting to AGENT_INIT=no for all the initrd/images
adapt the NV build to properly get kata-agent installed.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-01-27 17:56:21 +00:00
Zvonko Kaiser
10974b7bec gpu: AGENT_INIT=no
We're setting globally for each initrd and image AGENT_INIT=no

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-01-27 17:56:21 +00:00
Zvonko Kaiser
98e0dc1676 gpu: Add set -u to scripts
Make the scripts more robust by failing on unset varaibles

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-01-27 17:56:21 +00:00
Zvonko Kaiser
f153229865 gpu: Add driver version selection
Besides latest and lts options add an option to specify
the exact driver version.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-01-27 17:56:21 +00:00
Steve Horsman
311c3638c6 Merge pull request #10794 from fidencio/topic/bump-ubuntu-version-for-the-confidential-rootfs-and-initrd
versions: Bump Ubuntu base image & initrd
2025-01-27 15:55:16 +00:00
Fabiano Fidêncio
84b0ca1b18 versions: Bump Ubuntu rootfs / initrd versions
While I wish we could be bumping to the very same version everywhere,
it's not possible and it's been quite a ride to get a combination of
things that work.

Let me try to describe my approach here:
* Do *NOT* stay on 20.04
  * This version will be EOL'ed by April
  * This version has a very old version of systemd that causes a bug
    when trying to online the cpusets for guests using systemd as
    init, causing then a breakage on the qemu-coco-non-tee and TDX
    non-attestation set of tests

* Bump to 22.04 when possible
  * This was possible for the majority of the cases, but for the
    confidential initrd & confidential images for x86_64, the reason
    being failures on AMD SEV CI (which I didn't debug), and a kernel
    panic on the CentOS 9 Stream TDX machine
  * 22.04 is being used instead of 24.04 as multistrap is simply broken
    on Ubuntu 24.04, and I'd prefer to stay on an LTS release whenever
    it's possible

* Bump to 24.10 for x86_64 image confidential
  * This was done as we got everything working with 24.10 in the CI.
  * This requires using libtdx-attest from noble (Ubuntu 24.04), as
    Intel only releases their sgx stuff for LTS releases.

* Stick to 20.04 for x86_64 initrd confidential
  * 24.10 caused a panic on their CI
  * This is only being used by AMD so far, so they can decide when to
    bump, after doing the proper testing & debug that the bump will work
    as expected for them

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-27 15:08:20 +01:00
Carlos Segarra
b6e0effc06 tdx: bump version of libtdx-attest in rootfs-builder
Bump libtdx-attest to its 1.22 release.

Signed-off-by: Carlos Segarra <carlos@carlossegarra.com>
2025-01-27 15:08:20 +01:00
Fabiano Fidêncio
2b5dbfacb8 osbuilder: ubuntu: Try to install pyinstaller using --break-system-packages
We first try without passing the `--break-system-packages` argument, as
that's not supported on Ubuntu 22.04 or older, but that's required on
Ubuntu 24.04 or newer.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-27 15:08:20 +01:00
Fabiano Fidêncio
c54f78bc6b local-build: cache: Consider os name & version for image/initrd
Otherwise a bump in the os name and / or os version would lead to the CI
using a cached artefact.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-27 15:08:20 +01:00
Fabiano Fidêncio
4a66acc6f5 osbuilder: ubuntu: Abort if multistrap fails (but not on 20.04)
We have gotten Ubuntu 20.04 working pretty much "by luck", as multistrap
fails the deployment, and then a hacky function was introduced to add
the proper dbus links.  However, this does not scale at all, and we
should:
* Fail if multistrap fails
  * I won't do this for Ubuntu 20.04 as it's working for now and soon
    enough it'll be EOL
* Add better logging to ensure someone can know when multistrap fails

Below you can find the failure that we're hitting on Ubuntu 20.04:
```sh
Errors were encountered while processing:
 dbus
ERR: dpkg configure reported an error.
Native mode configuration reported an error!
I: Tidying up apt cache and list data.

Multistrap system reported 1 error in /rootfs/.
I: Tidying up apt cache and list data.
```

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-27 15:08:16 +01:00
Fabiano Fidêncio
585f82f730 osbuilder: ubuntu: Ensure OS_VERSION is passed & used
Right now we're hitting an interesting situation with osbuilder, where
regardless of what's being passed Ubuntu 20.04 (focal) is being used
when building the rootfs-image, as shown in the snippets of the logs
below:
```
ffidenci@tatu:~/src/upstream/kata-containers/kata-containers$ make rootfs-image-confidential-tarball
/home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build//kata-deploy-copy-libseccomp-installer.sh "agent"
make agent-tarball-build
...
make pause-image-tarball-build
...
make coco-guest-components-tarball-build
...
make kernel-confidential-tarball-build
...
make rootfs-image-confidential-tarball-build
make[1]: Entering directory '/home/ffidenci/src/upstream/kata-containers/kata-containers'
/home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build//kata-deploy-binaries-in-docker.sh  --build=rootfs-image-confidential
sha256:f16c57890b0e85f6e1bbe1957926822495063bc6082a83e6ab7f7f13cabeeb93
Build kata version 3.13.0: rootfs-image-confidential
INFO: DESTDIR /home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/rootfs-image-confidential/destdir
INFO: Create image
build image
~/src/upstream/kata-containers/kata-containers/tools/osbuilder ~/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/rootfs-image-confidential/builddir
INFO: Build image
INFO: image os: ubuntu
INFO: image os version: latest
Creating rootfs for ubuntu
/home/ffidenci/src/upstream/kata-containers/kata-containers/tools/osbuilder/rootfs-builder/rootfs.sh -o 3.13.0-13f0807e9f5687d8e5e9a0f4a0a8bb57ca50d00c-dirty -r /home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/rootfs-image-confidential/builddir/rootfs-image/ubuntu_rootfs ubuntu
INFO: rootfs_lib.sh file found. Loading content
~/src/upstream/kata-containers/kata-containers/tools/osbuilder/rootfs-builder/ubuntu ~/src/upstream/kata-containers/kata-containers/tools/osbuilder
~/src/upstream/kata-containers/kata-containers/tools/osbuilder

INFO: rootfs_lib.sh file found. Loading content
INFO: build directly

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

Get:1 http://security.ubuntu.com/ubuntu focal-security InRelease [128 kB]
Get:2 http://archive.ubuntu.com/ubuntu focal InRelease [265 kB]
Get:3 http://archive.ubuntu.com/ubuntu focal-updates InRelease [128 kB]
Get:4 http://security.ubuntu.com/ubuntu focal-security/restricted amd64 Packages [4276 kB]
Get:5 http://archive.ubuntu.com/ubuntu focal-backports InRelease [128 kB]
Get:6 http://archive.ubuntu.com/ubuntu focal/universe amd64 Packages [11.3 MB]
Get:7 http://security.ubuntu.com/ubuntu focal-security/universe amd64 Packages [1297 kB]
Get:8 http://security.ubuntu.com/ubuntu focal-security/multiverse amd64 Packages [30.9 kB]
Get:9 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages [4187 kB]
Get:10 http://archive.ubuntu.com/ubuntu focal/restricted amd64 Packages [33.4 kB]
Get:11 http://archive.ubuntu.com/ubuntu focal/main amd64 Packages [1275 kB]
Get:12 http://archive.ubuntu.com/ubuntu focal/multiverse amd64 Packages [177 kB]
Get:13 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages [4663 kB]
Get:14 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 Packages [1589 kB]
Get:15 http://archive.ubuntu.com/ubuntu focal-updates/multiverse amd64 Packages [34.6 kB]
Get:16 http://archive.ubuntu.com/ubuntu focal-updates/restricted amd64 Packages [4463 kB]
Get:17 http://archive.ubuntu.com/ubuntu focal-backports/main amd64 Packages [55.2 kB]
Get:18 http://archive.ubuntu.com/ubuntu focal-backports/universe amd64 Packages [28.6 kB]
Fetched 34.1 MB in 5s (6284 kB/s)
...
```

The reason this is happening is due to a few issues in different places:
1. IMG_OS_VERSION, passed to osbuilder, is not used anywhere and
   OS_VERSION should be used instead. And we should break if OS_VERSION
   is not properly passed down
2. Using UBUNTU_CODENAME is simply wrong, as it'll use whatever comes as
   the base container from kata-deploy's local-build scripts, and it has
   just been working by luck

Note that at the same time this commit fixes the wrong behaviour, it
would break the rootfses build as they are, this we need to set the
versions.yaml to use 20.04 were it was already using 20.04 even without
us knowing.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-27 14:19:42 +01:00
Fabiano Fidêncio
02a18c1359 versions: Clarify which release matches a codename
It'll make the life of the developers not so familiar with Ubuntu
easier.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-27 14:19:42 +01:00
Fabiano Fidêncio
ca96a6ac76 versions: Use Ubuntu codename instead of versions
As this is required as part of the osbuilder tool to be able to properly
set the repositories used when building the rootfs.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-27 14:19:39 +01:00
Fabiano Fidêncio
353ceb948e versions: Don't use the yaml variable definitions
While having variables are nice, those are more extensive to write down,
and actually confusing for tired developer eyes to read, plus we're
mixing the use of the yaml variables here and there together with not
using them for some architectures.

With the best "all or nothing" spirit, let's just make it easier for our
developers to read the versions.yaml and easily understand what's being
used.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-27 14:19:08 +01:00
Jakub Ledworowski
42531cf6c4 kernel: Add CONFIG_TMPFS_XATTR to confidential kernel
During pull inside the guest, overlayfs expects xattrs.

Fixes: [guest-components#876](https://github.com/confidential-containers/guest-components/issues/876)

Signed-off-by: Jakub Ledworowski <jakub.ledworowski@intel.com>
2025-01-27 07:07:54 +01:00
Zvonko Kaiser
b4c710576e Merge pull request #10782 from stevenhorsman/clh-metrics-write-update
metrics: Increase minval range for blogbench test
2025-01-24 10:21:20 -05:00
Steve Horsman
54e7e1fdc3 Merge pull request #10768 from kata-containers/dependabot/go_modules/src/runtime/go_modules-28d0d344dd
build(deps): bump the go_modules group across 3 directories with 1 update
2025-01-24 12:04:56 +00:00
Greg Kurz
17f3eb0579 Merge pull request #10766 from balintTobik/remove_shebang
Remove shebang in non-executable completion script
2025-01-24 12:29:03 +01:00
Alex Lyn
ee635293c6 Merge pull request #10740 from RuoqingHe/virtiofsd-riscv64
virtiofsd: Enable build for RISC-V
2025-01-24 15:43:56 +08:00
Zvonko Kaiser
f5c509d58e Merge pull request #10779 from kata-containers/topic/arm64-static-build-runner
workflows: Move arm static checks runner
2025-01-23 22:29:16 -05:00
Fabiano Fidêncio
4bc978416c Merge pull request #10720 from fidencio/topic/test-cgroupsv2-on-guest
kernel: Ensure no cgroupsv1 is used
2025-01-23 21:26:49 +01:00
Aurélien Bombo
66d292bdb4 Merge pull request #10732 from microsoft/danmihai/minor-systemd-cleanup
rootfs: minor systemd file deletion cleanup
2025-01-23 11:29:25 -06:00
Fabiano Fidêncio
b47cc6fffe cri-containerd: Skip TestDeviceCgroup till it's adapted to cgroupsv2
As the devices controller works in a different way in cgroupsv2, the
"/sys/fs/cgroup/devices/devices.list" file simply doesn't exist.

For now, let's skip the test till the test maintainer decides to
re-enable it for cgroupsv2.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-23 17:25:56 +01:00
Fabiano Fidêncio
0626d7182a tests: k8s-cpu-ns: Adapt to cgroupsv2
The changes done are:
* cpu/cpu.shares was replaced by cpu.weight
  * The weight, according to our reference[0], is calculated by:
    weight = (1 + ((request - 2) * 9999) / 262142)

* cpu/cpu.cfs_quota_us & cpu/cpu.cfs_period_us were replaced by cpu.max,
  where quota and period are written together (in this order)

[0]: https://github.com/containers/crun/blob/main/crun.1.md#cgroup-v2

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-23 17:25:56 +01:00
Fabiano Fidêncio
4307f0c998 Revert "ci: mariner: Ensure kernel_params can be set"
This reverts commit 091ad2a1b2, in order
to ensure tests would be running with cgroupsv2 on the guest.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-23 17:25:56 +01:00
Fabiano Fidêncio
c653719270 kernel: Ensure no cgroupsv1 is used
Let's ensure that we're fully running the guest on cgroupsv2.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-23 17:25:56 +01:00
stevenhorsman
d031e479ab metrics: Increase minval range for blogbench test
In the last couple of days I've seen the blogbench
metrics write latency test on clh fail a few times because
the latency was too low, so adjust the minimum range
to tolerate quicker finishes.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-01-23 15:58:31 +00:00
Fabiano Fidêncio
66d881a5da Merge pull request #10755 from fidencio/topic/ensure-systemd-is-used-as-init-for-coco-cases
rootfs-confidential: Ensure systemd is used as init
2025-01-23 15:25:24 +01:00
stevenhorsman
3acce82c91 ci: Update gatekeeper tests for static workflow
The static-checks targets are `pull_request`, so
they can run the PR workflow version, so we want to
update the required-tests.yaml so that static-check
workflow changes do trigger static checks in order
to test them properly.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-01-23 14:23:09 +00:00
stevenhorsman
d625f20d18 workflows: Move arm static checks runner
Now we have the build-assets running on the gh-hosted
runners, try the same approach for the static-checks

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-01-23 14:23:09 +00:00
Zvonko Kaiser
a23d6a1241 Merge pull request #10777 from zvonkok/arm64-nvidia-gpu-kernel
gpu: Fix arm64 kernel build
2025-01-23 07:14:30 -05:00
Christophe de Dinechin
9a92a4bacf cli: Remove shebang in non-executable completion script
Raised during package review [1] by rpmlint

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1590425#c8

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Balint Tobik <btobik@redhat.com>
2025-01-23 13:11:25 +01:00
Fabiano Fidêncio
734ef71cf7 tests: k8s: confidential: Cleanup $HOME/.ssh/known_hosts
I've noticed the following error when running the tests with SEV:
```
2025-01-21T17:10:28.7999896Z # @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
2025-01-21T17:10:28.8000614Z # @    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
2025-01-21T17:10:28.8001217Z # @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
2025-01-21T17:10:28.8001857Z # IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
2025-01-21T17:10:28.8003009Z # Someone could be eavesdropping on you right now (man-in-the-middle attack)!
2025-01-21T17:10:28.8003348Z # It is also possible that a host key has just been changed.
2025-01-21T17:10:28.8004422Z # The fingerprint for the ED25519 key sent by the remote host is
2025-01-21T17:10:28.8005019Z # SHA256:x7wF8zI+LLyiwphzmUhqY12lrGY4gs5qNCD81f1Cn1E.
2025-01-21T17:10:28.8005459Z # Please contact your system administrator.
2025-01-21T17:10:28.8006734Z # Add correct host key in /home/kata/.ssh/known_hosts to get rid of this message.
2025-01-21T17:10:28.8007031Z # Offending ED25519 key in /home/kata/.ssh/known_hosts:178
2025-01-21T17:10:28.8007254Z #   remove with:
2025-01-21T17:10:28.8008172Z #   ssh-keygen -f "/home/kata/.ssh/known_hosts" -R "10.244.0.71"
```

And this was causing a failure to ssh into the confidential pod.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-23 12:04:13 +01:00
Fabiano Fidêncio
18137b1583 tests: k8s: confidential: Increase log_buf_len to 4M
Relying on dmesg is really not ideal, as we may lose important info,
mainly those which happen very early in the boot, depending on the size
of kernel ring buffer.

So, for this specific test, let's increase the kernel ring buffer, by
default, to 4M.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-23 12:04:13 +01:00
Fabiano Fidêncio
d5f907dcf1 rootfs-confidential: Ensure systemd is used as init
Let's make sure that we don't use Kata Containers' agent as init for the
Confidential related rootfses*, as we don't want to increase the agent's
complexity for no reason ... mainly when we can rely on a proper init
system.

*:
- images already used systemd as init
- initrds are now using systemd as init

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-23 12:04:13 +01:00
dependabot[bot]
d2cb14cdbc build(deps): bump the go_modules group across 3 directories with 1 update
Bumps the go_modules group with 1 update in the /src/runtime directory: [golang.org/x/net](https://github.com/golang/net).
Bumps the go_modules group with 1 update in the /src/tools/csi-kata-directvolume directory: [golang.org/x/net](https://github.com/golang/net).
Bumps the go_modules group with 1 update in the /tools/testing/kata-webhook directory: [golang.org/x/net](https://github.com/golang/net).


Updates `golang.org/x/net` from 0.25.0 to 0.33.0
- [Commits](https://github.com/golang/net/compare/v0.25.0...v0.33.0)

Updates `golang.org/x/net` from 0.23.0 to 0.33.0
- [Commits](https://github.com/golang/net/compare/v0.25.0...v0.33.0)

Updates `golang.org/x/net` from 0.23.0 to 0.33.0
- [Commits](https://github.com/golang/net/compare/v0.25.0...v0.33.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
  dependency-group: go_modules
- dependency-name: golang.org/x/net
  dependency-type: direct:production
  dependency-group: go_modules
- dependency-name: golang.org/x/net
  dependency-type: indirect
  dependency-group: go_modules
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-01-23 10:18:22 +00:00
Fupan Li
944eb2cf3f Merge pull request #10762 from teawater/remove_enable_swap
libs/kata-types: Remove config enable_swap
2025-01-23 14:03:42 +08:00
Fupan Li
ebd8ec227b Merge pull request #10778 from zvonkok/kata-agent-cgroupsV2
agent: Ensure proper cgroupsV2 handling with init_mode=true
2025-01-23 14:00:13 +08:00
Zvonko Kaiser
afd286f6d6 agent: Ensure proper cgroupsV2 with init_mode=yes
When the agent is run as the init process cgroupfs is being
setup. In the case of cgroupsV1 we needed to enable the memory hiearchy
this is now per default enabled in cgroupsV2. Additionally the file
/sys/fs/cgroup/memory/memory.use_hierarchy isn't even available with V2.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-01-23 03:54:51 +00:00
Fabiano Fidêncio
3f8abb4da7 Merge pull request #10776 from kata-containers/topic/arm64-runners
workflows: Switch to github-hosted arm runners
2025-01-22 23:14:28 +01:00
Zvonko Kaiser
91c6d524f8 gpu: Fix arm64 kernel build
CONFIG_IOASID (not configurable) in newer kernels.
Removing it.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-01-22 18:15:57 +00:00
Fabiano Fidêncio
6baa60d77d Merge pull request #10775 from fidencio/topic/update-ttrpc-crate
agent: Update ttrpc to include the fix for connectivity issues
2025-01-22 17:45:38 +01:00
stevenhorsman
ab27e11d31 workflows: Switch to github-hosted arm runner
Now that gituhb have hosted arm runners
https://github.blog/changelog/2025-01-16-linux-arm64-hosted-runners-now-available-for-free-in-public-repositories-public-preview/
we should try and switch our arm64 builder jobs to
run on these.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-01-22 16:27:17 +00:00
Greg Kurz
90b6d5725b Merge pull request #10773 from RuoqingHe/retry-on-aks-throttle
ci: Retry on failure of Create AKS cluster
2025-01-22 15:30:57 +01:00
Ruoqing He
373a388844 ci: Retry on failure of Create AKS cluster
The `Create AKS cluster` step in `run-k8s-tests-on-aks.yaml` is likely
to fail fail since we are trying to issue `PUT` to `aks` in a relatively
high frequency, while the `aks` end has it's limit on `bucket-size` and
`refill-rate`, documented here [1].

Use `nick-fields/retry@v3` to retry in 10 seconds after request fail,
based on observations that AKS were request 7, or 8 second delays
before retry as part of their 429 response

[1] https://learn.microsoft.com/en-us/azure/aks/quotas-skus-regions#throttling-limits-on-aks-resource-provider-apis

Fixes: #10772

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-01-22 13:24:51 +00:00
Fabiano Fidêncio
a8678a7794 deps: Update ttrpc to v0.8.4
Update the ttrpc crate to include the fix from Moritz Sanft, which
solves the connectivity issues with 6.12.x kernels*

*: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v6.12.9&id=3257813a3ae7462ac5cde04e120806f0c0776850

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-22 13:05:43 +01:00
Fabiano Fidêncio
e71bc1f068 Merge pull request #10770 from zvonkok/gpu_kernel_dep
gpu: Add kernel dep for the non coco use-case
2025-01-22 12:53:39 +01:00
Greg Kurz
17d053f4bb Merge pull request #10711 from teawater/balloon
Add reclaim_guest_freed_memory config to qemu and cloud-hypervisor
2025-01-22 10:57:13 +01:00
Hui Zhu
c148b70da7 libs/kata-types: Remove config enable_swap
Remove config enable_swap because there is no code use it.

Fixes: #10761

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2025-01-22 11:08:45 +08:00
Aurélien Bombo
4e9d1363b3 Merge pull request #10754 from sprt/sprt/ci-gh-pr-number-coco
ci: Unify on `$GH_PR_NUMBER` environment variable
2025-01-21 15:07:24 -06:00
Zvonko Kaiser
4621f53e4a gpu: Add kernel dep for the non coco use-case
Add the kernel dependency to the non coco use-case
so that a rootfs build can be executed via GHA.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-01-21 16:18:14 +00:00
Zvonko Kaiser
61c282c725 Merge pull request #10769 from kata-containers/revert-10764-gpu_ci_cd
Revert "gpu: Add rootfs target amd64/arm64"
2025-01-21 11:09:52 -05:00
Zvonko Kaiser
9fd430e46b Revert "gpu: Add rootfs target amd64/arm64"
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-01-21 16:08:30 +00:00
Zvonko Kaiser
ef1639b6bf Merge pull request #10764 from zvonkok/gpu_ci_cd
gpu: Add rootfs target amd64/arm64
2025-01-21 09:51:20 -05:00
Ruoqing He
7e76ef587a virtiofsd: Enable build for RISC-V
With this change, `virtiofsd` (gnu target) could be built and then to be
used with other components.

Depends: #10741
Fixes: #10739

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-01-21 18:05:37 +08:00
Hui Zhu
185b94b7fa runtime-rs: Add reclaim_guest_freed_memory cloud-hypervisor support
Add reclaim_guest_freed_memory config to cloud-hypervisor in runtime-rs.

Fixes: #10710

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2025-01-21 10:34:21 +08:00
Hui Zhu
487171d992 runtime-rs: Add reclaim_guest_freed_memory qemu support
Add reclaim_guest_freed_memory config to qemu in runtime-rs.

Fixes: #10710

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2025-01-21 10:34:18 +08:00
Hui Zhu
8f550de88a runtime-rs: db: Change config enable_balloon_f_reporting
Change config enable_balloon_f_reporting of db to
reclaim_guest_freed_memory.

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2025-01-21 10:34:08 +08:00
Hui Zhu
42f5ef9ff1 kernel: config: Add CONFIG_VIRTIO_BALLOON to virtio.conf
Add CONFIG_VIRTIO_BALLOON to virtio.conf to open virtio-balloon.

Fixes: #10710

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2025-01-21 10:34:04 +08:00
Zvonko Kaiser
8b097244e7 gpu: Add rootfs initrd build for arm64
We need the arm64 builds as well for GH and GB systems.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-01-20 19:03:52 +00:00
Zvonko Kaiser
f525631522 gpu: Add rootfs target amd64
Adding the initrd build first to get the rootfs on amd64.
With that we can start to add tests.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-01-20 19:01:42 +00:00
Zvonko Kaiser
d7059e9024 Merge pull request #10736 from zvonkok/gpu-rootfs-fix
gpu: Fix rootfs build
2025-01-17 14:44:41 -05:00
Aurélien Bombo
0d70dc31c1 ci: Unify on $GH_PR_NUMBER environment variable
While working on #10559, I realized that some parts of the codebase use
$GH_PR_NUMBER, while other parts use $PR_NUMBER.

Notably, in that PR, since I used $GH_PR_NUMBER for CoCo non-TEE tests
without realizing that TEE tests use $PR_NUMBER, the tests on that PR
fail on TEEs:

https://github.com/kata-containers/kata-containers/actions/runs/12818127344/job/35744760351?pr=10559#step:10:45

  ...
  44      error: error parsing STDIN: error converting YAML to JSON: yaml: line 90: mapping values are not allowed in this context
  ...
  135               image: ghcr.io/kata-containers/csi-kata-directvolume:
  ...

So let's unify on $GH_PR_NUMBER so that this issue doesn't repro in the
future: I replaced all instances of PR_NUMBER with GH_PR_NUMBER.

Note that since some test scripts also refer to that variable, the CI
for this PR will fail (would have also happened with the converse
substitution), hence I'm not adding the ok-to-test label and we should
force-merge this after review.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-01-17 10:53:08 -06:00
Fabiano Fidêncio
c018a1cc61 Merge pull request #10741 from RuoqingHe/update-virtiofsd-build-image
virtiofsd: Update ubuntu to 22.04 for gnu target
2025-01-16 20:51:10 +01:00
Zvonko Kaiser
2777b13db7 Merge pull request #10742 from zvonkok/3.13.0-release
release: Bump version to 3.13.0
2025-01-16 10:05:48 -05:00
Ruoqing He
c70195d629 virtiofsd: Update ubuntu to 22.04 for gnu target
With ubuntu 20.04 image, virtiofsd gnu target couldn't be built due to
"unsupported ISA subset z" reported by "cc".

Updating to ubuntu 22.04 image addresses this problem.

Relates: #10739

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-01-16 17:27:38 +08:00
Zvonko Kaiser
f0bd83b073 gpu: Fix rootfs build
The pyinstaller is located per default under /usr/local/bin
some prior versions were installing it to ${HOME}.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-01-15 20:37:51 +00:00
Aurélien Bombo
0d93f59f5b Merge pull request #10738 from microsoft/danmihai1/empty-pty-lines
runtime: skip empty Guest console output lines
2025-01-15 10:33:24 -06:00
Zvonko Kaiser
0b04f43ac6 release: Bump version to 3.13.0
Bump VERSION and helm-chart versions

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-01-15 16:13:22 +00:00
Zvonko Kaiser
365def9b4a Merge pull request #10735 from BbolroC/kubectl-create-retry-trusted-storage
tests: Introduce retry_kubectl_apply() for trusted storage
2025-01-14 21:59:45 -05:00
Dan Mihai
2e21f51375 runtime: skip empty Guest console output lines
Skip logging empty lines of text from the Guest console output, if
there are any such lines.

Without this change, the Guest console log from CLH + /dev/pts/0 has
twice as many lines of text. Half of these lines are empty.

Fixes: #10737

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-01-15 00:28:26 +00:00
Hyounggyu Choi
f7816e9206 tests: Introduce retry_kubectl_apply() for trusted storage
On s390x, some tests for trusted storage occasionally failed due to:

```bash
etcdserver: request timed out
```

or

```bash
Internal error occurred: resource quota evaluation timed out
```

These timeouts were not observed previously on k3s but occur
sporadically on kubeadm. Importantly, they appear to be temporary
and transient, which means they can be ignored in most cases.

To address this, we introduced a new wrapper function, `retry_kubectl_apply()`,
for `kubectl create`. This function retries applying a given manifest up to 5
times if it fails due to a timeout. However, it will still catch and handle
any other errors during pod creation.

Fixes: #10651

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-01-14 21:15:44 +01:00
Fabiano Fidêncio
121ac0c5c0 Merge pull request #10727 from microsoft/danmihai1/mariner3-guest
image: bump mariner guest version to 3.0
2025-01-14 19:06:28 +01:00
Fabiano Fidêncio
3658ea2320 Merge pull request #10731 from microsoft/danmihai1/quiet-rootfs-build
rootfs: reduced console output by default
2025-01-14 19:02:42 +01:00
Chengyu Zhu
7d34ca4420 Merge pull request #10674 from bpradipt/fix-10398
agent: alternative implementation for sealed_secret as volume
2025-01-14 18:55:45 +08:00
Fabiano Fidêncio
4578969c5d Merge pull request #10730 from BbolroC/bump-coco-trustee
versions: Bump trustee to latest
2025-01-14 08:56:11 +01:00
Dan Mihai
c4da296326 rootfs: delete links to deleted files
Delete symbolic links to files being deleted.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-01-13 21:28:44 +00:00
Dan Mihai
5b8471ffce rootfs: print the path to files being deleted
Show the list of files being deleted.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-01-13 21:28:34 +00:00
Dan Mihai
a49d0fb343 rootfs: delete systemd units/files from rootfs.sh
Move the deletion of unnecessary systemd units and files from
image_builder.sh into rootfs.sh.

The files being deleted can be applicable to other image file formats
too, not just to the rootfs-image format created by image_builder.sh.

Also, image_builder.sh was deleting these files *after* it calculated
the size of the rootfs files, thus missing out on the opportunity to
possibly create a smaller image file.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-01-13 21:28:23 +00:00
Dan Mihai
0f522c09d9 rootfs: reduced console output by default
Use "set -x" only when the user specified DEBUG=1.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-01-13 19:34:05 +00:00
Pradipta Banerjee
36580bb642 tests: Update sealed secret CI value to base64url
The existing encoding was base64 and it fails due to
874948638a

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
2025-01-13 09:37:05 -05:00
Hyounggyu Choi
2cdb549a75 versions: Bump trustee to latest
This update addresses an issue with token verification for SE and SNP
introduced in the last update by #10541.
Bumping the project to the latest commit resolves the issue.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-01-13 15:07:33 +01:00
Pradipta Banerjee
5218345e34 agent: alternative implementation for sealed_secret as volume
The earlier implementation relied on using a specific mount-path prefix - `/sealed`
to determine that the referenced secret is a sealed secret.
However that was restrictive for certain use cases as it forced
the user to always use a specific mountpath naming convention.

This commit introduces an alternative implementation to relax the
restriction. A sealed secret can be mounted in any mount-path.
However it comes with a potential performance penality. The
implementation loops through all volume mounts and reads the file
to determine if it's a sealed secret or not.

Fixes: #10398

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
2025-01-11 12:36:44 -05:00
Dan Mihai
4707883b40 image: bump mariner guest version to 3.0
Use Mariner 3.0 (a.k.a., Azure Linux 3.0) as the Guest CI image.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-01-11 17:36:19 +00:00
Fabiano Fidêncio
2d9baf899a Merge pull request #10719 from msanft/msanft/runtime/fix-boolean-opts
runtime: use actual booleans for QMP `device_add` boolean options
2025-01-11 16:38:06 +01:00
Zvonko Kaiser
f08a9eac11 Merge pull request #10721 from stevenhorsman/more-metrics-latency-minimum-range-fixes
metrics: Increase latency test range
2025-01-10 21:59:39 -05:00
Moritz Sanft
e5735b221c runtime: use actual booleans for QMP device_add boolean options
Since
be93fd5372,
which is included in QEMU since version 9.2.0, the options for the
`device_add` QMP command need to be typed correctly.

This makes it so that instead of `"on"`, the value is set to `true`,
matching QEMU's expectations.

This has been tested on QEMU 9.2.0 and QEMU 9.1.2, so before and after
the change.

The compatibility with incorrectly typed options  for the `device_add`
command is deprecated since version 6.2.0 [^1].

[^1]:  https://qemu-project.gitlab.io/qemu/about/deprecated.html#incorrectly-typed-device-add-arguments-since-6-2

Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
2025-01-10 11:53:56 +01:00
Wainer Moschetta
5fae2a9f91 Merge pull request #9871 from wainersm/fix-print_cluster_name
tests/gha-run-k8s-common: shorten AKS cluster name
2025-01-09 14:35:02 -03:00
stevenhorsman
aaae5b6d0f metrics: clh: Increase network-iperf3 range
We hit a failure with:
```
time="2025-01-09T09:55:58Z" level=warning msg="Failed Minval (0.017600 > 0.015000) for [network-iperf3]"
```
The range is very big, but in the last 3 test runs I reviewed we have got a minimum value of 0.015s
and a max value of 0.052, so there is a ~350% difference possible
so I think we need to have a wide range to make this stable.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-01-09 11:25:57 +00:00
stevenhorsman
e946d9d5d3 metrics: qemu: Increase latency test range
After the kernel version bump, in the latest nightly run
https://github.com/kata-containers/kata-containers/actions/runs/12681309963/job/35345228400
The sequential read throughput result was 79.7% of the expected (so failed)
and the sequential write was 84% of the expected, so was fairly close,
so increase their minimum ranges to make them more robust.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-01-09 11:25:50 +00:00
Wainer dos Santos Moschetta
badc208e9a tests/gha-run-k8s-common: shorten AKS cluster name
Because az client restricts the name to be less than 64 characters. In
some cases (e.g. KATA_HYPERVISOR=qemu-runtime-rs) the generated name
will exceed the limit. This changed the function to shorten the name:

* SHA1 is computed from metadata then compound the cluster's name
* metadata as plain-text are passed as --tags

Fixes: #9850
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2025-01-08 16:39:07 -03:00
Fabiano Fidêncio
8f8988fcd1 Merge pull request #10714 from fidencio/topic/update-virtiofsd
virtiofsd: Update to its v1.13.0 ( + one patch) release :-)
2025-01-08 17:59:29 +01:00
Fabiano Fidêncio
7e5e109255 Merge pull request #10541 from fitzthum/bump-trustee-010
Update Trustee and Guest Components
2025-01-08 17:44:13 +01:00
Fabiano Fidêncio
eb3fe0d27c Merge pull request #10717 from fidencio/topic/re-enable-oom-test-for-mariner
tests: Re-enable oom tests for mariner
2025-01-08 17:43:56 +01:00
Fabiano Fidêncio
65e267294b Merge pull request #10718 from stevenhorsman/metrics-blogbench-latency-minimal-range-increase
metrics: Increase latency minimum range
2025-01-08 17:09:36 +01:00
stevenhorsman
dc069d83b5 metrics: Increase latency test range
The bump to kernel 6.12 seems to have reduced the latency in
the metrics test, so increase the ranges for the minimal value,
to account for this.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-01-08 15:11:49 +00:00
Fabiano Fidêncio
967d5afb42 Revert "tests: k8s: Skip one of the empty-dir tests"
This reverts commit 9aea7456fb.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-08 14:07:34 +01:00
Fabiano Fidêncio
7ae2ca4c31 virtiofsd: Update to its v1.13.0 + one patch release
Together with the bump, let's also bump the rust version needed to build
the package, with the caveat that virtiofsd doesn't actually use a
pinned version as part of their CI, so we're bumping to whatever is the
version on `alpine:rust` (which is used in their CI).

It's important to note that we're using a version which brings in one
extra patch apart from the release, as the next virtiofsd release will
happen at the end of February, 2025.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-08 14:07:34 +01:00
Fabiano Fidêncio
0af3536328 packaging: virtiofsd: Allow building a specific commit
Right now we've been only building releases from virtiofsd, but we'll
need to pin a specific commit till v1.14.0 is out, thus let's add the
needed machinery to do so.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-08 14:07:34 +01:00
Tobin Feldman-Fitzthum
41c7f076fa packaging: updating guest components build script
The guest-components directory has been re-arranged slightly. Adjust the
installation path of the LUKS helper script to account for this.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2025-01-07 16:59:10 -06:00
Tobin Feldman-Fitzthum
cafc7d6819 versions: update trustee and guest components
Trustee has some new features including a plugin backend, support for
PKCS11 resources, improvements to token verification, and adjustments to
logging, and more.

Also update guest-components to pickup improvements and keep the KBS
protocol in sync.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2025-01-07 16:59:10 -06:00
Fabiano Fidêncio
53ac0f00c5 tests: Re-enable oom tests for mariner
Since we bumped to the 6.12.x LTS kernel, we've also adjusted the
aggressivity of the OOM test, which may be enough to allow us to
re-enable it for mariner.

Fixes: #8821

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-07 18:33:17 +01:00
Fabiano Fidêncio
f4a39e8c40 Merge pull request #10468 from fidencio/topic/early-tests-on-next-lts-kernel
versions: Move kernel to the latest 6.12 release (the current LTS)
2025-01-07 18:02:04 +01:00
Fupan Li
bd56891f84 Merge pull request #10702 from lifupan/fix_containerdname
CI: change the containerd tarball name from cri-containerd-cni to containerd
2025-01-07 18:56:15 +08:00
Fupan Li
b19db40343 CI: change the containerd tarball name to containerd
Since from https://github.com/containerd/containerd/pull/9096
containerd removed cri-containerd-*.tar.gz release bundles,
thus we'd better change the tarball name to "containerd".

BTW, the containerd tarball containerd the follow files:

bin/
bin/containerd-shim
bin/ctr
bin/containerd-shim-runc-v1
bin/containerd-stress
bin/containerd
bin/containerd-shim-runc-v2

thus we should untar containerd into /usr/local directory instead of "/"
to keep align with the cri-containerd.

In addition, there's no containerd.service file,runc binary and cni-plugin
included, thus we should add a specific containerd.service file and
install install the runc binary and cni-pluginspecifically.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-01-07 17:39:05 +08:00
Fabiano Fidêncio
9aea7456fb tests: k8s: Skip one of the empty-dir tests
An issue has been created for this, and we should fix the issue before
the next release.  However, for now, let's unblock the kernel bump and
have the test skipped.

Reference: https://github.com/kata-containers/kata-containers/issues/10706

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-06 21:48:20 +01:00
Fabiano Fidêncio
44ff602c64 tests: k8s: Be more aggressive to get OOM
Let's increase the amount of bytes allocated per VM worker, so we can
hit the OOM sooner.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-06 21:48:20 +01:00
Fabiano Fidêncio
f563f0d3fc versions: Update kernel to v6.12.8
There are lots of configs removed from latest kernel. Update them here
for convenience of next kernel upgrade.

Remove CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE [1]
Remove CONFIG_IP_NF_TARGET_CLUSTERIP [2]
Remove CONFIG_NET_SCH_CBQ [3]
Remove CONFIG_AUTOFS4_FS [4]
Remove CONFIG_EMBEDDED [5]
Remove CONFIG_ARCH_RANDOM & CONFIG_RANDOM_TRUST_CPU [6]

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=a7e4676e8e2cb158a4d24123de778087955e1b36
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=9db5d918e2c07fa09fab18bc7addf3408da0c76f
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=051d442098421c28c7951625652f61b1e15c4bd5
[4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=1f2190d6b7112d22d3f8dfeca16a2f6a2f51444e
[5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=ef815d2cba782e96b9aad9483523d474ed41c62a
[6] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.2&id=b9b01a5625b5a9e9d96d14d4a813a54e8a124f4b

Apart from the removals, CONFIG_CPU_MITIGATIONS is now a dependency for
CONFIG_RETPOLINE (which has been renamed to CONFIG_MITIGATION_RETPOLINE)
and CONFIG_PAGE_TABLE_ISOLATION (which has been renamed to
CONFIG_MITIGATION_PAGE_TABLE_ISOLATION).  I've added that to the
whitelist because we still build older versions of the kernel that
do not have that dependency.

Fixes: #8408
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-01-06 21:48:20 +01:00
Xuewei Niu
71b14d40f2 Merge pull request #10696 from teawater/kt
kata-ctl: direct-volume: Auto create KATA_DIRECT_VOLUME_ROOT_PATH
2025-01-02 14:04:37 +08:00
Hui Zhu
d15a7baedd kata-ctl: direct-volume: Auto create KATA_DIRECT_VOLUME_ROOT_PATH
Got following issue:
kata-ctl direct-volume add /kubelet/kata-direct-vol-002/directvol002
"{\"device\": \"/home/t4/teawater/coco/t.img\", \"volume-type\":
\"directvol\", \"fstype\": \"\", \"metadata\":"{}", \"options\": []}"
subsystem: kata-ctl_main
 Dec 30 09:43:41.150 ERRO Os {
    code: 2,
    kind: NotFound,
    message: "No such file or directory",
}
The reason is KATA_DIRECT_VOLUME_ROOT_PATH is not exist.

This commit create_dir_all KATA_DIRECT_VOLUME_ROOT_PATH before join_path
to handle this issue.

Fixes: #10695

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-30 17:55:49 +08:00
Xuewei Niu
6400295940 Merge pull request #10683 from justxuewei/nxw/remove-mut 2024-12-29 00:49:38 +08:00
Fupan Li
2068801b80 Merge pull request #10626 from teawater/ma
Add mem-agent to kata
2024-12-24 14:11:36 +08:00
Steve Horsman
2322f6df94 Merge pull request #10686 from stevenhorsman/ppc64le-all-prepare-steps-timeout
workflows: Add more ppc64le timeouts
2024-12-20 19:08:48 +00:00
stevenhorsman
9b6fce9e96 workflows: Add more ppc64le timeouts
Unsurprisingly now we've got passed the containerd test
hangs on the ppc64le, we are hitting others  in the "Prepare the
self-hosted runner" stage, so add timeouts to all of them
to avoid CI blockages.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-20 17:31:24 +00:00
Steve Horsman
162e2af4f5 Merge pull request #10685 from stevenhorsman/ppc64le-containerd-test-timeout
workflows: Add timeout to some ppc64le steps
2024-12-20 16:55:40 +00:00
stevenhorsman
d9d8d53bea workflows: Add timeout to some ppc64le steps
In some runs e.g. https://github.com/kata-containers/kata-containers/actions/runs/12426384186/job/34697095588
and https://github.com/kata-containers/kata-containers/actions/runs/12422958889/job/34697016842
we've seen the Prepare the self-hosted runner
and Install dependencies steps get stuck for 5hours+.
If they are working then it should take a few minutes,
so let's add timeouts and not hold up whole the CI if they are stuck

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-20 16:37:36 +00:00
Steve Horsman
99f239bc44 Merge pull request #10380 from stevenhorsman/required-tests-guidance
doc: Add required jobs info
2024-12-20 16:24:42 +00:00
stevenhorsman
d1d4bc43a4 static-checks: Add words to dictionary
devmapper and snapshotters are being marked as spelling
errors, so add them to the kata dictionary

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-20 14:16:52 +00:00
stevenhorsman
7612839640 doc: Add required jobs info
Add information about what required jobs are and
our initial guidelines for how jobs are eligible for being
made required, or non-required

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-20 14:12:13 +00:00
Xuewei Niu
ecf98e4db8 runtime-rs: Remove unneeded mut from new_hypervisor()
`set_hypervisor_config()` and `set_passfd_listener_port()` acquire inner
lock, so that `mut` for `hypervisor` is unneeded.

Fixes: #10682

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-12-20 17:08:10 +08:00
Steve Horsman
2c6126d3ab Merge pull request #10676 from stevenhorsman/fix-qemu-coco-dev-skip
tests: Fix qemu-coc-dev skip
2024-12-20 08:56:54 +00:00
Xuewei Niu
ea60613be9 Merge pull request #9387 from deagon/fix-broken-usage
packaging: fix the broken usage help
2024-12-20 15:20:37 +08:00
Guoqiang Ding
75baf75726 packaging: fix the broken usage help
Using the plain usage text instead of the bad variable reference.

Fixes: #9386
Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>
2024-12-20 13:58:40 +08:00
stevenhorsman
dd02b6699e tests: Fix qemu-coc-dev skip
Fix the logic to make the test skipped on qemu-coco-dev,
rather than the opposite and update the syntax to make it
clearer as it incorrectly got written and reviewed by three
different people in it's prior form.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-19 19:50:46 +00:00
Steve Horsman
79495379e2 Merge pull request #10668 from stevenhorsman/update-release-process-post-3.12
doc: Update the release process
2024-12-19 14:16:30 +00:00
Steve Horsman
99b9ef4e5a Merge pull request #10675 from stevenhorsman/release-repeat-abort
release: Abort if release version exists
2024-12-19 11:55:44 +00:00
stevenhorsman
c3f13265e4 doc: Update the release process
Add a step to wait for the payload publish to complete
before running the release action.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-19 09:52:39 +00:00
Zvonko Kaiser
f2d72874a1 Merge pull request #10620 from kata-containers/topic/fix-remove-artifact-ordering
workflows: Remove potential timing issues with artifacts
2024-12-18 13:22:12 -05:00
Zvonko Kaiser
fc2c77f3b6 Merge pull request #10669 from zvonkok/qemu-aarch64-fix
qemu: Fix aarch64 build
2024-12-18 08:26:55 -05:00
stevenhorsman
e2669d4acc release: Abort if release version exists
In order to check that we don't accidentally overwrite
release artifacts, we should add a check if the release
name already exists and bail if it does.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-18 11:04:19 +00:00
Zvonko Kaiser
07d2b00863 qemu: Fix aarch64 build
Building static binaries for aarch64 requires disabling PIE
We get an GOT overflow and the OS libraries are only build with fpic
and not with fPIC which enables unlimited sized GOT tables.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-18 03:26:14 +00:00
Zvonko Kaiser
39bf10875b Merge pull request #10663 from zvonkok/3.12.0-relase
release: Bump version to 3.12.0
2024-12-17 10:00:42 -05:00
Zvonko Kaiser
28b57627bd release: Bump version to 3.12.0
Bump VERSION and helm-chart versions

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-16 18:41:51 +00:00
Xuewei Niu
02b5fa15ac Merge pull request #10655 from liubogithub/patch-1
kata-ctl: fix outdated comments
2024-12-16 13:11:25 +08:00
Hyounggyu Choi
cfbc425041 Merge pull request #10660 from BbolroC/fix-leading-zero-issue-for-vfio-ap
vfio-ap: Assign default string "0" for empty APID and APQI
2024-12-13 17:40:29 +01:00
Hyounggyu Choi
341e5ca58e vfio-ap: Assign default string "0" for empty APID and APQI
The current script logic assigns an empty string to APID and APQI
when APQN consists entirely of zeros (e.g., "00.0000").
However, this behavior is incorrect, as "00" and "0000" are valid
values and should be represented as "0".
This commit ensures that the script assigns the default string “0”
to APID and APQI if their computed values are empty.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-12-13 14:39:03 +01:00
Liu Bo
95fc585103 kata-ctl: fix outdated comments
MgmnClient can also tolerate short sandbox id.

Signed-off-by: Liu Bo <liub.liubo@gmail.com>
2024-12-12 21:59:54 -08:00
stevenhorsman
cf8b82794a workflows: Only remove artifacts in release builds
Due to the agent-api tests requiring the agent to be deployed in the
CI by the tarball, so in the short-term lets only do this on the release
stage, so that both kata-manager works with the release and the
agent-api tests work with the other CI builds.

In the longer term we need to re-evaluate what is in our tarballs
(issue #10619), but want to unblock the tests in the short-term.

Fixes: #10630
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-12 17:38:27 +00:00
stevenhorsman
e1f6aca9de workflows: Remove potential timing issues with artifacts
With the code I originally did I think there is potentially
a case where we can get a failure due to timing of steps.
Before this change the `build-asset-shim-v2`
job could start the `get-artifacts` step and concurrently
`remove-rootfs-binary-artifacts` could run and delete the artifact
during the download and result in the error. In this commit, I
try to resolve this by making sure that the shim build waits
for the artifact deletes to complete before starting.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-12 16:52:54 +00:00
Fabiano Fidêncio
7b0c1d0a8c Merge pull request #10492 from zvonkok/upgrade-qemu-9.1.0
qemu: Upgrade qemu 9.1.2
2024-12-12 08:15:39 +01:00
Fupan Li
07fe7325c2 Merge pull request #10643 from justxuewei/fix-bind-vol
runtime-rs & agent: Fix the issues with bind volumes
2024-12-12 11:34:52 +08:00
Fupan Li
372346baed Merge pull request #10641 from justxuewei/fix-build-type
runtime-rs: Ignore BUILD_TYPE if it is not release
2024-12-12 11:32:49 +08:00
Xuewei Niu
5f1b1d8932 Merge pull request #10638 from justxuewei/fix-stderr-fifo
runtime-rs: Fix the issues with stderr fifo
2024-12-12 10:03:46 +08:00
Fabiano Fidêncio
a5c863a907 Merge pull request #10581 from ryansavino/snp-enable-skipped
Revert "ci: Skip the failing tests in SNP"
2024-12-11 18:22:17 +01:00
Zvonko Kaiser
cc9ecedaea qemu: Bump version, new options, add no_patches
We want to have the latest QEMU version available
which is as of this writing v9.1.2

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>

qemu: Add new options for 9.1.2

We need to fence specific options depending on the version
and disable ones that are not needed anymore

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>

qemu: Add no_patches.txt

Since we do not have any patches for this version
let's create the appropriate files.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-11 16:32:39 +00:00
Zvonko Kaiser
69ed4bc3b7 qemu: Add depedency
The new QEMU build needs python-tomli, now that we bumped Ubuntu
we can include the needed tomli package

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-11 16:32:20 +00:00
Zvonko Kaiser
c82db45eaa qemu: Disable pmem
We're disabling pmem support, it is heavilly broken with
Ubuntu's static build of QEMU and not needed

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-11 16:32:19 +00:00
Zvonko Kaiser
a88174e977 qemu: Replace from source build with package
In jammy we have the liburing package available, hence
remove the source build and include the package.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-11 16:22:54 +00:00
Zvonko Kaiser
c15f77737a qemu: Bump Ubuntu version in Dockerfile
We need jammy for a new package that is not available in focal

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-11 16:22:54 +00:00
Zvonko Kaiser
eef2795226 qemu: Use proper QEMU builder
Do not use hardcoded abs path. Use the deduced rel path.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-11 16:22:54 +00:00
Zvonko Kaiser
e604e51b3d qemu: Build as user
We moved all others artifacts to be build as a user,
QEMU should not be the exception

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-11 16:22:54 +00:00
Zvonko Kaiser
1d56fd0308 qemu: Remove abs path
We want to stick with the other build scripts and
only use relative paths.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-11 16:22:54 +00:00
Ryan Savino
7d45382f54 Revert "ci: Skip the failing tests in SNP"
This reverts commit 2242aee099.
2024-12-10 16:20:31 -06:00
Xuewei Niu
3fb91dd631 agent: Fix the issues with bind volumes
The mount type should be considered as empty if the value is
`Some("none")`.

Fixes: #10642

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-12-11 00:51:32 +08:00
Xuewei Niu
59ed19e8b2 runtime-rs: Fix the issues with bind volumes
This path fixes the logic of getting the type of volume: when the type of
OCI mount is Some("none") and the options have "bind" or "rbind", the
type will be considered as "bind".

Fixes: #10642

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-12-11 00:50:36 +08:00
Xuewei Niu
2424c1a562 runtime-rs: Ignore BUILD_TYPE if it is not release
This patch fixes that by adding `--release` only if `BUILD_TYPE=release`.

Fixes: #10640

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-12-11 00:27:28 +08:00
Xuewei Niu
b4695f6303 runtime-rs: Fix the issues with stderr fifo
When tty is enabled, stderr fifo should never be opened.

Fixes: #10637

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-12-10 21:48:52 +08:00
Aurélien Bombo
037281d699 Merge pull request #10593 from microsoft/saulparedes/improve_namespace_validation
policy: improve pod namespace validation
2024-12-09 11:55:09 -06:00
Steve Horsman
9b7fb31ce6 Merge pull request #10631 from stevenhorsman/action-lint-workflow
Action lint workflow
2024-12-09 09:33:07 +00:00
Fabiano Fidêncio
bec1de7bd7 Merge pull request #10548 from Sumynwa/sumsharma/clh_tweak_vm_configs
runtime: Set memory config shared=false when shared_fs=None in CLH.
2024-12-06 23:15:29 +01:00
Sumedh Alok Sharma
ac4f986e3e runtime: Set memory config shared=false when shared_fs=None in CLH.
This commit sets memory config `shared` to false in cloud hypervisor
when creating vm with shared_fs=None && hugePages = false.

Currently in runtime/virtcontainers/clh.go,the memory config shared is by default set to true.
As per the CLH memory document,
(a) shared=true is needed in case like when using virtio_fs since virtiofs daemon runs as separate process than clh.
(b) for shared_fs=none + hugespages=false, shared=false can be set to use private anonymous memory for guest (with no file backing).
(c) Another memory config thp (use transparent huge pages) is always enabled by default.
As per documentation, (b) + (c) can be used in combination.
However, with the current CLH implementation, the above combination cannot be used since shared=true is always set.

Fixes #10547

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-12-06 21:22:51 +05:30
stevenhorsman
b4b3471bcb workflows: linting: Fix shellcheck SC1001
> This \/ will be a regular '/' in this context

Remove ignored escape

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
491210ed22 workflows: linting: Fix shellcheck SC2006
> Use $(...) notation instead of legacy backticks `...`

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
5d7c5bdfa4 workflows: linting: Fix shellcheck SC2015
> A && B || C is not if-then-else. C may run when A is true

Refactor the echo so that we can't get into a situation where
the retry of workspace delete happens if the original one was
successful

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
c2ba15c111 workflows: linting: Fix shellcheck SC2206
>  Quote to prevent word splitting/globbing

Double quote variables expanded in an array

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
007514154c workflows: linting: Fix shellcheck SC2068
> Double quote array expansions to avoid re-splitting elements

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
4ef05c6176 workflows: linting: Fix shellcheck SC2116
> Useless echo? Instead of 'cmd $(echo foo)', just use 'cmd foo'

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
f02d540799 workflows: Bump outdated action versions
Bump some actions that are significantly out-of-date
and out of sync with the versions used in other workflows

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
935327b5aa workflows: linting: Fix shellcheck SC2046
> Quote this to prevent word splitting.

Quote around subshell

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
e93ed6c20e workflows: linting: Add tdx labels
The tdx runners got split into two different
runners, so we need to update the known self-hosted
runner labels

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
d4bd314d52 workflows: linting: Fix incorrect properties
These properties are currently invalid, so either
fix, or remove them

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
9113606d45 workflows: linting: Fix shellcheck SC2086
> Double quote to prevent globbing and word splitting.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
42cd2ce6e4 workflows: Add actionlint workflows
On PRs that update anything in the workflows directory,
add an actionlint run to validate our workflow files for errors
and hopefully catch issues earlier.

Fixes: #9646

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 11:36:08 +00:00
Fabiano Fidêncio
a93ff57c7d Merge pull request #10627 from kata-containers/topic/release-helm-charm-tarball
release: helm: Add the chart as part of the release
2024-12-06 11:22:43 +01:00
Fabiano Fidêncio
300a827d03 release: helm: Add the chart as part of the release
So users can simply download the chart and use it accordingly without
the need to download the full repo.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-12-06 11:19:34 +01:00
Fabiano Fidêncio
652662ae09 Merge pull request #10551 from fidencio/topic/kata-deploy-allow-multi-deployment
kata-deploy: Add support to multi-installation
2024-12-06 11:16:20 +01:00
Hui Zhu
d3a6bcdaa5 runtime-rs: configuration-dragonball.toml.in: Add config for mem-agent
Add config for mem-agent to configuration-dragonball.toml.in.

Fixes: #10625

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-06 10:00:28 +08:00
Hui Zhu
2b6caf26e0 agent-ctl: Add mem-agent API support
Add sub command MemAgentMemcgSet and MemAgentCompactSet to agent-ctl to
configate the mem-agent inside the running kata-containers.

Fixes: #10625

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-06 10:00:24 +08:00
Hui Zhu
cb86d700a6 config: Add config of mem-agent
Add config of mem-agent to configate the mem-agent.

Fixes: #10625

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-06 10:00:20 +08:00
Hui Zhu
692ded8f96 agent: add support for MemAgentMemcgSet and MemAgentCompactSet
Add MemAgentMemcgSet and MemAgentCompactSet to agent API to set the config of
mem-agent memcg and compact.

Fixes: #10625

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-06 10:00:16 +08:00
Hui Zhu
f84ad54d97 agent: Start mem-agent in start_sandbox
mem-agent will run with kata-agent.

Fixes: #10625

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-06 10:00:13 +08:00
Hui Zhu
74a17f96f4 protocols/protos/agent.proto: Add mem-agent support
Add MemAgentMemcgConfig and MemAgentCompactConfig to AgentService.

Fixes: #10625

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-06 10:00:09 +08:00
Hui Zhu
ffc8390a60 agent: Add mem-agent to Cargo.toml
Add mem-agent to Cargo.toml of agent.
mem-agent will be integrated into kata-agent.

Fixes: #10625

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-06 10:00:05 +08:00
Hui Zhu
4407f6e098 mem-agent: Add to src
mem-agent is a component designed for managing memory in Linux
environments.
Sub-feature memcg: Utilizes the MgLRU feature to monitor each cgroup's
memory usage and periodically reclaim cold memory.
Sub-feature compact: Periodically compacts memory to facilitate the
kernel's free page reporting feature, enabling the release of more idle
memory from guests.
During memory reclamation and compaction, mem-agent monitors system
pressure using Pressure Stall Information (PSI). If the system pressure
becomes too high, memory reclamation or compaction will automatically
stop.

Fixes: #10625

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-06 10:00:02 +08:00
Hui Zhu
f9c63d20a4 kernel/configs: Add mglru, debugfs and psi to dragonball-experimental
Add mglru, debugfs and psi to dragonball-experimental/mem_agent.conf to
support mem_agent function.

Fixes: #10625

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-06 09:59:59 +08:00
Fabiano Fidêncio
111082db07 kata-deploy: Add support to multi-installation
This is super useful for development / debugging scenarios, mainly when
dealing with limited hardware availability, as this change allows
multiple people to develop into one single machine, while still using
kata-deploy.

Fixes: #10546

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-12-05 17:42:53 +01:00
Fabiano Fidêncio
0033a0c23a kata-deploy: Adjust paths for qemu-coco-dev as well
I missed that when working on the INSTALL_PREFIX feature, so adding it
now.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-12-05 17:42:53 +01:00
Fabiano Fidêncio
62b3a07e2f kata-deploy: helm: Add overlooked INSTALLATION_PREFIX env var
At the same time that INSTALLATION_PREFIX was added, I was working on
the helm changes to properly do the cleanup / deletion when it's
removed.  However, I missed adding the INSTALLATION_PREFIX env var
there. which I'm doing now.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-12-05 17:42:53 +01:00
Steve Horsman
5d96734831 Merge pull request #10572 from ldoktor/gk-stalled-results
ci.gatekeeper: Update existing results
2024-12-04 19:02:14 +00:00
Wainer Moschetta
a94982d8b8 Merge pull request #10617 from stevenhorsman/skip-k8s-job-test-on-non-tee
tests: Skip k8s job test on qemu-coco-dev
2024-12-04 15:47:33 -03:00
Saul Paredes
84a411dac4 policy: improve pod namespace validation
- Remove default_namespace from settings
- Ensure container namespaces in a pod match each other in case no namespace is specified in the YAML

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-12-04 10:17:54 -08:00
Steve Horsman
c86f76d324 Merge pull request #10588 from stevenhorsman/metrics-clh-min-range-relaxation
metrics: Increase minval range for failing tests
2024-12-04 16:10:26 +00:00
stevenhorsman
a8ccd9a2ac tests: Skip k8s job test on qemu-coco-dev
The tests is unstable on this platform, so skip it for now to prevent
the regular known failures covering up other issues. See #10616

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-04 16:00:05 +00:00
Steve Horsman
9e609dd34f Merge pull request #10615 from kata-containers/topic/update-remove-artifact-filter
workflows: Fix remove artifact name filter
2024-12-04 15:02:35 +00:00
Fabiano Fidêncio
531a29137e Merge pull request #10607 from microsoft/danmihai1/less-logging
runtime: skip logging some of the dial errors
2024-12-04 15:01:45 +01:00
stevenhorsman
14a3adf4d6 workflows: Fix remove artifact name filter
- Fix copy-paste errors in artifact filters for arm64 and ppc64le
- Remove the trailing wildcard filter that falsely ends up removing agent-ctl
and replace with the tarball-suffix, which should exactly match the artifacts

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-04 13:34:42 +00:00
Alex Lyn
5f9cc86b5a Merge pull request #10604 from 3u13r/euler/fix/genpolicy-rego-state-getter
genpolicy: align state path getter and setter
2024-12-04 13:57:34 +08:00
Alex Lyn
c7064027f4 Merge pull request #10574 from BbolroC/add-ccw-subchannel-qemu-runtime-rs
Add subchannel support to qemu-runtime-rs for s390x
2024-12-04 09:17:45 +08:00
Aurélien Bombo
57d893b5dc Merge pull request #10563 from sprt/csi-deploy
coco: ci: Fully implement compilation of CSI driver and require it for CoCo tests [2/x]
2024-12-03 18:58:14 -06:00
Aurélien Bombo
4aa7d4e358 ci: Require CSI driver for CoCo tests
With the building/publishing step for the CSI driver validated, we can
set that as a requirement for the CoCo tests.

Depends on: #10561

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-12-03 14:43:36 -06:00
Aurélien Bombo
fe55b29ef0 csi-kata-directvolume: Remove go version check
The driver build recipe has a script to check the current Go version against
the go.mod version.  However, the script is broken ($expected is unbound) and I
don't believe we do this for other components. On top of this, Go should be
backward-compatible. Let's keep things simple for now and we can evaluate
restoring this script in the future if need be.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-12-03 14:43:36 -06:00
Aurélien Bombo
fb87bf221f ci: Implement build step for CSI driver
This fully implements the compilation step for csi-kata-directvolume.
This component can now be built by the CI running:

 $ cd tools/packaging/kata-deploy/local-build
 $ make csi-kata-directvolume-tarball

A couple notes:

 * When installing the binary, we rename it from directvolplugin to
   csi-kata-directvolume on the fly to make it more readable.
 * We add go to the tools builder Dockerfile to support building this
   tool.
 * I've noticed the file install_libseccomp.sh gets created by the build
   process so I've added it to a .gitignore.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-12-03 14:43:36 -06:00
Aurélien Bombo
0f6113a743 Merge pull request #10612 from kata-containers/sprt/fix-csi-publish2
ci: Fix Docker publishing for CSI driver, 2nd try
2024-12-03 14:43:28 -06:00
Aurélien Bombo
a23ceac913 ci: Fix Docker publishing for CSI driver, 2nd try
Follow-up to #10609 as it seems GHA doesn't allow hard links:

https://github.com/kata-containers/kata-containers/actions/runs/12144941404/job/33868901896?pr=10563#step:6:8

Note that I also updated the `needs` directive as we don't need the Kata
payload container, just the tarball artifact.

Part of: #10560

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-12-03 13:04:46 -06:00
Dan Mihai
2a67038836 Merge pull request #10608 from microsoft/saulparedes/policy_metadatata_uid
policy: ignore optional metadata uid field
2024-12-03 10:19:12 -08:00
Dan Mihai
25e6f4b2a5 Merge pull request #10592 from microsoft/saulparedes/add_constants_to_rules
policy: add constants to rules.rego
2024-12-03 10:17:10 -08:00
Aurélien Bombo
5e1fc5a63f Merge pull request #10609 from kata-containers/sprt/fix-publish-csi
ci: Fix Docker publishing for CSI driver
2024-12-03 11:21:55 -06:00
Hyounggyu Choi
8b998e5f0c runtime-rs: Introduce get_devno_ccw() for deduplication
The devno assignment logic is repeated in 5 different places
during device addition.
To improve code maintainability and readability, this commit
introduces a standalone function, `get_devno_ccw()`,
to handle the deduplication.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-12-03 15:35:03 +01:00
Leonard Cohnen
9b614a4615 genpolicy: align state path getter and setter
Before this patch there was a mismatch between the JSON path under which
the state of the rule evaluation is set in comparison to under which
it is retrieved.

This resulted in the behavior that each time the policy was evaluated,
it thought it was the _first_ time the policy was evaluated.
This also means that the consistency check for the `sandbox_name`
was ineffective.

Signed-off-by: Leonard Cohnen <lc@edgeless.systems>
2024-12-03 13:25:24 +01:00
Aurélien Bombo
85d3bcd713 ci: Fix Docker publishing for CSI driver
The compilation succeeds, however Docker can't find the binary because
we specify an absolute path. In Docker world, an absolute path is
absolute to the Docker build context (here:
src/tools/csi-kata-directvolume).

To fix this, we link the binary into the build context, where the
Dockerfile expects it.

Failure mode:
https://github.com/kata-containers/kata-containers/actions/runs/12068202642/job/33693101962?pr=10563#step:8:213

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-12-02 15:50:01 -06:00
Saul Paredes
711d12e5db policy: support optional metadata uid field
This prevents a deserialization error when uid is specified

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-12-02 11:24:58 -08:00
Dan Mihai
efd492d562 runtime: skip logging some of the dial errors
With full debug logging enabled there might be around 1,500 redials
so log just ~15 of these redials to avoid flooding the log.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-12-02 19:11:32 +00:00
Hyounggyu Choi
9c19d7674a Merge pull request #10590 from zvonkok/fix-ci
ci: Fix variant for confidential targets
2024-12-02 18:39:52 +01:00
Saul Paredes
9105c1fa0c policy: add constants to rules.rego
Reuse constants where applicable

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-12-02 08:28:58 -08:00
Hyounggyu Choi
6f4f94a9f0 Merge pull request #10595 from BbolroC/add-zvsi-devmapper-to-gatekeeper-required-jobs
gatekeeper: add run-k8s-tests-on-zvsi(devmapper) to required jobs
2024-12-02 15:28:14 +01:00
Zvonko Kaiser
20442c0eae ci: Fix variant for confidential targets
The default initrd confidential target will have a
variant=confidential we need to accomodate this
and make sure we also accomodate aaa-xxx-confidential targets.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-02 14:21:03 +00:00
stevenhorsman
b87b4b6756 metrics: Increase ranges range for qemu failing tests
We've also seen the qemu metrics tests are failing due to the results
being slightly outside the max range for network-iperf3 parallel and minimum for network-iperf3 jitter tests on PRs that have no code changes,
so we've increase the bounds to not see false negatives.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-29 10:52:16 +00:00
stevenhorsman
4011071526 metrics: Increase minval range for failing tests
We've seen a couple of instances recently where the metrics
tests are failing due to the results being below the minimum
value by ~2%.
For tests like latency I'm not sure why values being too low would
be an issue, but I've updated the minpercent range of the failing tests
to try and get them passing.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-29 10:50:02 +00:00
Hyounggyu Choi
de3452f8e1 gatekeeper: add run-k8s-tests-on-zvsi(devmapper) to required jobs
As the following CI job has been marked as required:

- kata-containers-ci-on-push / run-k8s-tests-on-zvsi / run-k8s-tests (devmapper, qemu, kubeadm)

we need to add it to the gatekeeper's required job list.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-11-28 12:46:47 +01:00
Fabiano Fidêncio
bdf10e651a Merge pull request #10597 from kata-containers/topic/unbreak-ci-3rd-time-s-a-charm
Unbreak the CI, 3rd attempt
2024-11-28 12:36:09 +01:00
Fabiano Fidêncio
92b8091f62 Revert "ci: unbreak: Reallow no-op builds"
This reverts commit 559018554b.

As we've noticed that this is causing issues with initrd builds in the
CI.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-28 12:02:40 +01:00
Fabiano Fidêncio
ca2098f828 build: Allow dummy builds (for when adding a new target)
This will help us to simply allow a new dummy build whenever a new
component is added.

As long as the format `$(call DUMMY,$@)` is followed, we should be good
to go without taking the risk of breaking the CI.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-28 11:13:24 +01:00
Fabiano Fidêncio
f9930971a2 Merge pull request #10594 from sprt/sprt/unbreak-ci-noop-build
ci: unbreak: Reallow no-op builds
2024-11-28 07:38:25 +01:00
Aurélien Bombo
559018554b ci: unbreak: Reallow no-op builds
#9838 previously modified the static build so as not to repeatedly
copy the same assets on each matrix iteration:

https://github.com/kata-containers/kata-containers/pull/9838#issuecomment-2169299202

However, that implementation breaks specifiying no-op/WIP build targets
such as done in e43c59a. Such no-op builds have been a historical of the
project requirement because of a GHA limitation. The breakage is due to
no-op builds not generating a tar file corresponding to the asset:

https://github.com/kata-containers/kata-containers/actions/runs/12059743390/job/33628926474?pr=10592

To address this breakage, we revert to the `cp -r` implementation and
add the `--no-clobber` flag to still preserve the current behavior. Note
that `-r` will also create the destination directory if it doesn't
exist.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-27 18:40:29 -06:00
Fabiano Fidêncio
9699c7ed06 Merge pull request #10589 from kata-containers/sprt/fix-csi-publish
gha: Unbreak CI and work around workflow limit
2024-11-27 23:52:55 +01:00
Aurélien Bombo
eac197d3b7 Merge pull request #10564 from microsoft/danmihai1/clh-endpoint-type
runtime: clh: addNet() logging clean-up
2024-11-27 14:44:14 -06:00
Aurélien Bombo
7f659f3d63 gha: Unbreak CI and work around workflow limit
#10561 inadvertently broke the CI by going over the limit of
20 reusable workflows:

https://github.com/kata-containers/kata-containers/actions/runs/12054648658/workflow

This commit fixes that by inlining the job.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-27 12:23:15 -06:00
Aurélien Bombo
16a91fccbe Merge pull request #10561 from sprt/csi-driver-ci
coco: ci: Lay groundwork for compiling and publishing CSI driver image [1/x]
2024-11-27 10:26:45 -06:00
Fabiano Fidêncio
175fe8bc66 Merge pull request #10585 from fidencio/topic/kata-deploy-use-drop-in-containerd-config-whenever-it-is-possible
kata-deploy: Use drop-in files whenever it's possible
2024-11-27 16:36:18 +01:00
Steve Horsman
6bb00d9a1d Merge pull request #10583 from squarti/agent-startup-cdh-client
agent: fix startup when guest_components_procs is set to none
2024-11-27 11:43:07 +00:00
Fabiano Fidêncio
500508a592 kata-deploy: Use drop-in files whenever it's possible
This will make our lives considerably easier when it comes to cleaning
up content added, while it's also a groundwork needed for having
multiple installations running in parallel.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-27 12:27:08 +01:00
Steve Horsman
3240f8a4b8 Merge pull request #10586 from stevenhorsman/delete-rootfs-binary-assets-after-rootfs-build
workflows: Remove rootfs binary artifacts
2024-11-27 10:03:20 +00:00
Fabiano Fidêncio
c472fe1924 Merge pull request #10584 from fidencio/topic/kata-deploy-prepare-for-containerd-config-version-3
kata-deploy: Support containerd configuration version 3
2024-11-26 18:44:56 +01:00
stevenhorsman
3e5d360185 workflows: Remove rootfs binary artifacts
We need the publish certain artefacts for the rootfs,
like the agent, guest-components, pause bundle etc
as they are consumed in the `build-asset-rootfs` step.
However after this point they aren't needed and probably
shouldn't be included in the overall kata tarball, so delete
them once they aren't needed any more to avoid them
being included.

Fixes: #10575
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-26 15:24:20 +00:00
Fabiano Fidêncio
6f70ab9169 kata-deploy: Adapt how the containerd version is checked for k0s
Let's actually mount the whole /etc/k0s as /etc/containerd, so we can
easily access the containerd configuration file which has the version in
it, allowing us to parse it instead of just making a guess based on
kubernetes distro being used.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-26 16:15:11 +01:00
Silenio Quarti
1230bc77f2 agent: fix startup when guest_components_procs is set to none
This PR ensures that OCICRYPT_CONFIG_PATH file is initialized only
when CDH socket exists. This prevents startup error if attestation
binaries are not installed in PodVM.

Fixes: https://github.com/kata-containers/kata-containers/issues/10568

Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
2024-11-26 09:57:04 -05:00
Fabiano Fidêncio
f5a9aaa100 kata-deploy: Support containerd config version 3
On Ubuntu 24.04, with the distro default containerd, we're already
getting:
```
$ containerd config default | grep "version = "
version = 3
```

With that in mind, let's make sure that we're ready to support this from
the next release.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-26 14:01:50 +01:00
Fupan Li
28166c8a32 Merge pull request #10577 from Apokleos/fix-vfiodev-name
runtime-rs: fix vfio device name combination issue
2024-11-26 09:35:45 +08:00
Dan Mihai
d93900c128 Merge pull request #10543 from microsoft/danmihai1/regorus-warning
genpolicy: avoid regorus warning
2024-11-25 16:47:33 -08:00
Zvonko Kaiser
1b10e82559 Merge pull request #10516 from zvonkok/kata-agent-cdi
ci: Fix error on self-hosted machines
2024-11-25 18:49:37 -05:00
Ryan Savino
e46d24184a Merge pull request #10386 from kimullaa/fix-build-error-when-using-sev-snp
docs: Fix several build failures  when I tried the procedures in "Kata Containers with AMD SEV-SNP VMs"
2024-11-25 16:58:52 -06:00
Dan Mihai
f340b31c41 genpolicy: avoid regorus warning
Avoid adding to the Guest console warnings about "agent_policy:10:8".

"import input" is unnecessary.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-11-25 21:19:01 +00:00
Zvonko Kaiser
c3d1b3c5e3 Merge pull request #10464 from zvonkok/nvidia-gpu-rootfs
gpu: NVIDIA GPU initrd/image build
2024-11-25 16:16:42 -05:00
Fabiano Fidêncio
8763a9bc90 Merge pull request #10520 from fidencio/topic/drop-clear-linux-rootfs
osbuilder: Drop Clear Linux
2024-11-25 21:16:03 +01:00
Dan Mihai
78cbf33f1d runtime: clh: addNet() logging clean-up
Avoid logging the same endpoint fields twice from addNet().

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-11-25 19:58:54 +00:00
alex.lyn
5dba680afb runtime-rs: fix vfio device name combination issue
Fixes #10576

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2024-11-25 14:01:43 +08:00
Hyounggyu Choi
48e2df53f7 runtime-rs: Add devno to DeviceVirtioScsi
A new attribute named `devno` is added to DeviceVirtioScsi.
It will be used to specify a device number for a CCW bus type.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-11-23 13:45:36 +01:00
Hyounggyu Choi
2cc48f7822 runtime-rs: Add devno to DeviceVhostUserFs
A new attribute named `devno` is added to DeviceVhostUserFs.
It will be used to specify a device number for a CCW bus type.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-11-23 13:45:36 +01:00
Hyounggyu Choi
920484918c runtime-rs: Add devno to VhostVsock
A new attribute named `devno` is added to VhostVsock.
It will be used to specify a device number for a CCW bus type.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-11-23 13:45:36 +01:00
Hyounggyu Choi
9486790089 runtime-rs: Add devno to DeviceVirtioSerial
A new attribute named `devno` is added to DeviceVirtioSerial.
It will be used to specify a device number for a CCW bus type.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-11-23 13:45:36 +01:00
Hyounggyu Choi
516daecc50 runtime-rs: Add devno to DeviceVirtioBlk
A new attribute named `devno` is added to DeviceVirtioBlk.
It will be used to specify a device number for a CCW bus type.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-11-23 13:45:36 +01:00
Hyounggyu Choi
30a64092a7 runtime-rs: Add CcwSubChannel to provide devno for CCW devices
To explicitly specify a device number on the QEMU command line
for the following devices using the CCW transport on s390x:

- SerialDevice
- BlockDevice
- VhostUserDevice
- SCSIController
- VSOCKDevice

this commit introduces a new structure CcwSubChannel and implements
the following methods:

- add_device()
- remove_device()
- address_format_ccw()
- set_addr()

You can see the detailed explanation for each method in the comment.

This resolves the 1st part of #10573.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-11-23 13:45:36 +01:00
Steve Horsman
322073bea1 Merge pull request #10447 from ldoktor/required-jobs
ci: Required jobs
2024-11-22 09:15:11 +00:00
Lukáš Doktor
e69635b376 ci.gatekeeper: Remove unused variable
this is a left-over from previous way of iterating over jobs.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-11-22 09:27:11 +01:00
Lukáš Doktor
fa7bca4179 ci.gatekeeper: Print the older job id
let's print the also the existing result's id when printing the
information about ignoring older result id to simplify debugging.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-11-22 09:27:11 +01:00
Lukáš Doktor
6c19a067a0 ci.gatekeeper: Update existing results
tha matching run_id means we're dealing with the same job but with
updated results and not with an older job. Update the results in such
case.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-11-22 09:27:09 +01:00
Aurélien Bombo
5e4990bcf5 coco: ci: Add no-op steps to deploy CSI driver
This adds no-op steps that'll be used to deploy and clean up the CSI driver
used for testing.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-21 16:08:06 -06:00
Aurélien Bombo
893f6a4ca0 ci: Introduce job to publish CSI driver image
This adds a new job to build and publish the CSI driver Docker image.

Of course this job will fail after we merge this PR because the CSI driver
compilation job hasn't been implemented yet. However that will be implemented
directly after in #10561.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-21 16:07:59 -06:00
Aurélien Bombo
e43c59a2c6 ci: Add no-op step to compile CSI driver
This adds a no-op build step to compile the CSI driver. The actual compilation
will be implemented in an ulterior PR, so as to ensure we don't break the CI.

Addresses: #10560

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-21 16:06:55 -06:00
Zvonko Kaiser
0debf77770 gpu: NVIDIA gpu initrd/image build
With each release make sure we ship a GPU enabled rootfs/initrd

Fixes: #6554

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-11-21 18:57:23 +00:00
Steve Horsman
b4da4b5e3b Merge pull request #10377 from coolljt0725/fix_build
osbuilder: Fix build dependency of ubuntu rootfs with Docker
2024-11-21 08:45:59 +00:00
Jitang Lei
ed4c727c12 osbuilder: Fix build dependency of ubuntu rootfs with Docker
Build ubuntu rootfs with Docker failed with error:
`Unable to find libclang`

Fix this error by adding libclang-dev to the dependency.

Signed-off-by: Jitang Lei <leijitang@outlook.com>
2024-11-21 10:49:27 +08:00
Zvonko Kaiser
e9f36f8187 ci: Fixing simple typo
change evn to env

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-11-20 18:40:14 +00:00
Zvonko Kaiser
a5733877a4 ci: Fix error on self-hosted machines
We need to clean-up any created files/dirs otherwise
we cause problems on self-hosted runners. Using tempdir which
will be removed automatically.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-11-20 18:40:13 +00:00
Lukáš Doktor
62e8815a5a ci: Add documentation to cover mapping format
to help people with adding new entries.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-11-20 17:25:59 +01:00
Lukáš Doktor
64306dc888 ci: Set required-tests according to GH required tests
this should record the current list of required tests from GH.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-11-20 17:25:57 +01:00
Steve Horsman
358ebf5134 Merge pull request #10558 from AdithyaKrishnan/main
ci: Re-enable SNP CI
2024-11-20 10:27:41 +00:00
Steve Horsman
30bad4ee43 Merge pull request #10562 from stevenhorsman/remove-release-artifactor-skips
workflows: Remove skipping of artifact uploads
2024-11-20 08:45:37 +00:00
Adithya Krishnan Kannan
2242aee099 ci: Skip the failing tests in SNP
Per [Issue#10549](https://github.com/kata-containers/kata-containers/issues/10549),
the following tests are failing on SNP.
1. k8s-guest-pull-image-encrypted.bats
2. k8s-guest-pull-image-authenticated.bats
3. k8s-guest-pull-image-signature.bats
4. k8s-confidential-attestation.bats

Per @fidencio 's comment on
[PR#10558](https://github.com/kata-containers/kata-containers/pull/10558),
I am skipping the same.

Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
2024-11-19 10:41:43 -06:00
stevenhorsman
da5f6b77c7 workflows: Remove skipping of artifact uploads
Now we are downloading artifacts to create the rootfs
we need to ensure they are uploaded always,
even on releases

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-19 13:28:02 +00:00
Steve Horsman
817438d1f6 Merge pull request #10552 from stevenhorsman/3.11.0-release
release: Bump version to 3.11.0
2024-11-19 09:44:35 +00:00
Saul Paredes
eab48c9884 Merge pull request #10545 from microsoft/cameronbaird/sync-clh-logging
runtime: fix comment to accurately reflect clh behavior
2024-11-18 11:25:58 -08:00
Adithya Krishnan Kannan
ef367d81f2 ci: Re-enable SNP CI
We've debugged the SNP Node and we
wish to test the fixes on GHA.

Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
2024-11-18 11:11:27 -06:00
stevenhorsman
7a8ba14959 release: Bump version to 3.11.0
Bump `VERSION` and helm-chart versions

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-18 11:13:15 +00:00
Steve Horsman
0ce3f5fc6f Merge pull request #10514 from squarti/pause_command
agent: overwrite OCI process spec when overwriting pause image
2024-11-15 18:03:58 +00:00
Fabiano Fidêncio
92f7526550 Merge pull request #10542 from Crypt0s/topic/enable-CONFIG_KEYS
kernel: add CONFIG_KEYS=y to enable kernel keyring
2024-11-15 12:15:25 +01:00
Crypt0s
563a6887e2 kernel: add CONFIG_KEYS=y to enable kernel keyring
KinD checks for the presence of this (and other) kernel configuration
via scripts like
https://blog.hypriot.com/post/verify-kernel-container-compatibility/ or
attempts to directly use /proc/sys/kernel/keys/ without checking to see
if it exists, causing an exit when it does not see it.

Docker/it's consumers apparently expect to be able to use the kernel
keyring and it's associated syscalls from/for containers.

There aren't any known downsides to enabling this except that it would
by definition enable additional syscalls defined in
https://man7.org/linux/man-pages/man7/keyrings.7.html which are
reachable from userspace. This minimally increases the attack surface of
the Kata Kernel, but this attack surface is minimal (especially since
the kernel is most likely being executed by some kind of hypervisor) and
highly restricted compared to the utility of enabling this feature to
get further containerization compatibility.

Signed-off-by: Crypt0s <BryanHalf@gmail.com>
2024-11-15 09:30:06 +01:00
Shunsuke Kimura
706e8bce89 docs: change from OVMF.fd to AmdSev.fd
change the build method to generate OVMF for AmdSev.
This commit adds `ovmf_build=sev` env parameter.
<638c2c4164>

Fixes #10378

Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>
2024-11-15 11:24:45 +09:00
Shunsuke Kimura
d7f6fabe65 docs: fix build-kernel.sh option
`build-kernel.sh` no longer takes an argument for the -x option.
<6c3338271b>

Fixes #10378

Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>
2024-11-15 11:24:45 +09:00
Cameron Baird
65881ceb8a runtime: fix comment to accurately reflect clh behavior
Fix the CLH log levels description

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
2024-11-14 23:16:11 +00:00
Silenio Quarti
42b6203493 agent: overwrite OCI process spec when overwriting pause image
The PR replaces the OCI process spec of the pause container with the spec of
the guest provided pause bundle.

Fixes: https://github.com/kata-containers/kata-containers/issues/10537

Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
2024-11-14 13:05:16 -05:00
Fabiano Fidêncio
6a9266124b Merge pull request #10501 from kata-containers/topic/ci-split-tests
ci: tdx: Split jobs to run in 2 different machines
2024-11-14 17:24:50 +01:00
Fabiano Fidêncio
9b3fe0c747 ci: tdx: Adjust workflows to use different machines
This will be helpful in order to increase the OS coverage (we'll be
using both Ubuntu 24.04 and CentOS 9 Stream), while also reducing the
amount spent on the tests (as one machine will only run attestation
related tests, and the other the tests that do *not* require
attestation).

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-14 15:52:00 +01:00
Fabiano Fidêncio
9b1a5f2ac2 tests: Add a way to run only tests which rely on attestation
We're doing this as, at Intel, we have two different kind of machines we
can plug into our CI.  Without going much into details, only one of
those two kinds of machines will work for the attestation tests we
perform with ITA, thus in order to speed up the CI and improve test
coverage (OS wise), we're going to run different tests in different
machines.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-14 15:51:57 +01:00
Steve Horsman
915695f5ef Merge pull request #9407 from mrIncompetent/root-fs-clang
rootfs: Install missing clang in Ubuntu docker image
2024-11-14 10:35:06 +00:00
Henrik Schmidt
57a4dbedeb rootfs: Install missing libclang-dev in Ubuntu docker image
Fixes #9444

Signed-off-by: Henrik Schmidt <mrIncompetent@users.noreply.github.com>
2024-11-14 08:48:24 +00:00
Hyounggyu Choi
5869046d04 Merge pull request #9195 from UiPath/fix/vcpus-for-static-mgmt
runtime: Set maxvcpus equal to vcpus for the static resources case
2024-11-14 09:38:20 +01:00
Dan Mihai
d9977b3e75 Merge pull request #10431 from microsoft/saulparedes/add-policy-state
genpolicy: add state to policy
2024-11-13 11:48:46 -08:00
Aurélien Bombo
7bc2fe90f9 Merge pull request #10521 from ncppd/osbuilder-cleanup
osbuilder: remove redundant env variable
2024-11-13 12:17:09 -06:00
Steve Horsman
a947d2bc40 Merge pull request #10539 from AdithyaKrishnan/main
ci: Temporarily skip SNP CI
2024-11-13 17:58:32 +00:00
Adithya Krishnan Kannan
439a1336b5 ci: Temporarily skip SNP CI
As discussed in the CI working group,
we are temporarily skipping the SNP CI
to unblock the remaining workflow.
Will revert after fixing the SNP runner.

Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
2024-11-13 11:44:16 -06:00
Fabiano Fidêncio
02d4c3efbf Merge pull request #10519 from fidencio/topic/relax-restriction-for-qemu-tdx
Reapply "runtime: confidential: Do not set the max_vcpu to cpu"
2024-11-13 16:09:06 +01:00
Saul Paredes
c207312260 genpolicy: validate container sandbox names
Make sure all container sandbox names match the sandbox name of the first container.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-11-12 15:17:01 -08:00
Saul Paredes
52d1aea1f7 genpolicy: Add state
Use regorous engine's add_data method to add state to the policy.
This data can later be accessed inside rego context through the data namespace.

Support state modifications (json-patches) that may be returned as a result from policy evaluation.

Also initialize a policy engine data slice "pstate" dedicated for storing state.

Fixes #10087

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-11-12 15:16:53 -08:00
Alexandru Matei
e83f8f8a04 runtime: Set maxvcpus equal to vcpus for the static resources case
Fixes: #9194

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2024-11-12 16:36:42 +02:00
GabyCT
06fe459e52 Merge pull request #10508 from GabyCT/topic/installartsta
gha: Get artifacts when installing kata tools in stability workflow
2024-11-11 15:59:06 -06:00
Nikos Ch. Papadopoulos
ab80cf8f48 osbuilder: remove redundant env variable
Remove second declaration of GO_HOME in roofs-build ubuntu script.

Signed-off-by: Nikos Ch. Papadopoulos <ncpapad@cslab.ece.ntua.gr>
2024-11-11 19:49:28 +02:00
Fabiano Fidêncio
780b36f477 osbuilder: Drop Clear Linux
The Clear Linux rootfs is not being tested anywhere, and it seems Intel
doesn't have the capacity to review the PRs related to this (combined
with the lack of interested from the rest of the community on reviewing
PRs that are specific to this untested rootfs).

With this in mind, I'm suggesting we drop Clear Linux support and focus
on what we can actually maintain.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-11 15:22:55 +01:00
Fabiano Fidêncio
5618180e63 Merge pull request #10515 from kata-containers/sprt/ubuntu-latest-fix
gha: Hardcode ubuntu-22.04 instead of latest
2024-11-10 09:54:39 +01:00
Fabiano Fidêncio
2281342fb8 Merge pull request #10513 from fidencio/topic/ci-adjust-proxy-nightmare-for-tdx
ci: tdx: kbs: Ensure https_proxy is taken in consideration
2024-11-10 00:17:10 +01:00
Fabiano Fidêncio
0d8c4ce251 Merge pull request #10517 from microsoft/saulparedes/remove_manifest_v1_test
tests: remove manifest v1 test
2024-11-09 23:40:51 +01:00
Fabiano Fidêncio
56812c852f Reapply "runtime: confidential: Do not set the max_vcpu to cpu"
This reverts commit f15e16b692, as we
don't have to do this since we're relying on the
`static_sandbox_resource_mgmt` feature, which gives us the correct
amount of memory and CPUs to be allocated.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-09 23:20:17 +01:00
Saul Paredes
461efc0dd5 tests: remove manifest v1 test
This test was meant to show support for pulling images with v1 manifest schema versions.

The nginxhttps image has been modified in https://hub.docker.com/r/ymqytw/nginxhttps/tags such that we are no longer able to pull it:

$ docker pull ymqytw/nginxhttps:1.5
Error response from daemon: missing signature key

We may remove this test since schema version 1 manifests are deprecated per
https://docs.docker.com/engine/deprecated/#pushing-and-pulling-with-image-manifest-v2-schema-1 :
"These legacy formats should no longer be used, and users are recommended to update images to use current formats, or to upgrade to more
current images". This schema version was used by old docker versions. Further OCI spec
https://github.com/opencontainers/image-spec/blob/main/manifest.md#image-manifest-property-descriptions only supports schema version 2.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-11-08 13:38:51 -08:00
Aurélien Bombo
19e972151f gha: Hardcode ubuntu-22.04 instead of latest
GHA is migrating ubuntu-latest to Ubuntu 24 so
let's hardcode the current 22.04 LTS.

https://github.blog/changelog/2024-11-05-notice-of-breaking-changes-for-github-actions/

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-08 11:00:15 -06:00
Greg Kurz
2bd8fde44a Merge pull request #10511 from ldoktor/fedora-python
ci.ocp: Use the official python:3 container for sanity
2024-11-08 16:31:40 +01:00
Fabiano Fidêncio
baf88bb72d ci: tdx: kbs: Ensure https_proxy is taken in consideration
Trustee's deployment must set the correct https_proxy as env var on the
container that will talk to the ITA / ITTS server, otherwise the kbs
service won't be able to start, causing then issues in our CI.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Signed-off-by: Krzysztof Sandowicz <krzysztof.sandowicz@intel.com>
2024-11-08 16:06:16 +01:00
Steve Horsman
1f728eb906 Merge pull request #10498 from stevenhorsman/update-create-container-timeout-log
tests: k8s: Update image pull timeout error
2024-11-08 10:47:39 +00:00
Steve Horsman
6112bf85c3 Merge pull request #10506 from stevenhorsman/skip-runk-ci
workflow: Remove/skip runk CI
2024-11-08 09:54:06 +00:00
Steve Horsman
a5acbc9e80 Merge pull request #10505 from stevenhorsman/remove-stratovirt-metrics-tests
metrics: Skip metrics on stratovirt
2024-11-08 08:53:05 +00:00
Lukáš Doktor
2f7d34417a ci.ocp: Use the official python:3 container for sanity
Fedora F40 removed python3 from the base container, to avoid such issues
let's rely on the latest and greates official python container.

Fixes: #10497

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-11-08 07:16:30 +01:00
Zvonko Kaiser
183bd2aeed Merge pull request #9584 from zvonkok/kata-agent-cdi
kata-agent: Add CDI support
2024-11-07 14:18:32 -05:00
Zvonko Kaiser
aa2e1a57bd agent: Added test-case for handle_cdi_devices
We are generating a simple CDI spec with device and
global containerEdits to test the CDI crate.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-11-07 17:03:18 +00:00
Gabriela Cervantes
4274198664 gha: Get artifacts when installing kata tools in stability workflow
This PR adds the get artifacts which are needed when installing kata
tools in stability workflow to avoid failures saying that artifacts
are missing.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-11-07 16:20:41 +00:00
stevenhorsman
a5f1a5a0ee workflow: Remove/skip runk CI
As discussed in the AC meeting, we don't have a maintainer,
(or users?) of runk, and the CI is unstable, so giving we can't
support it, we shouldn't waste CI cycles on it.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-07 14:16:30 +00:00
stevenhorsman
0efe9f4e76 metrics: Skip metrics on stratovirt
As discussed on the AC call, we are lacking maintainers for the
metrics tests. As a starting point for potentially phasing them
out, we discussed starting with removing the test for stratovirt
as a non-core hypervisor and a job that is problematic in leaving
behind resources that need cleaning up.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-07 14:06:57 +00:00
Fabiano Fidêncio
c332e953f9 Merge pull request #10500 from squarti/fix-10499
runtime: Files are not synced between host and guest VMs
2024-11-07 08:28:53 +01:00
Silenio Quarti
be3ea2675c runtime: Files are not synced between host and guest VMs
This PR makes the root dir absolute after resolving the
default root dir symlink. 

Fixes: https://github.com/kata-containers/kata-containers/issues/10499

Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
2024-11-06 17:31:12 -05:00
GabyCT
47cea6f3c6 Merge pull request #10493 from GabyCT/topic/katatoolsta
gha: Add install kata tools as part of the stability workflow
2024-11-06 14:16:48 -06:00
Gabriela Cervantes
13e27331ef gha: Add install kata tools as part of the stability workflow
This PR adds the install kata tools step as part of the k8s stability workflow.
To avoid the failures saying that certain kata components are not installed it.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-11-06 20:07:06 +00:00
Fabiano Fidêncio
71c4c2a514 Merge pull request #10486 from kata-containers/topic/enable-AUTO_GENERATE_POLICY-for-qemu-coco-dev
workflows: Use AUTO_GENERATE_POLICY for qemu-coco-dev
2024-11-06 21:04:45 +01:00
Zvonko Kaiser
3995fe71f9 kata-agent: Add CDI support
For proper device handling add CDI support

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-11-06 17:50:20 +00:00
stevenhorsman
85554257f8 tests: k8s: Update image pull timeout error
Currently the error we are checking for is
`CreateContainerRequest timed out`, but this message
doesn't always seem to be printed to our pod log.
Try using a more general message that should be present
more reliably.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-06 17:00:26 +00:00
Fabiano Fidêncio
a3c72e59b1 Merge pull request #10495 from littlejawa/ci/skip_nginx_connectivity_for_crio
ci: skip nginx connectivity test with qemu/crio
2024-11-06 13:43:19 +01:00
Julien Ropé
da5e0c3f53 ci: skip nginx connectivity test with crio
We have an error with service name resolution with this test when using crio.
This error could not be reproduced outside of the CI for now.
Skipping it to keep the CI job running until we find a solution.

See: #10414

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-11-06 12:07:02 +01:00
Greg Kurz
5af614b1a4 Merge pull request #10496 from littlejawa/ci/expose_container_runtime
ci: export CONTAINER_RUNTIME to the test scripts
2024-11-06 12:05:36 +01:00
Julien Ropé
6d0cb1e9a8 ci: export CONTAINER_RUNTIME to the test scripts
This variable will allow tests to adapt their behaviour to the runtime (containerd/crio).

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-11-06 11:29:11 +01:00
Fabiano Fidêncio
72979d7f30 workflows: Use AUTO_GENERATE_POLICY for qemu-coco-dev
By the moment we're testing it also with qemu-coco-dev, it becomes
easier for a developer without access to TEE to also test it locally.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-06 10:47:08 +01:00
Fabiano Fidêncio
7d3f2f7200 runtime: Match TEEs for the static_sandbox_resource_mgmt option
The qemu-coco-dev runtime class should be as close as possible to what
the TEEs runtime classes are doing, and this was one of the options that
ended up overlooked till now.

Shout out to Dan Mihai for noticing that!

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-06 10:47:08 +01:00
Fabiano Fidêncio
ea8114833c Merge pull request #10491 from fidencio/topic/fix-typo-in-the-ephemeral-handler
agent: fix typo on getting EphemeralHandler size option
2024-11-06 10:31:48 +01:00
Fabiano Fidêncio
7e6779f3ad Merge pull request #10488 from fidencio/topic/teach-our-machinery-to-deal-with-rc-kernels
build: kernel: Teach our machinery to deal with -rc kernels
2024-11-05 16:19:57 +01:00
Zvonko Kaiser
a4725034b2 Merge pull request #9480 from zvonkok/build-image-suffix
image: Add suffix to image or initrd depending on the NVIDIA driver version
2024-11-05 09:43:56 -05:00
Fabiano Fidêncio
77c87a0990 agent: fix typo on getting EphemeralHandler size option
Most likely this was overlooked during the development / review, but
we're actually interested on the size rather than on the pagesize of the
hugepages.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 15:15:17 +01:00
Fabiano Fidêncio
2b16160ff1 versions: kernel-dragonball: Fix URL
SSIA

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 12:55:34 +01:00
Fabiano Fidêncio
f7b31ccd6c kernel: bump kata_config_version
Due to the changes done in the previous commits.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 12:26:57 +01:00
Fabiano Fidêncio
a52ea32b05 build: kernel: Learn how to deal with release candidates
So far we were not prepared to deal with release candidates as those:
* Do not have a sha256sum in the sha256sums provided by the kernel cdn
* Come from a different URL (directly from Linus)
* Have a different suffix (.tar.gz, instead of .tar.xz)

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 12:26:02 +01:00
Fabiano Fidêncio
9f2d4b2956 build: kernel: Always pass the url to the builder
This doesn't change much on how we're doing things Today, but it
simplifies a lot cases that may be added later on (and will be) like
building -rc kernels.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 12:26:02 +01:00
Fabiano Fidêncio
ee1a17cffc build: kernel: Take kernel_url into consideration
Let's make sure the kernel_url is actually used whenever it's passed to
the function.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 12:26:02 +01:00
Fabiano Fidêncio
9a0b501042 build: kernel: Remove tee specific function
As, thankfully, we're relying on upstream kernels for TEEs.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 12:26:02 +01:00
Fabiano Fidêncio
cc4006297a build: kernel: Pass the yaml base path instead of the version path
By doing this we can ensure this can be re-used, if needed (and it'll be
needed), for also getting the URL.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 12:26:02 +01:00
Fabiano Fidêncio
7057ff1cd5 build: kernel: Always pass -f to the kernel builder
-f forces the (re)generaton of the config when doing the setup, which
helps a lot on local development whilst not causing any harm in the CI
builds.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 12:26:02 +01:00
Fabiano Fidêncio
910defc4cf Merge pull request #10490 from fidencio/topic/fix-ovmf-build
builds: ovmf: Workaround Zeex repo becoming private
2024-11-05 12:25:00 +01:00
Fabiano Fidêncio
aff3d98ddd builds: ovmf: Workaround Zeex repo becoming private
Let's just do a simple `sed` and **not** use the repo that became
private.

This is not a backport of https://github.com/tianocore/edk2/pull/6402,
but it's a similar approach that allows us to proceed without the need
to pick up a newer version of edk2.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 11:25:54 +01:00
Dan Mihai
03bf4433d7 Merge pull request #10459 from stevenhorsman/update-bats
tests: k8s: Update bats
2024-11-04 12:26:58 -08:00
Aurélien Bombo
f639d3e87c Merge pull request #10395 from Sumynwa/sumsharma/create_container
agent-ctl: Add support to test kata-agent's container creation APIs.
2024-11-04 14:09:12 -06:00
GabyCT
7f066be04e Merge pull request #10485 from GabyCT/topic/fixghast
gha: Fix source for gha stability run script
2024-11-04 12:09:28 -06:00
Steve Horsman
a2b9527be3 Merge pull request #10481 from mkulke/mkulke/init-cdh-client-on-gcprocs-none
agent: perform attestation init w/o process launch
2024-11-04 17:27:45 +00:00
Gabriela Cervantes
fd4d0dd1ce gha: Fix source for gha stability run script
This PR fixes the source to avoid duplication specially in the common.sh
script and avoid failures saying that certain script is not in the directory.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-11-04 16:16:13 +00:00
Magnus Kulke
bf769851f8 agent: perform attestation init w/o process launch
This change is motivated by a problem in peerpod's podvms. In this setup
the lifecycle of guest components is managed by systemd. The current code
skips over init steps like setting the ocicrypt-rs env and initialization
of a CDH client in this case.

To address this the launch of the processes has been isolated into its
own fn.

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
2024-11-04 13:31:07 +01:00
Steve Horsman
4fd9df84e4 Merge pull request #10482 from GabyCT/topic/fixvirtdoc
docs: Update virtualization document
2024-11-04 11:51:09 +00:00
stevenhorsman
175ebfec7c Revert "k8s:kbs: Add trap statement to clean up tmp files"
This reverts commit 973b8a1d8f.

As @danmihai1 points out https://github.com/bats-core/bats-core/issues/364
states that using traps in bats is error prone, so this could be the cause
of the confidential test instability we've been seeing, like it was
in the static checks, so let's try and revert this.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-04 09:59:37 +00:00
stevenhorsman
75cb1f46b8 tests/k8s: Add skip is setup_common fails
At @danmihai1's suggestion add a die message in case
the call to setup_common fails, so we can see if in the test
output.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-04 09:59:33 +00:00
stevenhorsman
3f5bf9828b tests: k8s: Update bats
We've seen some issues with tests not being run in
some of the Coco CI jobs (Issue #10451) and in the
envrionments that are more stable we noticed that
they had a newer version of bats installed.

Try updating the version to 1.10+ and print out
the version for debug purposes

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-04 09:59:33 +00:00
Steve Horsman
06d2cc7239 Merge pull request #10453 from bpradipt/remote-annotation
runtime: Add GPU annotations for remote hypervisor
2024-11-04 09:10:06 +00:00
Zvonko Kaiser
3781526c94 gpu: Add VARIANT to the initrd and image build
We need to know if we're building a nvidia initrd or image
Additionally if we build a regular or confidential VARIANT

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-11-01 18:34:13 +00:00
Zvonko Kaiser
95b69c5732 build: initrd make it coherent to the image build
Add -f for moving the initrd to the correct file path

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-11-01 18:34:13 +00:00
Zvonko Kaiser
3c29c1707d image: Add suffix to image or initrd depending on the NVIDIA driver version
Fixes: #9478

We want to keep track of the driver versions build during initrd/image build so update the artifact_name after the fact.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-11-01 18:34:13 +00:00
Sumedh Alok Sharma
4b7aba5c57 agent-ctl: Add support to test kata-agent's container creation APIs.
This commit introduces changes to enable testing kata-agent's container
APIs of CreateContainer/StartContainer/RemoveContainer. The changeset
include:
- using confidential-containers image-rs crate to pull/unpack/mount a
container image. Currently supports only un-authenicated registry pull
- re-factor api handlers to reduce cmdline complexity and handle
request generation logic in tool
- introduce an OCI config template for container creation
- add test case

Fixes #9707

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-11-01 22:18:54 +05:30
Fabiano Fidêncio
2efcb442f4 Merge pull request #10442 from Sumynwa/sumsharma/tools_use_ubuntu_static_build
ci: Use ubuntu for static building of kata tools.
2024-11-01 16:04:31 +01:00
Gabriela Cervantes
1ca83f9d41 docs: Update virtualization document
This PR updates the virtualization document by removing a url link
which is not longer valid.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-31 17:28:02 +00:00
GabyCT
a3d594d526 Merge pull request #10480 from GabyCT/topic/fixstabilityrun
gha: Add missing steps in Kata stability workflow
2024-10-31 09:57:33 -06:00
Fabiano Fidêncio
e058b92350 Merge pull request #10425 from burgerdev/darwin
genpolicy: support darwin target
2024-10-31 12:16:44 +01:00
Markus Rudy
df5e6e65b5 protocols: only build RLimit impls on Linux
The current version of the oci-spec crate compiles RLimit structs only
for Linux and Solaris. Until this is fixed upstream, add compilation
conditions to the type converters for the affected structs.

Fixes: #10071

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-10-31 09:50:36 +01:00
Markus Rudy
091a410b96 kata-sys-util: move json parsing to protocols crate
The parse_json_string function is specific to parsing capability strings
out of ttRPC proto definitions and does not benefit from being available
to other crates. Moving it into the protocols crate allows removing
kata-sys-util as a dependency, which in turn enables compiling the
library on darwin.

Fixes: #10071

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-10-31 09:41:07 +01:00
Markus Rudy
8ab4bd2bfc kata-sys-util: remove obsolete cgroups dependency
The cgroups.rs source file was removed in
234d7bca04. With cgroups support handled
in runtime-rs, the cgroups dependency on kata-sys-util can be removed.

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-10-31 09:41:07 +01:00
Sumedh Alok Sharma
0adf7a66c3 ci: Use ubuntu for static building of kata tools.
This commit introduces changes to use ubuntu for statically
building kata tools. In the existing CI setup, the tools
currently build only for x86_64 architecture.

It also fixes the build error seen for agent-ctl PR#10395.

Fixes #10441

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-10-31 13:19:18 +05:30
Gabriela Cervantes
c4089df9d2 gha: Add missing steps in Kata stability workflow
This PR adds missing steps in the gha run script for the kata stability
workflow.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-30 19:13:15 +00:00
Xuewei Niu
1a216fecdf Merge pull request #10225 from Chasing1020/main
runtime-rs: Add basic boilerplate for remote hypervisor
2024-10-30 17:02:50 +08:00
Hyounggyu Choi
dca69296ae Merge pull request #10476 from BbolroC/switch-to-kubeadm-s390x
gha: Switch KUBERNETES from k3s to kubeadm on s390x
2024-10-30 09:52:06 +01:00
GabyCT
9293931414 Merge pull request #10474 from GabyCT/topic/removeunvarb
packaging: Remove kernel config repo variable as it is unused
2024-10-29 12:52:07 -06:00
Gabriela Cervantes
69ee287e50 packaging: Remove kernel config repo variable as it is unused
This PR removes the kernel config repo variable at the build kernel
script as it is not used.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-29 17:09:52 +00:00
GabyCT
8539cd361a Merge pull request #10462 from GabyCT/topic/increstress
tests: Increase time to run stressng k8s tests
2024-10-29 11:08:47 -06:00
Chasing1020
425f6ad4e6 runtime-rs: add oci spec for prepare_vm method
The cloud-api-adaptor needs to support different types of pod VM
instance.
We needs to pass some annotations like machine_type, default_vcpus and
default_memory to prepare the VMs.

Signed-off-by: Chasing1020 <643601464@qq.com>
2024-10-30 01:01:28 +08:00
Chasing1020
f1167645f3 runtime-rs: support for remote hypervisors type
This patch adds the support of the remote hypervisor type for runtime-rs.
The cloud-api-adaptor needs the annotations and network namespace path
to create the VMs.
The remote hypervisor opens a UNIX domain socket specified in the config
file, and sends ttrpc requests to a external process to control sandbox
VMs.

Fixes: #10350

Signed-off-by: Chasing1020 <643601464@qq.com>
2024-10-30 00:54:17 +08:00
Pradipta Banerjee
6f1ba007ed runtime: Add GPU annotations for remote hypervisor
Add GPU annotations for remote hypervisor to help
with the right instance selection based on number of GPUs
and model

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
2024-10-29 10:28:21 -04:00
Steve Horsman
68225b53ca Merge pull request #10475 from stevenhorsman/revert-10452
Revert "tests: Add trap statement in kata doc script"
2024-10-29 13:58:00 +00:00
Hyounggyu Choi
aeef28eec2 gha: Switch to kubeadm for run-k8s-tests-on-zvsi
Last November, SUSE discontinued support for s390x, leaving k3s
on this platform stuck at k8s version 1.28, while upstream k8s
has since reached 1.31. Fortunately, kubeadm allows us to create
a 1.30 Kubernetes cluster on s390x.
This commit switches the KUBERNETES option from k3s to kubeadm
for s390x and removes a dedicated cluster creation step.
Now, cluster setup and teardown occur in ACTIONS_RUNNER_HOOK_JOB_{STARTED,COMPLETED}.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-10-29 14:27:32 +01:00
Hyounggyu Choi
238f67005f tests: Add kubeadm option for KUBERNETES in gha-run.sh
When creating a k8s cluster via kubeadm, the devmapper setup
for containerd requires a different configuration.
This commit introduces a new `kubeadm` option for the KUBERNETES
variable and adjusts the path to the containerd config file for
devmapper setup.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-10-29 14:19:42 +01:00
stevenhorsman
b1cffb4b09 Revert "tests: Add trap statement in kata doc script"
This reverts commit 093a6fd542.
as it is breaking the static checks

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-10-29 09:57:18 +00:00
Aurélien Bombo
eb04caaf8f Merge pull request #10074 from koct9i/log-vm-start-error
runtime: log vm start error before cleanup
2024-10-28 14:39:00 -05:00
Fabiano Fidêncio
e675e233be Merge pull request #10473 from fidencio/topic/build-cache-fix-shim-v2-root_hash.txt-location
build: cache: Ensure shim-v2-root_hash.txt is in "${workdir}"
2024-10-28 16:53:06 +01:00
Fabiano Fidêncio
f19c8cbd02 build: cache: Ensure shim-v2-root_hash.txt is in "${workdir}"
All the oras push logic happens from inside `${workdir}`, while the
root_hash.txt extraction and renaming was not taking this into
consideration.

This was not caught during the manually triggered runs as those do not
perform the oras push.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 15:17:16 +01:00
Steve Horsman
51bc71b8d9 Merge pull request #10466 from kata-containers/topic/ensure-shim-v2-sets-the-measured-rootfs-parameters-to-the-config
re-enable measured rootfs build & tests
2024-10-28 13:11:50 +00:00
Fabiano Fidêncio
b70d7c1aac tests: Enable measured rootfs tests for qemu-coco-dev
Then it's on pair with what's being tested with TEEs using a rootfs
image.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:54 +01:00
Fabiano Fidêncio
d23d057ac7 runtime: Enable measured rootfs for qemu-coco-dev
Let's make sure we are prepared to test this with non-TEE environments
as well.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
7d202fc173 tests: Re-enable measured_rootfs test for TDX
As we're now building everything needed to test TDX with measured rootfs
support, let's bring this test back in (for TDX only, at least for now).

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
d537932e66 build: shim-v2: Ensure MEASURED_ROOTFS is exported
The approach taken for now is to export MEASURED_ROOTFS=yes on the
workflow files for the architectures using confidential stuff, and leave
the "normal" build without having it set (to avoid any change of
expectation on the current bevahiour).

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
9c8b20b2bf build: shim-v2: Rebuild if root_hashes do not match
Let's make sure we take the root_hashes into consideration to decide
whether the shim-v2 should or should not be used from the cached
artefacts.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
9c84998de9 build: cache: Cache root_hash.txt used by the shim-v2
Let's cache the root_hash.txt from the confidential image so we can use
them later on to decide whether there was a rootfs change that would
require shim-v2 to be rebuilt.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
d2d9792720 build: Don't leave cached component behind if it can't be used
Let's ensure we remove the component and any extra tarball provided by
ORAS in case the cached component cannot be used.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
ef29824db9 runtime: Don't do measured rootfs for "vanilla" kernel
We may decide to add this later on, but for now this is only targetting
TEEs and the confidential image / initrd.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
a65946bcb0 workflows: build: Ensure rootfs is present for shim-v2 build
Let's ensure that we get the already built rootfs tarball from previous
steps of the action at the time we're building the shim-v2.

The reason we do that is because the rootfs binary tarballs has a
root_hash.txt file that contains the information needed the shim-v2
build scripts to add the measured rootfs arguments to the shim-v2
configuration files.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
6ea0369878 workflows: build: Ensure rootfs is built before shim-v2
As the rootfs will have what we need to add as part of the shim-v2
configuration files for measured rootfs, we **must** ensure this is
built **before** shim-v2.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
13ea082531 workflows: Build rootfs after its deps are built
By doing this we can just re-use the dependencies already built, saving
us a reasonable amount of time.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
eb07a809ce tests: Add a helper script to use prebuild components
This is a helper script that does basically what's already being done by
the s390x CI, which is:
* Move a folder with the components that we were stored / downloaded
  during the GHA execution to the expected `build` location
* Get rid of the dependencies for a specific asset, as the dependencies
  are already pulled in from previous GHA steps

For now this script is only being added but not yet executed anywhere,
and that will come as the next step in this series.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:52 +01:00
Fabiano Fidêncio
c2b18f9660 workflows: Store rootfs dependencies
So far we haven't been storing the rootfs dependencies as part of our
workflows, but we better do it to re-use them as part of the rootfs
build.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:52 +01:00
Steve Horsman
b5f503b0b5 Merge pull request #10471 from fidencio/topic/possibly-fix-release-workflow
workflows: Possibly fix the release workflow
2024-10-28 11:38:33 +00:00
Konstantin Khlebnikov
ee50582848 runtime: log vm start error before cleanup
Return of proper error to the initiator is not guaranteed.
Method StopVM could kill shim process together with VM pieces.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
2024-10-28 11:21:21 +01:00
Fabiano Fidêncio
a8fad6893a workflows: Possibly fix the release workflow
The only reason we had this one passing for amd64 is because the check
was done using the wrong variable (`matrix.stage`, while in the other
workflows the variable used is `inputs.stage`).

The commit that broke the release process is 67a8665f51, which
blindly copy & pasted the logic from the matrix assets.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 11:15:53 +01:00
Steve Horsman
ad5749fd6b Merge pull request #10467 from stevenhorsman/release-3.10.1
release: Bump version to 3.10.1
2024-10-25 20:19:23 +01:00
stevenhorsman
b22d4429fb release: Bump version to 3.10.1
Fix release to pick up #10463

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-10-25 17:16:09 +01:00
Steve Horsman
19ac0b24f1 Merge pull request #10463 from skaegi/rustjail_filemode_perm_fix
agent: Correct rustjail device filemode permission typo
2024-10-25 14:27:50 +01:00
Fabiano Fidêncio
cc815957c0 Merge pull request #10461 from kata-containers/topic/workflows-follow-up-on-manually-triggered-job
workflows: devel: Follow-up on the manually triggered jobs
2024-10-25 08:31:14 +02:00
Simon Kaegi
322846b36f agent: Correct rustjail device filemode permission typo
Corrects device filemode permissions typo/regression in rustjail to `666` instead of `066`.
`666` is the standard and expected value for these devices in containers.

Fixes: #10454

Signed-off-by: Simon Kaegi <simon.kaegi@gmail.com>
2024-10-24 16:46:40 -04:00
GabyCT
a9af46ccd2 Merge pull request #10452 from GabyCT/topic/katadoctemp
tests: Add trap statement in kata doc script
2024-10-24 13:21:11 -06:00
Gabriela Cervantes
a3ef8c0a16 tests: Increase time to run stressng k8s tests
This PR increase the time to run the stressng k8s tests for the
CoCo stability CI.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-24 16:34:17 +00:00
Fabiano Fidêncio
475ad3e06b workflows: devel: Allow running more than one at once
More than one developer can and should be able to run this workflow at
the same time, without cancelling the job started by another developer.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-24 15:38:35 +02:00
Fabiano Fidêncio
8f634ceb6b workflows: devel: Adjust the pr-number
Let's use "dev" instead of "manually-triggered" as it avoids the name
being too long, which results in failures to create AKS clusters.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-24 15:38:31 +02:00
GabyCT
41d1178e4a Merge pull request #10438 from GabyCT/topic/fixspellreadme
docs: Fix misspelling in CI documentation
2024-10-23 13:34:52 -06:00
Steve Horsman
c5c389f473 Merge pull request #10449 from kata-containers/topic/add-workflows-specifically-for-testing
Add a specific workflow for testing the CI, without messing up with the "nightly" weather
2024-10-23 19:03:49 +01:00
Gabriela Cervantes
093a6fd542 tests: Add trap statement in kata doc script
This PR adds the trap statement into the kata doc
script to clean up properly the temporary files.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-23 15:56:58 +00:00
Gabriela Cervantes
701891312e docs: Fix misspelling in CI documentation
This PR fixes a misspelling in CI documentation readme.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-23 15:42:08 +00:00
Fabiano Fidêncio
829415dfda workflows: Remove the possibility to manually trigger the nightly CI
As a new workflow was added for the cases where developers want to test
their changes in the workflow itself, let's make sure we stop allowing
manual triggers on this workflow, which can lead to a polluted /
misleading weather of the CI.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-23 13:19:45 +02:00
Fabiano Fidêncio
cc093cdfdb workflows: Add a manually trigger "devel" workflow for the CI
This workflow is intended to replace the `workflow_dispatch` trigger
currently present as part of the `ci-nightly.yaml`.

The reasoning behind having this done in this way is because of our good
and old GHA behaviour for `pull_request_target`, which requires a PR to
be merged in order to check the changes in the workflow itself, which
leads to:
* when a change in a workflow is done, developers (should) do:
  * push their branch to the kata-containers repo
  * manually trigger the "nightly" CI in order to ensure the changes
    don't break anything
    * this can result in the "nightly" CI weather being polluted
      * we don't have the guarantee / assurance about the last n nightly
	runs anymore

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-23 13:14:50 +02:00
Greg Kurz
378f454fb9 Merge pull request #10208 from wtootw/main
runtime: Failed to clean up resources when QEMU is terminated
2024-10-23 12:11:57 +02:00
Fabiano Fidêncio
ca416d8837 Merge pull request #10446 from kata-containers/topic/re-work-shim-v2-build-as-part-of-the-ci-and-release
workflows: Ensure shim-v2 is built as the last asset
2024-10-23 09:27:29 +02:00
Fabiano Fidêncio
c082b99652 Merge pull request #10439 from microsoft/mahuber/azl-cfg-var
tools: Change PACKAGES var for cbl-mariner
2024-10-23 08:39:49 +02:00
Manuel Huber
a730cef9cf tools: Change PACKAGES var for cbl-mariner
Change the PACKAGES variable for the cbl-mariner rootfs-builder
to use the kata-packages-uvm meta package from
packages.microsoft.com to define the set of packages to be
contained in the UVM.
This aligns the UVM build for the Azure Linux distribution
with the UVM build done for the Kata Containers offering on
Azure Kubernetes Services (AKS).

Signed-off-by: Manuel Huber <mahuber@microsoft.com>
2024-10-22 23:11:42 +00:00
Fabiano Fidêncio
67a8665f51 workflows: Ensure shim-v2 is built as the last asset
By doing this we can ensure that whenever the rootfs changes, we'll be
able to get the new root_hash.txt and use it.

This is the very first step to bring the measured rootfs tests back.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-22 14:56:37 +02:00
Greg Kurz
3de6d09a86 Merge pull request #10443 from gkurz/release-3.10.0
release: Bump VERSION to 3.10.0
2024-10-22 14:46:30 +02:00
Greg Kurz
3037303e09 release: Bump VERSION to 3.10.0
Let's start the 3.10.0 release.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-10-22 11:28:15 +02:00
wangyaqi54
cf4b81344d runtime: Failed to clean up resources when QEMU is terminated by signal 15
When QEMU is terminated by signal 15, it deletes the PidFile.
Upon detecting that QEMU has exited, the shim executes the stopVM function.
If the PidFile is not found, the PID is set to 0.
Subsequently, the shim executes `kill -9 0`, which terminates the current process group.
This prevents any further logic from being executed, resulting in resources not being cleaned up.

Signed-off-by: wangyaqi54 <wangyaqi54@jd.com>
2024-10-22 17:04:46 +08:00
Fabiano Fidêncio
4c34cfb0ab Merge pull request #10420 from pmores/add-support-for-virtio-scsi
runtime-rs: support virtio-scsi device in qemu-rs
2024-10-22 11:00:33 +02:00
Pavel Mores
8cdd968092 runtime-rs: support virtio-scsi device in qemu-rs
Semantics are lifted straight out of the go runtime for compatibility.
We introduce DeviceVirtioScsi to represent a virtio-scsi device and
instantiate it if block device driver in the configuration file is set
to virtio-scsi.  We also introduce ObjectIoThread which is instantiated
if the configuration file additionally enables iothreads.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-10-22 08:55:54 +02:00
Greg Kurz
91b874f18c Merge pull request #10421 from Apokleos/hostname-bugfix
kata-agent: fixing bug of unable setting hostname correctly.
2024-10-22 00:26:51 +02:00
alex.lyn
b25538f670 ci: Introduce CI to validate pod hostname
Fixes #10422

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2024-10-21 16:32:56 +01:00
alex.lyn
3dabe0f5f0 kata-agent: fixing bug of unable setting hostname correctly.
When do update_container_namespaces updating namespaces, setting
all UTS(and IPC) namespace paths to None resulted in hostnames
set prior to the update becoming ineffective. This was primarily
due to an error made while aligning with the oci spec: in an attempt
to match empty strings with None values in oci-spec-rs, all paths
were incorrectly set to None.

Fixes #10325

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2024-10-21 16:32:56 +01:00
Steve Horsman
98886a7571 Merge pull request #10437 from mkulke/mkulke/dont-parse-oci-image-for-cached-artifacts
ci: don't parse oci image for cached artifacts
2024-10-21 16:31:23 +01:00
Magnus Kulke
e27d70d47e ci: don't parse oci image for cached artifacts
Moved the parsing of the oci image marker into its own step, since we
only need to perform that for attestation purposes and some cached
images might not have that file in the tarball.

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
2024-10-21 14:50:00 +02:00
Magnus Kulke
9a33a3413b Merge pull request #10433 from mkulke/mkulke/add-provenance-attestation-for-agent-builds
ci: add provenance attestation for agent artifact
2024-10-18 15:00:18 +02:00
Anastassios Nanos
68d539f5c5 Merge pull request #10435 from nubificus/fix_fc_machineconfig
runtime-rs: Use vCPU and memory values from config
2024-10-18 13:41:20 +01:00
Magnus Kulke
b93f5390ce ci: add provenance attestation for agent artifact
This adds provenance attestation logic for agent binaries that are
published to an oci registry via ORAS.

As a downstream consumer of the kata-agent binary the Peerpod project
needs to verify that the artifact has been built on kata's CI.

To create an attestation we need to know the exact digest of the oci
artifact, at the point when the artifact was pushed.

Therefore we record the full oci image as returned by oras push.

The pushing and tagging logic has been slightly reworked to make this
task less repetetive.

The oras cli accepts multiple tags separated by comma on pushes, so a
push can be performed atomically instead of iterating through tags and
pushing each individually. This removes the risk of partially successful
push operations (think: rate limits on the oci registry).

So far the provenance creation has been only enabled for agent builds on
amd64 and xs390x.

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
2024-10-18 10:24:00 +02:00
Anastassios Nanos
23f5786cca runtime-rs: Use vCPU and memory values from config
Use values from the config for the setup of the microVM.

Fixes: #10434

Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
2024-10-17 23:17:02 +01:00
GabyCT
4ae9317675 Merge pull request #10430 from GabyCT/topic/ciaz
docs: Update CI documentation
2024-10-17 15:09:24 -06:00
GabyCT
b00203ba9b Merge pull request #10428 from GabyCT/topic/archk8sc
gha: Use a arch_to_golang variable to have uniformity
2024-10-17 11:00:59 -06:00
Chengyu Zhu
cca77f0911 Merge pull request #10412 from stevenhorsman/agent-config-rstest
agent: config: Use rstest for unit tests
2024-10-17 23:01:21 +08:00
Gabriela Cervantes
e3efad8ed2 docs: Update CI documentation
This PR updates the CI documentation referring to the several tests and
in which kind of instances is running them.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-16 19:23:19 +00:00
stevenhorsman
4adb454ed0 agent: config: Use rstest for unit tests
Use rstest for unit test rather than TestData arrays where
possible to make the code more compact, easier to read
and open the possibility to enhance test cases with a
description more easily.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-10-16 16:55:44 +01:00
Gabriela Cervantes
f0e0c74fd4 gha: Use a arch_to_golang variable to have uniformity
This PR replaces the arch uname -m to use the arch_to_golang
variable in the script to have a better uniformity across the script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-15 20:03:09 +00:00
Dan Mihai
69509eff33 Merge pull request #10417 from microsoft/danmihai1/k8s-inotify.bats
tests: k8s-inotify.bats improvements
2024-10-15 11:22:53 -07:00
Dan Mihai
ece0f9690e tests: k8s-inotify: longer pod termination timeout
inotify-configmap-pod.yaml is using: "inotifywait --timeout 120",
so wait for up to 180 seconds for the pod termination to be
reported.

Hopefully, some of the sporadic errors from #10413 will be avoided
this way:

not ok 1 configmap update works, and preserves symlinks
waitForProcess "${wait_time}" "$sleep_time" "${command}" failed

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-10-15 16:01:25 +00:00
Dan Mihai
ccfb7faa1b tests: k8s-inotify.bats: don't leak configmap
Delete the configmap if the test failed, not just on the successful
path.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-10-15 16:01:25 +00:00
Aurélien Bombo
f13d13c8fa Merge pull request #10416 from microsoft/danmihai1/mariner_static_sandbox_resource_mgmt
ci: static_sandbox_resource_mgmt for cbl-mariner
2024-10-15 10:40:17 -05:00
Aurélien Bombo
c371b4e1ce Merge pull request #10426 from 3u13r/fix/genpolicy/handle-config-map-binary-data
genpolicy: read binaryData value as String
2024-10-14 21:31:23 -05:00
Leonard Cohnen
c06bf2e3bb genpolicy: read binaryData value as String
While Kubernetes defines `binaryData` as `[]byte`,
when defined in a YAML file the raw bytes are
base64 encoded. Therefore, we need to read the YAML
value as `String` and not as `Vec<u8>`.

Fixes: #10410

Signed-off-by: Leonard Cohnen <lc@edgeless.systems>
2024-10-14 20:03:11 +02:00
Aurélien Bombo
f9b7a8a23c Merge pull request #10402 from Sumynwa/sumsharma/agent-ctl-dependencies
ci: Install build dependencies for building agent-ctl with image pull.
2024-10-14 10:28:32 -05:00
Sumedh Alok Sharma
bc195d758a ci: Install build dependencies for building agent-ctl with image pull.
Adds dependencies of 'clang' & 'protobuf' to be installed in runners
when building agent-ctl sources having image pull support.

Fixes #10400

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-10-14 10:36:04 +05:30
Aurélien Bombo
614e21ccfb Merge pull request #10415 from GabyCT/topic/egreptim
tools/osbuilder/tests: Remove egrep in test images script
2024-10-11 13:47:30 -05:00
Gabriela Cervantes
aae654be80 tools/osbuilder/tests: Remove egrep in test images script
This PR removes egrep command as it has been deprecated and it replaces by
grep in the test images script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-11 17:23:35 +00:00
Dan Mihai
3622b5e8b4 ci: static_sandbox_resource_mgmt for cbl-mariner
Use the configuration used by AKS (static_sandbox_resource_mgmt=true)
for CI testing on Mariner hosts.

Hopefully pod startup will become more predictable on these hosts -
e.g., by avoiding the occasional hotplug timeouts described by #10413.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-10-10 22:17:39 +00:00
Fabiano Fidêncio
02f5fd94bd Merge pull request #10409 from fidencio/topic/ci-add-ita_image-and-ita_image_tag
kbs: ita: Ensure the proper image / image_tag is used for ITA
2024-10-10 11:46:26 +02:00
Fabiano Fidêncio
cf5d3ed0d4 kbs: ita: Ensure the proper image / image_tag is used for ITA
When dealing with a specific release, it was easier to just do some
adjustments on the image that has to be used for ITA without actually
adding a new entry in the versions.yaml.

However, it's been proven to be more complicated than that when it comes
to dealing with staged images, and we better explicitly add (and
update) those versions altogether to avoid CI issues.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-10 10:01:33 +02:00
Steve Horsman
0c4a7c8771 Merge pull request #10406 from ChengyuZhu6/fix-unit
agent:cdh: fix unit tests about sealed secret
2024-10-10 08:57:28 +01:00
Fabiano Fidêncio
3f7ce1d620 Merge pull request #10401 from stevenhorsman/kbs-deploy-overlays-update
Kbs deploy overlays update
2024-10-10 09:50:19 +02:00
Fabiano Fidêncio
036b04094e Merge pull request #10397 from fidencio/topic/build-remove-initrd-mariner-target
build: mariner: Remove the ability to build the marine initrd
2024-10-10 09:44:36 +02:00
ChengyuZhu6
65ecac5777 agent:cdh: fix unit tests about sealed secret
The root cause is that the CDH client is a global variable, and unit tests `test_unseal_env` and `test_unseal_file`
share this lock-free global variable, leading to resource contention and destruction.
Merging the two unit tests into one test_sealed_secret will resolve this issue.

Fixes: #10403

Signed-off-by: ChengyuZhu6 <zhucy0405@gmail.com>
2024-10-10 08:38:06 +08:00
ChengyuZhu6
a992feb7f3 Revert "Revert "agent:cdh: unittest for sealed secret as file""
This reverts commit b5142c94b9.

Signed-off-by: ChengyuZhu6 <zhucy0405@gmail.com>
2024-10-10 08:37:06 +08:00
GabyCT
0cda92c6d8 Merge pull request #10407 from GabyCT/topic/fixbuildk
packaging: Remove unused variable in build kernel script
2024-10-09 16:53:45 -06:00
Gabriela Cervantes
616eb8b19b packaging: Remove unused variable in build kernel script
This PR removes an unused variable in the build kernel script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-09 20:02:56 +00:00
Fabiano Fidêncio
652ba30d4a build: mariner: Remove the ability to build the marine initrd
As mariner has switched to using an image instead of an initrd, let's
just drop the abiliy to build the initrd and avoid keeping something in
the tree that won't be used.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-09 21:42:55 +02:00
Fabiano Fidêncio
59e3ab07e4 Merge pull request #10396 from fidencio/topic/ci-mariner-test-using-mariner-image-instead-of-initrd
ci: mariner: Use the image instead of the initrd
2024-10-09 21:39:44 +02:00
stevenhorsman
b2fb19f8f8 versions: Bump KBS version
Bump to the commit that had the overlays changes we want
to adapt to.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-10-09 17:49:21 +01:00
Fabiano Fidêncio
01a957f7e1 ci: mariner: Stop building mariner initrd
As the mariner image is already in place, and the tests were modified to
use them (as part of this series), let's just stop building it as part
of the CI.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-09 18:23:35 +02:00
Fabiano Fidêncio
091ad2a1b2 ci: mariner: Ensure kernel_params can be set
The reason we're doing this is because mariner image uses, by default,
cgroups default-hierarchy as `unified` (aka, cgroupsv2).

In order to keep the same initrd behaviour for mariner, let's enforce
that `SYSTEMD_CGROUP_ENABLE_LEGACY_FORCE=1
systemd.legacy_systemd_cgroup_controller=yes
systemd.unified_cgroup_hierarchy=0` is passed to the kernel cmdline, at
least for now.

Other tests that are setting `kernel_params` are not running on mariner,
then we're safe taking this path as it's done as part of this PR.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-09 18:23:35 +02:00
Fabiano Fidêncio
3bbf3c81c2 ci: mariner: Use the image instead of the initrd
As an image has been added for mariner as part of the commit 63c1f81c2,
let's start using it in the CI, instead of using the initrd.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-09 18:23:32 +02:00
Fabiano Fidêncio
9c0c159b25 Merge pull request #10404 from fidencio/topic/rever-sealed-secrets-tests
Revert "agent:cdh: unittest for sealed secret as file"
2024-10-09 18:09:09 +02:00
GabyCT
2035d638df Merge pull request #10388 from GabyCT/topic/testimtemp
tools/osbuilder/tests: Add trap statement in test images script
2024-10-09 09:49:45 -06:00
Fabiano Fidêncio
b5142c94b9 Revert "agent:cdh: unittest for sealed secret as file"
This reverts commit 31e09058af, as it's
breaking the agent unit tests CI.

This is a stop gap till Chengyu Zhu finds the time to properly address
the issue, avoiding the CI to be blocked for now.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-09 16:06:09 +02:00
stevenhorsman
8763880e93 tests/k8s: kbs: Update overlays logic
In https://github.com/confidential-containers/trustee/pull/521
the overlays logic was modified to add non-SE
s390x support and simplify non-ibm-se platforms.
We need to update the logic in `kbs_k8s_deploy`
to match and can remove the dummying of `IBM_SE_CREDS_DIR`
for non-SE now

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-10-09 09:39:41 +01:00
Gabriela Cervantes
e08749ce58 tools/osbuilder/tests: Add trap statement in test images script
This PR adds the trap statement in the test images script to clean up
tmp files.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-08 19:54:23 +00:00
Fabiano Fidêncio
80196c06ad Merge pull request #10390 from microsoft/danmihai1/new-rootfs-image-mariner
local-build: add ability to build rootfs-image-mariner
2024-10-08 21:40:43 +02:00
Fabiano Fidêncio
083b2f24d8 Merge pull request #10363 from ChengyuZhu6/secret-as-volume
Support Confidential Sealed Secrets (as volume)
2024-10-08 19:23:40 +02:00
Dan Mihai
63c1f81c23 local-build: add rootfs-image-mariner
Kata CI will start testing the new rootfs-image-mariner instead of the
older rootfs-initrd-mariner image.

The "official" AKS images are moving from a rootfs-initrd-mariner
format to the rootfs-image-mariner format. Making the same change in
Kata CI is useful to keep this testing in sync with the AKS settings.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-10-08 17:15:56 +00:00
GabyCT
7a38cce73c Merge pull request #10383 from kata-containers/topic/imagevar
image-builder: Remove unused variable
2024-10-08 10:27:03 -06:00
Aurélien Bombo
e56af7a370 Merge pull request #10389 from emanuellima1/fix-agent-policy
build: Fix RPM build fail due to AGENT_POLICY
2024-10-08 09:59:21 -05:00
ChengyuZhu6
a94024aedc tests: add test for sealed file secrets
add a test for sealed file secrets.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-10-08 16:01:48 +08:00
ChengyuZhu6
fe307303c8 agent:rpc: Refactor CDH-related operations
Refactor CDH-related operations into the cdh_handler function to make the `create_container` code clearer.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-10-08 16:01:48 +08:00
ChengyuZhu6
31e09058af agent:cdh: unittest for sealed secret as file
add unittest for sealed secret as file.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
2024-10-08 16:01:48 +08:00
ChengyuZhu6
974d6b0736 agent:cdh: initialize cdhclient with the input cdh socket uri
Refactor cdh code to initialize cdhclient with the input cdh socket uri.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-10-08 14:58:07 +08:00
ChengyuZhu6
1f33fd4cd4 agent:rpc: handle the sealed secret in createcontainer
Users must set the mount path to `/sealed/<path>` for kata agent to detect the sealed secret mount
and handle it in createcontainer stage.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
2024-10-08 14:58:07 +08:00
ChengyuZhu6
da281b4444 agent:cdh: support to unseal secret as file
Introduced `unseal_file` function to unseal secret as files:
- Implemented logic to handle symlinks and regular files within the sealed secret directory.
- For each entry, call CDH to unseal secrets and the unsealed contents are written to a new file, and a symlink is created to replace the sealed symlink.

Fixes: #8123

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
2024-10-08 14:58:07 +08:00
Fabiano Fidêncio
71d0c46e0a Merge pull request #10384 from microsoft/danmihai1/virtio-fs-policy
tests: k8s: AUTO_GENERATE_POLICY=yes for local testing
2024-10-07 21:25:52 +02:00
Emanuel Lima
e989e7ee4e build: Fix RPM build fail due to AGENT_POLICY
By checking for AGENT_POLICY we ensure we only try to read
allow-all.rego if AGENT_POLICY is set to "yes"

Signed-off-by: Emanuel Lima <emlima@redhat.com>
2024-10-07 15:43:23 -03:00
Dan Mihai
6d5fc898b8 tests: k8s: AUTO_GENERATE_POLICY=yes for local testing
The behavior of Kata CI doesn't change.

For local testing using kubernetes/gha-run.sh and AUTO_GENERATE_POLICY=yes:

1. Before these changes users were forced to use:
- SEV, SNP, or TDX guests, or
- KATA_HOST_OS=cbl-mariner

2. After these changes users can also use other platforms that are
configured with "shared_fs = virtio-fs" - e.g.,
- KATA_HOST_OS=ubuntu + KATA_HYPERVISOR=qemu

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-10-04 18:26:00 +00:00
Dan Mihai
5aaef8e6eb Merge pull request #10376 from microsoft/danmihai1/auto-generate-just-for-ci
gha: enable AUTO_GENERATE_POLICY where needed
2024-10-04 10:52:31 -07:00
Gabriela Cervantes
4cd737d9fd image-builder: Remove unused variable
This PR removes an unused variable in the image builder script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-04 15:56:28 +00:00
Greg Kurz
77c5db6267 Merge pull request #9637 from ldoktor/selective-ci
CI: Select jobs by touched code
2024-10-04 11:29:05 +02:00
GabyCT
2d089d9695 Merge pull request #10381 from GabyCT/topic/archrootfs
osbuilder: Remove duplicated arch variable definition
2024-10-03 14:48:08 -06:00
Wainer Moschetta
b9025462fb Merge pull request #10134 from ldoktor/ci-sort-range
ci.ocp: Sort images according to git
2024-10-03 15:08:41 -03:00
Chelsea Mafrica
9138f55757 Merge pull request #10375 from GabyCT/topic/mktempkbs
k8s:kbs: Add trap statement to clean up tmp files
2024-10-03 12:32:30 -04:00
Gabriela Cervantes
d7c2b7d13c osbuilder: Remove duplicated arch variable definition
This PR removes duplicated arch variable definition in the rootfs script
as this variable and its value is already defined at the top of the
script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-03 16:22:27 +00:00
Greg Kurz
96336d141b Merge pull request #10165 from pmores/add-network-device-hotplugging
runtime-rs: add network device hotplugging to qemu-rs
2024-10-03 17:44:50 +02:00
Pavel Mores
23927d8a94 runtime-rs: plug in netdev hotplugging functionality and actually call it
add_device() now checks if QEMU is running already by checking if we have
a QMP connection.  If we do a new function hotplug_device() is called
which hotplugs the device if it's a network one.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-10-03 11:23:10 +02:00
Pavel Mores
ac393f6316 runtime-rs: implement netdev hotplugging for qemu-rs
With the helpers from previous commit, the actual hotplugging
implementation, though lengthy, is mostly just assembling a QMP command
to hotplug the network device backend and then doing the same for the
corresponding frontend.

Note that hotplug_network_device() takes cmdline_generator types Netdev
and DeviceVirtioNet.  This is intentional and aims to take advantage of
the similarity between parameter sets needed to coldplug and hotplug
devices reuse and simplify our code.  To enable using the types from qmp,
accessors were added as needed.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-10-03 11:20:02 +02:00
Pavel Mores
4eb7e2966c runtime-rs: add netdev hotplugging helpers to qemu-rs
Before adding network device hotplugging functionality itself we add
a couple of helpers in a separate commit since their functionality is
non-trivial.

To hotplug a device we need a free PCI slot.  We add find_free_slot()
which can be called to obtain one.  It looks for PCI bridges connected
to the root bridge and looks for an unoccupied slot on each of them.  The
first found is returned to the caller.  The algorithm explicitly doesn't
support any more complex bridge hierarchies since those are never produced
when coldplugging PCI bridges.

Sending netdev queue and vhost file descriptors to QEMU is slightly
involved and implemented in pass_fd().  The actual socket has to be passed
in an SCM_RIGHTS socket control message (also called ancillary data, see
man 3 cmsg) so we have to use the msghdr structure and sendmsg() call
(see man 2 sendmsg) to send the message.  Since qapi-rs doesn't support
sending messages with ancillary data we have to do the sending sort of
"under it", manually, by retrieving qapi-rs's socket and using it directly.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-10-03 11:15:31 +02:00
Pavel Mores
3f46dfcf2f runtime-rs: don't treat NetworkConfig::index as unique in qemu-rs
NetworkConfig::index has been used to generate an id for a network device
backend.  However, it turns out that it's not unique (it's always zero
as confirmed by a comment at its definition) so it's not suitable to
generate an id that needs to be unique.

Use the host device name instead.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-10-03 11:12:37 +02:00
Pavel Mores
cda04fa539 runtime-rs: factor setup of network device out of QemuCmdLine
Network device hotplugging will use the same infrastructure (Netdev,
DeviceVirtioNet) as coldplugging, i.e. QemuCmdLine.  To make the code
of network device setup visible outside of QemuCmdLine we factor it out
to a non-member function `get_network_device()` and make QemuCmdLine just
delegate to it.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-10-03 11:03:32 +02:00
Pavel Mores
efc8e93bfe runtime-rs: factor bus_type() out of QemuCmdLine
The function takes a whole QemuCmdLine but only actually uses
HypervisorConfig.  We increase callability of the function by limiting
its interface to what it needs.  This will come handy shortly.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-10-03 11:03:32 +02:00
Pavel Mores
720265c2d8 runtime-rs: support adding PCI bridges to qemu VM
At least one PCI bridge is necessary to hotplug PCI devices.  We only
support PCI (at this point at least) since that's what the go runtime
does (note that looking at the code in virtcontainers it might seem that
other bus types are supported, however when the bridge objects are passed
to govmm, all but PCI bridges are actually ignored).  The entire logic of
bridge setup is lifted from runtime-go for compatibility's sake.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-10-03 11:03:32 +02:00
Lukáš Doktor
63b6e8a215 ci: Ensure we check the latest workflow run in gatekeeper
with multiple iterations/reruns we need to use the latest run of each
workflow. For that we can use the "run_id" and only update results of
the same or newer run_ids.

To do that we need to store the "run_id". To avoid adding individual
attributes this commit stores the full job object that contains the
status, conclussion as well as other attributes of the individual jobs,
which might come handy in the future in exchange for slightly bigger
memory overhead (still we only store the latest run of required jobs
only).

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-10-03 09:10:45 +02:00
Lukáš Doktor
2ae090b44b ci: Add extra gatekeeper debug output to stderr
which might be useful to assess the amount of querries.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-10-03 09:08:35 +02:00
Lukáš Doktor
2440a39c50 ci: Check required lables before checking tests in gatekeeper
some tests require certain labels before they are executed. When our PR
is not labeled appropriately the gatekeeper detects skipped required
tests and reports a failure. With this change we add "required-labeles"
to the tests mapping and check the expected labels first informing the
user about the missing labeles before even checking the test statuses.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-10-03 09:08:35 +02:00
Lukáš Doktor
dd2878a9c8 ci: Unify character for separating items
the test names are using `;` and regexps were designed to use `,` but
during development simply joined the expressions by `|`. This should
work but might be confusing so let's go with the semi-colon separator
everywhere.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-10-03 09:08:35 +02:00
Wainer dos Santos Moschetta
fdcfac0641 workflows/gatekeeper: export COMMIT_HASH variable
The Github SHA of triggering PR should be exported in the environment
so that gatekeeper can fetch the right workflows/jobs.

Note: by default github will export GITHUB_SHA in the job's environment
but that value cannot be used if the gatekeeper was triggered from a
pull_request_target event, because the SHA correspond to the push
branch.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-10-03 09:08:35 +02:00
Wainer dos Santos Moschetta
4abfc11b4f workflows/gatekeeper: configure concurrency properly
This will allow to cancel-in-progress the gatekeeper jobs.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-10-03 09:08:35 +02:00
Lukáš Doktor
5c1cea1601 ci: Select jobs by touched code
to allow selective testing as well as selective list of required tests
let's add a mapping of required jobs/tests in "skips.py" and a
"gatekeaper" workflow that will ensure the expected required jobs were
successful. Then we can only mark the "gatekeaper" as the required job
and modify the logic to suit our needs.

Fixes: #9237

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-10-03 09:08:33 +02:00
Dan Mihai
1a4928e710 gha: enable AUTO_GENERATE_POLICY where needed
The behavior of Kata CI doesn't change.

For local testing using kubernetes/gha-run.sh:

1. Before these changes:
- AUTO_GENERATE_POLICY=yes was always used by the users of SEV, SNP,
  TDX, or KATA_HOST_OS=cbl-mariner.

2. After these changes:
- Users of SEV, SNP, TDX, or KATA_HOST_OS=cbl-mariner must specify
  AUTO_GENERATE_POLICY=yes if they want to auto-generate policy.
- These users have the option to test just using hard-coded policies
  (e.g., using the default policy built into the Guest rootfs) by
  using AUTO_GENERATE_POLICY=no. AUTO_GENERATE_POLICY=no is the default
  value of this env variable.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-10-02 23:20:33 +00:00
Gabriela Cervantes
973b8a1d8f k8s:kbs: Add trap statement to clean up tmp files
This PR adds the trap statement in the confidential kbs script
to clean up temporary files and ensure we are leaving them.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-02 19:59:08 +00:00
Steve Horsman
8412c09143 Merge pull request #10371 from fidencio/topic/k8s-tdx-re-enable-empty-dir-tests
k8s: tests: Re-enable empty-dirs tests for TDX / coco-qemu-dev
2024-10-02 18:41:19 +01:00
Dan Mihai
9a8341f431 Merge pull request #10370 from microsoft/danmihai1/k8s-policy-rc
tests: k8s-policy-rc: remove default UID from YAML
2024-10-02 09:32:17 -07:00
GabyCT
a1d380305c Merge pull request #10369 from GabyCT/topic/egrepfastf
metrics: Update fast footprint script to use grep
2024-10-02 10:10:12 -06:00
Fabiano Fidêncio
b3ed7830e4 k8s: tests: Re-enable empty-dirs tests for TDX / coco-qemu-dev
The tests is disabled for qemu-coco-dev / qemu-tdx, but it doesn't seen
to actually be failing on those.  Plus, it's passing on SEV / SNP, which
means that we most likely missed re-enabling this one in the past.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-01 20:51:01 +02:00
Hyounggyu Choi
b179598fed Merge pull request #10374 from BbolroC/skip-block-volume-qemu-runtime-rs
tests: Skip k8s-block-volume.bats for qemu-runtime-rs
2024-10-01 19:45:10 +02:00
Lukáš Doktor
820e000f1c ci.ocp: Sort images according to git
The quay.io registry returns the tags sorted alphabetically and doesn't
seem to provide a way to sort it by age. Let's use "git log" to get all
changes between the commits and print all tags that were actually
pushed.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-10-01 16:08:00 +02:00
Hyounggyu Choi
4ccf1f29f9 tests: Skip k8s-block-volume.bats for qemu-runtime-rs
Currently, `qemu-runtime-rs` does not support `virtio-scsi`,
which causes the `k8s-block-volume.bats` test to fail.
We should skip this test until `virtio-scsi` is supported by the runtime.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-10-01 09:09:47 +02:00
Dan Mihai
3b24219310 tests: k8s-policy-rc: remove default UID from YAML
The nginx container seems to error out when using UID=123.

Depending on the timing between container initialization and "kubectl
wait", the test might have gotten lucky and found the pod briefly in
Ready state before nginx errored out. But on some of the nodes, the pod
never got reported as Ready.

Also, don't block in "kubectl wait --for=condition=Ready" when wrapping
that command in a waitForProcess call, because waitForProcess is
designed for short-lived commands.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-10-01 00:10:30 +00:00
Saul Paredes
94bc54f4d2 Merge pull request #10340 from microsoft/saulparedes/validate_create_sandbox_storages
genpolicy: validate create sandbox storages
2024-09-30 14:24:56 -07:00
Aurélien Bombo
b49800633d Merge pull request #7165 from sprt/k8s-block-volume-test
tests: Add `k8s-block-volume` test to GHA CI
2024-09-30 13:26:18 -07:00
Dan Mihai
7fe44d3a3d genpolicy: validate create sandbox storages
Reject any unexpected values from the CreateSandboxRequest storages
field.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-30 11:31:12 -07:00
Gabriela Cervantes
52ef092489 metrics: Update fast footprint script to use grep
This PR updates the fast footprint script to remove the use
of egrep as this command has been deprecated and change it
to use grep command.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-30 17:43:08 +00:00
Aurélien Bombo
c037ac0e82 tests: Add k8s-block-volume test
This imports the k8s-block-volume test from the tests repo and modifies
it slightly to set up the host volume on the AKS host.

This is a follow-up to #7132.

Fixes: #7164

Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-09-30 10:58:30 -05:00
Alex Lyn
dfd0ca9bfe Merge pull request #10312 from sidneychang/configurable-build-dragonball
runtime-rs: Add Configurable Compilation for Dragonball in Runtime-rs
2024-09-29 22:33:54 +08:00
GabyCT
6a9e3ccddf Merge pull request #10305 from GabyCT/topic/ita
ci:tdx: Use an ITA key for TDX
2024-09-27 16:44:53 -06:00
Fabiano Fidêncio
66bcfe7369 k8s: kbs: Properly delete ita kustomization
The ita kustomization for Trustee, as well as previously used one
(DCAP), doesn't have a $(uname -m) directory after the deployment
directory name.

Let's follow the same logic used for the deploy-kbs script and clean
those up accordingly.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-27 21:47:29 +02:00
Gabriela Cervantes
bafa527be0 ci: tdx: Test attestation with ITTS
Intel Tiber Trust Services (formerly known as Intel Trust Authority) is
Intel's own attestation service, and we want to take advantage of the
TDX CI in order to ensure ITTS works as expected.

In order to do so, let's replace the former method used (DCAP) to use
ITTS instead.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-27 21:47:25 +02:00
GabyCT
36750b56f1 Merge pull request #10342 from GabyCT/topic/updevguide
docs: Remove qemu information not longer valid
2024-09-27 11:15:11 -06:00
Fabiano Fidêncio
86b8c53d27 Merge pull request #10357 from fidencio/topic/add-ita-secret
gha: Add ita_key as a github secret
2024-09-27 17:40:41 +02:00
Gabriela Cervantes
d91979d7fa gha: Add ita_key as a github secret
This PR adds ita_key as a github secret at the kata coco tests yaml workflow.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-27 17:15:22 +02:00
Xuewei Niu
ad0f2b2a55 Merge pull request #10219 from sidneychang/decouple-runtime-rs-from-dragonball
runtime-rs: Port TAP implementation from dragonball
2024-09-27 11:17:55 +08:00
Xuewei Niu
11b1a72442 Merge pull request #10349 from lifupan/main_nsandboxapi
sandbox: refactor the sandbox init process
2024-09-27 11:10:45 +08:00
Xuewei Niu
3911bd3108 Merge pull request #10351 from lifupan/main_agent
agent: fix the issue of setup sandbox pidns
2024-09-27 10:49:47 +08:00
Fupan Li
f7bc627a86 sandbox: refactor the sandbox init process
Inorder to support sandbox api, intorduce the sandbox_config
struct and split the sandbox start stage from init process.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-09-26 23:50:24 +08:00
Hyounggyu Choi
b1275bed1b Merge pull request #10346 from BbolroC/minor-improvement-k8s-tests
tests: Minor improvement k8s tests
2024-09-26 17:01:32 +02:00
Hyounggyu Choi
01d460ac63 tests: Add teardown_common() to tests_common.sh
There are many similar or duplicated code patterns in `teardown()`.
This commit consolidates them into a new function, `teardown_common()`,
which is now called within `teardown()`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-26 13:56:36 +02:00
Hyounggyu Choi
e8d1feb25f tests: Validate node name for exec_host()
The current `exec_host()` accepts a given node name and
creates a node debugger pod, even if the name is invalid.
This could result in the creation of an unnecessary pending
pod (since we are using nodeAffinity; if the given name
does not match any actual node names, the pod won’t be scheduled),
which wastes resources.

This commit introduces validation for the node name to
prevent this situation.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-26 13:20:50 +02:00
Xuewei Niu
3a7f9595b6 Merge pull request #10318 from lsc2001/ci-add-docker
ci: Enable basic docker tests for runtime-rs
2024-09-26 17:41:09 +08:00
Xuewei Niu
cb5a2b30e9 Merge pull request #10293 from lsc2001/solve-docker-compatibility
runtime-rs: Notify containerd when process exits
2024-09-26 14:51:20 +08:00
Sicheng Liu
e4733748aa ci: Enable basic docker tests for runtime-rs
This commit enables basic amd64 tests of docker for runtime-rs by adding
vmm types "dragonball" and "cloud-hypervisor".

Signed-off-by: Sicheng Liu <lsc2001@outlook.com>
2024-09-26 06:27:05 +00:00
Sicheng Liu
08eb5fc7ff runtime-rs: Notify containerd when process exits
Docker cannot exit normally after the container process exits when
used with runtime-rs since it doesn't receive the exit event. This
commit enable runtime-rs to send TaskExit to containerd after process
exits.

Also, it moves "system_time_into" and "option_system_time_into" from
crates/runtimes/common/src/types/trans_into_shim.rs to a new utility
mod.

Signed-off-by: Sicheng Liu <lsc2001@outlook.com>
2024-09-26 02:52:50 +00:00
Fupan Li
71afeccdf1 agent: fix the issue of setup sandbox pidns
When the sandbox api was enabled, the pasue container
wouldn't be created, thus the shared sandbox pidns
should be fallbacked to the first container's init process,
instead of return any error here.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-09-26 10:21:25 +08:00
Xuewei Niu
857222af02 Merge pull request #10330 from lifupan/main_sandboxapi
Some prepared work for sandbox api support
2024-09-26 09:47:47 +08:00
Hyounggyu Choi
caf3b19505 Merge pull request #10348 from BbolroC/delete-node-debugger-by-trap
tests: Delete custom node debugger pod on EXIT
2024-09-25 23:39:43 +02:00
Hyounggyu Choi
57e8cbff6f tests: Delete custom node debugger pod on EXIT
It was observed that the custom node debugger pod is not
cleaned up when a test times out.
This commit ensures the pod is cleaned up by triggering
the cleanup on EXIT, preventing any debugger pods from
being left behind.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-25 20:36:05 +02:00
Fabiano Fidêncio
edf4ca4738 Merge pull request #10345 from ldoktor/kata-webhook
ci: Reorder webhook deployment
2024-09-25 18:16:46 +02:00
Fabiano Fidêncio
09ed9c5c50 Merge pull request #10328 from BbolroC/improve-negative-tests
tests: Improve k8s negative tests
2024-09-25 18:16:28 +02:00
Xuewei Niu
e1825c2ef3 Merge pull request #9977 from l8huang/dan-2-vfio
runtime: add DAN support for VFIO network device in Go kata-runtime
2024-09-25 10:11:38 +08:00
Lei Huang
39b0e9aa8f runtime: add DAN support for VFIO network device in Go kata-runtime
When using network adapters that support SR-IOV, a VFIO device can be
plugged into a guest VM and claimed as a network interface. This can
significantly enhance network performance.

Fixes: #9758

Signed-off-by: Lei Huang <leih@nvidia.com>
2024-09-24 09:53:28 -07:00
Hyounggyu Choi
c70588fafe tests: Use custom-node-debugger pod
With #10232 merged, we now have a persistent node debugger pod throughout the test.
As a result, there’s no need to spawn another debugger pod using `kubectl debug`,
which could lead to false negatives due to premature pod termination, as reported
in #10081.

This commit removes the `print_node_journal()` call that uses `kubectl debug` and
instead uses `exec_host()` to capture the host journal. The `exec_host()` function
is relocated to `tests/integration/kubernetes/lib.sh` to prevent cyclical dependencies
between `tests_common.sh` and `lib.sh`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-24 17:25:24 +02:00
Lukáš Doktor
8355eee9f5 ci: Reorder webhook deployment
in b9d88f74ed the `runtime_class` CM was
added which overrides the one we previously set. Let's reorder our logic
to first deploy webhook and then override the default CM in order to use
the one we really want.

Since we need to change dirs we also have to use realpath to ensure the
files are located well.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-09-24 17:01:28 +02:00
Hyounggyu Choi
2c2941122c tests: Fail fast in assert_pod_fail()
`assert_pod_fail()` currently calls `k8s_create_pod()` to ensure that a pod
does not become ready within the default 120s. However, this delays the test's
completion even if an error message is detected earlier in the journal.

This commit removes the use of `k8s_create_pod()` and modifies `assert_pod_fail()`
to fail as soon as the pod enters a failed state.

All failing pods end up in one of the following states:

- CrashLoopBackOff
- ImagePullBackOff

The function now polls the pod's state every 5 seconds to check for these conditions.
If the pod enters a failed state, the function immediately returns 0. If the pod
does not reach a failed state within 120 seconds, it returns 1.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-24 16:09:20 +02:00
Gabriela Cervantes
6a8b137965 docs: Remove qemu information not longer valid
This PR removes some qemu information which is not longer valid as
this is referring to the tests repository and to kata 1.x.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-23 16:58:24 +00:00
Aurélien Bombo
e738054ddb Merge pull request #10311 from pawelpros/pproskur/fixyq
ci: don't require sudo for yq if already installed
2024-09-23 08:57:11 -07:00
Alex Lyn
6b94cc47a8 Merge pull request #10146 from Apokleos/intro-cdi
Introduce cdi in runtime-rs
2024-09-23 21:45:42 +08:00
Alex Lyn
b8ba346e98 runtime-rs: Add test for container devices with CDI.
Fixes #10145

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-09-23 17:20:22 +08:00
Steve Horsman
0e0cb24387 Merge pull request #10329 from Bickor/webhook-check
tools.kata-webhook: Specify runtime class using configMap
2024-09-23 09:59:12 +01:00
Steve Horsman
6f0b3eb2f9 Merge pull request #10337 from stevenhorsman/update-release-process-post-3.9.0
doc: Update the release process
2024-09-23 09:55:57 +01:00
Hyounggyu Choi
8a893cd4ee Merge pull request #10232 from BbolroC/fix-loop-device-for-exec_host
tests: Fix loop device handling for exec_host()
2024-09-23 08:15:03 +02:00
Fupan Li
f1f5bef9ef Merge pull request #10339 from lifupan/main_fix
runtime-rs: fix the issue of using block_on
2024-09-23 09:28:40 +08:00
Fupan Li
52397ca2c1 sandbox: rename the task_service to service
rename the task_service to service, in order to
incopperate with the following added sandbox
services.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-09-22 14:44:19 +08:00
Fupan Li
20b4be0225 runtime-rs: rename the Request/Response to TaskRequest/TaskResponse
In order to make different from sandbox request/response, this commit
changed the task request/response to TaskRequest/TaskResponse.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-09-22 14:44:11 +08:00
Fupan Li
ba94eed891 sandbox: fix the issue of hypervisor's wait_vm
Since the wait_vm would be called before calling stop_vm,
which would take the reader lock, thus blocking the stop_vm
getting the writer lock, which would trigge the dead lock.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-09-22 14:44:03 +08:00
Fupan Li
fb27de3561 runtime-rs: fix the issue of using block_on
Since the block_on would block on the current thread
which would prevent other async tasks to be run on this
worker thread, thus change it to use the async task for
this task.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-09-22 14:40:44 +08:00
Aurélien Bombo
79a3b4e2e5 Merge pull request #10335 from kata-containers/sprt/fix-kata-deploy-docs
kata-deploy: clean up and fix docs for k0s
2024-09-20 13:33:14 -07:00
stevenhorsman
4f745f77cb doc: Update the release process
- Reflect the need to update the versions in the Helm Chart
- Add the lock branch instruction
- Add clarity about the permissions needed to complete tasks

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-09-20 19:04:33 +01:00
Aurélien Bombo
78c63c7951 kata-deploy: clean up and fix docs for k0s
* Clarifies instructions for k0s.
* Adds kata-deploy step for each cluster type.
* Removes the old kata-deploy-stable step for vanilla k8s.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-09-20 11:59:40 -05:00
sidney chang
456e13db98 runtime-rs: Add Configurable Compilation for Dragonball in Runtime-rs
rename DEFAULT_HYPERVISOR to HYPERVISOR in Makefile
Fixes #10310

Signed-off-by: sidney chang <2190206983@qq.com>
2024-09-20 05:41:34 -07:00
sidneychang
b85a886694 runtime-rs: Add Configurable Compilation for Dragonball in Runtime-rs
This PR introduces support for selectively compiling Dragonball in
runtime-rs. By default, Dragonball will continue to be compiled into
the containerd-shim-kata-v2 executable, but users now have the option
to disable Dragonball compilation.

Fixes #10310

Signed-off-by: sidney chang <2190206983@qq.com>
2024-09-20 05:38:59 -07:00
Hyounggyu Choi
2d6ac3d85d tests: Re-enable guest-pull-image tests for qemu-coco-dev
Now that the issue with handling loop devices has been resolved,
this commit re-enables the guest-pull-image tests for `qemu-coco-dev`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-20 14:37:43 +02:00
Hyounggyu Choi
c6b86e88e4 tests: Increase timeouts for qemu-coco-dev in trusted image storage tests
Timeouts occur (e.g. `create_container_timeout` and `wait_time`)
when using qemu-coco-dev.
This commit increases these timeouts for the trusted image storage
test cases

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-20 14:37:43 +02:00
Hyounggyu Choi
9cff9271bc tests: Run all commands in *_loop_device() using exec_host()
If the host running the tests is different from the host where the cluster is running,
the *_loop_device() functions do not work as expected because the device is created
on the test host, while the cluster expects the device to be local.

This commit ensures that all commands for the relevant functions are executed via exec_host()
so that a device should be handled on a cluster node.

Additionally, it modifies exec_host() to return the exit code of the last executed command
because the existing logic with `kubectl debug` sometimes includes unexpected characters
that are difficult to handle. `kubectl exec` appears to properly return the exit code for
a given command to it.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-20 14:37:43 +02:00
Hyounggyu Choi
374b8d2534 tests: Create and delete node debugger pod only once
Creating and deleting a node debugger pod for every `exec_host()`
call is inefficient.
This commit changes the test suite to create and delete the pod
only once, globally.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-20 14:37:43 +02:00
Hyounggyu Choi
aedf14b244 tests: Mimic node debugger with full privileges
This commit addresses an issue with handling loop devices
via a node debugger due to restricted privileges.
It runs a pod with full privileges, allowing it to mount
the host root to `/host`, similar to the node debugger.
This change enables us to run tests for trusted image storage
using the `qemu-coco-dev` runtime class.

Fixes: #10133

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-20 14:37:43 +02:00
Alex Lyn
63b25e8cb0 runtime-rs: Introduce cdi devices in container creation
Fixes #10145

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-09-20 09:28:51 +08:00
Alex Lyn
03735d78ec runtime-rs: add cdi devices definition and related methods
Add cdi devices including ContainerDevice definition and
annotation_container_device method to annotate vfio device
in OCI Spec annotations which is inserted into Guest with
its mapping of vendor-class and guest pci path.

Fixes #10145

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-09-20 09:28:51 +08:00
Alex Lyn
020e3da9b9 runtime-rs: extend DeviceVendor with device class
We need vfio device's properties device, vendor and
class, but we can only get property device and vendor.
just extend it with class is ok.

Fixes #10145

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-09-20 09:28:51 +08:00
Fabiano Fidêncio
77c844da12 Merge pull request #10239 from fidencio/topic/remove-acrn
acrn: Drop support
2024-09-19 23:10:29 +02:00
GabyCT
6eef58dc3e Merge pull request #10336 from GabyCT/topic/extendtimeout
gha: Increase timeout to run k8s tests on TDX
2024-09-19 13:12:55 -06:00
Martin
b9d88f74ed tools.kata-webhook: Specify runtime class using configMap
The kata webhook requires a configmap to define what runtime class it
should set for the newly created pods. Additionally, the configmap
allows others to modify the default runtime class name we wish to set
(in case the handler is kata but the name of the runtimeclass is
different).

Finally, this PR changes the webhook-check to compare the runtime of the
newly created pod against the specific runtime class in the configmap,
if said confimap doesn't exist, then it will default to "kata".

Signed-off-by: Martin <mheberling@microsoft.com>
2024-09-19 11:51:38 -07:00
Fabiano Fidêncio
51dade3382 docs: Fix spell checker
tokio is not a valid word, it seeems, so let's use `tokio`.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-19 20:25:21 +02:00
Gabriela Cervantes
49b3a0faa3 gha: Increase timeout to run k8s tests on TDX
This PR increases the timeout to run k8s tests for Kata CoCo TDX
to avoid the random failures of timeout.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-19 17:15:47 +00:00
Fabiano Fidêncio
31438dba79 docs: Fix qemu link
Otherwise static checks will fail, as we woke up the dogs with changes
on the same file.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-19 16:05:43 +02:00
Fabiano Fidêncio
fefcf7cfa4 acrn: Drop support
As we don't have any CI, nor maintainer to keep ACRN code around, we
better have it removed than give users the expectation that it should or
would work at some point.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-19 16:05:43 +02:00
Fabiano Fidêncio
cdaaf708a1 Merge pull request #10334 from emanuellima1/bump-version
release: Bump version to 3.9.0
2024-09-19 15:27:50 +02:00
Emanuel Lima
a6ee15c5c7 release: Bump VERSION to 3.9.0
Starting the v3.9.0 release

Signed-off-by: Emanuel Lima <emlima@redhat.com>
2024-09-19 10:14:55 -03:00
Fabiano Fidêncio
e9593b53a4 Merge pull request #10234 from pmores/add-support-for-disabled-guest-selinux
runtime-rs: add support for disabled guest selinux
2024-09-19 15:03:24 +02:00
Fabiano Fidêncio
4d11fecc2d Merge pull request #10274 from ajaypvictor/remote_image-os_types
runtime: Enable Image annotation for remote hypervisor
2024-09-19 13:39:20 +02:00
Fabiano Fidêncio
3d5f48e02e Merge pull request #10283 from alexman-stripe/alexman-stripe/fix-kata-shim-not-reporting-inactive-file-cgroup-v2
shim: Fix memory usage reporting for cgroup v2
2024-09-19 12:50:36 +02:00
Pavel Mores
5e5eb9759f runtime-rs: handle disabled guest selinux in virtiofsd
This is just a port of functionality existing in the golang runtime.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-09-19 12:47:10 +02:00
Pavel Mores
8c92f3bfec runtime-rs: enable/disable selinux in guest based on disable_guest_selinux
This change technically affects the path for enabled guest selinux as well,
however since this is not implemented in runtime-rs anyway nothing should
break.  When guest selinux support is added this change will come handy.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-09-19 12:47:10 +02:00
Pavel Mores
204ee21bc8 runtime-rs: handle disabled guest selinux in OCI spec
If guest selinux is off the runtime has to ensure that container OCI spec
contains no selinux labels for the container rootfs and process.  Failure
to do so causes kata agent to try and apply the labels which fails since
selinux is not enabled in guest, which in turn causes container launch
to fail.

This is largely inspired by golang runtime(*) with a slight deviation
in ordering of checks.  This change simply checks the disable_guest_selinux
config setting and if it's true it clears both rootfs and process label if
necessary.  Golang runtime, on the other hand, seems to first check if
process label is non-empty and only then it checks the config setting,
meaning that if process label is empty the rootfs label is not reset
even if it's non-empty.  Frankly, this looks like a potential bug though
probably unlikely to manifest since it can be assumed that the labels are
either both empty, or both non-empty.

(*) 4fd4b02f2e/src/runtime/virtcontainers/kata_agent.go (L1005)

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-09-19 12:47:10 +02:00
Pavel Mores
eb1227f47d runtime-rs: parse the disable_guest_selinux config key
In order to handle the setting we have to first parse it and make its
value available to the rest of the program.

The yes() function is added to comply with serde which seems to insist
on default values being returned from functions.  Long term, this is
surely not the best place for this function to live, however given that
this is currently the first and only place where it's used it seems
appropriate to put it near its use.  If it ends up being reused elsewhere
a better place will surely emerge.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-09-19 12:47:10 +02:00
Steve Horsman
8789551fe6 Merge pull request #10333 from fidencio/topic/ci-bump-ubuntu-20.04-runners-to-22.04
ci: Bump ubuntu 20.04 runners to 22.04
2024-09-19 11:44:33 +01:00
Fabiano Fidêncio
35c7f8d1ba ci: Bump ubuntu 20.04 runners to 22.04
Azure internal mirrors for Ubuntu 20.04 have gone awry, leading to a
situation where dependencies cannot be installed (such as
libdevmapper-dev), blocking then our CI.

Let's bump the runners to 22.04 regardless, even knowing it'll cause an
issue with the runk tests, as the agent check tests are considered more
crucial to the project at this point.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-19 12:29:20 +02:00
Fabiano Fidêncio
eccdffebf7 Merge pull request #10243 from katexochen/nydus-overlayfs-path
virtcontainers: allow specifying nydus-overlayfs binary by path
2024-09-19 11:35:45 +02:00
Ajay Victor
a19f2eacec runtime: Enable ImageName annotation for remote hypervisor
Enables ImageName to support multiple VM images in remote hypervisor scenario

Fixes https://github.com/kata-containers/kata-containers/issues/10240

Signed-off-by: Ajay Victor <ajvictor@in.ibm.com>
2024-09-19 14:48:46 +05:30
Alex Man
27f8f69195 shim: Fix memory usage reporting for cgroup v2
kata-shim was not reporting `inactive_file` in memory stat.

This memory is deducted by containerd when calculating the size of container working set, as it can be paged out by the operating
system under memory pressure. Without reporting `inactive_file`, containerd will over report container memory usage.
[Here](https://github.com/containerd/containerd/blob/v1.7.22/pkg/cri/server/container_stats_list_linux.go#L117) is where containerd
deducts `inactive_file` from memory usage.

Note that kata-shim correctly reports `total_inactive_file` for cgroup v1, but this was not implemented for cgroup v2.

This commit:
- Adds code in kata-shim to report "inactive_file" memory for cgroup v2
- Implements reporting of all available cgroup v2 memory stats to containerd
- Uses defensive coding to avoid assuming existence of any memory.stat fields

The list of available cgroup v2 memory stats defined by containerd can be found
[here](https://pkg.go.dev/github.com/containerd/cgroups/v2/stats#MemoryStat).

Fixes #10280

Signed-off-by: Alex Man <alexman@stripe.com>
2024-09-18 14:04:24 -07:00
Fabiano Fidêncio
1597f8ba00 Merge pull request #10279 from alexman-stripe/alexman-stripe/fix-cgroup-v2-wrong-cpu-usage-unit
agent: Fix CPU usage reporting for cgroup v2 in kata-agent
2024-09-18 21:36:52 +02:00
Fabiano Fidêncio
593cbb8710 Merge pull request #10306 from microsoft/danmihai1/more-security-contexts
genpolicy: get UID from PodSecurityContext
2024-09-18 21:33:39 +02:00
Aurélien Bombo
5402f2c637 Merge pull request #10308 from Sumynwa/sumsharma/add_setpolicy_agent_ctl
agent-ctl: Add SetPolicy support
2024-09-18 10:09:07 -07:00
Pawel Proskurnicki
b63d49b34a ci: don't require sudo for yq if already installed
Yq installation shouldn't force to use sudo in case yq is already installed in correct version.

Signed-off-by: Pawel Proskurnicki <pawel.proskurnicki@intel.com>
2024-09-18 11:01:07 +02:00
Sumedh Alok Sharma
18c887f055 agent-ctl: Add SetPolicy support
This patch adds support to call kata agents SetPolicy
API. Also adds tests for SetPolicy API using agent-ctl.

Fixes #9711

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-09-18 10:53:49 +05:30
GabyCT
28d430ec42 Merge pull request #10324 from GabyCT/topic/fixinlib
ci: Fix indentation of install libseccomp script
2024-09-17 14:21:24 -06:00
Fabiano Fidêncio
da2377346d Merge pull request #10323 from stevenhorsman/update-kubectl-release-url
kata-deploy: Switch Kubernetes URL
2024-09-17 20:47:17 +02:00
Gabriela Cervantes
096f32cc52 ci: Fix indentation of install libseccomp script
This PR fixes the indentation of the install libseccomp script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-17 16:38:53 +00:00
Aurélien Bombo
9d29ce460d Merge pull request #10303 from Sumynwa/sumsharma/agent_policy_set_env
agent: add support to provide default agent policy via env
2024-09-17 09:04:11 -07:00
stevenhorsman
c0d35a66aa ci: kata-deploy: Update kubectil install URL
The `deploy_k0s` and `deploy_k3s` kubectl installs aren't failing
yet, but let get ahead of this and bump them as well

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-09-17 15:35:42 +01:00
stevenhorsman
1abeffdac6 kata-deploy: Switch Kubernetes URL
The payload build is failing with:
```
ERROR: failed to solve: process "/bin/sh -c apk --no-cache add bash curl &&
ARCH=$(uname -m) &&
if [ \"${ARCH}\" = \"x86_64\" ]; then ARCH=amd64; fi &&
if [ \"${ARCH}\" = \"aarch64\" ]; then ARCH=arm64; fi &&
DEBIAN_ARCH=${ARCH} &&
if [ \"${DEBIAN_ARCH}\" = \"ppc64le\" ]; then DEBIAN_ARCH=ppc64el; fi &&
curl -fL --progress-bar -o /usr/bin/kubectl https://storage.googleapis.com/kubernetes-release/release/ \
$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/${ARCH}/kubectl &&
chmod +x /usr/bin/kubectl &&
curl -fL --progress-bar -o /usr/bin/jq https://github.com/jqlang/jq/releases/download/jq-1.7.1/jq-linux-${DEBIAN_ARCH} &&
chmod +x /usr/bin/jq &&
mkdir -p ${DESTINATION} &&
tar xvf ${WORKDIR}/${KATA_ARTIFACTS} -C ${DESTINATION} &&
rm -f ${WORKDIR}/${KATA_ARTIFACTS} &&
apk del curl &&
apk --no-cache add py3-pip &&
pip install --no-cache-dir yq==3.2.3" did not complete successfully: exit code: 22
```

Looking into this, the problem is that
https://storage.googleapis.com/kubernetes-release/release/v1.31.1/bin/linux/amd64/kubectl
doesn't exist. The [kubectl install doc](https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/#install-kubectl-on-linux)
recommends the `dl.k8s.io` site, so let's switch to this.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-09-17 15:35:42 +01:00
Steve Horsman
5448f7fbbf Merge pull request #10321 from BbolroC/fix-build-boot-image-se
local-build: Fix unbound variable issue for lib_se.sh
2024-09-17 15:35:04 +01:00
Hyounggyu Choi
72471d1a18 local-build: Fix unbound variable for lib_se.sh
As #10315 introduced an `unbound variable` error, this is a
hot-fix for it.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-17 10:01:14 +02:00
Hyounggyu Choi
72df3004e8 gha: Rebase build-secure-image-se atop of latest target branch
This commit adds a step called `Rebase atop of the latest target branch`
to the job named `build-asset-boot-image-se` which can test the PR properly.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-17 09:54:51 +02:00
Hyounggyu Choi
03cd02a006 Merge pull request #10315 from BbolroC/update-ibm-se-doc
doc: Update how-to-run-kata-containers-with-SE-VMs.md
2024-09-16 15:12:18 +02:00
Sumedh Alok Sharma
cefba08903 agent: add support to provide default agent policy via env
agent built with policy feature initializes the policy engine using a
policy document from a default path, which is installed & linked during
UVM rootfs build. This commit adds support to provide a default agent
policy as environment variable.

This targets development/testing scenarios where kata-agent
is wanted to be started as a local process.

Fixes #10301

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-09-16 18:05:21 +05:30
Hyounggyu Choi
8d609e47fb doc: Update how-to-run-kata-containers-with-SE-VMs.md
The following changes have been made:

- Remove unnecessary `sudo`
- Add an error message where an incorrect host key document is used
- Add a missing artifact `kernel-confidential-modules`
- Make a variable `kernel_version` and replace it with relevant hits

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-16 12:53:30 +02:00
Fabiano Fidêncio
fc5a631791 Merge pull request #10009 from Xynnn007/feat-cosign
Merge to main: supporting pull cosign signed images
2024-09-16 12:08:26 +02:00
stevenhorsman
aa9f21bd19 test: Add support for s390x in cosign testing
We've added s390x test container image, so add support
to use them based on the arch the test is running on

Fixes: #10302

Signed-off-by: stevenhorsman <steven@uk.ibm.com>

fixuop
2024-09-16 09:20:57 +01:00
stevenhorsman
3087ce17a6 tests: combined pod yaml creation for CoCo tests
This commit brings some public parts of image pulling test series like
encrypted image pulling, pulling images from authenticated registry and
image verification. This would help to reduce the cost of maintainance.

Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-09-16 09:20:57 +01:00
Xynnn007
c80c8d84c3 test: add cosign signature verificaton tests
Close #8120

**Case 1**
Create a pod from an unsigned image, on an insecureAcceptAnything
registry works.

Image: quay.io/prometheus/busybox:latest
Policy rule:
```
"default": [
    {
        "type": "insecureAcceptAnything"
    }
]
```

**Case 2**
Create a pod from an unsigned image, on a 'restricted registry' is
rejected.

Image: ghcr.io/confidential-containers/test-container-image-rs:unsigned
Policy rule:
```
"quay.io/confidential-containers/test-container-image-rs": [
    {
        "type": "sigstoreSigned",
        "keyPath": "kbs:///default/cosign-public-key/test"
    }
]
```

**Case 3**
Create a pod from a signed image, on a 'restricted registry' is
successful.

Image: ghcr.io/confidential-containers/test-container-image-rs:cosign-signed
Policy rule:
```
"ghcr.io/confidential-containers/test-container-image-rs": [
    {
        "type": "sigstoreSigned",
        "keyPath": "kbs:///default/cosign-public-key/test"
    }
]
```

**Case 4**
Create a pod from a signed image, on a 'restricted registry', but with
the wrong key is rejected

Image:
ghcr.io/confidential-containers/test-container-image-rs:cosign-signed-key2

Policy:
```
"ghcr.io/confidential-containers/test-container-image-rs": [
    {
        "type": "sigstoreSigned",
        "keyPath": "kbs:///default/cosign-public-key/test"
    }
]
```

**Case 5**
Create a pod from an unsigned image, on a 'restricted registry' works
if enable_signature_verfication is false

Image: ghcr.io/kata-containers/confidential-containers:unsigned

image security enable: false

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-09-16 09:20:57 +01:00
Xynnn007
9606e7ac8b agent: Set image-rs image security policy
Add two parameters for enabling cosign signature image verification.
- `enable_signature_verification`: to activate signature verification
- `image_policy`: URI of the image policy
config

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
2024-09-16 09:20:57 +01:00
Xynnn007
653bc3973f agent: fix make test for kata-agent of dependency anyhow
new version of the anyhow crate has changed the backtrace capture thus
unit tests of kata-agent that compares a raised error with an expected
one would fail. To fix this, we need only panics to have backtraces,
thus set `RUST_BACKTRACE=1` and `RUST_LIB_BACKTRACE=0` for tests due to
document

https://docs.rs/anyhow/latest/anyhow/

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
2024-09-16 09:20:57 +01:00
Fabiano Fidêncio
dfcb41b5cc Merge pull request #10313 from stevenhorsman/coco-components-0.10-bump
CoCo: Bump Coco components to 0.10 releases
2024-09-14 21:43:28 +02:00
stevenhorsman
705e469696 rootf: Change initrd alpine mirror
The rootfs-initrd build is failing with:
```
fetch https://mirror.math.princeton.edu/pub/alpinelinux//v3.18/main/aarch64/APKINDEX.tar.gz
6684368:error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:1914:
ERROR: https://mirror.math.princeton.edu/pub/alpinelinux//v3.18/main: Permission denied
```
so try bumping to a newer version of alpine to see
if that helps the issue

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-09-14 18:47:45 +02:00
Dan Mihai
5777869cf4 tests: k8s-policy-rc: add unexpected UID test
Change pod runAsUser value of a Replication Controller after generating
the RC's policy, and verify that the RC pods get rejected due to this
change.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-13 22:05:31 +00:00
Dan Mihai
6773f14667 tests: k8s-policy-job: add unexpected UID test
Change pod runAsUser value of a Job after generating the Job's policy,
and verify that the Job gets rejected due to this change.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-13 22:05:31 +00:00
Dan Mihai
124f01beb3 tests: k8s-policy-deployment: add bad UID test
Change pod runAsUser value of a Deployment after generating the
Deployment's policy, and verify that the Deployment fails due to
this change.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-13 22:05:31 +00:00
Dan Mihai
16f5ebf5f9 genpolicy: get UID from PodSecurityContext
Get UID from PodSecurityContext for other k8s resource types too,
not just for Pods.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-13 22:05:31 +00:00
Dan Mihai
5badc30a69 Merge pull request #10316 from microsoft/danmihai1/k8s-inotify
tests: k8s-inotify: pod termination polling
2024-09-13 15:02:38 -07:00
GabyCT
6f363bba18 Merge pull request #10304 from GabyCT/topic/fixcricont
tests: Fix indentation in the cri containerd tests
2024-09-13 14:49:12 -06:00
Dan Mihai
d3127af9c5 tests: k8s-inotify: pod termination polling
Poll/wait for pod termination instead of sleeping 2 minutes. This
change typically saves ~90 seconds in my test cluster.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-13 17:12:55 +00:00
sidney chang
5a7d0ed3ad runtime-rs: introduce tap in hypervisor by extrating it from dragonball
It's a prerequisite PR to make built-in vmm dragonball compilation
options configurable.

Extract TAP device-related code from dragonball's dbs_utils into a
separate library within the runtime-rs hypervisor module.
To enhance functionality and reduce dependencies, the extracted code
has been reimplemented using the libc crate and the ifreq structure.

Fixes #10182

Signed-off-by: sidney chang <2190206983@qq.com>
2024-09-13 07:32:14 -07:00
Fabiano Fidêncio
b09eba8c46 Merge pull request #10309 from BbolroC/helm-install-with-retry
tests: Introduce retry mechanism for helm install
2024-09-13 15:08:46 +02:00
stevenhorsman
00e657cdb7 agent: image-rs: Update to v0.10.0 release
Update image-rs to use the latest release of guest-components

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-09-13 13:29:54 +01:00
stevenhorsman
5e03890562 versions: Bump trustee and guest-components
Bump to the v0.10.1 release of trustee and v0.10.0
release of guest-components

Signed-off-by: stevenhorsman <steven@uk.ibm.com>

fixup
2024-09-13 13:28:54 +01:00
Hyounggyu Choi
0aae847ae5 tests: Update secure boot image verification for IBM SE
In the latest `s390-tools`, there has been update on how to
verify a secure boot image. A host key revocation list (CRL),
which was optinoal, now becomes mandatory for verification.
This commit updates the relevant scripts and documentation accordingly.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-13 14:14:02 +02:00
Hyounggyu Choi
4c933a5611 tests: Introduce retry mechanism for helm install
Kata-deploy often fails due to a transiently unreachable k8s cluster
for the qemu-coco-dev test on s390x.
(e.g. https://github.com/kata-containers/kata-containers/actions/runs/10831142906/job/30058527098?pr=10009)
This commit introduces a retry mechanism to mitigate these failures by
retrying the command two more times with a 10-second interval as a workaround.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-13 14:03:44 +02:00
Dan Mihai
e937cb1ded Merge pull request #10291 from microsoft/danmihai1/user-name-to-uid
genpolicy: fix and re-enable create container UID verification
2024-09-12 15:47:59 -07:00
Dan Mihai
0c5ac042e7 tests: k8s-policy-pod: add workaround for #10297
If the CI platform being tested doesn't support yet the prometheus
container image:
- Use busybox instead of prometheus.
- Skip the test cases that depend on the prometheus image.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-12 18:26:38 +00:00
Gabriela Cervantes
0346b32a90 tests: Fix indentation in the cri containerd tests
This PR fixes the indentation in the cri containerd tests as we
have in several places a misalignment in the script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-12 16:18:34 +00:00
Dan Mihai
94d95fc055 tests: k8s-policy-pod: test container UID changes
Add test cases for changing container UID after generating the policy.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-11 22:38:20 +00:00
Dan Mihai
db1ca4b665 tests: k8s-policy-pod: remove UID workaround
Remove the workaround for #9928, now that genpolicy is able to
convert user names from container images into the corresponding
UIDs from these images.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-11 22:38:20 +00:00
Dan Mihai
d2d8d2e519 genpolicy: remove default UID/GID values
Remove the recently added default UID/GID values, because the genpolicy
design is to initialize those fields before this new code path gets
executed.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-11 22:38:20 +00:00
Hernan Gatta
871476c3cb genpolicy: pull UID:GID values from /etc/passwd
Some container images are configured such that the user (and group)
under which their entrypoint should run is not a number (or pair of
numbers), but a user name.

For example, in a Dockerfile, one might write:

> USER 185

indicating that the entrypoint should run under UID=185.

Some images, however, might have:

> RUN groupadd --system --gid=185 spark
> RUN useradd --system --uid=185 --gid=spark spark
> ...
> USER spark

indicating that the UID:GID pair should be resolved at runtime via
/etc/passwd.

To handle such images correctly, read through all /etc/passwd files in
all layers, find the latest version of it (i.e., the top-most layer with
such a file), and, in so doing, ensure that whiteouts of this file are
respected (i.e., if one layer adds the file and some subsequent layer
removes it, don't use it).

Signed-off-by: Hernan Gatta <hernan.gatta@opaque.co>
2024-09-11 22:38:20 +00:00
Hernan Gatta
f9249b4476 genpolicy: add tar dependency
Used to read /etc/passwd from tar files.

Signed-off-by: Hernan Gatta <hernan.gatta@opaque.co>
2024-09-11 22:38:20 +00:00
Dan Mihai
eb7f747df1 genpolicy: enable create container UID verification
Disabling the UID Policy rule was a workaround for #9928. Re-enable
that rule here and add a new test/CI temporary workaround for this
issue. This new test workaround will be removed after fixing #9928.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-11 22:38:20 +00:00
Dan Mihai
71ede4ea3f tests: k8s-policy-pod: use prometheus container
Change quay.io/prometheus/busybox to quay.io/prometheus/prometheus in
this test. The prometheus image will be helpful for testing the future
fix for #9928 because it specifies user = "nobody".

Also, change:

sh -c "ls -l /"

to:

echo -n "readinessProbe with space characters"

as the test readinessProbe command line. Both include a command line
argument containing space characters, but "sh -c" behaves differently
when using the prometheus container image (causes the readinessProbe
to time out, etc.).

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-11 22:38:20 +00:00
GabyCT
614328f342 Merge pull request #10295 from GabyCT/topic/removeimgvar
metrics: Remove unused remove img var in common script
2024-09-11 15:02:39 -07:00
GabyCT
095c5ed961 Merge pull request #10289 from GabyCT/topic/enablestresst
tests: Enable stressng k8s stability test for Kata CoCo CI
2024-09-11 10:47:33 -07:00
Fabiano Fidêncio
97ecdabde9 Merge pull request #10294 from fidencio/topic/bring-ita-support
Bump guest-components / trustee to a version that supports ITA
2024-09-11 19:45:48 +02:00
Gabriela Cervantes
fdaf12d16c metrics: Remove unused remove img var in common script
This PR removes the remove_img variable in the metrics common script
as it is not being used.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-11 17:45:18 +00:00
Gabriela Cervantes
04d1122a46 tests: Decrease iterations in soak test
This PR decreases the number of iterations in the kubernetes soak test
as this is already taking more than 2 hours for the kata coco ci
stability.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-11 17:39:06 +00:00
Gabriela Cervantes
c48c6f974e tests: Enable stressng k8s stability test for Kata CoCo CI
This PR enables the stressng k8s stability test for Kata CoCo CI.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-11 17:38:13 +00:00
Alex Man
7e400f7bb2 agent: Fix CPU usage reporting for cgroup v2 in kata-agent
kata-agent incorrectly reports CPU time for cgroup v2, causing 1000x underreporting.

For cgroup v2, kata-agent reads the cpu.stat file, which reports the time consumed by the processes in the cgroup in µs.
However, there was a bug in kata-agent where it returned this value in µs without converting it to ns.

This commit adds the necessary µs to ns conversion for cgroup v2, aligning it with v1 behavior and kata-shim's expectations.

This fixes #10278

Signed-off-by: Alex Man <alexman@stripe.com>
2024-09-11 10:29:03 -07:00
Fabiano Fidêncio
1178fe20e9 tests: Adapt error parser for failed image decryption
With an older version of image-rs, we were getting the following error:
```
       Message:   failed to create containerd task: failed to create shim task: failed to handle layer: failed to get decrypt key no suitable key found for decrypting layer key:
```

However, with the version of image-rs we are bumping to, the error comes
as:
```
       Message:   failed to create containerd task: failed to create shim task: failed to handle layer: failed to get decrypt key

 Caused by:
     no suitable key found for decrypting layer key:
      keyprovider: failed to unwrap key by ttrpc
```

Due to this change, I'm splitting the check in two different ones.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-11 17:07:56 +02:00
Dan Mihai
66dda37877 Merge pull request #10271 from Sumynwa/sumsharma/agent_ctl_issue_9689_local
agent-ctl: Refactor CopyFile Handler
2024-09-11 07:35:09 -07:00
Fabiano Fidêncio
f6cfc33314 Merge pull request #10292 from fidencio/topic/ci-tdx-adapt-how-we-get-the-host-ip
ci: tdx: Adapt how we get the host IP
2024-09-11 14:42:22 +02:00
Fabiano Fidêncio
e2200f0690 versions: trustee: Update to a version that supports ITA
ITA stands for Intel Trust Authority, which is in the process to being
renamed to ITTS (Intel Tiber Trust Services).

Proper ITA / ITTS support on Trustee was finished as part of:
* 6f767fa15f

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-11 13:39:35 +02:00
Fabiano Fidêncio
d3e3ee7755 versions: guest-components: Update to a version that supports ITA
ITA stands for Intel Trust Authority, which is in the process to being
renamed to ITTS (Intel Tiber Trust Services).

As we've bumped guest-components on trustee, let's make sure we also
bump image-rs to the commit that brings ITA support in:
* https://github.com/confidential-containers/guest-components/commit/1db6c3a87665dde58d0efa56f4e4af5fc

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-11 13:36:56 +02:00
Fabiano Fidêncio
f94d80783d agent: image-rs: Update to a version that supports ITA
ITA stands for Intel Trust Authority, which is in the process to being
renamed to ITTS (Intel Tiber Trust Services).

As we've bumped guest-components on trustee, let's make sure we also
bump image-rs to the commit that brings ITA support in:
* 1db6c3a876

The reason we need to bump the dependency here is to avoid kbs_protocol
mismatch between the version used by the agent and the trustee one.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-11 13:36:46 +02:00
Fabiano Fidêncio
3946aa7283 ci: tdx: Adapt how we get the host IP
In the process of switching the TDX CI machine we've noticed that
`hostname -i` in one of the machines returns an one and only IP address,
while in another machine it returns a full list of IPs.

As we're only interested in the first one, let's adapt the code to
always return the first one.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-11 09:31:43 +02:00
Sumedh Alok Sharma
b4bbbf65c6 ci: Do not start CDH/attestation procs with kata-agent as local process.
Since CDH/attestation related processes and its dependencies are not fully
available, the setup fails to start kata-agent as local process. This
fix removes these procs to prevent kata-agent from trying to start them.

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-09-11 11:53:59 +05:30
Sumedh Alok Sharma
8045a7a2ba ci: Install policy document on host to run kata-agent as local process.
The test setup starts kata-agent as a local process without the
UVM. The agent policy initialization fails due to missing policy
document at `/etc/kata-opa/default-policy.rego`. The fix
- installs a relaxed `allow-all.rego` policy document
- cleans up the install during exit

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-09-11 11:25:08 +05:30
Sumedh Alok Sharma
822f898433 ci: Install bats as dependencies
Install bats as part of dependencies for running the tests.

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-09-11 10:57:15 +05:30
Sumedh Alok Sharma
2c774fb207 ci: Add tests for CopyFile api.
This commit introduces test cases for testing
CopyFile API using kata-agent-ctl with improved command
semantics and handling.
- copy a file to /run/kata-containers
- copy symlink to /run/kata-containers
- copy directory to /run/kata-containers
- copy file to /tmp
- copy large file to /run/kata-containers

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-09-11 10:54:01 +05:30
Sumedh Alok Sharma
2af1113426 agent-ctl: Refactor CopyFile handler
In the existing implementation for the CopyFile subcommand,
- cmd line argument list is too long, including various metadata information.
- in case of a regular file, passing the actual data as bytes stream adds to the size and complexity of the input.
- the copy request will fail when the file size exceeds that of the allowed ttrpc max data length limit of 4Mb.

This change refactors the CopyFile handler and modifies the input to a known 'source' 'destination' syntax.

Fixes #9708

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-09-11 10:54:01 +05:30
Alex Lyn
d0968032f7 Merge pull request #10276 from Apokleos/fix-runtime-cdi
runtime: Fix runtime/cdi panic with assignment to entry in nil map
2024-09-11 09:00:11 +08:00
Alex Lyn
3f541aff4a Merge pull request #10282 from teawater/dup
runtime-rs: configuration-dragonball.toml.in: Remove duplication
2024-09-10 11:46:40 +08:00
Hui Zhu
dfea12bc53 runtime-rs: configuration-dragonball.toml.in: Remove duplication
Remove duplicated description of enable_balloon_f_reporting from
configuration-dragonball.toml.in.

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-09-10 07:34:29 +08:00
David Esparza
6f8897249b Merge pull request #10277 from GabyCT/topic/fixsk
tests: Increase timeout to wait for soak stability test deployment
2024-09-09 14:07:10 -06:00
Gabriela Cervantes
5a52fe1a75 tests: Increase timeout to wait for soak stability test deployment
This PR increases the timeout to wait that the deployment for the soak
stability test is ready in order to avoid random failures saying that
the deployment is not ready yet.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-09 16:13:40 +00:00
Alex Lyn
1684c1962c runtime: Fix runtime/cdi panic with assignment to entry in nil map
It will panic when users do GPU vfio passthrough with cdi in runtime.
The root cause is that CustomSpec.Annotations is nil when new element
added.
To address this issue, initialization is introduced when it's nil.

Fixes #10266

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-09-09 20:15:10 +08:00
Alex Lyn
f31839af63 Merge pull request #10253 from teawater/enable_balloon_f_reporting
Add support of dragonball virtio-balloon free page reporting
2024-09-09 17:37:52 +08:00
Fabiano Fidêncio
026a4d92a9 Merge pull request #10272 from fidencio/topic/add-tdx-mrconfigid-mrowner-mrownerconfig-support
runtime: qemu: tdx: Add support for setting mrconfigid / mrowner / mrownerconfig
2024-09-08 14:11:30 +02:00
Fabiano Fidêncio
51ee4c381a Merge pull request #10257 from fidencio/topic/kata-deploy-remove-unused-vars-for-cleanup
kata-deploy: Remove kata-cleanup unneeded vars
2024-09-07 11:27:14 +02:00
Chengyu Zhu
3a37652d01 Merge pull request #10213 from ChengyuZhu6/device
Refine device management for kata-agent
2024-09-07 12:02:32 +08:00
ChengyuZhu6
75816d17f1 agent: switch to new device subsystem
Switch to new device subsystem to handle various devices in kata-agent.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:45:43 +08:00
ChengyuZhu6
df55f37dfe agent: Move unit tests about vfio device to vfio_device_handler
Move unit tests about vfio device to vfio_device_handler.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:45:43 +08:00
ChengyuZhu6
41c2d81fd3 agent: Move unit tests about scsi device to scsi_device_handler
Move unit tests about scsi device to scsi_device_handler.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:45:43 +08:00
ChengyuZhu6
f45129cb44 agent: Move unit tests about network device to network_device_handler
Move unit tests about network device to network_device_handler.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:45:43 +08:00
ChengyuZhu6
52203db760 agent: Move unit tests about block device to block_device_handler
Move unit tests about block device to block_device_handler.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:45:43 +08:00
ChengyuZhu6
e1afb92a28 agent: Move common unit tests about device
Move common unit tests about device to mod.rs

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:45:43 +08:00
ChengyuZhu6
25bd04c02a agent: Use DeviceHandlerManager to handle various devices
Use DeviceHandlerManager to handle various devices.

Fixes: #10218

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:45:42 +08:00
ChengyuZhu6
5fc645c869 agent: Move network device code to network_device_handler
Move network device code to network_device_handler to simplify the code.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:40:30 +08:00
ChengyuZhu6
07f104085a agent: Move vfio device code to vfio_device_handler
Move vfio device code to vfio_device_handler to simplify the code.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:40:30 +08:00
ChengyuZhu6
0cb87767ae agent: Move device code with virtio scsi driver to scsi_device_handler
Move scsi device code to scsi_device_handler to simplify the code.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:40:30 +08:00
ChengyuZhu6
0738d75a92 agent: Move device code with nvdimm driver to nvdimm_device_handler
Move device code with nvdimm driver to nvdimm_device_handler, including
nvdimm device and pmem device.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:40:30 +08:00
ChengyuZhu6
bbf934161b agent: Move virtio-block device handlers to block_device_handler
Move virtio-block device handlers to block_device_handler to simplify
the code.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:40:30 +08:00
ChengyuZhu6
4e33665be8 kata-types: Move device driver constants to kata-types
Move device driver constants and add DeviceHandlerManager type alias.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:40:30 +08:00
ChengyuZhu6
0b3ad2f830 kata-types: Replace StorageHandlerManager with type alias
Removed the `StorageHandlerManager` struct and its associated implementations and
introduced a type alias `StorageHandlerManager` for `HandlerManager` to simplify the code.
The new type alias maintains the same functionality while reducing redundancy.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 07:53:31 +08:00
ChengyuZhu6
281f0d7f29 kata-types: Add HandlerManager to manage registered handlers
Introduced `HandlerManager` struct to manage registered handlers, which will be used to storage and device management for kata-agent.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 07:51:48 +08:00
GabyCT
b05811587e Merge pull request #10245 from ChengyuZhu6/handler-manager
agent: Refactor storage handler registration
2024-09-06 09:45:39 -06:00
GabyCT
37ddb837c4 Merge pull request #10267 from GabyCT/topic/updatemlcomments
metrics: Update openVINO and oneDNN tests references
2024-09-06 09:42:21 -06:00
Fabiano Fidêncio
65a4562050 runtime: qemu: tdx: Add omitempty to QuoteGenerationSocket
I know right now we're always passing a value for that, but this doesn't
really have to be set unless attestation is used.  Thus, let's also omit
it in case it's empty.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-06 15:05:55 +02:00
Fabiano Fidêncio
7818484120 runtime: qemu: tdx: Support mrconfigid / mrowner/ mrownerconfig
This is a quick and simple pre-req for supporting initData, which will
take advantage of the mrconfigid in the TDX case.

While already adding mrconfigid, which is hardcoded empty right now,
let's do the same for mrowner and mrownerconfig, and leave it prepared
for future expansions.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-06 15:05:54 +02:00
Fabiano Fidêncio
8285957678 runtime: qemu: Rename prepareObjectWithTDXQgs to prepareTDXObject
The reason we're relying on yet another function to do so is because the
TDX object will be used in its qom / qapi json format.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-06 14:36:09 +02:00
Fabiano Fidêncio
29ce2205a1 Merge pull request #10268 from microsoft/saulparedes/pdb-support
genpolicy: add support for PodDisruptionBudget yaml
2024-09-06 09:53:36 +02:00
Dan Mihai
1885478e2e Merge pull request #10270 from Sumynwa/sumsharma/enable_agent_tests_in_ci
ci: Enable kata agent API tests
2024-09-05 14:24:49 -07:00
Archana Choudhary
f2625b0014 genpolicy: add support for PodDisruptionBudget
yaml

Prevent panic for PDB specs

Signed-off-by: Archana Choudhary <archana1@microsoft.com>
2024-09-05 11:33:47 -07:00
Sumedh Alok Sharma
e1ac2f4416 ci: Enable kata agent api tests
This commit enables running tests for kata agent apis.
The 'api-tests' directory will contain bats test files for
individual APIs.

Fixes #10269

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-09-06 00:02:55 +05:30
GabyCT
4b257bcbb6 Merge pull request #10255 from Sumynwa/sumsharma/metrics_ci_kill_kata_components
ci: send SIGKILL to kill kata components
2024-09-05 12:04:57 -06:00
Aurélien Bombo
cc9aeee81a Merge pull request #10263 from Sumynwa/sumsharma/add_ci_workflow
ci: Add workflow to run kata-agent api tests using kata-agent-ctl
2024-09-05 09:32:34 -07:00
Dan Mihai
7ab95b56f1 Merge pull request #10251 from microsoft/saulparedes/support_readonly_hostpath
genpolicy: support readonly hostpath
2024-09-05 09:27:15 -07:00
GabyCT
deb6d12ff6 Merge pull request #10237 from GabyCT/topic/k8soakcoco
tests: Enable k8s soak stability test for Kata CoCo CI
2024-09-05 09:56:48 -06:00
Gabriela Cervantes
fcc35dd3a7 metrics: Update openVINO and oneDNN tests references
This PR updates the machine learning tests references or urls for the
openVINO and oneDNN scripts as currently they are refering to a different
performance benchmark.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-05 15:39:21 +00:00
GabyCT
bb5d8bbcb5 Merge pull request #10229 from GabyCT/topic/ufcv
versions: Update firecracker version to 1.8.0
2024-09-05 09:19:36 -06:00
Fabiano Fidêncio
70491ff29f Merge pull request #10244 from BbolroC/turn-on-kbs-qemu-coco-dev-s390x
gha: Turn on KBS for qemu-coco-dev on s390x
2024-09-05 13:02:42 +02:00
Sumedh Alok Sharma
ad66f4dfc9 ci: Add workflow to run kata-agent api tests using kata-agent-ctl
enable CI to add test cases for testing kata-agent APIs. This commit
introduces:
- a workflow to run tests
- setup scripts to prepare the test environment

Fixes #10262

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-09-05 14:38:29 +05:30
Saul Paredes
24c2d13fd3 genpolicy: support readonly emptyDir mount
Set emptyDir access based on volume mount readOnly value

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-09-04 15:05:44 -07:00
Saul Paredes
36a4104753 genpolicy: support readonly hostpath
Set hostpath access based on volume mount readOnly value

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-09-04 14:55:22 -07:00
Fabiano Fidêncio
7d048f5963 Merge pull request #10254 from fidencio/topic/remove-amd-specific-warning-from-non-amd-systems
runtime: Don't error out about SNP cert path on non SNP platforms
2024-09-04 23:42:32 +02:00
Fabiano Fidêncio
d44d66ddf6 kata-deploy: Remove kata-cleanup unneeded vars
As kata-cleanup will only call `reset_runtime()`, there's absolutely no
need to export the other set of environment variables in its yaml file.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-04 19:09:02 +02:00
Steve Horsman
f66e8c41a1 Merge pull request #10250 from squarti/remote-machine-type-default
runtime: fix bad default machine_type for remote hypervisor
2024-09-04 17:34:04 +01:00
Sumedh Alok Sharma
4025468e27 ci: send SIGKILL to kill kata components
metrics tests sometimes fail with kata components still running.
sending SIGKILL and waiting for the processes to reap.

Fixes #8651

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-09-04 18:58:17 +05:30
Fabiano Fidêncio
b10256a7ca runtime: Don't error out about SNP cert path on non SNP platforms
This error is specific to SNP platforms, so let's make sure we only
error this out when an SNP platform is used.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-04 11:54:52 +02:00
Hui Zhu
447a7feccf runtime-rs: configuration-dragonball.toml.in: Add config for balloon
Add enable_balloon_f_reporting config to
configuration-dragonball.toml.in.

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-09-04 17:25:38 +08:00
Hui Zhu
9c1b5238b3 kernel/configs: Add ballon and f_reporting to dragonball-experimental
Add CONFIG_PAGE_REPORTING, CONFIG_BALLOON_COMPACTION and
CONFIG_VIRTIO_BALLOON to dragonball-experimental configs to open
dragonball function and free page reporting function.

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-09-04 17:25:30 +08:00
Hui Zhu
ad9968ce2d runtime-rs: Add enable_balloon_f_reporting for dragonball
Under normal circumstances, the virtual machine only requests memory
from the host and does not actively release it back to host when it is
no longer needed, leading to a waste of memory resources.

Free page reporting is a sub-feature of virtio-balloon. When this
feature is enabled, the Linux guest kernel will send information about
released pages to dragonball via virtio-balloon, and dragonball will
then release these pages.

This commit adds an option enable_balloon_f_reporting to runtime-rs.
When this option is enabled, runtime-rs will insert a virtio-balloon
device with the f_reporting option enabled during the Dragonball virtual
machine startup.

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-09-04 16:38:13 +08:00
Fabiano Fidêncio
13517cf9c1 Merge pull request #10192 from fidencio/topic/helm-add-post-delete-job
helm: Several fixes, including some reasonable re-work on kata-deploy.sh script
2024-09-04 09:34:57 +02:00
Paul Meyer
3be719c805 virtcontainers: allow specifying nydus-overlayfs binary by path
...or by using a binary with additional suffix.
This allows having multiple versions of nydus-overlayfs installed on the
host, telling nydus-snapshotter which one to use while still detecting
Nydus is used.

Signed-off-by: Paul Meyer <49727155+katexochen@users.noreply.github.com>
2024-09-04 08:29:40 +02:00
Chengyu Zhu
f0066568eb Merge pull request #10233 from ChengyuZhu6/cdh-instance
agent:cdh: Refactor CDHClient usage and initialization
2024-09-04 13:34:36 +08:00
Silenio Quarti
9e1388728e runtime: fix bad default machine_type for remote hypervisor
Fixes: #10249

Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
2024-09-03 20:53:19 -04:00
GabyCT
c2774b09dd Merge pull request #10247 from GabyCT/topic/removereportm
metrics: Remove metrics report for Kata Containers
2024-09-03 15:10:04 -06:00
Fabiano Fidêncio
bb9bcd886a kata-deploy: Add reset_cri_runtime()
This will help to avoid code duplication on what's needed on the helm
and non-helm cases.

The reason it's not been added as part of the commit which adds the
post-delete hook is simply for helping the reviewer (as the diff would
be less readable with this change).

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-03 23:08:22 +02:00
Fabiano Fidêncio
a773797594 ci: Pass --debug to helm
Just to make ourlives a little bit easier.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-03 23:08:22 +02:00
Fabiano Fidêncio
64ccb1645d helm: Add a post-delete hook
Instead of using a lifecycle.preStop hook, as done when we're using
using the helm chat, let's add a post-delete hook to take care of
properly cleaning up the node during when uninstalling kata-deploy.

The reason why the lifecyle.preStop hook would never work on our case is
simply because each helm chart operation follows the Kuberentes
"declarative" approach, meaning that an operation won't wait for its
previous operation to successfully finish before being called, leading
to us trying to access content that's defined by our RBAC, in an
operation that was started before our RBAC was deleted, but having the
RBAC being deleted before the operation actually started.

Unfortunately this hook brings in some code duplicatioon, mainly related
to the RBAC parts, but that's not new as the same happens with our
deamonset.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-09-03 23:08:22 +02:00
Wainer dos Santos Moschetta
3b23d62635 tests/k8s: fix wait for pods on deploy-kata action
On commit 51690bc157 we switched the installation from kubectl to helm
and used its `--wait` expecting the execution would continue when all
kata-deploy Pods were Ready. It turns out that there is a limitation on
helm install that won't wait properly when the daemonset is made of a
single replica and maxUnavailable=1. In order to fix that issue, let's
revert the changes partially to keep using kubectl and waitForProcess
to the exection while Pods aren't Running.

Fixes #10168
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-09-03 23:08:22 +02:00
Fabiano Fidêncio
40f8aae6db Reapply "ci: make cleanup_kata_deploy really simple"
This reverts commit 21f9f01e1d, as the
pacthes for helm are coming as part of this series.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-03 23:08:22 +02:00
Fabiano Fidêncio
cfe6e4ae71 Reapply "ci: Use helm to deploy kata-deploy" (partially)
This reverts commit 36f4038a89, as the
pacthes for helm are coming as part of this series.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-03 23:08:22 +02:00
Fabiano Fidêncio
424347bf0e Reapply "kata-deploy: Add Helm Chart" (partially)
This reverts commit b18c3dfce3, as the
pacthes for helm are coming as part of this series.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-03 23:08:22 +02:00
ChengyuZhu6
77521cc8d2 agent:cdh: introduce a function to check initialization of cdh client
introduce a function to check initialization of cdh client.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-04 04:52:50 +08:00
ChengyuZhu6
07e0e843e8 agent:cdh: switch to the new method for initializing cdh client
Decouple the cdh client from AgentService and refactor cdh client usage and initialization.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-04 04:51:55 +08:00
ChengyuZhu6
bc8156c3ae agent:cdh: Refactor cdh client methods for better integration
Move `unseal_env` and `secure_mount` functions on the global `CDH_CLIENT` instance to access the CDH client.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-04 04:51:54 +08:00
ChengyuZhu6
0ad35dc91b agent:cdh: Initialize CDH client as a global asynchronous instance
Introduced a global `CDH_CLIENT` instance to hold the cdh client and
implemented `init_cdh_client` function to initialize the cdh client if not already set.

Fixes: #10231

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-04 04:49:54 +08:00
Gabriela Cervantes
5b0ab7f17c metrics: Remove metrics report for Kata Containers
This PR removes the metrics report which is not longer being used
in Kata Containers.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-03 16:11:07 +00:00
Hyounggyu Choi
1cefa48047 gha: Add necessary steps for KBS enablement
The following steps are required for enabling KBS:

- Set environment variables `KBS` and `KBS_INGRESS`
- Uninstall and install `kbs-client`
- Deploy KBS

This commit adds the above stpes to the existing workflow
for `qemu-coco-dev`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-03 16:26:12 +02:00
Hyounggyu Choi
b0a912b8b4 tests: Enable KBS deployment for qemu-coco-dev on s390x
To deploy KBS on s390x, the environment variable `IBM_SE_CREDS_DIR`
must be exported, and the corresponding directory must be created.

This commit enables KBS deployment for `qemu-coco-dev`, in addition
to the existing `qemu-se` support on the platform.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-03 15:51:18 +02:00
Fabiano Fidêncio
057612f18f Merge pull request #10238 from fidencio/topic/remove-stdio-test
ci: Remove stdio tests
2024-09-03 14:50:46 +02:00
ChengyuZhu6
0d519162b5 agent:storage: Refactor storage handler registration
- Added `driver_types` method to `StorageHandler` trait to return driver
  types managed by each handler.
- Implemented driver_types method for all storage handlers.
- Updated `STORAGE_HANDLERS` initialization to use `driver_types` for
  handler registration.

Fixes: #10242

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-03 18:38:52 +08:00
ChengyuZhu6
e47eb0d7d4 kata-types:mount: support registering multiple IDs to a single handler
- Updated the `add_handler` function in `StorageHandlerManager` to accept a slice of IDs (`&[&str]`) instead of a single ID (`&str`).
  This change allows a single handler to be registered for multiple storage device types.
- Refactored calls to `add_handler` in `Storage` of kata-agent to use the new function, passing arrays of storage drivers instead of single driver.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-03 18:38:36 +08:00
Fabiano Fidêncio
e8657c502d Revert "CI: Add tests for stdio"
This reverts commit 704da86e9b, as the
tests never became stable to run.

This was discussed and agreed with the maintainer.

 Conflicts:
	.github/workflows/basic-ci-amd64.yaml
	tests/integration/stdio/gha-run.sh

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-03 11:52:30 +02:00
Greg Kurz
4698235e59 Merge pull request #10204 from fidencio/topic/kata-deploy-add-installation-prefix
kata-deploy: helm: Add INSTALLATION_PREFIX
2024-09-03 09:26:51 +02:00
Fabiano Fidêncio
e1d3fb8c00 Merge pull request #10236 from fidencio/topic/bump-image-rs-to-properly-handle-gzip-whiteouts
agent: Update image-rs to 02af65abc
2024-09-02 21:43:19 +02:00
Fabiano Fidêncio
0cb93ed1bb kata-deploy: helm: Add INSTALLATION_PREFIX option
This will allow users to properly set the INSTALLATION_PREFIX when
deploying Kata Containers.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-02 20:25:22 +02:00
Gabriela Cervantes
c2aa288498 gha: Increase time to run Kata CoCo stability tests
This PR increases the time to run the Kata CoCo stability tests as
this tests are design to run for more than 2 hours.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-02 16:40:47 +00:00
Gabriela Cervantes
825cb2d22e tests: Enable k8s soak stability test for Kata CoCo CI
This PR enables the k8s soak stability test to run on the weekly
Kata CoCo stability CI.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-02 16:30:44 +00:00
Fabiano Fidêncio
1309c49c09 agent: Update image-rs to 02af65abc
As this brings in proper support to handle gzip whiteouts.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-02 14:15:04 +02:00
Fabiano Fidêncio
7be77ebee5 kata-deploy: helm: Stop mounting /opt/kata
It's simply easier if we just use /host/opt/kata instead in our scripts,
which will simplify a lot the logic of adding an INSTALLATION_PREFIX
later on.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-02 09:38:51 +02:00
Fabiano Fidêncio
6ce5e62c48 kata-deploy: Add a $dest_dir var
As we build our binaries with the `/opt/kata` prefix, that's the value
of $dest_dir.

Later in thise series it'll become handy, as we'll introduce a way to
install the Kata Containers artefacts in a different location.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-02 09:36:33 +02:00
Fabiano Fidêncio
ef5a5ea26e Merge pull request #10038 from sprt/move-free-runner-iii
ci: Transition GARM tests to free runners, pt. III
2024-08-31 01:29:08 +02:00
Gabriela Cervantes
19d8f11345 versions: Update firecracker version to 1.8.0
This PR updates the firecracker version to 1.8.0 which includes the
following changes:
- Added ACPI support to Firecracker for x86_64 microVMs. Currently, we pass ACPI tables with information about the available vCPUs, interrupt controllers, VirtIO and legacy x86 devices to the guest. This allows booting kernels without MPTable support. Please see our kernel policy documentation for more information regarding relevant kernel configurations.
- Added support for the Virtual Machine Generation Identifier (VMGenID) device on x86_64 platforms. VMGenID is a virtual device that allows VMMs to notify guests when they are resumed from a snapshot. Linux includes VMGenID support since version 5.18. It uses notifications from the device to reseed its internal CSPRNG. Please refer to snapshot support and random for clones documention for more info on VMGenID. VMGenID state is part of the snapshot format of Firecracker. As a result, Firecracker snapshot version is now 2.0.0.
- Changed T2CL template to pass through bit 27 and 28 of MSR_IA32_ARCH_CAPABILITIES (RFDS_NO and RFDS_CLEAR) since KVM consider they are able to be passed through and T2CL isn't designed for secure snapshot migration between different processors.
- Avoid setting kvm_immediate_exit to 1 if are already handling an exit, or if the vCPU is stopped. This avoids a spurious KVM exit upon restoring snapshots.
- Changed T2S template to set bit 27 of MSR_IA32_ARCH_CAPABILITIES (RFDS_NO) to 1 since it assumes that the fleet only consists of processors that are not affected by RFDS.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-30 20:49:29 +00:00
Aurélien Bombo
886b3047ac Merge pull request #10222 from microsoft/danmihai1/log-level-false-positives
agent: avoid policy.txt log without debug enabled
2024-08-30 10:09:04 -07:00
Alex Lyn
4fd4b02f2e Merge pull request #10228 from GabyCT/topic/removeionednn
metrics: Remove unused variable in oneDNN benchmark
2024-08-30 09:31:14 +08:00
Gabriela Cervantes
aa8635727d metrics: Remove unused variable in oneDNN benchmark
This PR removes an unused variable in oneDNN metrics benchmark.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-29 15:52:47 +00:00
Alex Lyn
8241423ba5 Merge pull request #10224 from amshinde/update-image-rs-xattr
agent: image-rs: check xattrs for image unpacking
2024-08-29 09:33:22 +08:00
GabyCT
dd9f41547c Merge pull request #10160 from microsoft/saulparedes/support_priority_class
genpolicy: add priorityClassName as a field in PodSpec interface
2024-08-28 14:36:20 -06:00
GabyCT
394480e7ff Merge pull request #10221 from GabyCT/topic/addopendmmread
docs: Add oneDNN benchmark information to metrics README
2024-08-28 14:22:22 -06:00
GabyCT
83b031ca7a Merge pull request #10214 from GabyCT/topic/ciweekly
gha: Add GHA workflow to run Kata CoCo stability tests
2024-08-28 11:46:29 -06:00
Archana Shinde
c747852bce agent: image-rs: check xattrs for image unpacking
This commit includes a fix for pulling an image on platforms that do not
support xattr.

Some platforms/file-systems do not support xattrs, this would make the
image pull fail because of failing to set xattr. This commit will check
whether the target path supports xattr. If yes, the unpacking will
maintain xattrs; if not, it will not set xattrs.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-08-28 00:02:46 -07:00
Archana Choudhary
ae2cdedba8 genpolicy: add priorityClassName as a field in PodSpec interface
This allows generation of policy for pods specifying priority classes.

Signed-off-by: Archana Choudhary <archana1@microsoft.com>
2024-08-27 19:54:02 -07:00
Dan Mihai
aa8bdbde5a agent: avoid policy.txt log without debug enabled
slog's is_enabled() is documented as:
- "best effort", and
- Sometime resulting in false positives.

Use AGENT_CONFIG.log_level.as_usize() instead, to avoid those false
positives.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-08-28 02:33:56 +00:00
Aurélien Bombo
de98e467b4 ci: Use ubuntu-22.04 instead of ubuntu-latest
22.04 is the default today:
23da668261/README.md

Being more specific will avoid unexpected errors when Github updates the
default.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-08-27 16:44:39 +00:00
Aurélien Bombo
ceab66b1ce ci: Run build-checks-depending-on-kvm for free
Also keeps the Rust installation step even though it's preinstalled, so that we
use the version specified in versions.yaml.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-08-27 16:43:59 +00:00
Aurélien Bombo
b4ce84b9d2 ci: Move run-runk to free runner
No change other than switching the runner - no dependency issue
expected.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-08-27 16:43:33 +00:00
Aurélien Bombo
645aaa6f7f ci: Move run-monitor to free runner
No change other than switching the runner - no
dependency issue expected.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-08-27 16:43:33 +00:00
Gabriela Cervantes
3affde5b28 docs: Add oneDNN benchmark information to metrics README
This PR adds the oneDNN benchmark information to the machine
learning metrics README.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-27 16:32:50 +00:00
Dan Mihai
9f6f5dac4b Merge pull request #10037 from sprt/reinstate-mariner-host
ci: reinstate Mariner host and guest kernel
2024-08-27 08:24:51 -07:00
Alex Lyn
f24983b3cf Merge pull request #10210 from l8huang/cold-vf
runtime: check if  cold_plug_vfio is enabled before create PhysicalEndpoint
2024-08-27 15:23:55 +08:00
Alex Lyn
3a749cfb44 Merge pull request #10212 from squarti/remote-machine-type
runtime: Allow machine_type in kata config for remote hypervisors
2024-08-27 14:05:36 +08:00
Aurélien Bombo
a3dba3e82b ci: reinstate Mariner host
GH-9592 addressed a bug in a previous version of the AKS Mariner host
kernel that blocked the CH v39 upgrade. This bug has now been fixed so
we undo that PR.

Note we also specify a different OCI version for Mariner as it differs
from Ubuntu's.

Fixes: #9594

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-08-26 21:07:25 +00:00
Gabriela Cervantes
3a14b04621 gha: Fix entry for ci coco stability yaml
This PR fixes the entry or use of the ci weekly GHA workflow
to run properly the weekly k8s tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-26 17:14:35 +00:00
Gabriela Cervantes
95f6246858 gha: Add GHA workflow to run Kata CoCo stability tests
This PR adds a GHA workflow to run Kata CoCo weekly stablity tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-26 17:05:21 +00:00
Silenio Quarti
11ba8f05ca runtime: Allow machine_type in kata config for remote hypervisors
Fixes: #10211

Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
2024-08-26 10:17:40 -04:00
Lei Huang
70168a467d runtime: check if cold_plug_vfio is enabled before create PhysicalEndpoint
PhysicalEndpoint unbinds its VF interface and rebinds it as a VFIO device,
then cold-plugs the VFIO device into the guest kernel.

When `cold_plug_vfio` is set to "no-port", cold-plugging the VFIO device
will fail.

This change checks if `cold_plug_vfio` is enabled before creating PhysicalEndpoint
to avoid unnecessary VFIO rebind operations.

Fixes: #10162

Signed-off-by: Lei Huang <leih@nvidia.com>
2024-08-23 15:42:17 -07:00
GabyCT
6b0272d6bf Merge pull request #10193 from GabyCT/topic/k8ssoak
stability: Add kubernetes parallel test
2024-08-23 15:51:01 -06:00
GabyCT
83177efb9b Merge pull request #10201 from GabyCT/topic/readmeopenvino
metrics: Add OpenVINO general information into README
2024-08-23 14:11:26 -06:00
Bo Chen
a0bd78b358 Merge pull request #10205 from likebreath/0819/upgrade_clh_v41.0
Upgrade to Cloud Hypervisor v41.0
2024-08-23 10:01:41 -07:00
Hyounggyu Choi
169b4490d2 Merge pull request #10209 from fidencio/topic/kata-manager-avoid-rate-pull-limit
kata-manager: Avoid docker rate-limit
2024-08-23 12:52:14 +02:00
Fabiano Fidêncio
7f0289de60 kata-manager: Avoid docker rate-limit
To do so, use a test image from quay.io instead of docker.io.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-08-23 11:56:09 +02:00
Fabiano Fidêncio
45f69373a6 Merge pull request #10199 from BbolroC/make-cdh-api-timeout-configurable
agent/config: Make CDH_API_TIMEOUT configurable
2024-08-23 11:04:10 +02:00
Hyounggyu Choi
4cd83d2b98 Merge pull request #10202 from BbolroC/fix-k8s-tests-s390x
tests: Fix k8s test issues on s390x
2024-08-23 09:51:11 +02:00
Fabiano Fidêncio
11bb9231c2 Merge pull request #10207 from amshinde/remove-image-check-cc
Revert "tests: add image check before running coco tests"
2024-08-23 09:33:39 +02:00
Alex Lyn
44bf7ccb46 Merge pull request #10141 from soulfy/fix-delete-failed
agent: kill child process when console socket closed
2024-08-23 14:00:53 +08:00
Archana Shinde
b0be03a93f Revert "tests: add image check before running coco tests"
This reverts commit 41b7577f08.

We were seeing a lot of issues in the TDX CI of the nature:

"Error: failed to create containerd container: create instance
470: object with key "470" already exists: unknown"

With the TDX CI, we moved to having the nydus snapsotter pre-installed.
Essentially the `deploy-snapshotter` step was performed once before any
actual CI runs.
We were seeing failures related to the error message above.

On reverting this change, we are no longer seeing errors related to
"key exists" with the TDX CI passing now.

The change reverted here is related to downloading incomplete images, but this
seems to be messing up TDX CI.
It is possible to pass --snapshotter to `ctr image check` but that does
not seem to have any effect on the data set returned.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-08-22 18:05:42 -07:00
Bo Chen
254f8bca74 runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v41.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #10203

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-08-22 11:05:54 -07:00
Bo Chen
e69535326d versions: Upgrade to Cloud Hypervisor v41.0
Details of this release can be found in our roadmap project as iteration
v41.0: https://github.com/orgs/cloud-hypervisor/projects/6.

Fixes: #10203

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-08-22 11:02:26 -07:00
Gabriela Cervantes
2fa8e85439 metrics: Add OpenVINO general information into README
This PR adds the OpenVINO benchmark general information into the
machine learning README metrics information.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-22 16:08:06 +00:00
Hyounggyu Choi
274de8c6af tests: Introduce wait_time to k8s_create_pod()
In certain environments (e.g., those with lower performance), `k8s_create_pod()`
may require additional wait time, especially when dealing with large images.
Since `k8s_wait_pod_be_ready()` — which is called by `k8s_create_pod()` — already
accepts `wait_time` as a second argument, it makes sense to introduce `wait_time`
to `k8s_create_pod()` and propagate it to the callee.

This commit adds `wait_time` to `k8s_create_pod()` as the 2nd (optional) argument.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-08-22 17:46:53 +02:00
Hyounggyu Choi
5d7397cc69 tests: Load confidential_kbs.sh in k8s-guest-pull-iamge.bats
Some of the tests call set_metadata_annotation() for updating the kernel
parameters. For `kata-qemu-se`, repack_secure_image() is called which is
defined in `lib_se.sh` and sourced by `confidential_kbs.sh`.

This commit ensures that the function call chain for the relevant
`KATA_HYPERVISOR` is properly handled.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-08-22 17:33:38 +02:00
Fabiano Fidêncio
890fa26767 Merge pull request #10196 from fidencio/topic/ci-commit-message-take-reapply-into-consideration
ci: commit-message-check: Take re-revert into consideration
2024-08-22 17:31:27 +02:00
Fabiano Fidêncio
2f6edc4b9b Merge pull request #10194 from fidencio/topic/kata-deploy-re-work-logic
kata-deploy: Rework the logic a little bit
2024-08-22 16:46:36 +02:00
Hyounggyu Choi
baa8af3f8e doc: Update how-to-set-sandbox-config-kata.md
This commit add a row for `cdh_api_timeout` to the agent options in
how-to-set-sandbox-config-kata.md.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-08-22 14:50:51 +02:00
Hyounggyu Choi
7d0aba1a24 runtime: Enable to get cdh_api_timeout from configuration file
This commit allows `cdh_api_timeout` to be configured from the configuration file.
The configuration is commented out with specifying a default value (50s) because
the default value is configured in the agent.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-08-22 14:47:37 +02:00
Hyounggyu Choi
8615516823 agent: Add agent.cdh_api_timeout to README
This commit adds an explanation for `cdh_api_timeout` to the README file.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-08-22 14:47:37 +02:00
Fabiano Fidêncio
a9a1345a31 kata-deploy: Print the action the script was invoked with
This increases debuggability.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-08-22 14:32:33 +02:00
Fabiano Fidêncio
ab493b6028 kata-deploy: Move general logic to the correct actions
therwise we may end up running into unexpected issues when calling the
cleanup option, as the same checks would be done, and files could end up
being copied again, overwriting the original content which was backked
up by the install option.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-22 14:32:29 +02:00
Fabiano Fidêncio
6596012956 kata-deploy: Simplify check for runtime
Let's write the runtime check in a shorter and simpler to read form.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-08-22 14:32:02 +02:00
Hyounggyu Choi
2512ddeab2 agent/cdh: Use AGENT_CONFIG.cdh_api_timeout for CDH_API_TIMEOUT
This commit updates CDH_API_TIMEOUT to use AGENT_CONFIG.cdh_api_timeout
and changes it from a `const` to `lazy_static` to accommodate runtime-determined values.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-08-22 10:09:16 +02:00
Hyounggyu Choi
6139e253a0 agent/config: Add cdh_api_timeout to AgentConfig
To make the `cdh_api_timeout` variable configurable, it has been added to
the `AgentConfig` structure.
This change includes storing the variable as a `time::Duration` type and
generalizing the existing `hotplug_timeout` code to handle both timeouts.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-08-22 10:09:16 +02:00
GabyCT
3fd108b09a Merge pull request #10198 from GabyCT/topic/remvaropenvino
metrics: Remove unused variable in openvino script
2024-08-21 15:48:56 -06:00
Dan Mihai
8ccc8a8d0b Merge pull request #9911 from microsoft/saulparedes/mounts
genpolicy: deny UpdateEphemeralMountsRequest
2024-08-21 10:12:28 -07:00
Gabriela Cervantes
59e31baaee metrics: Remove unused variable in openvino script
This PR removes an unused variable in the openvino script for kata
metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-21 16:05:55 +00:00
Greg Kurz
09a13da8ec Merge pull request #10197 from beraldoleal/release-3.8
release: Bump VERSION to 3.8.0
2024-08-21 17:50:10 +02:00
Beraldo Leal
55bdb380fb release: Bump VERSION to 3.8.0
Let's start the 3.8.0 release.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-08-21 10:24:07 -04:00
Gabriela Cervantes
27d5539954 stability: Add pod deployment yaml for soak test
This PR adds the pod deployment yaml for soak test which is part
of the stability k8s tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-21 14:23:22 +00:00
Fabiano Fidêncio
3fd021a9b3 ci: commit-message-check: Take re-revert into consideration
`Reapply "` should be taken into sonsideration as well.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-08-21 14:19:16 +02:00
Fabiano Fidêncio
f071c8cada Merge pull request #10191 from fidencio/topic/ci-temporarily-revert-helm-usage
ci: Let's temporarily revert the helm charts usage in our CI
2024-08-21 10:52:23 +02:00
Dan Mihai
6654491cc3 genpolicy: deny UpdateEphemeralMountsRequest
* genpolicy: deny UpdateEphemeralMountsRequest

Deny UpdateEphemeralMountsRequest by default, because paths to
critical Guest components can be redirected using such request.

Signed-off-by: Dan Mihai <Daniel.Mihai@microsoft.com>
2024-08-20 18:28:17 -07:00
Gabriela Cervantes
c04a805215 stability: Add kubernetes parallel test
This PR adds a kubernetes parallel test that will launch multiple replicas
from a kubernetes deployment and we will iterate this multiple times to
verify that we are able to do this using CoCo Kata. This test will be
part of the CoCo Kata stability CI.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-20 23:24:22 +00:00
Fabiano Fidêncio
b18c3dfce3 Revert "kata-deploy: Add Helm Chart" (partially)
This partially reverts commit 94b3348d3c,
as there's more work needed in order to have this one done in a robust
way, and we are taking the safer path of reverting for now, and adding
it back as soon as the release is cut out.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-08-21 00:09:11 +02:00
Fabiano Fidêncio
36f4038a89 Revert "ci: Use helm to deploy kata-deploy" (partially)
This partially reverts commit 51690bc157,
as there's more work needed in order to have this one done in a robust
way, and we are taking the safer path of reverting for now, and adding
it back as soon as the release is cut out.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-08-21 00:09:11 +02:00
Fabiano Fidêncio
21f9f01e1d Revert "ci: make cleanup_kata_deploy really simple"
This reverts commit 1221ab73f9, as there's
more work needed in order to have this one done in a robust way, and we
are taking the safer path of reverting for now, and adding it back as
soon as the release is cut out.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-08-21 00:09:11 +02:00
GabyCT
e0bff7ed14 Merge pull request #10177 from GabyCT/topic/cocoghas
gha: Add k8s stability Kata CoCo GHA workflow
2024-08-20 15:12:29 -06:00
Gabriela Cervantes
ca3d778479 gha: Add Kata CoCo Stability workflow
This PR adds the Kata CoCo Stability workflow that will setup the
environment to run the k8s tests on a non-tee environment.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-20 16:34:33 +00:00
Gabriela Cervantes
3ebaa5d215 gha: Add Kata CoCo stability weekly yaml
This PR adds the Kata CoCo stability weekly yaml that will trigger
weekly the k8s stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-20 16:32:03 +00:00
Fabiano Fidêncio
aeb6f54979 Merge pull request #10180 from fidencio/topic/ci-ensure-the-key-was-created-on-kbs
ci: Ensure the KBS resources are created
2024-08-20 09:07:56 +02:00
Fabiano Fidêncio
40d385d401 Merge pull request #10188 from wainersm/kbs_key
tests/k8s: check and save kbs.key
2024-08-19 23:29:10 +02:00
Fabiano Fidêncio
c0d7222194 ci: Ensure the KBS resources are created
Otherwise we may have tests failing due to the resource not being
created yet.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-19 23:27:06 +02:00
Wainer dos Santos Moschetta
e014eee4e8 tests/k8s: check and save kbs.key
The deploy-kbs.sh script generates the kbs.key that's used to install
KBS. This same file is used lately by kbs-client to authenticate. This ensures
that the file was created, otherwise fail.

Another problem solved here is that on bare-metal machines the key doesn't survive
a reboot as it is created in a temporary directory (/tmp/trustee). So let's save
the file to a non-temporary location.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-08-19 16:03:03 -03:00
Wainer Moschetta
6a982930e2 Merge pull request #10183 from fidencio/topic/kata-deploy-use-runtime_path
kata-deploy: Stop symlinking into /usr/local/bin
2024-08-19 13:17:21 -03:00
Fabiano Fidêncio
42d48efcc2 Merge pull request #10181 from fidencio/topic/ci-fix-stdio-typo
ci: stdio: Fix typo on getting the containerd version
2024-08-18 16:05:42 +02:00
Fabiano Fidêncio
e0ae398a2e Merge pull request #10151 from squarti/rootdir2
runtime: Files are not synced between host and guest VMs
2024-08-18 12:32:52 +02:00
Fabiano Fidêncio
d03b72f19b kata-deploy: Stop linking binaries to /usr/local/bin
Neither CRI-O nor containerd requires that, and removing such symlinks
makes everything less intrusive from our side.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-18 01:25:12 +02:00
Fabiano Fidêncio
c2393dc467 kata-deploy: Use shim's absolute path for crio's runtime_path
This will allow us, in the future, not have to do symlinks here and
there.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-18 01:25:12 +02:00
Fabiano Fidêncio
58623723b1 kata-deploy: Use runtime_path for containerd
It's already being used with CRi-O, let's simplify what we do and also
use this for containerd, which will allow us to do further cleanups in
the coming patches.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-18 01:25:12 +02:00
Fabiano Fidêncio
e75c149dec ci: stdio: Properly start running the test
"gha-run.sh" requires a `run` argument in order to run the tests, which
seems to be forgotten when the test was added.

This PR needs to get merged before the test can successfully run.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-17 14:41:44 +02:00
Fabiano Fidêncio
dd2d9e5524 ci: stdio: Fix typo on getting the containerd version
I assume the PR that introduced this was based on an older version of
yq, and as the test couldn't run before it got merged we never noticed
the error.

However, this test has been failing for a reasonable amount of time,
which makes me think that we either need a maintainer for it, or just
remove it completely, but that's a discussion for another day.

For now, let's make it, at least, run.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-17 14:06:24 +02:00
Fabiano Fidêncio
7113490cb1 Merge pull request #10179 from fidencio/topic/switch-nginx-image
ci: k8s: Replace nginx alpine images
2024-08-17 13:07:31 +02:00
Fabiano Fidêncio
0831081399 ci: k8s: Replace nginx alpine images
The previous ones are gone, so let's switch to our own multi-arch image
for the tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-17 12:19:33 +02:00
Fabiano Fidêncio
a78d82f4f1 Merge pull request #10159 from squarti/main
agent: Handle EINVAL error when umounting container rootfs
2024-08-16 22:07:50 +02:00
Dan Mihai
79c1d0a806 Merge pull request #10136 from microsoft/danmihai1/docker-image-volume2
genpolicy: add bind mounts for image volumes
2024-08-16 13:07:01 -07:00
Fabiano Fidêncio
28aa4314ba Merge pull request #10175 from ChengyuZhu6/error_message
runtime: Add specific error message for gRPC request timeouts
2024-08-16 22:06:49 +02:00
Fabiano Fidêncio
720edbe3fc Merge pull request #10174 from ChengyuZhu6/install_script
tools: install luks-encrypt-storage script by guest-components
2024-08-16 22:04:56 +02:00
Fabiano Fidêncio
7b5da45059 Merge pull request #10178 from fidencio/topic/revert-trustee-bump
Revert "version: bump trustee version"
2024-08-16 21:48:30 +02:00
Gabriela Cervantes
6ea34f13e1 gha: Add k8s stability Kata CoCo GHA workflow
This PR adds the k8s stability Kata CoCo GHA workflow to run weekly
the k8s stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-16 16:14:15 +00:00
Fabiano Fidêncio
45f43e2a6a Revert "version: bump trustee version"
This reverts commit d35320472c.

Although the commit in question does solve an issue related to the usage
of busybox from docker.io, as it's reasonably easy to hit the rate
limit, the commit also brings in functionalities that are causing issues
in, at least, the TDX CI, such as:
```sh
[2024-08-16T16:03:52Z INFO  actix_web::middleware::logger] 10.244.0.1 "POST /kbs/v0/attest HTTP/1.1" 401 259 "-" "attestation-agent-kbs-client/0.1.0" 0.065266
[2024-08-16T16:03:53Z INFO  kbs::http::attest] Auth API called.
[2024-08-16T16:03:53Z INFO  actix_web::middleware::logger] 10.244.0.1 "POST /kbs/v0/auth HTTP/1.1" 200 74 "-" "attestation-agent-kbs-client/0.1.0" 0.000169
[2024-08-16T16:03:54Z INFO  kbs::http::attest] Attest API called.
[2024-08-16T16:03:54Z INFO  verifier::tdx] Quote DCAP check succeeded.
[2024-08-16T16:03:54Z INFO  verifier::tdx] MRCONFIGID check succeeded.
[2024-08-16T16:03:54Z INFO  verifier::tdx] CCEL integrity check succeeded.
[2024-08-16T16:03:54Z ERROR kbs::http::error] Attestation failed: Verifier evaluate failed: TDX Verifier: failed to parse AA Eventlog from evidence

    Caused by:
        at least one line should be included in AAEL
```

Let's revert this for now, and then once we get this one fixed on
trustee side we'll update again.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-16 18:10:38 +02:00
Dan Mihai
c22ac4f72c genpolicy: add bind mounts for image volumes
Add bind mounts for volumes defined by docker container images, unless
those mounts have been defined in the input K8s YAML file too.

For example, quay.io/opstree/redis defines two mounts:
/data
/node-conf
Before these changes, if these mounts were not defined in the YAML file
too, the auto-generated policy did not allow this container image to
start.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-08-16 15:11:05 +00:00
Fabiano Fidêncio
b203f715e5 Merge pull request #10170 from beraldoleal/deploy-reset-fix
kata-deploy: fix kata-deploy reset
2024-08-16 16:51:14 +02:00
Fabiano Fidêncio
8d63723910 Merge pull request #10161 from microsoft/saulparedes/ignore_role_resource
genpolicy: ignore Role resource
2024-08-16 16:50:16 +02:00
Fabiano Fidêncio
6c58ae5b95 Merge pull request #10171 from fidencio/topic/ci-treat-nydus-snapshotter-as-a-dep
ci: nydus: Treat the snapshotter as a dependency
2024-08-16 16:39:48 +02:00
ChengyuZhu6
1eda6b7237 tests: update error message with guest pulling image timeout
update error message with guest pulling image timeout.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-16 20:26:33 +08:00
ChengyuZhu6
ca05aca548 runtime: Add specific error message for gRPC request timeouts
Improved error handling to provide clearer feedback on request failures.

For example:
Improve createcontainer request timeout error message from
"Error: failed to create containerd task: failed to create shim task:context deadline exceed"
to "Error: failed to create containerd task: failed to create shim task: CreateContainerRequest timed out: context deadline exceed".

Fixes: #10173 -- part II

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-16 20:24:48 +08:00
Beraldo Leal
b3a4cd1a06 Merge pull request #10172 from deagon/fix-typo
osbuilder: fix typo in ubuntu rootfs depends
2024-08-16 08:01:59 -04:00
Beraldo Leal
b843b236e4 kata-deploy: improve kata-deploy script
For the rare cases where containerd_conf_file does not exist, cp could fail
and let the pod in Error state. Let's make it a little bit more robust.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-08-16 07:52:38 -04:00
ChengyuZhu6
aa31a9d3c4 tools: install luks-encrypt-storage script by guest-components
Install luks-encrypt-storage script by guest-components. So that we can maintain a single source and prevent synchronization issues.

Fixes: #10173 -- part I

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-16 16:28:20 +08:00
Chengyu Zhu
ba3c484d12 Merge pull request #9999 from ChengyuZhu6/trusted-storage
Trusted image storage
2024-08-16 15:39:50 +08:00
Fabiano Fidêncio
0f3eb2451e Merge pull request #10169 from fidencio/topic/revert-reset_runtime-to-cleanup
Revert "ci: add reset_runtime to cleanup"
2024-08-16 07:29:58 +02:00
Aurélien Bombo
e1775e4719 Merge pull request #10164 from BbolroC/make-exec_host-stable
tests: Ensure exec_host() consistently captures command output
2024-08-15 21:43:32 -07:00
Guoqiang Ding
1d21ff9864 osbuilder: fix typo in ubuntu rootfs depends
Remove the duplicate package "xz-utils".

Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>
2024-08-16 11:33:55 +08:00
Silenio Quarti
5d815ffde1 runtime: Files are not synced between host and guest VMs
This PR resolves the default kubelet root dir symbolic link and
uses it as the absolute path for the fs watcher regexs

Fixes: https://github.com/kata-containers/kata-containers/issues/9986

Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
2024-08-15 23:19:08 -04:00
Silenio Quarti
0dd16e6b25 agent: Handle EINVAL error when umounting container rootfs
Container/Sandbox clean up should not fail if root FS is not mounted.
This PR handles EINVAL errors when umount2 is called.

Fixes: #10166

Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
2024-08-15 19:41:46 -04:00
Fabiano Fidêncio
3733266a60 ci: nydus: Treat the snapshotter as a dependency
Instead of deploying and removing the snapshotter on every single run,
let's make sure the snapshotter is always deploy on the TDX case.

We're doing this as an experiment, in order to see if we'll be able to
reduce the failures we've been facing with the nydus snapshotter.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-15 22:44:30 +02:00
Hyounggyu Choi
ba3e5f6b4a Revert "tests: Disable k8s file volume test"
This reverts commit e580e29246.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-08-15 21:10:39 +02:00
Hyounggyu Choi
758e650a28 tests: Ensure exec_host() consistently captures command output
The `exec_host()` function often fails to capture the output of a given command
because the node debugger pod is prematurely terminated. To address this issue,
the function has been refactored to ensure consistent output capture by adjusting
the `kubectl debug` process as follows:

- Keep the node debugger pod running
- Wait until the pod is fully ready
- Execute the command using `kubectl exec`
- Capture the output and terminate the pod

This commit refactors `exec_host()` to implement the above steps, improving its reliability.

Fixes: #10081

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-08-15 21:10:39 +02:00
Beraldo Leal
74662a0721 Merge pull request #10137 from hex2dec/fix-image-warning
tools: Fix container image build warning
2024-08-15 14:45:41 -04:00
Dan Mihai
905c76bd47 Merge pull request #10153 from microsoft/saulparedes/support_cron_job
genpolicy: Add support for cron jobs
2024-08-15 11:11:00 -07:00
Aurélien Bombo
0223eedda5 Merge pull request #10050 from burgerdev/request-hardening
genpolicy: hardening some agent requests
2024-08-15 08:31:21 -07:00
Fabiano Fidêncio
1f6a8baaf1 Revert "ci: add reset_runtime to cleanup"
This reverts commit 8d9bec2e01, as it
causes issues in the operator and kata-deploy itself, leading to the
node to be NotReady.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-15 16:09:34 +02:00
ChengyuZhu6
5f4209e008 agent:README: add secure_image_storage_integrity to agent's README
add secure_image_storage_integrity to agent's README.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-15 20:32:44 +08:00
ChengyuZhu6
6ecb2b8870 tests: skip test trusted storage in qemu-coco-dev
I can't set up loop device with `exec_host`, which the command is
necessary for qemu-coco-dev. See issue #10133.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-15 20:32:44 +08:00
ChengyuZhu6
51b9d20d55 tests: update error message in pulling image encrypted tests
Update error message in pulling image encrypted to "failed to get decrypt key no suitable key found for decrypting layer key".

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-15 20:32:44 +08:00
ChengyuZhu6
b4d10e7655 version: update the version of coco-guest-components
update the version of coco-guest-components.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-15 20:32:43 +08:00
Fupan Li
365df81d5e Merge pull request #10148 from lifupan/main_sandboxapi
runtime-rs: Add the wait_vm support for hypervisors
2024-08-15 17:08:38 +08:00
ChengyuZhu6
a9b436f788 agent:cdh: Introduces secure_mount API in cdh
Introduces `secure_mount` API in the cdh. It includes:

- Adding the `SecureMountServiceClient`.
- Implementing the `secure_mount` function to handle secure mounting requests.
- Updating the confidential_data_hub.proto file to define SecureMountRequest and SecureMountResponse messages
  and adding the SecureMountService service.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-15 13:55:23 +08:00
ChengyuZhu6
1528d543b2 agent:cdh: Rename sealed_secret API namespace to confidential_data_hub
renames the sealed_secret.proto file to confidential_data_hub.proto and
updates the corresponding API namespace from sealed_secret to confidential_data_hub.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-15 13:55:23 +08:00
ChengyuZhu6
37bd2406e0 docs: add content about how to pull large image
Add content about how to pull large image in the guest with trust
storage.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-15 13:55:22 +08:00
ChengyuZhu6
c5a973e68c tests:k8s: add tests for guest pull with configured timeout
add tests for guest pull with configured timeout:
1) failed case: Test we cannot pull a large image that pull time exceeds a short creatcontainer timeout(10s) inside the guest
2) successful case: Test we can pull a large image inside the guest with increasing createcontainer timeout(120s)

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-15 13:55:22 +08:00
ChengyuZhu6
6c506cde86 tests:k8s: add tests for pull images in the guest using trusted storage
add tests for pull images in the guest using trusted storage:
1) failed case: Test we cannot pull an image that exceeds the memory limit inside the guest
2) successful case: Test we can pull an image inside the guest using
   trusted ephemeral storage.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-15 13:55:22 +08:00
GabyCT
ecfbc9515a Merge pull request #10158 from GabyCT/topic/k8sstabil
tests: Add kubernetes stability test
2024-08-14 14:44:49 -06:00
Saul Paredes
5ad47b8372 genpolicy: ignore Role resource
Ignore Role resources because they don't need a Policy.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-08-14 12:57:06 -07:00
Gabriela Cervantes
d48ad94825 tests: Add kubernetes stability test
This PR adds a k8s stability test that will be part of the CoCo Kata
stability tests that will run weekly.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-14 15:30:49 +00:00
Fupan Li
cadcf5f92d runtime-rs: Add the wait_vm support for hypervisors
Add the wait_vm method for hypervisors. This is a
prerequisite for sandbox api support.

Fixes: #7043

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-08-14 12:01:34 +08:00
Fupan Li
506977b102 Merge pull request #10156 from GabyCT/topic/disablevolume
tests: Disable k8s file volume test
2024-08-14 12:00:47 +08:00
GabyCT
b0b6a1baea Merge pull request #10154 from GabyCT/topic/stressk8s
tests: Add kubernetes stress-ng tests
2024-08-13 15:09:59 -06:00
Gabriela Cervantes
e580e29246 tests: Disable k8s file volume test
This PR disables the k8s file volume test as we are having random failures
in multiple GHA CIs mainly because the exec_host function sometimes
does it not work properly.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-13 20:50:18 +00:00
Saul Paredes
af598a232b tests: add test for cron job support
Add simple test for cron job support

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-08-13 10:47:42 -07:00
Saul Paredes
88451d26d0 genpolicy: add support for cron jobs
Add support for cron jobs

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-08-13 10:47:42 -07:00
Gabriela Cervantes
bdca5ca145 tests: Add kubernetes stress-ng tests
This PR adds kubernetes stress-ng tests as part of the stability testing
for kata.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-13 16:23:52 +00:00
Fabiano Fidêncio
99730256a2 Merge pull request #10149 from fidencio/topic/kata-manager-relax-opt-check
kata-manager: Only check files when tarball is not passed
2024-08-13 16:26:16 +02:00
Markus Rudy
bce5cb2ce5 genpolicy: harden CreateSandboxRequest checks
Hooks are executed on the host, so we don't expect to run hooks and thus
require that no hook paths are set.

Additional Kernel modules expand the attack surface, so require that
none are set. If a use case arises, modules should be allowlisted via
settings.

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-08-13 09:01:58 +02:00
Markus Rudy
aee23409da genpolicy: harden CopyFileRequest checks
CopyFile is invoked by the host's FileSystemShare.ShareFile function,
which puts all files into directories with a common pattern. Copying
files anywhere else is dangerous and must be prevented. Thus, we check
that the target path prefix matches the expected directory pattern of
ShareFile, and that this directory is not escaped by .. traversal.

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-08-13 09:01:58 +02:00
soulfy
722b576eb3 agent: kill child process when console socket closed
when use debug console, the shell run in child process may not be
exited, in some scenes.
eg. directly Ctrl-C in the host to terminate the kata-runtime process,
that will block the task handling the console connection,while waiting
for the child to exit.

Signed-off-by: soulfy <liukai254@jd.com>
2024-08-13 10:18:03 +08:00
Steve Horsman
91084058ae Merge pull request #10007 from wainersm/run_k8s_on_free_runners
ci: Transition GARM tests to free runners, pt. II
2024-08-12 18:12:18 +01:00
Fabiano Fidêncio
5fe65e9fc2 kata-manager: Only check files when tarball is not passed
Only do the checking in case the tarball was not explicitly passed by
the user.  We have no control of what's passed and we cannot expect that
all the files are going to be under /opt.

Fixes: #10147

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-12 13:54:24 +02:00
ChengyuZhu6
c3a0ab4b93 tests:k8s: Re-enable and refactor the tests with guest pull
Currently, setting `io.containerd.cri.runtime-handler` annotation in
the yaml is not necessary for pulling images in the guest. All TEE
hypervisors are already running tests with guest-pulling enabled.
Therefore, we can remove some duplicate tests and re-enable the
guest-pull test for running different runtime pods at the same time.
While considering to support different containerd version, I recommend
to keep setting "io.containerd.cri.runtime-handler".

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-12 16:36:54 +08:00
ChengyuZhu6
47be9c7c01 osbuilder:rootfs: install init_trusted_storage script
Install init_trusted_storage script if enable MEASURED_ROOTFS.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
Co-authored-by: Anand Krishnamoorthi <anakrish@microsoft.com>
2024-08-12 16:36:54 +08:00
ChengyuZhu6
df993b0f88 agent:rpc: initialize trusted storage device
Initialize the trusted stroage when the device is defined
as "/dev/trusted_store" with shell script as first step.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
2024-08-12 16:36:54 +08:00
ChengyuZhu6
94347e2537 agent:config: Support secure_storage_integrity option for trusted storage
After enable secure storage integrity for trusted storage, the initialize
time will take more times, the default value will be NOT enabled but add this config to
allow the user to enable if they care more strict security.

Fixes: #8142

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
2024-08-12 16:36:54 +08:00
GabyCT
775f6bdc5c Merge pull request #10142 from GabyCT/topic/updatestress
tests: Update ubuntu image for stress Dockerfile
2024-08-09 16:11:35 -06:00
Gabriela Cervantes
5e5fc145cd tests: Update ubuntu image for stress Dockerfile
This PR updates the ubuntu image for stress Dockerfile. The main purpose
is to have a more updated image compared with the one that is in libpod
which has not been updated in a while.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-09 15:29:10 +00:00
Steve Horsman
e4c023a9fa Merge pull request #10140 from stevenhorsman/kata-version-in-artefact-version
ci: cache: Include kata version in artefact versions
2024-08-09 11:37:09 +01:00
Fabiano Fidêncio
44b08b84b0 Merge pull request #10113 from Freax13/fix/no-scsi-off
qemu: don't emit scsi parameter
2024-08-08 16:23:36 +02:00
stevenhorsman
b6a3a3f8fe ci: cache: Include kata version in artefact versions
- At the moment we aren't factoring in the kata version on our caches,
so it means that when we bump this just before release, we don't
rebuilt components that pull in the VERSION content, so the release build
ends up with incorrect versions in it's binaries

Fixes: #10092
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-08-08 14:58:58 +01:00
GabyCT
584d7a265e Merge pull request #10127 from GabyCT/topic/execimage
tests:k8s: Update image in kubectl debug for the exec host function
2024-08-07 17:00:52 -06:00
Archana Shinde
1012449141 Merge pull request #10129 from hex2dec/qemu-aio-native
tools: Support for building qemu with linux aio
2024-08-07 14:32:52 -07:00
Archana Shinde
a6a736eeaf Merge pull request #10089 from amshinde/enable-nerdctl-clh
ci: Enable nerdctl tests for clh
2024-08-07 12:13:00 -07:00
Wainer dos Santos Moschetta
374405aed1 workflows/run-k8s-tests-on-amd64: remove 'instance' from matrix
The jobs are all executed on ubuntu-22.04 so it's invariant and
can be removed from the matrix (this will shrink the jobs names).

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-08-07 16:00:39 -03:00
Wainer dos Santos Moschetta
d11ce129ac workflows: merge run-k8s-tests-on-garm and run-k8s-tests-with-crio-on-garm
Created the run-k8s-tests-on-amd64.yaml which is a merge of
run-k8s-tests-on-garm.yaml and run-k8s-tests-with-crio-on-garm.yaml

ps: renamed the job from 'run-k8s-tests' to 'run-k8s-tests-on-amd64' to
it is easier to find on Github UI and be distinguished from s390x,
ppc64le, etc...

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-08-07 15:50:43 -03:00
Wainer dos Santos Moschetta
ed0732c75d workflows: migrate run-k8s-tests-with-crio-on-garm to free runners
Switch to Github managed runners just like the run-k8s-tests-on-garm
workflow.

See: #9940
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-08-07 15:20:42 -03:00
Wainer dos Santos Moschetta
3d053a70ab workflows: migrate run-k8s-tests-on-garm to free runners
Switched to Github managed runners. The instance_type parameter was
removed and K8S_TEST_HOST_TYPE is set to "all" which combine the
tests of "small" and "normal". This way it will reduze to half of
the jobs.

See: #9940
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-08-07 15:20:42 -03:00
Wainer dos Santos Moschetta
dfb92e403e tests/k8s: add "deploy-kata"/"cleanup" actions to gh-run.sh
These new "kata-deploy" and "cleanup" actions are equivalent to
"kata-deploy-garm" "cleanup-garm", respectively, and should be
used on the workflows being migrated from GARM to
Github's managed runners.

Eventually "kata-deploy-garm" and "cleanup-garm" won't be used anymore
then we will be able to remove them.

See: #9940
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-08-07 15:20:23 -03:00
Zhiwei Huang
7270a7ba48 tools: Fix container image build warning
All commands within the Dockerfile should use the same casing
(either upper or lower).[1]

[1]: https://docs.docker.com/reference/build-checks/consistent-instruction-casing/

Signed-off-by: Zhiwei Huang <ai.william@outlook.com>
2024-08-07 15:49:01 +08:00
Dan Mihai
2da77c6979 Merge pull request #10068 from burgerdev/genpolicy-test
genpolicy: add crate-scoped integration test
2024-08-06 16:10:46 -07:00
GabyCT
fb166956ab Merge pull request #10132 from fidencio/topic/support-image-pull-with-nerdctl
runtime: image-pull: Make it work with nerdctl
2024-08-06 15:33:40 -06:00
Gabriela Cervantes
d0ca43162d tests:k8s: Update image in kubectl debug for the exec host function
This PR updates the image that we are using in the kubectl debug command
as part of the exec host function, as the current alpine image does not
allow to create a temporary file for example and creates random kubernetes
failures.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-06 21:13:46 +00:00
Fabiano Fidêncio
63802ecdd9 Merge pull request #9880 from zvonkok/helm-chart
kata-deploy: Add Helm Chart
2024-08-06 22:55:31 +02:00
Archana Shinde
ba884aac13 ci: Enable nerdctl tests for clh
A recent fix should resolve some the issues seen earlier with clh
with the go runtime. Enabling this test to check if the issue is still
seen.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-08-06 10:41:42 -07:00
Fabiano Fidêncio
f33f2d09f7 runtime: image-pull: Make it work with nerdctl
Our code for handling images being pulled inside the guest relies on a
containerType ("sandbox" or "container") being set as part of the
container annotations, which is done by the CRI Engine being used, and
depending on the used CRI Engine we check for a specfic annotation
related to the image-name, which is then passed to the agent.

However, when running kata-containers without kubernetes, specifically
when using `nerdctl`, none of those annotations are set at all.

One thing that we can do to allow folks to use `nerdctl`, however, is to
take advantage of the `--label` flag, and document on our side that
users must pass `io.kubernetes.cri.image-name=$image_name` as part of
the label.

By doing this, and changing our "fallback" so we can always look for
such annotation, we ensure that nerdctl will work when using the nydus
snapshotter, with kata-containers, to perform image pulling inside the
pod sandbox / guest.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-06 17:07:45 +02:00
Zvonko Kaiser
8d9bec2e01 ci: add reset_runtime to cleanup
Adding reset_cleanup to cleanup action so that it is done automatically
without the need to run yet another DS just to reset the runtime.

This is now part of the lifecycle hook when issuing kata-deploy.sh
cleanup

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-08-06 11:57:04 +02:00
Zvonko Kaiser
1221ab73f9 ci: make cleanup_kata_deploy really simple
Remove the unneeded logic for cleanup the values are
encapsulated in the deployed helm release

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-08-06 11:57:04 +02:00
Zvonko Kaiser
51690bc157 ci: Use helm to deploy kata-deploy
Rather then modifying the kata-depoy scripts let's use Helm and
create a values.yaml that can be used to render the final templates

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-08-06 11:57:04 +02:00
Zvonko Kaiser
94b3348d3c kata-deploy: Add Helm Chart
For easier handling of kata-deploy we can leverage a Helm chart to get
rid of all the base and overlays for the various components

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-08-06 11:57:04 +02:00
Zhiwei Huang
d455883b46 tools: Support for building qemu with linux aio
The kata containers hypervisior qemu configuration supports setting
block_device_aio="native", but the kata static build of qemu does
not add the linux aio feature.

The libaio-dev library is a necessary dependency for building qemu
with linux aio.

Fixes: #10130

Signed-off-by: Zhiwei Huang <ai.william@outlook.com>
2024-08-06 14:30:45 +08:00
Markus Rudy
69535e5458 genpolicy: add crate-scoped integration test
Provides a test runner that generates a policy and validates it
with canned requests. The initial set of test cases is mostly for
illustration and will be expanded incrementally.

In order to enable both cross-compilation on Ubuntu test runners as well
as native compilation on the Alpine tools builder, it is easiest to
switch to the vendored openssl-src variant. This builds OpenSSL from
source, which depends on Perl at build time.

Adding the test to the Makefile makes it execute in CI, on a variety of
architectures. Building on ppc64le requires a newer version of the
libz-ng-sys crate.

Fixes: #10061

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-08-05 11:52:01 +02:00
Markus Rudy
4d1416529d genpolicy: fix clippy v1.78.0 warnings
cargo clippy has two new warnings that need addressing:
- assigning_clones
  These were fixed by clippy itself.
- suspicious_open_options
  I added truncate(false) because we're opening the file for reading.

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-08-05 11:48:30 +02:00
Fabiano Fidêncio
43dca8deb4 Merge pull request #10121 from microsoft/saulparedes/add_version_flag
genpolicy: add --version flag
2024-08-03 21:22:10 +02:00
Fabiano Fidêncio
3b2173c87a Merge pull request #10124 from fidencio/topic/ci-enable-encrypted-image-tests-for-tees
ci: Enable encrypted image tests for TEEs
2024-08-03 11:39:51 +02:00
Fabiano Fidêncio
89f1581e54 ci: Enable encrypted image tests for TEEs
After experimenting a little bit with those tests, they seem to be
passing on all the available TEE machines.

With this in mind, let's just enable them for those machines.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-03 09:27:32 +02:00
Fabiano Fidêncio
3b896cf3ef Merge pull request #10125 from fidencio/topic/un-break-ci
ci: Remove jobs that are not running
2024-08-03 09:27:04 +02:00
Fabiano Fidêncio
62a086937e ci: Remove jobs that are not running
When re-enabling those we'll need a smart way to do so, as this limit of
20 workflows referenced is just ... weird.

However, for now, it's more important to add the jobs related to the new
platforms than keep the ones that are actively disabled.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-03 09:24:05 +02:00
GabyCT
76af5a444b Merge pull request #10075 from microsoft/saulparedes/hooks
genpolicy: reject create custom hook settings
2024-08-02 15:36:34 -06:00
GabyCT
aadde2c25b Merge pull request #10120 from kata-containers/fix_metrics_json_results_file
Fix metrics json results file
2024-08-02 11:29:02 -06:00
Fabiano Fidêncio
b93a0642e0 Merge pull request #10123 from fidencio/topic/re-enable-arm-ci
ci: re-enable arm CI
2024-08-02 17:48:35 +02:00
Dan Mihai
2628b34435 Merge pull request #10098 from microsoft/danmihai1/allow-failing
agent: fix the AllowRequestsFailingPolicy functionality
2024-08-02 08:42:47 -07:00
GabyCT
8da5f7a72f Merge pull request #10102 from ChengyuZhu6/fix-debug
tests: Fix error with `kubectl debug`
2024-08-02 09:25:13 -06:00
Fabiano Fidêncio
551e0a6287 Merge pull request #10116 from GabyCT/topic/kbsdependencies
tests: kbs: Add missing dependencies to install kbs cli
2024-08-02 14:22:28 +02:00
Fabiano Fidêncio
ed57ef0297 ci; aarch64: Enable builders as part of the CI
As we have new runners added, let's enable the builders so we can
prevent build failures happening after something gets merged.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-02 14:13:53 +02:00
Fabiano Fidêncio
388b5b0e58 Revert "ci: Temporarily remove arm64 builds"
This reverts commit e9710332e7, as there
are now 2 arm64-builders (to be expanded to 4 really soon).

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-02 13:53:50 +02:00
Fabiano Fidêncio
08be9c3601 Revert "ci: Temporarily remove arm64 builds -- part II"
This reverts commit c5dad991ce, as there
are now 2 arm64-builders (to be expanded to 4 really soon).

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-02 13:52:53 +02:00
Tom Dohrmann
322c80e7c8 qemu: don't emit scsi parameter
This parameter has been deprecated for a long time and QEMU 9.1.0 finally removes it.

Fixes: kata-containers#10112
Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>
2024-08-02 07:30:39 +02:00
Tom Dohrmann
b7999ac765 runtime-rs: don't emit scsi parameter for block devices
This parameter has been deprecated for a long time and QEMU 9.1.0 finally removes it.

Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>
2024-08-02 07:30:23 +02:00
Fabiano Fidêncio
4183680bc3 Merge pull request #10107 from fidencio/topic/rotate-journal-logs-every-run
tests: k8s: Rotate & cleanup journal for every run
2024-08-02 07:27:10 +02:00
Fabiano Fidêncio
302e02aed8 Merge pull request #10114 from fidencio/topic/kata-manager-configure-qemu-and-ovmf-for-tdx
kata-manager: Ensure distro specific TDX config is set
2024-08-02 07:24:57 +02:00
Saul Paredes
194cc7ca81 genpolicy: add --version flag
- Add --version flag to the genpolicy tool that prints the current
version
- Add version.rs.in template to store the version information
- Update makefile to autogenerate version.rs from version.rs.in
- Add license to Cargo.toml

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-08-01 17:18:17 -07:00
David Esparza
dcd0c0b269 metrics: Remove duplicated headers from results file.
This PR removes duplicated entries (vcpus count, and available memory),
from onednn and openvino results files.

Fixes: #10119

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-08-01 18:11:06 -06:00
Dan Mihai
9e99329bef genpolicy: reject create sandbox hooks
Reject CreateSandboxRequest hooks, because these hooks may be used by an
attacker.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-08-01 16:58:35 -07:00
ChengyuZhu6
2eac8fa452 tests: Fix error with kubectl debug
The issue is similar to #10011.

The root cause is that tty and stderr are set to true at same time in
containerd: #10031.

Fixes: #10081

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-02 07:32:30 +08:00
David Esparza
1e640ec3a6 metrics: fix pargins json results file.
This PR encloses the search string for 'default_vcpus ='
and 'default_memory =' with double quotes in order to
parse the precise values, which are included in the kata
configuration file.

Fixes: #10118

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-08-01 17:05:03 -06:00
Dan Mihai
c2a55552b2 agent: fix the AllowRequestsFailingPolicy functionality
1. Use the new value of AllowRequestsFailingPolicy after setting up a
   new Policy. Before this change, the only way to enable
   AllowRequestsFailingPolicy was to change the default Policy file,
   built into the Guest rootfs image.

2. Ignore errors returned by regorus while evaluating Policy rules, if
   AllowRequestsFailingPolicy was enabled. For example, trying to
   evaluate the UpdateInterfaceRequest rules using a policy that didn't
   define any UpdateInterfaceRequest rules results in a "not found"
   error from regorus. Allow AllowRequestsFailingPolicy := true to
   bypass that error.

3. Add simple CI test for AllowRequestsFailingPolicy.

These changes are restoring functionality that was broken recently by
commmit df23eb09a6.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-08-01 22:37:18 +00:00
Fabiano Fidêncio
66b0305eed Merge pull request #10117 from fidencio/topic/temporarily-remove-arm-nightly-jobs-part-2
ci: Temporarily remove arm64 builds -- part II
2024-08-01 23:06:46 +02:00
GabyCT
20a88b6470 Merge pull request #10099 from GabyCT/topic/fixmemo
metrics: Update memory tests to use grep -F
2024-08-01 13:48:36 -06:00
Fabiano Fidêncio
aef7da7bc9 tests: k8s: Rotate & cleanup journal for every run
This will help to avoid huge logs, and allow us to debug issues in a
better way.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-01 21:36:57 +02:00
Fabiano Fidêncio
c5dad991ce ci: Temporarily remove arm64 builds -- part II
Let's remove what we commented out, as publish manifest complains:
```
Created manifest list quay.io/kata-containers/kata-deploy-ci:kata-containers-latest
./tools/packaging/release/release.sh: line 146: --amend: command not found
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-01 20:43:28 +02:00
Fabiano Fidêncio
5ec11afc21 Merge pull request #10111 from fidencio/topic/temporarily-remove-arm-nightly-jobs
ci: Temporarily remove arm64 builds
2024-08-01 19:50:07 +02:00
Gabriela Cervantes
7454908690 metrics: Update memory tests to use grep -F
This PR updates the memory tests like fast footprint to use grep -F
instead of fgrep as this command has been deprecated.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-01 17:20:57 +00:00
Gabriela Cervantes
d72cb8ccfc tests: kbs: Add missing dependencies to install kbs cli
This PR adds missing packages depenencies to install kbs cli in a fresh
new baremetal environment. This will avoid to have a failure when trying
to run install-kbs-client.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-01 17:09:50 +00:00
Fabiano Fidêncio
bfd014871a kata-manager: Ensure distro specific TDX config is set
We've done something quite similar for kata-deploy, but I've noticed we
forgot about the kata-manager counterpart.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-01 17:27:01 +02:00
Fabiano Fidêncio
e9710332e7 ci: Temporarily remove arm64 builds
It's been a reasonable time that we're not able to even build arm64
artefacts.

For now I am removing the builds as it doesn't make sense to keep
running failing builds, and those can be re-enabled once we have arm64
machines plugged in that can be used for building the stuff, and
maintainers for those machines.

The `arm-jetson-xavier-nx-01` is also being removed from the runners.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-01 13:30:47 +02:00
Fabiano Fidêncio
c784fb6508 Merge pull request #10110 from ChengyuZhu6/bump-trustee
version: bump trustee version
2024-08-01 07:34:38 +02:00
ChengyuZhu6
d35320472c version: bump trustee version
Bump trustee to the latest version to fix error
with pulling busybox from dockerhub.

Fixes: #10109

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-01 08:59:58 +08:00
Fupan Li
230aefc0da Merge pull request #10070 from BbolroC/qemu-runtime-rs-k8s-s390x
GHA: Run k8s e2e tests for qemu-runtime-rs on s390x
2024-07-31 18:41:11 +08:00
Chengyu Zhu
8e9f140ee0 Merge pull request #10080 from ChengyuZhu6/fix-coco-ci
tests: add image check before running coco tests
2024-07-31 17:08:00 +08:00
Peng Tao
11e10647f9 Merge pull request #10104 from BbolroC/fix-zvsi-cleanup-s390x
gha: Restore cleanup-zvsi for s390x
2024-07-31 16:23:26 +08:00
Chengyu Zhu
fc0f635098 Merge pull request #10101 from AdithyaKrishnan/main
ci: Fix rate limit error by migrating busybox_image
2024-07-31 14:48:12 +08:00
ChengyuZhu6
2cfb32ac4d version: bump nydus snapshotter to v0.13.14
bump nydus snapshotter to v0.13.14 to stabilize CIs.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-31 14:47:33 +08:00
ChengyuZhu6
41b7577f08 tests: add image check before running coco tests
Currently, there are some issues with pulling images in CI, such as :
https://github.com/kata-containers/kata-containers/actions/runs/10109747602/job/27959198585

This issue is caused by switching between different snapshotters for the same image in some scenarios.
To resolve it, we can check existing images to ensure all content is available locally before running tests.

Fixes: #10029

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-31 14:47:33 +08:00
Hyounggyu Choi
e135d536c5 gha: Restore cleanup-zvsi for s390x
In #10096, a cleanup step for kata-deploy is removed by mistake.
This leads to a cleanup error in the following `Complete job` step.

This commit restores the removed step to resolve the current CI failure on s390x.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-31 06:42:16 +02:00
Adithya Krishnan Kannan
fdf7036d5e ci: Fix rate limit error by migrating busybox_image
Changing the busybox_image from
docker to quay to fix rate limit errors.

Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
2024-07-30 22:32:22 -05:00
Hyounggyu Choi
c8a160d14a Merge pull request #10096 from BbolroC/remove-pre-post-action-s390x
gha: Eradicate {pre,post}-action steps for s390x runners
2024-07-30 22:30:05 +02:00
Hyounggyu Choi
8d529b960a gha: Eradicate {pre,post}-action steps for s390x runners
As suggested in #9934, the following hooks have been introduced for s390x runners:

- ACTIONS_RUNNER_HOOK_JOB_STARTED
- ACTIONS_RUNNER_HOOK_JOB_COMPLETED

These hooks will perfectly replace the existing {pre,post}-action scripts.
This commit wipes out all GHA steps for s390x where the actions are triggered.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-30 17:10:19 +02:00
Wainer Moschetta
528745fc88 Merge pull request #10052 from nubificus/feat_fix_qemu_after_8070
runtime-rs: Fix QEMU backend for runtime-rs
2024-07-30 11:00:14 -03:00
Fupan Li
de22b3c4bf Merge pull request #10024 from lifupan/main
runtime-rs: enable dragonball hypervisor support initrd
2024-07-30 16:00:42 +08:00
Fupan Li
e3f0d2a751 runtime-rs: enable dragonball hypervisor support initrd
enable the dragonball support initrd.

Fixes: #10023

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-07-30 14:50:24 +08:00
Fupan Li
4fbf9d67a5 Merge pull request #10043 from lifupan/fix_sandbox
runtime-rs : fix the issue of stop sandbox
2024-07-29 09:22:26 +08:00
Fabiano Fidêncio
949ffd146a Merge pull request #10083 from microsoft/danmihai1/policy-tests
tests: k8s: minor policy tests clean-up
2024-07-28 11:04:24 +02:00
Dan Mihai
3e348e9768 tests: k8s: rename hard-coded policy test script
Rename k8s-exec-rejected.bats to k8s-policy-hard-coded.bats, getting
ready to test additional hard-coded policies using the same script.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-26 20:14:05 +00:00
Dan Mihai
7b691455c2 tests: k8s: hard-coded policy for any platform
Users of AUTO_GENERATE_POLICY=yes:

- Already tested *auto-generated* policy on any platform.
- Will be able to test *hard-coded* policy too on any platform, after
  this change.

CI continues to test hard-coded policies just on the platforms listed
here, but testing those policies locally (outside of CI) on other
platforms can be useful too.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-26 19:30:03 +00:00
Dan Mihai
83056457d6 tests: k8s-policy-pod: avoid word splitting
Avoid potential word splitting when using array of command args array.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-26 18:55:52 +00:00
Dan Mihai
5546ce4031 Merge pull request #10069 from microsoft/danmihai1/exec-args
genpolicy: validate each exec command line arg
2024-07-26 11:39:44 -07:00
Fabiano Fidêncio
b0b04bd2f3 Merge pull request #10078 from fidencio/topic/increase-rootfs-confidential-slash-run-to-50-percent
tee: osbuilder: Set /run to use 50% of the image with systemd
2024-07-26 18:37:41 +02:00
Anastassios Nanos
d11657a581 runtime-rs: Remove unused env vars from build
Since we can't find a homogeneous value for the resource/cgroup
management of multiple hypervisors, and we have decoupled the
env vars in the Makefile, we don't need the generic ones.

Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
2024-07-26 14:03:50 +00:00
Anastassios Nanos
3f58ea9258 runtime-rs: Decouple Makefile env VARS
To avoid overriding env vars when multiple hypervisors are
available, we add per-hypervisor vars for static resource
management and cgroups handling. We reflect that in the
relevant config files as well.

Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
2024-07-26 14:02:35 +00:00
Fabiano Fidêncio
5f146e10a1 osbuilder: Add logs for setting up systemd based stuff
This helps us to debug any kind of changes.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-07-26 14:22:45 +02:00
Alex Carter
4a8fb475be tee: osbuilder: Set /run to use 50% of the image with systemd
Let's ensure at least 50% of the memory is used for /run, as systemd by
default forces it to be 10%, which is way too small even for very small
workloads.

This is only done for the rootfs-confidential image.

Fixes: kata-containers#6775
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.co
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-07-26 14:22:38 +02:00
Chengyu Zhu
2a9ed19512 Merge pull request #9988 from huoqifeng/annotation
initdata: add initdata annotation in hypervisor config
2024-07-26 19:59:45 +08:00
Fupan Li
c51ba73199 container: fix the issue of send signal to process
It's better to check the container's status before
try to send signal to it. Since there's no need
to send signal to it when the container's stopped.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-07-26 19:23:43 +08:00
Fupan Li
e156516bde sandbox: fix the issue of stop sandbox
Since stop sandbox would be called in multi path,
thus it's better to set and check the sandbox's state.

Fixes: #10042

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-07-26 19:23:34 +08:00
Qi Feng Huo
a113fc93c8 initdata: fix unit test code for initdata annotation
Added ut code for initdata annotation

Signed-off-by: Qi Feng Huo <huoqif@cn.ibm.com>
2024-07-26 18:24:05 +08:00
Qi Feng Huo
8d61029676 initdata: add unit test code for initdata annotation
Added ut code for initdata annotation

Signed-off-by: Qi Feng Huo <huoqif@cn.ibm.com>
2024-07-26 14:20:57 +08:00
Qi Feng Huo
b80057dfb5 initdata: Merge branch 'main' into annotation
- Merge branch 'main' into feature branch annotation
2024-07-26 14:01:04 +08:00
Archana Shinde
d7637f93f9 Merge pull request #9899 from amshinde/multiple-networks-fix
Fix issue while adding multiple networks with nerdctl
2024-07-25 11:56:27 -07:00
Dan Mihai
a37f10fc87 genpolicy: validate each exec command line arg
Generate policy that validates each exec command line argument, instead
of joining those args and validating the resulting string. Joining the
args ignored the fact that some of the args might include space
characters.

The older format from genpolicy-settings.json was similar to:

    "ExecProcessRequest": {
      "commands": [
                "sh -c cat /proc/self/status"
        ],
      "regex": []
    },

That format will not be supported anymore. genpolicy will detect if its
users are trying to use the older "commands" field and will exit with
a relevant error message in that case.

The new settings format is:

    "ExecProcessRequest": {
      "allowed_commands": [
        [
          "sh",
          "-c",
          "cat /proc/self/status"
        ]
      ],
      "regex": []
    },

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-25 16:57:17 +00:00
Dan Mihai
0f11384ede tests: k8s-policy-pod: exec_command clean-up
Use "${exec_command[@]}" for calling both:
- add_exec_to_policy_settings
- kubectl exec

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-25 16:55:03 +00:00
Dan Mihai
95b78ecaa9 tests: k8s-exec: reuse sh_command variable
Reuse sh_command variable instead of repeading "sh".

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-25 16:50:34 +00:00
Alex Lyn
abb0a2659a Merge pull request #9944 from Apokleos/align-ocispec-rs
Align kata oci spec with oci-spec-rs
2024-07-25 19:36:52 +08:00
Alex Lyn
bb2b60dcfc oci: Delete the kata oci spec
It's time to delete the kata oci spec implemented just
for kata. As we have already done align OCI Spec with
oci-spec-rs.

Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-25 17:47:01 +08:00
Alex Lyn
b56313472b agent: Align agent OCI spec with oci-spec-rs
Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-25 17:47:01 +08:00
Alex Lyn
882385858d runtime-rs: Align oci spec in runtime-rs with oci-spec-rs
This commit aligns the OCI Spec implementation in runtime-rs
with the OCI Spec definitions and related operations provided
by oci-spec-rs. Key changes as below:
(1) Leveraged oci-spec-rs to align Kata Runtime OCI Spec with
the official OCI Spec.
(2) Introduced runtime-spec to separate OCI Spec definitions
from Kata-specific State data structures.
(3) Preserved the original code logic and implementation as
much as possible.
(4) Made minor code adjustments to adhere to Rust programming
conventions;

Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-25 17:47:01 +08:00
Alex Lyn
bf813f85f2 runk: Align oci spec with oci-spec-rs
Utilized oci-spec-rs to align OCI Spec structures
and data representations in runk with the OCI Spec.

Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-25 17:47:01 +08:00
Alex Lyn
b3eab5ffea genpolicy: Align agent-ctl OCI Spec with oci-spec-rs
Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-25 17:47:01 +08:00
Alex Lyn
c500fd5761 agent-ctl: Align agent-ctl OCI Spec with oci-spec-rs
This commit aligns the OCI Spec used within agent-ctl
with the oci-spec-rs definition and operations. This
enhancement ensures that agent-ctl adheres to the latest
OCI standards and provides a more consistent and reliable
experience for managing container images and configurations.

Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-25 17:47:01 +08:00
Alex Lyn
faffee8909 libs: update Cargo config and lock file
update Cargo.toml and Cargo.lock for adding runtime-spec

Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-25 17:47:01 +08:00
Alex Lyn
8b5499204d protocols: Reimplement OCI Spec to TTRPC Data Translation
This commit transitions the data implementation for OCI Spec
from kata-oci-spec to oci-spec-rs. While both libraries adhere
to the OCI Spec standard, significant implementation details
differ. To ensure data exchange through TTRPC services, this
commit reimplements necessary data conversion logic.
This conversion bridges the gap between oci-spec-rs data and
TTRPC data formats, guaranteeing consistent and reliable data
transfer across the system.

Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-25 17:46:07 +08:00
Anastassios Nanos
cda00ed176 runtime-rs: Add FC specific KERNELPARAMS
To avoid overriding KERNELPARAMS for other hypervisors, add
FC-specific KERNELPARAMS.

Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
2024-07-25 08:53:57 +00:00
Hyounggyu Choi
d8cac9f60b GHA: Run k8s e2e tests for qemu-runtime-rs on s390x
This commit adds a new CI job for qemu-runtime-rs to the existing
zvsi Kubernetes test matrix.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-25 08:11:49 +02:00
Alex Lyn
4e003a2125 Merge pull request #10058 from Apokleos/enhance-vsock-connect
runtime-rs: enhance debug info for agent connect.
2024-07-25 11:29:04 +08:00
Alex Lyn
36385a114d runtime-rs: enhance debug info for agent connect.
we need more friendly logs for debugging agent conntion
cases when kata pods fail.

Fixes #10057

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-25 08:51:57 +08:00
Dan Mihai
c3adeda3cc Merge pull request #10051 from microsoft/danmihai1/exec-variable-reuse
tests: k8s: reuse policy exec variable
2024-07-24 14:58:40 -07:00
Aurélien Bombo
f08b594733 Merge pull request #9576 from microsoft/saulparedes/support_env_from
genpolicy: Add support for envFrom
2024-07-24 13:39:54 -07:00
GabyCT
79edf2ca7d Merge pull request #10054 from GabyCT/topic/docnydus
docs: Update url links in kata nydus document
2024-07-24 14:08:44 -06:00
Archana Shinde
64d6293bb0 tests:Add nerdctl test for testing with multiple netwokrs
Add integration test that creates two bridge networks with nerdctl and
verifies that Kata container is brought up while passing the networks
created.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-07-24 10:45:56 -07:00
Archana Shinde
49fbae4fb1 agent: Wait for interface in update_interface
For nerdctl and docker runtimes, network is hot-plugged instead of
cold-plugged. While this change was made in the runtime,
we did not have the agent waiting for the device to be ready.
On some systems, the device hotplug could take some time causing
the update_interface rpc call to fail as the interface is not available.

Add a watcher for the network interface based on the pci-path of the
network interface. Note, waiting on the device based on name is really
not reliable especially in case multiple networks are hotplugged.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-07-24 10:45:56 -07:00
Dan Mihai
fecb70b85e tests: k8s: reuse policy exec variable
Share a single test script variable for both:
- Allowing a command to be executed using Policy settings.
- Executing that command using "kubectl exec".

Fixes: #10014

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-24 17:42:04 +00:00
Fabiano Fidêncio
162a6b44f6 Merge pull request #10063 from ChengyuZhu6/fix-ci-timeout
gha: Increase timeout to run CoCo tests
2024-07-24 15:14:35 +02:00
Pavel Mores
dd1e09bd9d runtime-rs: add experimental support for memory hotunplugging to qemu-rs
Hotunplugging memory is not guaranteed or even likely to work.
Nevertheless I'd really like to have this code in for tests and
observation.  It shouldn't hurt, from experience so far.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-07-24 13:22:41 +02:00
Pavel Mores
3095b65ac3 runtime-rs: support hotplugging memory in QemuInner
The bulk of this implementation are simple though tedious sanity checks,
alignment computations and logging.

Note that before any hotplugging, we query qemu directly for the current
size of hotplugged memory.  This ensures that any request to resize memory
will be properly compared to the actual already available amount and only
necessary amount will be added.

Note also that we borrow checked_next_multiple_of() from CH implementation.
While this might look uncleanly it's just a rather temporary solution since
an equivalent function will apparently be part of std soon, likely the
upcoming 1.75.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-07-24 13:22:41 +02:00
Pavel Mores
4a1c828bf8 runtime-rs: support hotplugging memory in Qmp
The algorithm is rather simple - we query qemu for existing memory devices
to figure out the index of the one we're about to add.  Then we add a
backend object and a corresponding frontend device.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-07-24 13:22:41 +02:00
Pavel Mores
0e0b146b87 runtime-rs: support storage & retrieval of guest memblock size in qemu-rs
This will be used for ensuring that hotplugged memory block sizes are
properly aligned.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-07-24 13:22:41 +02:00
Alex Lyn
efb7390357 kata-sys-utils: align OCI Spec with oci-spec-rs
Do align oci spec and fix warnings to make clippy
happy.

Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-24 14:38:48 +08:00
Alex Lyn
012029063c runtime-spec: Introduce runtime-spec for Container State
As part of aligning the Kata OCI Spec with oci-spec-rs,
the concept of "State" falls outside the scope of the OCI
Spec itself. While we'll retain the existing code for State
management for now, to improve code organizationand clarity,
we propose moving the State-related code from the oci/ dir
to a dedicated directory named runtime-spec/.
This separation will be completed in subsequent commits with
the removal of the oci/ directory.

Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-24 14:38:30 +08:00
Zvonko Kaiser
a388d2b8d4 Merge pull request #9919 from zvonkok/ubuntu-dockerfile
gpu: rootfs ubuntu build expansion
2024-07-24 08:05:54 +02:00
ChengyuZhu6
2b44e9427c gha: Increase timeout to run CoCo tests
This PR increases the timeout for running the CoCo tests to avoid random failures.
These failures occur when the action `Run tests` times out after 30 minutes, causing the CI to fail.

Fixes: #10062

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-24 12:31:38 +08:00
GabyCT
b408cc1694 Merge pull request #10060 from GabyCT/topic/fgreptest
metrics: Update launch times to use grep -F
2024-07-23 17:23:14 -06:00
Gabriela Cervantes
0e5489797d docs: Update url links in kata nydus document
This PR updates the url links in the kata nydus document.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-23 17:49:12 +00:00
Gabriela Cervantes
3d17a7038a metrics: Update launch times to use grep -F
This PR updates the metrics launch times to use grep -F instead of
fgrep as this command has been deprecated.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-23 17:13:52 +00:00
Zvonko Kaiser
941577ab3b gpu: rootfs ubuntu build expansion
For the GPU build we need go/rust and some other helpers
to build the rootfs.

Always use versions.yaml for the correct and working Rust and golang
version

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-07-23 14:31:35 +00:00
Steve Horsman
d69950e5c6 Merge pull request #10053 from stevenhorsman/release-env-var
ci: cache: Pass through RELEASE env
2024-07-22 21:53:20 +01:00
Dan Mihai
f26d595e5d Merge pull request #9910 from microsoft/saulparedes/set_policy_rego_via_env
tools: Allow setting policy rego file via
2024-07-22 11:00:30 -07:00
stevenhorsman
66f6ec2919 ci: cache: Pass through RELEASE env
In kata-deploy-binaries.sh we want to understand if we are running
as part of a release, so we need to pass through the RELEASE env
from the workflow, which I missed in
https://github.com/kata-containers/kata-containers/pull/9550

Fixes: #9921
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-22 16:39:35 +01:00
Zvonko Kaiser
5765b6e062 Merge pull request #9920 from zvonkok/initrd-builer
gpu: rootfs/initrd build init
2024-07-22 15:06:49 +02:00
Zvonko Kaiser
73bcb09232 Merge pull request #9968 from zvonkok/kernel-gpu-dragonball-6.1.x
dragonball: kernel gpu dragonball 6.1.x
2024-07-22 13:03:14 +02:00
Zvonko Kaiser
3029e6e849 gpu: rootfs/initrd build init
Initramfs expects /init, create symlink only if ${ROOTFS}/init does not exist
Init may be provided by other packages, e.g. systemd or GPU initrd/rootfs

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-07-22 10:19:05 +00:00
Saul Paredes
b7a184a0d8 rootfs: Allow AGENT_POLICY_FILE te be an absolute
path

Don't set AGENT_POLICY_FILE as $script_dir may change

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-07-21 14:57:41 -07:00
Alex Lyn
67466aa27f kata-types: do alignment of oci-spec for kata-types
Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-21 22:54:43 +08:00
Hyounggyu Choi
c774cd6bb0 Merge pull request #10031 from ChengyuZhu6/fix-log-contain-tdx
tests: Fix missing log on TDX
2024-07-20 07:26:08 +02:00
ChengyuZhu6
6ea6e85f77 tests: Re-enable authenticated image tests on tdx
Try to re-enable authenticated image tests on tdx.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-20 12:10:02 +08:00
ChengyuZhu6
3476fb481e tests: Fix missing log on TDX
Currently, we have found that `assert_logs_contain` does not work on TDX.
We manually located the specific log, but it fails to get the log using `kubectl debug`. The error found in CI is:
```
warning: couldn't attach to pod/node-debugger-984fee00bd70.jf.intel.com-pdgsj,
falling back to streaming logs: error stream protocol error: unknown error
```

Upon debugging the TDX CI machine, we found an error in containerd:
```
Attach container from runtime service failed" err="rpc error: code = InvalidArgument desc = tty and stderr cannot both be true"
containerID="abc8c7a546c5fede4aae53a6ff2f4382ff35da331bfc5fd3843b0c8b231728bf"
```

We believe this is the root cause of the test failures in TDX CI.
Therefore, we need to ensure that tty and stderr are not set to true at same time.

Fixes: #10011

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Signed-off-by: Wang, Arron <arron.wang@intel.com>
2024-07-20 12:10:01 +08:00
Steve Horsman
7dd560f07f Merge pull request #9620 from l8huang/kernel
Add kernel config for NVIDIA DPU/ConnectX adapter
2024-07-19 23:16:51 +01:00
Dan Mihai
3127dbb3df Merge pull request #10035 from microsoft/danmihai1/k8s-credentials-secrets
tests: k8s-credentials-secrets: policy for second pod
2024-07-19 12:44:21 -07:00
Saul Paredes
2681fc7eb0 genpolicy: Add support for envFrom
This change adds support for the `envFrom` field in the `Pod` resource

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-07-19 09:53:58 -07:00
GabyCT
be2d4719c2 Merge pull request #10040 from kata-containers/fix_blogbench_midvalues
metrics: update avg reference values for blogbench.
2024-07-19 09:51:29 -06:00
Zvonko Kaiser
8eaa2f0dc8 dragonball: Add GPU support
Build a GPU flavoured dragonball kernel

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-07-19 14:48:05 +00:00
Dan Mihai
44e443678d Merge pull request #9835 from microsoft/saulparedes/test_policy_on_sev
gha: enable autogenerated policy testing on SEV and SEV-SNP
2024-07-19 07:46:01 -07:00
Greg Kurz
dc97f3f540 Merge pull request #10045 from lifupan/cleanup_container
runtime-rs: container: fix the issue of missing cleanup container
2024-07-19 16:36:04 +02:00
Alex Lyn
d0dc67bb96 Merge pull request #8597 from amshinde/vfio-hotplug-support
Implement hotplug support for physical endpoints
2024-07-19 13:41:11 +08:00
Lei Huang
20f6979d8f build: add kernel config for Nvidia DPU/ConnectX adapter
With Nvidia DPU or ConnectX network adapter, VF can do VFIO passthrough
to guest VM in `guest-kernel` mode. In the guest kernel, the adapter's
driver is required to claim the VFIO device and create network interface.

Signed-off-by: Lei Huang <leih@nvidia.com>
2024-07-18 22:29:16 -07:00
Fupan Li
8a2f7b7a8c container: fix the issue of missing cleanup container
When create container failed, it should cleanup the container
thus there's no device/resource left.

Fixes: #10044

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-07-19 11:02:55 +08:00
ms-mahuber
ddff762782 tools: Allow setting policy rego file via
environment variable

* Set policy file via env var

* Add restrictive policy file to kata-opa folder

* Change restrictive policy file name

* Change relative default path location

* Add license headers

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-07-18 15:05:45 -07:00
David Esparza
60f52a4b93 metrics: update avg reference values for blogbench.
This PR updates the Blogbench reference values for
read and write operations used in the CI check metrics
job.

This is due to the update to version 1.2 of blobench.

Fixes: #10039

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-07-18 15:47:14 -06:00
Greg Kurz
fc4357f642 Merge pull request #10034 from BbolroC/hide-repack_secure_image-from-test
tests: Call repack_secure_image() in set_metadata_annotation()
2024-07-18 23:03:41 +02:00
Aurélien Bombo
ab6f37aa52 Merge pull request #10022 from microsoft/danmihai1/probes-and-lifecycle
genpolicy: container.exec_commands args validation
2024-07-18 12:21:31 -07:00
Steve Horsman
256ab50f1a Merge pull request #9959 from sprt/fix-ci-cleanup
ci: cleanup: Ignore nonexisting resources
2024-07-18 19:23:48 +01:00
David Esparza
1fdc5c1183 Merge pull request #10028 from amshinde/upgrade-blogbench-1.2
metric: Upgrade blogbench to 1.2
2024-07-18 11:30:17 -06:00
Hyounggyu Choi
a7e4d3b738 tests: Call repack_secure_image() in set_metadata_annotation()
It is not good practice to call repack_secure_image() from a bats file
because the test code might not consider cases where `qemu-se` is used
as `KATA_HYPERVISOR`.

This commit moves the function call to set_metadata_annotation() if a key
includes `kernel_params` and `KATA_HYPERVISOR` is set to `qemu-se`, allowing
developers to focus on the test scenario itself.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-18 18:09:45 +02:00
Dan Mihai
035a42baa4 tests: k8s-credentials-secrets: policy for second pod
Add policy to pod-secret-env.yaml from k8s-credentials-secrets.bats.

Policy was already auto-generated for the other pod used by the same
test (pod-secret.yaml). pod-secret-env.yaml was inconsistent,
because it was taking advantage of the "allow all" policy built into
the Guest image. Sooner or later, CI Guests for CoCo will not get the
"allow all" policy built in anymore and pod-secret-env.yaml would
have stopped working then.

Note that pod-secret-env.yaml continues to use an "allow all" policy
after these changes. #10033 must be solved before a more restrictive
policy will be generated for pod-secret-env.yaml.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-18 15:03:57 +00:00
Hyounggyu Choi
d2ac01c862 Merge pull request #10032 from BbolroC/fix-image-authenticated-for-s390x
tests: Rebuild secure boot image for guest-pull-image-authenticated for IBM SE
2024-07-18 17:00:18 +02:00
Hyounggyu Choi
6e7ee4bdab tests: Rebuild secure image for guest-pull-image-authenticated on SE
Since #9904 was merged, newly introduced tests for `k8s-guest-pull-image-authenticated.bats`
have been failing on IBM SE (s390x). The agent fails to start because a kernel parameter
cannot pass to the guest VM via annotation. To fix this, the boot image must be rebuilt with
updated parameters.

This commit adds the rebuilding step in create_pod_yaml_with_private_image() for `qemu-se`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-18 14:56:12 +02:00
Archana Shinde
1636c201f4 network: Implement network hotunplug for physical endpoints
Similar to HotAttach, the HotDetach method signature for network
endoints needs to be changed as well to allow for the method to make
use of device manager to manage the hot unplug of physical network
devices.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-07-17 16:42:41 -07:00
Archana Shinde
c6390f2a2a vfio: Introduce function to get vfio dev path
This function will be later used to get the vfio dev path.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-07-17 16:42:41 -07:00
Archana Shinde
1e304e6307 network: Implement hotplug for physical endpoints
Enable physical network interfaces to be hotplugged.
For this, we need to change the signature of the HotAttach method
to make use of Sandbox instead of Hypervisor. Similar approach was
followed for Attach method, but this change was overlooked for
HotAttach.
The signature change is required in order to make use of
device manager and receiver for physical network
enpoints.

Fixes: #8405

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-07-17 16:42:40 -07:00
Archana Shinde
2fef4bc844 vfio: use driver_override field for device binding.
The current implementation for device binding using driver bind/unbind
and new_id fails in the scenario when the physical device is not bound
to a driver before assigning it to vfio.
There exists and updated mechanism to accomplish the same that does not
have the same issue as above.
The driver_override field for a device allows us to specify the driver for a device
rather than relying on the bound driver to provide a positive match of the
device. It also has other advantages referenced here:
https://patchwork.kernel.org/project/linux-pci/patch/1396372540.476.160.camel@ul30vt.home/

So use the updated driver_override mechanism for binding/unbinding a
physical device/virtual function to vfio-pci.

Signed-off-by: liangxianlong <liang.xianlong@zte.com.cn>
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-07-17 16:42:40 -07:00
GabyCT
6aff5f300a Merge pull request #10021 from GabyCT/topic/fixarchdoc
docs: Update devmapper docs
2024-07-17 14:56:40 -06:00
Saul Paredes
57d2ded3e2 gha: enable autogenerated policy testing on
SEV-SNP

Enable autogenerated policy testing on SEV-SNP

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-07-17 13:32:06 -07:00
Archana Shinde
30e5e88ff1 metric: Upgrade blogbench to 1.2
Move to blogbench 1.2 version from 1.1.
This version includes an important fix for the read_score test
which was reported to be broken in the previous version.
It essentially fixes this issue here:
https://github.com/jedisct1/Blogbench/issues/4

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-07-17 11:32:09 -07:00
Steve Horsman
e5d5284761 Merge pull request #10026 from wainersm/release_370
release: Bump VERSION to 3.7.0
2024-07-17 18:43:51 +01:00
Wainer dos Santos Moschetta
6f7ab31860 release: Bump VERSION to 3.7.0
On preparation for the 3.7.0 release, bumped the version in VERSION file.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-07-17 14:19:44 -03:00
Saul Paredes
b3cc8b200f gha: enable autogenerated policy testing on SEV
Enable autogenerated policy testing on SEV

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-07-17 09:55:13 -07:00
Dan Mihai
f31c1b121e Merge pull request #9812 from microsoft/saulparedes/test_policy_on_tdx
gha: enable policy testing on TDX
2024-07-17 08:47:44 -07:00
Dan Mihai
449103c7bf Merge pull request #10020 from microsoft/danmihai1/pod-security-context
tests: fix ps command in k8s-security-context
2024-07-17 08:12:57 -07:00
Fabiano Fidêncio
b7051890af Merge pull request #9722 from zvonkok/busybox-build
deploy: Add busybox target
2024-07-17 13:47:15 +02:00
Steve Horsman
5ce2c1010a Merge pull request #9904 from stevenhorsman/registry-authentication
Support for registry authentication in guest pull
2024-07-17 10:48:38 +01:00
Fupan Li
65f2bfb8c4 Merge pull request #9967 from zvonkok/kernel-dragonball-6.1.x
dragonball: kernel dragonball 6.1.x
2024-07-17 14:38:06 +08:00
Dan Mihai
0e86a96157 tests: fix ps command in k8s-security-context
1. Use a container image that supports "ps --user 1000 -f".
2. Execute that command using:

sh -c "ps --user 1000 -f"

instead of passing additional arguments to sh:

sh -c ps --user 1000 -f

Fixes: #10019

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-17 01:33:31 +00:00
Dan Mihai
9f4d1ffd43 genpolicy: container.exec_commands args validation
Keep track of individual exec args instead of joining them in the
policy text. Verifying each arg results in a more precise policy,
because some of the args might include space characters.

This improved validation applies to commands specified in K8s YAML
files using:

- livenessProbe
- readinessProbe
- startupProbe
- lifecycle.postStart
- lifecycle.preStop

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-17 01:19:23 +00:00
Dan Mihai
b23ea508d5 tests: k8s: container.exec_commands policy tests
Add tests for genpolicy's handling of container.exec_commands. These
are commands allowed by the policy and originating from these input
K8s YAML fields:

- livenessProbe
- readinessProbe
- startupProbe
- lifecycle.postStart
- lifecycle.preStop

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-17 01:19:00 +00:00
stevenhorsman
567b4d5788 test/k8s: Fix up node logging typo
We had a typo in the attestation tests that we've copied around a
lot and Wainer spotted it in the authenticated registry tests, so let's fix it up now

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-16 21:39:31 -03:00
stevenhorsman
0015c8ef51 tests: Add guest-pull auth registry tests
Add three new test cases for guest pull from an authenticated registry for
the following scenarios:

_**Scenario**: Creating a container from an authenticated image, with correct credentials via KBC works_
**Given** An authenticated container registry *quay.io/kata-containers/confidential-containers-auth*
  **And** a version of kata deployed with a guest image that has an agent with `guest_pull`
feature enabled and nydus-snapshotter installed and configured for
[guest-pulling](https://github.com/containerd/nydus-snapshotter/blob/main/misc/snapshotter/config-coco-guest-pulling.toml)
  **And** a KBS set up to have the correct auth.json for
registry *quay.io/kata-containers/confidential-containers-auth* embedded in the `"Credential"` section of `its resources file`
**When** I create a pod from the container image *quay.io/kata-containers/confidential-containers-auth:test*
**Then** The pull image works and the pod can start

_**Scenario**: Creating a container from an authenticated image, with incorrect credentials via KBC fails_
**Given**  An authenticated container registry *quay.io/kata-containers/confidential-containers-auth*
  **And** a version of kata deployed with a guest image that has an agent with `guest_pull`
feature enabled and nydus-snapshotter installed and configured for
[guest-pulling](https://github.com/containerd/nydus-snapshotter/blob/main/misc/snapshotter/config-coco-guest-pulling.toml)
  **And** An installed kata CC with the sample_kbs set up to have the auth.json for registry
*quay.io/kata-containers/confidential-containers-auth* embedded in the `"Credential"` resource, but with a dummy user name and password
**When** I create a pod from the container image *quay.io/kata-containers/confidential-containers-auth:test*
**Then** The pull image fails with a message that reflects that the authorisation failed

_**Scenario**: Creating a container from an authenticated image, with no credentials fails_
**Given**  An authenticated container registry *quay.io/kata-containers/confidential-containers-auth*
  **And** a version of kata deployed with a guest image that has an agent with `guest_pull`
feature enabled and nydus-snapshotter installed and configured for
[guest-pulling](https://github.com/containerd/nydus-snapshotter/blob/main/misc/snapshotter/config-coco-guest-pulling.toml)
  **And** An installed kata CC with no credentials section
**When** I create a pod from the container image *quay.io/kata-containers/confidential-containers-auth:test*
**Then** The pull image fails with a message that reflects that the authorisation failed

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-16 21:39:31 -03:00
stevenhorsman
eb07f5ef5e agent: doc: Fix ordering of options
- Fix the config options to be back in alphabetical order to be
easier to find

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-16 21:39:31 -03:00
stevenhorsman
7cc81ce867 agent: image: Set image-rs auth config
If the agent-config has a value for `image_registry_auth`,
Then pass this to the image-rs client and enable auth mode too

Fixes: #8122

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-16 21:39:31 -03:00
stevenhorsman
265322990a agent: config: Add config option to provide auth for guest-pull
Add optional config for agent.image_registry_auth, to specify
the uri of credentials to be used when pulling images in the guest
from an authenticated registry

Fixes: #8122

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-16 21:39:31 -03:00
Steve Horsman
064b45a2fa Merge pull request #10016 from wainersm/ibm-se-auth-reg
workflows: setup environment to run auth registry tests on s390x
2024-07-16 22:24:39 +01:00
Gabriela Cervantes
d2866081d2 docs: Update devmapper docs
This PR updates the devmapper docs by updating the url link
for the current containerd devmapper information.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-16 21:07:51 +00:00
GabyCT
2206e2dd5c Merge pull request #10013 from GabyCT/topic/updatecontdoc
docs: Update cri installion guide url in containerd documentation
2024-07-16 14:32:59 -06:00
Wainer dos Santos Moschetta
66c600f8d8 gha: delint the s390x workflow
Made run-k8s-tests-on-zvsi.yaml free of warnings by removing:

SC2086:info:1:1: Double quote to prevent globbing and word splitting ...
SC2086:info:2:1: Double quote to prevent globbing and word splitting ...

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-07-16 15:20:46 -03:00
Wainer dos Santos Moschetta
a98985fab8 gha: export user/password for auth registry tests on s390x
Counterpart of commit d8961cbd4a for run-k8s-tests-on-zvsi workflow

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-07-16 15:18:40 -03:00
Saul Paredes
af49252c69 gha: enable policy testing on TDX
Enable policy testing on TDX

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-07-15 14:09:49 -07:00
Saul Paredes
0b3d193730 genpolicy: Support cpath for mount sources
Add setting to allow specifying the cpath for a mount source.

cpath is the root path for most files used by a container. For example,
the container rootfs and various files copied from the Host to the
Guest when shared_fs=none are hosted under cpath.

mount_source_cpath is the root of the paths used a storage mount
sources. Depending on Kata settings, mount_source_cpath might have the
same value as cpath - but on TDX for example these two paths are
different: TDX uses "/run/kata-containers" as cpath,
but "/run/kata-containers/shared/containers" as mount_source_cpath.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-07-15 14:09:49 -07:00
Gabriela Cervantes
e4045ff29a docs: Update runtime v2 containerd url information
This PR updates the runtime v2 containerd url information at containerd
documentation.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-15 20:36:17 +00:00
Dan Mihai
bcaf7fc3b4 Merge pull request #10008 from microsoft/danmihai1/runAsUser
genpolicy: add support for runAsUser fields
2024-07-15 12:08:50 -07:00
Gabriela Cervantes
9f738f0d05 docs: Update cri installion guide url in containerd documentation
This PR updates the cri installation guide url link in the containerd
documentation guide as the previous url link does not exists.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-15 16:58:18 +00:00
Dan Mihai
648265d80e Merge pull request #9998 from microsoft/danmihai1/GENPOLICY_PULL_METHOD
tests: k8s: GENPOLICY_PULL_METHOD clean-up
2024-07-15 09:32:29 -07:00
Steve Horsman
02b9fd6e95 Merge pull request #9382 from Xynnn007/feat-encrypt-image
Merge to main: supporting pull encrypted images
2024-07-15 15:58:42 +01:00
stevenhorsman
b060fb5b31 tests/k8s: Skip measured rootfs test
The only kernel built for measured rootfs was the kernel-tdx-experimental,
so this test only ran in the qemu-tdx job runs the test.
In commit 6cbdba7 we switched all TEE configurations to use the same kernel-confidential,
so rootfs measured is disabled for qemu-tdx too now.
The VM still fails to boot (because of a different reason...) but the bug
in the assert_logs_contain, fixed in this PR was masking the checks on the logs.
We still have a few open issues related to measured rootfs and generating
the root hash, so let's skip this test that doesn't work until they are looked at

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-15 12:00:50 +01:00
stevenhorsman
2cf94ae717 tests: Add guest-pull encrypted image tests
Add three new tests cases for guest-pull of an encrypted image
for the following scenarios:

_**Scenario: Pull encrypted image on guest with correct key works**_
**Given** I have a version of kata deployed with a guest image that has
an agent with `guest_pull` feature enabled and nydus-snapshotter installed
and configured for guest-pulling
  **And** A public encrypted container image *i* with a decryption key *k*
that is configured as a resource the KBS, so that image-rs on the guest can
connect to it
**When** I try and create a pod from *i*
**Then** The pod is successfully created and runs

_**Scenario: Cannot pull encrypted image with no decryption key**_
**Given** I have a version of kata deployed with a guest image that has
an agent with `guest_pull` feature enabled and nydus-snapshotter installed
and configured for guest-pulling
  **And** A public encrypted container image *i* with a decryption key *k*,
that is **not** configured in a KBS that image-rs on the guest can connect to
**When** I try and create a pod from *i*
**Then** The pod is not created with an error message that reflects why

_**Scenario: Cannot pull encrypted image with wrong decryption key**_
**Given** I have a version of kata deployed with a guest image that has
an agent with `guest_pull` feature enabled and nydus-snapshotter installed
and configured for guest-pulling
  **And** A public encrypted container image *i* with a decryption key *k*
and a different key *k'* that is set as a resource in a KBS, that image-rs
on the guest can connect to
**When** I try and create a pod from *i*
**Then** The pod is not created with an error message that reflects why

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-15 12:00:50 +01:00
Xynnn007
a56b15112a agent: add ocicrypt config
ocicrypt config is for kata-agent to connect to CDH to request for image
decryption key. This value is specified by an env. We use this
workaround the same as CCv0 branch.

In future, we will consider better ways instead of writting files and
setting envs inside inner logic of kata-agent.

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
2024-07-15 12:00:50 +01:00
Xynnn007
1072658219 agent: Enable kata-cc-rustls-tls in image-rs
- Enable the kata-cc-rustls-tls feature in image-rs, so that it
can get resources from the KBS in order to retrieve the registry
credentials.
- Also bump to the latest image-rs to pick up protobuf fixes
- Add libprotobuf-dev dependency to the agent packaging
as it is needed by the new image-rs feature
- Add extra env in the agent make test as the
new version of the anyhow crate has changed the backtrace capture thus unit
tests of kata-agent that compares a raised error with an expected one
would fail. To fix this, we need only panics to have backtraces, thus
set RUST_BACKTRACE=0 for tests due to document
https://docs.rs/anyhow/latest/anyhow/

Fixes #9538

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-15 12:00:50 +01:00
stevenhorsman
3b72e9ffab tests/k8s: Fix assert_logs_contain
The pipe needs adding to the grep, otherwise the grep
gets consumed as an argument to `print_node_journal` and
run in the debug pod.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-15 12:00:50 +01:00
Hyounggyu Choi
83b3a681f4 Merge pull request #10010 from BbolroC/osbuilder-bump-fedora-to-40
osbuilder: Bump Fedora to 40
2024-07-15 13:00:28 +02:00
Greg Kurz
203d9e7803 Merge pull request #10000 from littlejawa/kata_deploy_add_storage_config_for_crio
kata-deploy: add storage configuration for cri-o
2024-07-15 12:29:21 +02:00
Hyounggyu Choi
08d2f6bfe4 osbuilder: Bump Fedora to 40
As Fedora 38 has reached EOL, we are encountering 404 errors for s390x, such as:

```
Status code: 404 for https://dl.fedoraproject.org/pub/fedora-secondary/updates/38/Everything/s390x/repodata/repomd.xml
```

Let's bump the OS to the latest version.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-15 09:58:54 +02:00
Fupan Li
a7179be31d Merge pull request #9534 from Tim-Zhang/fix-stdin-stuck
Fix ctr exec stuck problem
2024-07-15 13:19:19 +08:00
Dan Mihai
dded329d26 tests: k8s: SecurityContext.runAsUser policy test
Add test for auto-generating policy for a pod spec that includes the
SecurityContext.runAsUser field.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-13 01:37:58 +00:00
Dan Mihai
7040fb8c50 tests: k8s-security-context auto-generated policy
Auto-generate the policy in k8s-security-context.bats - previously
blocked by lacking support for PodSecurityContext.runAsUser.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-13 01:23:54 +00:00
Dan Mihai
f087044ecb genpolicy: add support for runAsUser
Add ability to auto-generate policy for SecurityContext.runAsUser and
PodSecurityContext.runAsUser.

Fixes: #8879

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-13 01:10:43 +00:00
Dan Mihai
5282701b5b genpolicy: add link to allow_user() active issue
Improve comment to workaround in rules.rego, to explain better the
reason for that workaround.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-13 01:05:58 +00:00
GabyCT
3c0171df3d Merge pull request #10005 from GabyCT/topic/katadragonball
common: Add share fs information for dragonball
2024-07-12 16:10:29 -06:00
Wainer Moschetta
646d7ea4fb Merge pull request #9951 from BbolroC/enable-attestation-for-ibm-se
tests: Enable attestation e2e tests for IBM SE
2024-07-11 16:02:59 -03:00
Hyounggyu Choi
ca80301b4b Merge pull request #10003 from BbolroC/skip-pod-shared-volume-for-ibm-se
k8s: Skip shared-volume relevant tests for IBM SE
2024-07-11 19:29:13 +02:00
Gabriela Cervantes
4477b4c9dc common: Add share fs information for dragonball
This PR adds the share fs information for dragonball using kata-ctl
to avoid the failures in runk tests saying that shared_fs is an
unbound variable.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-11 17:09:35 +00:00
Dan Mihai
09c5ca8032 tests: k8s: clarify the need to use containerd.sock
Modify the permissions of containerd.sock just when genpolicy needs
access to this socket, when testing GENPOLICY_PULL_METHOD=containerd.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-11 16:49:58 +00:00
Dan Mihai
c1247cc254 tests: k8s: explain the default containerd settings
Explain why the containerd settings on the local machine get set to
containerd's defaults when testing GENPOLICY_PULL_METHOD=containerd.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-11 16:49:39 +00:00
Dan Mihai
3b62eb4695 tests: k8s: add comment for GENPOLICY_PULL_METHOD
Explain why there are two different methods for pulling container
images in genpolicy.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-11 16:40:01 +00:00
Dan Mihai
eaedd21277 tests: k8s: use oci-distribution as default value
oci-distribution is the value used by run-k8s-tests-on-aks.yaml, so
use the same value as default for GENPOLICY_PULL_METHOD in gha-run.sh.

The value of GENPOLICY_PULL_METHOD is currently compared just with
"containerd", but avoid possible future problems due to using a
different default value in gha-run.sh.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-11 16:40:01 +00:00
GabyCT
2056eda5f0 Merge pull request #9922 from GabyCT/topic/updateblogname
metrics: Update container name in blogbench test
2024-07-11 10:05:35 -06:00
Hyounggyu Choi
32c3e55cde k8s: Skip shared-volume relevant tests for IBM SE
Currently, it is not viable to share a writable volume (e.g., emptyDir)
between containers in a single pod for IBM SE.
The following tests are relevant:
  - pod-shared-volume.bats
  - k8s-empty-dirs.bats
(See: https://github.com/kata-containers/kata-containers/issues/10002)

This commit skips the tests until the issue is resolved.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-11 14:09:19 +02:00
Julien Ropé
b83d4e1528 kata-deploy: add storage configuration for cri-o
Make sure that the "skip_mount_home" flag is set in cri-o config.

Fixes: #9878

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-07-11 10:11:30 +02:00
Qi Feng Huo
4d66ee1935 initdata: add initdata annotation in hypervisor config
- Add Initdata annotation for hypervisor config, so that it can be passed when CreateVM

Signed-off-by: Qi Feng Huo <huoqif@cn.ibm.com>
2024-07-11 10:56:18 +08:00
GabyCT
dac07239f5 Merge pull request #9974 from squarti/sharedfs
runtime: Initialize SharedFS for remote hypervisor
2024-07-10 17:03:00 -06:00
GabyCT
3827b5f9f2 Merge pull request #9982 from ChengyuZhu6/fix-ci
tests: Delete test scripts forcely
2024-07-10 17:00:41 -06:00
Wainer Moschetta
deb4627558 Merge pull request #9975 from niteeshkd/nd_snp_attestation
gha: enable SNP attestation
2024-07-10 18:59:05 -03:00
GabyCT
c40b3b4ce7 Merge pull request #9992 from sprt/fix-nydus
ci: fix run-nydus tests
2024-07-10 13:56:16 -06:00
David Esparza
be9385342e Merge pull request #9990 from GabyCT/topic/tdxtimeout
gha: Increase timeout to run CoCo TDX tests
2024-07-10 13:21:23 -06:00
Silenio Quarti
8260ce8d15 runtime: Initialize SharedFS for remote hypervisor
Sets SharedFS config to NoSharedFS for remote hypervisor in order to start the file watcher which syncs files from the host to the guest VMs. 

Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
2024-07-10 14:31:25 -03:00
Aurélien Bombo
25e0e2fb35 ci: fix run-nydus tests
GH-9973 introduced:

 * New function get_kata_memory_and_vcpus() in
   tests/metrics/lib/common.bash.
 * A call to get_kata_memory_and_vcpus() from extract_kata_env(), which
   is defined in tests/common.bash.

Because the nydus test only sources tests/common.bash, it can't find
get_kata_memory_and_vcpus() and errors out.

We fix this by moving the get_kata_memory_and_vcpus() call from
tests/common.bash to tests/metrics/lib/json.bash so that it doesn't
impact the nydus test.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-07-10 17:19:08 +00:00
Gabriela Cervantes
b6b8524ab7 gha: Increase timeout to run CoCo TDX tests
This PR increases the timeout to run the CoCo TDX tests in order
to avoid the random failures on TDX saying that
The action 'Run tests' has timed out after 30 minutes and making
the GHA job fail.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-10 16:06:07 +00:00
Niteesh Dubey
e8a3f8571e docs: update for SNP attestation
This updates how-to document for SNP attestation.

Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
2024-07-10 15:06:55 +00:00
Niteesh Dubey
ff04154fdb gha: enable SNP attestation
This removes the code to skip the SNP attestation.

Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
2024-07-10 15:06:55 +00:00
Hyounggyu Choi
d94b285189 tests: Enable k8s-confidential-attestation.bats for s390x
For running a KBS with `se-verifier` in service,
specific credentials need to be configured.
(See https://github.com/confidential-containers/trustee/tree/main/attestation-service/verifier/src/se for details.)

This commit introduces two procedures to support IBM SE attestation:

- Prepare required files and directory structure
- Set necessary environment variables for KBS deployment
- Repackage a secure image once the KBS service address is determined

These changes enable `k8s-confidential-attestation.bats` for s390x.

Fixes: #9933

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-10 16:18:37 +02:00
Hyounggyu Choi
5d0f74cd70 local-build: Extract build_secure_image() as a separate library
Currently, all functions in `build_se_image.sh` are dedicated to
publishing a payload image. However, `build_secure_image()` is now
also used for repackaging a secure image when a kernel parameter
is reconfigured. This reconfiguration is necessary because the KBS
service address is determined after the initial secure image build.

This commit extracts `build_secure_image()` from `build_se_image.sh`
and creates a separate library, which can be loaded by bats-core.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-10 16:18:37 +02:00
Hyounggyu Choi
bf2f0ea2ca tests: Change a location for creating key.bin
The current KBS deployment creates a file `key.bin` assuming that
`kustomization.yaml` is located in `overlays/`.

However, this does not hold true when the kustomize config is enabled
for multiple architectures. In such cases, the configuration file
should be located in `overlays/$(uname -m)`.
This commit changes the location for file creation.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-10 16:18:37 +02:00
Hyounggyu Choi
4025ef7193 versions: Bump trustee to multi-arch deployment for KBS
As part of the enablement for s390x, KBS should support multi-arch deployment.
This commit updates the version of coco-trustee to a commit where the support
is implemented.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-10 16:18:37 +02:00
Hyounggyu Choi
856a1f72c6 packaging: Set ATTESTER to se-attester for guest components on s390x
This commit allows the guest-components builder to only build se-attester on s390x.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-10 16:18:37 +02:00
Xuewei Niu
7f71eac6de Merge pull request #9868 from l8huang/dan
runtime: implement DAN in Go kata-runtime
2024-07-10 19:09:46 +08:00
Alex Lyn
dafff26f01 Merge pull request #9814 from Apokleos/bugfix-pcipath
runtime-rs: bugfix for root bus slot allocation
2024-07-10 16:19:06 +08:00
Steve Horsman
aa487307e8 Merge pull request #9962 from GabyCT/topic/removecif
scripts: Eliminate CI variable as it is not longer used
2024-07-10 09:02:33 +01:00
Steve Horsman
78bbc51ff0 Merge pull request #9806 from niteeshkd/nd_snp_certs
runtime: pass certificates to get extended attestation report for SNP coco
2024-07-10 08:57:45 +01:00
Steve Horsman
29413021e5 Merge pull request #9981 from stevenhorsman/run-k8s-tests-on-zvsi-inherit-secrets
gha: make run-k8s-tests-on-zvsi inherit secrets
2024-07-10 08:49:11 +01:00
Lei Huang
171d298dea runtime: implement DAN in Go kata-runtime
The DAN feature has already been implemented in kata-runtime-rs, and
this commit brings the same capability to the Go kata-runtime.

Fixes: #9758

Signed-off-by: Lei Huang <leih@nvidia.com>
2024-07-10 00:22:30 -07:00
ChengyuZhu6
489afffd8c tests:gha: delete namespace before resetting namespace
Delete the kata-containers-k8s-tests namespace before resetting the namespace
to ensure that no deployments or services are restarting and creating pods in the default namespace.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Signed-off-by: Wang, Arron <arron.wang@intel.com>
2024-07-10 12:08:28 +08:00
ChengyuZhu6
e874c8fa2e tests: Delete test scripts forcely
Delete test scripts forcely in `Delete kata-deploy` step before
deleting all kata pods.

Fixes: #9980

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-10 12:08:28 +08:00
Alex Lyn
806e959b01 runtime-rs: bugfix for device slot allocation failed in dragonball
In dragonball Vfio device passthrough scenarois, the first passthrough
device will be allocated slot 0 which is occupied by root device.
It will cause error, looks like as below:
```
...
6: failed to add VFIO passthrough device: NoResource\n
7: no resource available for VFIO device"): unknown
...
```
To address such problem, we adopt another method with no pre-allocated
guest device id and just let dragonball auto allocate guest device id
and return it to runtime. With this idea, add_device will return value
Result<DeviceType> and apply the change to related code.

Fixes #9813

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-10 10:59:57 +08:00
Alex Lyn
27947cbb0b dragonball: make add vfio device return guest device id
Fixes #9813

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-10 10:59:51 +08:00
Alex Lyn
fa4af09658 Merge pull request #9985 from GabyCT/topic/fixcrites
cri-containerd: Remove use_devmapper variable for cri-containerd tests
2024-07-10 10:13:27 +08:00
Alex Lyn
e4997760f1 Merge pull request #9987 from kata-containers/remove_double_process_check_from_memory_usage_test
metrics: Remove duplicate check of processes from memory test.
2024-07-10 10:12:18 +08:00
David Esparza
09f523c815 Merge pull request #9973 from kata-containers/add_memory_and_vcpus_info_to_results
Add memory and vcpus info to metrics results
2024-07-09 18:05:07 -06:00
David Esparza
e77d44614b metrics: Remove duplicate check of processes from memory test.
This PR removes the common_init function call from the memory
usage script to eliminate duplicate checking that is also done
from the init_env function.

It also eliminates duplicaction of nested conditionals.

Fixes: #9984

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-07-09 12:34:51 -06:00
Gabriela Cervantes
7061272b4e kernel: bump kata config version
This PR bumps the kata config version as the kernel scripts were
modified.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-09 20:04:24 +02:00
Gabriela Cervantes
de848c1458 packaging: Remove CI variable from build kernel script
This PR removes the CI variable from build kernel script which
is not longer supported it as this was part of the jenkins
environment.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-09 20:04:24 +02:00
Gabriela Cervantes
28601b51d2 tools: Remove CI variable in kata deploy in docker script
This PR removes the CI variable in kata deploy in docker script
which was supported it in jenkins environment which is not
longer being supported it.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-09 20:04:24 +02:00
Gabriela Cervantes
f2b8c6619d makefile: Remove CI variable from local build makefile
This PR removes the CI variable from the local build makefile as
this was part of the jenkins environment which is not longer supported
it.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-09 20:04:24 +02:00
Gabriela Cervantes
4161fa3792 tools: Remove CI variable in test images script for osbuilder
This PR removes the CI variable in test images script for osbuilder
as this was part of the jenkins environment which is not longer supported
it.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-09 20:04:24 +02:00
Greg Kurz
7506d1ec29 tools: Remove CI variable in test config osbuilder script
This PR removes the CI variable in test config osbuilder script
which was supported on the jenkins environment which is not
longer supported it.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
[greg: squash all fixes into a single patch]
Signed-off-by: Greg Kurz <groug@kaod.org>
2024-07-09 20:03:08 +02:00
Niteesh Dubey
647dad2a00 gha: skip SNP attestation test
Skip the SNP attestation test for now.

Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
2024-07-09 17:16:07 +00:00
Niteesh Dubey
e7b4e5e386 gha: add SNP attestation test
This tests the attestation of SNP guest.

Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
2024-07-09 17:14:26 +00:00
Gabriela Cervantes
1a1e62b968 cri-containerd: Remove use_devmapper variable for cri-containerd tests
This PR removes the use_devmapper variable which was part of the jenkins
environment flags which is not longer support it or available for the
cri-containerd tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-09 17:09:55 +00:00
GabyCT
eb0bc5007c Merge pull request #9976 from sprt/fix-cri-containerd
tests: cri-containerd: Ensure Docker isn't present
2024-07-09 11:02:20 -06:00
David Esparza
04df85a44f metrics: Add num_vcpus and free_mem to metrics results template.
This PR retrieves the free memory and the vcpus count from
a kata container and includes them to the json results file of
any metric.

Additionally this PR parses the requested vcpus quantity and the
requested amount memory from kata configuration file and includes
this pair of values into the json results file of any metric.

Finally, the file system defined in the kata configuration file
is included in the results template.

Fixes: #9972

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-07-09 10:29:29 -06:00
David Esparza
a554541495 metrics: Improvement to the description of certain functions.
This PR rephrased the description and usage of certain functions
as such as:
- set_kata_configuration_performance
- set_kata_config_file
- get_current_kata_config_file
- check_if_root
- check_ctr_images

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-07-09 10:29:29 -06:00
stevenhorsman
c7cf26fa32 gha: make run-k8s-tests-on-zvsi inherit secrets
run-k8s-tests-on-zvsi runs the coco tests and we've added new
secrets to provide credentials for the authenticated image testing,
so we need to let the zvsi job inherit these from the caller workflow
like the rest of the coco tests

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-09 15:29:48 +01:00
Hyounggyu Choi
37b907dfbc Merge pull request #9859 from BbolroC/set-ocispec-for-vfio-ap
tests: Extend vfio-ap hotplug test to use a zcrypttest tool
2024-07-09 14:03:45 +02:00
Steve Horsman
ff498c55d1 Merge pull request #9719 from fitzthum/sealed-secret
Support Confidential Sealed Secrets (as env vars)
2024-07-09 09:43:51 +01:00
Niteesh Dubey
529660fafb runtime: pass certificates for SNP coco
This will be used to get extended attestation report.

Fixes: #9805

Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
2024-07-09 03:46:00 +00:00
Tim Zhang
704da86e9b CI: Add tests for stdio
Add tests for stdio

Signed-off-by: Tim Zhang <tim@hyper.sh>
2024-07-09 11:44:40 +08:00
Tim Zhang
8801554889 runtime-rs: Fix ctr exec stuck problem
Fixes: #9532

Instead of call agent.close_stdin in close_io, we call agent.write_stdin
with 0 len data when the stdin pipe ends.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2024-07-09 11:44:36 +08:00
Tobin Feldman-Fitzthum
1c2d69ded7 tests: add test for sealed env secrets
The sealed secret test depends on the KBS to provide
the unsealed value of a vault secret.

This secret is provisioned to an environment variable.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2024-07-08 17:41:20 -05:00
Linda Yu
b4d61f887b agent: unittest for sealed secret as env in kata
To test unsealing secrets stored in environment variables,
we create a simple test server that takes the place of
the CDH. We start this server and then use it to
unseal a test secret.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
2024-07-08 17:32:45 -05:00
Linda Yu
6003608fe6 agent: support sealed secret as env in kata
When sealed-secret is enabled, the Kata Agent
intercepts environment variables containing
sealed secrets and uses the CDH to unseal
the value.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
2024-07-08 17:31:33 -05:00
Gabriela Cervantes
cf2d5ff4c1 scrips: Fix indentation in QAT run script
This PR fixes the indentation of the QAT run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-08 20:23:50 +00:00
Gabriela Cervantes
d53eb61856 QAT: Remove CI variable from QAT run script
This PR removes the CI variable from QAT run script which was used
in the jenkins environment and not longer used.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-08 20:16:00 +00:00
Gabriela Cervantes
8a79b1449e tests: Remove CI variable in tracing test
This PR removes the CI variable as well as the instructions related
to this as this was part of the jenkins environment which is not
longer supported it.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-08 20:12:41 +00:00
Gabriela Cervantes
9d44abb406 tests: Remove CI variable in test agent shutdown
This PR removes the CI variable as well as the instructions related
to this variable which was used on the jenkins environment and not
longer supported.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-08 20:10:24 +00:00
Gabriela Cervantes
f2ed8dc568 docs: Remove CI variable from Intel QAT documentation
This PR updates the Intel QAT documentation by removing the CI variable
which is not longer being supported as this was part of the jenkins
CI environment.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-08 20:05:47 +00:00
Gabriela Cervantes
ff06ef0bbc scripts: Eliminate CI variable as it is not longer used
This PR removes the CI variable which is not longer being used or valid
in the kata containers repository. The CI variable was used when we
were using jenkins and scripts setups which are not longer supported.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-08 20:00:30 +00:00
GabyCT
cb0fb91bdd Merge pull request #9966 from GabyCT/topic/fixstability
tests: Use variable already defined in metrics common script for stability tests
2024-07-08 13:55:55 -06:00
Aurélien Bombo
e9d6179b28 tests: cri-containerd: Ensure Docker isn't present
Following #9960 that transitioned this test to a free runner, we need to
ensure Docker isn't installed on the system as that will conflict with
the installation of Podman.

Example error:
https://github.com/kata-containers/kata-containers/actions/runs/9818218975/job/27177785716

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-07-08 18:50:57 +00:00
Steve Horsman
e8836fafaa Merge pull request #9828 from stevenhorsman/image-rs-bump-bad84c7
Image rs bump to latest main
2024-07-08 17:07:59 +01:00
Fabiano Fidêncio
67ba0ad0ad Merge pull request #9971 from GabyCT/topic/fixnerdctldep
gha: Fix pip installation for nerdctl GHA
2024-07-06 21:37:55 +02:00
Gabriela Cervantes
724b2c612c gha: Fix pip installation for nerdctl GHA
This PR fixes the pip installation for nerdctl by removing a flag
which is not longer supported and avoid the failure of
no such option: --break-system-packages.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-05 17:31:52 +00:00
stevenhorsman
1d6c1d1621 test: Add journal logging for debug
- Due to the error we hit with pulling the agnhost
image used in the liveness-probe tests, we want to leave
the console printing to help with debug when we next try
to bump the image-rs version

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-05 10:25:28 +01:00
stevenhorsman
d511820974 agent: Bump image-rs
- Bump the commit of image-rs we are pulling in to 413295415
Note: This is the last commmit before a change to whiteout handling
was introduced that lead to the error `'failed to unpack: convert whiteout"`
when pulling the agnhost:2.21 image

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-05 10:25:28 +01:00
Fabiano Fidêncio
543c90f145 Merge pull request #9695 from ChengyuZhu6/fix-init
Fix issues on CI about guest-pull
2024-07-05 11:21:08 +02:00
ChengyuZhu6
65dc12d791 tests: Re-enable k8s-kill-all-process-in-container.bats
This test was fixed by previous patches in this PR: kata-containers#9695

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-05 08:10:04 +08:00
ChengyuZhu6
2ea521db5e tests:tdx: Re-enable k8s-liveness-probes.bats
This test was fixed by previous patches in this PR: kata-containers#9695

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-05 08:10:04 +08:00
ChengyuZhu6
93453c37d6 tests: Re-enable k8s-sysctls.bats
This test was fixed by previous patches in this PR: kata-containers#9695

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-05 08:10:04 +08:00
ChengyuZhu6
6c5e053dd5 tests: Re-enable k8s-shared-volume.bats
This test was fixed by previous patches in this PR: kata-containers#9695

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-05 08:10:04 +08:00
ChengyuZhu6
85979021b3 tests: Re-enable k8s-file-volume.bats
This test was fixed by previous patches in this PR: kata-containers#9695

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-05 08:10:04 +08:00
ChengyuZhu6
e71c7ab932 agent/image: Remove functions about merging container spec for guest pull
Let me explain why:

In our previous approach, we implemented guest pull by passing PullImageRequest to the guest.
However, this method  resulted in the loss of specifications essential for running the container,
such as commands specified in YAML, during the CreateContainer stage. To address this,
it is necessary to integrate the OCI specifications and process information
from the image’s configuration with the container in guest pull.

The snapshotter method does not care this issue. Nevertheless, a problem arises
when two containers in the same pod attempt to pull the same image, like InitContainer.
This is because the image service searches for the existing configuration,
which resides in the guest. The configuration, associated with <image name, cid>,
is stored in the directory /run/kata-containers/<cid>. Consequently, when the InitContainer finishes
its task and terminates, the directory ceases to exist. As a result, during the creation
of the application container, the OCI spec and process information cannot
be merged due to the absence of the expected configuration file.

Fixes: kata-containers#9665
Fixes: kata-containers#9666
Fixes: kata-containers#9667
Fixes: kata-containers#9668

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-05 08:10:04 +08:00
ChengyuZhu6
c9d1a758cd agent/image: Reuse the mountpoint in image-rs
Currently, the image is pulled by image-rs in the guest and mounted at
`/run/kata-containers/image/cid/rootfs`. Finally, the agent rebinds
`/run/kata-containers/image/cid/rootfs` to `/run/kata-containers/cid/rootfs` in CreateContainer.
However, this process requires specific cleanup steps for these mount points.

To simplify, we reuse the mount point `/run/kata-containers/cid/rootfs`
and allow image-rs to directly mount the image there, eliminating the need for rebinding.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-05 08:10:04 +08:00
stevenhorsman
05cd1cc7a0 agent: Add CreateContainer support for pre-pulled bundle
- Add a check in setup_bundle to see if the bundle already exists
and if it does then skip the setup.

This commit is cherry-picked from 44ed3ab80e.

The reason that k8s-kill-all-process-in-container.bats failed is that
deletion of the directory `/root/kata-containers/cid/rootfs` failed during removing container
because it was mounted twice (one in image-rs and one in set_bundle ) and only unmounted once in removing container.

Fixes: #9664

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Dave Hay <david_hay@uk.ibm.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-05 08:10:00 +08:00
Zvonko Kaiser
7990d3a154 dragonball: Update kata config version
Mandatory update

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-07-04 17:24:16 +00:00
Zvonko Kaiser
cfbca4fe0d dragonball: Update versions
Use the latest guest kernel that we use for all other VMMs

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-07-04 17:24:16 +00:00
Zvonko Kaiser
26446d1edb dragonball: Update patches
After v5.14 there is no cpu_hotplug_begin function
now cpus_write_lock same for cpu_hotplug_done = cpus_write_unlock

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-07-04 17:23:24 +00:00
Zvonko Kaiser
ad574b7e10 dragonball: Add patches for 6.1.x
Ported the 5.10 patchs to 6.1.x

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-07-04 17:06:39 +00:00
Gabriela Cervantes
757f37d956 stability: General improvements for soak parallel test
This PR has better variable definitons as well the use of a variable
which is already defined in the metrics common script for soak parallel
test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-04 16:32:46 +00:00
Gabriela Cervantes
6d56abbdad stability: General improvements to agent stability test
This PR is for better variable definitions as well as the use of the
CTR_EXE variable which is already defined in the metrics common script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-04 16:24:27 +00:00
Gabriela Cervantes
3e6c32c3c8 tests: Use variable already defined in stability tests
This PR uses the CTR_EXE which is already defined in the metrics common
script to have uniformity across the multiple stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-04 16:21:24 +00:00
Steve Horsman
ddb8a94677 Merge pull request #9960 from sprt/fix-garm
ci: Transition GARM tests to free runners, pt. I
2024-07-04 09:04:58 +01:00
Biao Lu
6c1a2f01f8 protocols: add support for sealed_secret service
To unseal a secret, the Kata agent will contact the CDH
using ttRPC. Add the proto that describes the sealed
secret service and messages that will be used.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Biao Lu <biao.lu@intel.com>
2024-07-04 01:03:41 -05:00
Fabiano Fidêncio
49696bbdf2 Merge pull request #9943 from AdithyaKrishnan/nydus-cleanup-timeout
tests: Fixes TEE timeout issue
2024-07-03 22:57:17 +02:00
Anastassios Nanos
db75b5f3c4 Merge pull request #8070 from nubificus/feat_add-fc-runtime-rs
runtime-rs: firecracker hypervisor backend
2024-07-03 22:29:30 +03:00
Adithya Krishnan Kannan
9250858c3e tests: Stop trying to patch finalize
We have not seen instances of the nydus snapshotter hanging on its
deletion that we must patch its finalize.

Let's just drop this line for now.

Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-07-03 12:19:26 -05:00
Dan Mihai
ada53744ea Merge pull request #9907 from microsoft/saulparedes/allow_empty_env_vars
genpolicy: allow some empty env vars
2024-07-03 08:07:23 -07:00
Aurélien Bombo
f18e35014f ci: Move run-nerdctl-tests to free runner
See #9940.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-07-03 14:58:11 +00:00
Aurélien Bombo
c0919d6f45 ci: Move run-docker-tests to free runner
Removed the Docker installation step as that's preinstalled in free
runners.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-07-03 14:57:59 +00:00
Aurélien Bombo
743a765525 ci: Move run-runk to free runner
See #9940.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-07-03 14:57:48 +00:00
Aurélien Bombo
09cce86cc7 ci: Move run-nydus to free runner
See #9940.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-07-03 14:57:42 +00:00
Aurélien Bombo
9e1b6064dc ci: Move run-containerd-stability to free runner
Removes the Docker installation step as that's preinstalled on the free
runner:

https://github.com/actions/runner-images/blob/main/images/ubuntu/Ubuntu2204-Readme.md#tools

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-07-03 14:57:37 +00:00
Aurélien Bombo
6a0e403acf ci: Move run-cri-containerd to free runner
See #9940.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-07-03 14:57:29 +00:00
George Pyrros
2d19f3fbd7 runtime-rs: firecracker hypervisor backend
Add a basic runtime-rs `Hypervisor` trait implementation for
AWS Firecracker

- Add basic hypervisor operations (setup / start / stop / add_device)
- Implement AWS Firecracker API on a separate file `fc_api.rs`
- Add support for running jailed (include all sandbox-related content)
- Add initial device support (limited as hotplug is not supported)
- Add separate config for runtime-rs (FC)

Notes:
- devmapper is the only snapshotter supported
- to account for no sharefs support, we copy files in the sandbox (as
  in the GO runtime)
- nerdctl spawn is broken (TODO: #7703)

Fixes: #5268

Signed-off-by: George Pyrros <gpyrros@nubificus.co.uk>
Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
Signed-off-by: Charalampos Mainas <cmainas@nubificus.co.uk>
Signed-off-by: George Ntoutsos <gntouts@nubificus.co.uk>
2024-07-03 08:30:30 +00:00
GabyCT
e3e3873857 Merge pull request #9954 from GabyCT/topic/sysbenchci
metrics: Remove variable in sysbench that is not being used
2024-07-02 16:58:46 -06:00
Aurélien Bombo
eda5d2c623 ci: cleanup: Run every 24 hours instead of 6 hours
Resources don't fail to get deleted as often to need to run every 6
hours.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-07-02 22:27:58 +00:00
Aurélien Bombo
f20924db24 ci: cleanup: Ignore nonexisting resources
Some resource names seem to be lingering in Azure limbo but do not map
to any actual resources, so we ignore those.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-07-02 22:23:54 +00:00
GabyCT
0590aab3e6 Merge pull request #9952 from GabyCT/topic/unitjenkins
docs: Remove jenkins reference from unit testing presentation
2024-07-02 15:34:25 -06:00
Aurélien Bombo
33d08a8417 Merge pull request #9825 from microsoft/mahuber/main
osbuilder: allow rootfs builds w/o git or version file deps
2024-07-02 09:38:13 -07:00
Steve Horsman
078a1147a6 Merge pull request #9909 from kata-containers/sprt/gha-cleanup-pt2
ci: Add scheduled job to cleanup resources, pt. II
2024-07-02 17:12:03 +01:00
Gabriela Cervantes
b7da1291ea metrics: Remove variable in sysbench that is not being used
This PR removes the CI_JOB variable which previously was used but
not longer being supported of the metrics sysbench test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-02 15:29:50 +00:00
Wainer Moschetta
ec695f67e1 Merge pull request #9577 from microsoft/saulparedes/topology
genpolicy: add topologySpreadConstraints support
2024-07-02 11:24:26 -03:00
Fabiano Fidêncio
ef3f6515cf Merge pull request #9941 from sprt/temp-disable-test
ci: Temporarily disable kata-deploy and GARM tests
2024-07-02 14:13:46 +02:00
Amulya Meka
dd12089e0d Merge pull request #9914 from Amulyam24/qemu-fix
kata-deploy: fix qemu static build on ppc64le
2024-07-02 10:45:03 +05:30
Saul Paredes
f3f3caa80a genpolicy: update sample
Update pod-one-container.yaml sample

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-07-01 13:49:08 -07:00
Dan Mihai
75aee526a9 genpolicy: add topologySpreadConstraints support
Allow genpolicy to process Pod YAML files including
topologySpreadConstraints.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-01 13:32:49 -07:00
Gabriela Cervantes
c270df7a9c docs: Remove jenkins reference from unit testing presentation
This PR removes the jenkins reference from unit testing presentation
as this is not longer supported on the kata containers project.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-01 20:26:35 +00:00
GabyCT
e94490232e Merge pull request #9949 from cmaf/tests-fix-openvino-help
tests: Update help section in openvino test
2024-07-01 13:31:51 -06:00
Gabriela Cervantes
e3318a04f7 metrics: Update container name in blogbench test
This PR updates the container name to put a random name instead
of using a hard coded name. This PR is a general improvement
to avoid random bug failures specially when we are running on
baremetal environments.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-07-01 19:28:16 +00:00
Fabiano Fidêncio
05848d0c34 Merge pull request #9930 from likebreath/0627/clh_v40.0
Upgrade to Cloud Hypervisor v40.0
2024-07-01 20:04:47 +02:00
Steve Horsman
4fd820abd2 Merge pull request #9947 from stevenhorsman/fix-cleanups-workflow-secret
gha: ci: Remove incorrect secrets line
2024-07-01 16:30:37 +01:00
Chelsea Mafrica
0b83c8549a tests: Update help section in openvino test
Test reports that it is a onednn test when it is openvino; update
description.

Fixes: #9948

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2024-07-01 14:24:50 +00:00
Hyounggyu Choi
795c5dc0ff tests: Extend vfio-ap hotplug test to use zcrypttest
This commit extends the vfio-ap hotplug test to include the use of `zcrypttest`.
A newly introduced test by the tool consists of several test rounds as follows:

- ioctl_test
- simple_test
- simple_one_thread_test
- simple_multi_threads_test
- multi_thread_stress_test
- hang_after_offline_online_test

A writable root filesystem is required for testing because the reference count
needs to be reset after each test round. The current containerd kata containers
support does not include `--privileged_without_host_devices`, which is necessary
to configure a writable filesystem along with `--privileged`. (Please check out
https://github.com/kata-containers/kata-containers/issues/9791 for details)

So `crictl` is chosen to extend the test.

The commit also includes the removal of old commands previously used for the
tests repository but no longer in use.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-01 11:41:59 +02:00
Hyounggyu Choi
5bda197e9d tests: Add zcrypttest tool to test image Dockerfile
This commit copies an internal testing tool `zcrypttest` to the
test image. A base image is changed to `ubuntu:22.04` due to a
library dependency issue.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-01 11:40:49 +02:00
Hyounggyu Choi
99690ab202 runtime: Instantiate/pass vfio-ap device to ociSpec
This commit adds the missing step of passing an attached vfio-ap device
to a container via ociSpec. It instantiates and passes a vfio-ap device
(e.g. a Z crypto device).
A device at `/dev/z90crypt` covers all use cases at the time of writing.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-01 11:40:49 +02:00
Amulyam24
259ec408b5 kata-deploy: fix qemu static build for v8.2.1 on ppc64le
Do not install the packages librados-dev and librbd-dev as they are not needed for building static qemu.

Add machine option cap-ail-mode-3=off while creating the VM to qemu cmdline.
Fixes: #9893

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2024-07-01 14:56:43 +05:30
stevenhorsman
16130e473c gha: ci: Remove incorrect secrets line
The CI is failing with:
```
Invalid workflow file: .github/workflows/cleanup-resources.yaml#L10
The workflow is not valid. .github/workflows/cleanup-resources.yaml (Line: 10, Col: 5): Unexpected value 'secrets'
```
I think this is because `secrets: inherit` is only applicable
when re-using a workflow, not for a standalone job like
we have here.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-01 09:32:58 +01:00
Hyounggyu Choi
f0187ff969 Merge pull request #9932 from BbolroC/drop-ci-install-go
CI: Eliminate dependency on tests repo
2024-07-01 08:24:28 +02:00
Hyounggyu Choi
f2bfc306a2 Merge pull request #9936 from BbolroC/use-quay-lpine-bash-curl
CI: Use multi-arch image for alpine-bash-curl
2024-07-01 08:02:01 +02:00
Manuel Huber
4b2e725d03 rootfs: Install Rust only when necessary
For docker-based builds only install Rust when necessary.
Further, execute the detect Rust version check only when
intending to install Rust.
As of today, this is the case when we intend to build the
agent during rootfs build.

Signed-off-by: Manuel Huber <mahuber@microsoft.com>
2024-06-28 22:19:46 +00:00
Aurélien Bombo
c605fff4c1 ci: Temporarily disable kata-deploy and GARM tests
Per the decision taken in the 6/27 AC meeting, this PR temporarily
disables kata-deploy and GARM tests until we secure further Azure CI
funding.

In the meantime, I'll transition the GARM tests to free runners and
reenable them to regain that coverage without affecting spending (see
#9940). If it turns out the free runners are too slow, we'll switch back
to GARM.

After funding is secured, we'll reenable the kata-deploy tests (see
#9939).

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-06-28 20:23:07 +00:00
Hyounggyu Choi
dd23beeb05 CI: Eliminating dependency on clone_tests_repo()
As part of archiving the tests repo, we are eliminating the dependency on
`clone_tests_repo()`. The scripts using the function is as follows:

- `ci/install_rust.sh`.
- `ci/setup.sh`
- `ci/lib.sh`

This commit removes or replaces the files, and makes an adjustment accordingly.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-28 14:52:02 +02:00
Hyounggyu Choi
f2c5f18952 CI: Use multi-arch image for alpine-bash-curl
A multi-arch image for `alpine-bash-curl` has been pushed to and available
at `quay.io/kata-containers`.

This commit switches the test image to `quay.io/kata-containers/alpine-bash-curl`.

Fixes: #9935

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-28 12:01:53 +02:00
Hyounggyu Choi
0e20f60534 CI: Drop unused scripts
The following scripts are not used by the repository any more:

- ci/install_go.sh
- ci/run.sh
- ci/install_vc.sh

Additionally, they rely on the tests repo, which is soon to be archived.

This commit drops the unused scripts.

Fixes: #8507

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-28 07:55:21 +02:00
Archana Shinde
82a1892d34 agent: Add additional info while returning errors for update_interface
This should provide additional context for errors while updating network
interface.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-06-27 12:56:53 -07:00
Archana Shinde
2127288437 agent: Bring interface down before renaming it.
In case we are dealing with multiple interfaces and there exists a
network interface with a conflicting name, we temporarily rename it to
avoid name conflicts.
Before doing this, we need to rename bring the interface down.
Failure to do so results in netlink returning Resource busy errors.

The resource needs to be down for subsequent operation when the name is
swapped back as well.

This solves the issue of passing multiple networks in case of nerdctl
as:
nerdctl run --rm  --net foo --net bar docker.io/library/busybox:latest ip a

Fixes: #9900

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-06-27 12:56:53 -07:00
Zvonko Kaiser
a32b21bd32 Merge pull request #9918 from zvonkok/build-error
rootfs: Fix spurious error
2024-06-27 19:46:51 +02:00
Bo Chen
25e3cab028 runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v40.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #9929

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-06-27 09:59:00 -07:00
Bo Chen
ad92d73e43 versions: Upgrade to Cloud Hypervisor v40.0
Details of this release can be found in our roadmap project as iteration
v40.0: https://github.com/orgs/cloud-hypervisor/projects/6.

Fixes: #9929

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-06-27 09:40:13 -07:00
Alex Lyn
d66c214ae7 Merge pull request #9849 from markyangcc/main
runtime: fix missing of VhostUserDeviceReconnect parameter assignment
2024-06-27 21:48:37 +08:00
Wainer Moschetta
afc1c1a782 Merge pull request #9896 from fitzthum/bump-gc-090
versions: bump coco guest components and trustee
2024-06-27 09:46:06 -03:00
Zvonko Kaiser
29bb9de864 Merge pull request #9923 from BbolroC/increase-interval-max-tries-kubectl
tests: Increase interval and max_tries for kubectl_retry
2024-06-27 09:49:24 +02:00
Hyounggyu Choi
4ec355fb78 tests: Increase interval and max_tries for kubectl_retry
Observed instability in the API server after deploying kata-deploy caused test failures.
(see: https://github.com/kata-containers/kata-containers/actions/runs/9681494440/job/26743286861)
Specifically, `kubectl_retry logs` failed before the API server could respond properly.

This commit increases the interval and max_tries for kubectl_retry(), allowing sufficient
time to handle this situation.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-27 08:39:22 +02:00
Aurélien Bombo
2c89828749 ci: Add scheduled job to cleanup resources, pt. II
Follow-up to #9898 and final PR of this set. This implements the actual
deletion logic.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-06-26 17:36:47 +00:00
Zvonko Kaiser
893fd2b59c Merge pull request #9916 from zvonkok/config-fix
gpu: Missing separator
2024-06-26 14:46:47 +02:00
Greg Kurz
fe7ef878d2 Merge pull request #9913 from gkurz/update-kata-ctl-deps
kata-ctl: Update Cargo.lock
2024-06-26 14:31:03 +02:00
Zvonko Kaiser
30ec78b19a rootfs: Fix spurious error
In some DMZ'ed or CI systems the repos are not up to date
and multistrap fails to find the ubuntu-keyring package.
Update the repos to fix this;

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-26 11:10:58 +00:00
Zvonko Kaiser
e0aa54301f gpu: Missing separator
Add the correct separator for replacement

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-26 10:40:35 +00:00
Greg Kurz
ac33a389c0 Merge pull request #9879 from pmores/remove-dependency-on-containerd-bundle-dir-tree
runtime-rs: remove attempt to access sandbox bundle from container bu…
2024-06-26 10:57:50 +02:00
Greg Kurz
db7b2f7aaa kata-ctl: Update Cargo.lock
A previous change missed to refresh Cargo.lock.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-06-26 08:27:52 +02:00
Tobin Feldman-Fitzthum
dd8605917b versions: bump coco guest components and trustee
Pick up the changes from the newest version of guest-components
and trustee.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2024-06-25 23:56:18 +00:00
GabyCT
81d23a1865 Merge pull request #9897 from GabyCT/topic/montime
tests: Increase timeout to crictl calls on kata monitor tests
2024-06-25 17:27:15 -06:00
Gabriela Cervantes
a8432880f8 tests: Increase timeout to crictl calls on kata monitor tests
This PR increases the timeout to crictl calls on kata monitor
tests to avoid to hit issues every now and avoid random failures.
This PR is very similar to PR #7640.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-25 22:32:47 +00:00
Wainer Moschetta
c4fb6fbda2 Merge pull request #9887 from ldoktor/ci-kata-runtime
ci.ocp: Ensure we smoke-test with the right runtime class
2024-06-25 15:27:27 -03:00
Fabiano Fidêncio
fb44edc22f Merge pull request #9906 from stevenhorsman/TEE-sample-kbs-policy-guards
tests: attestation: Restrict sample policy use
2024-06-25 20:27:13 +02:00
Steve Horsman
c9df743dab Merge pull request #9898 from sprt/gha-cleanup-job
ci: Add scheduled job to cleanup resources, pt. I
2024-06-25 19:11:30 +01:00
Saul Paredes
ce19419d72 genpolicy: allow some empty env vars
Updated genpolicy settings to allow 2 empty environment variables that
may be forgotten to specify (AZURE_CLIENT_ID and AZURE_TENANT_ID)

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-06-25 10:53:05 -07:00
Aurélien Bombo
0582a9c75b Merge pull request #9864 from 3u13r/feat/genpolicy/layers-cache-file-path
genpolicy: allow specifying layer cache file
2024-06-25 10:42:22 -07:00
Aurélien Bombo
d60b548d61 ci: Add scheduled job to cleanup resources
This is the first part of adding a job to clean up potentially dangling
Azure resources. This will be based on Jeremi's tool from
https://github.com/jepio/kata-azure-automation.

At first, we'll only clean up AKS clusters, as this is what has been
causing us problems lately, but this could very well be extended to
cleaning up entire resource groups, which is why I left the different
names pretty generic (i.e. "resources" instead of "clusters").

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-06-25 16:33:03 +00:00
stevenhorsman
7610b34426 tests: attestation: Restrict sample policy use
- We only want to enable the sample verifier in the KBS for non-TEE
tests, so prevent an edge case where the TEE platform isn't set up
correctly and we might fall back to the sample and get false positives.
To prevent this we add guards around the sample policy enablement and
only run it for non confidential hardware

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-25 16:59:40 +01:00
Steve Horsman
d574d37c4b Merge pull request #9903 from stevenhorsman/authenticated-regsitry-workflow-secrets
workflow: coco: Add auth registry secret
2024-06-25 16:40:46 +01:00
stevenhorsman
d8961cbd4a workflow: coco: Add auth registry secret
- Add the `AUTHENTICATED_IMAGE_USER` and
`AUTHENTICATED_IMAGE_PASSWORD` repository secrets as env vars
to the coco tests, so we can use them to pull an images from
and authenticated registry for testing

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-25 11:11:02 +01:00
Alex Lyn
2c5b3a5c20 Merge pull request #9830 from gaohuatao-1/ght/count-rs
runtime-rs: fix the bug of func count_files
2024-06-25 15:00:46 +08:00
GabyCT
27d75f93e2 Merge pull request #9872 from GabyCT/topic/varmemin
metrics: Improve variable definition in memory inside containers script
2024-06-24 15:30:05 -06:00
Aurélien Bombo
b0cdf4eb0d Merge pull request #9579 from microsoft/saulparedes/add_seccomp_support
genpolicy: ignore SeccompProfile in PodSpec
2024-06-24 08:58:01 -07:00
Wainer Moschetta
bcdc4fde10 Merge pull request #9857 from wainersm/disable_failing_jobs-part2
CI: disable jobs that failed >= 50% on nightly CI recently - part 2
2024-06-24 10:11:05 -03:00
Leonard Cohnen
6a3ed38140 genpolicy: allow specifying layer cache file
Add --layers-cache-file-path flag to allow the user to
specify where the cache file for the container layers
is saved. This allows e.g. to have one cache file
independent of the user's working directory.

Signed-off-by: Leonard Cohnen <lc@edgeless.systems>
2024-06-24 14:53:27 +02:00
Fabiano Fidêncio
3adf9e250f Merge pull request #9875 from zvonkok/gha-no-sudo-arm64
ci: gha no sudo arm64
2024-06-21 15:28:54 +02:00
Wainer Moschetta
f7e0d6313b Merge pull request #9865 from wainersm/qemu-coco-dev_updates
runtime: updates to qemu-coco-dev configuration
2024-06-21 10:14:30 -03:00
Fabiano Fidêncio
2d552800f2 Merge pull request #9876 from zvonkok/gha-no-sudo-s390x
ci: remove sudo from s390x build
2024-06-21 15:00:31 +02:00
Saul Paredes
44afb4aa5f genpolicy: ignore SeccompProfile in PodSpec
Ignore SeccompProfile in PodSpec

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-06-20 09:42:17 -07:00
Dan Mihai
7aeaf2502a Merge pull request #9856 from microsoft/danmihai1/new-policy-rules
genpolicy: reject untested CreateContainer field values
2024-06-20 09:34:53 -07:00
GabyCT
9320c2e484 Merge pull request #9845 from GabyCT/topic/fixartifacts
gha: Do not fail when collecting artifacts
2024-06-20 10:15:53 -06:00
Hyounggyu Choi
959a277dc5 Merge pull request #9886 from BbolroC/kernel-config-uv-uapi-s390x
kernel: Add CONFIG_S390_UV_UAPI for s390x
2024-06-20 16:05:15 +02:00
Steve Horsman
d5b4da7331 Merge pull request #9881 from stevenhorsman/remote-hypervisor-policy
runtime: Support policy in remote hypervisor
2024-06-20 14:01:29 +01:00
Hyounggyu Choi
9cb12dfa88 kernel: Add CONFIG_S390_UV_UAPI for s390x
While enabling the attestation for IBM SE, it was observed that
a kernel config `CONFIG_S390_UV_UAPI` is missing.
This config is required to present an ultravisor in the guest VM.
Ths commit adds the missing config.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-20 13:15:33 +02:00
Lukáš Doktor
b08c019003 ci.ocp: Ensure we smoke-test with the right runtime class
we do encourage people to set the KATA_RUNTIME, but it is only used by
the webhook. Let's define it in the main `test.sh` and use it in the
smoke test to ensure the user-defined runtime is smoke-tested rather
than hard-coded kata-qemu one.

Related to: #9804

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-06-20 11:15:02 +02:00
Fabiano Fidêncio
0f2a4d202e Merge pull request #9884 from fidencio/topic/re-enable-tdx-ci
ci: tdx: Re-enable TDX CI
2024-06-20 06:39:06 +02:00
GabyCT
02075f73e9 Merge pull request #9874 from GabyCT/topic/fixvarnerdctl
tests: nerdctl: Fix variables names and remove network
2024-06-19 13:43:25 -06:00
Fabiano Fidêncio
2bab0f31d7 ci: tdx: Re-enable TDX CI
Now, using vanilla kubernetes, let's re-enable the TDX CI and hope it
becomes more stable than it used to be.

The cleanup-snapshotter is now taking ~4 minutes, and that matches with
the other platforms, mainly considering there's a sum of 210 seconds
sleep in the process.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-19 20:08:28 +02:00
Greg Kurz
81972f6ffc Merge pull request #9149 from ryansavino/upgrade-to-qemu-8.2.1
qemu: upgrade to 8.2.4
2024-06-19 19:10:02 +02:00
stevenhorsman
779754dcf6 runtime: Support policy in remote hypervisor
Move the `sandbox.agent.setPolicy` call out of the remoteHypervisor
if, block, so we can use the policy implementation on peer pods

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-19 16:43:53 +01:00
Fabiano Fidêncio
f9862e054c Merge pull request #9882 from fidencio/topic/ci-tdx-use-vanilla-k8s
ci: tdx: Use vanilla k8s instead of k3s
2024-06-19 17:33:00 +02:00
Pavel Mores
6a4919eeb9 runtime-rs: fix misleading log message
get_vmm_master_tid() currently returns an error with the message "cannot
get qemu pid (though it seems running)" when it finds a valid
QemuInner::qemu_process instance but fails to extract the PID out of it.

This condition however in fact means that a qemu child process was running
(otherwise QemuInner::qemu_process would be None) but isn't anymore (id()
returns None).

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-06-19 17:15:24 +02:00
Pavel Mores
af5492e773 runtime-rs: made Qemu::stop_vm() idempotent
Since Hypervisor::stop_vm() is called from the WaitProcess request handling
which appears to be per-container, it can be called multiple times during
kata pod shutdown.  Currently the function errors out on any subsequent
call after the initial one since there's no VM to stop anymore.  This
commit makes the function tolerate that condition.

While it seems conceivable that sandbox shouldn't be stopped by WaitProcess
handling, and the right fix would then have to happen elsewhere, this
commit at least makes qemu driver's behaviour consistent with other
hypervisor drivers in runtime-rs.

We also slightly improve the error message in case there's no
QemuInner::qemu_process instance.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-06-19 17:15:24 +02:00
Pavel Mores
5fbbff9e5e runtime-rs: remove attempt to access sandbox bundle from container bundle
Since no objections were raised in the linked issue (#9847) this commit
removes the attempt to derive sandbox bundle path from container bundle
path.  As described in more detail in the linked issue, this is container
runtime specific and doesn't seem to serve any purpose.

As for implementation, we hoist the only part of
get_shim_info_from_sandbox() that's still useful (getting the socket
address) directly into the caller and remove the function altogether.

Fixes #9847

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-06-19 17:09:15 +02:00
Fabiano Fidêncio
7127178acc ci: tdx: Use vanilla k8s instead of k3s
We've noticed a bunch of issues related to deploying and deleting the
nydus-snapshotter.  As we don't see the same issues on other machines
using vanilla kubernetes, let's avoid using k3s for now follow the flow.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-19 16:56:15 +02:00
Zvonko Kaiser
beab17f765 Merge pull request #9877 from zvonkok/gha-no-sudo-ppc64
ci: gha no sudo ppc64
2024-06-19 14:02:05 +02:00
Zvonko Kaiser
d783ddaf03 ci: Remove not needed chown for ppc64
Now that all artifacts are owned by $USER no extra step needed
to adjust ownership

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-19 07:56:45 +00:00
Zvonko Kaiser
5bc37e39d5 ci: remove sudo from ppc64 build
We can now do the same for ppc64 that we did for amd64 and remove
the sudo cp.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-19 07:55:45 +00:00
Zvonko Kaiser
c341234c0b ci: remove sudo from s390x build
We can now do the same for s390x that we did for amd64 and remove
the sudo cp.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-19 07:53:33 +00:00
Zvonko Kaiser
3beb460a97 ci: Remove not needed chown for arm64
Now that all artifacts are owned by $USER no extra step needed
to adjust ownership

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-19 07:48:00 +00:00
Zvonko Kaiser
445b389b16 ci: remove sudo from arm64 build
We can now do the same for arm64 that we did for amd64 and remove
the sudo cp.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-19 07:46:51 +00:00
Gabriela Cervantes
6ec7971f7a tests: nerdctl: Fix variables names and remove network
This PR fixes the variables names for the network that was created as well
removes the network that were created for the tests to ensure a clean environment
when running all the tests and avoid failures specially on baremental environments
that network already exists.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-18 23:00:49 +00:00
Dan Mihai
4df66568cf genpolicy: reject untested CreateContainer field values
Reject CreateContainerRequest field values that are not tested by
Kata CI and that might impact the confidentiality of CoCo Guests.

This change uses a "better safe than sorry" approach to untested
fields. It is very possible that in the future we'll encounter
reasonable use cases that will either:

- Show that some of these fields are benign and don't have to be
  verified by Policy, or
- Show that Policy should verify legitimate values of these fields

These are the new CreateContainerRequest Policy rules:

    count(input.shared_mounts) == 0
    is_null(input.string_user)

    i_oci := input.OCI
    is_null(i_oci.Hooks)
    is_null(i_oci.Linux.Seccomp)
    is_null(i_oci.Solaris)
    is_null(i_oci.Windows)

    i_linux := i_oci.Linux
    count(i_linux.GIDMappings) == 0
    count(i_linux.MountLabel) == 0
    count(i_linux.Resources.Devices) == 0
    count(i_linux.RootfsPropagation) == 0
    count(i_linux.UIDMappings) == 0
    is_null(i_linux.IntelRdt)
    is_null(i_linux.Resources.BlockIO)
    is_null(i_linux.Resources.Network)
    is_null(i_linux.Resources.Pids)
    is_null(i_linux.Seccomp)
    i_linux.Sysctl == {}

    i_process := i_oci.Process
    count(i_process.SelinuxLabel) == 0
    count(i_process.User.Username) == 0

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-06-18 18:09:31 +00:00
Wainer Moschetta
cf372f41bf Merge pull request #9869 from fidencio/topic/disable-tdx-ci
ci: tdx: Disable TDX CI
2024-06-18 14:47:38 -03:00
Gabriela Cervantes
671d9af456 metrics: Improve variable definition in memory inside containers script
This PR improves the variable definition in memory inside
the container script for metrics. This change declares and assigns
the variables separately to avoid masking return values.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-18 16:56:12 +00:00
Gabriela Cervantes
eeb467bdc2 gha: Do not fail when collecting artifacts
This PR will avoid the failures when collecting artifacts for the gha.
This will ensure that we collect and archive system's data for the
purpose of debugging.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-18 16:05:23 +00:00
Zvonko Kaiser
b1909e940e deploy: Add busybox target
For a minimal initrd/image build we may want to leverage busybox.
This is part number two of the NVIDIA initrd/image build

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-18 15:31:00 +00:00
Wainer Moschetta
36093e86e0 Merge pull request #9863 from wainersm/kata-deploy_yq
kata-deploy: always copy ci/install_yq.sh
2024-06-18 10:05:41 -03:00
Fabiano Fidêncio
587f4d45de ci: tdx: Disable TDX CI
TDX CI has been having some issues with the Nydus snapshotter cleanup,
which has been stuck for hours depending every now and then.

With this in mind, let's disable the TDX CI, so we avoid it blocking the
progress of Kata Containers project, and we re-enable it as soon as we
have it solved on Intel's side.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-18 10:30:40 +02:00
markyangcc
a28bf266f9 runtime: fix missing of VhostUserDeviceReconnect parameter assignment
Commit 'ca02c9f5124e' implements the vhost-user-blk reconnection functionality,
However, it has missed assigning VhostUserDeviceReconnect when new the QEMU
HypervisorConfig, resulting in VhostUserDeviceReconnect always set to default value 0.

Real change is this line, most of changes caused by go format,

return vc.HypervisorConfig{
	// ...
	VhostUserDeviceReconnect: h.VhostUserDeviceReconnect,
}, nil

Fixes: #9848
Signed-off-by: markyangcc <mmdou3@163.com>
2024-06-18 12:15:10 +08:00
Alex Lyn
388cd7dde4 Merge pull request #9772 from pmores/add-base-qmp-framework
runtime-rs: add base qmp framework
2024-06-18 09:53:28 +08:00
Alex Lyn
275c498dc9 Merge pull request #9834 from lifupan/main
sandbox: fix the issue of failed to get the vmm master tid
2024-06-18 08:57:21 +08:00
Alex Lyn
d3fb6bfd35 Merge pull request #9860 from stevenhorsman/tokio-vulnerability-bump
Tokio vulnerability bump
2024-06-18 08:35:34 +08:00
Wainer dos Santos Moschetta
bdbee78517 runtime: allow default_{vcpus,memory} annotations to qemu-coco-dev
This is a counterpart of commit abf52420a4 for the qemu-coco-dev
configuration. By allowing default_vcpu and default_memory annotations
users can fine-tune the VM based on the size of the container
image to avoid issues related with pulling large images in the guest.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-17 18:59:52 -03:00
Wainer dos Santos Moschetta
baa8d9d99c runtime: set shared_fs=none to qemu-coco-dev configuration
Just like the TEE configurations (sev, snp, tdx) we want to have the
qemu-coco-dev using shared_fs=none.

Fixes: #9676
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-17 18:42:46 -03:00
Wainer Moschetta
b8d7a8c546 Merge pull request #9862 from BbolroC/improve-kubectl-retry
tests: Use selector rather than pod name for kubectl logs/describe
2024-06-17 18:33:24 -03:00
Hyounggyu Choi
6b065f5609 tests: Use selector rather than pod name for kubectl logs/describe
The following error was observed during the deployment of nydus snapshotter:

```
Error from server (NotFound):
the server could not find the requested resource ( pods/log nydus-snapshotter-5v82v)
  'kubectl logs nydus-snapshotter-5v82v -n nydus-system' failed after 3 tries
    Error: Process completed with exit code 1.
```

This error can occur when a pod is re-created by a daemonset during the retry interval.
This commit addresses the issue by using `--selector` rather than the pod name
for `kubectl logs/describe`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-17 22:27:50 +02:00
Wainer Moschetta
7df221a8f9 Merge pull request #9833 from wainersm/qemu-rs_tests
tests/k8s: run for qemu-runtime-rs on AKS
2024-06-17 16:59:46 -03:00
Zvonko Kaiser
5f11c0f144 Merge pull request #9861 from zvonkok/release-3.6.0
release: Bump VERSIONS file to 3.6.0
2024-06-17 20:35:29 +02:00
Wainer Moschetta
b6a28bd932 Merge pull request #9786 from microsoft/saulparedes/add_back_insecure_registry_pull
genpolicy: add back support for insecure
2024-06-17 15:21:25 -03:00
Wainer Moschetta
68415dabcd Merge pull request #9815 from msanft/fix/genpolicy/flag-name
genpolicy: fix settings path flag name
2024-06-17 15:13:25 -03:00
Wainer dos Santos Moschetta
08eaa60b59 CI: disable all run-kata-deploy-tests-on-garm jobs
The following jobs have failed more than 50% on nightly CI.

run-kata-deploy-tests-on-garm / run-kata-deploy-tests (clh, k0s)
run-kata-deploy-tests-on-garm / run-kata-deploy-tests (clh, rke2)
run-kata-deploy-tests-on-garm / run-kata-deploy-tests (qemu, k0s)

Instead of removing only those jobs, let's skip the kata-deploy-tests
on GARM completely so we can try to fix all the issues (or maybe
drop the jobs altogether).

Issue: #9854
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-17 14:39:38 -03:00
Steve Horsman
4a41cee534 Merge pull request #9838 from zvonkok/gha-no-sudo
CI: remove sudo from GHA
2024-06-17 16:23:39 +01:00
Wainer dos Santos Moschetta
e517167825 kata-deploy: always copy ci/install_yq.sh
To build the build-kata-deploy image, it should be copied ci/install_yq.sh to
tools/packaging/kata-deploy/local-build/dockerbuild as this script will install
yq within the image. Currently, if
tools/packaging/kata-deploy/local-build/dockerbuild/install_yq.sh exists then
make won't copy it again. This can raise problems as, for example, the current
update of yq version (commit c99ba42d) in ci/install_yq.sh won't force the
rebuild of the build-kata-deploy image.

Note: this isn't a problem on a fresh dev or CI environment.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-17 12:18:22 -03:00
Zvonko Kaiser
618121a654 release: Bump VERSIONS file to 3.6.0
Let's bump the VERSIONS file and start preparing for a new release of
the project.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-17 12:06:46 +00:00
stevenhorsman
53659f1ede libs: Update tokio dependencies
- Bump tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-17 13:03:01 +01:00
stevenhorsman
35f6be97df runtime-rs: Update tokio dependency
- Bump tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019

If possible it would be good to add the many runtime-rs creates into the
runtime-rs workspace and provide a centralised version to avoid the updates
in many places.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-17 13:03:01 +01:00
stevenhorsman
3bb1a67d80 agent-ctl: Update rustjail dependencies
- Run `cargo update -p rustjail` to pick up rustjail's bump of
tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-17 13:03:01 +01:00
stevenhorsman
d2d35d2dcc runk: Update tokio dependencies
- Bump tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-17 13:03:01 +01:00
stevenhorsman
adda401a8c genpolicy: Update tokio dependencies
- Bump tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-17 13:03:01 +01:00
stevenhorsman
b7928f465e agent: Update tokio dependencies
- Bump tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-17 13:02:47 +01:00
Zvonko Kaiser
5c2f3f34a8 CI: remove sudo from GHA
Now that all artifacts are owned by $USER we can start
to remove sudo from our GHA

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-17 11:06:56 +00:00
Steve Horsman
cce735a09e Merge pull request #9840 from stevenhorsman/bump-agent-rust-1.75.0
versions: Bump rust toolchain
2024-06-17 11:28:07 +01:00
Fupan Li
b218c4bc10 Merge pull request #9836 from lifupan/main_fix
sandbox: fix the issue of double initial_size_manager config
2024-06-17 09:15:51 +08:00
Fabiano Fidêncio
9b5dd854db Merge pull request #9726 from GabyCT/topic/unodeport
tests: kbs: Use nodeport deployment from upstream trustee
2024-06-16 22:31:27 +02:00
Wainer dos Santos Moschetta
d4f664b73b CI: disable run-kata-monitor-tests / run-monitor (containerd, lts) job
The job has failed more than 50% on nightly CI. Remove it from the list of
execution until we don't have a fix.

Issue: #9853
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-14 16:27:04 -03:00
Wainer dos Santos Moschetta
cbf0b7ca7b CI: disable run-basic-amd64-tests / run-nerdctl-tests (clh) job
The job has failed more than 50% on nightly CI. Remove it from the list of
execution until we don't have a fix.

Issue: #9852
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-14 16:17:26 -03:00
Wainer dos Santos Moschetta
562820449e CI: disable run-basic-amd64-tests / run-vfio (qemu) job
The job has failed more than 50% on nightly CI. Remove it from the list of
execution until we don't have a fix.

The clh variation was disabled on commit 5f5274e699 so this change will
actually result on all the VFIO jobs disabled. Instead of delete the entire
entry from this workflow yaml (or comment the entry), I preferred to use
`if: false` which will make the jobs appear on the UI as skipped.

Issue: 9851
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-14 16:09:59 -03:00
GabyCT
4800e242a4 Merge pull request #9832 from GabyCT/topic/fixsets
tests: setup: Improve setup script for kubernetes tests
2024-06-14 11:14:05 -06:00
Bo Chen
a68aeca356 Merge pull request #9575 from likebreath/0430/clh_v39.0
versions: Upgrade to Cloud Hypervisor v39.0
2024-06-14 09:10:19 -07:00
stevenhorsman
e23b929ba0 versions: Bump rust toolchain
- Bump the rust version used to build the agent to 1.75.0 as
agreed on in the AC meeting

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-14 16:45:16 +01:00
stevenhorsman
3fb176970f dragonball: Fix device manager warning
- Fix the lint error:
```
error: you seem to use `.enumerate()` and immediately discard the index
   --> src/device_manager/mod.rs:427:33
    |
427 |         for (_index, device) in self.virtio_devices.iter().enumerate() {
    |                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```
 by removing the unnecessary enumerate

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-14 16:45:16 +01:00
stevenhorsman
1ea2671f2f dragonball: Fix lint with rust 1.75.0
The ci failed with:
```
error: use of `or_insert_with` to construct default value
   --> src/address_space_manager.rs:650:14
    |
650 |             .or_insert_with(NumaNode::new);
    |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try: `or_default()`
    |
```

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-14 16:45:16 +01:00
Steve Horsman
ab8a9882c1 Merge pull request #9818 from EmmEff/fix-spelling
runtime: fix minor spelling issues
2024-06-14 13:12:56 +01:00
Steve Horsman
99bf95f773 Merge pull request #9827 from littlejawa/fix_panic_on_metrics_gathering
runtime: avoid panic on metrics gathering
2024-06-14 11:12:43 +01:00
Steve Horsman
3eba4211f3 Merge pull request #9843 from microsoft/danmihai1/install_yq
ci: fix the expected yq version string
2024-06-14 10:26:21 +01:00
Pavel Mores
380f8ad03f runtime-rs: add base vCPU hotplugging support
We take advantage of the Inner pattern to enable QemuInner::resize_vcpu()
take `&mut self` which we need to call non-const functions on Qmp.

This runs on Intel architecture but will need to be verified and ported
(if necessary) to other architectures in the future.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-06-14 10:13:32 +02:00
Pavel Mores
8231c6c4a3 runtime-rs: instantiate Qmp as (optional) member of QemuInner
The QMP_SOCKET_FILE constant in cmdline_generator.rs is made public to make
it accessible from QemuInner.  This is fine for now however if the constant
needs to be accessed from additional places in the future we could consider
moving it to somewhere more visible.

The Debug impl for Qmp is empty since first, we don't actually want it,
it's only forced by Hypervisor trait bounds, and second, it doesn't have
anything to display anyway.  If Qmp gets any members in the future that
can be meaningfully displayed they should be handled by Qmp's Debug::fmt().

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-06-14 10:13:32 +02:00
Pavel Mores
6fdb262dca runtime-rs: add Qmp object to encapsulate QMP functionality
The constructor handles QMP connection initialisation, too, so there can
be non-functional Qmp instance.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-06-14 10:13:32 +02:00
Manuel Huber
62fd84dfd8 build: allow rootfs builds w/o git or VERSION file deps
We set the VERSION variable consistently across Makefiles to
'unknown'  if the file is empty or not present.
We also use git commands consistently for calculating the COMMIT,
COMMIT_NO variables, not erroring out when building outside of
a git repository.
In create_summary_file we also account for a missing/empty VERSION
file.
This makes e.g. the UVM build process in an environment where we
build outside of git with a minimal/reduced set of files smoother.

Signed-off-by: Manuel Huber <mahuber@microsoft.com>
2024-06-13 22:46:52 +00:00
Dan Mihai
824287d64a Merge pull request #9844 from microsoft/danmihai1/k8s-policy-pvc
tests: fix yq command line in k8s-policy-pvc
2024-06-13 15:07:15 -07:00
Wainer dos Santos Moschetta
73ab5942fb tests/k8s: run for qemu-runtime-rs on AKS
The following tests are disabled because they fail (alike with dragonball):

- k8s-cpu-ns.bats
- k8s-number-cpus.bats
- k8s-sandbox-vcpus-allocation.bats

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-13 16:20:59 -03:00
Mike Frisch
c2f61b0fe3 runtime: spelling fixes
Minor spelling fixes in runtime log messages.

Signed-off-by: Mike Frisch <mikef17@gmail.com>
2024-06-13 12:11:34 -04:00
Dan Mihai
56f9e23710 tests: fix yq command line in k8s-policy-pvc
Fix the collision between:
- https://github.com/kata-containers/kata-containers/pull/9377
- https://github.com/kata-containers/kata-containers/pull/9706

One enabled a newer yq command line format and the other used the
older format. Both passed CI because they were not tested together.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-06-13 16:06:15 +00:00
Dan Mihai
23e99e264c ci: fix the expected yq version string
I get:

~/gopath/bin/yq --version
yq (https://github.com/mikefarah/yq/) version v4.40.7

Also add support for set -o xtrace to install_yq.sh.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-06-13 15:52:26 +00:00
Ryan Savino
0430794952 qemu: upgrade to 8.2.4
There is a known issue in qemu 7.2.0 that causes kernel-hashes to fail the verification of the launch binaries for the SEV legacy use case.

Upgraded to qemu 8.2.4.
new available features disabled.

Fixes: #9148

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2024-06-13 10:19:42 -05:00
Greg Kurz
b85b1c1058 Merge pull request #9790 from gkurz/kill-some-dead-runtime-code
Kill some dead runtime code
2024-06-13 15:45:51 +02:00
gaohuatao
4cb4e44234 runtime-rs: fix the bug of func count_files
When the total number of files observed is greater than limit, return -1 directly.
runtime has fixed this bug, it should b ported to runtime-rs.

Fixes:#9829

Signed-off-by: gaohuatao <gaohuatao@bytedance.com>
2024-06-13 16:02:33 +08:00
Fupan Li
cd68ef372f sandbox: fix the issue of double initial_size_manager config
It shouldn't call the initial_size_manager's setup_config
in the load_config since it had been called in the sandbox's
try_init function.

Fixes: #9778

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-06-13 15:44:51 +08:00
Fupan Li
61687992f4 sandbox: fix the issue of failed to get the vmm master tid
For kata container, the container's pid is meaning less to
containerd/crio since the container's pid is belonged to VM,
and containerd/crio couldn't use it. Thus we just return any
tid of kata shim or hypervisor. But since the hypervisor had
been stopped before deleting the container, and it wouldn't
get the hypervisor's tid for some supported hypervisor, thus
we'd better to return the kata shim's pid instead of hypervisor's
tid.

Fixes: #9777

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-06-13 10:27:04 +08:00
Fabiano Fidêncio
56423cbbfe Merge pull request #9706 from burgerdev/burgerdev/genpolicy-devices
genpolicy: add support for devices
2024-06-12 23:03:41 +02:00
Wainer Moschetta
d971e5ae68 Merge pull request #9537 from wainersm/kata-deploy-crio
kata-deploy: configuring CRI-O for guest-pull image pulling
2024-06-12 17:27:00 -03:00
Gabriela Cervantes
c36c300fd6 tests: kbs: Use nodeport deployment from upstream trustee
This PR uses the nodeport deployment from upstream trustee.
To ensure our deployment is as close to upstream trustee replace
the custom nodeport handling and replace it with nodeport
kustomized flavour from the trustee project.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-12 20:01:59 +00:00
Gabriela Cervantes
0066aebd84 tests: setup: Improve setup script for kubernetes tests
This PR makes general improvements like definition of variables and
the use of them to improve the general setup script for kubernetes
tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-12 19:39:54 +00:00
GabyCT
461b6e7c93 Merge pull request #9821 from GabyCT/topic/fixts
metrics: Use function definition to have uniformity
2024-06-12 10:04:28 -06:00
Fabiano Fidêncio
3a0247ed43 Merge pull request #9819 from stevenhorsman/config-envvar-precedence
agent: config: Ensure envs take precedence
2024-06-12 11:26:02 +02:00
Julien Ropé
9c86eb1d35 runtime: avoid panic on metrics gathering
While running with a remote hypervisor, whenever kata-monitor tries to access
metrics from the shim, the shim does a "panic" and no metric can be gathered.

The function GetVirtioFsPid() is called on metrics gathering, and had a call
to "panic()". Since there is no virtiofs process for remote hypervisor, the
right implementation is to return nil. The caller expects that, and will skip
metrics gathering for virtiofs.

Fixes: #9826

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-06-12 10:02:44 +02:00
Xuewei Niu
92cc5e0adb Merge pull request #9781 from gaohuatao-1/ght/shm 2024-06-12 12:39:28 +08:00
Moritz Sanft
84903c898c genpolicy: fix settings path flag name
This corrects the warning to point to the \`-j\` flag,
which is the correct flag for the JSON settings file.
Previously, the warning was confusing, as it pointed to
the \`-p\` flag, which specifies to the path for the Rego ruleset.

Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
2024-06-11 21:17:18 +02:00
Greg Kurz
1acf8d0c35 govmm: Drop QEMU's NoShutdown knob
Code is not used.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-06-11 19:55:54 +02:00
Greg Kurz
cb5b548ad7 govmm: Drop QEMU's Daemonize knob
Code isn't used anymore.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-06-11 19:55:54 +02:00
Greg Kurz
33eaf69d5f virtcontainers: Drop QEMU's Daemonize knob
QEMU isn't started as daemon anymore and this won't change (see #5736
for details). Drop the related code.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-06-11 19:55:54 +02:00
Wainer Moschetta
f66a5b6287 Merge pull request #9807 from wainersm/qemu-rs_kata-deploy
kata-deploy: add qemu-runtime-rs runtimeClass
2024-06-11 14:50:01 -03:00
Dan Mihai
d47f40210a Merge pull request #9808 from microsoft/saulparedes/oci_from_settings
genpolicy: load OCI version from settings
2024-06-11 10:42:04 -07:00
Gabriela Cervantes
a96ff49060 metrics: Use function definition to have uniformity
This PR uses the function definition to have uniformity across
all the launch times script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-11 17:36:08 +00:00
Saul Paredes
3e9d6c11a1 genpolicy: add back support for insecure
registries

Adding back changes from
77540503f9.

Fixes: #9008

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-06-11 09:42:23 -07:00
Bo Chen
2398442c58 runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v39.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #8694, #9574

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-06-11 09:42:17 -07:00
Bo Chen
7a82894502 versions: Upgrade to Cloud Hypervisor v39.0
This patch upgrades Cloud Hypervisor to v39.0 from v36.0, which contains
fixes of several security advisories from dependencies. Details can be
found from #9574.

Fixes: #8694, #9574

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-06-11 09:42:16 -07:00
Wainer dos Santos Moschetta
be9990144a workflow: run kata-deploy tests to qemu-runtime-rs on AKS
Start testing the ability of kata-deploy to install and configure
the qemu-runtime-rs runtimeClass.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-11 12:58:47 -03:00
Wainer dos Santos Moschetta
4f398cc969 kata-deploy: add qemu-runtime-rs runtimeClass
Allow kata-deploy to install and configure the qemu-runtime-rs runtimeClass
which ties to qemu hypervisor implementation in rust for the runtime-rs.

Fixes: #9804
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-11 12:58:47 -03:00
stevenhorsman
40e02b34cb agent: config: Ensure envs take precedence
- Update the config parsing logic so that when reading from the
agent-config.toml file any envs are still processed
- Add units tests to formalise that the envs take precedence over values
from the command line and the config file

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-11 16:31:10 +01:00
Steve Horsman
59ff40f054 Merge pull request #9811 from mkulke/mkulke/use-kebabcase-for-enum-values-in-config-file-parsing
agent: convert enum vals to kebab-case in cfg file
2024-06-11 14:49:30 +01:00
gaohuatao
638e9acf89 runtime: fix the bug of func countFiles
When the total number of files observed is greater than limit, return (-1, err).
When the returned err is not nil, the func countFiles should return -1.

Fixes:#9780

Signed-off-by: gaohuatao <gaohuatao@bytedance.com>
2024-06-11 18:17:18 +08:00
Alex Lyn
1c8db85d54 Merge pull request #9784 from Apokleos/bufix-testcases
kata-types: fix bug in kata-types several test cases
2024-06-11 10:01:45 +08:00
Saul Paredes
6a84562c16 genpolicy: load OCI version from settings
Load OCI version from genpolicy-settings.json and validate it in
rules.rego

Fixes: #9593

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-06-10 15:30:39 -07:00
GabyCT
0c5849b68b Merge pull request #9809 from microsoft/danmihai1/yq-breaking-change
tests: k8s: use newer yq command line format
2024-06-10 16:29:59 -06:00
Wainer Moschetta
ade69e44f9 Merge pull request #9785 from BbolroC/kubectl-retry
CI: Introduce retry mechanism for kubectl in gha-run.sh
2024-06-10 18:33:34 -03:00
Magnus Kulke
abc704a720 agent: convert enum vals to kebab-case in cfg file
fixes #9810

Add an annotation to the enum values in the agent config that will
deserialize them using a kebab-case conversion, aligning the behaviour
to parsing of params specified via kernel cmdline.

drive-by fix: add config override for guest_component_procs variable

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
2024-06-10 21:55:05 +02:00
Dan Mihai
32198620a9 tests: k8s: use newer yq command line format
Fix the recent collision between:
- https://github.com/kata-containers/kata-containers/pull/9377
- https://github.com/kata-containers/kata-containers/pull/9725

One enabled a newer yq command line format and the other used the older
format. Both passed CI because they were not tested together.

Fixes: #9789

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-06-10 18:48:25 +00:00
Dan Mihai
079a0a017c Merge pull request #9557 from portersrc/ci-debug-output-nydus-pod
CI: describe pod on k8s-create-pod wait failure
2024-06-10 08:17:54 -07:00
Ryan Savino
84280115f6 Merge pull request #9151 from niteeshkd/nd_snp_kernel_hashes
runtime: enable kernel-hashes for SNP confidential container
2024-06-07 18:19:51 -05:00
GabyCT
03bcc167a4 Merge pull request #9779 from GabyCT/topic/fixcoscript
tests: Fix indentation in common script
2024-06-07 15:37:10 -06:00
Wainer Moschetta
7a28535277 Merge pull request #9800 from fidencio/topic/ci-tdx-re-enable-some-of-the-tests
ci: tdx: Re-enable a bunch of volume related tests
2024-06-07 16:17:19 -03:00
Hyounggyu Choi
8ff128dda8 CI: Introduce retry mechanism for kubectl in gha-run.sh
Frequent errors have been observed during k8s e2e tests:

- The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
- Error from server (ServiceUnavailable): the server is currently unable to handle the request
- Error from server (NotFound): the server could not find the requested resource

These errors can be resolved by retrying the kubectl command.

This commit introduces a wrapper function in common.sh that runs kubectl up to 3 times
with a 5-second interval. Initially, this change only covers gha-run.sh for Kubernetes.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-07 18:24:19 +02:00
Fabiano Fidêncio
81c221c1b4 ci: k8s: tdx: Re-enable volume tests
It seems I was very lose on disabling some of the tests, and the issues
I faced could be related to other instabilities in the CI.

Let's re-enable this one, following what was done for the SEV, SNP, and
coco-qemu-dev.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-07 18:13:36 +02:00
Fabiano Fidêncio
9db9d35198 ci: k8s: tdx: Re-enable projected-volume tests
It seems I was very lose on disabling some of the tests, and the issues
I faced could be related to other instabilities in the CI.

Let's re-enable this one, following what was done for the SEV, SNP, and
coco-qemu-dev.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-07 18:12:36 +02:00
Fabiano Fidêncio
f6a6cba8ca ci: k8s: tdx: Re-enable nested-configmap-secret tests
It seems I was very lose on disabling some of the tests, and the issues
I faced could be related to other instabilities in the CI.

Let's re-enable this one, following what was done for the SEV, SNP, and
coco-qemu-dev.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-07 18:12:06 +02:00
Fabiano Fidêncio
957d0cccf6 ci: k8s: tdx: Re-enable inotify tests
It seems I was very lose on disabling some of the tests, and the issues
I faced could be related to other instabilities in the CI.

Let's re-enable this one, following what was done for the SEV, SNP, and
coco-qemu-dev.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-07 18:10:39 +02:00
Fabiano Fidêncio
fc6f662ae0 ci: k8s: tdx: Re-enable credentials-secrets tests
It seems I was very lose on disabling some of the tests, and the issues
I faced could be related to other instabilities in the CI.

Let's re-enable this one, following what was done for the SEV, SNP, and
coco-qemu-dev.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-07 18:08:29 +02:00
Fabiano Fidêncio
5741c6d3e6 Merge pull request #9768 from fidencio/topic/ci-tdx-enable-cdh-test
ci: kbs: Enable CDH tests for TDX
2024-06-07 17:59:12 +02:00
Greg Kurz
afeb98d73f Merge pull request #9782 from ldoktor/ci-centos-9
ci.ocp: Switch base to centos-9
2024-06-07 13:15:02 +02:00
Fabiano Fidêncio
fde457589e ci: kbs: tdx: Enable basic attestation tests
Let's stop skipping the CDH tests for TDX, as know we should have an
environmemnt where it can run and should pass. :-)

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-07 12:18:50 +02:00
Fabiano Fidêncio
cac525059e ci: kbs: tdx: Use the hostname ip instead of localhost for the PCCS
We must ensure we use the host ip to connect to the PCCS running on the
host side, instead of using localhost (which has a different meaning
from inside the KBS pod).

The reason we're using `hostname -i` isntead of the helper functions, is
because the helper functions need the coco-kbs deployed for them to
work, and what we do is before the deployment.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-07 12:18:07 +02:00
Alex Lyn
27685c91e5 kata-types: fix bug in kata-types several test cases
(1) As mis-use of cap.set causing previous Caps lost which
causing assert! failed, just replacing cap.set with cap.add.

(2) It will return error if there's no such name setting when
do update_config_by_annotation {
    ...
if config.runtime.name.is_empty() {
            return Err(io::Error::new(
                io::ErrorKind::InvalidData,
                "Runtime name is missing in the configuration",
            ));
        }
...
}

Fixes #9783

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-06-07 09:16:23 +08:00
David Esparza
822c641b58 Merge pull request #9760 from amshinde/kata-manager-link-runc
kata-manager: Add symlinks for runc and slirp4netns
2024-06-06 12:55:57 -06:00
Lukáš Doktor
699376c535 ci.ocp: Switch base to centos-9
Centos8 is EOL and repos are not available anymore. Centos9 contains the
same packages and should do well as a base for testing.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-06-06 09:03:17 +02:00
Chris Porter
4172ccb3a0 CI: describe pod on k8s-create-pod wait failure
This is generally useful debug output on test failures,
and specifically this has been useful for nydus-related
issues recently.

Signed-off-by: Chris Porter <porter@ibm.com>
2024-06-05 12:37:53 -04:00
Gabriela Cervantes
264c7e9473 tests: Fix indentation in common script
This PR fixes the indentation in common script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-05 15:52:40 +00:00
Niteesh Dubey
1dbf5208ac versions: Upgrade ovmf
This is required to support SEV-SNP confidential container with kernel-hashes.
Since this ovmf is latest stable version, it is good to upgrade for tdx
and Vanilaa builds too.

Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
2024-06-05 15:02:02 +00:00
Niteesh Dubey
62d3d7c58f runtime: enable kernel-hashes for SNP confidential container
This is required to provide the hashes of kernel, initrd and cmdline
needed during the attestation of the coco.

Fixes: #9150

Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
2024-06-05 15:02:02 +00:00
Steve Horsman
b30d085271 Merge pull request #9702 from ildikov/blog-submission-guide
docs: Adding blog submission guidelines
2024-06-05 09:03:19 +01:00
Amulya Meka
b323afeda9 Merge pull request #9214 from Amulyam24/oras
kata-deploy: install oras using release artefacts on ppc64le
2024-06-05 11:40:55 +05:30
Fabiano Fidêncio
138ef2c55f Merge pull request #9678 from AdithyaKrishnan/main
TEEs: Skip a few CI tests for SEV/SNP
2024-06-04 23:42:51 +02:00
GabyCT
ba30f0804a Merge pull request #9770 from GabyCT/topic/fixvad
tests: Use variable definition for better uniformity
2024-06-04 15:23:34 -06:00
Wainer dos Santos Moschetta
af4f9afb71 kata-deploy: add PULL_TYPE handler for CRI-O
A new PULL_TYPE environment variable is recognized by the kata-deploy's
install script to allow it to configure CRIO-O for guest-pull image pulling
type.

The tests/integration/kubernetes/gha-run.sh change allows for testing it:
```
export PULL_TYPE=guest-pull
cd tests/integration/kubernetes
./gha-run.sh deploy-k8s
```

Fixes #9474
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-04 14:02:01 -03:00
GabyCT
6c2e8bed77 Merge pull request #9725 from 3u13r/feat/genpolicy/filter-by-runtime
genpolicy: add ability to filter for runtimeClassName
2024-06-04 10:06:14 -06:00
Hyounggyu Choi
869f89c338 Merge pull request #9773 from BbolroC/use-qemu-coco-dev-s390x
GHA: Use qemu-coco-dev for k8s nydus test on s390x
2024-06-04 17:49:38 +02:00
Gabriela Cervantes
cafba23f3e tests: Use variable definition for better uniformity
This PR replaces the name to use a variable that is already defined
to have a better uniformity across the general script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-04 15:49:27 +00:00
Wainer Moschetta
2b8cdd9ff2 Merge pull request #9765 from wainersm/disable_failing_jobs
CI: disable jobs that failed > 50% on nightly CI recently - part 1
2024-06-04 12:05:36 -03:00
Hyounggyu Choi
246ee83768 GHA: Use qemu-coco-dev for k8s nydus test on s390x
In line with the changes for x86_64, the k8s nydus test for s390x should
also use `qemu-coco-dev` for `KATA_HYPERVISOR`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-04 15:49:23 +02:00
Hyounggyu Choi
3aff6c5bd8 CI: Retry fetching node_start_time when it is empty
It was observed that the `node_start_time` value is sometimes empty,
leading to a test failure.

This commit retries fetching the value up to 3 times.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-04 15:41:15 +02:00
Zvonko Kaiser
647560539f Merge pull request #9769 from zvonkok/initrd-image-no-sudo
ci: remove sudo and make sure artifacts is owned by user
2024-06-04 07:16:51 +02:00
Wainer Moschetta
b5561074c3 Merge pull request #9377 from beraldoleal/yqbump
deps: bumping yq to v4.40.7
2024-06-03 14:34:58 -03:00
Ildiko Vancsa
5e03bec26b docs: Adding blog submission guidelines
The Kata blog was recently moved to the project's website. The content
of the blog is stored together with the rest of the website source on
GitHub.

This patch adds a short guide that describes how to submit a new
blog post as a PR, to appear on the project's website.

Signed-off-by: Ildiko Vancsa <ildiko.vancsa@gmail.com>
2024-06-03 08:58:05 -07:00
GabyCT
6c7affbd85 Merge pull request #9741 from GabyCT/topic/staticcheck
tests: Fix indentation in static checks script
2024-06-03 09:43:23 -06:00
Zvonko Kaiser
a48c084e13 ci: remove sudo and make sure image is owed by user
The image build needs special handling since we're doing a lot of
privileged operations.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-03 15:29:06 +00:00
Fabiano Fidêncio
34d45f0868 Merge pull request #9749 from mkulke/mkulke/configure-guest-components-spawning
CoCo: introduce config for guest-components procs
2024-06-03 15:50:36 +02:00
Ryan Savino
72dc823059 tests: k8s: sev: snp: skip "setting sysctl" test
This test fails when using `shared_fs=none` with the nydus snapshotter.
Issue tracked here: #9666
Skipping for now.

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2024-06-03 01:14:17 -05:00
Ryan Savino
3f3be54893 tests: k8s: sev: snp: skip initContainers shared vol test
This test is failing due to the initContainers not being properly
handled with the guest image pulling.
Issue tracked here: #9668
Skipping for now.

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2024-06-03 01:14:17 -05:00
Ryan Savino
35dfb730ce tests: k8s: sev: snp: skip "kill all processes in container" test
This test fails when using `shared_fs=none` with the nydus napshotter,
Issue tracked here: #9664
Skipping for now.

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2024-06-03 01:14:16 -05:00
Ryan Savino
62cc1dec4c tests: replace docker debug alpine image with ghcr
docker alpine latest image is rate limited.
Need to use ghcr.io image.

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2024-06-03 01:14:16 -05:00
ChengyuZhu6
1820b02993 tests: replace busybox from docker with quay in guest pull
To prevent download failures caused by high traffic to the Docker image,
opt for quay.io/prometheus/busybox:latest over docker.io/library/busybox:latest .

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-06-03 01:14:16 -05:00
Ryan Savino
6c646dc96d tests: k8s: sev: snp: add runtime annotation for sev and snp
sev and snp cases added to the KATA_HYPERVISOR switch.

Signed-off-by: Ryan Savino <ryan.savino@amd.com>
2024-06-03 01:14:16 -05:00
Ryan Savino
6db08ed620 runtime: sev: snp: Use shared_fs=none
Disabling 9p for SEV and SNP TEEs.

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2024-06-03 01:14:16 -05:00
Ryan Savino
668959408d tests: ensure kata_deploy cleanup even if namespace deletion fails
the test cluster namespace deletion failing causes kata_deploy to not get cleaned up.

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2024-06-03 01:14:15 -05:00
Wainer dos Santos Moschetta
c9f93fc507 github: add actionlint configuration file
Added configuration file with rules to exclude some self-hosted
runners from the linter warnings.

Related-with: #9646
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-31 19:46:09 -03:00
Wainer dos Santos Moschetta
5f5274e699 CI: disable run-basic-amd64-tests / run-vfio (clh) job
The job has failed more than 50% on nightly CI. Remove it from the list of
execution until we don't have a fix.

Issue: 9764
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-31 19:34:45 -03:00
Wainer dos Santos Moschetta
9154ce9051 CI: disable run-basic-amd64-tests / run-tracing jobs
These jobs have failed more than 50% on nightly CI. Remove them from the list of
execution until we don't have a fix.

Issue: 9763
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-31 19:26:58 -03:00
Wainer dos Santos Moschetta
ac4d48ad17 CI: disable run-kata-monitor-tests / run-monitor (qemu, containerd) job
This job has failed more than 50% on nightly CI. Remove it from the list of
execution until we don't have a fix.

Issue: 9761
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-31 19:21:21 -03:00
Archana Shinde
7a3e13fae8 kata-manager: Add symlinks for runc and slirp4netns
For nerdctl install, add symlinks for runc and slirp4netns in the
binary install path.
runc link comes in handy for running runc containers with nerdctl fir
quick tests.
slirp4netns allows for running containers with user mode networking
useful in case of rootless containers.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-05-31 13:53:42 -07:00
Markus Rudy
13310587ed genpolicy: check requested devices
CreateContainerRequest objects can specify devices to be created inside
the guest VM. This change ensures that requested devices have a
corresponding entry in the PodSpec.

Devices that are added to the pod dynamically, for example via the
Device Plugin architecture, can be allowlisted globally by adding their
definition to the settings file.

Fixes: #9651
Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-05-31 22:05:49 +02:00
Wainer Moschetta
f093c4c190 Merge pull request #9754 from wainersm/qemu_coco_dev-enable_policy_tests
tests/k8s: enable policy tests for qemu-coco-dev
2024-05-31 15:09:25 -03:00
Markus Rudy
ea578f0a80 genpolicy: add support for VolumeDevices
This adds structs and fields required to parse PodSpecs with
VolumeDevices and PVCs with non-default VolumeModes.

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-05-31 19:34:14 +02:00
Beraldo Leal
d3a5eb299a tools: bumping kernel config version
Lets make ci happy.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Beraldo Leal
53b8158a81 tests: adding debug and skip to kata-deploy
If a test is failing during setup, makes no much sense to run the suite.
Let's skip and add some debug messages.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Beraldo Leal
9171821d57 tests: add debug message to check return code
Lets add this message to make sure sh is starting properly.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Beraldo Leal
f91fbef184 tests: increase time after sh execution
Increased sleep duration to ensure the shell process starts.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Beraldo Leal
ba5d2e54c2 tests: remove object separation mark from eof
End of file should not end with --- mark. This will confuse tools like
yq and kubectl that might think this is another object.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Beraldo Leal
3e8b4806b8 tests: increase debug messages for kata-deploy
When the timeout happens we can't tell much information about the nodes.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Beraldo Leal
c99ba42d62 deps: bumping yq to v4.40.7
Since yq frequently updates, let's upgrade to a version from February to
bypass potential issues with versions 4.41-4.43 for now. We can always
upgrade to the newest version if necessary.

Fixes #9354
Depends-on:github.com/kata-containers/tests#5818

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Beraldo Leal
4f6732595d ci: skip go version check
golang.mk is not ready to deal with non GOPATH installs. This is
breaking test on s390x.

Since previous steps here are installing go and yq our way, we could
skip this aditional check. A full refactor to golang.mk would be needed
to work with different paths.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Greg Kurz
7886ed6670 Merge pull request #9751 from wainersm/k8s_print_logs_on_fail
tests/k8s: print logs on fail only (k8s-confidential-attestation.bats)
2024-05-31 14:47:27 +02:00
Fabiano Fidêncio
44df674232 Merge pull request #9757 from fidencio/topic/ci-tdx-skip-empty-dir-tests
ci: k8s: Skip empty dir tests also for TDX
2024-05-31 13:18:35 +02:00
Magnus Kulke
9f04dc4c8b agent: introduce config for coco attestion procs
fixes #9748

A configuration option `guest_component_procs` has been introduced that
indicates which guest component processes are supposed to be spawned by
the agent. The default behaviour remains that all of those processes are
actively spawned by the agent. At the moment this is based on presence
of binaries in the rootfs and the guest_component_api_rest option.

The new option is incremental:

none -> attestation-agent -> confidential-data-hub -> api-server-rest

e.g. api-server-rest implies attestation-agent and confidential-data-hub

the `none` option has been removed from guest_component_api_rest, since
this is addresses by the introduced option.

To not change expected behaviour for  non-coco guests we still will still
only attempt to spawn the processes if the requested attestation binaries
are present on the rootfs, and issue in warning in those cases.

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
2024-05-31 12:15:41 +02:00
Amulyam24
eadcb868f4 kata-deploy: install oras using release artefacts on ppc64le
We are currently building Oras from source on ppc64le. Now that they offically release the artefacts
for power, consume them to install Oras.

Fixes: #9213

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2024-05-31 14:16:14 +05:30
Zvonko Kaiser
0321a3adcc Merge pull request #8944 from zvonkok/update-threat-model
threat-model: Add VFIO, ACPI and KVM/VMM threat-model descriptions
2024-05-31 10:38:27 +02:00
Fabiano Fidêncio
03a7cf4b02 ci: k8s: Skip empty dir tests also for TDX
Wainer noticed this is failing for the coco-qemu-dev case, and decided
to skip it, notifying me that he didn't fully understand why it was not
failing on TDX.

Turns out, though, this is also failing on TDX, and we need to skip it
there as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-31 09:59:46 +02:00
Fabiano Fidêncio
72a71ff2bf Merge pull request #9737 from zvonkok/kata-deploy-no-sudo
ci: kata-deploy no sudo
2024-05-31 09:55:24 +02:00
Zvonko Kaiser
dd89d35b75 Merge pull request #9747 from zvonkok/remove-git-config
ci: Remove all git config safe.directory
2024-05-31 07:25:28 +02:00
Leonard Cohnen
1d1690e2a4 genpolicy: add ability to filter for runtimeClassName
Add the CLI flag --runtime-class-names, which is used during
policy generation. For resources that can define a
runtimeClassName (e.g., Pods, Deployments, ReplicaSets,...)
the value must have any of the --runtime-class-names as
prefix, otherwise the resource is ignored.

This allows to run genpolicy on larger yaml
files defining many different resources and only generating
a policy for resources which will be deployed in a
confidential context.

Signed-off-by: Leonard Cohnen <lc@edgeless.systems>
2024-05-31 03:17:02 +02:00
Wainer dos Santos Moschetta
3333f8ddfd tests/k8s: enable policy tests for qemu-coco-dev
So qemu-coco-dev is on pair with the TEE configurations.

Fixes: #9753
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-30 21:51:15 -03:00
Wainer Moschetta
83fa813700 Merge pull request #9694 from wainersm/qemu_coco_dev-k8s-guest-pull
tests: enable guest-pull on all k8s tests for the qemu-coco-dev configuration
2024-05-30 21:48:11 -03:00
Wainer dos Santos Moschetta
55ae98eb28 tests/k8s: print logs on fail only (k8s-confidential-attestation.bats)
Use the variable BATS_TEST_COMPLETED which is defined by the bats framework
when the test finishes. `BATS_TEST_COMPLETED=` (empty) means the test failed,
so the node syslogs will be printed only at that condition.

Fixes: #9750
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-30 17:19:33 -03:00
Wainer Moschetta
66e3b88694 Merge pull request #9746 from wainersm/nydus_snapshotter_pin
ci: pin the nydus-snapshotter image version
2024-05-30 16:49:10 -03:00
Wainer dos Santos Moschetta
3e18fe7805 tests/k8s: skip file volume tests for qemu-coco-dev
This test fails with qemu-coco-dev configuration and guest-pull image pull.

Issue: #9667
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-30 14:50:59 -03:00
Zvonko Kaiser
063db516f2 ci: Remove all git config safe.directory
Now with the sudo less build we should be good
to remove those hacks.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-30 15:12:28 +00:00
Zvonko Kaiser
d8889684f0 ci: kata-deploy no sudo
Build/push/manage aritfacts without sudo

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-30 15:07:27 +00:00
Wainer dos Santos Moschetta
5faf9ca344 ci: pin the nydus-snapshotter image version
It's cloning the nydus-snapshotter repo from the version specified in
versions.yaml, however, the deployment files are set to pull in the
latest version of the snapshotter image. With this version we are
pinning the image version too.

This is a temporary fix as it should be better worked out at nydus-snapshotter
project side.

Fixes: #9742
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-30 11:21:16 -03:00
Greg Kurz
b3cb19b6a7 Merge pull request #9639 from emanuellima1/rng-impl
runtime-rs: Add RNG to QEMU cmdline
2024-05-30 12:00:11 +02:00
Zvonko Kaiser
7cc0ebe75e Merge pull request #9743 from zvonkok/tools-fix
ci: Fix tools builder images
2024-05-30 11:53:34 +02:00
Zvonko Kaiser
02a7f8c852 ci: Fix tools builder images
We weren't considering changes of the tools script dir
adding a fourth hash to accomodate this

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-30 08:10:42 +00:00
Fabiano Fidêncio
97806dbdaa Merge pull request #9732 from zvonkok/shim-v2-no-sudo
ci: shim-v2 no sudo
2024-05-30 07:01:04 +02:00
Wainer dos Santos Moschetta
37894923c1 tests/k8s: skip empty dir volumes tests for qemu-coco-dev
This test fails with qemu-coco-dev configuration and guest-pull image pull.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta
79a8b31ec5 tests/k8s: skip shared volume tests for qemu-coco-dev
This test fails with qemu-coco-dev configuration and guest-pull image pull.

Issue: #9668
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta
aa1a37081e tests/k8s: skip sysctls tests for qemu-coco-dev
This test fails with qemu-coco-dev configuration and guest-pull image pull.

Issue: #9666
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta
0e81ced9f1 tests/k8s: skip kill-all-process tests for qemu-coco-dev
This test fails with qemu-coco-dev configuration and guest-pull image pull.

Issue: #9664
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta
18896efa3c tests/k8s: skip seccomp tests for qemu-coco-dev
This test fails with qemu-coco-dev configuration and guest-pull image pull.
Unlike other tests that I've seen failing on this scenario, k8s-seccomp.bats
fails after a couple of consecutive executions, so it's that kind of failure
that happens once in a while.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta
b62ad71c43 tests/k8s: add runtime handler annotation for qemu-coco-dev
This will enable the k8s tests to leverage guest pulling when
PULL_TYPE=guest-pull for qemu-coco-dev runtimeclass.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta
089c7ad84a tests/k8s: add runtime handler annotation only for guest-pull
The runtime handler annotation is required for Kubernetes <= 1.28 and
guest-pull pull type. So leverage $PULL_TYPE (which is exported by CI jobs)
to conditionally apply the annotation.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-29 18:37:24 -03:00
GabyCT
0eddfdc74f Merge pull request #9731 from zvonkok/pause-no-sudo
ci: pause-image no sudo
2024-05-29 11:48:41 -06:00
Zvonko Kaiser
7354c427f9 Merge pull request #9734 from zvonkok/virtiofsd-no-sudo
ci: virtiofsd no sudo
2024-05-29 19:31:25 +02:00
GabyCT
3c91aa0475 Merge pull request #9739 from zvonkok/initramfs-no-sudo
ci: initramfs no sudo
2024-05-29 11:28:59 -06:00
Hyounggyu Choi
40d2306f95 Merge pull request #9729 from zvonkok/agent-no-sudo-build
ci: build agent without sudo
2024-05-29 19:27:56 +02:00
GabyCT
03be220482 Merge pull request #9730 from zvonkok/kernel-no-sudo
ci: kernel no sudo
2024-05-29 10:23:31 -06:00
GabyCT
a32058913a Merge pull request #9679 from amshinde/kata-manager-install-cni
kata-manager: Copy cni files under /opt/cni
2024-05-29 10:20:34 -06:00
GabyCT
a5808a556d Merge pull request #9733 from zvonkok/tools-no-sudo
ci: tools no sudo
2024-05-29 10:19:17 -06:00
GabyCT
e94b09839d Merge pull request #9736 from zvonkok/qemu-no-sudo
ci: qemu no sudo
2024-05-29 10:18:34 -06:00
GabyCT
6d58fce4a9 Merge pull request #9677 from GabyCT/topic/memoryusags
metrics: Improve variable definition in memory usage script
2024-05-29 10:16:56 -06:00
Emanuel Lima
138d985c64 runtime-rs: Add RNG to QEMU cmdline
It creates this line, as the Golang runtime does:
-object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng-pci,rng=rng0

Signed-off-by: Emanuel Lima <emlima@redhat.com>
2024-05-29 13:11:00 -03:00
Hyounggyu Choi
6ba2461404 Merge pull request #9728 from zvonkok/coco-guest-comp-no-sudo
ci: guest-components without sudo
2024-05-29 17:55:43 +02:00
Gabriela Cervantes
09c3e08f6a tests: Fix indentation in static checks script
This PR fixes the indentation in the static checks script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-29 15:43:44 +00:00
Xuewei Niu
c297a7891c Merge pull request #9723 from zvonkok/hotunplug-fix
vfio: Fix hot-unplug
2024-05-29 22:02:05 +08:00
Zvonko Kaiser
25c784c568 ci: shim-v2 no sudo
Build shim-v2 without sudo docker this is not needed. This is part 6 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-29 09:24:54 +00:00
Zvonko Kaiser
84a9773cec ci: initramfs no sudo
BUild initramfs  without sudo docker this is not needed. This is part 10 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-29 09:20:39 +00:00
Zvonko Kaiser
7dc47c8150 ci: qemu no sudo
Build qemu without sudo docker this is not needed. This is part 9 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 16:12:06 +00:00
Zvonko Kaiser
4a455bf24a ci: virtiofsd no sudo
build virtiofsd without sudo docker this is not needed. This is part 8 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 14:19:58 +00:00
Wainer Moschetta
9896f69827 Merge pull request #9414 from ldoktor/ci-bisection
ci.ocp: Document openshift pipeline and manual bisection
2024-05-28 11:17:09 -03:00
Zvonko Kaiser
dd04d26cb0 ci: tools no sudo
Build tools without sudo docker this is not needed. This is part 7 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 13:57:20 +00:00
Zvonko Kaiser
6c9c0306ac ci: pause-image no sudo
Build pause-image without sudo docker this is not needed. This is part 5 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 11:31:59 +00:00
Hyounggyu Choi
e8c06301d7 Merge pull request #9727 from zvonkok/ovmf-no-sudo
ci: ovmf without sudo
2024-05-28 13:29:00 +02:00
Zvonko Kaiser
c95ae5a502 ci: kernel no sudo
Build kernel without sudo docker this is not needed. This is part 4 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 11:19:08 +00:00
Zvonko Kaiser
8fab5dd584 ci: build agent without sudo
Build agent without sudo docker this is not needed. This is part 3 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 09:55:32 +00:00
Zvonko Kaiser
1e4cbc4fcd ci: guest-components wihout sudo
Build guest-components without sudo docker this is not needed. This is part 2 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 09:03:14 +00:00
Zvonko Kaiser
b76938b922 ci: ovmf without sudo
Build ovmf without sudo docker this is not needed. This is part 1 of N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 08:25:27 +00:00
Zvonko Kaiser
c6c20ac253 docs: Format the threat-model to 80 chars
Truncate long lines to reasonable 80 characters

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 07:39:26 +00:00
Zvonko Kaiser
d4832b3b74 vfio: Fix hotpunplug
We need to remove the device from the tracking map, a container
restart will increment the bus index and we will get out of root-ports
and crash the machine.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 07:37:30 +00:00
Zvonko Kaiser
a7931115a0 Merge pull request #8861 from zvonkok/config-pcie-root-switch-port
gpu: reintroduce pcie_root_port and add pcie_switch_port
2024-05-27 13:17:57 +02:00
Fabiano Fidêncio
3276bb52b6 Merge pull request #9721 from fidencio/topic/ci-kata-deploy-improvements-and-fixes
kata-deploy / kata-cleanup / ci: Fixes and improvements to kata-deploy / kata-cleanup and its usage in the CI
2024-05-27 12:29:40 +02:00
Zvonko Kaiser
4c93bb2d61 qemu: Add CDI device handling for any container type
We need special handling for pod_sandbox, pod_container and
single_container how and when to inject CDI devices

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-27 10:13:01 +00:00
Zvonko Kaiser
c7b41361b2 gpu: reintroduce pcie_root_port and add pcie_switch_port
In Kubernetes we still do not have proper VM sizing
at sandbox creation level. This KEP tries to mitigates
that: kubernetes/enhancements#4113 but this can take
some time until Kube and containerd or other runtimes
have those changes rolled out.

Before we used a static config of VFIO ports, and we
introduced CDI support which needs a patched contianerd.
We want to eliminate the patched continerd in the GPU case
as well.

Fixes: #8860

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-27 10:13:01 +00:00
Fupan Li
6f6a164451 Merge pull request #9268 from zvonkok/kata-agent-createcontainer
kata-agent: CreateContainer Hook
2024-05-27 16:36:22 +08:00
Fabiano Fidêncio
e81e8a4527 tests: kata-deploy: Adjust timeout
10 minutes is waay too long.  Let's give it 4 minutes only.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-27 06:23:00 +02:00
Fabiano Fidêncio
fba5793c0d tests: kata-deploy: Run the tests from "${repo_root_dir}"
Let's see if it helps with issues like:
```
error: must build at directory: not a valid directory: evalsymlink
failure on
'"/home/runner/actions-runner/_work/kata-containers/kata-containers/tests/functional/kata-deploy/../../..//tools/packaging/kata-deploy/kata-cleanup/overlays/k0s"'
: lstat
/home/runner/actions-runner/_work/kata-containers/kata-containers/tests/functional/kata-deploy/":
no such file or directory
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-27 06:23:00 +02:00
Fabiano Fidêncio
8a8a7ea0e5 tests: kata-deploy: Show more logs in the setup()
This will also help us to better understand possible failures with the
CI.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-27 05:05:06 +02:00
Fabiano Fidêncio
47d9589e9b tests: kata-deploy: Show output of passing tests
This will help us to debug failures and compare passing and failures
outputs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-27 05:05:06 +02:00
Fabiano Fidêncio
dbd0d4a090 gha: Only do preventive cleanups for baremetal
This takes a few minutes that could be saved, so let's avoid doing this
on all the platforms, but simply do this when it's needed (the baremetal
use case).

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-27 05:05:06 +02:00
Fabiano Fidêncio
ee2ef0641c tests: k8s: Allow passing "all" to run all the tests
Currently only "baremetal" runs all the tests, but we could easily run
"all" locally or using the github provided runners, even when not using
a "baremetal" system.

The reason I'd like to have a differentiation between "all" and
"baremetal" is because "baremetal" may require some cleanup, which "all"
can simply skip if testing against a fresh created VM.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-27 05:05:06 +02:00
Fabiano Fidêncio
556227cb51 tests: Add the possibility to deploy k0s / rke2
For now we've only exposed the option to deploy kata-deploy for k3s and
vanilla kubernetes when using containerd.

However, I do need to also deploy k0s and rke2 for an internal CI, and
having those exposed here do not hurt, and allow us to easily expand the
CI at any time in the future.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-27 05:05:06 +02:00
Fabiano Fidêncio
e3c2f0b0f1 kata-cleanup: Add k0s kustomization
k0s was added to kata-deploy, but it's kata-cleanup counterpart was
never added.  Let's fix it.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-27 05:05:06 +02:00
Fabiano Fidêncio
f15d40f8fb kata-deploy: Fix k0s deployment
k0s deployment has been broken since we moved to using `tomlq` in our
scripts.  The reason is that before using `tomlq` our script would,
involuntarily, end up creating the file.

Now, in order to fix the situation, we need to explicitly create the
file and let `tomlq` add the needed content.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-27 05:05:06 +02:00
Alex Lyn
713c929a64 Merge pull request #9656 from pmores/document-qemu-rs-conventions
runtime-rs: document architecture & implementation conventions in qem…
2024-05-27 10:38:58 +08:00
Xuewei Niu
bb7a1c56e9 Merge pull request #9693 from sidneychang/9690/Adjust-indentation 2024-05-27 00:20:34 +08:00
Alex Lyn
55dbf6121a Merge pull request #9604 from Apokleos/qmp-cmdline01
runtime-rs: add QMP support for Qemu(part I)
2024-05-26 20:22:59 +08:00
Alex Lyn
028b10ce7a Merge pull request #9687 from l8huang/vfio-pci-gk
agent: collect PCI address mapping for both vfio-pci-gk and vfio-pci device
2024-05-26 17:48:25 +08:00
Steve Horsman
b89c3e35dd Merge pull request #9583 from cncal/update_check_error_message
runtime: make kata-runtime check error more understandable when /dev/kvm doesn't exist
2024-05-24 17:49:43 +01:00
Alex Lyn
41fb7aeb89 runtime-rs: add QMP params suppport in cmdline
Fixes: #9603

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-05-24 22:16:24 +08:00
Alex Lyn
7ed6c6896b runtime-rs: add an option dbg_monitor_socket for HMP support
This option allows to add a debug monitor socket when
`enable_debug = true` to control QEMU within debugging case.

Fixes: #9603

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-05-24 22:16:17 +08:00
Lei Huang
3624573b12 agent: collect PCI address mapping for both vfio-pci-gk and vfio-pci device
The `update_env_pci()` function need the PCI address mapping to
translate the host PCI address to guest PCI address in below
environment variables:
- PCIDEVICE_<prefix>_<resource-name>_INFO
- PCIDEVICE_<prefix>_<resource-name>

So collect PCI address mapping for both vfio-pci-gk and
vfio-pci devices.

Fixes #9614

Signed-off-by: Lei Huang <leih@nvidia.com>
2024-05-23 21:20:01 -07:00
Fupan Li
d73876252e Merge pull request #9690 from justxuewei/agent-timeout
runtime-rs: Remove obsoleted dial_timeout config
2024-05-24 10:31:12 +08:00
Zvonko Kaiser
3affd83e14 Merge pull request #9605 from l8huang/skip-env
kata-agent: update env PCIDEVICE_<prefix>_<resource-name>_INFO
2024-05-23 18:45:00 +02:00
Fabiano Fidêncio
44d6cb7791 Merge pull request #9698 from wainersm/k8s_tests_disable_fail_fast
tests/k8s: disable "fail-fast" behavior by default
2024-05-23 18:28:00 +02:00
Fabiano Fidêncio
d83cf39ba1 Merge pull request #9680 from kata-containers/dependabot/go_modules/src/runtime/go_modules-5e29427af7
build(deps): bump golang.org/x/net from 0.24.0 to 0.25.0 in /src/runtime in the go_modules group across 1 directory
2024-05-23 12:55:29 +02:00
Fabiano Fidêncio
d9ee950d8f Merge pull request #9696 from wainersm/skip_custom_dns_test
tests/k8s: skip custom DNS tests on confidential jobs
2024-05-22 23:57:21 +02:00
GabyCT
e08ad8d1b7 Merge pull request #9686 from GabyCT/topic/fixbootclh
metrics: Fix minvalue for boot time
2024-05-22 15:46:50 -06:00
Wainer dos Santos Moschetta
76735df427 tests/k8s: disable "fail-fast" behavior by default
The k8s test suite halts on the first failure, i.e., failing-fast. This
isn't the behavior that we used to see when running tests on Jenkins and it
seems that running the entire test suite is still the most productive way. So
this disable fail-fast by default.

However, if you still wish to run on fail-fast mode then just export
K8S_TEST_FAIL_FAST=yes in your environment.

Fixes: #9697
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-22 18:27:44 -03:00
Fabiano Fidêncio
8eb061cd5b Merge pull request #9681 from GabyCT/topic/etdx
gha: Enable install kbs and coco components for TDX, but still skip the CDH test
2024-05-22 23:18:42 +02:00
Wainer dos Santos Moschetta
43766cdb96 tests/k8s: skip custom DNS tests on confidential jobs
This test has failed in confidential runtime jobs. Skip it
until we don't have a fix.

Fixes: #9663
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-22 17:08:22 -03:00
Fabiano Fidêncio
904370ecd6 tests: attestation: tdx: Skip test for now
Skipping the test will allow us to have the TDX CI running while we
debug the test.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-22 20:04:13 +02:00
Fabiano Fidêncio
414d716eef tests: kbs: Enable cli installation also on CentOS
One of our machines is running CentOS 9 Stream, and we could easily
verify that we can build and install the kbs client there, thus we're
expanding the installation script to also support CentOS 9 Stream.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-22 20:01:57 +02:00
Fabiano Fidêncio
27d7f4c5b8 tests: kbs: Fix rust installation
`externals.coco-kbs.toolchain` is not defined, get the rust_version from
`externals.coco-trustee.toolchain` instead.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-22 20:01:57 +02:00
Fabiano Fidêncio
fa8b5c76b8 tests: kbs: Add more info for the TDX deployment
Ditto in the commit shortlog.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-22 20:01:57 +02:00
Fabiano Fidêncio
6ffd7b8425 versions: trustee: Bump version to 6adb8383309cbb7
We're bumping the version in order to bring in the customisation needed
for setting up a custom pccs, which is needed for the KBS integration
tests with Kata Containers + TDX.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-22 20:01:57 +02:00
Fabiano Fidêncio
dbd1fa51cd tests: kbs: Don't assume /tmp/trustee exists in the machine
Instead, check if the directory exists before pushd'ing into it.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-22 20:01:57 +02:00
Gabriela Cervantes
f698caccc0 gha: Enable install kbs and coco components for TDX
This PR enables the installation and unistallation of the kbs client
as well as general coco components needed for the TDX GHA CI.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-22 20:01:57 +02:00
GabyCT
eaaab19763 Merge pull request #9685 from GabyCT/topic/fixic
tests: Fix indentation in confidential common script
2024-05-22 11:53:33 -06:00
Gabriela Cervantes
29a10f1373 metrics: Fix minvalue for boot time
This PR fixes the minvalue for boot time to avoid the random failures
of the GHA CI.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-22 17:52:51 +00:00
GabyCT
0b32360ab4 Merge pull request #9684 from stevenhorsman/add-arch-to-component-cache-tags
ci: cache: Add arch suffix to all cache tags
2024-05-22 09:24:28 -06:00
Fabiano Fidêncio
0e33ecf7fc Merge pull request #9653 from JakubLedworowski/fixes-9497-ensure-quote-generation-service-is-added-to-qemu-cmd-2
runtime: Enable connection to Quote Generation Service (QGS)
2024-05-22 15:49:23 +02:00
sidneychang
8938f35627 runtime-rs: Adjust indentation in ifneq statements within Makefile.
Replace tab indentation with spaces for the three lines within the ifneq statements, aligning them with the surrounding code.

Fixes:#9692

Signed-off-by: sidneychang <2190206983@qq.com>
2024-05-22 20:24:35 +08:00
Fabiano Fidêncio
94f7bbf253 Merge pull request #9682 from fidencio/topic/allow-increasing-cpus-and-memory-via-annotation-for-tdx
runtime: tdx: Allow default_{cpu,memory} annotations
2024-05-22 12:07:28 +02:00
Xuewei Niu
d31616cec3 runtime-rs: Remove obsoleted dial_timeout config
The `dial_timeout` works fine for Runtime-go, but is obsoleted in
Runtime-rs.

When the pod cannot connect to the Agent upon starting, we need to adjust
the `reconnect_timeout_ms` to increase the number of connection attempts to
the Agent.

Fixes: #9688

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-05-22 17:57:05 +08:00
Jakub Ledworowski
fc680139e5 runtime: Enable connection to Quote Generation Service (QGS)
For the TD attestation to work the connection to QGS on the host is needed.
By default QGS runs on vsock port 4050, but can be modified by the host owner.
Format of the qemu object follows the SocketAddress structure, so it needs to be provided in the JSON format, as in the example below:
-object '{"qom-type":"tdx-guest","id":"tdx","quote-generation-socket":{"type":"vsock","cid":"2","port":"4050"}}'

Fixes: #9497
Signed-off-by: Jakub Ledworowski <jakub.ledworowski@intel.com>
2024-05-22 11:16:24 +02:00
Alex Lyn
0331859740 Merge pull request #9642 from gkurz/drop-unused-knobs-qemu-rs
runtime-rs: Drop some useless QEMU arguments
2024-05-22 16:13:14 +08:00
Alex Lyn
ce030d1804 Merge pull request #9641 from cmaf/runtime-resize-mem-1
runtime: Add missing check in ResizeMemory for CH
2024-05-22 14:05:30 +08:00
Alex Lyn
b7af00be2a Merge pull request #9624 from cncal/bugfix_duplicated_devices
runtime: fix duplicated devices requested to the agent
2024-05-22 12:45:46 +08:00
Steve Horsman
f41f642b90 Merge pull request #9635 from kata-containers/dependabot/go_modules/src/runtime/go_modules-f0df977846
build(deps): bump github.com/containerd/containerd from 1.7.11 to 1.7.16 in /src/runtime in the go_modules group across 1 directory
2024-05-21 21:19:32 +01:00
Steve Horsman
9b0ed3dfa7 Merge pull request #9657 from ajaypvictor/remote-hyp-annotations
runtime: Disable number of cpu comparison on remote hypervisor scenario
2024-05-21 21:19:12 +01:00
Hyounggyu Choi
92101fc61f Merge pull request #9658 from BbolroC/migrate-vfio-ap-test
CI: Migrate vfio-ap test files from tests repo
2024-05-21 20:21:09 +02:00
Lei Huang
b0a91b0d13 kata-agent: update env PCIDEVICE_<prefix>_<resource-name>_INFO
The new version of sriov-network-device-plugin adds an env
`PCIDEVICE_<prefix>_<resource-name>_INFO`, which has a json
value; kata-agent can't parse it as env
`PCIDEVICE_<prefix>_<resource-name>` which has value in format
"DDDD:BB:SS.F".

This change updates env `PCIDEVICE_<prefix>_<resource-name>_INFO`.

Signed-off-by: Lei Huang <leih@nvidia.com>
2024-05-21 10:46:41 -07:00
stevenhorsman
db4818fe1d ci: cache: Enforce tag length limit
Container tags can be a maximum of 128 characters long
so calculate the length of the arch suffix and then restrict
the tag to this length subtracted from 128

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-21 18:03:45 +01:00
Gabriela Cervantes
c9e91db16f tests: Fix indentation in confidential common script
This PR fixes the indentation in the confidential common script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-21 16:33:46 +00:00
stevenhorsman
d6afd77eae ci: cache: Update agent cache to use the full commit hash
- Previously I copied the logic that abbreviated the commit hash
from the versioning, but looking at our versions.yaml the clear pattern
is that when pointing at commits of dependencies we use the full
commit hash, not the abbreviated one, so for consistency I think we should
do the same with the components that we make available

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-21 16:51:16 +01:00
stevenhorsman
d46b6a3879 ci: cache: Add arch suffix to all cache tags
As we have multi-arch builds for nearly all components, we want to ensure
that all the cache tags we set have the architecture suffix, not just the
`TARGET_BRANCH` one.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-21 11:25:07 +01:00
stevenhorsman
865fa9da15 runtime: Resolve go static-checks failure
Remove `rand.Seed` call to resolve the following failure:
```
rand.Seed is deprecated: As of Go 1.20 there is no reason to call Seed with a random value.
```

The go rand.Seed docs: https://pkg.go.dev/math/rand@go1.20#Seed
back this up and states:
> If Seed is not called, the generator is seeded randomly at program startup.
so I believe we can just delete the call.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-21 11:08:59 +01:00
Fabiano Fidêncio
abf52420a4 runtime: tdx: Allow default_{cpu,memory} annotations
For now, let's allow the users to set the default_cpu and default_memory
when using TDX, as they may hit issues related to the size of the
container image that must be pulled and unpacked inside the guest,

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-21 10:26:39 +02:00
stevenhorsman
75a201389d runtime: update go version in go.mod
- Make due to us bumping the golang version used in our CI
but `make vendor` fails without the go version in the runtime go.mod
being increased, so update this and run go mod tidy

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-21 09:11:46 +01:00
dependabot[bot]
735185b15c build(deps): bump github.com/containerd/containerd
Bumps the go_modules group with 1 update in the /src/runtime directory: [github.com/containerd/containerd](https://github.com/containerd/containerd).


Updates `github.com/containerd/containerd` from 1.7.11 to 1.7.16
- [Release notes](https://github.com/containerd/containerd/releases)
- [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md)
- [Commits](https://github.com/containerd/containerd/compare/v1.7.11...v1.7.16)

---
updated-dependencies:
- dependency-name: github.com/containerd/containerd
  dependency-type: direct:production
  dependency-group: go_modules
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-05-21 09:11:46 +01:00
Ajay Victor
abe607b0c7 runtime: Disable number of cpu comparison on remote hypervisor scenario
Fixes https://github.com/kata-containers/kata-containers/issues/9238

Signed-off-by: Ajay Victor <ajvictor@in.ibm.com>
2024-05-21 13:34:21 +05:30
dependabot[bot]
01868b2849 ---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
  dependency-group: go_modules
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-05-20 22:06:41 +00:00
Fabiano Fidêncio
8879e3bc45 Merge pull request #9452 from GabyCT/topic/tdxcoco
gha: Add support to install KBS to k8s TDX GHA workflow
2024-05-20 23:28:52 +02:00
Fabiano Fidêncio
072b929b6f Merge pull request #9660 from malt3/fix/genpolicy/namespace_empty_string
genpolicy: detect empty string in ns as default
2024-05-20 21:34:13 +02:00
Gabriela Cervantes
cfdef7ed5f tests/k8s: Use custom intel DCAP configuration
This PR adds the use of custom Intel DCAP configuration when
deploying the KBS.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-20 18:44:57 +00:00
Gabriela Cervantes
cace2fd340 metrics: Improve variable definition in memory usage script
This PR improves general format like variable definition to have
uniformity across the memory usage script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-20 16:14:59 +00:00
Fabiano Fidêncio
97056b017d Merge pull request #9675 from stevenhorsman/release-build-tarballs-inherit-secrets
gha: release: Set inherit secrets on tarball builds
2024-05-20 18:06:38 +02:00
Fabiano Fidêncio
b8b3bcc492 Merge pull request #9671 from bikesheddev/fix/kata-deploy-unbound-variable
fix: kata-deploy.sh VERSION_ID unbound-variable
2024-05-20 17:22:55 +02:00
Fabiano Fidêncio
94cff3f74e Merge pull request #9315 from fidencio/topic/adapt-TEEs-for-shared_fs-none
TEEs: Use `shared_fs=none` for TDX
2024-05-20 17:17:36 +02:00
Fabiano Fidêncio
cffeb0ffb8 Merge pull request #9673 from fidencio/topic/revert-aks-workaround
Revert "ci: azure: Workaround azure cli installation script"
2024-05-20 16:16:55 +02:00
stevenhorsman
f271983aeb gha: release: Set inherit secrets on tarball builds
Now we have updated the release builds to push
artefacts to
our registry for the release, so we can cache the images, we need to
set `secrets: inherit` for all architecture's tarball builds
so that we can log into quay.io and ghcr in those steps

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-20 14:19:17 +01:00
Fabiano Fidêncio
25c9cf32ff Revert "ci: azure: Workaround azure cli installation script"
This reverts commit 5ff53e4d1c, as the
script was fixed by MSFT, at least according to:
https://github.com/Azure/azure-cli/issues/28984

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-20 14:38:46 +02:00
vac (Brendan)
d812007b99 kata-deploy: Fix unbound VERSION_ID
VERSION_ID is not guaranteed to be specified in os-release, this
makes kaka-deploy breaks in rolling distros like arch linux and void
linux.

Note that operating system vendors may choose not to provide
version information, for example to accommodate for rolling releases.
In this case, VERSION and VERSION_ID may be unset.
Applications should not rely on these fields to be set.

Signed-off-by: vac <dot.fun@protonmail.com>
2024-05-20 19:48:31 +08:00
Tim Zhang
857d2bbc8e agent: Fix ctr exec stuck problem
Fixes: #9532

Close stdin when write_stdin receives data of length 0.

Stop call notify_term_close() in close_stdin, because it could
discard stdout unexpectedly.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2024-05-20 14:52:14 +08:00
Fabiano Fidêncio
e8ebe18868 tests: k8s: tdx: Skip liveness probe test
This test doesn't fail with the guest image pulling, but it for sure
should. :-)

We can see in the bats logs, something like:
```
Events:
  Type     Reason     Age               From               Message
  ----     ------     ----              ----               -------
  Normal   Scheduled  31s               default-scheduler  Successfully assigned kata-containers-k8s-tests/liveness-exec to 984fee00bd70.jf.intel.com
  Normal   Pulled     23s               kubelet            Successfully pulled image "quay.io/prometheus/busybox:latest" in 345ms (345ms including waiting)
  Normal   Started    21s               kubelet            Started container liveness
  Warning  Unhealthy  7s (x3 over 13s)  kubelet            Liveness probe failed: cat: can't open '/tmp/healthy': No such file or directory
  Normal   Killing    7s                kubelet            Container liveness failed liveness probe, will be restarted
  Normal   Pulled     7s                kubelet            Successfully pulled image "quay.io/prometheus/busybox:latest" in 389ms (389ms including waiting)
  Warning  Failed     5s                kubelet            Error: failed to create containerd task: failed to create shim task: the file /bin/sh was not found: unknown
  Normal   Pulling    5s (x3 over 23s)  kubelet            Pulling image "quay.io/prometheus/busybox:latest"
  Normal   Pulled     4s                kubelet            Successfully pulled image "quay.io/prometheus/busybox:latest" in 342ms (342ms including waiting)
  Normal   Created    4s (x3 over 23s)  kubelet            Created container liveness
  Warning  Failed     3s                kubelet            Error: failed to create containerd task: failed to create shim task: failed to mount /run/kata-containers/f0ec86fb156a578964007f7773a3ccbdaf60023106634fe030f039e2e154cd11/rootfs to /run/kata-containers/liveness/rootfs, with error: ENOENT: No such file or directory: unknown
  Warning  BackOff    1s (x3 over 3s)   kubelet            Back-off restarting failed container liveness in pod liveness-exec_kata-containers-k8s-tests(b1a980bf-a5b3-479d-97c2-ebdb45773eff)
```

Let's skip it for now as we have an issue opened to track it down:
https://github.com/kata-containers/kata-containers/issues/9665

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 21:59:29 +02:00
Fabiano Fidêncio
a2c70222a8 tests: k8s: tdx: Skip initContainerd shared vol test
This is another one that is related to initContainers not being properly
handled with the guest image pulling.

Let's skip it for now as we have
https://github.com/kata-containers/kata-containers/issues/9668 to track
it down.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 20:58:45 +02:00
Fabiano Fidêncio
9d56145499 tests: k8s: tdx: Skip volume related tests
Similarly to firecracker, which doesn't have support for virtio-fs /
virtio-9p, TDX used with `shared_fs=none` will face the very same
limitations.

The tests affected are:
* k8s-credentials-secrets.bats
* k8s-file-volume.bats
* k8s-inotify.bats
* k8s-nested-configmap-secret.bats
* k8s-projected-volume.bats
* k8s-volume.bats

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 19:38:49 +02:00
Fabiano Fidêncio
606a62a0a7 tests: k8s: tdx: Skip "Setting sysctl" test
This test fails when using `shared_fs=none` with the nydus-snapshotter,
and we're tracking the issue here:
https://github.com/kata-containers/kata-containers/issues/9666

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 19:38:38 +02:00
Fabiano Fidêncio
937b2d5806 tests: k8s: tdx: Skip "Kill all processes in container" test
This test fails when using `shared_fs=none` with the nydus snapshotter,
and we're tracking the issue here:
https://github.com/kata-containers/kata-containers/issues/9664

For now, let's have it skipped.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:51:14 +02:00
Fabiano Fidêncio
03ce41b743 tests: k8s: tdx: Skip "Check custom dns" test
The test has been failing on TDX for a while, and an issue has been
created to track it down, see:
https://github.com/kata-containers/kata-containers/issues/9663

For now, let's have it skipped.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:51:14 +02:00
Fabiano Fidêncio
1a8a4d046d tests: k8s: setup: Improve / Fix logs
Let's make sure the logs will print the correct annotation and its
value, instead of always mentioning "kernel" and "initrd".

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:51:14 +02:00
Fabiano Fidêncio
3f38309c39 tests: k8s: tdx: Stop running k8s-guest-pull-image.bats
We're doing that as all tests are going to be running with
`shared_fs=none`, meaning that we don't need any specific test for this
case anymore.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:51:00 +02:00
Fabiano Fidêncio
e84619d54b tests: k8s: tdx: Add add_runtime_handler_annotations function
This function will set the needed annotation for enforcing that the
image pull will be handled by the snapshotter set for the runtime
handler, instead of using the default one.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:49:07 +02:00
Fabiano Fidêncio
f2de259387 runtime: tdx: Use shared_fs=none
We shouldn't be using 9p, at all, with TEEs, as off right now we have no
way to ensure the channels are encrypted.  The way to work this around
for now is using guest pull, either with containerd + nydus snapshotter
or with CRI-O; or even tardev snapshotter for pulling on the host (which
is the approach used by MSFT).

This is only done for TDX for now, leaving the generic, AMD, and IBM
related stuff for the folks working on those to switch and debug
possible issues on their environment.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:47:09 +02:00
Fabiano Fidêncio
5b257685d9 Merge pull request #9662 from dborquez/fix_launchtimes_timestamp_generation
Fix launch times timestamp generation.
2024-05-18 21:11:09 +02:00
Fabiano Fidêncio
94786dc939 Merge pull request #9659 from stevenhorsman/remove-non-printable-tag-characters
ci: cache: Filter out non-printable characters from tag
2024-05-18 14:47:07 +02:00
Fabiano Fidêncio
874cda0e51 Merge pull request #9655 from BbolroC/add-arch-to-initramfs
CI: Append arch type to initramfs-cryptsetup image
2024-05-18 14:31:57 +02:00
Malte Poll
babdab9078 genpolicy: detect empty string in ns as default
In Kubernetes, the following values for namespace are equivalent and all refer to the default namespace:

- ` ` (namespace field missing)
- `namespace: ""` (namespace field is the empty string)
- `namespace: "default"`(namespace field has the explicit value `default`)

Genpolicy currently does not handle the empty string case correctly.

Signed-Off-By: Malte Poll <1780588+malt3@users.noreply.github.com>
2024-05-18 12:44:59 +02:00
Fabiano Fidêncio
cbfdc70a55 Merge pull request #9613 from fidencio/topic/skip-pull-image-tests-on-tees-part-II
tests: pull-image: Only skip tests for TEEs
2024-05-18 03:31:38 +02:00
Archana Shinde
0e28e904e0 kata-manager: Install cni for containerd
When just containerd is installed without installing nerdctl,
cni plugins are missing from the installation.
containerd tarball does not include cni plugin files.
Hence install cni plugins separately for containerd.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-05-18 00:19:57 +00:00
Archana Shinde
d23d58a484 kata-manager: Copy cni files under /opt/cni
nerdctl requires cni plugins to be installed in /opt/cni/bin
Without bridge plugin installed, it is not possible to run a
container with nerdctl.
The downloaded nerdctl tarball contains cni plugin files, but are
extracted under /usr/local/libexec.
Copy extracted tarball cni files under /usr/local/libexec
to /opt/cni/bin

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-05-18 00:16:48 +00:00
David Esparza
938d3dc430 metrics: fix timestamps generation from launch times test.
Use `eval` to process the `date` command along with its parameters,
thus avoiding misinterpreting the parameters as commands.

Fixes: #9661

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-05-17 14:44:41 -06:00
David Esparza
bae377b42a metrics: determine the realpath of kata-shim component.
Determine the realpath of kata-shim avoiding the check fails
in case the kata-shim is not a symlink, as was happening prior
to this commit.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-05-17 14:40:02 -06:00
Fabiano Fidêncio
5ff53e4d1c ci: azure: Workaround azure cli installation script
This is done in order to work around
https://github.com/Azure/azure-cli/issues/28984, following a suggestion
on the very same issue.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 20:28:24 +02:00
stevenhorsman
42fddb5530 ci: cache: Filter out non-printable characters from tag
- The tags have a trailing non-printable character, which results
in our cache tags having a trailing underscore e.g. `ghcr.io/kata-containers/cached-artefacts/agent:ce24e9835_`
For ease of use of these cached components, we should strip off the trailing underscore.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-17 14:16:40 +01:00
Hyounggyu Choi
961735a181 CI: Migrate vfio-ap test files from tests repo
An e2e test for `vfio-ap` has been conducted internally in IBM
due to the lack of publicly available test machines equipped
with a required crypto device.
The test is performed by the `tests` repository:
(i.e. 772105b560/Makefile (L144))

The community is working to integrate all tests into the `kata-containers`
repository, so the `vfio-ap` test should be part of that effort.

This commit moves a test script and Dockerfile for a test image from
the `tests` repository. We do not rename the script to `gha-run.sh`
because it is not executed by Github Actions' workflow.

You can check the test results from the s390x nightly test with the migrated files here:
https://github.com/kata-containers/kata-containers/actions/runs/9123170010/job/25100026025

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-05-17 14:59:16 +02:00
stevenhorsman
a92defdffe tests: pull-image: Remove skips
Given that we think the containerd -> snapshotter image cache
problems have been resolved by bumping to nydus-snapshotter v0.3.13
we can try removing the skips to test this out

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-17 12:39:57 +02:00
stevenhorsman
7ac302e2d8 tests: Slacken guest pull rootfs count assert
- We previously have an expectation for the pause rootfs
to be pull on the host when we did a guest pull. We weren't
really clear why, but it is plausible related to the issues we had
with containerd and nydus caching. Now that is fixed we can begin
to address this with setting shared_fs=none, but let's start with
updating the rootfs host check to be not higher than expected

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-17 12:39:56 +02:00
Fabiano Fidêncio
67ff58251d tests: confidential_common: Remove unneeded ensure_yq call
This test is called from `tests/integration/run_kuberentes_tests.sh`,
which already ensures that yq is installed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 12:39:56 +02:00
Fabiano Fidêncio
cc874ad5e1 tests: confidential: Ensure those only run on TEEs
Running those with the non-TEE runtime classes will simply fail.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 12:39:56 +02:00
Fabiano Fidêncio
2bc5b1bba2 tests: pull-image: Only skip tests for TEEs
On 1423420, I've mistakenly disabled the tests entirely, for both
non-TEEs and TEEs.

This happened as I didn't realise that `confidential_setup` would take
non-TEEs into consideration. :-/

Now, let me follow-up on that and make sure that the tests will be
running on non-TEEs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 12:39:56 +02:00
Fabiano Fidêncio
d875f89fa2 tests: Add is_confidential_hardware()
This function is a helper to check whether the KATA_HYPERVISOR being
used is a confidential hardware (TEE) or not, and we can use it to
skip or only run tests on those platforms when needed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 12:39:56 +02:00
Fabiano Fidêncio
4a04a1f2ae tests: Re-work confidential_setup()
Let's rename it to `is_confidential_runtime_class`, and adapt all the
places where it's called.

The new name provides a better description, leading to a better
understanding of what the function really does.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 12:39:56 +02:00
Pavel Mores
b9febc4458 runtime-rs: document architecture & implementation conventions in qemu-rs
Implementation of QemuCmdLine has a fairly uniform and repetitive structure
that's guided by a set of conventions.  These conventions have however been
mostly implicit so far, leading to a superfluous and annoying
request/force-push churn during qemu-rs PR reviews.

This commit aims to make things explicit so that contributors can take them
into account before an initial PR submission.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-05-17 12:21:44 +02:00
Hyounggyu Choi
3917930a76 CI: Append arch type to initramfs-cryptsetup image
This commit is to append an arch type to the initramfs-cryptsetup image
to prevent a wrong arch image from being pulled on a different arch host.

Fixes: #9654

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-05-17 11:42:49 +02:00
Steve Horsman
9a6d8d8330 Merge pull request #9650 from stevenhorsman/caching-tagging-update-partIII
Caching tagging update part iii
2024-05-17 09:09:15 +01:00
stevenhorsman
ce24e98358 ci: cache: Add tag character filtering
- Container image tags can only contain alphanumeric, period,
hyphen and underscore characters, so convert characters outside
of these to be underscores, to avoid having invalid tag failures

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-16 21:38:07 +01:00
stevenhorsman
a98b1e3afb ci: cache: Integrate tagging updates with recent changes
Recently the extra gpu caching was added, unfortunately when I
rebased I ended up with both the new tagging logic and old logic.
Let's try and integrate them properly to avoid doing the push twice.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-16 21:38:07 +01:00
Lukáš Doktor
f994f79078 ci.ocp: Add steps to reproduce/bisect CI runs
in case the upstream CI fails it's useful to pin-point the PR that
caused the regression. Currently openshift-ci does not allow doing that
from their setup but we can mimic the setup on our infrastructure and
use the available kata-deploy-ci images to find the first failing one.
To help with that add a few helper scripts and a howto.

Fixes: #9228

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-05-16 20:20:05 +02:00
Lukáš Doktor
a556ad7e01 ci.ocp: Document how to run openshift-tests with kata
document the ocp pipeline.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-05-16 20:15:32 +02:00
Lukáš Doktor
ea081bd882 ci.ocp: Add webhook cleanup
cleanup the webhook resources as well.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-05-16 20:15:31 +02:00
David Esparza
029a6de52b Merge pull request #9615 from GabyCT/topic/fixlaunchtime
metrics: Update launch times script
2024-05-16 11:28:44 -06:00
Steve Horsman
33e6b241ba Merge pull request #9647 from stevenhorsman/fix-artefact-tags-unbound-variable
ci: cache: Fix unbound variable
2024-05-16 16:22:47 +01:00
stevenhorsman
9d9487b17f ci: cache: Fix unbound variable
Now we have the workflow updated and can test the changes in caching
we've hit an error:
```
line 1180: artefact_tag: unbound variable
```
so we need to fix that up. Sorry for missing this before.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-16 14:30:32 +01:00
Steve Horsman
03c08583c3 Merge pull request #9644 from stevenhorsman/fix-broken-workflow
workflow: Remove if from env conditional
2024-05-16 14:13:25 +01:00
stevenhorsman
f7fd2f9a5d workflow: Fix problems with build-asset workflows
- It appears like the `if` isn't required when setting env as a
conditional
- `inputs.stage` over input.stage
- Swap matrix.component to matrix.asset

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-16 11:51:46 +01:00
Steve Horsman
d8468cb178 Merge pull request #9550 from stevenhorsman/tag-component-caches
Tag component caches
2024-05-16 11:05:18 +01:00
Steve Horsman
b31ff09b8d Merge pull request #9617 from zvonkok/artefact-repository
deploy: Add artefact repository
2024-05-16 10:41:23 +01:00
Fabiano Fidêncio
4d073c837d Merge pull request #9636 from ChengyuZhu6/snapshotter
version: Bump nydus snapshotter to v0.13.13
2024-05-16 02:54:53 +02:00
GabyCT
05cc8fae5e Merge pull request #9610 from GabyCT/topic/fixrwfio
metrics: Fix random write value for FIO
2024-05-15 17:44:41 -06:00
Gabriela Cervantes
793a02600a metrics: Fix random write value for clh for FIO
This PR decreases the random write value for clh for FIO.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-15 22:13:10 +00:00
Chelsea Mafrica
5d2af555da runtime: Add missing check in ResizeMemory for CH
ResizeMemory for Cloud Hypervisor is missing a check for the new
requested memory being greater than the max hotplug size after
alignment. Add the check, and since an earlier check for this
setsrequested memory to the max hotplug size, do the same in the
post-alignment check.

Fixes #9640

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2024-05-15 11:29:18 -07:00
GabyCT
d752f0aa4f Merge pull request #9627 from GabyCT/topic/ghacomk8s
gha: Fix indentation in gha run k8s common
2024-05-15 11:55:14 -06:00
Greg Kurz
bd6420e0cc runtime-rs: Drop some useless QEMU arguments
All these settings are hardcoded as `false` and result in
no extra options on the QEMU command line, like the go
runtime does. There actually not needed :
- we're never going to ask QEMU to survive a guest shutdown
- we're never going to run QEMU daemonized since it prevents
  log collection
- we're never going to ask QEMU to start with the guest stopped

No need to keep this code around then.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-05-15 18:33:43 +02:00
stevenhorsman
7f41329010 ci: cache: Optional tag components with tags
- CoCo wants to use the agent and coco-guest-components cached artifacts
so tag them with a helpful version, so make these easier to get

Signed-off-by: stevenhorsman <steven@uk.ibm.com>

 No commands remaining.
2024-05-15 16:56:40 +01:00
stevenhorsman
9999971656 release: Move component's don't ship logic
- We don't want to ship certain components (agent, coco-guest-components)
as part of the release, but for other consumers it's useful to be able to pull in the components
from oras, so rather than not building them, just don't upload it as part of the release.
- Also make the archs all consistent on not shipping the agent

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-15 16:55:55 +01:00
stevenhorsman
040e6cdf12 gha: release: Set RELEASE env
- Set RELEASE env to 'yes', or 'no', based on if the stage
passed in was 'release', so we can use it in the build scripts

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-15 16:55:55 +01:00
stevenhorsman
d93156d84d gha: release: Push artifacts to registry on release
For other projects (e.g. CoCo projects) being able to
access the released versions of components is helpful,
so push these during the release process

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-15 16:55:55 +01:00
Steve Horsman
19ca1a6656 Merge pull request #9638 from BbolroC/use-fixed-len-git-hash-explicitly
CI: Use `--abbrev=9` explicitly for abbreviated commit hash
2024-05-15 16:55:07 +01:00
GabyCT
64b915b86e Merge pull request #9438 from GabyCT/topic/addnegativetest
tests: Add k8s negative policy test
2024-05-15 08:52:57 -06:00
Hyounggyu Choi
e075150fbe CI: Use --abbrev=9 explicitly for abbreviated commit hash
A length of the result of `git log -1 --pretty=format:%h` could vary
over different CI systems, highly likely messing up their caching
mechanisms.

This commit is to use an option `--abbrev=9` to standardize the length
to 9 characters for CI.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-05-15 14:22:07 +02:00
Zvonko Kaiser
117e2f2ecc Merge pull request #9618 from zvonkok/nvidia-rootfs-#1
gpu: Add build targets for GPU rootfs initrd/image
2024-05-15 13:30:42 +02:00
Hyounggyu Choi
6a4ff08156 Merge pull request #9632 from BbolroC/do-not-build-agent-policy-for-s390x
local-build: Ensure the default rootfs is built with AGENT_POLICY=yes
2024-05-15 06:56:22 +02:00
ChengyuZhu6
d48c7ec979 version: Bump nydus snapshotter to v0.13.13
Bump nydus snapshotter to v0.13.13 to fix the gap when switching
different snapshotters in guest pull.

Fixes: #8407

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-05-15 12:21:01 +08:00
Fabiano Fidêncio
92bb235723 osbuilder: Log when the default policy is installed
This will help us to debug issues in the future (and would have helped
in the past as well). :-)

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-14 20:45:49 +02:00
Fabiano Fidêncio
75bd97e8df build: Ensure the default rootfs is built with AGENT_POLICY=yes
This is needed, as b1710ee2c0 made the
default agent shipped the one with policy support.  However, we simply
didn't update the rootfs to reflect that, causing then an issue to start
the agent as shown by the strace below:
```
open("/etc/kata-opa/default-policy.rego", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
futex(0x7f401eba0c28, FUTEX_WAKE_PRIVATE, 1) = 1
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1 RT_2], [], 8) = 0
tkill(553681, SIGABRT)                  = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
--- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=553681, si_uid=1000} ---
+++ killed by SIGABRT (core dumped) +++
```

This happens as the default policy **must** be set when the agent is
built with policy support, but the code path that copies that into the
rootfs is only triggered if the rootfs itself is built with
AGENT_POLICY=yes, which we're now doing for both confidential and
non-confidential cases.

Sadly this was not caught by CI till we the cache was not used for
rootfs, which should be solved by the previous commit.

Fixes: #9630, #9631

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-14 20:39:15 +02:00
Hyounggyu Choi
37060a7d2e local-build: Stop using cached artifacts when local-build/* is updated
This is to add an info for files at `tools/packaging/kata-deploy/local-build/*
to a version of the components and ensure that the cached artefacts are not used
when the files of interest are updated.

Fixes: #9630

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-05-14 19:47:33 +02:00
Fabiano Fidêncio
9a3392993d Merge pull request #9629 from ldoktor/tdx_not_supported_warning
kata-deploy: Fix tdx_not_supported call
2024-05-14 17:27:56 +02:00
Greg Kurz
f14a1330d4 Merge pull request #9585 from littlejawa/debugging_the_runtime
debugging: adding a script and instructions for debugging the GO shim
2024-05-14 15:31:07 +02:00
Lukáš Doktor
d9ae130031 kata-deploy: Fix tdx_not_supported call
the `tdx_not_supported_warning` function does not exists, the
`tdx_not_supported` should be called instead.

Fixes: #9628

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-05-14 13:26:07 +02:00
Julien Ropé
e7cfc0865a debugging: adding a script and instructions for debugging the GO shim
Using a debugger with the kata runtime is complicated, but it can be done
and can be very useful.

This commits provides a helper script that simplifies it, and updates
the developper's documentation to explain how to use it.

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-05-14 11:12:31 +02:00
Greg Kurz
e2117d3b71 Merge pull request #9571 from emanuellima1/fix-impl-rtc
runtime-rs: Fix constructing the RTC struct
2024-05-14 09:17:27 +02:00
Gabriela Cervantes
f20a44bba3 gha: Fix indentation in gha run k8s common
This PR fixes the indentation in gha run k8s common script
to have uniformity across the script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-13 20:07:47 +00:00
Fabiano Fidêncio
4d5e90038c Merge pull request #9626 from fidencio/topic/prepare-for-3.5.0-release
release: Bump VERSIONS file to 3.5.0
2024-05-13 12:52:12 +02:00
Fabiano Fidêncio
0e385452e5 release: Bump VERSIONS file to 3.5.0
Let's bump the VERSIONS file and start preparing for a new release of
the project.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-13 10:49:09 +02:00
Fabiano Fidêncio
c64b07f981 Merge pull request #9622 from fidencio/topic/unbreak-nvidia-gpu-build
build: nvidia-gpu: Fix cache usage of the headers tarball
2024-05-12 14:40:22 +02:00
cncal
232db2d906 runtime: fix duplicated devices requested to the agent
By default, when a container is created with the `--privileged` flag,
all devices in `/dev` from the host are mounted into the guest. If
there is a block device(e.g. `/dev/dm`) followed by a generic
device(e.g. `/dev/null`),two identical block devices(`/dev/dm`)
would be requested to the kata agent causing the agent to exit with error:

> Conflicting device updates for /dev/dm-2

As the generic device type does not hit any cases defined in `switch`,
the variable `kataDevice` which is defined outside of the loop is still
the value of the previous block device rather than `nil`. Defining `kataDevice`
in the loop fixes this bug.

Signed-off-by: cncal <flycalvin@qq.com>
2024-05-12 16:38:37 +08:00
Fabiano Fidêncio
9713558477 k0s: Use a different port for kube-route's metrics
kube-router decided to use :8080 for its metrics, and this seems to be a
change that affected k0s 1.30.0+, leading to kube-router pod crashing
all the time and anything can actually be started after that.

Due to this issue, let's simply use a different port (:9999) and move on
with our tests.

Fixes: #9623

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-11 23:18:20 +02:00
Fabiano Fidêncio
4cd048444d build: nvidia-gpu: Fix cache usage of the headers tarball
Whenever we count on having the headers tarball, we must unpack the
cached content into the expected directory, otherwise we'd simply fail,
as we've been failing in our CI, at the end of the process where we
generate the tarball from the cached components.

It's weird to me, sincerely, that the headers tarball end up in such
weird place (build/kernel-nvidia-gpu/builddir/), but I'll leave that to
Zvonko to figure out whether something better can be done, as the intuit
of this PR is simply unblock Kata Containers CI.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-11 17:59:53 +02:00
Zvonko Kaiser
693e307f72 deploy: Add artefact repository
New env var so everyone can test the PUSH_TO_REGISTRY feature

export PUSH_TO_REGISTRY=yes
export ARTEFACT_REGISTRY=quay.io
export ARTEFACT_REPOSITORY=my-fancy-kata-containers
export ARTEFACT_REGISTRY_USERNAME=zvonkok
export ARTEFACT_REGISTRY_PASSWORD=<super-secret>

make ...-tarball

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-10 16:41:52 +00:00
Zvonko Kaiser
4dea73b433 Merge pull request #9616 from zvonkok/nv-kernel-hotfix
deploy: Fix wrong pushing of artifacts
2024-05-10 18:38:09 +02:00
Zvonko Kaiser
4d0f42a145 deploy: Fix wrong pushing of artifacts
Added explicit case statements for nvidia-gpu and
nvidia-gpu-confidential

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-10 14:08:32 +00:00
Zvonko Kaiser
85374f55d2 gpu: Add build targets for GPU rootfs initrd/image
Preparation for complete GPU rootfs build step #1/#N

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-10 09:47:21 +00:00
Zvonko Kaiser
8ec2cc9c0d threat-model: Add VFIO, ACPI and KVM/VMM threat-model descriptions
We're missing several topics in the current threat model lets update.

Fixes: #8943

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-10 07:18:44 +00:00
Fabiano Fidêncio
20515fed70 Merge pull request #9484 from zvonkok/nvidia-runtimeclasses
deploy: Add runtimeClasses relating to the NVIDIA GPU
2024-05-10 03:52:12 +02:00
Gabriela Cervantes
80e551ea74 metrics: Update launch times script
This PR updates the launch times scripts by improving the variable
definition as well as trying to use the same format across all the script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-09 21:29:32 +00:00
Emanuel Lima
59c1567f80 runtime-rs: Fix constructing the RTC struct
RTC was being built in a wrong fashion on commit #2bc5e3c6e2ab0145fa9e8be95df0d5086c07a517

RTC was being constructed inside the QemuCmdLine struct,
but it should've been built inside the devices vector.

Signed-off-by: Emanuel Lima <emlima@redhat.com>
2024-05-09 15:00:47 -03:00
Fabiano Fidêncio
2f686b1179 Merge pull request #9608 from fidencio/topic/tdx-depend-on-distro-host-stack-part-II
tdx: Adapt kata-deploy to use QEMU / OVMF from the distros
2024-05-09 10:25:19 +02:00
Zvonko Kaiser
da7e6a0f07 deploy: Add runtimeClasses relating to the NVIDIA GPU
Fixes: #9483

For the added configurations we need to provide runtimeClasses.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-09 10:00:59 +02:00
Fabiano Fidêncio
96a100f910 Merge pull request #9482 from zvonkok/kernel-headers-tarball
kernel: Add caching of kernel-headers
2024-05-09 09:58:30 +02:00
Fabiano Fidêncio
aba56a8adb tests: measured-rootfs: Skip policy addition
Let's skip the policy addition for now, in order to get the TDX CI back
up and running, and then we can re-enable it as soon as we get
https://github.com/kata-containers/kata-containers/issues/9612 fixed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-09 07:59:12 +02:00
Fabiano Fidêncio
77f457c0e1 runtime: tdx: Drop sept-ve-disable=on
This was needed when we were using an old (and not maintained anymore)
host stack.  Considering what we have as part of the distros, Today,
this can simply be dropped, as I cannot find any reference of this one
being needed in any up-to-date documentation.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-09 07:59:12 +02:00
Fabiano Fidêncio
416d00228c Revert "qemu: tdx: Adapt command line" (partially)
This reverts commit b7cccfa019.

The `private=on` bit has never made its way upstream, and was removed
from the latest iteration that we're using.  With that in mind, let's
revert its usage in the code.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-09 07:59:12 +02:00
Fabiano Fidêncio
1c3037fd25 Revert "govmm: tdx: Expose the private=on|off knob"
This reverts commit 582b5b6b19.

The `private=on` bit has never made its way upstream, and was removed
from the latest iteration that we're using.  With that in mind, let's
revert its addition, and later on its usage in the code.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-09 07:59:12 +02:00
Fabiano Fidêncio
a9720495de kata-deploy: Ensure the distro QEMU and OVMF are used for TDX
Here we're checking the distro's `/etc/os-release` or
`/usr/lib/os-release` in order to get which distro we're deploying the
Kata Containers artefacts to, and then to properly adjust the QEMU and
OVMF with TDX support that's been shipped with the distros.

Together with that, we're also printing the instructions provided by the
distro on how to enable and use TDX.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-09 07:59:12 +02:00
Fabiano Fidêncio
f48450b360 runtime: config: tdx: Add QEMU / OVMF placeholder var
Let's add the PLACEHOLDER_FOR_DISTRO_{QEMU,OVMF}_WITH_TDX_SUPPORT
variables instead of actually setting a path, so we can easily replace
those as part of our deployment scripts.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-09 07:59:12 +02:00
Fabiano Fidêncio
84b94dc2b1 kata-deploy: Expose /host to the daemon-set
We'll need to have access to the host os-release file (either under
`/etc/os-release` or under `/usr/lib/os-release`), and the simplest
approach that comes to my mind to do is doing what a debug pod would do,
mounting `/` as `/host` and then allowing us to have access to those
files, and then corectly set the TDX specific QEMU and OVMF (TDVF) paths
for the tdx available configurations.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-09 07:59:12 +02:00
Fabiano Fidêncio
f2d40da8e4 versions: build: Remove unused td-shim entry
We haven't been using nor testing with td-shim, as Cloud Hypervisor does
not officially support TDX yet, and TDVF is supposed to be used with
QEMU, instead of td-shim.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-09 07:59:12 +02:00
Fabiano Fidêncio
ea82740b19 versions: build: Remove TDX specific QEMU
Let's remove everything related to the TDX specific QEMU building /
shipping from our repo, as we'll be relying on the one coming from the
distros.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-09 07:59:12 +02:00
Fabiano Fidêncio
4292c4c3b1 versions: build: Remove TDX specific OVMF (TDVF)
Let's remove everything related to the TDVF building / shipping from our
repo, as we'll be relying on the one coming from the distro.

Later on, we may need to re-add TDVF logic, as we're already using
upstream edk2 repo / content, but when that's needed we'll simply revert
this commit.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-09 07:59:12 +02:00
Alex Lyn
946f0bdfff Merge pull request #9609 from fidencio/topic/skip-pull-image-tests-on-tees
tests: pull-image: Don't run on TEEs
2024-05-09 08:22:55 +08:00
GabyCT
3b8a910393 Merge pull request #9596 from lifupan/main
db: fix the issue of failed to init pci root bus
2024-05-08 13:14:20 -06:00
Gabriela Cervantes
2fb406ed3a metrics: Fix random write value for FIO
This PR fixes the random write value for FIO for qemu by decreasing it
to avoid the random failures of the GHA CI.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-08 18:54:41 +00:00
Fabiano Fidêncio
142342012c tests: pull-image: Don't run on TEEs
Let's skip those tests on TEEs as we've been facing a reasonable amount
of issues, most likely on the containerd side, related to pulling the
image on the guest.

Once we're able to fix the issues on containerd, we can get back and
re-enable those by reverting this commit.

The decision of disabling the tests for TEEs is because the machines may
end up in a state where human intervention is necessary to get them back
to a functional state, and that's really not optimal for our CI.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-08 18:40:22 +02:00
Fabiano Fidêncio
c0bf9e9bc6 Merge pull request #9607 from fidencio/topic/tdx-depend-on-distro-host-stack-part-I
ci: Stop building TDX specific QEMU and OVMF
2024-05-08 15:53:15 +02:00
Zvonko Kaiser
fb0b821771 kernel: Add caching of kernel-headers
Fixes: #9481

We need to cache the kernel-headers for the NVIDIA GPU initrd/image build.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-08 11:30:39 +00:00
Fabiano Fidêncio
12dc9f83df ci: Stop building TDX specific QEMU and OVMF
This is the first step of the work to start relying on the artefacts
coming from the distros (CentOS 9 Stream, and Ubuntu) themselves.

Let's have this first one merged, as this will not run the CI due to the
changes being on the yaml itself, and then follow-up with the changes
needed on other parts of the project (kata-deploy, runtime, etc).

Fixes: #9590 -- part I

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-08 11:39:32 +02:00
Alex Lyn
875e6e3815 Merge pull request #9601 from cncal/fix_redundant_log
qemu: the error is logged only when it occurs
2024-05-08 08:59:01 +08:00
GabyCT
22087f9db9 Merge pull request #9598 from lifupan/main_shim
runtime-rs: fix the issue of the leak of dead shim
2024-05-07 10:14:11 -06:00
GabyCT
a564422b7b Merge pull request #9582 from cncal/main
build: fix the confusing build message if yq doesn't exist in GOPATH/bin
2024-05-07 09:34:27 -06:00
Fabiano Fidêncio
cd84414c63 Merge pull request #9600 from GabyCT/topic/deleteoci
versions: Remove oci information from versions file
2024-05-07 13:15:35 +02:00
Fabiano Fidêncio
ddf6b367c7 Merge pull request #9568 from kata-containers/dependabot/go_modules/src/runtime/go_modules-22ef55fa20
build(deps): bump the go_modules group across 5 directories with 8 updates
2024-05-07 13:14:48 +02:00
Steve Horsman
e967db60ab Merge pull request #9592 from sprt/mariner-before-ch39
tests: adapt Mariner CI to unblock CH v39 upgrade
2024-05-07 11:52:55 +01:00
cncal
15d511af97 qemu: the error is logged only when it occurs
Everytime I create contianer on arm64 machine, containerd/kata logs a redundant warning
as follows:
``` shell
time="2024-05-07" level=warning msg="<nil>" arch=arm64 name=containerd-shim-v2
pid=xxx sandbox=fdd1f05 source=virtcontainers/hypervisor
```
I added an error statement so that the error would be logged when it occurs.

Signed-off-by: cncal <flycalvin@qq.com>
2024-05-07 14:28:04 +08:00
Gabriela Cervantes
aecede11fc versions: Remove oci information from versions file
This PR removes oci information from versions file as this is not
longer being used in kata containers repository.

Fixes #9599

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-06 20:14:00 +00:00
Gabriela Cervantes
b54dc26073 gha: Enable uninstall kbs client function for coco gha workflow
This PR enables the uninstall kbs client function for coco gha tdx
workflow.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-06 15:55:24 +00:00
Gabriela Cervantes
aaf9b54d97 gha: Add support to install KBS to k8s TDX GHA workflow
This PR adds support to install KBS to k8s TDX GHA workflow in
order to run confidential attestation tests.

Fixes #9451

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-06 15:42:17 +00:00
Gabriela Cervantes
506e17a60d tests: Add k8s negative policy test
This PR adds a k8s negative policy test to the confidential attestation
bats test.

Fixes #9437

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-06 15:28:54 +00:00
Fupan Li
3694f3d9fe runtime-rs: fix the issue of the leak of dead shim
We should init and asign the runtime instance to runtime
handler, otherwise, if the pause container failed to start,
which means the runtime instance failed to start, then the
following delete & shutdown request wouldn't be run, thus
the dead shim would be left.

Fixes: #9597

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-05-06 17:31:31 +08:00
Fupan Li
26bee78e8d db: fix the issue of failed to init pci root bus
dragonball reserves 2048G of mmio space for the pci root bus by default
on physical addresses greater than 4G. However, for some machines with
smaller physical address widths, such as 39-bit wide physical addresses,
dragonball reserves the mmio space when initializing the memory. It is
less than 2048G, so this commit dynamically calculates and allocates the
mmio size of each pci root bus.

Fixes: #9509

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-05-06 11:34:18 +08:00
Aurélien Bombo
0cc2b07a8c tests: adapt Mariner CI to unblock CH v39 upgrade
The CH v39 upgrade in #9575 is currently blocked because of a bug in the
Mariner host kernel. To address this, we temporarily tweak the Mariner
CI to use an Ubuntu host and the Kata guest kernel, while retaining the
Mariner initrd. This is tracked in #9594.

Importantly, this allows us to preserve CI for genpolicy. We had to
tweak the default rules.rego however, as the OCI version is now
different in the Ubuntu host. This is tracked in #9593.

This change has been tested together with CH v39 in #9588.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-05-03 16:29:12 +00:00
cncal
48d873b52b build: fix the confusing build message if yq doesn't exist in GOPATH/bin
The build message shows that yq was not found when I tried to build
runtime binaries, but I've actually installed yq by yum install.

Signed-off-by: cncal <flycalvin@qq.com>
2024-05-03 08:34:45 +08:00
cncal
9caa7beb1f runtime: make kata-runtime check error more understandable
If device /dev/kvm does not exist, kata-runtime check would fail with
an ambiguous error messae 'no such file or directory'. I added a little
more details to make it understandable and it will belike:

```
ERRO[0000] cannot open kvm device: no such file or directory  arch=arm64 check-type=full device=/dev/kvm name=kata-runtime pid=2849085 source=runtime
ERRO[0000] no such file or directory                          arch=arm64 name=kata-runtime pid=2849085 source=runtime
no such file or directory
```

Signed-off-by: cncal <flycalvin@qq.com>
2024-05-03 08:29:08 +08:00
Zvonko Kaiser
e5e0983b56 Merge pull request #9476 from zvonkok/nvidia-config-tomls
config: Add NVIDIA GPU SNP, TDX configuration files
2024-05-02 10:27:10 +02:00
Fabiano Fidêncio
f04a7a55ed Merge pull request #9563 from fidencio/topic/agent-use-policy-by-default
build: Build the shipped agent with policy enabled
2024-05-01 12:22:05 +02:00
Fabiano Fidêncio
33a8701904 Merge pull request #9573 from littlejawa/kata_deploy_crio_conf
kata-deploy: configure debugging for crio
2024-05-01 12:19:10 +02:00
Julien Ropé
c2aed995b7 kata-deploy: configure debugging for crio
Fix the configuration for crio's log_level

Fixes: #9556

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-04-30 17:48:43 +02:00
stevenhorsman
3c2232d898 runtime: fix testVersionString logic
- The testVersionString logic use regex to check that the ociVersion is
displayed correctly, but with the new go module that version has a
`+` in, so we need to quote this to escape special characters

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-04-30 10:54:49 +01:00
dependabot[bot]
391bc35805 build(deps): bump the go_modules group across 5 directories with 8 updates
Bumps the go_modules group with 2 updates in the /src/runtime directory: [github.com/containerd/containerd](https://github.com/containerd/containerd) and [github.com/containers/podman/v4](https://github.com/containers/podman).
Bumps the go_modules group with 4 updates in the /src/tools/csi-kata-directvolume directory: [golang.org/x/sys](https://github.com/golang/sys), google.golang.org/protobuf, [golang.org/x/net](https://github.com/golang/net) and [google.golang.org/grpc](https://github.com/grpc/grpc-go).
Bumps the go_modules group with 2 updates in the /src/tools/log-parser directory: [golang.org/x/sys](https://github.com/golang/sys) and gopkg.in/yaml.v3.
Bumps the go_modules group with 2 updates in the /tests directory: [golang.org/x/sys](https://github.com/golang/sys) and gopkg.in/yaml.v3.
Bumps the go_modules group with 2 updates in the /tools/testing/kata-webhook directory: [golang.org/x/sys](https://github.com/golang/sys) and [golang.org/x/net](https://github.com/golang/net).


Updates `github.com/containerd/containerd` from 1.7.2 to 1.7.11
- [Release notes](https://github.com/containerd/containerd/releases)
- [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md)
- [Commits](https://github.com/containerd/containerd/compare/v1.7.2...v1.7.11)

Updates `github.com/containers/podman/v4` from 4.2.0 to 4.9.4
- [Release notes](https://github.com/containers/podman/releases)
- [Changelog](https://github.com/containers/podman/blob/v4.9.4/RELEASE_NOTES.md)
- [Commits](https://github.com/containers/podman/compare/v4.2.0...v4.9.4)

Updates `google.golang.org/protobuf` from 1.29.1 to 1.33.0

Updates `github.com/cyphar/filepath-securejoin` from 0.2.3 to 0.2.4
- [Release notes](https://github.com/cyphar/filepath-securejoin/releases)
- [Commits](https://github.com/cyphar/filepath-securejoin/compare/v0.2.3...v0.2.4)

Updates `golang.org/x/sys` from 0.15.0 to 0.19.0
- [Commits](https://github.com/golang/sys/compare/v0.15.0...v0.19.0)

Updates `google.golang.org/protobuf` from 1.31.0 to 1.33.0

Updates `golang.org/x/net` from 0.19.0 to 0.23.0
- [Commits](https://github.com/golang/net/compare/v0.19.0...v0.23.0)

Updates `google.golang.org/grpc` from 1.59.0 to 1.63.2
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.59.0...v1.63.2)

Updates `golang.org/x/sys` from 0.0.0-20191026070338-33540a1f6037 to 0.1.0
- [Commits](https://github.com/golang/sys/compare/v0.15.0...v0.19.0)

Updates `gopkg.in/yaml.v3` from 3.0.0-20200313102051-9f266ea9e77c to 3.0.0

Updates `golang.org/x/sys` from 0.0.0-20220429233432-b5fbb4746d32 to 0.19.0
- [Commits](https://github.com/golang/sys/compare/v0.15.0...v0.19.0)

Updates `gopkg.in/yaml.v3` from 3.0.0-20210107192922-496545a6307b to 3.0.0

Updates `golang.org/x/sys` from 0.15.0 to 0.19.0
- [Commits](https://github.com/golang/sys/compare/v0.15.0...v0.19.0)

Updates `golang.org/x/net` from 0.19.0 to 0.23.0
- [Commits](https://github.com/golang/net/compare/v0.19.0...v0.23.0)

---
updated-dependencies:
- dependency-name: github.com/containerd/containerd
  dependency-type: direct:production
  dependency-group: go_modules
- dependency-name: github.com/containers/podman/v4
  dependency-type: direct:production
  dependency-group: go_modules
- dependency-name: google.golang.org/protobuf
  dependency-type: direct:production
  dependency-group: go_modules
- dependency-name: github.com/cyphar/filepath-securejoin
  dependency-type: indirect
  dependency-group: go_modules
- dependency-name: golang.org/x/sys
  dependency-type: indirect
  dependency-group: go_modules
- dependency-name: google.golang.org/protobuf
  dependency-type: indirect
  dependency-group: go_modules
- dependency-name: golang.org/x/net
  dependency-type: direct:production
  dependency-group: go_modules
- dependency-name: google.golang.org/grpc
  dependency-type: direct:production
  dependency-group: go_modules
- dependency-name: golang.org/x/sys
  dependency-type: indirect
  dependency-group: go_modules
- dependency-name: gopkg.in/yaml.v3
  dependency-type: indirect
  dependency-group: go_modules
- dependency-name: golang.org/x/sys
  dependency-type: indirect
  dependency-group: go_modules
- dependency-name: gopkg.in/yaml.v3
  dependency-type: indirect
  dependency-group: go_modules
- dependency-name: golang.org/x/sys
  dependency-type: indirect
  dependency-group: go_modules
- dependency-name: golang.org/x/net
  dependency-type: indirect
  dependency-group: go_modules
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-04-30 09:46:13 +01:00
Wainer Moschetta
eae429a39b Merge pull request #9552 from wainersm/kata_cc_dev
runtime: new qemu-coco-dev configuration
2024-04-30 05:21:49 -03:00
Zvonko Kaiser
28078ded84 Merge pull request #9570 from stevenhorsman/dependabot-commit-check-skip
workflow: static-checks: Skip commit checks for dependabout
2024-04-29 23:00:35 +02:00
Pavel Mores
1dd06cf40d Merge pull request #9551 from pmores/support-iommu
runtime-rs: support IOMMU in qemu VMs
2024-04-29 15:26:11 +02:00
stevenhorsman
0bec8721cc workflow: Skip commit checks for dependabout
Dependabot doesn't follow all our commit format guidelines,
so add a check and skip these if the author is `dependabot[bot]`

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-04-29 13:45:51 +01:00
Wainer dos Santos Moschetta
631f6f6ed6 gha: switch CoCo tests on non-TEE to use qemu-coco-dev
With the addition of the 'qemu-coco-dev' runtimeClass we no longer need
to run CoCo tests on non-TEE environments with 'qemu'. As a result the
tests also no longer need to set the "io.katacontainers.config.hypervisor.image"
annotation to pods.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-29 05:45:11 -03:00
Wainer dos Santos Moschetta
c6708726ff kata-deploy: install the new kata-qemu-coco-dev runtimeclass
Created the runtimeclasses/kata-qemu-coco-dev.yaml file and updated the list
of SHIMS.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-29 05:45:11 -03:00
Wainer dos Santos Moschetta
42fb5d7760 runtime: new qemu-coco-dev configuration
Created a new configuration to configure Kata for CoCo without requiring TEE
hardware so to allow developers implement/test/debug platform agnostic code
on their workstations. It will also ease testing of CoCo features on CI with
non-TEE supported VMs.

This is based off qemu configuration. The following differences applied:
 - switched to confidential guest image/initrd
 - switched to confidential kernel
 - switched to 9p shared_fs

Fixes #9487
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-29 05:45:10 -03:00
Fabiano Fidêncio
d3b300ff95 build: tests: Remove agent-opa
Now that the `kata-agent` is being built with policy support, let's stop
building the `kata-opa-agent`, reducing the amount of things we need to
test and maintain.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-04-28 12:52:54 +02:00
Fabiano Fidêncio
b1710ee2c0 build: Build the shipped agent with policy enabled
Now that the OPA binary is not required anymore, let's start shipping
the agent with the policy enabled by default.

The agent *without* policy enabled has 30MB, while it's 34MB *with* the
policy enabled.

This 4MB (~10%) increase is, IMHO, worth it in order to reduce the
amount of components we have to maintain and test, including the
possibility to also reduce the amount of possible rootfs / initrd
images.

Whoever wants to use the agent without policy enabled can simply do that
by building their own agent. :-)

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-04-28 12:52:54 +02:00
Fabiano Fidêncio
7b039eb1b9 Merge pull request #9559 from fidencio/topic/remove-opa-stuff
rootfs: Stop building and shipping OPA
2024-04-28 12:52:07 +02:00
Fabiano Fidêncio
fe21d7a58b rootfs: Stop building and shipping OPA
Since OPA binary was replaced by the regorus crate, we can finally stop
building and shipping the binary.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-04-26 18:51:28 +02:00
Fabiano Fidêncio
7dd2fde22d Revert "rootfs: Make OPA build working in docker for s390x and ppc64le"
This reverts commit d523e865c0, as we will
not depend on the OPA binary anymore.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-04-26 18:51:27 +02:00
Hyounggyu Choi
62bad976e0 Merge pull request #9562 from BbolroC/bump-golang
build: Update golang version to 1.22.2
2024-04-26 17:58:04 +02:00
Steve Horsman
34a1cdc5c7 Merge pull request #9528 from cncal/patch-1
doc: fix missing document link
2024-04-26 15:22:15 +01:00
Hyounggyu Choi
80cb4a6c18 build: Update golang version to 1.22.2
As we have an issue with a golang version for `run-cri-containerd`,
it is required to bump the language.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-26 15:50:29 +02:00
Pavel Mores
908ec31d9b runtime-rs: fix iommu_platform support for qemu vhost-user-fs device
iommu_platform support was already added on initial DeviceVhostUserFs
introduction, however it incorrectly enabled iommu_platform also on
non-CCW (e.g. PCI) systems.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-04-26 14:48:00 +02:00
Pavel Mores
174fc8f44b runtime-rs: support iommu_platform for qemu virtio-net device
Note that it's only supported on CCW systems.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-04-26 14:48:00 +02:00
Pavel Mores
0d038f20cc runtime-rs: support iommu_platform for qemu virtio-serial device
iommu_platform is only turned on for CCW systems.

PartialEq is added to VirtioBusType to enable the '==' operator.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-04-26 14:48:00 +02:00
Pavel Mores
66a2dc48ae runtime-rs: support iommu_platform for qemu vhost-vsock device
iommu_platform addition is controlled solely by the configuration file.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-04-26 14:48:00 +02:00
Pavel Mores
d1e6f9cc4e runtime-rs: add IOMMU to qemu VM if configured
The adding itself is done by a new function add_iommu() that conforms with
the add_*() convention.  Note though that this function is called
internally, by the QemuCmdLine constructor, simply because there's nothing
to trigger its invocation from QemuInner (unlike the other add_*()
functions so far).

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-04-26 14:48:00 +02:00
Pavel Mores
0859f47a17 runtime-rs: add representation of '-device intel-iommu' to qemu-rs
Following the golang shim example, the values are hardcoded.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-04-26 14:47:51 +02:00
Pavel Mores
702bf0d35e runtime-rs: support qemu machine's 'kernel_irqchip' param
We will want to set kernel_irqchip when enabling IOMMU and this commit
adds the requisite support.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-04-26 14:42:54 +02:00
Alex Lyn
f72c6ba814 Merge pull request #9519 from emanuellima1/impl-rtc
runtime-rs: Add RTC to QEMU cmdline
2024-04-26 17:44:47 +08:00
Dan Mihai
b42ddaf15f Merge pull request #9530 from microsoft/saulparedes/improve_caching
genpolicy: changing caching so the tool can run concurrently with itself
2024-04-25 13:06:23 -07:00
David Esparza
ae317a319f Merge pull request #9549 from JakubLedworowski/fix-tarball-dockerfile
build: Fix tarball not building correctly in docker
2024-04-25 09:40:20 -06:00
James O. D. Hunt
5bd614530f Merge pull request #9525 from jodh-intel/gha-k8s-ch-dm
gha: Enable k8s tests for cloud hypervisor with devicemapper
2024-04-25 09:28:09 +01:00
Fabiano Fidêncio
b4360e7e37 Merge pull request #9510 from microsoft/danmihai1/regorus-policy2
agent: use regorus instead of opa
2024-04-24 21:40:29 +02:00
James O. D. Hunt
ff7349b6f0 gha: Enable k8s tests for cloud hypervisor with devicemapper
Enable the k8s tests for cloud hypervisor with devicemapper.

Fixes: #9221.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Co-authored-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-04-24 16:32:51 +01:00
Dan Mihai
2400a4d249 Merge pull request #9428 from arc9693/archana1/genplicyfixes
genpolicy: implement default methods for K8sResource trait
2024-04-24 08:04:19 -07:00
Dan Mihai
ff385eac41 agent: remove unnecessary comment
Remove reminder to initialize Policy earlier, because currently there
are no plans to initialize earlier.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-24 14:53:51 +00:00
Jakub Ledworowski
73366da9f9 build: Fix tarball not building correctly in docker
When docker is installed on the host system using script from https://get.docker.com/ it automatically creates a docker group with gid=999.
Then during docker build process of tarball, eg. make qemu-tdx-experimental-tarball docker is also installed inside the image with the same
script, which also automatically adds docker group with gid=999.
Then, the build tries to add a new group docker_on_host with gid=999, which already exists, which breaks the build.

Signed-off-by: Jakub Ledworowski <jakub.ledworowski@intel.com>
2024-04-24 15:35:36 +02:00
Calvin Liu
56a73ee704 doc: fix missing document link
Document section hardware-requirements locates to /README.md for now.

Signed-off-by: Calvin Liu <flycalvin@qq.com>
2024-04-24 17:34:30 +08:00
Fabiano Fidêncio
4e35f11a3d Merge pull request #9535 from fidencio/topic/fix-crio-debug-drop-in
kata-deploy: Stop append `log_level = "debug"` for CRI-O
2024-04-24 10:03:36 +02:00
Dan Mihai
89c85dfe84 Merge pull request #9432 from UiPath/fix-clh-wait
clh: isClhRunning waits for full timeout when clh exits
2024-04-23 13:02:45 -07:00
Hyounggyu Choi
608df9b7df Merge pull request #9494 from BbolroC/guest-pull-gha-s390x
CC: Enable guest-pull tests on non-TEE for s390x
2024-04-23 21:22:37 +02:00
Dan Mihai
e5c3f5fa9b tests: no generated policy for untested platforms
Avoid auto-generating Policy on platforms that haven't been tested
yet with auto-generated Policy.

Support for auto-generated Policy on these additional platforms is
coming up in future PRs, so the tests being fixed here were
prematurely enabled.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-23 16:07:03 +00:00
Emanuel Lima
2bc5e3c6e2 runtime-rs: Add RTC to QEMU cmdline
Add RTC by hardcoding the ooptions base=utc,driftfix=slew,clock=host

Signed-off-by: Emanuel Lima <emlima@redhat.com>
2024-04-23 10:46:30 -03:00
Fabiano Fidêncio
d190c9d4d9 kata-deploy: Stop append log_level = "debug" for CRI-O
This should only be done once, and if CRI-O restarts, there's a big
chance kata-deploy will also restart and the user would end up with a
file that looks like:
```
[crio]
log_level = "debug"
[crio]
log_level = "debug"
[crio]
log_level = "debug"
...
```

And that would simply cause CRI-O to not start.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-04-23 14:51:35 +02:00
Greg Kurz
42a79801f3 Merge pull request #9524 from littlejawa/fix_createruntime_hook_not_called
runtime: Call CreateRuntime hooks at container creation time
2024-04-23 13:43:36 +02:00
Fupan Li
469c4e4f44 Merge pull request #9335 from Tim-Zhang/fix-passfd-fifo-open
passfd-io: fix FIFO opening and vsock handling
2024-04-23 09:04:45 +08:00
Alex Lyn
bc2cf95e7a Merge pull request #9517 from amshinde/update-storage-source-pciblock
runtime-rs: Update storage source for pci block devices
2024-04-23 07:32:36 +08:00
Dan Mihai
5d31eb4847 agent: use regorus 0.1.4
Use regorus 0.1.4 from crates.io, instead of its source code
repository.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-22 23:21:17 +00:00
Dan Mihai
ed6412b63c tests: k8s: reduce the policy tests output noise
Hide some of the kubectl output, to reduce the size and redundancy of
this output.

Fixes: #9388

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-22 19:59:33 +00:00
Dan Mihai
df23eb09a6 agent: use regorus instead of opa
Implement Agent Policy using the regorus crate instead of the OPA
daemon.

The OPA daemon will be removed from the Guest rootfs in a future PR.

Fixes: #9388

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-22 19:58:30 +00:00
Dan Mihai
58e608d61a tests: remove k8s-policy-set-keys.bats
Remove k8s-policy-set-keys.bats in preparation for using the regorus
crate instead of the OPA daemon for evaluating the Agent Policy. This
test depended on sending HTTP requests to OPA.

Fixes: #9388

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-22 19:49:38 +00:00
Dan Mihai
b509c1beee agent: lock anyhow version to 1.0.58
Lock anyhow version to 1.0.58 because:

- Versions between 1.0.59 - 1.0.76 have not been tested yet using
  Kata CI. However, those versions pass "make test" for the
  Kata Agent.

- Versions 1.0.77 or newer fail during "make test" - see
  https://github.com/kata-containers/kata-containers/issues/9538.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-22 19:49:15 +00:00
Archana Shinde
cc6b671101 runtime-rs: Update storage source for pci block devices
In case of block devices using virtio-block, we need to pass the
pci-path as the storage source field to the agent.
Current the virt-path is being passed which works just for mmio block
devices.
In the future when support is added for scsi, block-ccw and pmem
devices, the storage source would need to be handled accordingly.

Fixes: #9034

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-04-22 11:36:58 -07:00
Hyounggyu Choi
f10744df99 CC: Enable guest-pull tests on non-TEE for s390x
This commit is to add a new CI job to run-k8s-tests-on-zvsi.yaml.
Why the job is not configured in run-kata-coco-tests.yaml by having it
integrated with `run-k8s-tests-coco-nontee` is:

- It uses k3s instead of AKS
- It runs on a self-hosted runner

These differences make the integrated job not easy to read and maintain
when it comes to incorporating other platforms in the near future.

Fixes: #9467

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-22 17:15:20 +02:00
Greg Kurz
6ca0f09710 Merge pull request #9518 from microsoft/danmihai1/agent-cargo-lock
agent: update cargo.lock
2024-04-22 13:36:06 +02:00
Tim Zhang
aeba483ec8 agent: avoid fd leakage of passfd-io
In do_create_container and do_exec_process, we should create the proc_io first,
in case there's some error occur below, thus we can make sure
the io stream closed when error occur.

Signed-off-by: Tim Zhang <tim@hyper.sh>
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-04-22 17:39:33 +08:00
Tim Zhang
8441187d5e runtime-rs: fix FIFO handling
Fixes: #9334

In linux, when a FIFO is opened and there are no writers, the reader
will continuously receive the HUP event. This can be problematic.
To avoid this problem, we open stdin in write mode and keep the stdin-writer

We need to open the stdout/stderr as the read mode and keep the open endpoint
until the process is delete. otherwise,
the process would exit before the containerd side open and read
the stdout fifo, thus runD would write all of the stdout contents into
the stdout fifo and then closed the write endpoint. Then, containerd
open the stdout fifo and try to read, since the write side had closed,
thus containerd would block on the read forever.
Here we keep the stdout/stderr read endpoint File in the common_process,
which would be destroied when containerd send the delete rpc call,
at this time the containerd had waited the stdout read return, thus it
can make sure the contents in the stdout/stderr fifo wouldn't be lost.

Signed-off-by: Tim Zhang <tim@hyper.sh>
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-04-22 17:39:33 +08:00
Tim Zhang
d68eb7f0ad agent: Fix close_stdin for passfd-io
In scenario passfd-io, we should wait for stdin to close itself
instead of manually intervening in it.

Signed-off-by: Tim Zhang <tim@hyper.sh>
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-04-22 17:39:32 +08:00
Steve Horsman
ff9985fc50 Merge pull request #9490 from wainersm/port_attestation_nontee_job
gha: move attestation tests to run-k8s-tests-coco-nontee
2024-04-22 10:23:11 +01:00
Archana Choudhary
4a010cf71b genpolicy: add default implementations for K8sResource trait
This commit adds default implementations for following methods of
K8sResource trait:
- generate_policy
- serialize

Fixes: #8960
Signed-off-by: Archana Choudhary <archana1@microsoft.com>
2024-04-21 12:59:02 +00:00
Archana Choudhary
6edc3b6b0a genpolicy: add default implementation for use_sandbox_pidns
This patch adds a default implementation for the use_sandbox_pidns
and updates the structs that implement the K8sResource trait to use
the default.

Fixes: #8960
Signed-off-by: Archana Choudhary <archana1@microsoft.com>
2024-04-21 12:59:02 +00:00
Archana Choudhary
d5d3f9cda7 genpolicy: add default implementation for use_host_network
- Provide default implementation for use_host_network
- Remove default implementation from structs implementing the trait K8sResource

Fixes: #8960
Signed-off-by: Archana Choudhary <archana1@microsoft.com>
2024-04-21 12:59:02 +00:00
Archana Choudhary
9a3eac5306 genpolicy: add default impl for get_containers
- Provide default impl for get_containers
- Remove default impl from structs implementing the trait K8sResource

Fixes: #8960
Signed-off-by: Archana Choudhary <archana1@microsoft.com>
2024-04-21 12:59:02 +00:00
Archana Choudhary
2db3470602 genpolicy: add default impl for get_container_mounts_and_storages
- Provide default impl for get_container_mounts_and_storages
- Remove default impl from structs implementing the trait K8sResource

Fixes: #8960
Signed-off-by: Archana Choudhary <archana1@microsoft.com>
2024-04-21 12:59:02 +00:00
Archana Choudhary
09b0b4c11d genpolicy: add default implementation for get_sandbox_name
- Provide default implementation for get_sandbox_name in K8sResource trait
- Remove default implementation from structs implementing the trait K8sResource

Fixes: #8960
Signed-off-by: Archana Choudhary <archana1@microsoft.com>
2024-04-21 12:55:32 +00:00
Archana Choudhary
43e9de8125 genpolicy: add default implementation for get_annotations
- Provide default implementation for get_annontations.
- Remove default implementation from structs implementing the trait K8sResource

Fixes: #8960
Signed-off-by: Archana Choudhary <archana1@microsoft.com>
2024-04-21 12:55:32 +00:00
Saul Paredes
2149cb6502 genpolicy: changing caching so the tool can run
concurrently with itself

Based on 3a1461b0a5186a92afedaaea33ff2bd120d1cea0

Previously the tool would use the layers_cache folder for all instances
and hence delete the cache when it was done, interfereing with other
instances. This change makes it so that each instance of the tool will
have its own temp folder to use.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-04-19 15:46:30 -07:00
Wainer dos Santos Moschetta
1e35291fd5 gha: move attestation tests to run-k8s-tests-coco-nontee
The new run-k8s-tests-coco-nontee job should be the home of attestation
tests.

Changed run-k8s-tests-coco-nontee to get KBS installed and by the time the
KBS variable is exported in the environment then the attestation tests
will kick in (likewise they will skip in run-k8s-tests-on-aks).

Fixes #9455
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-19 14:51:30 -03:00
Steve Horsman
7e12d588c0 Merge pull request #9485 from sparky005/update_golang.org/x/net
update golang.org/x/net
2024-04-19 11:26:13 +01:00
Amulya Meka
12964256a4 Merge pull request #9521 from Amulyam24/gha
gha: tag k8s tests on ppc64le to ppc64le-runner-01
2024-04-19 15:08:08 +05:30
Julien Ropé
70e798ed35 runtime: Call CreateRuntime hooks at container creation time
CreateRuntime hooks are called at the CreateSandbox time,
but not after CreateContainer.

Fixes: #9523

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-04-19 10:25:02 +02:00
Alex Lyn
3456483df9 Merge pull request #9513 from stevenhorsman/bump-stale-version
gha: stale: Bump stalebot version
2024-04-19 15:15:10 +08:00
Alex Lyn
c147f0f4ed Merge pull request #9516 from sprt/rlz-340
release: bump version for 3.4.0 release
2024-04-19 15:12:26 +08:00
Amulyam24
8255ed248a gha: tag k8s tests on ppc64le to ppc64le-runner-01
This PR aims at running the k8s tests to one runner on ppc64le.

Fixes: #9520

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2024-04-19 12:04:25 +05:30
Hyounggyu Choi
304dc1e4da doc: Update how-to-run-kata-containers-with-SE-VMs.md
This is to update a document `how-to-run-kata-containers-with-SE-VMs`
on using confidential artifacts to build a secure image.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-19 08:31:12 +02:00
Hyounggyu Choi
8fbed9f6a4 local-build: Use confidential kernel and initrd for boot-image-se
This is to make `boot-image-se-tarball` use confidential kernel and
initrd instead of vanilla version of artifacts.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-19 07:09:04 +02:00
Dan Mihai
4242801b1c agent: update cargo.lock
Update Kata Agent's Cargo.lock after the recent changes to Cargo.toml.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-18 17:12:48 +00:00
Aurélien Bombo
95971e4a42 release: bump version for 3.4.0 release
Release v3.4.0.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-04-18 17:08:06 +00:00
Steve Horsman
6dd038fd58 Merge pull request #9501 from zvonkok/check-fixes
kata: Remove check for "Fixes" in PR
2024-04-18 17:48:50 +01:00
Hyounggyu Choi
2b9c439fcf Merge pull request #9508 from BbolroC/gha-s390x-k8s-label
gha: Make integration tests for s390x run on s390x-large runners
2024-04-18 18:05:01 +02:00
Adil Sadik
1c5ca0c915 runtime: update golang.org/x/net
updates golang.org/x/net to newer version that closes some reported
vulnerabilities and security issues

Fixes #9486

Signed-off-by: Adil Sadik <sparky.005@gmail.com>
2024-04-18 10:55:02 -04:00
Tim Zhang
221c5b51fe dragonball: fix EPOLLHUP/EPOLLERR events handling in vsock
1. EPOLLHUP events also need to be read and will be got len 0.
2. We should kill the connection when EPOLLERR events are received.

Signed-off-by: Tim Zhang <tim@hyper.sh>
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-04-18 20:47:02 +08:00
Hyounggyu Choi
49a0d57f66 gha: Make integration tests for s390x run on s390x-large runners
This is to make a workflow `run-k8s-tests` and `run-cri-containerd`
(s390x and zvsi) run only on the runners labeled by `s390x-large`.

Fixes: #9507

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-18 14:35:24 +02:00
stevenhorsman
cf5c3dc155 gha: stale: Bump stalebot version
- Bump the stalebot action version to v9 as that fixes the
```
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/stale@v8.
```
warning.

Fixes: #9512
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-04-18 11:41:09 +01:00
Steve Horsman
bf16b18180 Merge pull request #9503 from stevenhorsman/stale-pr-remove-date
gha: stale: Remove the start-date
2024-04-18 09:36:27 +01:00
Hyounggyu Choi
566a6de594 Merge pull request #9505 from BbolroC/remove-crio-nightly-test-s390x
gha: Remove k8s-cri-containerd-rhel9-e2e-tests for s390x
2024-04-18 09:31:07 +02:00
Hyounggyu Choi
cc22dc33f2 Merge pull request #9489 from BbolroC/install-opa-in-docker
rootfs: Make OPA build working in docker for s390x and pp…
2024-04-18 00:26:11 +02:00
Dan Mihai
5ceed689eb Merge pull request #9492 from microsoft/danmihai1/pod-tests
tests: k8s: inject agent policy failures (part 3)
2024-04-17 14:01:11 -07:00
Hyounggyu Choi
e046f5e652 gha: Remove k8s-cri-containerd-rhel9-e2e-tests for s390x
This commit is simply to remove a CI workflow `k8s-cri-containerd-rhel9-e2e-tests`.

Fixes: #9504

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-17 15:36:42 +02:00
Zvonko Kaiser
eda3bfe2ef config: Add NVIDIA GPU SNP, TDX configuration files
Fixes: #9475

For TDX and SNP add NVIDIA specific configuration files

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-04-17 12:49:13 +00:00
Wainer Moschetta
2d8e7933c5 Merge pull request #9461 from GabyCT/topic/uninstallkbs
tests/k8s: Add uninstall kbs client command function
2024-04-17 09:36:37 -03:00
Zvonko Kaiser
d7b24c04e5 Merge pull request #9473 from zvonkok/gpu-image-initrd-versions
version: add initrd, image NVIDIA sections
2024-04-17 13:22:05 +02:00
stevenhorsman
7235988605 gha: stale: Remove the start-date
As documented in https://github.com/actions/stale?tab=readme-ov-file#start-date
> The start date is used to ignore the issues and pull requests created before the start date.
> Particularly useful when you wish to add this stale workflow on an existing repository
> and only wish to stale the new issues and pull requests.

As we don't want need to treat PRs older than May 2023 as a special case, then remove this option.

Fixes: #9502
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-04-17 11:19:56 +01:00
Zvonko Kaiser
395e93acd5 kata: Remove Issue - PR dependency
We've discussed this over and over. Let's try to get to an agreement here.
I will use this issue to remove the mandatory Issue - PR dependency.

Fixes: #9500

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-04-17 09:53:08 +00:00
Archana Shinde
af3b19ed18 Merge pull request #9084 from amshinde/document-intel-gpu-vfio
docs: Document Intel Discrete GPUs usage with Kata
2024-04-16 16:17:03 -07:00
Archana Shinde
973a15332a spell-check: Add missing words to spell-check
Add missing words to spell-check dictionaries

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-04-16 11:50:02 -07:00
Archana Shinde
6f97dc1f60 static-checks: Rename file in doc to make static checks happy
Configuration file for qemu with runtime-rs was recently renamed.
Doc contains name for old file. This was somehow not caught in the CI
earlier.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-04-16 11:50:02 -07:00
Archana Shinde
87f0097b18 docs: Document Intel Discrete GPUs usage with Kata
Document describes the steps needed to pass an entire Intel Discrete GPU
as well a GPU SR-IOV interface to a Kata Container.

Fixes: #9083

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-04-16 11:50:02 -07:00
Dan Mihai
2c4d1ef76b tests: k8s: inject agent policy failures (part 3)
Auto-generate the policy and then simulate attacks from the K8s
control plane by modifying the test yaml files. The policy then
detects and blocks those changes.

These test cases are using K8s Pods. Additional policy failures
are injected during CI using other types of K8s resources - e.g.,
using Jobs and Replication Controllers - from separate PRs.

Fixes: #9491

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-16 18:15:12 +00:00
Dan Mihai
c26dad8fe5 Merge pull request #9294 from burgerdev/burgerdev/genpolicy-configurable-pause
genpolicy: support insecure registries and custom pause containers
2024-04-16 09:39:33 -07:00
GabyCT
9238daf729 Merge pull request #9464 from microsoft/danmihai1/rc-tests
tests: k8s: inject agent policy failures (part2)
2024-04-16 10:01:39 -06:00
Hyounggyu Choi
d523e865c0 rootfs: Make OPA build working in docker for s390x and ppc64le
The commit is to make the OPA build from source working in `ubuntu-rootfs-osbuilder`.
To achieve the goal, the configuration is changed as follows:

- Switch the make target to `ci-build-linux-static` not triggering docker-in-docker build
- Install go in the builder image for s390x and ppc64le

Fixes: #9466

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-16 16:49:12 +02:00
Greg Kurz
aca6a1bcb5 Merge pull request #9353 from pmores/pr-8866-follow-up
runtime-rs: refactor qemu driver
2024-04-16 16:07:36 +02:00
Fabiano Fidêncio
7bb5490676 Merge pull request #9479 from wainersm/fix_coco_nontee_jobs
gha: make run-kata-coco-tests inherit secrets
2024-04-16 13:46:52 +02:00
Hyounggyu Choi
7b11fd2546 Merge pull request #9471 from BbolroC/coco-kernel-version-s390x
version: Add coco name and version for {image,initrd} for s390x
2024-04-15 16:03:20 +02:00
Wainer dos Santos Moschetta
77541008fc gha: make run-kata-coco-tests inherit secrets
The new CoCo non-tee job introduced on commit 0d5399ba92 need to read secrets
like AZ_TENANT_ID, so run-kata-coco-tests workflow should inherit the secrets from
the caller workflow.

Fixes #9477
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-15 10:53:44 -03:00
Zvonko Kaiser
78e3ebb011 version: add initrd, image NVIDIA sections
Fixes: #9472

For initrd and image, the related NVIDIA will not use the default targets and we will pin them to a specific release.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-04-15 13:31:35 +00:00
Wainer Moschetta
c85e1ca674 Merge pull request #9404 from ldoktor/ci-mcp-timeout
ci.ocp: Increase the MCP update time
2024-04-15 09:42:14 -03:00
Hyounggyu Choi
3ec209dcf1 Merge pull request #9469 from BbolroC/coco-kernel-config-s390x
kernel: Adjust s390x config for confidential containers
2024-04-15 13:55:28 +02:00
Hyounggyu Choi
8fce600493 version: Add coco name and version for {image,initrd} for s390x
In order to build a coco {image,initrd}, it is required to
specify its name and version in versions.yaml. This commit
is to add the configuration for them, respectively.

Fixes: #9470

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-15 12:53:00 +02:00
Hyounggyu Choi
a792dc3e2b kernel: Adjust s390x config for confidential containers
`CONFIG_TN3270_TTY` and `CONFIG_S390_AP_IOMMU` are dropped for s390x
in 6.7.x which is used for a confidential kernel.
But they are still used for a vanilla kernel. So we need to add them
to the whitelist.

Fixes: #9465

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-15 10:28:59 +02:00
Hyounggyu Choi
32f58abfde Merge pull request #9403 from BbolroC/runtime-rs-ci-qemu
CI: Enable GHA cri-containerd workflow for runtime-rs with QEMU
2024-04-15 09:31:25 +02:00
Xuewei Niu
402d8a968e Merge pull request #9430 from UiPath/fix-agent-shutdown
agent: shutdown vm on exit when agent is used as init process
2024-04-15 10:47:07 +08:00
Wainer Moschetta
0a04f54a8e Merge pull request #9454 from GabyCT/topic/pulltype
gha: Define unbound PULL TYPE variable
2024-04-12 14:48:56 -03:00
Wainer Moschetta
a0b21d0e14 Merge pull request #9424 from wainersm/cc_guest_pull-encrypted
CC: run guest-pull tests on non-TEE jobs
2024-04-12 09:34:35 -03:00
Hyounggyu Choi
cf20a6a4ae gha: Add qemu-runtime-rs to VMM matrix for run-cri-containerd
This commit expands the VMM matrix for run-cri-containerd,
adding a new item `qemu-runtime-rs` for a test scenario where
the VMM is QEMU and runtime-rs is employed.
This expansion affects the workflows for both x86_64 and s390x platforms.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-12 12:25:53 +02:00
Hyounggyu Choi
606f8e1ab2 runtime-rs: Adjust configuration for qemu-runtime-rs
To make `qemu-runtime-rs` working for CI, we have to rename a configuration
template file and `CONFIG_FILE_QEMU` in Makefile.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-12 12:25:53 +02:00
Hyounggyu Choi
3c217c6c15 ci|cri-containerd: Introduce qemu-runtime-rs for KATA_HYPERVISOR
`qemu-runtime-rs` will be utilized to handle a test scenario where
the VMM is QEMU and runtime-rs is employed.

Note: Some of the tests are skipped. They are going to be reintegrated in
the follow-up PR (Check out #9375).

Fixes: #9371

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-12 12:25:53 +02:00
Alexandru Matei
9e01732f7a agent: shutdown vm on exit when agent is used as init process
Linux kernel generates a panic when the init process exits.
The kernel is booted with panic=1, hence this leads to a
vm reboot.
When used as a service the kata-agent service has an ExecStop
option which does a full sync and shuts down the vm.
This patch mimicks this behavior when kata-agent is used as
the init process.

Fixes: #9429

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2024-04-12 11:32:31 +03:00
Alexandru Matei
54923164b5 clh: isClhRunning waits for full timeout when clh exits
isClhRunning uses signal 0 to test whether the process is
still alive or not. This doesn't work because the process is a
direct child of the shim. Once it is dead the process becomes
zombie.
Since no one waits for it the process lingers until
its parent dies and init reaps it. Hence sending signal 0 in
isClhRunning will always return success whether the process is
dead or not.
This patch calls wait to reap the process, if it succeeds that
means it is our child process, if not we send the signal.

Fixes: #9431

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2024-04-12 11:31:53 +03:00
Dan Mihai
e51cbdcff9 tests: k8s: inject agent policy failures (part2)
Auto-generate the policy and then simulate attacks from the K8s
control plane by modifying the test yaml files. The policy then
detects and blocks those changes.

These test cases are using K8s Replication Controllers. Additional
policy failures will be injected using other types of K8s resources
- e.g., using Pods and/or Jobs - in separate PRs.

Fixes: #9463

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-11 21:08:53 +00:00
Markus Rudy
77540503f9 genpolicy: add support for insecure registries
genpolicy is a handy tool to use in CI systems, to prepare workloads
before applying them to the Kubernetes API server. However, many modern
build systems like Bazel or Nix restrict network access, and rightfully
so, so any registry interaction must take place on localhost.
Configuring certificates for localhost is tricky at best, and since
there are no privacy concerns for localhost traffic, genpolicy should
allow to contact some registries insecurely. As this is a runtime
environment detail, not a target environment detail, configuring
insecure registries does not belong into the JSON settings, so it's
implemented as command line flags.

Fixes: #9008

Signed-off-by: Markus Rudy <webmaster@burgerdev.de>
2024-04-11 22:29:03 +02:00
Wainer dos Santos Moschetta
4f74617897 tests: pass --overwrite-existing to aks get-credentials
By passing --overwrite-existing to `aks get-credentials` it will stop
asking if I want to overwrite the existing credentials. This is handy
for running the scripts locally.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-11 15:31:40 -03:00
Wainer dos Santos Moschetta
3508f3a43a tests/k8s: use CoCo image on guest-pull when non-TEE
When running on non-TEE environments (e.g. KATA_HYPERVISOR=qemu) the tests should
be stressing the CoCo image (/opt/kata/share/kata-containers/kata-containers-confidential.img)
although currently the default image/initrd is built to be able to do guest-pull as well.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-11 15:31:40 -03:00
Wainer dos Santos Moschetta
c24f13431d tests/k8s: enable guest-pull tests on non-TEE
Enabled guest-pull tests on non-TEE environment. It know requires the SNAPSHOTTER environment
variable to avoid it running on jobs where nydus-snapshotter is not installed

Fixes: #9410
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-11 15:31:40 -03:00
Wainer dos Santos Moschetta
0d5399ba92 gha: Create CoCo tests jobs on non-TEE
Created the new run-k8s-tests-coco-nontee jobs for running CoCo tests on
non-TEE. It currently generates the run-k8s-tests-coco-nontee(qemu, nydus, guest-pull)
job only to run the guest-pull tests.

Fixes: #9410
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-11 15:31:40 -03:00
Gabriela Cervantes
5420595d03 tests/k8s: Add uninstall kbs client command function
This PR adds the function to uninstall kbs client command function
specially when we are running with baremetal devices.

Fixes #9460

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-04-11 17:06:11 +00:00
Steve Horsman
6b2d655857 Merge pull request #9457 from justxuewei/fs_manager_tests
agent: Fix the issue with the "test_new_fs_manager" test
2024-04-11 17:02:58 +01:00
Fabiano Fidêncio
5611233ed8 Merge pull request #9439 from microsoft/danmihai1/job-tests
tests: k8s: inject agent policy failures
2024-04-11 17:21:54 +02:00
Markus Rudy
bc2292bc27 genpolicy: make pause container image configurable
CRIs don't always use a pause container, but even if they do the
concrete container choice is not specified. Even if the CRI config can
be tweaked, it's not guaranteed that registries in the public internet
can be reached. To be portable across CRI implementations and
configurations, the genpolicy user needs to be able to configure the
container the tool should append to the policy.

Signed-off-by: Markus Rudy <webmaster@burgerdev.de>
2024-04-11 16:26:35 +02:00
Markus Rudy
8b30fa103f genpolicy: parse json settings during config init
Decouple initialization of the Settings struct from creating the
AgentPolicy struct, so that the settings are available for evaluating,
extending or overriding command line arguments.

Signed-off-by: Markus Rudy <webmaster@burgerdev.de>
2024-04-11 16:17:33 +02:00
Xuewei Niu
50f78ec52c agent: Fix the issue with the "test_new_fs_manager" test
This patch introduces a one-time cpath to mitigate the cgroup residuals. It
might break the device cgroup merging rules when the cgroup has children.

Fixes: #9456

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-04-11 18:06:05 +08:00
GabyCT
08dcdc62de Merge pull request #9423 from GabyCT/topic/improvecleanup
tests: Improve the kbs_k8s_delete function
2024-04-10 14:28:21 -06:00
Gabriela Cervantes
4a2ee3670f gha: Define unbound PULL TYPE variable
This PR defines the PULL_TYPE variable to avoid failures of unbound
variable when this is being test it locally.

Fixes #9453

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-04-10 17:16:19 +00:00
GabyCT
dab837d71d Merge pull request #9450 from GabyCT/topic/fixinnydus
gha: Fix indentation in gha run script
2024-04-10 11:07:56 -06:00
David Esparza
9e1368dbc5 Merge pull request #9391 from dborquez/add-onednn-openvino-ml-benchs
add onednn and openvino ml-benchmarks
2024-04-09 19:03:00 -06:00
Dan Mihai
ea31df8bff Merge pull request #9185 from microsoft/saulparedes/genpolicy_add_containerd_pull
genpolicy: Add optional toggle to pull images using containerd
2024-04-09 12:29:19 -07:00
Gabriela Cervantes
6ebdcf8974 gha: Fix indentation in gha run script
This PR fixes an identation in gha run script.

Fixes #9449

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-04-09 16:37:17 +00:00
Greg Kurz
89353249fc Merge pull request #8988 from beraldoleal/ci-docs
docs: adding an initial CI documentation
2024-04-09 18:26:15 +02:00
Dan Mihai
2252490a96 tests: k8s: inject agent policy failures
Auto-generate the policy and then simulate attacks from the K8s
control plane by modifying the test yaml files. The policy then
detects and blocks those changes.

These test cases are using K8s Jobs. Additional policy failures
will be injected using other types of K8s resources - e.g., using
Pods and/or Replication Controllers - in future PRs.

Fixes: #9406

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-09 15:36:57 +00:00
David Esparza
facf3c9364 metrics: Add onednn benchmark.
This PR adds onednn test to exercise additional ML benchmarks.

Onednn is an Intel-optimized library for Deep Neural Networks.

Fixes: #9390

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-04-09 09:05:51 -06:00
David Esparza
3bde511d0d metrics: Add openvino benchmark.
This PR adds openvino test in order to exercise additional ML
benchmarks.

OpenVino bench used to optimize and deploy deep learning models.

Fixes: #9389

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-04-09 09:05:51 -06:00
David Esparza
b37c5f8ba1 metrics:libs: Add HTTPS and HTTP vars to docker build.
Include HTTP and HTTPS env variables in the building docker
images because they are required to download packages
such as Phoronix.

Added a restriction that verifies that docker building images
is performed as root.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-04-09 09:05:51 -06:00
David Esparza
3355dd9e2b metrics:libs: Adds a function to set new kata configuration.
Adds a function that receives as a single parameter the name of
a valid Kata configuration file which  will be established as
the default kata configuration to start kata containers.

Adds a second function that returns the path to the current
kata configuration file.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-04-09 09:05:51 -06:00
David Esparza
cb4380d1c9 metrics: common: Add function to clean the cache.
The function clear the Page Cache only.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-04-09 09:05:51 -06:00
David Esparza
3a419ba3b1 metrics: common: Add function to update kata config.
Add an extra function that updates kata config
to use the max num. of vcpus available and
to use the available memory in the system.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-04-09 09:05:51 -06:00
Beraldo Leal
959e56525c docs: adding an initial CI documentation
This is actually a first attempt to document our CI, and all this
content was based on the document created by Fabiano Fidencio (kudos to
him). We are just moving the content and discussion from Google Docs to
here.

I used the "poetic license" to add some notes on what I believe our CI
will look like in the future.

Fixes #9006

Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-04-09 09:21:47 -04:00
Saul Paredes
51498ba99a genpolicy: toggle containerd pull in tests
- Add v1 image test case
- Install protobuf-compiler in build check
- Reset containerd config to default in kubernetes test if we are testing genpolicy
- Update docker_credential crate
- Add test that uses default pull method
- Use GENPOLICY_PULL_METHOD in test

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-04-08 19:28:29 -07:00
Dan Mihai
f60c9eaec3 Merge pull request #9398 from microsoft/danmihai1/policy-test-cleanup
tests: k8s: improve the Agent Policy tests
2024-04-08 15:37:07 -07:00
Gabriela Cervantes
fb4c359cc2 tests: Improve the kbs_k8s_delete function
This PR improves the kbs_k8s_delete function to verify that the
resources were properly deleted for baremetal environments.

Fixes #9379

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-04-08 18:03:07 +00:00
Saul Paredes
c96ebf237c genpolicy: add containerd pull method
Add optional toggle to use existing containerd installation to pull and manage container images.
This adds support to a wider set of images that are currently not supported by standard pull method,
such as those that use v1 manifest.

Fixes: #9144

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-04-08 09:56:59 -07:00
Greg Kurz
8b996b9307 Merge pull request #9331 from egernst/foobar
katautils: check number of cores on the system intead of go runtime
2024-04-08 18:38:49 +02:00
Greg Kurz
934beb5ae4 Merge pull request #9421 from gkurz/bump-node-js-20
gha: Bump various actions to use Node.js 20
2024-04-08 18:22:28 +02:00
Wainer Moschetta
fba1d394d7 Merge pull request #9369 from ChengyuZhu6/sandbox-image
agent:image: Support different pause image in the guest for guest pull
2024-04-08 11:06:21 -03:00
Steve Horsman
3242f55691 Merge pull request #8870 from LindaYu17/aa2main
port attestation agent from CCv0 branch to main branch
2024-04-08 15:01:07 +01:00
James O. D. Hunt
42936cb92c Merge pull request #9372 from jodh-intel/docs-kata-manager-update
docs: kata-manager: Update with latest details
2024-04-08 13:23:23 +01:00
stevenhorsman
864e9c22ba agent: doc: Add new config doc
Document the new guest_components_rest_api config parameter

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-04-08 11:38:53 +01:00
stevenhorsman
29a5652e31 packaging: guest-components, set new environment variables
- Set KBC_PROVIDER and ATTESTER rather than TEE_PLATFORM
to avoid tss build issues for vTPM attester(s)
- There are future plans to make a matching TEE_PLATFORM, so this can be simplified once that is available

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-04-08 11:38:53 +01:00
stevenhorsman
a284a20a14 tests: Filter CoCo tests on ppc64le/arm
- At the moment we aren't supporting ppc64le or
aarch64 for
CoCo, so filter out these tests from running

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-04-08 11:38:53 +01:00
stevenhorsman
a0c03966c2 versions: Bump guest-components
- Bump guest-components to try and test compatibility with the latest version

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-04-08 11:38:53 +01:00
stevenhorsman
101a5bf273 packaging: Update guest-components Dockerfile
- Switch to Ubuntu 20.04 for building guest-components as
The rootfs is based on 20.04, so we need matching GLIBC versions.
See #8955
- Add dependencies needed by TDX verifier as we want to build for all platforms

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-04-08 11:38:53 +01:00
Gabriela Cervantes
6d85025e59 test/k8s: Add basic attestation test
- Add basic test case to check that a ruuning
pod can use the api-server-rest (and attestation-agent
and confidential-data-hub indirectly) to get a resource
from a remote KBS

Fixes #9057

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Co-authored-by: Linda Yu <linda.yu@intel.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-08 11:38:53 +01:00
Biao Lu
f0edec84f6 agent: Launch api-server-rest
If 'rest_api' is configured, let's start the  api-server-rest after
the attestation-agent and the confidential-data-hub have been started.

Fixes: #7555

Signed-off-by: Biao Lu <biao.lu@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
Co-authored-by: zhouliang121 <liang.a.zhou@linux.alibaba.com>
Co-authored-by: Alex Carter <alex.carter@ibm.com>
Co-authored-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>
2024-04-08 11:38:53 +01:00
Biao lu
4d752e6350 agent: Add config for api-server-rest
Add configuration for 'rest api server'.

Optional configurations are
  'agent.rest_api=attestation' will enable attestation api
  'agent.rest_api=resource' will enable resource api
  'agent.rest_api=all' will enable all (attestation and resource) api

Fixes: #7555

Signed-off-by: Biao Lu <biao.lu@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
Co-authored-by: zhouliang121 <liang.a.zhou@linux.alibaba.com>
Co-authored-by: Alex Carter <alex.carter@ibm.com>
Co-authored-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>
2024-04-08 11:06:14 +01:00
Biao Lu
f476d671ed agent: Launch the confidential data hub
Let's introduce a new method to start the confidential data hub and the
attestation agent.  The former depends on the later, and it needs to be
started before the RPC server.

Starting the attestation components is based on whether the confidential
containers guest components binaries are found in the rootfs.

Fixes: #7544

Signed-off-by: Biao Lu <biao.lu@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
Co-authored-by: zhouliang121 <liang.a.zhou@linux.alibaba.com>
Co-authored-by: Alex Carter <alex.carter@ibm.com>
Co-authored-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>
2024-04-08 11:06:14 +01:00
Greg Kurz
be8f0cb520 Merge pull request #9402 from deagon/feat/debug-threads
qemu: show the thread name when enable the hypervisor.debug option
2024-04-08 11:04:36 +02:00
Hyounggyu Choi
e39be7a45e Merge pull request #9415 from BbolroC/fix-dir-removal-error
GHA: Implement secondary GITHUB_WORKSPACE cleanup on 1st failure
2024-04-08 10:44:44 +02:00
ChengyuZhu6
8c897f822c agent:image: Support different pause image in the guest for guest pull
Support different pause images in the guest for guest-pull, such as k8s
pause image (registry.k8s.io/pause) and openshift pause image (quay.io/bpradipt/okd-pause).

Fixes: #9225 -- part III

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-04-07 09:00:10 +08:00
GabyCT
9d2c5b180e Merge pull request #9419 from GabyCT/topic/fxlatency
metrics: Improve latency test cleanup
2024-04-05 16:31:00 -06:00
Wainer Moschetta
aae7048d4f Merge pull request #9273 from ldoktor/kcli-coco-kbs
tests: Support for kbs setup on kcli
2024-04-05 18:55:58 -03:00
Fabiano Fidêncio
f09bb98f51 Merge pull request #8840 from fidencio/topic/update-tdx-artefacts-to-the-new-host-os
tdx: Update TDX artefacts to be used with the Ubuntu 23.10 / CentOS 9 stream OSVs.
2024-04-05 22:36:03 +02:00
Fabiano Fidêncio
cdb8531302 hypervisor: Simplify TDX protection detection
Let's rely on the kvm module 'tdx' parameter to do so.
This aligns with both OSVs (Canonical, Red Hat, SUSE) and the TDX
adoption (https://github.com/intel/tdx-linux) stacks.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-04-05 19:51:27 +02:00
Fabiano Fidêncio
2ee03b5dc3 tdvf: Adapt the build command
This is done in order to match the example from:
https://github.com/intel/tdx-linux/wiki/Instruction-to-set-up-TDX-host-and-guest#build-tdvf-image

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-04-05 19:51:27 +02:00
Fabiano Fidêncio
b7cccfa019 qemu: tdx: Adapt command line
This commit is a mess, but I'm not exactly sure what's the best way to
make it less messy, as we're getting QEMU TDX to work while partially
reverting 1e34220c41.

With that said, let me cover the content of this commit.

Firstly, we're reverting all the changes related to
"memory-backend-memfd-private", as that's what was used with the
previous host stack, but it seems it
didn't fly upstream.

Secondly, in order to get QEMU to properly work with TDX, we need to
enforce the 'private=on' knob and use the "memory-backend-ram", and
we're doing so, and also making sure to test the `private=on` newly
added knob.

I'm sorry for the confusion, I understand this is not optimal, I just
don't see an easy path to do changes without leaving the code broken
during those changes.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-04-05 19:51:27 +02:00
Greg Kurz
424a5e243f gha: Bump to actions/[down|up]load-artifact@v4 (all the rest)
`Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`.

This fixes all remaining sites.

Fixes #9245

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-04-05 18:36:51 +02:00
Greg Kurz
dbc5dc7806 gha: Bump to actions/[down|up]load-artifact@v4 (k8s tests on garm)
`Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`.

As explained at [1] :

> The contents of an Artifact are uploaded together into an immutable
> archive. They cannot be altered by subsequent jobs. Both of these
> factors help reduce the possibility of accidentally corrupting
> Artifact files.

This means that artifacts cannot have the same name.

Adapt the `run-k8s-tests-on-garm` workflow accordingly by embedding all
the other `${{ vmm.* }}` fields and `${{ inputs.tag }}` in the artifact
names that would otherwise collide.

Fixes #9245

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-04-05 18:36:51 +02:00
Greg Kurz
62a54ffa70 gha: Bump to actions/[down|up]load-artifact@v4 (kata static tarball)
`Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`.

As explained at [1] :

> The contents of an Artifact are uploaded together into an immutable
> archive. They cannot be altered by subsequent jobs. Both of these
> factors help reduce the possibility of accidentally corrupting
> Artifact files.

This means that artifacts cannot have the same name.

Adapt all `build-kata-static-tarball` workflows accordingly by
embedding `${{ matrix.asset }}` in the artifact names that would
otherwise collide.

Fixes #9245

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-04-05 18:36:51 +02:00
Greg Kurz
7f2ce914a1 gha: Bump to actions/checkout@v4
`Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`.

Fixes #9245

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-04-05 18:36:50 +02:00
Greg Kurz
0a43d26c94 gha: Bump to docker/login-action@v3
`Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`.

Fixes #9245

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-04-05 18:36:50 +02:00
Greg Kurz
06c9c0d7db gha: Bump to docker/build-push-action@v5
`Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`.

Fixes #9245

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-04-05 18:36:50 +02:00
Greg Kurz
8c21844aef gha: Bump to docker/setup-buildx-action@v3
`Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`.

Fixes #9245

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-04-05 18:36:50 +02:00
Greg Kurz
03cbe6a011 gha: Bump to docker/setup-qemu-action@v3
`Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`.

Fixes #9245

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-04-05 18:36:50 +02:00
Hyounggyu Choi
4493459937 GHA: Implement secondary GITHUB_WORKSPACE cleanup on 1st failure
Occasionally, the removal of GITHUB_WORKSPACE fails for self-hosted runners
because one of the subdirectories is not empty. This is likely due to another
process occupying the directory at the time.
Implementing a secondary cleanup resolves this issue.
This commit focuses on the implementation for the secondary cleanup.

Fixes: #9317

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-05 11:41:51 +02:00
Fabiano Fidêncio
6b4cc5ea6a Revert "qemu: tdx: Workaround SMP issue with TDX 1.5"
This reverts commit d1b54ede29.

 Conflicts:
	src/runtime/virtcontainers/qemu.go

This commit was a hack that was needed in order to get QEMU + TDX to
work atop of the stack our CI was running on.  As we're moving to "the
officially supported by distros" host OS, we need to get rid of this.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-04-05 10:23:52 +02:00
Fabiano Fidêncio
582b5b6b19 govmm: tdx: Expose the private=on|off knob
The private=on|off knob is required in order to properly lauunch a TDX
guest VM.

This is a brand new property that is part of the still in-flight patches
adding TDX support on QEMU.

Please, see:
3fdd8072da

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-04-05 10:23:52 +02:00
Fabiano Fidêncio
fe5adae5d9 qemu-tdx: Update to v8.1.0 + TDX patches
Let's update the QEMU to the one that's officially maintained by Intel
till all the TDX patches make their way upstream.

We've had to also update python to explicitly use python3 and add
python3-venv as part of the dependencies.

Fixes: #8810

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-04-05 10:23:51 +02:00
Alex Lyn
0e0a361f0e Merge pull request #8782 from Apokleos/device-increate-count
bugfix and refactor device increate count
2024-04-05 13:43:49 +08:00
Dan Mihai
6f9f8ae285 Merge pull request #9413 from microsoft/saulparedes/ensure_unique_rg_in_gha
gha: ensure unique resource group name
2024-04-04 17:13:09 -07:00
GabyCT
80d926c357 Merge pull request #9411 from microsoft/danmihai1/k8s-job
tests: k8s-job: wait for job successful create
2024-04-04 15:14:56 -06:00
Gabriela Cervantes
8e5d401be0 metrics: Improve latency test cleanup
This PR improves the latency test cleanup in order to avoid random
failures of leaving the pods.

Fixes #9418

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-04-04 20:43:53 +00:00
Saul Paredes
f20caac1c0 gha: ensure unique resource group name
There's an rg name duplication situation that got introduced by #9385
where 2 different test runs might have same rg name.

Add back uniqueness by including the first letter of GENPOLICY_PULL_METHOD to
cluster name.

Fixes: #9412

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-04-04 13:13:32 -07:00
GabyCT
aae2679f09 Merge pull request #9409 from GabyCT/topic/ghrunset
gha: Define GH_PR_NUMBER variable in gha run k8s common script
2024-04-04 09:46:48 -06:00
Eric Ernst
da01bccd36 katautils: check number of cores on the system intead of go runtime
We used to utilize go runtime's "NumCPUs()", which will give the number
of cores available to the Go runtime, which may be a subset of physical
cores if the shim is started from within a cpuset. From the function's
description:
"NumCPU returns the number of logical CPUs usable by the current
process."

As an example, if containerd is run from within a smaller CPUset, the
maximum size of a pod will be dictated by this CPUset, instead of what
will be available on the rest of the system.

Since the shim will be moved into its own cgroup that may have a
different CPUset, let's stick with checking physical cores. This also
aligns with what we have documented for maxVCPU handling.

In the event we fail to read /proc/cpuinfo, let's use the goruntime.

Fixes: #9327

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2024-04-03 16:09:16 -07:00
Dan Mihai
3e72b3f360 tests: k8s-job: wait for job successful create
Don't just verify SuccessfulCreate - wait for it if needed.

Fixes: #9138

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 22:11:15 +00:00
Gabriela Cervantes
73f27e28d1 gha: Define GH_PR_NUMBER variable in gha run k8s common script
This PR defines the GH_PR_NUMBER variable in gha run k8s common
script to avoid failures like unbound variable when running
locally the scripts just like the GHA CI.

Fixes #9408

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-04-03 18:25:00 +00:00
GabyCT
c5c229b330 Merge pull request #9397 from GabyCT/topic/removeconmon
versions: Remove conmon information from versions.yaml
2024-04-03 11:14:43 -06:00
GabyCT
12947b1ba6 Merge pull request #9344 from GabyCT/topic/kerneldoc
docs: Remove stale kernel information
2024-04-03 11:13:54 -06:00
Dan Mihai
07c23a05f2 Merge pull request #9385 from microsoft/saulparedes/add_genpolicy_yaml_params
gha: add GENPOLICY_PULL_METHOD
2024-04-03 09:20:16 -07:00
Lukáš Doktor
b8382cea88 ci.ocp: Increase the MCP update time
updating the machine config takes even longer than 1200s, use 60m to be
sure everything is updated.

Fixes: #9338

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-04-03 15:01:29 +02:00
Alex Lyn
935a1a3b40 runtime-rs: refactor decrease_attach_count with do_decrease_count
Try to reduce duplicated code in decrease_attach_count with public
new function do_decrease_count.

Fixes: #8738

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-04-03 17:19:19 +08:00
Alex Lyn
4f0fab938d runtime-rs: refactor increase_attach_count with do_increase_count
Try to reduce duplicated code in increase_attach_count with public
new function do_increase_count.

Fixes: #8738

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-04-03 17:19:19 +08:00
Alex Lyn
fff64f1c3e runtime-rs: introduce dedicated function do_decrease_count
Introduce a dedicated public function do_decrease_count to
reduce duplicated code in drivers' decrease_attach_count.

Fixes: #8738

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-04-03 17:19:08 +08:00
Alex Lyn
5750faaf31 runtime-rs: introduce dedicated function do_increase_count
Since there are many implementations of reference counting in the
drivers, all of which have the same implementation, we should try
to reduce such duplicated code as much as possible. Therefore, a
new function is introduced to solve the problem of duplicated code.

Fixes: #8738

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-04-03 17:09:17 +08:00
Dan Mihai
f800bd86f6 tests: k8s-sandbox-vcpus-allocation.bats policy
Use the "allow all" policy for k8s-sandbox-vcpus-allocation.bats,
instead of relying on the Kata Guest image to use the same policy
as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:01:33 +00:00
Dan Mihai
4211d93b87 tests: k8s-nginx-connectivity.bats policy
Use the "allow all" policy for k8s-nginx-connectivity.bats, instead of
relying on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:01:26 +00:00
Dan Mihai
5dcf64ef34 tests: k8s-volume.bats allow all policy
Use the "allow all" policy for k8s-volume.bats, instead of relying
on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:01:18 +00:00
Dan Mihai
04085d8442 tests: k8s-sysctls.bats allow all policy
Use the "allow all" policy for k8s-sysctls.bats, instead of
relying on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:01:10 +00:00
Dan Mihai
839993f245 tests: k8s-security-context.bats allow all policy
Use the "allow all" policy for k8s-security-context.bats, instead of
relying on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:01:03 +00:00
Dan Mihai
02a050b47e tests: k8s-seccomp.bats allow all policy
Use the "allow all" policy for k8s-seccomp.bats, instead of relying
on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:00:56 +00:00
Dan Mihai
543e40b80c tests: k8s-projected-volume.bats allow all policy
Use the "allow all" policy for k8s-projected-volume.bats, instead of
relying on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:00:47 +00:00
Dan Mihai
3f94e2ee1b tests: k8s-pod-quota.bats allow all policy
Use the "allow all" policy for k8s-pod-quota.bats, instead of
relying on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:00:37 +00:00
Dan Mihai
ba23758a42 tests: k8s-optional-empty-secret.bats policy
Use the "allow all" policy for k8s-optional-empty-secret.bats,
instead of relying on the Kata Guest image to use the same policy as
its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:00:30 +00:00
Dan Mihai
e4ff6b1d91 tests: k8s-measured-rootfs.bats allow all policy
Use the "allow all" policy for k8s-measured-rootfs.bats, instead of
relying on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:00:23 +00:00
Dan Mihai
2821326a7e tests: k8s-liveness-probes.bats allow all policy
Use the "allow all" policy for k8s-liveness-probes.bats, instead of
relying on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:00:15 +00:00
Dan Mihai
9af3e4cc4a tests: k8s-inotify.bats allow all policy
Use the "allow all" policy for k8s-inotify.bats, instead of relying
on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:00:08 +00:00
Dan Mihai
bd45e948cc tests: k8s-guest-pull-image.bats policy
Use the "allow all" policy for k8s-guest-pull-image.bats, instead of
relying on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:00:00 +00:00
Dan Mihai
be3797ef7c tests: k8s-footloose.bats allow all policy
Use the "allow all" policy for k8s-footloose.bats, instead of
relying on the Kata Guest image to use the same policy as its
default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 02:59:50 +00:00
Dan Mihai
18f5e55667 tests: k8s-empty-dirs.bats allow all policy
Use the "allow all" policy for k8s-empty-dirs.bats, instead of
relying on the Kata Guest image to use the same policy as its
default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 02:59:44 +00:00
Dan Mihai
ef22bd8a2b tests: k8s: replace run_policy_specific_tests
Check from:

- k8s-exec-rejected.bats
- k8s-policy-set-keys.bats

if policy testing is enabled or not, to reduce the complexity of
run_kubernetes_tests.sh. After these changes, there are no policy
specific commands left in run_kubernetes_tests.sh.

add_allow_all_policy_to_yaml() is moving out of run_kubernetes_tests.sh
too, but it not used yet. It will be used in future commits.

Fixes: #9395

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 02:59:28 +00:00
Guoqiang Ding
cd0c31e185 qemu: show the thread name when enable the hypervisor.debug option
Add debug-threads=on in the name argument if debug enabled.

Fixes: #9400
Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>
2024-04-03 10:36:52 +08:00
Saul Paredes
8a92e81f98 gha: add GENPOLICY_PULL_METHOD
Add GENPOLICY_PULL_METHOD that will be used to test pulling
container images in genpolicy using the oci-distribution crate
and/or the containerd interface.

GENPOLICY_PULL_METHOD will start being used in a future PR.

Fixes: #9384

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-04-02 19:03:28 -07:00
Gabriela Cervantes
f3957352f0 versions: Remove conmon information from versions.yaml
This PR removes conmon information from versions.yaml as this is not
longer being used in kata containers repository.

Fixes #9396

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-04-02 16:25:45 +00:00
Dan Mihai
39805822fc tests: k8s: reduce policy testing complexity
Don't add the "allow all" policy to all the test YAML files anymore.

After this change, the k8s tests assume that all the Kata CI Guest
rootfs image files either:

- Don't support Agent Policy at all, or
- Include an "allow all" default policy.

This relience/assumption will be addressed in a future commit.

Fixes: #9395

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-02 16:18:31 +00:00
Alex Lyn
7795f9c016 Merge pull request #9365 from GabyCT/topic/removerunc
versions: Remove runc version information
2024-04-02 09:21:56 +08:00
Alex Lyn
fa8049af6c Merge pull request #9383 from Apokleos/unified-cgrp-cmdline
kata-agent: enabling cgroups-v2 by systemd.unified_cgroup_hierarchy
2024-04-02 09:08:04 +08:00
Alex Lyn
07bfdf4a22 Merge pull request #9275 from Apokleos/swap-hooks-bindmnt
kata-agent: Change order of guest hook and bind mount processing
2024-04-02 07:40:10 +08:00
Alex Lyn
c88014834b kata-agent: enabling cgroups-v2 by systemd.unified_cgroup_hierarchy
Configure the system to mount cgroups-v2 by default during system boot
by the systemd system, We must add systemd.unified_cgroup_hierarchy=1
parameter to kernel cmdline, which will be passed by kernel_params in
configuration.toml.
To enable cgroup-v2, just add systemd.unified_cgroup_hierarchy=true[1]
to kernel_params.

Fixes: #9336

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-04-01 18:45:12 +08:00
alex.lyn
548f252bc4 runtime-rs: bugfix incorrect use of refcount before vfio attach
When there's a pod with multiple containers, there may be case that
attach point more than 2, we should not return Err in that case when
we are doing attach ops, but just return Ok.

Fixes: #8738

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-04-01 11:28:57 +08:00
Alex Lyn
aa9cd232cd Merge pull request #9358 from GabyCT/topic/nerdrandom
gha: Update journal log names for nerdctl artifacts
2024-04-01 09:50:16 +08:00
Alex Lyn
dfa8832406 Merge pull request #9345 from c3d/bug/9342-agent-test-errors
agent: Fix errors in `make check`
2024-04-01 09:48:44 +08:00
Dan Mihai
3a7dbcfc17 Merge pull request #9367 from microsoft/danmihai1/infinite-io-stream-copy-loop
runtime: remove stream copy infinite loop
2024-03-29 09:37:44 -07:00
Dan Mihai
600f9266f3 runtime: remove stream copy infinite loop
This reverts commit 1c5693be86.

Avoid apparent infinite loop when ReadStreamRequest is blocked by
policy - for some of the pods.

When running the k8s-limit-range.bats test with Policy enabled,
the Shim + VMM never get terminated on my cluster. Not sure why
the sandbox clean-up works better for other tests, but the
k8s-limit-range test pod gets stuck in an infinite loop:

stdout io stream copy error happens: error = %wrpc error: code =
PermissionDenied desc = \"ReadStreamRequest is blocked by policy

...

policy check: ReadStreamRequest

...

stdout io stream copy error happens: error = %wrpc error: code =
PermissionDenied desc = \"ReadStreamRequest is blocked by policy

...

policy check: ReadStreamRequest

...

Fixes: #9380

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-03-28 22:43:28 +00:00
James O. D. Hunt
13966f4d1d docs: kata-manager: Add help for permissions issue
The 3.3.0 release installs the `kata-manager` script with overly restrictive
permissions (see #9373), so add details to help users handle the situation.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-28 16:22:10 +00:00
James O. D. Hunt
5589e4e291 docs: kata-manager: Update with latest details
Now that v3.3.0 has been released, simplify
the `kata-manager` documentation.

Fixes: #9227.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-28 16:22:10 +00:00
James O. D. Hunt
52fe60c94b docs: kata-manager: Fix heading levels
Add an extra heading indent so that there is only a single
top-level heading.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-28 16:21:31 +00:00
Dan Mihai
ebb26edf42 Merge pull request #9347 from microsoft/danmihai1/reduce-exec-test-policy-prints
genpolicy: reduce policy debug prints
2024-03-27 15:12:10 -07:00
Gabriela Cervantes
a32418bf32 versions: Remove runc version information
This PR removes the runc version information as this is not longer being used
in the kata containers scripts.

Fixes #9364

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-27 20:32:38 +00:00
Steve Horsman
b3acbe0b7f Merge pull request #8046 from fitzthum/clean-config
runtime: remove unimplemented CoCo configurations
2024-03-27 19:39:48 +00:00
Tobin Feldman-Fitzthum
04d021bd12 packaging: remove SERVICEOFFLOAD option
Since we're removing the unused service_offload parameter,
don't set it in any of the packaging scripts.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2024-03-27 12:21:13 -05:00
Tobin Feldman-Fitzthum
9856fe5bea runtime: remove ServiceOffload parameter
Since we no longer use the service_offload configuration,
remove the ServiceOffload field from the image struct.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2024-03-27 12:21:13 -05:00
Tobin Feldman-Fitzthum
a18c7ca307 runtime: remove unimplemented CoCo configurations
These experimental options were added 2 years ago
in anticipation of features that would be added
in CoCo. These do not match the features that were
eventually added and will soon be ported to main.

Fixes: #8047

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2024-03-27 12:21:06 -05:00
Steve Horsman
53fa1fd82d Merge pull request #9349 from fidencio/topic/ci-k8s-update-cpuid
k8s: confidential: Update cpuid to its latest release
2024-03-27 16:57:36 +00:00
Chengyu Zhu
e66a5cb54d Merge pull request #9332 from ChengyuZhu6/guest-pull-timeout
Support to set timeout to pull large image in guest
2024-03-28 00:34:08 +08:00
Christophe de Dinechin
82c4079fd0 agent: Remove useless loop
This is the report from `make check`:

```
error: this loop never actually loops
   --> src/signal.rs:147:9
    |
147 | /         loop {
148 | |             select! {
149 | |                 _ = handle => {
150 | |                     println!("INFO: task completed");
...   |
156 | |             }
157 | |         }
    | |_________^
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#never_loop
    = note: `#[deny(clippy::never_loop)]` on by default
```

There is only one option: you get something or a timeout. You never retry, so
the report is correct.

Fixes: #9342

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2024-03-27 17:03:44 +01:00
Christophe de Dinechin
df5c88cdf0 agent: Remove lint error about .flatten running forever
The lint report is the following:

```
error: `flatten()` will run forever if the iterator repeatedly produces an `Err`
    --> src/rpc.rs:1754:10
     |
1754 |         .flatten()
     |          ^^^^^^^^^ help: replace with: `map_while(Result::ok)`
     |
note: this expression returning a `std::io::Lines` may produce an infinite number of `Err` in case of a read error
    --> src/rpc.rs:1752:5
     |
1752 | /     reader
1753 | |         .lines()
     | |________________^
     = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#lines_filter_map_ok
     = note: `-D clippy::lines-filter-map-ok` implied by `-D warnings`
     = help: to override `-D warnings` add `#[allow(clippy::lines_filter_map_ok)]`
```

This commit simply applies the suggestion.

Fixes: #9342

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2024-03-27 17:03:44 +01:00
Christophe de Dinechin
bfb55312be agent: Fix .enumerate errors during make check
Running `make check` in the `src/agent` directory gives:

```
error: you seem to use `.enumerate()` and immediately discard the index
   --> rustjail/src/mount.rs:572:27
    |
572 |     for (_index, line) in reader.lines().enumerate() {
    |                           ^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unused_enumerate_index
    = note: `-D clippy::unused-enumerate-index` implied by `-D warnings`
    = help: to override `-D warnings` add `#[allow(clippy::unused_enumerate_index)]`
help: remove the `.enumerate()` call
    |
572 |     for line in reader.lines() {
    |         ~~~~    ~~~~~~~~~~~~~~

    Checking tokio-native-tls v0.3.1
    Checking hyper-tls v0.5.0
    Checking reqwest v0.11.18
error: could not compile `rustjail` (lib) due to 1 previous error
warning: build failed, waiting for other jobs to finish...
make: *** [../../utils.mk:177: standard_rust_check] Error 101
```

Fixes: #9342

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2024-03-27 17:03:44 +01:00
Greg Kurz
e1068da1a0 Merge pull request #9326 from gkurz/draft-release
Only tag and publish the release when it is fully ready
2024-03-27 15:59:59 +01:00
ChengyuZhu6
c50d3ebacc tests:k8s: Add a test to pull large images in the guest
Add a test to pull large images in the guest.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-27 21:58:44 +08:00
ChengyuZhu6
8551ee9533 how-to: add createcontainer timeout to sandbox config documentation
add createcontainer timeout annotation to sandbox config documentation.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-27 21:58:44 +08:00
ChengyuZhu6
c2dc13ebaa runtime: support to configure CreateContainer Timeout in configurations
support to configure CreateContainerRequestTimeout in the
configurations.

e.g.:
[runtime]
...
create_container_timeout = 300

Note: The effective timeout is determined by the lesser of two values: runtime-request-timeout from kubelet config
(https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout) and create_container_timeout.
In essence, the timeout used for guest pull=runtime-request-timeout<create_container_timeout?runtime-request-timeout:create_container_timeout.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-27 21:58:41 +08:00
Chengyu Zhu
87fc17d4d2 Merge pull request #9341 from ChengyuZhu6/guest-pull-doc
docs: Add documents for kata guest image management
2024-03-27 21:20:22 +08:00
ChengyuZhu6
95b2f7f129 how-to: Add a document for kata guest image management usage
Add a document for kata guest image management usage.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-27 20:09:37 +08:00
Greg Kurz
693c9487d4 docs: Adjust release documentation
Most of the content of `docs/Stable-Branch-Strategy.md` got de-facto
deprecated by the re-design of the release process described in #9064.
Remove this file and all its references in the repo.

The `## Versioning` section has some useful information though. It is
moved to `docs/Release-Process.md`. The documentation of the `PATCH`
field is adapted according to new workflow.

Fixes #9064 - part VI

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-03-27 12:41:48 +01:00
Steve Horsman
45aba769c0 Merge pull request #9346 from cmaf/ci-remove-repo-docs
Remove additional links to tests directory
2024-03-27 11:13:32 +00:00
Steve Horsman
a1a615a7c8 Merge pull request #9356 from stevenhorsman/agent-opa-ppc64le-s390x
workflows: Build agent-opa for more archs
2024-03-27 08:53:28 +00:00
ChengyuZhu6
2224f6d63f runtime: support to configure CreateContainer timeout in annotation
Support to configure CreateContainerRequestTimeout in the annotations.

e.g.:
annotations:
      "io.katacontainers.config.runtime.create_container_timeout": "300"

Note: The effective timeout is determined by the lesser of two values: runtime-request-timeout from kubelet config
(https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout) and create_container_timeout.
In essence, the timeout used for guest pull=runtime-request-timeout<create_container_timeout?runtime-request-timeout:create_container_timeout.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-27 15:44:29 +08:00
ChengyuZhu6
39bd462431 runtime: support to set timeout for CreateContainerRequest
In the situation to pull images in the guest #8484, it’s important to account for pulling large images.
Presently, the image pull process in the guest hinges on `CreateContainerRequest`, which defaults to a 60-second timeout.
However, this duration may prove insufficient for pulling larger images, such as those containing AI models.
Consequently, we must devise a method to extend the timeout period for large image pull.

Fixes: #8141

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-27 15:44:29 +08:00
Gabriela Cervantes
a997e282be gha: Update journal log names for nerdctl artifacts
This PR updates the journal log name for nerdctl artifacts to make
sure that we have different names in case we add a parallel GHA job.

Fixes #9357

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-26 20:03:54 +00:00
GabyCT
c163d9f114 Merge pull request #9329 from GabyCT/topic/seun
scripts: Fix unbound variables in k8s setup script
2024-03-26 11:19:33 -06:00
stevenhorsman
9aa675abb9 workflows: Build agent-opa for more archs
Since https://github.com/kata-containers/kata-containers/pull/7769, we support
building the OPA binary into the ppc64le and s390x arch versions of the rootfs,
so build the policy enabled agent to match for those architectures too.

Fixes: #9355
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-03-26 17:02:14 +00:00
Lukáš Doktor
a671b3fc6e tests: Use full svc address to check kbs service
the service might not listen on the default port, use the full service
address to ensure we are talking to the right resource.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-03-26 16:59:02 +01:00
Lukáš Doktor
6b0eaca4d4 tests: Add support for nodeport ingress for the kbs setup
this can be used on kcli or other systems where cluster nodes are
accessible from all places where the tests are running.

Fixes: #9272

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-03-26 16:59:00 +01:00
Greg Kurz
5009fabde4 release: Keep it draft until all artifacts have been published
The automated release workflow starts with the creation of the release in
GitHub. This is followed by the build and upload of the various artifacts,
which can be very long (like hours). During this period, the release appears
to be fully available in https://github.com/kata-containers/kata-containers/
even though it lacks all the artifacts. This might be confusing for users
or automation consuming the release.

Create the release as draft and clear the draft flag when all jobs are
done. This ensure that the release will only be tagged and made public
when it is fully usable.

If some job fails because of network timeout or any other transient
error, the correct action is to restart the failed jobs until they
eventually all succeed. This is by far the quicker path to complete
the release process.

If the workflow is *canceled* for some reason, the draft release is left
behind. A new run of the workflow will create a brand new draft release
with the same name (not an issue with GitHub). The draft release from
the previous run should be manually deleted. This step won't be automated
as it looks safer to leave the decision to a human.

[1] https://github.com/kata-containers/kata-containers/releases

Fixes #9064 - part VI

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-03-26 14:48:05 +01:00
Pavel Mores
4c72b02e53 runtime-rs: remove the now-unused code of NetDevice
The remaining code in network.rs was mostly moved to utils.rs which seems
better home for these utility functions anyway (and a closely related
function open_named_tuntap() has already lived there).

ToString implementation for Address was removed after some consideration.
Address should probably ideally implement Display (as per RFC 565) which
would also supply a ToString implementation, however it implements Debug
instead, probably to enable automatic implementation of Debug for anything
that Address is a member of, if for no other reason.  Rather than having
two identical functions this commit simply switches to using the Debug
implementation for printing Address on qemu command line.

Fixes #9352

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-03-26 12:52:40 +01:00
Pavel Mores
c94e55d45a runtime-rs: make QemuCmdLine own vsock file descriptor
Make file descriptors to be passed to qemu owned by QemuCmdLine.  See
commit 52958f17cd for more explanation.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-03-26 12:50:41 +01:00
Pavel Mores
0cf0e923fc runtime-rs: refactor QemuCmdLine::add_network_device() signature
add_network_device() doesn't need to be passed NetworkInfo since it
already has access to the full HypervisorConfig.

Also, one of the goals of QemuCmdLine interface's design is to avoid
coupling between QemuCmdLine and the hypervisor crate's device module,
if at all possible.  That's why add_network_device() shouldn't take
device module's NetworkConfig but just parts that are useful in
add_network_device()'s implementation.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-03-26 12:50:41 +01:00
Pavel Mores
a4f033f864 runtime-rs: add should_disable_modern() utility function
is_running_in_vm() is enough to figure out whether to disable_modern but
it's clumsy and verbose to use.  should_disable_modern() streamlines the
usage by encapsulating the verbosity.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-03-26 12:50:41 +01:00
Pavel Mores
12e40ede97 runtime-rs: reimplement add_network_device() using Netdev & DeviceVirtioNet
This commit replaces the existing NetDevice-based implementation with one
using Netdev and DeviceVirtioNet.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-03-26 12:50:41 +01:00
Pavel Mores
0a57e2bb32 runtime-rs: refactor NetDevice in qemu driver
In keeping with architecture of QemuCmdLine implementation we split the
functionality into two objects: Netdev to represent and generate the
-netdev part and DeviceVirtioNet for the -device virtio-net-<transport>
part.

This change is a pure refactor, existing functionality does not change.
However, we do remove some stub generalizations and govmm-isms, notably:
- we remove the NetDev enum since the only network interface types that
  kata seems to use with qemu are tuntap and macvtap, both of which are
  implemented by the same -netdev tap
- enum DeviceDriver is also left out since it doesn't seem reasonable to
  try to represent VFIO NICs (which are completely different from
  virtio-net ones) with the same struct as virtio-net
- we also remove VirtioTransport because there's no use for it so far, but
  with the expectation that it will be added soon.

We also make struct Netdev the owner of any vhost-net and queue file
descriptors so that their lifetime is tied ultimately to the lifetime of
QemuCmdLine automatically, instead of returning the fds to the caller and
forcing it to achieve the equivalent functionality but manually.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-03-26 12:50:41 +01:00
Pavel Mores
7f23734172 runtime-rs: reduce generate_netdev_fds() dependencies
generate_netdev_fds() takes NetworkConfig from which it however only needs
a host-side network device name.  This commit makes it take the device name
directly, making the function useful to callers who don't have the whole
NetworkConfig but do have the requisite device name.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-03-26 12:50:41 +01:00
Pavel Mores
d4ac45d840 runtime-rs: refactor clear_fd_flags()
The idea of this function is to make sure O_CLOEXEC is not set on file
descriptors that should be inherited by a child (=hypervisor) process.
The approach so far is however rather heavy-handed - clearing *all* flags
is unjustifiably aggresive for a low-level function with no knowledge of
context whatsoever.

This commit refactors the function so that it only does what's expected
and renames it accordingly.  It also clarifies some of its call sites.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-03-26 12:50:14 +01:00
Fabiano Fidêncio
cfe75f9422 k8s: confidential: Update cpuid to its latest release
Since v2.2.6 it can detect TDX guests on Azure, so let's bump it even if
Azure peer-pods are not currently used as part of our CI.

Fixes: #9348

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-03-26 10:21:12 +01:00
Chengyu Zhu
d16971e37e Merge pull request #9325 from ChengyuZhu6/image_service
agent:image: Refactor code to improve memory efficiency of image service
2024-03-26 10:38:37 +08:00
Dan Mihai
6c72c29535 genpolicy: reduce policy debug prints
Kata CI has full debug output enabled for the cbl-mariner k8s tests,
and the test AKS node is relatively slow. So debug prints from policy
are expensive during CI.

Fixes: #9296

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-03-26 02:21:26 +00:00
Alex Lyn
cec943fc26 Merge pull request #9244 from Apokleos/dgb-gpu
runtime-rs/dragonball: add support building kernel with upcall and GPU hotplug
2024-03-26 08:53:54 +08:00
Chelsea Mafrica
4e3deb5a3b tools: Fix path for installing yq in packaging script
The lib.sh script uses the right directory but the wrong path for the
script that installs yq; fix it.

Fixes #9165

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2024-03-25 15:09:52 -07:00
Chelsea Mafrica
cfb977625e docs: Remove links to tests repo
Remove links to tests repo and update with corresponding location in the
current repo.

Fixes #9165

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2024-03-25 15:09:52 -07:00
Chelsea Mafrica
d69514766e src: Remove references to files in tests repo
Change scripts and source that uses files in the tests repo to use the
corresponding file in the current repo.

Fixes #9165

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2024-03-25 15:09:52 -07:00
Gabriela Cervantes
ddef2be4f1 docs: Remove stale kernel information
This PR removes stale kernel information from the README document.

Fixes #9343

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-25 15:57:00 +00:00
Greg Kurz
e9e94d2dbd release: Give a pretty name to all steps
For a prettier rendering in the web UI.

Fixes #9064 - part VI

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-03-25 15:50:35 +01:00
Greg Kurz
dce6ea57b2 release: Simplify the create-new-release action of release.sh
Now that the version is an invariant for the entire workflow, it
isn't required to obtain it with an environment variable. Just
rely on the content of the `VERSION` file like other actions.

Fixes #9064 - part VI

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-03-25 15:50:35 +01:00
Alex Lyn
5c54315a87 dragonball: fix CI failure due to poor UT adaptation.
Fixes: #9144

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-25 20:25:27 +08:00
Alex Lyn
079d894496 kernel: bump version in kata config version
Fixes: #9140

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-25 20:25:27 +08:00
Alex Lyn
070c3fa657 docs: add doc about building kernel with upcall and GPU hotplug
We need some docs about how to build a guest kernel to support
both Upcall and Nvidia GPU Passthrough(hotplug) at the same time.
This patch is to do such thing to help users to build a guest
kernel with support both Upcall and Nvidia GPU hotplug/unlplug.

Fixes: #9140

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-25 20:25:17 +08:00
ChengyuZhu6
06b9935402 docs: Add a document for kata guest image management design
Add a document for kata guest image management design.

Related feature: #8484

Fixes: #9225 -- part I

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-25 18:17:23 +08:00
Chengyu Zhu
4029d154ba Merge pull request #9313 from ChengyuZhu6/rtest
agent: Refactor unit tests to leverage rstest for parameterization
2024-03-25 10:31:45 +08:00
Alex Lyn
bc309b9865 kernel: add CONFIG_CRYPTO_ECDSA into whitelist
CONFIG_CRYPTO_ECDSA is not supported in older kernels such as 5.10.x
which may cause building broken problem if we build such kernel with
NVIDIA GPU in version 5.10.x

So this patch is to add CONFIG_CRYPTO_ECDSA into whitelist.conf to
avoid break building guest kernel with NVIDIA GPU.

Fixes: #9140

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-25 08:05:31 +08:00
ChengyuZhu6
f47408fdf4 agent:image: Refactor code to improve memory efficiency of image service
Currently, `.lock().await.clone()` results in `Option<ImageService>` being duplicated in memory with each call to `singleton()`.
Consequently, if kata-agent receives numerous image pulling requests simultaneously,
it will lead to the allocation of multiple `Option<ImageService>` instances in memory, thereby consuming additional memory resources.

In image.rs, we introduce two public functions:
`merge_bundle_oci()` and `init_image_service()`. These functions will encapsulate
the operations on `IMAGE_SERVICE`, ensuring that its internal details remain
hidden from external modules such as `rpc.rs`.

Fixes: #9225 -- part II

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-25 07:46:50 +08:00
ChengyuZhu6
7a49ec1c80 agent:util: Refactor the unit tests to leverage rstest
Refactor the unit tests in util.rs to leverage rstest for parameterization.

Fixes: #9314

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-23 10:49:53 +08:00
ChengyuZhu6
2df2b4d30d agent:namespace: Refactor unit tests to leverage rstest
Refactor the unit tests in `namespace.rs` to leverage rstest for parameterization.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-23 10:49:48 +08:00
Hyounggyu Choi
d915a79e2d Merge pull request #9280 from BbolroC/enable-qemu-on-s390x
runtime-rs: Enable qemu on s390x
2024-03-22 23:58:42 +01:00
Fabiano Fidêncio
25cd28a32b Merge pull request #9337 from fidencio/topic/bump-nydus-snapshotter
versions: Update nydus-snapshotter to v0.13.11
2024-03-22 22:18:18 +01:00
Hyounggyu Choi
81aaa34bd6 runtime-rs: Add DeviceVirtioSerial and DeviceVirtconsole
It is observed that virtiofsd exits immediately on s390x
if there is no attached console devices.
This commit resolves the issue by migrating `appendConsole()`
from runtime and being triggered in `start_vm()`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-22 19:27:13 +01:00
Hyounggyu Choi
2cfe745efb runtime-rs: Enable memory backend option for Machine for s390x
For s390x, it requires an additional option `memory-backend` for `-machine`.
Otherwise, virtiofsd exits with HandleRequest(InvalidParam).

This commit is to add a field `memory_backend` to `struct Machine`
and turn it on for s390x.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-22 19:27:13 +01:00
Hyounggyu Choi
9bcfaad625 runtime-rs: Add ccw block device for rootfs
Like nvdimm for x86_64, a block device for s390x should be
treated differently with `virtio-blk-ccw`.
This is to generate a QEMU command line parameter for a block
device by using `-blockdev` and `-device` if the `vm_rootfs_driver`
is set to `virtio-blk-ccw`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-22 19:27:13 +01:00
David Esparza
3e40051634 Merge pull request #9255 from dborquez/thread_pid_function
runtime-rs: ch: Implement full thread/tid/pid handling
2024-03-22 10:05:02 -06:00
Fabiano Fidêncio
d0949759ec versions: Update nydus-snapshotter to v0.13.11
This version brings in a fix for cleaning up k3s/rke2 environments,
which directly impacts the TDX machine that's part of our CI.

Fixes: #9318

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-03-22 14:56:18 +01:00
Greg Kurz
e4f6a778a8 Merge pull request #9321 from fidencio/topic/releases-follow-up-VI
Revert "release: Skip --generate-notes for this release"
2024-03-22 10:44:40 +01:00
GabyCT
a67382fd00 Merge pull request #9324 from GabyCT/topic/udevguide
docs: Update libseccomp instructions in Developers Guide
2024-03-21 14:25:41 -06:00
Gabriela Cervantes
d54cdd3f0c scripts: Fix unbound variables in k8s setup script
This PR fixes the unbound variables error when trying to run
the setup script locally in order to avoid errors.

Fixes #9328

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-21 19:10:16 +00:00
Chengyu Zhu
9a4cb96262 Merge pull request #9312 from ChengyuZhu6/show-feature
agent: Add guest-pull to the list of agent features in announce()
2024-03-21 23:35:29 +08:00
David Esparza
b498e140a1 runtime-rs: ch: Implement full thread/tid/pid handling
Add in the full details once cloud-hypervisor/cloud-hypervisor#6103
has been implemented, and the feature is available in a Cloud Hypervisor
release.

Fixes: #8799

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-03-21 08:24:53 -06:00
James O. D. Hunt
1e684f5848 Merge pull request #9259 from jodh-intel/tests-add-static-checks-announce
tests: static checker: Add announce message
2024-03-21 13:59:36 +00:00
ChengyuZhu6
754399d909 agent: Add guest-pull to the list of agent features in announce()
Add guest-pull to the list of agent features in announce().

Fixes: #9225 -- part IV

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-21 20:01:52 +08:00
Xuewei Niu
9c4f9dcb35 Merge pull request #9311 from studychao/chao/fix_mtrr
Dragonballl: introduce MTRR regs support
2024-03-21 17:24:27 +08:00
Hyounggyu Choi
9b2c08935b runtime-rs: Pass different device argument based on bus type
Currently, `*-pci` is used as an argument for the device config.
It is not true for a case where a different type of bus is used.
s390x uses `ccw`.
This commit is to make it flexible to generate the device argument
based on the bus type. A structure `DeviceVhostUserFsPci` and
`VhostVsockPci` is renamed to `DeviceVhostUserFs` and `VhostVsock`
because the structure name is not bound to a certain bus type any more.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-21 09:25:37 +01:00
GabyCT
03f3d3491d Merge pull request #9265 from GabyCT/topic/fixnydusclean
gha: Fix nydus namespace clean up
2024-03-20 16:17:38 -06:00
GabyCT
702a8a440f Merge pull request #9309 from GabyCT/topic/fixlograndom
gha: Update journal log names for kubernetes artifacts
2024-03-20 16:17:17 -06:00
Gabriela Cervantes
05f4dc1902 docs: Update libseccomp instructions in Developers Guide
This PR updates the libseccomp instructions in the Developers Guide.

Fixes #9323

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-20 20:44:24 +00:00
GabyCT
163103d59e Merge pull request #9307 from GabyCT/topic/fixdocreq
docs: Update links in the Documentation Requirements document
2024-03-20 14:29:04 -06:00
Gabriela Cervantes
af18221ab7 docs: Update links in the Documentation Requirements document
This PR updates the url links in the Documentation Requirements
document.

Fixes #9306

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-20 15:45:49 +00:00
Gabriela Cervantes
a855ecf21b gha: Update journal log names for kubernetes artifacts
This PR updates the journal log names for kubernetes artifacts
in order to make sure that we have different names when we are
running parallel GHA jobs.

Fixes #9308

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-20 15:44:20 +00:00
Gabriela Cervantes
4fb8f8705f gha: Fix nydus namespace clean up
This PR terminates the nydus namespace to avoid the error of
that the flag needs an argument.

Fixes #9264

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-20 15:41:39 +00:00
Fabiano Fidêncio
0278fc8a91 Revert "release: Skip --generate-notes for this release"
This reverts commit 0fa59ff94b, as now
we'll be able to use the `--generate-notes`, hopefully, without blowing
the allowed limit.

Fixes: #9064 - part VI

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-03-20 15:48:22 +01:00
James O. D. Hunt
577abd014b tests: static checker: Add announce message
Added an announcement message to the `static-checks.sh` script. It runs
platform / architecture specific code so it would be useful to display
details of the platform the checker is running on to help with
debugging.

Fixes: #9258.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-20 13:41:26 +00:00
James O. D. Hunt
4af4a8ad2b tests: static checker: Create setup function
Move some of the common code into a setup function.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-20 11:58:28 +00:00
Fabiano Fidêncio
1aec4f737a Merge pull request #9316 from fidencio/topic/releases-follow-up-V
release: Skip --generate-notes for this release
2024-03-20 10:50:14 +01:00
Fabiano Fidêncio
0fa59ff94b release: Skip --generate-notes for this release
This release is a special case, as we've slacked for 6 months and the
release content is way too long ... long enough to exceed the allowed
limit for the release notes.

With this in mind we'll just remove the `--generate-notes` for now, and
then revert this commit as soon as the release is out, as releases
should be happening every month and, ideally, we won't reach this
situation never ever again.

Fixes: #9064 - part V

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-03-20 10:32:11 +01:00
Hyounggyu Choi
7b3d1adb8c libs: Bump sysinfo to v0.30.5
It has been observed that the runtime stops running around
`sysinfo::total_memory()` while adjusting a config on s390x.
This is to update the crate to the latest version which happened
to resolve the issue. (No explicit release note for this)

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-20 09:27:13 +01:00
Chao Wu
5a4b858ece Dragonballl: introduce MTRR regs support
MTRR, or Memory-Type Range Registers are a group of x86 MSRs providing a way to control access
 and cache ability of physical memory regions.
During our test in runtime-rs + Dragonball, we found out that this register support is a must
for passthrough GPU running CUDA application, GPU needs that information to properly use GPU memory.

fixes: #9310
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2024-03-20 14:18:16 +08:00
Fabiano Fidêncio
19eb45a27d Merge pull request #8484 from ChengyuZhu6/guest-pull
Merge basic guest pull image code to main
2024-03-19 23:15:39 +01:00
Hyounggyu Choi
6e782826c7 Merge pull request #9305 from BbolroC/handle-comment-for-skipped-tests
CI|k8s: Handle skipped tests with a comment for filter_out_per_arch
2024-03-19 22:54:03 +01:00
Fabiano Fidêncio
8911d3565f gha: tests: Filter out confidential tests for aarch64 / ppc64le
Those two architectures are not TEE capable, thus we can just skip
running those tests there.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-03-19 18:06:01 +01:00
Fabiano Fidêncio
d14e9802b6 gha: k8s: Set {https,no}_proxy correctly for TDX
This is needed as the TDX machine is hosted inside Intel and relies on
proxies in order to connect to the external world.  Not having those set
causes issues when pulling the image inside the guest.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-03-19 18:06:00 +01:00
Fabiano Fidêncio
291b14bfb5 kata-deploy: Add the ability to set {https,no}_proxy if needed
Let's make sure those two proxy settings are respected, as those will be
widely used when pulling the image inside the guest on the Confidential
Containers case.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
5bad18f9c9 agent: set https_proxy/no_proxy before initializing agent policy
When the https_proxy/no_proxy settings are configured alongside agent-policy enabled, the process of pulling image in the guest will hang.
This issue could stem from the instantiation of `reqwest`’s HTTP client at the time of agent-policy initialization,
potentially impacting the effectiveness of the proxy settings during image guest pulling.
Given that both functionalities use `reqwest`, it is advisable to set https_proxy/no_proxy prior to the initialization of agent-policy.

Fixes: #9212

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
db9f18029c README: Add https_proxy and no_proxy to agent README
Add agent.https_proxy and agent.no_proxy to the table in the agent README.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
e23737a103 gha: refactor code with yq for better clarity
refactor code with yq for better clarity:

Before:
```bash
yq write -i "${tools_dir}/packaging/kata-deploy/kata-deploy/base/kata-deploy.yaml"
'spec.template.spec.containers[0].env[7].value' "${KATA_HYPERVISOR}:${SNAPSHOTTER}"
```

After:
```bash
yq write -i \
  "${tools_dir}/packaging/kata-deploy/kata-deploy/base/kata-deploy.yaml" \
  'spec.template.spec.containers[0].env[7].value' \
  "${KATA_HYPERVISOR}:${SNAPSHOTTER}"
```

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
2c0bc8855b tests: Make sure to install yq before using it
Make sure to install yq before using it to modify YAML files.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
c52b356482 tests: add guest pull image test
Add a test case of pulling image inside the guest for confidential
containers.

Signed-off-by: Da Li Liu <liudali@cn.ibm.com>
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: Georgina Kinge <georgina.kinge@ibm.com>
Co-authored-by: Megan Wright <Megan.Wright@ibm.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
e8c4effc07 tests: refactor the check for hypervisor to a function
Extract two reusable functions for confidential tests in confidential_common.sh

- check_hypervisor_for_confidential_tests: verifies if the input hypervisor supports confidential tests.
- confidential_setup: performs the common setup for confidential tests.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Co-authored-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
6e5e4e55d0 rootfs: add ca file to guest rootfs
To access the URL, the component to pull image in the guest needs to send a request to the remote.
Therefore, we need to add CA to the rootfs.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
8724d7deeb packaging: Enable to build agent with PULL_TYPE feature
Enable to build kata-agent with PULL_TYPE feature.

We build kata-agent with guest-pull feature by default, with PULL_TYPE set to default.
This doesn't affect how kata shares images by virtio-fs. The snapshotter controls the image pulling in the guest.
Only the nydus snapshotter with proxy mode can activate this feature.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
cd6a84cfc5 kata-deploy: Setting up snapshotters per runtime handler
Setting up snapshotters per runtime handler as the commit
(6cc6ca5a7f) described.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:05:59 +01:00
ChengyuZhu6
ba242b0198 runtime: support different cri container type check
To support handle image-guest-pull block volume from different CRIs, including cri-o and containerd.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:05:59 +01:00
ChengyuZhu6
874d83b510 agent/image: Use guest provided pause image
By default the pause image and runtime config will provided
by host side, this may have potential security risks when the
host config a malicious pause image, then we will use the pause
image packaged in the rootfs.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Arron Wang <arron.wang@intel.com>
Co-authored-by: Julien Ropé <jrope@redhat.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
2024-03-19 18:05:59 +01:00
ChengyuZhu6
c269b9e8c6 agent: Add guest-pull feature for kata-agent
Add "guest-pull" feature option to determine that the related dependencies
would be compiled if the feature is enabled.

By default, agent would be built with default-pull feature, which would
support all pull types, including sharing images by virtio-fs and
pulling images in the guest.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:05:59 +01:00
Aurélien
192250c52e Merge pull request #9299 from sprt/sprt/mariner-normal-tests
ci: aks: also run tests in normal instance for Mariner
2024-03-19 11:34:20 -05:00
ChengyuZhu6
965da9bc9b runtime: support to pass image information to guest by KataVirtualVolume
support to pass image information to guest by KataVirtualVolumeImageGuestPullType
in KataVirtualVolume, which will be used to pull image on the guest.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 17:22:36 +01:00
ChengyuZhu6
cfd14784a0 agent: Introduce ImagePullHandler to support IMAGE_GUEST_PULL volume
As we do not employ a forked containerd in confidential-containers, we utilize the KataVirtualVolume
which storing the image information as an integral part of `CreateContainer`.
Within this process, we store the image information in rootfs.storage and pass this image url through `CreateContainerRequest`.
This approach distinguishes itself from the use of `PullImageRequest`, as rootfs.storage is already set and initialized at this stage.
To maintain clarity and avoid any need for modification to the `OverlayfsHandler`,we introduce the `ImagePullHandler`.
This dedicated handler is responsible for orchestrating the image-pulling logic within the guest environment.
This logic encompasses tasks such as calling the image-rs to download and unpack the image into `/run/kata-containers/{container_id}/images`,
followed by a bind mount to `/run/kata-containers/{container_id}`.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 17:22:36 +01:00
ChengyuZhu6
462051b067 agent/image: merge container spec for images pulled inside guest
When being passed an image name through a container annotation,
merge its corresponding bundle OCI specification and process into the passed container creation one.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Arron Wang <arron.wang@intel.com>
Co-authored-by: Jiang Liu <gerry@linux.alibaba.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: wllenyj <wllenyj@linux.alibaba.com>
Co-authored-by: jordan9500 <jordan.jackson@ibm.com>
2024-03-19 17:22:36 +01:00
ChengyuZhu6
cec1916196 agent: Support https_proxy/no_proxy config for image download in guest
Containerd can support set a proxy when downloading images with a environment variable.
For CC stack, image download is offload to the kata agent, we need support similar feature.
Current we add https_proxy and no_proxy, http_proxy is not added since it is insecure.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Arron Wang <arron.wang@intel.com>
2024-03-19 17:22:36 +01:00
ChengyuZhu6
9cddd5813c agent/image: Enable image-rs crate to pull image inside guest
With image-rs pull_image API, the downloaded container image layers
will store at IMAGE_RS_WORK_DIR, and generated bundle dir with rootfs
and config.json will be saved under CONTAINER_BASE/cid directory.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Arron Wang <arron.wang@intel.com>
Co-authored-by: Jiang Liu <gerry@linux.alibaba.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: wllenyj <wllenyj@linux.alibaba.com>
2024-03-19 17:22:36 +01:00
ChengyuZhu6
2b3a00f848 agent: export the image service singleton instance
Export the image service singleton instance.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Jiang Liu <gerry@linux.alibaba.com>
Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: wllenyj <wllenyj@linux.alibaba.com>
2024-03-19 17:22:36 +01:00
ChengyuZhu6
1f1ca6187d agent: Introduce ImageService
Introduce structure ImageService, which will be used to pull images
inside the guest.

Fixes: #8103

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
co-authored-by: wllenyj <wllenyj@linux.alibaba.com>
co-authored-by: stevenhorsman <steven@uk.ibm.com>
2024-03-19 17:22:33 +01:00
Hyounggyu Choi
b381743dd5 CI|k8s: Handle skipped tests with a comment for filter_out_per_arch
This commit updates `filter_k8s_test.sh` to handle skipped tests that
include comments. In addition to the existing parameter expansion,
the following expansions have been added:

- Removal of a comment
- Stripping of trailing spaces

Fixes: #9304

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-19 17:21:25 +01:00
Chelsea Mafrica
42dfe0e8d1 Merge pull request #9286 from jodh-intel/agent-show-enabled-features
agent: Show features enabled at build time
2024-03-19 08:54:49 -07:00
Wainer Moschetta
e6501aa4ad Merge pull request #9229 from ldoktor/ocp-ci
ocp.ci: Various fixes and improvements to the OCP pipeline
2024-03-19 11:13:01 -03:00
James O. D. Hunt
46aec0f15a Merge pull request #9293 from jodh-intel/kata-manager-fix-containerd-for-docker
kata-manager: Fix Docker install
2024-03-19 10:06:44 +00:00
Fabiano Fidêncio
e0a6b6449f Merge pull request #9302 from BbolroC/fix-permission-issue-on-s390x-runners
gha: Place pre-action on s390x runner for kata-deploy during release
2024-03-19 10:42:23 +01:00
Hyounggyu Choi
f2bc819644 gha: Place pre-action on s390x runner for kata-deploy during release
This is to place a pre-action step for the kata-deploy job in order to
clean up the github workspace directory before checking out the repo.

Fixes: #9301

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-19 10:18:38 +01:00
Alex Lyn
7af2df408e Merge pull request #9295 from likebreath/0318/fix_clh_default_netconfig
runtime-rs: ch: Provide valid default value for NetConfig
2024-03-19 15:17:18 +08:00
Xuewei Niu
99d0e5fff8 Merge pull request #9270 from zvonkok/kata-agent-bind-mount
kata-agent: optional bind flag
2024-03-19 10:39:23 +08:00
Aurélien Bombo
71a1be9c57 ci: aks: also run tests in normal instance for Mariner
Currently we're only running the small instance tests. This adds the
normal instance tests as well.

Fixes: #9298

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-03-18 23:33:17 +00:00
Bo Chen
ad4262e86b runtime-rs: ch: Provide valid default value for NetConfig
The current default value of IP `0.0.0.0` with mask `0.0.0.0` will cause
ioctl error when being used to create and configure TAP device, with
newer version of Cloud Hypervisor [1]. This patch replaces them with
valid value that are the same as the Go-lang runtime [2].

[1] https://github.com/cloud-hypervisor/cloud-hypervisor/pull/5924
[2] e3f7852738/src/runtime/virtcontainers/pkg/cloud-hypervisor/client/model_net_config.go (L40-L57)

Fixes: #9254

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-03-18 15:47:58 -07:00
Fabiano Fidêncio
e3f7852738 Merge pull request #9289 from fidencio/topic/releases-follow-up-IV
releases: Simply the release in order to avoid pushing a commit updating the VERSION file
2024-03-18 17:38:58 +01:00
James O. D. Hunt
a6c3f75872 kata-manager: Fix Docker install
Fix the Docker install by removing the second (erroneous) call to
`containerd_installed()` in `handle_docker()`.

Without this fix, installing using Docker (`-D`) will work *iff* you
already have containerd installed. However, if you do not have
containerd installed, the `containerd_installed()` function returns 1,
which exits the script as we're running with `set -e`, leaving a broken
Docker installation.

> **Note:** containerd is installed via Docker's `get-docker.sh` script.

Fixes: #9292.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-18 14:08:35 +00:00
stevenhorsman
0ab8e61a64 release: Remove release type from arch release
Now we don't have minor and major releases and
we are now generating a new version
in the release workflow, we can
tidy up the arch specific releases workflows to remove
the extra required inputs

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-03-18 12:27:57 +00:00
Greg Kurz
3cfc1b6ba7 releases: Adjust documentation to the new workflow
This drops the documentation of the legacy release scripts and adds
a quick description of the scripts of the new workflow. It also
highlights the bump of the `VERSION` file.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-03-18 12:57:02 +01:00
Greg Kurz
76c640767e releases: Drop Makefile
It isn't used anymore.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-03-18 12:54:00 +01:00
Greg Kurz
bfe19e68e8 kata-deploy: Adapt test-kata.sh to the new release workflow
All releases are now created in the `main` branch following
the very same workflow. No need to special case pre-releases.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-03-18 12:54:00 +01:00
Fabiano Fidêncio
12578f11bc releases: Assume VERSION has the correct version to be released
This is done in order to avoid having to push a commit to the main
branch, which is against the defined rules on GitHub.

By doing this, we need to educate ourselves to always bump the VERSION
file as soon as a release is cut out.

As a side effect of this change, we can drop the release-major and
release-minor workflows, as those are not needed anymore.

Fixes: #9064 - part IV

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-03-16 13:30:58 +01:00
Fabiano Fidêncio
8ce50269fe release: Bump the VERSION file to the next release number
3.3.0 it will be.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-03-16 13:21:27 +01:00
Xuewei Niu
9f512c016e Merge pull request #9282 from gkurz/runtime-rs-fds-for-qemu
runtime-rs: Consolidate the handling of fds passed to QEMU
2024-03-16 10:26:11 +08:00
Greg Kurz
1e526a4769 runtime-rs: Consolidate the handling of fds passed to QEMU
File descriptors that are passed to QEMU need some special care.
We want them to be closed when the QEMU process is started. But
at the same time, it is required that the associated rust File
structures, either coming from the` std::fs` or the `tokio::fs`
crates, are still in scope when the QEMU process is forked. This
is currently achieved by keeping File structures in variables
at the outer scope of `start_vm()`. This scheme is currently
duplicated, with similar justifications in the corresponding
comments.

Consolidate all this handling in one place with a more generic
explanation.

Fixes #9281

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-03-15 16:14:59 +01:00
James O. D. Hunt
9ef59488d9 agent: Show features enabled at build time
The agent now has a number of optional build-time features that can be
enabled.

Add details of these features to the following areas:

- Version output (`kata-agent --version`)
- Announce message (so that the details are always added to the journal
  at agent startup).
- The response message returned by the ttRPC `GetGuestDetails()` API.

Fixes: #9285.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-15 13:29:21 +00:00
Chelsea Mafrica
2c50d3c393 Merge pull request #9278 from wainersm/github_env_fix
tests: fix nounset error with $GITHUB_ENV
2024-03-14 16:39:13 -07:00
Greg Kurz
6a112cc7a5 runtime-rs: Fix missing dependency
Some previous contribution missed to run cargo clippy.
Fix the dependency now so that it doesn't cause noise
in future contributions.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-03-14 23:19:38 +01:00
Dan Mihai
b3b00e00a6 Merge pull request #9246 from microsoft/danmihai/default-env
genpolicy: default env if image doesn't have env
2024-03-14 11:01:43 -07:00
Dan Mihai
6094f1e31d Merge pull request #9250 from microsoft/danmihai1/k8s-pid-ns2
tests: k8s: k8s-pid-ns.bats auto-generated policy
2024-03-14 10:10:24 -07:00
Zvonko Kaiser
c15e19c806 kata-agent: optional bind flag
Fixes: #9269

From https://github.com/opencontainers/runtime-spec/blob/main/config.md#mounts
type (string, OPTIONAL) The type of the filesystem to be mounted.
bind may be only specified in the oci spec options -> flags update r#type
The agent will ignore bind mounts if they are only specified in the OCI spec options and not in the flags.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-03-14 14:42:01 +00:00
Hyounggyu Choi
1dac6b1357 runtime-rs: Configure s390x specific flags for Makefile
s390x supports a different machine type `s390-ccw-virtio` and it is
not required to configure cpu features by default for the platform.
A hypervisor `dragonball` is not supported on s390x so that `DBCMD`
is not necessary. `vm-rootfs_driver` should be set to `virtio-blk-ccw`.
This commit is to set the architecture-specific flags for Makefile.

Fixes: #9158

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-14 13:05:35 +01:00
Wainer dos Santos Moschetta
981f95df55 tests: fix nounset error with $GITHUB_ENV
Initialize $GITHUB_ENV to avoid nounset error when running the scripts locally
out of Github Actions.

Fixed commit 9ba5e3d2a8

Fixes #9217
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-13 14:57:38 -03:00
Dan Mihai
ac27caf1b4 Merge pull request #9248 from microsoft/danmihai1/k8s-exec.bats2
tests: k8s: k8s-exec.bats auto-generated policy
2024-03-13 09:21:12 -07:00
Alex Lyn
2aa3519520 kata-agent: Change order of guest hook and bind mount processing
The guest_hook_path item in configuration.toml allows OCI hook scripts
to be executed within Kata's guest environment. Traditionally, these
guest hook programs are pre-built and included in Kata's guest rootfs
image at a fixed location.

While setting guest_hook_path = "/usr/share/oci/hooks" in configuration.toml
works, it lacks flexibility. Not all guest hooks reside in the path
/usr/share/oci/hooks, and users might have custom locations.

To address this, a more flexible and configurable approach is to be proposed
that allows users to specify their desired path. This could include using a
sandbox bind mount path for hooks specific to that particular container.

However, The current implementation of guest hooks and bind mounts in kata-agent
has a reversed order of execution compared to the desired behavior.
To achieve the intended functionality, we simply need to swap the order of their
implementation.

Fixes: #9274

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-13 20:30:32 +08:00
Steve Horsman
8f4cbd49d7 Merge pull request #9263 from Amulyam24/gha-fixes
gha: ensure that the self hosted runner is in desired state before running the workflow
2024-03-13 10:49:29 +00:00
Zvonko Kaiser
63dff9a9f2 kata-agent: CreateContainer Hook
Fixes: #9267

The doc states we have support for all lifecycle hooks. There are still some missing.
This is the first issue regarding the CreateContainer hook which is run before pivot_root but after prestart and createruntime

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-03-13 09:24:25 +00:00
Amulyam24
3f4b24be8b gha: ensure that self hosted runner is prepared before running the workflow
This PR ensures that the self hosted runner is prepared by taking
necesary actions before running the workflow. The script prepare_runner.sh
checks the following:
1. Ensure that containerd/docker is up and running
2. Make sure that the repository workspace is cleaned up and has no conflicts
3. Remove/cleanup any leftover files from the previous runs

Fixes: #9262

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2024-03-13 14:20:10 +05:30
Alex Lyn
410afcc913 Merge pull request #8866 from Apokleos/netdev-qemu-rs
runtime-rs: add netdev params to cmdline for qemu-rs.
2024-03-13 13:07:43 +08:00
Dan Mihai
e8c2a45ce0 tests: k8s: k8s-pid-ns.bats auto-generated policy
Auto-generate policy for k8s-pid-ns.bats.

Fixes: #9249

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-03-12 22:34:46 +00:00
Lukáš Doktor
46e62eecb1 ci.ocp: Log the full grepped line rather than the expected msg
we are grepping for an expected message but it might contain extra bits
of information fruitful for later debugging. Let's include it in the
output and the full log in case of an error.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-03-12 17:03:46 +01:00
Lukáš Doktor
7ff2eb508e ci.ocp: Increase the mcp update timeout
we're hitting this timeout quite often, looks like newer OCP takes
longer to reconfigure. Increase the timeout to 1200.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-03-12 16:38:04 +01:00
Lukáš Doktor
cc02329fd1 ci.ocp: Add a cleanup script
This script doesn't serve as a complete cleanup, but it can be used as a
best-effort cleaner between deploying different versions of
kata-containers on the same OCP cluster.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-03-12 16:38:04 +01:00
Lukáš Doktor
b811ee0650 ci.ocp: Allow to override the kata-deploy image
sometimes we want to test a different than the latest image (eg. when
verifying a PR via ghcr images or when bisecting a failure over older
builds). Let's add a KATA_DEPLOY_IMAGE variable for that while keeping
the latest image by default.

Fixes: #9228

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-03-12 16:38:04 +01:00
Lukáš Doktor
2936503b24 ci.ocp: Always replace the kata-deploy image in OCP pipeline
previously we only replaced the image when the previously defined one
matched the "old_img". This is good to avoid modifying developers custom
changes, but it might lead to hard-to-debug issues when the image stays
different. Let's ensure we always replace the image with the one we
asked for.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-03-12 16:38:04 +01:00
Lukáš Doktor
6525c94065 ci.ocp: Add a workaround to optionally enable skip_mount_home
the latest upstream kata-containers requires the skip_mount_home to be
enabled, which is default on OCP 4.14+ but disabled on OCP 4.13-. Let's
use a "WORKAROUND_9206_CRIO" (called by kata-containers GH issue)
variable to allow users to enable this treatement when needed.

Related to: #9206

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-03-12 16:38:04 +01:00
Lukáš Doktor
739d627b4e ci.ocp: Turn selinux relabel failures into warnings
Instead of failing the pipeline let's proceed with an error message that
selinux setup failed so, in case of a later failure, we know what might
have caused it while keeping the coverage in case of a false setup
issue.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-03-12 16:38:04 +01:00
Lukáš Doktor
76c452d4e0 ci.ocp: Wait for all pods to finish the work
previously we only waited for a random pod to finish the selinux
relabel, which could be error-prone. Let's wait for all of the podst to
contain the expected message.

Increase the timeout to 120s as some pods might take a little bit longer
to finish.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-03-12 16:34:56 +01:00
Lukáš Doktor
f7febd07a0 ci.ocp: Allow to re-apply the selinux workaround
in case we re-apply the selinux workaround or if user had already
existing similar rule the relabel_selinux was failing. Let's allow it to
modify the existing rules as well to avoid such issues.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-03-12 16:02:21 +01:00
Lukáš Doktor
fbbea68f1f ci.ocp: Ignore selinux setup on non-selinux cluster
improve our selinux workaround to work well on non-selinux clusters.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-03-12 16:02:20 +01:00
Alex Lyn
e2ae8ba79b runtime-rs: add network device into Qemu's cmdline
It will open tuntap device and vhost-net device
and store device files.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-12 22:28:54 +08:00
Alex Lyn
d3bca4597e runtime-rs: add open_named_tuntap to open a named tuntap device.
The open_named_tuntap function is designed as a public function to
open a tuntap device with the specified name. However, in order to
reference existing methods in dbs_utils, we still need to keep the
reference "path = "../../../dragonball/src/dbs_utils" in dependencies
and cannot hide it.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-12 22:26:32 +08:00
Alex Lyn
005b333976 runtime-rs: add network helpers and impl ToQemuParams
Add network helpers and impl ToQemuParams trait to build
netdev params which are put into cmdline for Qemu VM running.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-12 22:25:39 +08:00
Alex Lyn
63786934f4 runtime-rs: set network namespace for qemu process and netdev.
We need ensure the add_network_device happens in netns and
move qemu process into netns which keeps the qemu process
running in this net namespace.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-12 22:21:43 +08:00
Alex Lyn
69a5e5b955 runtime-rs: add network device handler in start_vm.
Add network device handler in start_vm, which is sepcially
for Qemu VM running with added net params to command line.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-12 22:18:01 +08:00
Alex Lyn
a116b252c8 Merge pull request #9236 from jodh-intel/docs-improve-install-details
docs: install: Simplify instructions
2024-03-12 14:29:38 +08:00
Alex Lyn
a31fb35e5d Merge pull request #9231 from UiPath/fix/clh-pid-init
clh: initialize clh pid before using it
2024-03-12 13:43:24 +08:00
Alex Lyn
9f6003adde runtime-rs: add a new netns field in struct QemuInner.
We need add a new netns field in struct QemuInner, and
initialize it with argument passed down in prepare_vm().

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-11 16:02:39 +08:00
Alex Lyn
f571ec84d2 runtime-rs: add a public method to support process entering netns.
The enter_netns function is designed as a public method to help
VMMs running as a independent process enter a network namespace,
reducing duplicate code.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-11 15:55:52 +08:00
Alex Lyn
4176fcc3c6 runtime-rs: make the code for cleanup fd flags as public method.
It just move the related code to a public file(utils.rs) and make
it a common method for both vsock and network, or some others.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-03-11 15:52:20 +08:00
Alex Lyn
b1038704e0 runtime-rs: make NetnsGuard common for hypervisor and resource.
In order to better support non-builtin vmm usage of NetnsGuard and
reduce code duplication, we need to move it to a common path that
can be referenced by both hypervisor and resource manager.

In this patch, it just do moving code from network/utils/netns.rs
to kata-sys-utils/src/netns.rs

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-11 15:38:42 +08:00
Alexandru Matei
617b0114b3 clh: initialize clh pid before using it
The PID needs to be initialized before calling isClhRunning.
waitVMM() uses isClhRunning and is called by launchClh() just
before returning from function.

Fixes: #9230

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2024-03-09 13:53:51 +02:00
Dan Mihai
88b7a44271 tests: k8s: k8s-exec.bats auto-generated policy
Auto-generate policy for k8s-exec.bats.

Fixes: #9247

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-03-08 17:48:20 +00:00
Steve Horsman
54e5ce2464 Merge pull request #9154 from chungeun-choi/change-deprecated-package
fixed - Change the deprecated module from 'io/util' to util. 'io/util…
2024-03-08 15:05:43 +00:00
Steve Horsman
e9bbf2f67b Merge pull request #9203 from fidencio/topic/releases-follow-up-III
release: Ensure the release-type is passed to workflows
2024-03-08 14:09:36 +00:00
Alex Lyn
c73597c39d Merge pull request #9208 from studychao/chao/fix_virt_ci
Dragonball: fix unit test problems when switching to new virt github machine
2024-03-08 09:41:05 +08:00
Chengyu Zhu
d49391a555 Merge pull request #8798 from LindaYu17/setpolicy
add setpolicy function to kata-runtime tool
2024-03-08 06:31:57 +08:00
Dan Mihai
5398b6466c Merge pull request #9224 from 3u13r/sidecar-container
genpolicy: add restartPolicy to container struct
2024-03-07 12:59:55 -08:00
GabyCT
35d8f82232 Merge pull request #9242 from GabyCT/topic/enabldebugnerd
gha: Add collect artifacts step to nerdctl workflow
2024-03-07 13:34:40 -06:00
Wainer Moschetta
91998af173 Merge pull request #9114 from wainersm/ci_kbs_cli
CI: add KBS utilities for attestation tests
2024-03-07 16:34:03 -03:00
Dan Mihai
4c3d6fadc8 genpolicy: default env if image doesn't have env
Use containerd's default environment for container images that don't
specify the Env field.

Also, re-enable policy env variable verification, now that these
uncommon images are supported too.

Fixes: #9239

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-03-07 16:56:06 +00:00
Dan Mihai
b3a02d5e06 Merge pull request #9128 from microsoft/danmihai1/test-genpolicy
tests: k8s: auto-generated policy
2024-03-07 08:50:47 -08:00
Fabiano Fidêncio
8faab965a7 gh: Fix payload-after-push tags
We now expect the arch specific images to be tagged as
kata-containers-latest-${arch}.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-03-07 12:02:51 +00:00
Fabiano Fidêncio
eab78cf1ba release: Reword the extra notes added as part of the release
We're trying to keep just the bare minimum info, as we really would like
to not have the list of commits, and mainly the list of new
contributors, trucated from the release notes.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-03-07 12:02:51 +00:00
Fabiano Fidêncio
658fb6972b release: Ensure the release-type is passed to workflows
We need to ensure the release type is passed down to workflows,
otherwise we'll fail to get the correct release version for tagging the
daemonset images.

Fixes: #9064 - part III

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-03-07 12:02:51 +00:00
Alex Lyn
a0a50f5e52 Merge pull request #9191 from Apokleos/fix-kata-ctl-exec0
kata-ctl: Support using container short ID to enter guest.
2024-03-07 19:26:40 +08:00
Wainer dos Santos Moschetta
8ea9ac515e tests/k8s: update kbs repository
Recently confidential-containers/kbs repository was renamed to
confidential-containers/trustee. Github will automatically resolve the
old URL but we better adjust it in code.

The trustee repository will be cloned to $COCO_TRUSTEE_DIR. Adjusted
file paths and pushd/popd's to use $COCO_KBS_DIR
($COCO_TRUSTEE_DIR/kbs).

On versions.yaml changed from `coco-kbs` to `coco-trustee` as in the
future we might need other trustee components, so keeping it generic.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta
c669567cd3 tests/k8s: add utils to set KBS policies
Added the kbs_set_resources_policy() function to set the KBS policy. Also the
kbs_set_allow_all_resources() and kbs_set_deny_all_resources to set the
"allow all" and "deny all" policy, respectively.

Fixes #9056
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta
6f0d38094d tests/k8s: add utils to set KBS resources
Added utility functions to manage resources in KBS:
- kbs_set_resource(), where the resource data is passed via argument
- kbs_set_resource_from_file(), where the resource data is found in a
  file

Fixes #9056
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta
2a374422c5 tests/k8s: add function to install kbs-client
Added kbs_install_cli function to build and install the kbs-client
executable if not present into the system.

Removed the stub from gha-run.sh; now the install kbs-client in the
.github/workflows/run-kata-deploy-tests-on-aks.yaml will effectively
install the executable.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta
4141875ffd ci/lib.sh: set GOPATH default value
Scripts sourcing ci/lib.sh need to set $GOPATH otherwise it will
fail. This ensure that GOPATH is set to ${HOME}/go unless it is
already exported.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta
e410aef4fa tests/k8s: add utils to get kbs service address
Added functions to return the service host, port or full-qualified
HTTP address, respectively, kbs_k8s_svc_host(), kbs_k8s_svc_port(),
and kbs_k8s_svc_http_addr().

Fixes #9056
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-07 11:20:36 +00:00
Leonard Cohnen
e30e8ab7dc genpolicy: add restartPolicy to container struct
This adds support for sidecar container introduced in Kubernetes 1.28

Fixes: #9220

Signed-off-by: Leonard Cohnen <lc@edgeless.systems>
2024-03-07 12:00:14 +01:00
Chungeun Choi
bad263f399 runtime: Replace deprecated module io/ioutil" to "io"
This change updates the module import to use 'util' instead of the deprecated 'io/util'

Fixes: #9166

Signed-off-by: Chungeun Choi <ce.choi@okestro.com>
2024-03-07 10:56:06 +00:00
Alex Lyn
ef9a38e551 shim-interface: add Copyright of AntGroup in file shim-interface.rs
Fixes: #9189

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-07 15:46:32 +08:00
Alex Lyn
2972a3a675 shim-interface: add UT for get_uds_with_sid
Fixes: #9189

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-07 15:45:44 +08:00
Alex Lyn
7145243bd3 kata-ctl: Support using container short ID to enter guest.
Fixes: #9189

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-07 15:44:47 +08:00
Linda Yu
bb77d2d7e6 docs: add docs on how to set policy by kata-runtime
Fixes: #8797

Signed-off-by: Linda Yu <linda.yu@intel.com>
2024-03-07 15:00:23 +08:00
Linda Yu
1c5693be86 stream: repeat copybuffer if it is blocked by policy
copyBuffer returns and the streams will be closed when error occurs.
If the error contains "blocked by policy" it means the log output is
disabled by policy with "ReadStreamRequest" and "WriteStreamRequest" set
to false. But at this moment, we want the real stream still working (not
be seen) because we might want to enable logging for debugging purpose,
so we repeat copybuffer in this case to avoid streams being closed.

Fixes: #8797

Signed-off-by: Linda Yu <linda.yu@intel.com>
2024-03-07 15:00:23 +08:00
Linda Yu
eda419cb03 kata-runtime: add set policy function to kata-runtime
logging/debugging information might probably be disabled in production
due to security consideration, but we'd better provide an approach for
customer to get logging information during runtime, this PR implement
setpolicy function in kata-runtime tools, although it can set whole policy
other than logging.
setpolicy would evokes remote attestation, which means before setting
policy during runtime, user has to reconfigure new policy hash in KBS/AS.

usage:  kata-runtime policy set policy.rego --sandbox-id XXXXXXXX

Fixes: #8797

Signed-off-by: Linda Yu <linda.yu@intel.com>
2024-03-07 15:00:23 +08:00
Dan Mihai
c08b696d9e tests: k8s: k8s-shared-volume generated policy
Auto-generate policy for k8s-shared-volume.bats.

Fixes: #9096

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-03-07 05:57:30 +00:00
Dan Mihai
b24758fad8 tests: k8s: k8s-scale-nginx auto-generated policy
Auto-generate policy for k8s-scale-nginx.bats.

Fixes: #9096

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-03-07 05:57:30 +00:00
Dan Mihai
af9ac8d194 tests: k8s: k8s-replication auto-generated policy
Auto-generate policy for k8s-replication.bats.

Fixes: #9096

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-03-07 05:57:30 +00:00
Dan Mihai
56689c6800 tests: k8s: k8s-qos-pods auto-generated policy
Auto-generate policy for k8s-qos-pods.bats.

Fixes: #9096

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-03-07 05:57:30 +00:00
Dan Mihai
0179f53469 tests: k8s: k8s-parallel auto-generated policy
Auto-generate policy for k8s-parallel.bats.

Fixes: #9096

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-03-07 05:57:30 +00:00
Dan Mihai
73a8b61c2e Merge pull request #9243 from microsoft/danmihai1/genpolicy-unblock-ci
genpolicy: disable env variable verification
2024-03-06 21:44:18 -08:00
Dan Mihai
e61ef30a76 genpolicy: disable env variable verification
Disable env variable verification to unblock CI, until container
images that don't specify the Env variables will be handled correctly
(see #9239).

Also, mark the image config Env field as optional, thus allowing
policy generation for these container images.

Fixes: #9240

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-03-07 01:59:18 +00:00
Gabriela Cervantes
94fdcda7f7 scripts: Add collect artifacts function in nerdctl gha run script
This PR adds the collect artifacts function in nerdctl gha run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-06 19:48:12 +00:00
Gabriela Cervantes
f902ee78d0 gha: Add collect artifacts step to nerdctl workflow
This PR adds the collect artifacts step to nerdctl workflow.

Fixes #9241

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-06 19:41:16 +00:00
GabyCT
640ed591bd Merge pull request #9219 from GabyCT/topic/fixkerneldoc
docs: Remove stale kernel information at README documentation
2024-03-06 10:24:31 -06:00
James O. D. Hunt
b1d4cbd9d1 utils: spell-checker: Fix grep warning
Fix the `grep(1)` warning caused by the unnecessary escaping of the
hash/sharp symbol.

Fixes: #9235.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-06 13:21:15 +00:00
James O. D. Hunt
5257bfa9a9 docs: install: Simplify instructions
Move the "build from source" and "manual installation" details to the
developer guide. This makes the installation landing page clearer for
users.

Fixes: #9234.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-06 13:14:03 +00:00
Ryan Savino
fdfc825bc4 Merge pull request #9174 from ryansavino/snp-qemu-stable-coco-tag
versions: SNP qemu updated to stable coco tagged version
2024-03-06 01:03:10 -06:00
GabyCT
83e39a206c Merge pull request #9223 from jodh-intel/tests-add-k3s-artifacts
tests: Add k3s artifacts
2024-03-05 13:37:21 -06:00
James O. D. Hunt
a67ed2f1c2 tests: Add k3s artifacts
The k3s distribution of k8s uses an embedded version of containerd and
configures it to log to a file, not the journal. Hence, although we
collect the journal as a test artifact, we also need to collect the
actual log files for containerd.

Also collect the k3s containerd config files to help with debugging.

Fixes: #9104.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-05 17:54:20 +00:00
GabyCT
9fab57acc8 Merge pull request #9217 from wainersm/revert_collect_artifacts
gha: export start_time to collect artifacts properly
2024-03-05 11:11:49 -06:00
Gabriela Cervantes
12be4cf828 docs: Remove stale kernel information at README documentation
This PR removes stale kernel information at README documentation.

Fixes #9218

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-05 16:46:45 +00:00
Wainer dos Santos Moschetta
9ba5e3d2a8 gha: export start_time to collect artifacts properly
The jobs running on garm will collect journal information. The data gathered
is based on the time the tests started running. The $start_time is
exported on run_tests() and used in collect_artifacts(). It happens that
run_tests() and collect_artifacts() are called on different steps of the
workflow and the environment variables aren't preserved between them,
i.e, $start_time exported on the first step is not available on the
subsequents.

To solve that issue, let's save $start_time in the file pointed out by
$GITHUB_ENV that Github actions uses to export variables. In case $GITHUB_ENV is
empty then probably it is running locally outside of Github, so it won't
save the start time value.

Fixes #9217
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-05 12:15:20 -03:00
James O. D. Hunt
b761a80bd1 Merge pull request #9059 from jodh-intel/kata-manager-add-hypervisor-option
kata-manager: Allow hypervisor to be changed
2024-03-05 09:30:04 +00:00
Alex Lyn
bf5edc8e73 Merge pull request #9155 from Jimmy-Xu/fix-build-gpu-kernel
gpu: fix build guest kernel with gpu
2024-03-05 16:53:44 +08:00
Greg Kurz
0320198889 Merge pull request #9206 from lifupan/main
CI: fix the issue of ci failure on crio
2024-03-05 09:52:13 +01:00
Fupan Li
628f57aca0 Merge pull request #9193 from UiPath/fix/clh-dax
clh: Enable DAX for rootfs
2024-03-05 09:39:22 +08:00
Wainer Moschetta
38088a934b Merge pull request #9184 from wainersm/fix_kata_deploy_bats
tests/kata-deploy: fix checker for kata-deploy running
2024-03-04 20:50:37 -03:00
GabyCT
77d048da4d Merge pull request #9065 from wainersm/ci_install_kbs
CI: Install KBS on k8s for attestation tests
2024-03-04 16:59:01 -06:00
GabyCT
a4153f3b71 Merge pull request #9210 from GabyCT/topic/addtestreadme
docs: Add general README for tests section
2024-03-04 16:54:28 -06:00
Gabriela Cervantes
5d50262422 docs: Add general tests documentation in main README
This PR adds the general tests documentation in main README of the
kata containers repository.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-04 21:53:01 +00:00
Gabriela Cervantes
d5fa2bebd5 docs: Add general README for tests section
This PR adds general README documentation for the tests section
in the kata containers repository.

Fixes #9209

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-04 21:50:37 +00:00
GabyCT
4dea9019ab Merge pull request #9126 from GabyCT/topic/addartifactsk
gha: Storing artifacts for logs of k8s tests garm
2024-03-04 15:41:54 -06:00
Gabriela Cervantes
fc5e040d96 scripts: Apply general fixes to variables in gha-run script
This PR applies general fixes to variables in gha-run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-04 18:54:15 +00:00
James O. D. Hunt
7af892f8d8 docs: Update kata-manager docs for switching hypervisor
Add details to the README for `kata-manager` showing how to list
available hypervisor configs (packaged and local), and switch between
the configurations. Also, update the hypervisors page to show a lot more
detail about the hypervisor configurations, including the "short name"
used by `kata-manager` for switching hypervisor config.

> **Note:**
>
> These changes only apply to the current default golang runtime.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-04 12:24:31 +00:00
James O. D. Hunt
4f6fef1f61 docs: Whitespace fix
Remove extraneous whitespace from hypervisors doc.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-04 12:18:05 +00:00
James O. D. Hunt
1ac3caf656 kata-manager: Allow hypervisor to be changed
Add new options to allow the configured hypervisor to be changed:

- `-L`: List available _packaged_ hypervisor config short names.
- `-e`: List available _local_ hypervisor config names.
- `-H <hypervisor>`: Install Kata then switch to the specified hypervisor.
- `-S <hypervisor>`: Switch to the specified hypervisor (by config short name [Errors if Kata not installed]).

For example, to install Kata and configure it to use Cloud Hypervisor
with the golang Kata runtime:

```bash
$ kata-manager.sh -H clh
```

To switch back to the default hypervisor:

```bash
$ kata-manager.sh -S default
```

To show details of the available packaged configs:

```bash
$ kata-manager.sh -L
```

To show details of the local configs:

```bash
$ kata-manager.sh -e
```

> **Notes:**
>
> - This change **only** applies to the current default (golang) Kata runtime.
>
> - Although this is mainly for users wishing to switch hypervisor (by
>   changing the Kata config file to another of the packaged config files
>   provided for specific hypervisors), strictly it allows users to change
>   to _any_ config file. For example, if the user has a config file called
>   `/etc/kata-containers/configuration-my-custom-config.toml`, they could
>   switch to this by running:
>
>   ```bash
>   $ kata-manager.sh -S my-custom-config
>   ```
>
> - The "config short names" are the hypervisor specific part of the configuration file name.
>   For example, the config short name for file `configuration-qemu.toml` is
>   `qemu` and the config short name for `configuration-clh.toml` is `clh`.

Fixes: #8305.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-04 12:18:00 +00:00
James O. D. Hunt
0bb558c0b9 kata-manager: Fix symlink handling
The `configure_kata()` function modifies the configuration file to
enable debug. But it was doing this by calling `sed -i` which, by
default, creates a new _file_ from the `configuration.toml` symbolic
link. This defeated the point of the symbolic link which is supposed to
resolve to the local copy of the pristine config file, so we now use
the GNU sed(1) specific `---follow-symlinks` option to retain the
sym-link.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-04 11:15:39 +00:00
James O. D. Hunt
455637b30a kata-manager: Show message when checking file
Add an info message just before the archive file is checked. This keeps
the user informed about what is happening as it can take a few seconds
to perform the checks on slower systems.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-04 11:15:39 +00:00
James O. D. Hunt
ce350450e8 kata-manager: Sort options in usage
Ensure the usage statement lists all options in alphabetical order.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-04 11:15:39 +00:00
James O. D. Hunt
159d29665a kata-manager: Whitespace fixes
Remove extraneous whitespace.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-04 11:15:39 +00:00
Chao Wu
9f0eab904b Dragonball: fix test_signal_handler
a) There is some unknown syscalls triggered in new github virt machine
that would break the make test process with SIGSYS after applying
SeccompFilter. In order to fix this, we change the allowlist in this
unit test for seccompfileter into a blocklist to avoid meeting the unknown syscalls.
b) lazy static METRICS is not fully initialize in the unit test and may lead to
unstable result for this UT.

fixes: #9207

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2024-03-04 16:27:27 +08:00
Chao Wu
253fe72435 Dragonball: fix test_handler_insert_region
the mmap region start guest addr hard-code a value and later there
 would be check whether the mentioned addr is larger than or equal
 to mem_end (default to host_phy_mem >> 1) in order to satisfy the
 requirement for DaxMemory. Since github virt machine phy_mem is larger
 than previous CI machine we use, the hard-code value could no longer be
 worked. To fix this, we change the address to mem_end in unit test to
 avoid the influence of host machine change.

fixes: #9207
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2024-03-04 16:27:19 +08:00
Jimmy-Xu
5ada7329b8 gpu: fix build guest kernel with nvidia gpu
- enable CONFIG_MTRR,CONFIG_X86_PAT on x86_64 for nvidia gpu
- optimize -f of build-kernel.sh, clean old kernel path and config before setup
- add kernel 5.16.x

Fixes: #9143

Signed-off-by: Jimmy-Xu <xjimmyshcn@gmail.com>
2024-03-04 09:40:42 +08:00
Fupan Li
07e0cf1855 CI: fix the issue of ci failure on crio
PR #8760 tentatively tried to have the shim to run in its own mount
namespace for the sake of improving isolation between the sandbox and
the host. Thus crio storage drivers shouldn't create a PRIVATE
bind mount on their home directory. Otherwise, the container's rootfs
mount wouldn't be propagated to kata runtime's mount namespace, and
kata runtime couldn't access the container's rootfs files.

So, when kata cooperated with crio, crio should set
skip_mount_home=true for its storage overlay.

Fixes: #9028

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-03-03 20:53:36 +08:00
Wainer dos Santos Moschetta
2c24977cb1 tests/k8s: allow to overwrite the cluster name
_print_cluster_name() create a string based information like the
pull request number and commit SHA. However, when you are developing the
scripts you might want to use an arbitrary name, so it was introduced
the $AKS_NAME variable that once exported it will overwrite the
generated name.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-02 12:42:35 -03:00
Wainer dos Santos Moschetta
5e4b7bbd04 tests/k8s: expose KBS service externally
Until this point the deployed KBS service is only reachable from within
the cluster. This introduces a generic mechanism to apply an Ingress
configuration to expose the service externally.

The first implemened ingress is for AKS. In case the HTTP application
routing isn't enabled in the cluster (this is required for ingress), an
add-on is applied.

It was added the get_cluster_specific_dns_zone() and
enable_cluster_http_application_routing() helper functions
to gha-run-k8s-common.sh.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-02 12:42:35 -03:00
Wainer dos Santos Moschetta
e1e0b94975 tests/k8s: introduce the CoCo kbs library
Introduce the tests/integration/kubernetes/confidential_kbs.sh library
that contains functions to manage the KBS on CI. Initially implemented
the kbs_k8s_deploy() and kbs_k8s_delete() functions to, respectively,
deploy and delete KBS on Kubernetes. Also hooked those functions in the
tests/integration/kubernetes/gha-run.sh script to follow the convention
of running commands from Github Workflows:

$ .tests/integration/kubernetes/gha-run.sh deploy-coco-kbs
$ .tests/integration/kubernetes/gha-run.sh delete-coco-kbs

Fixes #9058
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-02 12:39:26 -03:00
Wainer dos Santos Moschetta
6a28c94d99 tests/k8s: add a kustomize installer
Kustomize has been used on some of our internal components (e.g.
kata-deploy) to manage k8s deployments. On CI it has been used
the `sed` tool to edit kustomization.yaml files, but `kustomize` is
more suitable for that purpose. So in order to use that tool on CI
scripts in the future, this commit introduces the `install_kustomize()`
function that is going to download and install the binary in
/usr/local/bin in case it's found on $PATH.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-02 12:39:26 -03:00
Xuewei Niu
daab76de36 Merge pull request #9201 from liubogithub/liubo/dev/panic_fix_3
katautils: fix panic on tracing.
2024-03-02 10:27:02 +08:00
GabyCT
4a0cfc4e3f Merge pull request #9199 from GabyCT/topic/enablecri
gha: Enable cri-containerd tests for cloud hypervisor runtime-rs
2024-03-01 12:23:16 -06:00
Steve Horsman
1ec33b8879 Merge pull request #9200 from wainersm/ci_install_kbs-timeout
gha: increase timeout of KBS steps
2024-03-01 16:00:21 +00:00
Gabriela Cervantes
7299dbdb43 gha: Store journalctl logs
This PR stores the journalctl logs.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-01 15:17:20 +00:00
Gabriela Cervantes
342d3a320d gha: Add collect artifacts function in gha-run script
This PR adds the collect artifacts function in gha-run script for
the kubernetes tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-01 15:17:20 +00:00
Gabriela Cervantes
2070e3481e gha: Storing artifacts for logs of k8s tests garm
This PR helps to store the artifacts for different logs for k8s tests
on garm.

Fixes #9103

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-01 15:17:20 +00:00
Greg Kurz
df17bf95d5 Merge pull request #9169 from ldoktor/backport-ocp
ci.ocp: Backport service-up detection fixes
2024-03-01 16:09:55 +01:00
Greg Kurz
dc6bda19bf Merge pull request #9179 from gkurz/fix-k8s-sandbox-vcpus-allocation-check
tests: k8s: Adapt k8s-sandbox-vcpus-allocation.bats to kubernetes v1.29
2024-03-01 15:55:07 +01:00
Lukáš Doktor
6fffbaa190 ci.ocp: Backport service-up detection fixes
This backports the:

9060e930caf2d20f413df07778d3ab497493161c

    ci.ocp: Add debug output on HTTP service failure

    these logs are vital to analyze a setup failure.

a10a1e2c9cbc21afc1e80f22b0fb8634d27cbd8d

    ci.ocp: Improve the service-up detection

    waiting for the first response is not sufficient as OCP returns html
    page without error even when the route is not yet established describing
    the issue (why it doesn't reply with 500?). Waiting for the correct
    output should do better.

commits from the kata-containers/tests repo.

Fixes: #8653

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-03-01 12:04:20 +01:00
Alex Lyn
13a20957cb Merge pull request #9164 from Apokleos/directvol-csi-dockerfile
csi-kata-directvolume: add Dockerfile for building csi image
2024-03-01 18:12:19 +08:00
Alex Lyn
f69428a1e7 csi-kata-directvolume: add Dockerfile for building csi image
Fixes: #9163

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-01 10:41:51 +08:00
Liu Bo
b6f8355ea3 katautils: fix panic on tracing.
This fixes a panic on tracing on container exit.

The root cause is that global var needs to be set by "=" instead of
":=".

Fixes: #9102

Signed-off-by: Liu Bo <liub.liubo@gmail.com>
2024-02-29 18:40:23 -08:00
Wainer dos Santos Moschetta
24c163e6e1 tests/kata-deploy: fix checker for kata-deploy running
Currently, the checking for kata-deploy is running assume that the
daemonset scheduled at least one pod, however it might not had and the
kubectl wait command fails due to "error: no matching resources found".

On CI I've observed that fail intermittently. I suspect the service
account kata-deploy-sa take a while to show up then no kata-deploy is
scheduled in meanwhile.

Changed the checker logic to use waitForProcess() to keep testing if it is
already running, or hit the timeout (still 10m).

Fixes #9183
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-29 22:26:27 -03:00
Wainer dos Santos Moschetta
4410df7233 gha: increase timeout of KBS steps
The step to deploy KBS on run-k8s-tests-on-aks workflow should be
increased so that there is enough time for checking the service is
healthy and exposed. Likewise the step that builds the kbs-client
which requires enough time to build the executable.

Fixes #9058
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-29 22:05:58 -03:00
Dan Mihai
11b603e5f1 Merge pull request #9139 from microsoft/saulparedes/genolicy_panic_subpath
genpolicy: panic when we see a volume mount subpath
2024-02-29 12:18:56 -08:00
Gabriela Cervantes
beb592b309 gha: Enable cri-containerd tests for cloud hypervisor runtime-rs
This PR enables the cri-containerd tests for cloud hypervisor runtime-rs.

Fixes #9198

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-29 20:18:16 +00:00
GabyCT
a4f5815f6b Merge pull request #9182 from GabyCT/topic/addclhcri
gha: Add cloud-hypervisor (runtime-rs) support to cri-containerd tests
2024-02-29 14:12:01 -06:00
Gabriela Cervantes
0f595cf15b gha: General variable fixes to gha-run script
This PR adds general variable fixes to gha-run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-29 18:15:27 +00:00
Alexandru Matei
6856e8f678 clh: Enable DAX for rootfs
Fixes: #9192

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2024-02-29 18:01:47 +02:00
Greg Kurz
f3442cdef9 tests: k8s: Adapt k8s-sandbox-vcpus-allocation.bats to kubernetes v1.29
Kubernetes v1.29 introduced a new `PodReadyToStartContainers` condition
that gets inserted at index 0 in the conditions array. This means that
the expected `PodCompleted` reason can now be either at index 0 with
kubernetes v1.28 and older or at index 1 starting with kubernetes v1.29.
This is fragile at best since the `kubectl wait` doesn't allow to combine
multiple checks. Also, checking the reason is dubious as it doesn't really
tell if the pods have actually completed or not.

Check the pod phase to be `Succeeded` instead, this guarantees that :

> All containers in the Pod have terminated in success, and will not
> be restarted.

Fixes #9178

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-02-29 17:00:31 +01:00
Greg Kurz
f89120662d tests: k8s: Wait for all pods concurrently
A single invocation of `kubectl wait` can handle all pods.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-02-29 17:00:31 +01:00
Greg Kurz
58bc026656 Merge pull request #9180 from fidencio/topic/actually-add-the-pause-image-into-the-rootfs
rootfs: Fix PAUSE_IMAGE_TARBALL addition to the rootfs
2024-02-29 13:56:32 +01:00
Chengyu Zhu
c01ba58b3d Merge pull request #9176 from ChengyuZhu6/stale_doc
docs: renew stale link
2024-02-29 18:35:26 +08:00
Fabiano Fidêncio
1d2f7afd1f Merge pull request #9188 from fidencio/topic/releases-follow-up-II
releases: Second round of follow-up fixes
2024-02-29 10:59:36 +01:00
Fabiano Fidêncio
c9dfe49152 gha: payload: Fix env var declarations
This was introduced by a45988766c, but
didn't follow the correct format for the env declaration.

Fixes: #9064 - part II

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-29 10:52:49 +01:00
Fabiano Fidêncio
1c3a769822 gha: payload: Don't use concurrency for this job
We want all payloads to be built and published, regardless if there's a
new PR merged.

This will help people to easily trace / debug issues.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-29 10:52:45 +01:00
Fabiano Fidêncio
02af62b66c gha: payload: Stop generating payloads for the stable branches
We've decided to not maintain stable branches anymore, thus we can only
trigger this workflow for the `main` branch.

For more details, please, see:
https://github.com/kata-containers/kata-containers/issues/9064

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-29 10:42:25 +01:00
Fabiano Fidêncio
b4061a1c23 Merge pull request #9170 from fidencio/topic/releases-follow-up-I
release: Add the needed fixes for the release process
2024-02-29 10:36:20 +01:00
ChengyuZhu6
e5d3627794 docs: renew stale link
Renew the stale link "https://github.com/containerd/containerd/tree/main/runtime/v2" to
the latest "https://github.com/containerd/containerd/tree/main/core/runtime/v2".

Fixes: #9177

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-29 15:03:02 +08:00
Fabiano Fidêncio
0022474164 rootfs: Fix PAUSE_IMAGE_TARBALL addition to the rootfs
We were never passing the arguments to add the PAUSE_IMAGE to the
rootfs, leading to it never being present in the confidential image /
initrd.

Fixes: #9032 -- part II

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-28 22:42:27 +01:00
GabyCT
aacbbde35d Merge pull request #9172 from GabyCT/topic/docpradvice
docs: Update Code PR advice document
2024-02-28 13:37:28 -06:00
Gabriela Cervantes
3cd319fcc2 scripts: General fixes to the gha-run script
This PR implements general fixes to the gha-run script for the
cri-containerd tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-28 19:32:51 +00:00
Gabriela Cervantes
5a498948c8 scripts: Skip cri-containerd in gha-run script
This PR skips the cri-containerd in gha-run script for cloud hypervisor
runtime-rs.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-28 19:30:38 +00:00
Gabriela Cervantes
4bfb9c30e7 gha: Add cloud-hypervisor (runtime-rs) support to cri-containerd tests
This PR adds the Cloud Hypervisor driver, integrated with the runtime-rs,
as part of the cri-containerd tests.

Fixes #9181

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-28 19:24:18 +00:00
Wainer Moschetta
c4b8270073 Merge pull request #9009 from wainersm/runk_bats
tests/runk: fix the "run ps command" flaky test
2024-02-28 15:58:36 -03:00
Wainer Moschetta
129ce84705 Merge pull request #9116 from wainersm/ci_install_kbs-workflow
gha: k8s: prepare AKS workflow to install the CoCo KBS
2024-02-28 14:43:41 -03:00
Gabriela Cervantes
ec1dde1d01 docs: Update Code PR advice document
This PR updates the code pr advice document to make the proper
references now that we have move the test repository to the kata containers
repository.

Fixes #9171

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-28 16:14:22 +00:00
Ryan Savino
9e9dae8efb versions: SNP qemu updated to stable coco tagged version
New qemu fork of AMDESE created in confidential-containers project.
SNP qemu version now pointed to stable tag at:
https://github.com/confidential-containers/qemu/tree/amd-snp-202402240000

Fixes: #9173

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2024-02-28 09:28:14 -06:00
Fabiano Fidêncio
068d80a9cb docs: releases: Update link for the release actions
This allows users to go directly to the action page whenever a release
needs to be cut.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-28 12:34:56 +01:00
Fabiano Fidêncio
520cd90c43 release: Remove the "test-" from the release version
This is not needed anymore as we can run the tests from any branch, and
we can patch this locally before doing a test.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-28 12:34:56 +01:00
Fabiano Fidêncio
22b19d0637 release: Add a step to get the release tags
GitHub actions is fun and always willing to play tricks with us.  This
nice little kid decided that `echo "FOO=\"bar zaz\"" >> $GITHUB_ENV` is
not valid, and it simply breaks things in a way that is a pain to debug.

But hey, we take this path, and after doing so I realised that the
correct way to export that is `echo "FOO=bar zaz" >> $GITHUB_ENV`.

I know, this looks incorrect, but this fellow never stops surprising us.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-28 12:34:56 +01:00
Fabiano Fidêncio
cdf1e4afde release: Fix typo in the arm arch
For some reason I'd changed arm64 to arm4 in a previous (already merged)
commit. :-/

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-28 12:34:56 +01:00
Fabiano Fidêncio
3db0630bc1 release: Add our own bits to the release notes
I'm getting here the most relevant parts of what we had as part of the
release-notes.sh script.  As the script will not be used anymore, it's
been removed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-28 12:34:56 +01:00
Fabiano Fidêncio
aaf38aca98 release: Fix typo in the _upload_libseccomp_tarball()
RELEASE_VERSIOB -> RELEASE_VERSION

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-28 12:34:56 +01:00
Fabiano Fidêncio
397167836b release: Fix yq installation
For some reason we need to force its installation in the GOPATH,
otherwise yq is not found.

Ideally we should switch to a packaged version of yq, but that's a topic
for another series.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-28 12:34:55 +01:00
Fabiano Fidêncio
6915131adc release: Fix KATA_DEPLOY_{IMAGE_TAGS,REGISTRIES} declaration
Otherwise we may end up with an unbound variable.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-28 12:34:55 +01:00
Fabiano Fidêncio
757f958943 release: Adjust tags used to publish our deamonset
We need to adjust the tags as when this workflow ends up being called
from the release side, we'll receive "refs/tags/main"  as the
GITHUB_REF, and in that case we must use the release version.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-28 12:34:51 +01:00
Fabiano Fidêncio
d339366a16 release: Get the release version from our internal function
This is utterly counter intuitive, but if we change a file during the
GitHub Action, the checkout done for the next workflow won't have that
file updated, but rather the branch on its original state when the
workflow was created.

This makes us safe to always "calculate" the next release version from
the VERSION file at the time the workflow was triggered.

This requires us to have the release type exported for the whole
workflow.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-28 12:30:06 +01:00
Fabiano Fidêncio
8023d64b1a release: Adjust "needs" in the release workflow
Without those we'll end up running steps in parallel that should
actually wait for a previous step to be completed.

While here, let's also correct some of the "needs" that were waiting fro
the wrong workflow to be finished.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-28 12:30:06 +01:00
Fabiano Fidêncio
d10b818de5 release: Add missing return to _check_required_env_var()
Otherwise none of the calls to this function will actually continue
after it's called.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-28 12:30:06 +01:00
Fabiano Fidêncio
0aa82e7050 release: Add missing env vars to _check_required_env_var()
We missed doing this as part of
50011e89a0, but we also need to check for:
* RELEASE_VERSION
* GH_TOKEN
* ARCHITECTURE
* KATA_STATIC_TARBALL

While here, let's fix a ARCHITECURE -> ARCHITECTURE typo.

Fixes: #9064 -- part II

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-28 12:30:05 +01:00
Chengyu Zhu
bb4c608b32 Merge pull request #9110 from ChengyuZhu6/agent_option
agent: Add all agent configuration options to README
2024-02-28 18:50:44 +08:00
Dan Mihai
352e2af5f0 Merge pull request #9153 from microsoft/danmihai1/clh-bootVM-timeout
runtime: clh: minimum 10s timeout for CreateVM + BootVM
2024-02-27 09:58:01 -08:00
Wainer dos Santos Moschetta
b44e0c4e7c gha: k8s: prepare AKS workflow to install the CoCo KBS
Changed the "run k8s tests on AKS" workflows to get the CoCo KBS
installed so that we can run attestation tests.

The plan is to run attestation tests only on a subset of non-TEE jobs
initially, so this commit restricts to install KBS only on kata-qemu
configuration. Actually at this point it is added only stubs commands
to tests/integration/kubernetes/gha-run.sh that should be implemented
in a future commit.

Fixes #9058
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-27 13:51:15 -03:00
Wainer Moschetta
6186410e35 Merge pull request #8949 from wainersm/tests_nydus
tests/nydus: refactor the teardown()
2024-02-27 09:52:44 -03:00
ChengyuZhu6
731c490ded agent: Add all agent configuration options to README
Add all agent configuration options to README so that users can more easily understand
what these options do and how to configure them at runtime.

Fixes: #9109

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-27 17:35:19 +08:00
Fabiano Fidêncio
4aa40f1bbb Merge pull request #9146 from fidencio/topic/releases
release: Update everything in this repo related to the release and its process
2024-02-27 10:30:49 +01:00
Fabiano Fidêncio
111bb3ec66 release: Add "test-" into the release name
This commit should be merged as it's now, then we trigger a test
release, fix whatever has to be fixed, and drop it as soon as we know
our workflows are working as expected.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-27 08:34:03 +01:00
Fabiano Fidêncio
d69766c0b2 docs: Update the release process
Now that we've simplified it by quite a lot, let's update the
documentation accordingly.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-27 08:34:03 +01:00
Fabiano Fidêncio
a85481110a releases: Remove scripts that won't be used anymore
Those are not needed anymore as we're automating our release process
around GitHub actions.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-27 08:34:03 +01:00
Fabiano Fidêncio
e714c37521 gha: Remove workflows related to backporting stuff
We're not doing backports anymore, as we're getting rid of the stable
branches in favour of having a better release cadence from the main
branch.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-27 08:34:02 +01:00
Fabiano Fidêncio
3229c777e7 kata-deploy: Remove "stable" yamls
As we're not maintaining a stable branch anymore, let's get rid of the
kata-deploy stable pieces.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-27 08:34:02 +01:00
Fabiano Fidêncio
008293f015 gha: Add release-{major,minor} workflows
Those will allow us to cut a release just by a single click, instead of
the current process we have.

Fixes: #9064 -- part I

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-27 08:34:02 +01:00
Fabiano Fidêncio
f9f04dca2b gha: release: Update the workflow
The release workflow is now updated to be a `workflow_call`, and it
includes the steps that had to be manually done in the past, such as
updating the needed files and creating the release itself.

While on this, the kata-deploy multiarch manifest tags have been updated
to match the new release scheme.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-27 08:34:02 +01:00
Fabiano Fidêncio
f0675a163a release: Add _next_release_version()
This function returns the version of the next release (the one about to
be cut), and it'll be used as part of our new workflow that will take
care of the release.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-27 08:34:02 +01:00
Fabiano Fidêncio
4675364d8d release: Add _update_version_file() function
Let's add a function that will be responsible for bumping the project's
version in the VERSION file, and push it to the branch as part of the
release process that will be introduced.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-27 08:34:02 +01:00
Fabiano Fidêncio
a99f9026e1 release: Add _create_new_release()
This is a helper function that will be used to create a new release as
part of our release process workflow (which will still be modified).

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-27 08:34:02 +01:00
Fabiano Fidêncio
fd699625fe release: Add _upload_libseccomp_tarball()
As the name of the function says, it's responsible for uploading the
libseccomp source tarballs as par of our release process.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-27 08:34:02 +01:00
Fabiano Fidêncio
d517fa54ac release: Add _upload_vendored_code_tarball()
As hinted by the name of the function, this is used to generate and
upload the vendored code we have as its own tarball.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-27 08:34:02 +01:00
Fabiano Fidêncio
94b30fcb14 release: Add _upload_versions_yaml_file()
As the name says, this function will be used to upload the versions.yaml
file during a given release process of the project.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-27 08:34:02 +01:00
Fabiano Fidêncio
50011e89a0 release: Add _upload_kata_static_tarball
This function, as it names says, will be used to upload the
kata-static.tar.xz tarballs generated during the release process.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-27 08:34:02 +01:00
Fabiano Fidêncio
a45988766c release: Add _publish_multiarch_manifest()
This function, as it names says, will be used to publish multiarch
manifests for the Kata Containers CI and Kata Containers releases.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-27 08:34:01 +01:00
Fabiano Fidêncio
fb2ef32c04 release: Introduce the release.sh helper
For now this script does nothing, but we're introducing it in order to
redduce the diffs for the next commits in this series.

My intention is to have as much as possible related to the release as
part of this helper script, and it'll be populated function by function
while replacing content that's "hard coded" (and duplicated) on
different GitHub actions.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-27 08:34:01 +01:00
GabyCT
1a6c378d26 Merge pull request #9161 from GabyCT/topic/testsreadme
docs: Update link for tests in README
2024-02-26 14:50:46 -06:00
Gabriela Cervantes
94615a4fd4 docs: Update link for tests in README
This PR updates the link for the tests in README for Kata Containers.

Fixes #9160

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-26 15:43:33 +00:00
Wainer dos Santos Moschetta
0f8c36d990 tests/nydus: refactor the teardown()
This refactor the teardown() of tests/integration/nydus/nydus_tests.sh:

* Moved boilerplate code that kill process to a loop;
* Doesn't leave teardown() if a process failed to get killed, so that
  other clean up routines are ran;
* Check if the pid exist then attempt to kill the process, so avoid this
  misleading message:
```
Usage:
 kill [options] <pid> [...]

Options:
 <pid> [...]            send signal to every <pid> listed
 -<signal>, -s, --signal <signal>
                        specify the <signal> to be sent
 -q, --queue <value>    integer value to be sent with the signal
 -l, --list=[<signal>]  list all signal names, or convert one to a name
 -L, --table            list all signal names in a nice table

 -h, --help     display this help and exit
 -V, --version  output version information and exit

For more details see kill(1).
```

Fixes #8948
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-26 11:21:43 -03:00
Wainer dos Santos Moschetta
0f0ce9a81b tests/runk: replace the busybox image
It's recommended to avoid images from docker.io to avoid errors related
with hitting the pull limits that happens mostly on bare-metal machines.

So this replaced the docker.io's busybox with
quay.io/prometheus/busybox.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-26 11:11:05 -03:00
Wainer dos Santos Moschetta
bba8b5b2b4 tests/runk: fix flaky test
The "run ps command" test has failed once in a while because it doesn't
wait the sh command to start within the container, consequently `ps`
won't report the amount of lines expected.

Fixes #8975
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-26 11:09:29 -03:00
Wainer dos Santos Moschetta
28a63070f7 gha: fix step name in run-runk-tests
Likely copied from the tracing workflow by mistake.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-26 11:09:29 -03:00
Wainer dos Santos Moschetta
8a606eb94d tests/runk: convert to bats
Migrated runk tests from pure shell script to bats to be consistent with
other test suites.

The install_dependencies() will install the bats tool locally.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-26 11:09:23 -03:00
Xuewei Niu
bb5e33b33a Merge pull request #9100 from littlejawa/fix_5738_metrics_memory
runtime: remove kata_shim_netdev metric
2024-02-26 19:01:21 +08:00
James O. D. Hunt
0ea30f44cf Merge pull request #9076 from jodh-intel/add-survey-link-to-release-notes
packaging: release notes: Don't show shortlist by default, and add survey link
2024-02-26 10:25:19 +00:00
Steve Horsman
483ecbadf0 Merge pull request #9142 from ChengyuZhu6/protoc
build-checks: Install protoc in the ci environments
2024-02-26 09:52:31 +00:00
Dan Mihai
f4509b806b runtime: clh: minimum 10s timeout for CreateVM + BootVM
Relax the timeout for calling CLH's CreateVM + BootVM APIs. When
hitting the older 1s timeout, killing a half-booted Guest and
retrying the same boot sequence could have been wasteful and resulting
in unstable CI testing on slower Hosts.

Fixes: #9152

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-24 19:15:57 +00:00
GabyCT
4f3c83cd12 Merge pull request #9115 from GabyCT/topic/adddief
scripts: Add an enhanced die function
2024-02-23 12:03:02 -06:00
Saul Paredes
9b7bd376eb genpolicy: panic when we see a volume mount subpath
Based on https://github.com/kata-containers/runtime/issues/2812

Fixes: #9145

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-02-23 09:56:51 -08:00
James O. D. Hunt
8c72abe38d packaging: Add link to survey in release notes
Add a link in the release notes to the Kata Container survey, to
advertise it, and hopefully encourage users to take the survey.

Fixes: #9074.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-02-23 09:57:52 +00:00
James O. D. Hunt
0391c0de82 packaging: Add twistie to release notes shortlog
Add a "twistie" / arrow (`▶`) that the user can click on to see the full
list of commits _if they want to_.

This way, the release notes become easier to read and we can display
information below the shortlog which would (probably) normally not be
seen due to the huge long list of commits.

Fixes: #9075.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-02-23 09:57:52 +00:00
ChengyuZhu6
3cc55ff8af build-checks: Install protoc in the ci environments
To test PR #8484 for pulling image in the guest with image-rs, the compilation process for the kata-agent relies on
protoc:
https://github.com/kata-containers/kata-containers/actions/runs/8016317290/job/21898040849?pr=8484
https://github.com/kata-containers/kata-containers/actions/runs/8016534530/job/21898654435?pr=8484

Fixes: #9141

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-23 17:38:13 +08:00
Xuewei Niu
89c76d7d8d Merge pull request #9125 from gkurz/fix-agent-cgroup-ns
agent: Run container workload in its own cgroup namespace (cgroup v2 guest only)
2024-02-23 10:40:17 +08:00
Steve Horsman
e342a9adc4 Merge pull request #9119 from ChengyuZhu6/pause-confidential
kata-deploy: Add pause image to confidential rootfs
2024-02-22 17:10:55 +00:00
Steve Horsman
531dcd2f25 Merge pull request #9132 from ChengyuZhu6/nydus-snapshotter-version
gha: bump nydus snapshotter version to v0.13.8
2024-02-22 17:10:42 +00:00
Steve Horsman
dfa6e932bb Merge pull request #9122 from ChengyuZhu6/snapshotter-clean
gha: try to cleanup nydus snapshotter before deploying it
2024-02-22 13:30:04 +00:00
Julien Ropé
1c306fe4a6 runtime-rs: stop reporting net dev metrics for the shim
For consistency with the go runtime.
As the shim itself is not using the network (all its communication with
other processes is done with local unix sockets), there is no reason to
keep gathering and reporting shim-specific network metrics.
Actual network usage of the kata containers can be found from the existing
agent network metrics (kata_guest_netdev_stat).

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-02-22 14:00:00 +01:00
Julien Ropé
9de65707ca runtime: stop reporting net dev metrics for the shim
As part of the shim network metrics, the shim is reporting network interfaces
from the host with no namespace isolation - this gives insight in interfaces
not tied to the kata containers, and causes an increase in resource usage for
kata metrics.

As the shim itself is not using the network (all its communication with
other processes is done with local unix sockets), there is no reason to
keep gathering and reporting shim-specific network metrics.
Actual network usage of the kata containers can be found from the existing
hypervisor network metrics (kata_hypervisor_netdev) and from the agent
network metrics (kata_guest_netdev_stat).

Fixes: #5738

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-02-22 14:00:00 +01:00
ChengyuZhu6
8ab3894dc5 gha: try to cleanup nydus snapshotter before deploying it
CI failed to deploy nydus snapshotter because it was not cleaned up last time.
So we can try to cleanup nydus snapshotter before deploying it.

Fixes: #9121

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-22 18:51:14 +08:00
Alex Lyn
5d3ae360ed Merge pull request #9130 from Apokleos/bugfix-dragonball-invalidOperation
runtime-rs: bugfix for GPU passthrough failed with InvalidOperation.
2024-02-22 17:47:09 +08:00
ChengyuZhu6
f16f709a5e kata-deploy: Add pause image to confidential rootfs
For confidential containers, the pause image needs to be installed in
the rootfs.

Fixes: #9118

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-22 15:41:16 +08:00
ChengyuZhu6
d8db3fb17f gha: bump nydus snapshotter version to v0.13.8
Bump nydus snapshotter version to v0.13.8 to fix the bug in v0.13.7 : https://github.com/containerd/nydus-snapshotter/pull/582

Fixes: #9131

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-22 15:35:08 +08:00
Alex Lyn
014e0f4e46 runtime-rs: bugfix for GPU passthrough failed with InvalidOperation.
We need initailize the pci_hotplug_enabled with true before we do GPU
passthrough with runtime-rs/dragonball. Otherwise it fails with error
`InvalidOperation`.

Fixes: #9129

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-02-22 10:22:32 +08:00
Dan Mihai
58fbb9f6ec Merge pull request #9073 from microsoft/danmihai1/test-genpolicy3
tests: k8s: generated policy for additional tests
2024-02-21 14:11:51 -08:00
Dan Mihai
b3c3f992ab tests: k8s: common clean-up on teardown
teardown() gets executed after each test case, so there is no need to
clean-up before teardown.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
9c164698d3 tests: k8s: k8s-optional-empty-configmap policy
Auto-generate policy for k8s-optional-empty-configmap.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
74a52c6d25 tests: k8s: k8s-oom.bats auto-generated policy
Auto-generate policy for k8s-oom.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
26a77d67f4 tests: k8s: k8s-number-cpus auto-generated policy
Auto-generate policy for k8s-number-cpus.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
9cbdce15fd tests: k8s: k8s-memory.bats auto-generated policy
Auto-generate policy for k8s-memory.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
40209cc0b7 tests: k8s: k8s-limit-range auto-generated policy
Auto-generate policy for k8s-limit-range.bats.

Also, fix teardown() namespace.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
df3c0318c6 tests: k8s: add set_namespace_to_policy_settings
Add set_namespace_to_policy_settings() for changing the pod namespace
in genpolicy settings.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
6e14ce93c9 tests: k8s-kill-all-process-in-container policy
Auto-generate policy for k8s-kill-all-process-in-container.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
fad7ba0aea tests: k8s: k8s-job.bats auto-generated policy
Auto-generate policy for 8s-job.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
41c2bcbdc5 tests: k8s: k8s-file-volume auto-generated policy
Auto-generate policy for k8s-file-volume.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
d84f50db5b genpolicy: fix typo in policy logging
Improve logging, for easier debugging.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
81e641814f tests: k8s: k8s-cpu-ns auto-generated policy
Auto-generate policy for k8s-cpu-ns.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
bc6d3fc238 tests: k8s: k8s-env.bats auto-generated policy
Auto-generate policy for k8s-env.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
0a4fc071ac tests: k8s: k8s-custom-dns auto-generated policy
Auto-generate policy for k8s-custom-dns.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
f693f49e92 tests: k8s: k8s-credentials-secrets policy
Auto-generate policy for k8s-credentials-secrets.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
d3d27bbb5b tests: k8s: k8s-configmap auto-generated policy
Auto-generate policy for k8s-configmap.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
b318535536 tests: k8s: auto-generate k8s-caps.bats policy
Auto-generated policy for k8s-caps.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Greg Kurz
600b951afd agent: Run container workload in its own cgroup namespace
When cgroup v2 is in use, a container should only see its part of the
unified hierarchy in `/sys/fs/cgroup`, not the full hierarchy created
at the OS level. Similarly, `/proc/self/cgroup` inside the container
should display `0::/`, rather than a full path such as :

0::/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podde291f58_8f20_4d44_aa89_c9e538613d85.slice/crio-9e1823d09627f3c2d42f30d76f0d2933abdbc033a630aab732339c90334fbc5f.scope

What is needed here is isolation from the OS. Do that by running the
container in its own cgroup namespace. This matches what runc and
other non VM based runtimes do.

Fixes #9124

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-02-21 13:14:13 +01:00
Greg Kurz
14886c7b32 agent: lint code
Run cargo-clippy to reduce noise in actual functional changes.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-02-21 13:14:13 +01:00
ChengyuZhu6
cddaf2ce97 kata-deploy: Remove specific kernel/initrd/image leftovers in Makefile
Remove specific kernel/initrd/image leftovers in Makefile of
local-build, which is the part of #9026.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-21 18:24:10 +08:00
Chelsea Mafrica
241a56989a Merge pull request #9090 from GabyCT/topic/pulldockerimage
gha: docker: Pull docker image as part of the dependencies
2024-02-20 14:28:53 -08:00
GabyCT
ea78013c7e Merge pull request #9079 from GabyCT/topic/removecilink
docs: Update CI link into the README
2024-02-20 14:11:13 -06:00
GabyCT
64c09fe6c5 Merge pull request #9088 from GabyCT/topic/fixnydus
gha: nydus: Fix indentation in gha run script
2024-02-20 14:09:54 -06:00
Gabriela Cervantes
ff8a6fa9ef scripts: Add error script
This PR adds the error script to display the error message with
much more information to help debugging.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-20 18:30:03 +00:00
Gabriela Cervantes
43a46d5a6b scripts: Add an enhanced die function
This PR adds an enhanced die function in order to dump more information
in a yaml format that will help with the debugging.

Fixes #9105

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-20 18:27:44 +00:00
Archana Shinde
6d84fe3a37 Merge pull request #8647 from amshinde/cleanup-network
Cleanup network to make sure physical interfaces are restores back to original host driver.
2024-02-20 08:59:53 -08:00
Archana Shinde
6d38fa1530 network: Try removing as many changes as possible during network cleanup
In case an error is encountered while removing a network endpoint during
network cleanup, we cuurently return immediately with the error.
With this change, in case of error we simply log the error and proceed
towards removing the next endpoint. With this, we can cleanup the
network changes made by the shim as much as possible.
This is especially important when multiple interfaces are passed to the
network namespace using a network plugin like multus.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-02-20 06:08:05 -08:00
Archana Shinde
b005cda689 network: Move up defer block tp cleanup network
Move the defer for cleaning up network before the call to add network.
This way if any change made by add network is reverted by in case of
failure. This is particulary important for physical network interfaces
as with this step we make sure that driver for the physical interface is
reverted back to the original host driver. Without this the physical
network iterface will remain bound to vfio.

Fixes: #8646

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-02-20 06:06:42 -08:00
Ryan Savino
61ce7455c5 Merge pull request #9086 from niteeshkd/nd_snp_upm
packaging: qemu-snp-experimental: support host kernel with gmem
2024-02-19 10:50:13 -06:00
Fabiano Fidêncio
79dc6e95d1 Merge pull request #9108 from fidencio/topic/ci-k8s-fix-wrong-logic-on-confidential-tests
ci: k8s: Fix checks used to skip confidential tests
2024-02-19 12:49:57 +01:00
Xuewei Niu
f9307f6852 Merge pull request #9112 from ChengyuZhu6/vendor
runtime: fix checksum mismatch error in `make vendor`
2024-02-19 10:54:38 +08:00
ChengyuZhu6
96c297cb37 runtime: fix checksum mismatch error in make vendor
Fix checksum mismatch error in `make vendor`.

Fixes: #9111

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-18 22:22:38 +08:00
Fabiano Fidêncio
3468ac3b6e ci: k8s: Fix checks used to skip confidential tests
This has been introduced by 53bc4a432b,
where the condition was changed.

The correct condition is:
* If the list of supported tees does not contain the kata hypervisor
  and the list of supported non tees does not contain the kata
  hypervisor.

The error is that we were checking whether kata-hypervisor would contain
the list of supported tees, and that would almost always be false
(unless in the case where the list had an one and only one element).

Fixes: #9055 -- part II

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-18 10:10:45 +01:00
Niteesh Dubey
0538bbfc49 packaging: qemu-snp-experimental: support host kernel with gmem
This is required to allow creation of SNP coco on host kernel
(e.g. https://github.com/AMDESE/linux ,branch:snp-host-latest)
supporting guest private memory for SNP using gmem.

Note: This qemu does not work if the host kernel does not support
gmem/UPM.

Fixes: #9092

Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
2024-02-15 16:33:46 +00:00
Wainer Moschetta
db744aa8d2 Merge pull request #9023 from ldoktor/webhook-path
tools.kata-webhook: Fix lib path
2024-02-15 12:34:01 -03:00
Fabiano Fidêncio
28b4e5ce51 Merge pull request #9099 from BbolroC/skip-k8s-sandbox-vcpus-allocation-s390x
CI|k8s: Skip vcpu allocation test for s390x
2024-02-15 16:05:18 +01:00
James O. D. Hunt
d1513b2030 Merge pull request #9091 from jodh-intel/packaging-add-kata-manager-script
packaging: Add the kata manager script
2024-02-15 13:08:36 +00:00
Hyounggyu Choi
8b3f7f353d CI|k8s: Skip vcpu allocation test for s390x
A test `vcpu allocation k8s test` exhibits different behavior on s390x
For more details, please refer to issue #9093.
This commit is to make the test skipped until the issue is resolved on
the platform.

Fixes: #9093

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-02-15 12:26:35 +01:00
Fabiano Fidêncio
9178541dfb Merge pull request #9098 from fidencio/topic/runtime-update-runc-to-v1.1.12
runtime: Update runc to v1.1.12
2024-02-15 09:29:10 +01:00
Fabiano Fidêncio
eea4277fbf runtime: Update runc to v1.1.12
Although we don't seem to be affected by
https://nvd.nist.gov/vuln/detail/CVE-2024-21626, we vendor and use the
runc package in a few different places of our code, and we better update
the package to its latest release.

Fixes: #9097

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-14 23:13:39 +01:00
James O. D. Hunt
8c51e02f55 packaging: Add the kata manager script
Add `kata-manager.sh` to the release packages.

Fixes: #9066.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-02-14 17:44:42 +00:00
James O. D. Hunt
e49aeec97f packaging: Use variable for default binary permissions
Create a variable for the default binary permissions.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-02-14 17:44:35 +00:00
James O. D. Hunt
cc2d96671f packaging: Remove extraneous whitespace
Remove some unnecessary whitespace from a couple of `kata-deploy` files.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>

whitespace

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-02-14 17:44:08 +00:00
Fabiano Fidêncio
c95c37d2ab Merge pull request #9026 from fidencio/topic/packaging-remove-tee-specific-leftovers
packaging: Remove leftovers from the transition from TEE specific kernel / initrd / image to the "confidential" ones
2024-02-13 22:14:26 +01:00
GabyCT
9cf343779f Merge pull request #9062 from GabyCT/topic/nonteet
tests: Add ability to run non-TEE environments
2024-02-13 14:28:07 -06:00
Fabiano Fidêncio
74c8d243ea versions: Remove TEE specific kernels
We've switched to using the confidential one, instead.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-13 19:07:33 +01:00
Fabiano Fidêncio
adbe24c283 versions: Remove non-used tdx / sev image and initrd entries
We've switched to using the confidential ones, instead.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-13 19:07:33 +01:00
Fabiano Fidêncio
6c3338271b packaging: kernel: Remove sev/snp/tdx specific stuff
Now we're using a "confidential" image that has support for all of
those.

Fixes: #9010 -- part II
       #8982 -- part II
       #8978 -- part II

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-13 19:07:33 +01:00
Gabriela Cervantes
598c77409a gha: docker: Pull docker image as part of the dependencies
This PR pulls the docker image needed for the test as part of the dependencies
in order to avoid failures of timeouts mainly because the image was not
properly download it and it is unable to find it.

Fixes #9089

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-13 17:48:31 +00:00
Gabriela Cervantes
53bc4a432b tests: Add ability to run non-TEE environments
This PR adds the ability to run k8s confidential tests in a
non-TEE environment.

Fixes #9055

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-13 17:27:55 +00:00
Fabiano Fidêncio
14f4480f12 packaging: Remove specific TEEs image / initrd leftovers
Let's remove the targets as those are not built anymore as part of our
CI.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-13 18:03:12 +01:00
Fabiano Fidêncio
0c761f14b3 packaging: Remove specific TEEs kernel leftovers
Let's remove the targets as those are not built anymore as part of our
CI.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-13 18:03:11 +01:00
Fabiano Fidêncio
28488f0790 Merge pull request #9082 from fidencio/topic/cleanup-kata-deploy-leftovers-before-start-a-test
tests: Remove kata-deploy-tdx test and ensure kata-deploy is always cleaned up before starting the tests
2024-02-13 18:01:16 +01:00
Gabriela Cervantes
54d1f34650 gha: nydus: Fix indentation in gha run script
This PR fixes the indentation in gha run script for nydus.

Fixes #9087

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-13 16:53:28 +00:00
Fabiano Fidêncio
a867e19da1 gha: tdx: Stop running kata-deploy tests on TDX
We only have one TDX machine, let's not make it busier than needed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-13 14:14:57 +01:00
Fabiano Fidêncio
3877a9f49a ci: Clean up kata-deploy ds before starting the tests
This will ensure no leftovers are in the node, which has been cause the
TDX CI to fail every now and then.

Fixes: #9081

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-13 14:10:44 +01:00
Fabiano Fidêncio
8fe7349d3e Merge pull request #9080 from fidencio/topic/dont-add-the-pause-image-to-the-released-tarball
release: Don't ship the pause-image / coco-guest-components as part of the release artefacts
2024-02-13 12:34:29 +01:00
Fabiano Fidêncio
443a5b8327 release: Don't ship the coco-guest-components
In the same way that doesn't make sense to ship the pause-image, it also
doesn't make sense to ship the coco-guest-components itself as part an
release artefact.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-13 09:47:26 +01:00
Fabiano Fidêncio
0462b33a5b release: Don't ship the pause-image
It doesn't make sense to ship the pause-image itself as an release
artefact.

The reason we build it and cache it is in order to use it inside the
rootfs, and that's it, there's not need to ship it as part of the
release, at all.

Fixes: #9032 -- part II

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-13 09:45:50 +01:00
GabyCT
00be9ae872 Merge pull request #9070 from microsoft/danmihai1/debug-containers
tests: k8s: avoid deleting unrelated pods
2024-02-12 15:24:15 -06:00
Gabriela Cervantes
69b325a31c docs: Update CI link into the README
This PR updates the CI link into the README as currently we are
using GHA workflows and they are now part of the kata containers
repository.

Fixes #9078

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-12 20:53:25 +00:00
Greg Kurz
532567bfe9 Merge pull request #8936 from fidencio/topic/fix-cri-o-ci
tests: cri-o: Use packages from pkgs.k8s.io
2024-02-12 10:04:53 +01:00
Dan Mihai
42d13a0f33 Merge pull request #9068 from microsoft/danmihai1/dockerfile-linux-musl-gcc
tools: avoid rootfs-image build "ln -s" error
2024-02-11 18:02:53 -08:00
Greg Kurz
d7afd31fd4 Merge pull request #8455 from BbolroC/runtime-rs-qemu-config
runtime-rs: Add a new config option for QEMU
2024-02-10 08:48:23 +01:00
Dan Mihai
a21ca9b7c9 tests: k8s: avoid deleting unrelated pods
Delete the debugger pod created during the test, rather than already
existing debugger pods.

Also, send the output of "kubectl delete" to stderr, just in case it's
useful for debugging.

Fixes: #9069

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-09 22:48:41 +00:00
Dan Mihai
a054462eb7 Merge pull request #9051 from microsoft/danmihai1/k8s-copy-file
tests: k8s: k8s-copy-file auto-generated policy
2024-02-09 12:30:49 -08:00
Hyounggyu Choi
05c4c8055c runtime-rs: Configure argument replacement for QEMU in Makefile
Last but not least, all placeholders for argument replacement
should be configured to generate a configuration file when `QEMUCMD`
is defined. This enriches those variables.

Additionally, this involves creating a symbolic link to `configuration-qemu.toml`
if QEMU is defined as the default hypervisor.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-02-09 19:31:20 +01:00
Dan Mihai
fcd005774d tools: avoid rootfs-image build "ln -s" error
Avoid error when building for amd64 using:

USE_CACHE=no AGENT_POLICY=yes DEBUG=1 \
tools/packaging/kata-deploy/local-build/kata-deploy-binaries.sh \
--build=rootfs-image

Fixes: #9067

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-09 17:10:35 +00:00
GabyCT
b8f277676f Merge pull request #9047 from GabyCT/topic/ukd
docs: Remove jenkins reference in kernel documentation
2024-02-09 10:58:06 -06:00
Fabiano Fidêncio
e78a951e03 Merge pull request #8585 from ChengyuZhu6/dependencies-for-guest-pull
gha: Setup nydus snapshotter for CoCo tests
2024-02-09 16:45:42 +01:00
Hyounggyu Choi
27cb30d8ce runtime-rs: Adjust configuration template for runtime-rs
There are some variables newly introduced to runtime-rs, such as:

- runtime.name
- runtime.hypervisor_name
- runtime.agent_name
- vm_rootfs_driver

Additionally some of the placeholders for argument replacement are
made hypervisor-specific based on the changes made for dragonball.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-02-09 16:26:59 +01:00
ChengyuZhu6
97fbf360cc gha: Cleanup nydus snapshotter by the daemonset
Cleanup nydus snapshotter by the daemonset.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-09 14:47:13 +01:00
ChengyuZhu6
43b04fd0c0 gha: Deploy nydus snapshotter by the daemonset
We can use daemonset to deploy nydus snapshotter, which will decrease
one manual step both for Kata Containers and Confidential Containers CI.

Fixes: #8584

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-09 14:47:09 +01:00
Julien Ropé
236c2c7650 tests: cri-o: Update critools version to 1.29
This will also update the version of crio used in kata-monitor tests.

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-02-09 12:15:55 +01:00
Fabiano Fidêncio
344e0580ca tests: cri-o: Use packages from pkgs.k8s.io
CRI-O has moved, for a long time, towards pkgs.k8s.io, see:
https://kubernetes.io/blog/2023/10/10/cri-o-community-package-infrastructure/

With this the OBS repo won't be used anymore.

Fixes: #8935

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-09 12:15:55 +01:00
Fabiano Fidêncio
03f7cfd429 Merge pull request #9061 from GabyCT/topic/csk
tests:k8s: make add_kernel_initrd_anotations function generic
2024-02-09 10:05:58 +01:00
Fabiano Fidêncio
555784268d Merge pull request #9031 from ChengyuZhu6/guest-pull-rootfs
packaging/osbuilder: allow to pull and unpack pause image
2024-02-08 22:21:44 +01:00
Gabriela Cervantes
0b508f301b tests:k8s: make add_kernel_initrd_anotations function generic
This PR replaces the add_kernel_initrd_annotations_to_yaml function
more generic so later can be used for other components.

Fixes #9054

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-08 19:30:43 +00:00
Dan Mihai
f139c7dc60 tests: k8s: k8s-copy-file auto-generated policy
Auto-generate policy for k8s-copy-file.bats.

Fixes: #9050

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-08 13:26:05 +00:00
Dan Mihai
1179306afa tests: k8s: additional policy testing utilities
1. add_requests_to_policy_settings allows one or more ttrpc requests
   from the Host to the Guest. Example:

add_requests_to_policy_settings "${policy_settings_dir}" \
   "ReadStreamRequest" "WriteStreamRequest"

2. add_copy_from_host_to_policy_settings allows executing on the Guest
   the commands initiated behind the scenes by "kubectl cp" from the
   Host to the Guest. Example:

add_copy_from_host_to_policy_settings "${policy_settings_dir}"

3. add_copy_from_guest_to_policy_settings allows executing on the Guest
   the commands initiated behind the scenes by "kubectl cp" from the
   Guest to the Host. Example:

add_copy_from_guest_to_policy_settings "${policy_settings_dir}" \
   "/tmp/file.txt"

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-08 13:25:41 +00:00
Steve Horsman
b99f574522 Merge pull request #9037 from niteeshkd/nd_SevSnpGuest
runtime: fix creation of SEV confidential container on SNP enabled host.
2024-02-08 09:29:20 +00:00
ChengyuZhu6
a43edd0c30 rootfs: Install pause image into rootfs
Install the pause image into the confidential rootfs
image and initrd.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-08 16:49:56 +08:00
Greg Kurz
6ead48ec06 Merge pull request #8986 from pmores/drop-shim-v2-address-value-validation
runtime-rs: fix interoperability issues between runtime-rs and cri-o
2024-02-08 09:44:12 +01:00
ChengyuZhu6
42ef6bdcae osbuilder:rootfs: support to unpack pause image to rootfs
This env ver will serve us to pass the pause image tarball to the rootfs builder, which will then just
unpack the content into the rootfs.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com>
2024-02-08 16:29:36 +08:00
ChengyuZhu6
53183cba31 workflow: Enable to build pause image in ci
Enable to build pause image static tarball for confidential containers
casesi in ci environment.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-08 11:23:23 +08:00
ChengyuZhu6
70a84eca9e packaging: allow to pull and unpack pause image
For Confidential containers stack, the pause image is managed by host side,
then it may configure a malicious pause image, we need package
a pause image inside the rootfs and don't the pause image from host.

But the installation of skopeo is not included in 20.04 release, so we
can not directly install skopeo in rootfs and pull pause image.

So I plan to let the task as a static build stuff, which would not be influenced
by the system version in rootfs. And the pause image will be part of the Kata Containers rootfs
that's used by the Confidential Containers usecase. This commit enables the component to be built
both locally and in our CI environment with the command: make pause-image-tarball.

Fixes: #9032

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com>
2024-02-08 11:23:23 +08:00
Dan Mihai
9a780aa98f genpolicy: improve logging from ExecProcessRequest
Additional logging from the ExecProcessRequest rules, for easier
debugging.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-08 02:21:58 +00:00
Dan Mihai
dab567bdfa genpolicy: add easy way to allow CloseStdinRequest
For example, Kata CI's k8s-copy-file.bats transfers files between the
Host and the Guest using "kubectl exec", and that results in
CloseStdinRequest being called from the Host.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-08 02:21:58 +00:00
Dan Mihai
8401adb113 genpolicy: update default values
1. Remove PullImageRequest because that is not used in the main
   branch. It was used in the CCv0 branch.

2. Add default false values for the remaining Kata Agent ttrpc
   requests.

These changes don't change the functionality of the auto generated
Policy, but they help with easier understanding the Policy text and
the logging from the Rego rules.

Fixes: #9049

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-08 02:21:58 +00:00
Dan Mihai
535db6b29c Merge pull request #9043 from ChengyuZhu6/assert
runtime-rs: fix assert error in `make check`
2024-02-07 18:19:18 -08:00
Dan Mihai
2bb91c9d8f Merge pull request #8922 from microsoft/danmihai1/k8s-attach-handlers
tests: k8s-attach-handlers auto-generated policy
2024-02-07 13:29:50 -08:00
Dan Mihai
01745689e1 Merge pull request #9029 from microsoft/danmihai1/k8s-empty-dirs
genpolicy: mount source for non-confidential guest
2024-02-07 11:26:16 -08:00
Dan Mihai
6b5e57f7c7 tests: k8s: address PR review feedback
1. Rename install_kata_common to install_kata_core.

2. Add TODO for better way to install the Kata tools.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 18:51:56 +00:00
Steve Horsman
934d8dca0f Merge pull request #9045 from ChengyuZhu6/nydus-version
nydus: Bump nydus snapshotter version to v0.13.7
2024-02-07 17:20:21 +00:00
Pavel Mores
6346e04cf7 runtime-rs: fix handling of TTRCP_ADDRESS
Since cri-o doesn't seem to use address for event publishing as mentioned
in the previous commit it will not send it.  However, the exact way of
not sending it is unfortunately different from what is assumed by
runtime-rs.  Due to an implementation detail of cri-o which uses containerd
libraries for some low-level tasks, TTRPC_ADDRESS will not be missing from
environment as assumed, instead it will be present with an empty value.

This commit contains a small adjustment to account for that and use
LogForwarder even if TTRPC_ADDRESS is present, but with an empty value.

Fixes #8985

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-02-07 17:01:04 +01:00
Gabriela Cervantes
ff1ace1c74 docs: Remove jenkins reference in kernel documentation
This PR removes the jenkins reference which is not longer being used
in the kernel documentation.

Fixes #9046

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-07 15:44:07 +00:00
ChengyuZhu6
d0b8e6d8f3 nydus: Bump nydus snapshotter version to v0.13.7
Bump nydus snapshotter version to v0.13.7.
The new release name of nydus snapshotter is `nydus-snapshotter-v0.13.7-linux-amd64.tar.gz`,
which differs from the version used by kata (`nydus-snapshotter-v0.12.0-x86_64.tgz`).
Therefore, we need to update the script to obtain the correct nydus snapshotter name.

Fixes: #9044

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-07 22:17:05 +08:00
ChengyuZhu6
34c47e08b2 runtime-rs: fix assert error in test in make check
Fix assert error:
error: used `assert_eq!` with a literal bool
   --> crates/hypervisor/src/ch/inner.rs:218:9
    |
218 |         assert_eq!(state.jailed, false);
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#bool_assert_comparison
    = note: `-D clippy::bool-assert-comparison` implied by `-D warnings`

Fixes: #9042

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-07 19:31:10 +08:00
Archana Shinde
d9ce88ada3 Merge pull request #8704 from amshinde/runtime-rs-clh-implement-persist
runtime-rs: implement persist api for cloud-hypervisor
2024-02-07 02:29:33 -08:00
Dan Mihai
dd16bc393f tests: k8s: k8s-attach-handlers generated policy
Automatically generate the test policy for k8s-attach-handlers.bats,
if AUTO_GENERATE_POLICY is enabled.

Steps:

- Create a temporary directory for the current test and copy the
  common genpolicy settings into this new directory.

- Change genpolicy settings in the temp directory to allow the
  "kubectl exec" command that this test needs. (For CoCo, exec is
  blocked by the default policy settings)

- Auto-generate the policy for the test YAML file.

- Test as usual, using the YAML file.

- Clean-up the temporary settings described above.

Fixes: #8921

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 02:26:03 +00:00
Dan Mihai
0de407f8b7 tests: k8s: enable AUTO_GENERATE_POLICY
Enable AUTO_GENERATE_POLICY for one of the Kata CI K8s test platforms.
Additional platforms will be enabled after testing them.

When AUTO_GENERATE_POLICY is enabled, create genpolicy settings that
are common for all tests. Some of the tests will make temporary copies
of these common settings and customize them as needed.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 02:25:54 +00:00
Dan Mihai
05b2e4f606 tests: k8s: install genpolicy
Install the genpolicy app before starting test execution.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 02:25:42 +00:00
Dan Mihai
8aa8b70573 tests: k8s: add policy test utilities
Add script functions useful for auto-generating and testing policy.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 02:24:06 +00:00
Dan Mihai
24a17a2e1b tests: k8s: output the names of test files
Output the names of test files, for easier search through logs.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 02:23:54 +00:00
Dan Mihai
bf533de31a tests: k8s: add DEBUG support for test scripts
Make these scripts easier to debug.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 02:23:46 +00:00
Dan Mihai
1b4ef672ef tests: k8s: reduce namespace name duplication
1. Avoid repeating "kata-containers-k8s-tests".
2. Allow users to specify a different test namespace.
3. Introduce the TEST_CLUSTER_NAMESPACE variable, that will also be
   useful when auto-generating the Agent Policy for these tests.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 02:23:38 +00:00
Dan Mihai
8a5ba5fb34 tests: k8s: allow run_kubernetes_tests.sh exec
Allow everyone to directly execute run_kubernetes_tests.sh, for easier
local testing.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 02:23:30 +00:00
Fabiano Fidêncio
11ba90ebf2 Merge pull request #8958 from fidencio/topic/kata-manager-nerdctl-support
kata-manager: Add support for nerdctl installation
2024-02-06 21:33:48 +01:00
GabyCT
d74b6e143f Merge pull request #8951 from GabyCT/topic/udf
metrics: Update packages for TensorFlow ResNet Int8 Dockerfile
2024-02-06 14:29:41 -06:00
GabyCT
6337f300a8 Merge pull request #8628 from GabyCT/topic/enablek8stclh
tests: k8s: Enable tests for cloud hypervisor runtime-rs without devicemapper
2024-02-06 14:28:35 -06:00
Niteesh Dubey
3e383674f8 runtime: fix creation of SEV confidential container on SNP enabled host.
This is needed to fix the bug which is not allowing to create SEV container
on SNP enabled host anymore. This is a regression that was introduced as
part of the following commit:
de39fb7d38

Fixes: #9036

Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
2024-02-06 19:01:30 +00:00
Hyounggyu Choi
462afcf829 runtime-rs: Copy configuration for QEMU from runtime
It makes sense to reuse a configuration template for runtime-golang
as a base. This is simply to copy it into the config directory.

Fixes: #8441

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-02-06 19:35:44 +01:00
Fabiano Fidêncio
058f068d67 Merge pull request #9020 from BbolroC/ok-to-test-static-checks-but-x86
gha: Run static-checks on self-hosted runners conditionally
2024-02-06 19:30:21 +01:00
Gabriela Cervantes
cf049fc718 k8s: Skip k8s tests that are not working
This PR skips the k8s tests that are not working with cloud hypervisor
runtime-rs with its proper issue.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-06 16:52:02 +00:00
Pavel Mores
f0256fded5 runtime-rs: remove validation of shim v2 -address value
It appears that under the shim v2 protocol, a shim has no use of its own
for the -address value, it just passes it back to container runtime's
(mostly containerd or cri-o) event-publishing binary.  Since the -address
value only flows through the shim, being passed to the shim by a container
runtime and then essentially passed back by shim to the container runtime,
it seems inappropriate for a shim to validate the value that is fully
owned and only used by the container runtime.

This commit removes such validation from runtime-rs.  Doing so, it solves
(part of) an interoperability problem between runtime-rs and cri-o.  cri-o
seems to intentionally choose not to implement the event-publishing part
of the shim v2 protocol and thus it has no value it could pass to
runtime-rs for -address.  As a result, it sends an empty string which has
been failing the excessive validation performed by runtime-rs so far.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-02-06 13:43:09 +01:00
Wainer Moschetta
f1ca5d1563 Merge pull request #8953 from ChengyuZhu6/ci-guest-pull
gha: Enable nydus snapshotter in CoCo ci tests
2024-02-06 09:36:59 -03:00
Fabiano Fidêncio
1ccb850ee7 Merge pull request #9027 from fidencio/topic/add-libattest-tdx-into-the-confidential-rootfs
rootfs: Add libattest-tdx into the confidential rootfs
2024-02-06 12:52:13 +01:00
Fabiano Fidêncio
ce82b5e3f5 rootfs: Add libtdx-attest into the confidential rootfs
This is required as the tdx-attest-rs crate, which is used as part of
the guest components, has a runtime dependency on libattest-tdx.

Fixes: #9021 -- part II

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-06 09:13:49 +01:00
Xuewei Niu
67d9847fac Merge pull request #9025 from wainersm/cri-containerd_fix_loop
cri-containerd: fix loop in TestContainerMemoryUpdate()
2024-02-06 14:49:57 +08:00
Amulya Meka
354a3093fa Merge pull request #9019 from Amulyam24/k8s-fix
gha: add GOPATH env var to the ppc64le k8s workflow
2024-02-06 11:01:49 +05:30
Alex Lyn
1ab9a21492 Merge pull request #8552 from deagon/fix/missing-port-type
runtime: missing port type in the DeviceInfo
2024-02-06 10:56:46 +08:00
Dan Mihai
473efc2149 genpolicy: mount source for non-confidential guest
The emergent Kata CI tests for Policy use confidential_guest = false
in genpolicy-settings.json. That value is inconsistent with the
following mount settings:

        "emptyDir": {
            "mount_type": "local",
            "mount_source": "^$(cpath)/$(sandbox-id)/local/",
            "mount_point": "^$(cpath)/$(sandbox-id)/local/",
            "driver": "local",
            "source": "local",
            "fstype": "local",
            "options": [
                "mode=0777"
            ]
        },

We need to keep those settings for confidential_guest = true, and
change confidential_guest = false to use:

        "emptyDir": {
            "mount_type": "local",
            "mount_source": "^$(cpath)/$(sandbox-id)/rootfs/local/",
            "mount_point": "^$(cpath)/$(sandbox-id)/local/",
            "driver": "local",
            "source": "local",
            "fstype": "local",
            "options": [
                "mode=0777"
            ]
        },

The value of the mount_source field is different.

This change unblocks testing using Kata CI's pod-empty-dir.yaml:

genpolicy -u -y pod-empty-dir.yaml

kubectl apply -f pod-empty-dir.yaml

k get pod sharevol-kata
NAME            READY   STATUS    RESTARTS   AGE
sharevol-kata   1/1     Running   0          53s

Fixes: #8887

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-06 01:19:48 +00:00
Fabiano Fidêncio
ffa190831d Merge pull request #9022 from fidencio/topic/add-guest-components-to-the-confidential-image-and-initrd
rootfs: confidential: Install coco-guest-components
2024-02-05 18:56:48 +01:00
Hyounggyu Choi
40b2b2a43a gha: Run static-checks on self-hosted runners conditionally
Due to the restrictions on instance provisioning for self-hosted runners, performing
static checks (36 jobs at the time of writing) on them each time a PR is updated could
significantly burden them, consequently slowing down the entire CI system. To address
this, the decision is to trigger these checks only when an 'ok-to-test' label is added.
Meanwhile, the checks for x86_64, which are supported by GitHub-hosted runners, will
remain unchanged.

Fixes: #8998

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-02-05 15:24:21 +01:00
Wainer dos Santos Moschetta
106e1af497 cri-containerd: fix loop in TestContainerMemoryUpdate()
The loop that generate test cases for virtio-mem enabled/disabled
doesn't return the integers '1' and '0' as expected. Instead it returns
the strings '{1,' and '0}'.

Fixes #9024
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-05 10:59:39 -03:00
Fabiano Fidêncio
27e7974048 rootfs: confidential: Install coco-guest-components
Let's install the coco-guest-components into the confidential rootfs
image and initrd.

Fixes: #9021

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-05 14:41:29 +01:00
Fabiano Fidêncio
f80dbcee0e rootfs: Add logging about the coco guest components
This will make our lives easier to figure out whether the components are
being installed or not.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-05 14:41:29 +01:00
Fabiano Fidêncio
68b8186ec4 osbuilder: Expose COCOGUEST_COMPONENTS_TARBALL
We need to pass this to the container where the rootfs is built, so it
can actually be unpacked inside the rootfs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-05 14:41:28 +01:00
Lukáš Doktor
3b0049b2a4 tools.kata-webhook: Fix lib path
When moving the webhook we skipped the common.bash as (close-enough)
version is already in `/tests` but we forgot to update the source path,
fixing it here.

Fixes: #8653

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-02-05 14:17:24 +01:00
Fabiano Fidêncio
64d09874c3 packaging: coco-guest-components: Pass DESTDIR to the build script
As DESTDIR was not being passed, we've been installing the final
binaries in a container path that was not exposed to the host, leading
to creating an empty tarball with the guest components.

Now, theoretically, guest-components should respect a PREFIX passed, but
that's not the case and we're manually adding "/usr/local/bin" to the
passed DESTDIR.

Here's the result of the tarball:
```bash
⋊> kata-containers ≡ tar tf build/kata-static-coco-guest-components.tar.xz
./
./usr/
./usr/local/
./usr/local/bin/
./usr/local/bin/confidential-data-hub
./usr/local/bin/attestation-agent
./usr/local/bin/api-server-rest
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-05 14:07:10 +01:00
ChengyuZhu6
a214bd8d13 gha: Enable nydus snapshotter in CoCo ci tests
This PR is a split of #8585.
make the changes on the Github workflows, and the skeleton to deploy_snapshotter()
and cleanup_snapshotter() in tests/integration/kubernetes/gha-run.sh in this commit.

After initially merging this patch to trigger CI jobs for CoCo, which will begin executing
the dummy functions deploy_snapshotter() and cleanup_snapshotter(), the implementation details for these functions
remain in #8585. Our subsequent step involves transferring this logic to the PR #8484, enabling the PR to undergo CI testing prior to its merge.

Fixes: #8997

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-05 18:51:59 +08:00
Fabiano Fidêncio
1362918ff0 Merge pull request #9011 from fidencio/topic/switch-to-using-the-confidential-rootfs
runtime: Replace TEE specific initrd / image for the confidential one
2024-02-05 10:43:12 +01:00
Guoqiang Ding
6068faf40b runtime: failed to run in the case of ColdPlugVFIO
Add the missing port type in the DeviceInfo.

Fixes: #9014
Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>
2024-02-05 17:30:11 +08:00
Fabiano Fidêncio
65013205ed Merge pull request #9005 from ChengyuZhu6/clang
static-checks: Install clang in the ci environments
2024-02-05 09:24:51 +01:00
Archana Shinde
b3c74411f6 runtime-rs: Add tests for persist api for clh
Add tests to check clh struct is saved/restored correctly.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-02-04 22:03:57 -08:00
Archana Shinde
0b78296dca runtime-rs: Store additional field for hypervisor state
Implementing Persist API for cloud-hypervisor was done partially with
initial support for cloud-hypervisor. Store and retrieve additional
fields to/from the hypervisor state.

Fixes: #6202

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-02-04 22:03:57 -08:00
Archana Shinde
a5f0b92bca runtime-rs: Add guest protection to hypervisor state
Store guest-protection used while storing the state of the hypervisor.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-02-04 22:03:54 -08:00
Alex Lyn
cf74166d75 Merge pull request #9015 from Apokleos/bugfix-exec-uds
runtime: display accurate error msg to avoid misleading users.
2024-02-05 13:50:43 +08:00
Amulyam24
e59d005568 gha: add GOPATH env var to the ppc64le k8s workflow
The filtering of testing cases installs/uses yq and expects GOPATH to be present. Hence, add it to the workflow.

Fixes: #9018

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2024-02-05 10:30:10 +05:30
Alex Lyn
51a82bec3c Merge pull request #9012 from deagon/fix/monitor-agent-url
kata-monitor: fix agentUrl from containerd shim
2024-02-05 10:41:56 +08:00
ChengyuZhu6
f354beb253 static-checks: Install clang in the ci environments
To test PR #8484, the compilation process for the kata-agent relies on clang.
There have been encountered failures on ARM, s390x, and ppc64le architectures:
ppc64le: https://github.com/kata-containers/kata-containers/actions/runs/7754082828/job/21146689026?pr=8484
s390x: https://github.com/kata-containers/kata-containers/actions/runs/7754082828/job/21146689401?pr=8484
arm: https://github.com/kata-containers/kata-containers/actions/runs/7754082828/job/21146689026?pr=8484

Fixes: #9004

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-04 17:00:19 +08:00
Alex Lyn
c6830ceb89 runtime: display accurate error msg to avoid misleading users.
The original handling method does not reach user expectations.
When the ClientSocketAddress method stats the corresponding
path of runtime-rs and has not found it yet, we should return
an error message here that includes the reason for the failure
(which should be an error display indicating that both runtime-go
and runtime-rs were not found). Instead of simply displaying the
corresponding path of runtime-rs as the final error message to
users.
It is also necessary to return the error promptly to the caller
for further error handling.

Fixes: #8999

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-02-04 16:45:59 +08:00
Xuewei Niu
fa01a86334 Merge pull request #9007 from wainersm/aks_delete_rg
gha: delete azure RG only if it exists
2024-02-04 16:34:17 +08:00
Guoqiang Ding
7bf1ebe16d kata-monitor: fix agentUrl from containerd shim
Fix the missing leading slash.

Fixes: #9013
Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>
2024-02-04 16:24:13 +08:00
Fabiano Fidêncio
d4a9856a84 gha: Remove SEV / SNP / TDX images / initrds
We can remove this now that we're relying on the confidential one.

Fixes: #9010

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-03 13:22:07 +01:00
Fabiano Fidêncio
e4258d8694 runtime: Use confidential image / initrd instead of TEE specific ones
Now that we have a confidential image / initrd being built, instead of a
specific one for each TEE, let's use it everywhere possible.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-03 13:20:14 +01:00
Fabiano Fidêncio
e0bb632053 Merge pull request #8983 from fidencio/topic/add-confidential-image
packaging: Add confidential image / initrd
2024-02-03 12:30:16 +01:00
Fabiano Fidêncio
a9f8888c15 packaging: Add confidential image / initrd
Let's use a single rootfs image / initrd for confidential workloads,
instead of having those split for different TEEs.

We can easily do this now as the soon-to-be-added guest-components can
be built in a generic way.

Fixes: #8982

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-03 00:58:52 +01:00
Fabiano Fidêncio
7ddb2e5999 Merge pull request #8978 from fidencio/topic/use-the-kernel-confidential-when-possible
runtime: packaging: Use confidential kernel instead of the TDX one
2024-02-03 00:29:43 +01:00
Fabiano Fidêncio
e9de0ef6b3 packaging: rootfs: Depend on kernel-confidential tarball
Now that we're using the kernel-confidential, let the rootfs depending
on it, instead of depending on the TEE specific ones.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-02 21:13:41 +01:00
Fabiano Fidêncio
b58cfc765c packaging: Ensure rootfs is rebuilt in case kernel changes
We need to do this in order to ensure that the measure boot will be
taking the latest kernel bits, as needed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-02 21:13:06 +01:00
Fabiano Fidêncio
4394dacb88 packaging: Build the confidential kernel with MEASURED_ROOTFS support
This is already done for the TDX kernel, and should have been done also
for the confidential one.

This action requires us to bump the kernel version as the resulting
kernel will be different from the cached one.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-02 21:13:06 +01:00
Fabiano Fidêncio
c7680839f9 packaging: Fix modules tarball for nvidia-gpu-confidential
The modules dir has an extra "-nvidia-gpu-confidential" string in its
name.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-02 21:13:06 +01:00
Fabiano Fidêncio
dc027e39d6 gha: Remove TEE specific kernel build targets
We're using the confidential kernel instead from now on.

Fixes: #8981 -- part I

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-02 21:12:41 +01:00
Fabiano Fidêncio
3755c69165 runtime: makefile: remove SNP specific kernel references
As this is not used anymore, we can go ahead and just remove it

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-02 21:12:21 +01:00
Fabiano Fidêncio
57b132f94c runtime: makefile: remove SEV specific kernel references
As this is not used anymore, we can go ahead and just remove it

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-02 21:12:21 +01:00
Fabiano Fidêncio
2562d23242 runtime: makefile: remove TDX specific kernel references
As this is not used anymore, we can go ahead and just remove it.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-02 21:11:43 +01:00
Fabiano Fidêncio
f4e3c936d8 runtime: snp: config: Use the confidential kernel
As we're building a single confidential kernel, we should rely on it
rather than keep using the specific ones for TDX / SEV / SNP.

However, for debugability-sake, let's do this change TEE by TEE.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-02 21:11:36 +01:00
Fabiano Fidêncio
8731366d7b runtime: sev: config: Use the confidential kernel
As we're building a single confidential kernel, we should rely on it
rather than keep using the specific ones for TDX / SEV / SNP.

However, for debugability-sake, let's do this change TEE by TEE.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-02 21:11:36 +01:00
Wainer dos Santos Moschetta
a04b215bcc gha: delete azure RG only if it exists
delete_cluster() has tried to delete the az resources group regardless
if it exists. In some cases the result of that operation is ignored,
i.e., fail to resource group not found, but the log messages get a
little dirty. Let's delete the RG only if it exists then.

Fixes #8989
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-02 16:57:20 -03:00
Gabriela Cervantes
eb5b7d3bf8 tests: k8s: Enable tests for cloud hypervisor runtime-rs
This PR enable the k8s tests for cloud hypervisor runtime-rs.

Fixes #8627

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-02 17:58:58 +00:00
Fabiano Fidêncio
6cbdba7268 runtime: tdx: config: Use the confidential kernel
As we're building a single confidential kernel, we should rely on it
rather than keep using the specific ones for TDX / SEV / SNP.

However, for debugability-sake, let's do this change TEE by TEE.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-02 17:13:06 +01:00
Fabiano Fidêncio
a618461d3a runtime: Add confidential kernel to the makefile
With this we can properly generate and the the `-confidential` kernel,
which supports SEV / SNP / TDX as part of our configuration files.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-02 17:13:05 +01:00
GabyCT
40d9a65601 Merge pull request #8996 from GabyCT/topic/addclhr
gha: k8s: Add cloud-hypervisor (runtime-rs) support
2024-02-02 09:48:35 -06:00
Fabiano Fidêncio
741ed1c8bd Merge pull request #9001 from fidencio/topic/fix-cache-for-confidential-kernel-part-III
packaging: Don't build the confidential / sev kernel twice -- part III
2024-02-02 15:19:41 +01:00
Wainer Moschetta
424fbfe58f Merge pull request #8654 from ldoktor/openshift-tests
ci/openshift-ci: Move openshift-ci from the tests repo here
2024-02-02 10:40:30 -03:00
Fabiano Fidêncio
2ff3f0afc6 packaging: Remove trailing whitespace from extra_tarballs arg
This was overlooked during the reviews.

Fixes: #6415 -- part III

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-02 12:42:02 +01:00
Fabiano Fidêncio
228bc48c73 packaging: Fix kernel confidential name
It should be "kernel-confidential" instead of "kernel".

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-02 12:42:02 +01:00
Fabiano Fidêncio
31b21093b0 packaging: Pass the kernel flavour to get_kernel_modules_dir
I made this a required argument during the series and ended up
forgetting to add that while calling the function.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-02 12:42:02 +01:00
Fabiano Fidêncio
51b1df2333 packaging: Fix typo to get the extra_tarballs path
It should've been  "${m#*:}" instead of "${m#&:}".

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-02 12:41:54 +01:00
Fabiano Fidêncio
53e8461db2 Merge pull request #9000 from fidencio/topic/fix-pushing-artefacts-to-registry
packaging: Fix pushing artefacts to the registry
2024-02-02 10:21:40 +01:00
Fabiano Fidêncio
0b221b5618 packaging: Fix pushing artefacts to the registry
This issues was introduced due to a typo not caught during reviews on
e5bca90274.

Fixes: #6415 -- part II

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-02 10:13:11 +01:00
Wenyuan Liu
cb888516c1 Merge pull request #8760 from fadecoder/reduce_go_runtime_mounts
runtime: Reduce the mount points with namespace isolation
2024-02-02 16:54:44 +08:00
Greg Kurz
d1a26ead94 Merge pull request #8454 from BbolroC/compile-with-qemu-s390x
runtime-rs: make compilation for QEMU on s390x
2024-02-02 09:29:32 +01:00
Fabiano Fidêncio
0520b272a3 Merge pull request #8987 from fidencio/topic/fix-cache-for-confidential-kernel
packaging: cache: Fix caching kernels which rely on extra modules
2024-02-02 09:10:52 +01:00
Amulya Meka
e4252a3fe2 Merge pull request #8957 from Amulyam24/add-k8s-test-ppc64le
gha: add kubernetes tests workflow for ppc64le
2024-02-02 10:22:00 +05:30
Fabiano Fidêncio
b2f1235e3c Merge pull request #8994 from sprt/sprt/switch-aks-eastus
ci: aks: switch from eastus2 to eastus region
2024-02-02 00:09:40 +01:00
Hyounggyu Choi
bb6f5073aa runtime-rs: Allow compilation for s390x
Until now, runtime-rs couldn't be compiled on s390x.
We need to lift those restrictions in Makefile first.

Fixes: #8446

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-02-01 23:48:15 +01:00
Dan Mihai
6f1062b5d6 Merge pull request #8966 from microsoft/danmihai1/k8s-sandbox-vcpus-allocation
genpolicy: ignore empty YAML as input
2024-02-01 13:51:02 -08:00
Dan Mihai
8f9c92c0ee Merge pull request #8977 from microsoft/danmihai1/default-namespace
genpolicy: support non-default namespace name
2024-02-01 13:50:33 -08:00
Gabriela Cervantes
6771ca463b gha: k8s: Add cloud-hypervisor (runtime-rs) support
This PR adds the Cloud Hypervisor driver, integrated with the runtime-rs,
as part of the kubernetes tests different with devmapper.

Fixes #8995

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-01 21:22:56 +00:00
Aurélien Bombo
0ace31f041 ci: aks: switch from eastus2 to eastus region
This addresses an internal AKS issue that intermittently prevents
clusters from getting created. The fix has been rolled out to eastus but
not yet eastus2, so we unblock the CI by switching. No downsides in
general.

This supersedes #8990.

Fixes: #8989

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-02-01 19:22:42 +00:00
Hyounggyu Choi
8fcee6e6ec runtime-rs: Use Persist::restore() of QEMU for VirtSandbox
It fails to compile virt_container because Dragonball is only
used in the implementation of the trait method Persist::restore().
As the hypervisor is not compiled on s390x and QEMU implements
the trait method, this commit is to let the method use QEMUi's.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-02-01 18:02:10 +01:00
Hyounggyu Choi
56aef3741d runtime-rs: Exclude hypervisors plugins except QEMU for s390x
Dragonball and cloud-hypervisor are not supported on s390x.
We need to exclude the plugins for these hypervisors from compilation.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-02-01 18:02:10 +01:00
Fabiano Fidêncio
5d2906c36a packaging: Bump the kata config kernel version
Just to make sure we won't use cached components.

Fixes: #6415

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-01 16:57:15 +01:00
Fabiano Fidêncio
d2ea11dbff packaging: Use the cached kernel modules
Till now we didn't have a logic to consume the kernel modules cached
tarball.  Let's make sure those are consumed as it'll save us a
reasonable amount of build time.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-01 16:57:15 +01:00
Fabiano Fidêncio
e5bca90274 packaging: Cache the kernel modules
This will save us a lot of time, as right now the CI is rebuilding the
kernel for absolutely no reason.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-01 16:55:21 +01:00
Fabiano Fidêncio
f481f58659 packaging: Create the tarball for the kernel modules
Let's start doing this for the confidential kernels (and also for SEV,
till it gets removed).

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-01 16:55:20 +01:00
Fabiano Fidêncio
a58caca723 packaging: Take extra tarballs in install_cached_tarball_component()
This allows us to add a map, in the format of:
`"tarball1_name:tarball1_path tarball2_name:tarball2_path ..."`

With this we have a base to start doing a better job when caching extra
artefacts, like kernel modules.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-01 16:55:20 +01:00
Fabiano Fidêncio
33ac5468fe packaging: Add function to get the kernel modules directory
Right now this is just being added but not used yet.  The idea is to use
this to both cache and later on untar the kernel modules needed for some
of the kernel targets we have (specifically looking at the confidential
one).

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-01 16:55:20 +01:00
Zhigang Wang
9317e23df1 mount: Reduce the mount points with namespace isolation
This patch can reduce load on systemd process, and
increase the k8s deployment density when using go runtime.

Fixes: #8758

Signed-off-by: Zhigang Wang <wangzhigang17@huawei.com>
Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2024-02-01 18:34:24 +08:00
Fabiano Fidêncio
ed6816e29f kata-manager: Add support for nerdctl installation
As already done for docker, let's also add support for installing
nerdctl + kata containers.

For now, at least for now, we are explicitly not allowing the
combination of installing both docker and nerdctl in the same
installation in order to reduce the script complexity.

Also, nerdctl installation, for now, is limited to x86_64 and aarch64 as
those are the only architectures that nerdctl releases a "full" package
for.

Fixes: #8358

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-01 09:19:35 +01:00
Xuewei Niu
2332552c8f Merge pull request #7483 from frezcirno/passfd_io_feature
runtime-rs: improving io performance using dragonball's vsock fd passthrough
2024-02-01 14:53:53 +08:00
Amulyam24
f8585db8d9 gha: add kubernetes tests workflow for ppc64le
This PR adds workflow for running kubernetes test suite on ppc64le.

It uses scripts to create and delete the cluster using kubeadm as none of the current cluster creation tools are supported on Power.

Fixes: #7950

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2024-02-01 12:23:11 +05:30
Alex Lyn
cf26c16017 Merge pull request #8931 from yaoyinnan/8930/feat/merge-ValidCgroupPath
runtime: merged ValidCgroupPath method
2024-02-01 12:53:55 +08:00
Alex Lyn
a157fc3b74 Merge pull request #8974 from yaoyinnan/5240/fix/cgroup-parallel
runtime: add SingleContainer when obtaining OCI Spec
2024-02-01 11:43:02 +08:00
Alex Lyn
1b8f3ce28a Merge pull request #8929 from yaoyinnan/8838/fix/error-message
runtime-rs: report error on missing or empty fields in configuration
2024-02-01 11:02:30 +08:00
Dan Mihai
09ea0eed9d genpolicy: ignore empty YAML as input
Kata CI's pod-sandbox-vcpus-allocation.yaml ends with "---", so the
empty YAML document following that line should be ignored.

To test this fix:

genpolicy -u -y pod-sandbox-vcpus-allocation.yaml

Fixes: #8895

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-01 02:22:21 +00:00
Dan Mihai
befef119ff Merge pull request #8941 from malt3/genpolicy-flags
genpolicy: allow separate paths for rules and settings files
2024-01-31 18:14:12 -08:00
GabyCT
6db1cd5f65 Merge pull request #8964 from GabyCT/topic/fixnerdcltt
tests: Re-arranged nerdctl tests
2024-01-31 15:02:54 -06:00
Dan Mihai
21125baec3 Merge pull request #8962 from microsoft/danmihai1/config-map-optional2
genpolicy: ignore volume configMap optional field
2024-01-31 12:29:30 -08:00
Fabiano Fidêncio
39a64d1447 Merge pull request #8269 from wainersm/kata-deploy_deprecated
kata-deploy: fix deprecations on kustomization files
2024-01-31 20:02:01 +01:00
Hyounggyu Choi
9c0312d466 Merge pull request #8956 from BbolroC/agent-build-fix-s390x-ppc64le
packaging: Use Ubuntu 20.04 for building an agent
2024-01-31 18:23:16 +01:00
Greg Kurz
8b1dc06971 Merge pull request #8938 from pmores/log-qemus-stderr-in-shim-log
runtime-rs: Log qemu's stderr in shim log
2024-01-31 18:04:28 +01:00
Dan Mihai
f0339a79a6 genpolicy: support non-default namespace name
Allow users to specify in genpolicy-settings.json a default cluster
namespace other than "default". For example, Kata CI uses as default
namespace: "kata-containers-k8s-tests".

Fixes: #8976

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-31 15:47:01 +00:00
Zixuan Tan
222de4f684 agent: Fix a race condition in passfd_io.rs
There is a race condition in agent HVSOCK_STREAMS hashmap, where a
stream may be taken before it is inserted into the hashmap. This patch
add simple retry logic to the stream consumer to alleviate this issue.

Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
6e4d4c329a agent,runtime-rs: Add license header to passfd_io.rs
Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
1206de2c23 agent: Use pipes as stdout/stderr of container process
Linux forbids opening an existing socket through /proc/<pid>/fd/<fd>,
making some images relying on the special file /dev/stdout(stderr),
/proc/self/fd/1(2) fail to boot in passfd io mode, where the
stdout/stderr of a container process is a vsock socket.

For back compatibility, a pipe is introduced between the process
and the socket, and its read end is set as stdout/stderr of the
container process instead of the socket. The agent will do the
forwarding between the pipe and the socket.

Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
f6710610d1 agent,runtime-rs,runk: fix fmt and clippy warnings
Fix rustfmt and clippy warnings detected by CI.

Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
89be42a177 runtime-rs: open stdout and stderr fifos NONBLOCK
This patch adds O_NONBLOCK flag when open stdout and stderr FIFOs
to avoid blocking.

Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
3eb4bed957 agent: use biased select to avoid data loss
This patch uses a biased select to avoid stdin data loss in case of
CloseStdinRequest.

Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
7874ef5fd2 agent: set stdout/err vsock stream as blocking before passing to child
In passfd io mode, when not using a terminal, the stdout/stderr vsock
streams are directly used as the stdout/stderr of the child process.
These streams are non-blocking by default.

The stdout/stderr of the process should be blocking, otherwise
the process may encounter EAGAIN error when writing to stdout/stderr.

Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Fupan Li
cfb262d02f container: keep the io connection when pass fd to hybrid vsock
We want the io connection keep connected when the containerd closed
the io pipe, thus it can be attached on the io stream.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-01-31 21:07:48 +08:00
Fupan Li
4a762fcfdd dbs: hybrid stream support keep the connection when local closed
Support the hybrid fd passthrough mode with passing pipe fd,
which can specify this connection kept even when the pipe
peer closed, and this connection can be reget wich re-opening
the pipe.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
5536743361 agent,runtime-rs: fix container io detach and attach
Partially fix some issues related to container io detach and attach.

Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
657b17a86f runtime-rs: open stdin fifo with RDWR|NONBLOCK when pass vsock streams
In linux, when a FIFO is opened and there are no writers, the reader
will continuously receive the HUP event. This can be problematic
when creating containers in detached mode, as the stdin FIFO writer
is closed after the container is created, resulting in this situation.

In passfd io mode, open stdin fifo with O_RDWR|O_NONBLOCK to avoid the
HUP event.

Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
f1b33fd2e0 agent: clean up term master fd when container exits
When container exits, the agent should clean up the term master fd,
otherwise the fd will be leaked.

Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
b8632b4034 dragonball: vsock: properly handle EPOLLHUP/EPOLLERR events
When one end of the connection close, the epoll event will be triggered
forever. We should close the connection and kill the connection.

Fixes: #6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
442df71fe5 agent,runtime-rs: refactor process io using vsock fd passthrough feature
Currently in the kata container, every io read/write operation requires
an RPC request from the runtime to the agent. This process involves
data copying into/from an RPC request/response, which are high overhead.

To solve this issue, this commit utilize the vsock fd passthrough, a
newly introduced feature in the Dragonball hypervisor. This feature
allows other host programs to pass a file descriptor to the Dragonball
process, directly as the backend of an ordinary hybrid vsock connection.

The runtime-rs now utilizes this feature for container process io. It
open the stdin/stdout/stderr fifo from containerd, and pass them to
Dragonball, then don't bother with process io any more, eliminating
the need for an RPC for each io read/write operation.

In passfd io mode, the agent uses the vsock connections as the child
process's stdin/stdout/stderr, eliminating the need for a pipe
to bump data (in non-tty mode).

Fixes: #6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
eb6bb6fe0d config: add two options to control vsock passthrough io feature
Two toml options, `use_passfd_io` and `passfd_listener_port` are introduced
to enable and configure dragonball's vsock fd passthrough io feature.

This commit is a preparation for vsock fd passthrough io feature.

Fixes: #6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
973b5ad1f4 runtime-rs: make Container::new async
Fixes: #6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Xuewei Niu
5449173102 Merge pull request #8932 from kalil-pelissier/feature/issue-8586/fix-noop-method-call-warning
dragonball: fix noop-method-call warning
2024-01-31 19:24:27 +08:00
Malte Poll
531a11159f genpolicy: allow separate paths for rules and settings files
Using custom input paths with -i is counter-intuitive. Simplify path handling with explicit flags for rules.rego and genpolicy-settings.json.

Fixes: #8568

Signed-Off-By: Malte Poll <1780588+malt3@users.noreply.github.com>
2024-01-31 11:00:19 +01:00
Hyounggyu Choi
2e1d770fcf packaging: Track files correctly when naming builder image for agent
The necessary files for the agent builder image can be found in
`tools/packaging/static-build/agent`,
`ci/install_libseccomp.sh` and
`tools/packaging/kata-deploy/local-build/kata-deploy-copy-libseccomp-installer.sh`.
Identifying the correct files addresses the previously misreferenced path
used to name the builder image.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-01-31 10:49:20 +01:00
yaoyinnan
9aa1ed805a runtime: add SingleContainer when obtaining OCI Spec
When creating a cgroup, add a SingleContainer when obtaining the OCI Spec to apply to ctr, podman, etc.

Fixes: #5240

Signed-off-by: yaoyinnan <35447132+yaoyinnan@users.noreply.github.com>
2024-01-31 15:24:07 +08:00
yaoyinnan
b0b8523cea runtime: modify ValidCgroupPath unit test
Modify ValidCgroupPath unit test.

Fixes: #8930

Signed-off-by: yaoyinnan <35447132+yaoyinnan@users.noreply.github.com>
2024-01-31 14:37:17 +08:00
yaoyinnan
feed5c8ff9 runtime: merged ValidCgroupPath method
Merged ValidCgroupPath method to handle cgroupv1 and cgroupv2.

Fixes: #8930

Signed-off-by: yaoyinnan <35447132+yaoyinnan@users.noreply.github.com>
2024-01-31 14:37:13 +08:00
yaoyinnan
864389c524 runtime-rs: report error on missing or empty fields in configuration
Removed the setting of default values for runtime fields. Added explicit checks for missing or empty fields, reporting errors with clear messages.

Fixes: #8838

Signed-off-by: yaoyinnan <35447132+yaoyinnan@users.noreply.github.com>
2024-01-31 12:46:17 +08:00
Wainer dos Santos Moschetta
abc2fcd88f kata-deploy: fix deprecations on kustomization files
By running `kustomize edit fix` on those files they have changed
deprecated instructions ('bases' and 'patchesStrategicMerge') as well as
'apiVersion' and 'kind' were added.

Fixes #8268
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-01-30 18:41:03 -03:00
Lukáš Doktor
4876eadd2f tools: Add reference to the kata webhook's README
The newly added webhook is a new component and oughst to be linked from
the main README file.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-01-30 19:05:56 +01:00
Lukáš Doktor
b0b7748f30 ci/openshift-ci: Correct the lib location
correct the lib file locations after the move from
tests->kata-containers repo and add a minimized version of the
".ci/lib.sh" library into the "ci/openshift-ci" as we don't really
utilize all of the features.

Fixes: #8653

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-01-30 19:05:56 +01:00
Lukáš Doktor
4c58478536 ci/openshift-ci: Move openshift-ci from the tests repo
Move the f15be37d9bef58a0128bcba006f8abb3ea13e8da version of scripts
required for openshift-ci from "kata-containers/tests/.ci/openshift-ci"
into "kata-containers/kata-containers/ci/openshift-ci" and required
webhook+libs into "kata-containers/kata-containers/tools/testing" as is
to simplify verification, the different location handling will be added
in following commit.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-01-30 19:05:55 +01:00
Kvlil
3fd5628771 dragonball: fix noop-method-call warning
The `noop-method-call` is a rustc lint that has existed since v1.52.0.
This lint has been moved to the warn by default lint level since v1.73.0.
Therefore build is failing with this version and above.
This commit removes the unnecessary call to `<&T as Deref>::deref` on `T: !Deref`.

Fixes: #8586

Signed-off-by: Kvlil <kalil.pelissier@gmail.com>
2024-01-30 17:16:49 +00:00
Wainer Moschetta
bf54a02e16 Merge pull request #8924 from microsoft/danmihai1/pod-nested-configmap-secret
genpolicy: fix ConfigMap volume mount paths
2024-01-30 14:09:41 -03:00
Gabriela Cervantes
78b517ccc8 tests: Re-arranged nerdctl tests
This PR re-arranged the nerdctl tests to avoid random failures.
In this PR first will run the tests with RunC and then with the kata hypervisor.
This PR tries to avoid the random failures that is happening with cloud-hypervisor
and clh.

Fixes #8963

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-30 16:07:12 +00:00
Dan Mihai
d12875ee66 genpolicy: ignore volume configMap optional field
The auto-generated Policy already allows these volumes to be mounted,
regardless if they are:
- Present, or
- Missing and optional

Fixes: #8893

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-30 15:32:37 +00:00
Fabiano Fidêncio
7a83e6dc14 Merge pull request #8959 from fidencio/topic/crio-bump-runners-to-2204
gha: cri-o: Bump runners to 22.04
2024-01-30 14:27:40 +01:00
Fabiano Fidêncio
34d51b05f8 gha: cri-o: Bump runners to 22.04
This will *not* solve the CRI-O CI breakage but will give us an
environment where we could get it to run locally.

Fixes: #8935 -- part I

Thanks to Julien Ropé for trying to reproduce the issues I faced on
https://github.com/kata-containers/kata-containers/issues/8935 in an
Ubuntu 22.04 system.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-30 14:17:06 +01:00
Xuewei Niu
7e10000b6f Merge pull request #8928 from yaoyinnan/8927/fix/unused-DriverInfo
runtime-rs: fix unused driverInfo error
2024-01-30 20:39:10 +08:00
Hyounggyu Choi
f3bc6e4155 packaging: Use Ubuntu 20.04 for building an agent
This involves using Ubuntu 20.04 as a build environment for an agent to match with a runtime environment.

Fixes: #8955

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-01-30 10:22:14 +01:00
Pavel Mores
d53edbd0a5 runtime-rs: collect qemu stderr and log it in shim log
Qemu stderr monitoring runs in its own asynchronous green thread.
For that, `stderr` is taken out of the Child representing the qemu child
process to avoid partial move and make it possible for the main thread
still to call functions on QemuInner::qemu_process (e.g. kill(), id()).

Fixes #8937

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-01-30 09:09:05 +01:00
Pavel Mores
684d740122 runtime-rs: switch qemu child process management from std to tokio
We'll want to capture qemu's stderr in parallel with normal runtime-rs
execution.  Tokio's primitives make this much easier than std's.  This
also makes child process management more consistent across runtime-rs
(i.e. virtiofsd child process is already launched and managed using tokio).

Some changes were necessary due to tokio functions being slightly different
from their std counterparts.  Child::kill() is now async and Child::id()
now returns an Option.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-01-30 09:07:14 +01:00
Dan Mihai
6a8f46f3b8 Merge pull request #8918 from microsoft/danmihai1/metadata
genpolicy: optional PodTemplateSpec metadata field
2024-01-29 12:36:30 -08:00
Dan Mihai
60ac3048e9 genpolicy: fix ConfigMap volume mount paths
Allow Kata CI's pod-nested-configmap-secret.yaml to work with
genpolicy and current cbl-mariner images:

1. Ignore the optional type field of Secret input YAML files.

   It's possible that CoCo will need a more sophisticated Policy
   for Secrets, but this change at least unblocks CI testing for
   already-existing genpolicy features.

2. Adapt the value of the settings field below to fit current CI
   images for testing on cbl-mariner Hosts:

    "kata_config": {
        "confidential_guest": false
    },

    Switching this value from true to false instructs genpolicy to
    expect ConfigMap volume mounts similar to:

        "configMap": {
            "mount_type": "bind",
            "mount_source": "$(sfprefix)",
            "mount_point": "^$(cpath)/watchable/$(bundle-id)-[a-z0-9]{16}-",
            "driver": "watchable-bind",
            "fstype": "bind",
            "options": [
                "rbind",
                "rprivate",
                "ro"
            ]
        },

    instead of:

        "confidential_configMap": {
            "mount_type": "bind",
            "mount_source": "$(sfprefix)",
            "mount_point": "$(sfprefix)",
            "driver": "local",
            "fstype": "bind",
            "options": [
                "rbind",
                "rprivate",
                "ro"
            ]
        }
    },

    This settings change unblocks CI testing for ConfigMaps.

Simple sanity testing for these changes:

genpolicy -u -y pod-nested-configmap-secret.yaml

kubectl apply -f pod-nested-configmap-secret.yaml

kubectl get pods | grep config
nested-configmap-secret-pod 1/1     Running   0          26s

Fixes: #8892

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-29 16:13:47 +00:00
Gabriela Cervantes
31813cf8d8 metrics: Update packages for TensorFlow ResNet Int8 Dockerfile
This PR updates the required packages for the TensorFlow ResNet50
Int8 Dockerfile.

Fixes #8950

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-29 16:11:09 +00:00
Fabiano Fidêncio
087856f26c Merge pull request #8934 from microsoft/danmihai1/nodeName
genpolicy: ignore the nodeName field
2024-01-29 16:57:59 +01:00
Greg Kurz
d687b601f1 Merge pull request #8933 from fidencio/topic/package-coco-guest-components
packaging: Build coco-guest-components
2024-01-29 16:34:06 +01:00
Zvonko Kaiser
a9348fa35b Merge pull request #8375 from zvonkok/opa-binary-fix
arm64: agent_policy build always pulls amd64 opa binary
2024-01-29 15:10:10 +01:00
Fabiano Fidêncio
5ea6a29c37 Merge pull request #8947 from fidencio/topic/gha-pass-down-AZ_SUBSCRIPTION_ID
gha: azure: Set the correct subscription to the account
2024-01-29 15:07:06 +01:00
Fabiano Fidêncio
448c0aaecb gha: azure: Set the correct subscription to the account
Due to the changes done in the CI, we need to set the correct
subscription to be used with the account from now on, otherwise we'd end
up using CoCo subscription.

Fixes: #8946

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-29 15:00:38 +01:00
Pavel Mores
b52a398469 runtime-rs: move creation of VM path from start_vm() to prepare_vm()
This fixes a flaw pointed out in review of PR #8185.  Creation of the
directory semantically fits better into VM preparation than VM launch.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-01-27 13:46:35 +01:00
Fabiano Fidêncio
98dc2d4c52 rootfs: agent: Initialise AGENT_SOURCE_BIN & AGENT_TARBALL
Otherwise those would be unbound if not passed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-26 19:58:41 +01:00
Fabiano Fidêncio
5e57e0235e rootfs: agent: Fix build with AGENT_SOURCE_BIN
We need to actually check that the env var is not empty. :-)
This was introduced by 8307718842.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-26 19:58:20 +01:00
Fabiano Fidêncio
fbfc880eb6 rootfs: Add COCO_GUEST_COMPONENTS_TARBALL env var
This env ver will serve us to pass the Confidential Containers
guest-components tarball to the rootfs builder, which will then just
unpack the content into the rootfs.

Fixes: #8848 -- part I

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
Co-authored-by: zhouliang121 <liang.a.zhou@linux.alibaba.com>
Co-authored-by: Alex Carter <alex.carter@ibm.com>
Co-authored-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>
2024-01-26 19:58:19 +01:00
Fabiano Fidêncio
644abde35c packaging: coco-guest-components: Allow building the project
The Confidential Containers guest-components will, in the very short
future, be part of the Kata Containers rootfs that's used by the
Confidential Containers usecase.

This commit introduces the ability to, standalone, build the component
locally and as part of our CI, and this can be done by calling:
`make coco-guest-components-tarball`

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
Co-authored-by: zhouliang121 <liang.a.zhou@linux.alibaba.com>
Co-authored-by: Alex Carter <alex.carter@ibm.com>
Co-authored-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>
2024-01-26 19:36:01 +01:00
Hyounggyu Choi
ee072e8a06 Merge pull request #8926 from fidencio/topic/cache-the-agent-for-non-x86_64
gha: Cache the agent for non-x86_64 arches
2024-01-26 18:04:33 +01:00
Dan Mihai
076869aa39 genpolicy: ignore the nodeName field
Validating the node name is currently outside the scope of the CoCo
policy.

This change unblocks testing using Kata CI's test-pod-file-volume.yaml
and pv-pod.yaml.

Fixes: #8888

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-26 16:30:55 +00:00
Dan Mihai
ef1ee81f81 Merge pull request #8909 from microsoft/danmihai1/main-shareProcessNamespace
genpolicy: add shareProcessNamespace support
2024-01-26 05:49:19 -08:00
yaoyinnan
9b7c5c69cf runtime-rs: fix unused driverInfo error
Remove the unused DriverInfo declaration or integrate it into the codebase where applicable.

Fixes: #8927
Signed-off-by: yaoyinnan <35447132+yaoyinnan@users.noreply.github.com>
2024-01-26 19:59:52 +08:00
Greg Kurz
f41fa7557a Merge pull request #8914 from BbolroC/basic-e2e-ibm-se
tests: Add IBM SE to the basic confidential test
2024-01-26 12:32:32 +01:00
Fabiano Fidêncio
08a082ca47 gha: Cache the agent for non-x86_64 arches
Those are not yet being cached for no reason, and they better be as
it'll allow us to save a considerable amount of time building the
rootfs.

Fixes: #8917

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-26 12:02:26 +01:00
Fabiano Fidêncio
a7c68225aa Merge pull request #8916 from fidencio/topic/packaging-reuse-already-built-agent
packaging:  Don't always build the kata-agent
2024-01-26 12:00:55 +01:00
Fabiano Fidêncio
95c569b0a6 packaging: Add safe.directory to the git config
Otherwise building as root will not work, as demonstrated by the arm64
CI.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-26 09:44:43 +01:00
Hyounggyu Choi
ab462a4b89 tests: Add IBM SE to the basic confidential test
The existing confidential basic test titled `Test unencrypted
confidential container launch success and verify that we are
running in a secure enclave` has been updated to incorporate
IBM Secure Execution (`qemu-se`).
Previously, a secure image was absent from kata-deploy, hindering
the inclusion of IBM SE in the test.
Thanks to the #6755 update, it is now possible to test the TEE.

This modification extends the existing test by introducing
`qemu-se`. The specific changes are outlined below:

- Add an additional test `cc-se-e2e-tests` to s390x nightly
- Expansion of `REMOTE_COMMAND_PER_HYPERVISOR` for `qemu-se`
- Temporary exclusion of two test cases currently incompatible with IBM SE
(`cpu-ns` is a common issue across all TEEs, while `inotify`
will be addressed in a subsequent pull request).

Fixes: #8913

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-01-26 06:04:39 +01:00
GabyCT
c13a63c8ba Merge pull request #8905 from zvonkok/enable-tpm
qemu: enable TPM
2024-01-25 14:52:00 -06:00
GabyCT
aa958adf90 Merge pull request #8904 from GabyCT/topic/buildbq
tools: Use defined variable in build base qemu script
2024-01-25 13:51:44 -06:00
GabyCT
36fc2fd83f Merge pull request #8876 from GabyCT/topic/dockerrestfp
metrics: Update packages needed for ResNet50 FP32 Dockerfile
2024-01-25 13:51:16 -06:00
Dan Mihai
8ad5459beb genpolicy: optional PodTemplateSpec metadata field
Add metadata containing the Policy annotation if the user didn't
provide any metadata in the input yaml file.

For a simple sanity test using a Kata CI YAML file:

genpolicy -u -y job.yaml

kubectl apply -f job.yaml

kubectl get pods | grep job
job-pi-test-64dxs 0/1     Completed   0          14s

Fixes: #8891

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-25 19:06:59 +00:00
Fabiano Fidêncio
dd49479829 packaging: Don't build the agent if not needed
Let's start relying on the already cached agent to be deployed inside
the rootfs.  By doing this we save a lot of time in our CI, and we have
a better way, for developers, to play with changes in the agent.

Fixes: #8915

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-25 19:41:33 +01:00
Fabiano Fidêncio
21fd7e6dfd packaging: Fail in case oras can't find an artefact
It just means the component is not cached, and that it must be built in
the usual way.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-25 19:41:32 +01:00
Fabiano Fidêncio
eb7a33ee71 rootfs: Always strip the agent binary
Let's always do this, regardless of where the agent is coming from.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-25 19:41:32 +01:00
Fabiano Fidêncio
f23451de01 rootfs: Add xz as a dep
As we'll be untarring the agent tarball (and any other component that
may be part of the rootfs) into the rootfs, we have to have xz
installed.

For debian and ubuntu the package is called xz-utils; for centos,
alpine and cbl-mariner the package is called xz.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-25 19:41:32 +01:00
Fabiano Fidêncio
8307718842 rootfs: Add AGENT_TARBALL env var
This env var will serve us to pass the agent tarball to the rootfs
builder, which will then just unpack the content into the rootfs instead
of building the agent again.

AGENT_TARBALL and AGENT_SOURCE_BIN should never be used together.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-25 19:41:32 +01:00
Fabiano Fidêncio
5b0d0687e5 packaging: agent: Allow building in all arches
We're moving away from alpine and using ubuntu in order to be able to
build the agent for all the architectures we need.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-25 19:41:32 +01:00
Dan Mihai
535cf04edb genpolicy: add shareProcessNamespace support
Validate the sandbox_pidns field value for CreateSandbox and
CreateContainer.

Fixes: #8868

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-25 16:48:57 +00:00
Dan Mihai
1e24581c07 Merge pull request #8908 from microsoft/danmihai1/genpolicy-permissions
tools: allow all users to execute genpolicy
2024-01-25 08:42:24 -08:00
Dan Mihai
295494c7dc Merge pull request #8898 from microsoft/danmihai1/show-output-of-passing-tests
tests: k8s: bats --show-output-of-passing-tests
2024-01-25 06:22:50 -08:00
Fabiano Fidêncio
1039641ab8 packaging: agent: Add the arch to the builder container
This has been missed during reviews and is already a problem as we're
trying to build the agent outside of the rootfs for other architectures
than x86_64.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-25 14:11:14 +01:00
Fabiano Fidêncio
58874f9c3e packaging: tools: Add the arch to the builder container
This has been missed during reviews and will become a problem when the
tools start to be built in different architectures.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-25 14:10:22 +01:00
Zvonko Kaiser
76efe25aed Merge pull request #8901 from zvonkok/remove-gha-action
gpu: remove GHA target first then remove the obsoleted Makefile targets
2024-01-25 13:40:03 +01:00
Chelsea Mafrica
24b33ae35b Merge pull request #8884 from GabyCT/topic/ulib
versions: Update libseccomp to version v2.5.5
2024-01-24 23:55:32 -08:00
Dan Mihai
723c76d945 tools: allow all users to execute genpolicy
This tool can be useful for any users.

Fixes: #8907

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-25 00:40:53 +00:00
Zvonko Kaiser
19ecdbca3b qemu: enable TPM
Several use-cases need a vTPM lets enable it for QEMU, a follow up patch will introduce the runtime config.

Fixes: #8902

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-01-24 17:49:08 +00:00
Gabriela Cervantes
98b5a19b3a tools: Use defined variable in build base qemu script
This PR uses a variable that is already defined in the build base
qemu script to have uniformity across the script as this variable
is already used in the script.

Fixes #8903

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-24 17:05:17 +00:00
Zvonko Kaiser
4b8d79c1f6 gpu: remove GHA target first then remove the obsoleted Makefile targets
Lets remove the GHA target actions first so the the follow-up PR #8874 tests are succeeding.

Fixes: #8900

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-01-24 11:43:39 +00:00
Dan Mihai
66c012d052 tests: k8s: bats --show-output-of-passing-tests
Add --show-output-of-passing-tests to the k8s integration tests. The
output of a passing test can be helpful when investigating a failure
of the same test.

Fixes: #8885

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-24 03:04:28 +00:00
Hyounggyu Choi
f4290688bb Merge pull request #7146 from BbolroC/ibm-se-howto-doc
docs: provide a guide for how to use IBM Secure Execution
2024-01-23 22:48:05 +01:00
Hyounggyu Choi
25ecca91c6 docs: provide a guide for how to use IBM Secure Execution
This PR is to add a document for how to run kata containers under IBM
Secure Execution environment.

Fixes: #7025

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-01-23 18:58:27 +01:00
Greg Kurz
0f67a26751 Merge pull request #8812 from kalil-pelissier/feature/issue-7720/drop-dead-code
runtime: remove SharedVersions field dead code
2024-01-23 17:46:41 +01:00
Gabriela Cervantes
1b0d12ab78 versions: Update libseccomp to version v2.5.5
This PR updates the libseccompt version to v2.5.5 which includes
the following changes:
- Update the syscall table for Linux
- Fix minor issues with binary tree testing and with empty binary trees

Fixes #8883

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-23 16:31:25 +00:00
Zvonko Kaiser
ab597a4d5b opa: Improve the download logic
The versions.yaml has a default for the amd64 binary, but there is no
code to actually build the arm64 binary, which seems an overlook.

Let's simplify the OPA logic by removing the direct link to the binary,
and construct that link as part of the checks we do to decide whether we
need to build OPA or not.

Fixes: #8373

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-23 09:16:16 +00:00
Greg Kurz
4516f38165 Merge pull request #8872 from zvonkok/nvidia-gpu-confidential
gpu: Add NVIDIA GPU Confidential kernel target
2024-01-23 09:22:27 +01:00
Dan Mihai
3d2ec5c919 Merge pull request #8857 from microsoft/danmihai1/k8s-gha
gha: get ready to install genpolicy
2024-01-22 08:29:24 -08:00
Gabriela Cervantes
eb7e123de8 metrics: Update packages needed for ResNet50 FP32 Dockerfile
This PR updates the packages necessary to build the ResNet50 fp32
Dockerfile to run properly the benchmark.

Fixes #8875

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-22 16:15:36 +00:00
Zvonko Kaiser
4fc34323ae gpu: Add NVIDIA GPU Confidential kernel target
This is a follow up to the work of minimizing targets, unifying TDX,SNP builds for NVIDIA GPUs

Fixes: #8828

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-01-22 14:58:57 +00:00
Kvlil
a4b208a712 runtime: remove SharedVersions field dead code
SharedVersion fiel add a versiontable property that isn't supported by upstream QEMU.
This is dead code since virtcontainers isn't setting SharedVersions to true.

Fixes: #7720

Signed-off-by: Kvlil <kalil.pelissier@gmail.com>
2024-01-22 12:18:42 +00:00
Dan Mihai
ea9c659d36 gha: get ready to install genpolicy
The changes to install and test genpolicy must come later, after CI
picks up these gha changes.

Fixes: #8856

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-19 23:37:49 +00:00
GabyCT
bb1ada1a8b Merge pull request #8855 from GabyCT/topic/updatefc
versions: Update firecracker version
2024-01-19 16:25:50 -06:00
Fabiano Fidêncio
1e30fde8fa Merge pull request #8862 from microsoft/danmihai1/genpolicy-dns
genpolicy: ignore pod DNS settings
2024-01-19 23:08:26 +01:00
Dan Mihai
ca03d47634 genpolicy: ignore pod DNS settings
Ignore pod DNS settings because policing the network traffic is
currently outside the scope of the Agent Policy.

Example from Kata CI: pod-custom-dns.yaml

Fixes: #8832

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-19 16:42:35 +00:00
Alex.Lyn
826c751bf3 Merge pull request #8185 from pmores/add-qemu-cmdline-generation-framework
Add qemu cmdline generation framework
2024-01-19 21:42:49 +08:00
Greg Kurz
b7d6b18768 Merge pull request #8485 from BbolroC/add-unit-test-s390x
GHA: Enable static check for s390x, aarch64 and ppc64le
2024-01-19 11:49:16 +01:00
Pavel Mores
25c8d5db5d runtime-rs: use qemu cmdline generation framework to launch VM
Deploy the framework added by the previous commit to generate qemu
command line and launch the VM.

We now properly store the child process object which allows us to
implement remaining Hypervisor functions necessary for a simple but
successful VM lifecycle, get_vmm_master_tid() and stop_vm().

Fixes #8184

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-01-19 11:42:23 +01:00
Gabriela Cervantes
0696807384 versions: Update firecracker version
This PR updates the firecracker version to v1.6.0 which includes
the following features
- Added support for per net device metrics. In addition to aggregate metrics net, each individual net device will emit metrics under the label "net_{iface_id}". E.g. the associated metrics for the endpoint "/network-interfaces/eth0" will be available under "net_eth0" in the metrics json object.
- Added support for per block device metrics. In addition to aggregate metrics block, each individual block device will emit metrics under the label "block_{drive_id}". E.g. the associated metrics for the endpoint "/drives/{drive_id}" will be available under "block_drive_id" in the metrics json object.
- Added a new vm-state subcommand to info-vmstate command in the snapshot-editor tool to print MicrovmState of vmstate snapshot file in a readable format. Also made the vcpu-states subcommand available on x86_64.
- Added source-level instrumentation based tracing. See tracing for more details.
- Added developer preview only (NOT for production use) support for vhost-user block devices. Firecracker implements a vhost-user frontend. Users are free to choose from existing open source backend solutions or their own implementation. Known limitation: snapshotting is not currently supported for microVMs containing vhost-user block devices. See the related doc page for details. The device emits metrics under the label "vhost_user_{device}_{drive_id}".

Fixes #8854

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-18 15:50:30 +00:00
Amulyam24
f6fea5f2ca agent: fix failing unit tests on ppc64le
- test_volume_capacity_stats: verify the file block size against the fetched size via statfs()
 - test_reseed_rng: Correct the request codes for RNDADDTOENTCNT and RNDRESEEDCRNG when platform is ppc64le
 - test list_routes: Add the route only if destination is not empty
 - test_new_fs_manager: skip the test if cgroups v2 is used by default
 - skip test cases rpc::tests::test_do_write_stream, sandbox::tests::test_find_process, sandbox::t
ests::test_find_container_process and sandbox::tests::add_and_get_container on ppc64le as they are fl
aky

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2024-01-18 16:32:16 +01:00
Hyounggyu Choi
610f878894 dragonball: Fix compile error for aarch64
This is to fix a compile error raised for aarch64.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-01-18 16:32:15 +01:00
Amulyam24
376941cf69 kata-ctl: skip building kata-ctl on ppc64le
kata-ctl currently fails to build on ppc64le. Skip it for running static checks and the issues will be fixed and tracked in a seperate issue.

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2024-01-18 16:31:13 +01:00
Amulyam24
4ecd82a5df runk: skip the test_init_container_create_launcher if not root on ppc64le
This is to skip the test_init_container_create_launcher if not root on ppc64le.

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2024-01-18 16:31:13 +01:00
Amulyam24
a4b5447924 tools: fix makefile spacing
This minor PR removes the extra space in the makefiles.

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2024-01-18 16:31:13 +01:00
Amulyam24
394777291d runtime: fix failing unit tests on ppc64le
A few CPU related test cases were failing as the version was being verified against Power8 while the CI machine is Power9.

Fixes: #5531

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2024-01-18 16:31:13 +01:00
Amulyam24
486b8a0538 dragonball: skip running static-checks for ppc64le
Since dragonball is not currently supported on ppc64le, skip running the targets for static-checks.

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2024-01-18 16:31:13 +01:00
Amulyam24
14934c7b0d github: run static checks on ppc64le
This PR adds ppc64le runner to the static-checks workflow.

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2024-01-18 16:31:13 +01:00
Hyounggyu Choi
8061a49ca5 kata-ctl: Clean up a test leftover file explicitely
It was observed that a tmporary file `/tmp/kata_hybrid_vsock02.hvsock`
for test_setup_hvsock_failed() is not removed from time to time.
This leads to a test failure for the same test next time due to the
file permission on a self-hosted runner.
This commit is to explicitely delete the file before the check starts.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-01-18 16:31:13 +01:00
Hyounggyu Choi
290ecf4c46 Static-check: Exclude s390x from dragonball and runtime-rs
At the moment, a project `dragonball` and `runtime-rs` does not support
for s390x. During the enablement, some errors due to the misconfiguration
of Makefile for `make check` and `make vendor` were identified.

This is to skip the build for the affected target of the projects.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-01-18 16:31:13 +01:00
Hyounggyu Choi
c0f57c9e0a Lint: Fix cargo clippy errors for s390x
Some linting errors were identified during the enablement of `make check`.
These have not been found by the Jenkins CI job because `make test` was
only triggered.

The errors for the `agent` occurs under the s390x specific tests while
the other ones for the `kata-ctl` are the architecture-specific code.

This commit is to fix those errors.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-01-18 16:31:13 +01:00
Hyounggyu Choi
a1f288e5d3 CI: Use sudo if yq_path is not writable by USER
If `yq_path` is set to `/usr/local/bin/yq`, there could be a situation
where the `yq` cannot be installed without `sudo`.
This commit handles the situation by putting `sudo` in front of `curl`
and `chmod`, respectively.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-01-18 16:31:13 +01:00
Hyounggyu Choi
354cbede9c GHA: Enable static check for s390x
As part of the CI migration from Jenkins to GitHub Action, a CI job named
`kata-containers-2.0-ubuntu-s390x-unit-PR` is covered by the static check.
This commit is to enable the check for s390x by incorporating a runner
`s390x` with the corresponding workflow.

Fixes: #8482

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-01-18 16:31:13 +01:00
Jianyong Wu
ba74a624a8 runtime-rs: use pathBuf only for x86
PathBuf here is only used for x86.

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2024-01-18 16:31:13 +01:00
Jianyong Wu
a10779bf0b GHA: enable static check on arm64
This is to add a runner for arm64 to the workflow.

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2024-01-18 16:31:11 +01:00
Dan Mihai
eeba459a6b Merge pull request #8845 from microsoft/danmihai1/genpolicy-defaults
tools: install genpolicy settings files
2024-01-17 15:08:49 -08:00
Chelsea Mafrica
32ad465663 Merge pull request #8710 from jodh-intel/runtime-rs-ch-get-thread-ids
runtime-rs: ch: Implement minimal implementation for missing thread/pid APIs
2024-01-17 14:51:44 -08:00
Fabiano Fidêncio
147d5fd752 Merge pull request #8836 from microsoft/danmihai1/test-with-cbl-mariner
genpolicy: use root path from cbl-mariner Guest VM
2024-01-17 17:51:44 +01:00
Pavel Mores
f550d9a325 runtime-rs: add basic implementation of qemu command line generation
This current framework is enough to launch a VM with a simple container
in it (e.g. busybox).

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-01-17 12:55:00 +01:00
Pavel Mores
e8e13044da runtime-rs: add simple impls to some of Qemu's Hypervisor functions
The idea of most of these is just to prevent running into todo!()s where
we can at the moment, while implementing the fundamental functionality of
VM launch.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-01-17 12:55:00 +01:00
Dan Mihai
febabef08c tools: install genpolicy settings files
Install the default genpolicy OPA rules and settings JSON files, in
addition to the genpolicy binary.

Fixes: #8844

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-16 23:59:59 +00:00
David Esparza
e11c520ffa Merge pull request #8808 from kata-containers/memory_usage_test_skip_virtiofs_when_req
tests: Ignore virtiofs contribution to memory usage when it is disabled.
2024-01-16 16:50:06 -06:00
Dan Mihai
69557e5ad6 Merge pull request #8814 from microsoft/danmihai1/genpolicy-kata-deploy
tools: genpolicy static checks
2024-01-16 07:33:42 -08:00
Dan Mihai
13f2398fe8 Merge pull request #8837 from microsoft/danmihai1/allow_storages
genpolicy: temporarily disable allow_storages()
2024-01-16 07:10:49 -08:00
Alex.Lyn
17719f1ac5 Merge pull request #8708 from Apokleos/directvol-bugfix-blk-pci
runtime-rs: bugfix for DirectVolume/rawblock when driver is blk
2024-01-16 14:25:16 +08:00
alex.lyn
99717371c1 runtime-rs: bugfix for DirectVolume/rawblock when driver is blk
DirectVolume/Rawblock doesn't work well when device's block driver
is virtio-blk-pci and the storage handler is DRIVER_BLK_PCI_TYPE.

Fixes: #8707

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2024-01-16 10:35:08 +08:00
Dan Mihai
205dafd323 genpolicy: temporarily disable allow_storages()
Temporarily disable the allow_storages() rules, because they are based
on the tarfs snapshotter + container image integrity information that
are not available yet in the main branch - see #8833.

Fixes: #8834

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-15 23:55:27 +00:00
Dan Mihai
f4106a6107 genpolicy: use root path from cbl-mariner Guest VM
Adjust genpolicy-settings.json to match the container root path from
the main branch + cbl-mariner Guest VMs.

This configuration might have to be adjusted again when other types of
Guest VMs will be tested during CI using genpolicy, in the future.

Also, improve logging from allow_root_path(), to easier debug these
issues in the future.

Fixes: #8835

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-15 23:33:28 +00:00
GabyCT
37a4049d0f Merge pull request #8830 from GabyCT/topic/removeprotocol
metrics: Remove iperf3 server protocol
2024-01-15 14:44:39 -06:00
Dan Mihai
201eec628a tools: genpolicy static checks
Package genpolicy and enable static checks for it.

Fixes: #8813

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-15 16:49:58 +00:00
David Esparza
4b772d2480 tests: Ignore virtiofs contribution to memory usage when it is disabled.
This PR removes the references to virtiofs from memory average
calculation when the container uses a shared file system other than
virtiofs.

Fixes: #8807

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-01-15 08:07:06 -08:00
Gabriela Cervantes
dff800a8ff metrics: Remove iperf3 server protocol
This PR removes the iperf3 server protocol as this server definition is
also used for the UDP iperf3 benchmarks to avoid duplication of the
same yaml files.

Fixes #8829

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-15 15:44:24 +00:00
Fabiano Fidêncio
0dc00ae373 Merge pull request #8822 from microsoft/danmihai1/cargo-clippy
genpolicy: cargo clippy fixes
2024-01-15 14:59:04 +01:00
Fabiano Fidêncio
73cf31bd9e Merge pull request #8827 from microsoft/danmihai1/disable-k8s-oom
tests: cbl-mariner: disable k8s-oom.bats
2024-01-15 14:40:16 +01:00
Xuewei Niu
923bd65dff Merge pull request #8819 from justxuewei/rm-protocol-backend
dragonball: Remove unused definition
2024-01-15 10:09:46 +08:00
Dan Mihai
b7c31e3b98 tests: cbl-mariner: disable k8s-oom.bats
Disable k8s-oom.bats on cbl-mariner until it passes more often.

Fixes: #8824

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-14 17:39:25 +00:00
Dan Mihai
681cb1626a genpolicy: cargo clippy fixes
Clean up cargo clippy errors.

Fixes: #8818

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-14 01:23:46 +00:00
Dan Mihai
3af713acd4 Merge pull request #8817 from microsoft/danmihai1/cargo-fmt
genpolicy: "cargo fmt -- --check" clean-up
2024-01-13 16:22:27 -08:00
Xuewei Niu
f1fda3d6b0 dragonball: Remove unused definition
`EndpointProtocolFlags::ProtocolBackend` is removed due to no reference.

Fixes: #8745

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-01-13 13:25:11 +08:00
Dan Mihai
dcaae54cf6 genpolicy: "cargo fmt -- --check" clean-up
Also, update Cargo.lock

Fixes: #8816

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-13 01:57:00 +00:00
GabyCT
a7114a35a8 Merge pull request #8792 from GabyCT/topic/updatenhwc
metrics: Use a specific python version to run tensorflow benchmark
2024-01-12 11:24:54 -06:00
Alex.Lyn
ffcd95b6b4 Merge pull request #8737 from Apokleos/test-ci-dgb-cri-containerd
ci: enable test dragonball stability and cri-containerd
2024-01-12 11:56:22 +08:00
Fabiano Fidêncio
a606401722 Merge pull request #8803 from jodh-intel/issues-8784-runtime-rs-ch-rm-todo-to-unbreak
runtime-rs: ch: Unbreak CH driver
2024-01-11 19:37:13 -03:00
Gabriela Cervantes
12a41f89b1 metrics: Use a specific python version to run tensorflow benchmark
This PR uses a specific python version to run tensorflow benchmark
as it needs python 3.8 to run correctly and avoid failures.

Fixes #8791

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-11 22:15:31 +00:00
GabyCT
2ffb161958 Merge pull request #8763 from stevenhorsman/fix-backport-check-hub
Fix backport check hub
2024-01-11 15:15:12 -06:00
Fabiano Fidêncio
86a6d133e4 Merge pull request #8248 from microsoft/danmihai1/genpolicy-main
tools: add policy generation tool
2024-01-11 17:02:54 -03:00
GabyCT
69be050ff9 Merge pull request #8657 from WenyuanLau/8656/Fix_StratoVirt_on_gha_metrics
gha: Fix the failure of gha metrics for StratoVirt
2024-01-11 11:41:25 -06:00
James O. D. Hunt
29e0de4e4a runtime-rs: ch: Implement minimal memory hotplug APIs
Replace the `todo!()` calls with a minimal NOP implementation to return
the CH driver to working order since the `todo!()`'s forcibly crash the
driver at runtime. Full implementations for these APIs will be added on
issues #8800, #8801, and #8802.

Fixes: #8784.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-01-11 14:11:31 +00:00
James O. D. Hunt
1c0df670af runtime-rs: ch: Add minimal implementation of hypervisor metrics method
Remove the `todo!()` macro which would cause a runtime crash and replace
with a implementation that returns an error as a stop-gap until #8800 is
implemented.

Fixes: #8785.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-01-11 14:11:01 +00:00
alex.lyn
b97efc3139 CI: enable test container memory update for dragonball
Fixes: #8746

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2024-01-11 19:07:33 +08:00
alex.lyn
6c85e95c34 CI: bugfix for dragonball when CI running with cri-containerd
Containerd runtime options with wrong setting cause it failed.
Correct it as below:
...
 [plugins.cri.containerd.runtimes.${runtime}.options]
   ConfigPath= "${KATA_CONFIG_PATH}"
...

Fixes: #8746

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2024-01-11 17:35:33 +08:00
alex.lyn
cd59d31a15 CI: make CI work for dragonball to test stability and cri-containerd
It needs to remove the skip setting, and make it work for dragonball.

Fixes: #8746

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2024-01-11 17:35:13 +08:00
Hyounggyu Choi
f62ec0a7f5 Merge pull request #8693 from BbolroC/ibm-se-config-validation-fix
runtime: Allow no initrd path for IBM Z Secure Execution
2024-01-11 09:53:51 +01:00
Xuewei Niu
70305fefc5 Merge pull request #8780 from justxuewei/containerd-events
runtime-rs: Forward events to containerd via ttrpc
2024-01-11 14:58:14 +08:00
Xuewei Niu
6fd49f7604 runtime-rs: Forward events to containerd via ttrpc
It is a little bit heavy for the runtime-rs to forwards events via
containerd CLI, contrast to the ttrpc way. Plus, for runtimes that haven't
this mechanism, e.g. CRI-O, we can't get those events anywhere.

This patch introduces two types of forwarders:

- `ContainerdForwarder`: Acquire ttrpc address from environment variables
  and forward events via ttrpc connection.
- `LogForwarder`: Write event info into logs.

Fixes: #7881

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-01-11 10:32:50 +08:00
GabyCT
a8be3d0450 Merge pull request #8796 from GabyCT/topic/uruncv
versions: Update runc version
2024-01-10 14:16:20 -06:00
Gabriela Cervantes
e69f7c07a7 versions: Update runc version
This PR updates the runc version to 1.1.11 which includes the
following improvements

- Fix several issues with userns path handling.
- Support memory.peak and memory.swap.peak in cgroups v2. Add
swapOnlyUsage in MemoryStats. This field reports swap-only usage.
For cgroupv1, Usage and Failcnt are set by subtracting memory usage
from memory+swap usage. For cgroupv2, Usage, Limit, and MaxUsage
are set.
- build(deps): bump github.com/cyphar/filepath-securejoin.

Fixes #8795

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-10 16:46:08 +00:00
Greg Kurz
0c37aec7dc Merge pull request #8753 from fidencio/topic/add-confidential-artefacts
TEEs: Introduce kernel-confidential
2024-01-10 16:59:57 +01:00
Alex.Lyn
695440a431 Merge pull request #8749 from Apokleos/fixup-dragonball-vfio
runtime-rs: fixup vfio device in runtime-rs/dragonball
2024-01-10 15:20:34 +08:00
Dan Mihai
de61b4d4e2 Merge pull request #8772 from microsoft/danmihai1/wait-for-delete
tests: list the current k8s pods
2024-01-09 13:45:55 -08:00
Fabiano Fidêncio
c3f6eaa267 build-kernel: Fix typo 'terball' -> 'tarball'
SSIA. :-)

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-09 14:35:45 -03:00
Fabiano Fidêncio
8b2f43a2c2 build: Add "confidential" kernel
We're using a Kernel based on v6.7, which should include all te
patches needed for SEV / SNP / TDX.

By doing this, later on, we'll be able to stop building the specific
kernel for each one of the targets we have for the TEEs.

Let's note that we've introduced the "confidential" target for the
kernel builder script, while the TEE specific builds are being kept as
they're -- at least for now.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-09 14:35:45 -03:00
Jianyong Wu
379e2f3da2 kernel: update some configs based on kernel 6.5 and 6.6
There are lots of configs removed from latest kernel. Update them here
for convenience of next kernel upgrade.

Remove CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE [1]
Remove CONFIG_IP_NF_TARGET_CLUSTERIP [2]
Remove CONFIG_NET_SCH_CBQ [3]
Remove CONFIG_AUTOFS4_FS [4]
Remove CONFIG_EMBEDDED [5]

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=a7e4676e8e2cb158a4d24123de778087955e1b36
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=9db5d918e2c07fa09fab18bc7addf3408da0c76f
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=051d442098421c28c7951625652f61b1e15c4bd5
[4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=1f2190d6b7112d22d3f8dfeca16a2f6a2f51444e
[5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=ef815d2cba782e96b9aad9483523d474ed41c62a

Fixes: #8408
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2024-01-09 14:35:45 -03:00
Fabiano Fidêncio
cf4835e3ae packaging: qemu: Simplify "--disable-virtiofsd" logic
As all the supported architectures are disabling the virtiofsd build,
there's no need to keep the switch statement there.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-09 14:35:45 -03:00
Fabiano Fidêncio
bfc6fc7a85 build: Get rid of QEMU experimental
We've not been building QEMU experimental for a very long time, and the
entry there has only been serving the purpose to clutter the
versions.yaml (in the best case scenario) or even confuse new
contributors to the project.

Mind that the machinery to build the QEMU experimental is not touched,
and that's used to build the TEEs capabale artefacts.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-09 14:35:45 -03:00
GabyCT
4ac5f13722 Merge pull request #8789 from GabyCT/topic/installimagestress
tests: Add check images as part of install dependencies
2024-01-09 09:28:13 -06:00
GabyCT
393edf380a Merge pull request #8778 from GabyCT/topic/fixin
packaging: Fix indentation of build static stratovirt
2024-01-09 09:27:52 -06:00
Greg Kurz
e3611cf27d Merge pull request #8326 from cheriL/8325/fix_method_param
agent: use method params instead of const params in functions
2024-01-09 07:35:19 +01:00
Gabriela Cervantes
24fab19f6f tests: Remove check images function from stressng test
This PR removes the check images function from stressng test as now
it will part of the install dependencies function from gha-run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-08 17:40:39 +00:00
Gabriela Cervantes
aceba94d95 tests: Add check images as part of install dependencies
To avoid random failures while trying to build and install the stressng image,
this PR moves that step as part of the install dependencies in order to move
the stability tests and avoid timeouts.

Fixes #8787

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-08 17:38:14 +00:00
Pavel Mores
0cfb2d2570 runtime-rs: add simple Persist implementation for Qemu
This is not necessarily meant to work, just to stub out unimplemented
functionality while focusing on more fundamental things.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-01-08 13:12:39 +01:00
Pavel Mores
45862aeec0 runtime-rs: add default rootfs type for qemu
Make sure that rootfs type is known early on even if it's not set in
configuration.toml.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-01-08 13:12:39 +01:00
Gabriela Cervantes
7d41c97f60 packaging: Fix indentation of build static stratovirt
This PR fixes the indentation of the build static stratovirt script
for kata containers.

Fixes #8777

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-05 18:06:08 +00:00
Dan Mihai
90c782f928 tests: list the current k8s pods
Log the list of the current pods between tests because these pods
might be related to cluster nodes occasionally running out of memory.

Fixes: #8769

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-05 16:41:43 +00:00
Xuewei Niu
192c6ee9c3 Merge pull request #8773 from justxuewei/dbs-k8s-fragile 2024-01-05 12:54:32 +08:00
Xuewei Niu
0e9d73fe30 agent: Fix an issue reporting OOM events by mistake
The agent registers an event fd in `memory.oom_control`. An OOM event is
forwarded to containerd when the event is emitted, regardless of the
content in that file.

I observed content indicating that events should not be forwarded, as shown
below. When `oom_kill` is set to 0, it means no OOM has occurred. Therefore,
it is important to check the content to avoid mistakenly forwarding OOM
events.

```
oom_kill_disable 0
under_oom 0
oom_kill 0
```

Fixes: #8715

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-01-05 11:06:37 +08:00
Dan Mihai
b18f269ccf Merge pull request #8735 from microsoft/danmihai1/set-policy
agent: hold lock while setting new policy
2024-01-04 13:28:21 -08:00
GabyCT
5ea07c2b3e Merge pull request #8776 from GabyCT/topic/addextraqemu
tests: Add hypervisor component to kill kata components function
2024-01-04 14:29:52 -06:00
Gabriela Cervantes
4ad1971a0a tests: Add hypervisor component to kill kata components function
This PR adds the qemu-experimental hypervisor in the function to
kill kata components.

Fixes #8775

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-04 17:07:12 +00:00
stevenhorsman
6bac3323be workflows: Update backport-label to use gh-utils.sh
- hub is deprecated, so use the new gh-utils.sh script that wraps the github cli instead

Fixes: #8125
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-01-04 16:48:34 +00:00
stevenhorsman
0d5d1c8c36 ci: Add gh-util.sh script
- The hub tool is now deprecated, so introduce a new alternative to `hub-util.sh`
https://github.com/kata-containers/.github/blob/main/scripts/hub-util.sh
that works with it.
Initially I've only started with the couple of commands that we use regularly, but we can extend it in future.
- Expects jq to be installed and `gh` to be installed an setup (see [1])
- Now we don't have lots of repos, I've moved it into `kata-containers` rather than `.github`,
so it is more visible.

Fixes: #8125

[1] https://docs.github.com/en/github-cli/github-cli/quickstart#prerequisites

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-01-04 16:48:34 +00:00
Dan Mihai
7d5336aca3 agent: hold lock while setting new policy
Don't release the lock between is_allowed and set_policy calls,
because the policy might change in between these calls.

Also, move more policy code into policy.rs.

Fixes: #8734

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-04 16:45:30 +00:00
GabyCT
f056ffe5ef Merge pull request #8759 from fadecoder/update_docs_for_stratoVirt_VMM
docs: Update docs for new StratoVirt VMM introduction
2024-01-04 10:39:37 -06:00
GabyCT
4f9ee7b31c Merge pull request #8766 from GabyCT/topic/improvedeleteion
metrics: Improve iperf3 cleanup
2024-01-04 10:38:33 -06:00
Xuewei Niu
b5a6e74cdf Merge pull request #8744 from justxuewei/vhu-net-compile
dragonball: Fix compilation issue without all net features
2024-01-04 19:02:55 +08:00
Xuewei Niu
db948f685d Merge pull request #8757 from justxuewei/upgrade-containerd-shim-protos
runtime-rs|agent|protocols|agent-ctl: Bump ttrpc and containerd-shim-protos versions
2024-01-04 19:02:42 +08:00
soup
7c176a62fe agent: use method params instead of const params in functions
Fixes: #8325

Signed-off-by: soup <lqh348659137@outlook.com>
2024-01-04 09:29:29 +01:00
Xuewei Niu
f97f16a44a agent-ctl: Bump ttrpc version
- `ttrpc` from `0.7.1` to `0.8`.

Fixes: #8757

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-01-04 15:58:34 +08:00
Xuewei Niu
bf59c7b3d4 runtime-rs: Bump ttrpc and containerd-shim-protos versions
- `ttrpc` from `0.7.1` to `0.8`.
- `containerd-shim-protos` from `0.3.0` to `0.6.0`.

Fixes: #8756

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-01-04 15:58:34 +08:00
Xuewei Niu
cf9a0e21a1 protocols: Bump ttrpc version
- `ttrpc` from `0.7.1` to `0.8`.

Fixes: #8756

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-01-04 15:58:34 +08:00
Xuewei Niu
91360e7ddb agent: Bump ttrpc version
- `ttrpc` from `0.7.1` to `0.8`.

Fixes: #8756

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-01-04 15:58:34 +08:00
Chao Wu
0f532175fe Merge pull request #8771 from openanolis/chao/fix_ut
dbs-pci: introduce Cargo.lock to prevent the influence from upstream
2024-01-04 15:14:22 +08:00
Zhigang Wang
44b5b88f4c docs: Update docs for new StratoVirt VMM introduction
As the StratoVirt VMM has been added, we can update the docs
and make some intoduction to StratoVirt, thus users can know more
about the hypervisor choices.

Fixes: #8645

Signed-off-by: Zhigang Wang <wangzhigang17@huawei.com>
Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2024-01-04 14:26:48 +08:00
Chao Wu
f1235ddba3 dbs_virtio_devices: add Cargo.lock
In order to avoid rust-vmm upstream change breaks Dragonball
compilation, we introduce Cargo.lock to dbs crates.

fixes: #8770

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2024-01-04 11:23:30 +08:00
Chao Wu
02cd726bfc dbs-utils: add Cargo.lock
In order to avoid rust-vmm upstream change breaks Dragonball
compilation, we introduce Cargo.lock to dbs crates.

fixes: #8770

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2024-01-04 11:17:45 +08:00
Chao Wu
97bdc1529b dbs-pci: introduce Cargo.lock
As reported in #8767, we have found that the root cause is that rust-vmm's vmm-sys-utils
introduce a new release 0.12.1 and dbs-pci rely on rust-vmm's vfio-ioctls which uses >=
to declare vmm-sys-utils so it automatically upgrade vmm-sys-utils to 0.12.1.
That's how two different versions of vmm-sys-utils is introduced and this breaks the compilation.

In order to fix this and also avoid future problems, we introduce Cargo.lock file to dbs crates.

fixes: #8770

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2024-01-04 11:11:56 +08:00
Gabriela Cervantes
4bc67dba08 metrics: Improve iperf3 cleanup
This PR improves the iperf3 cleanup to ensure all the components are
being deleted properly to avoid the random failures of leaving
the iperf3 clients on the kata metrics CI.

Fixes #8765

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-03 17:14:38 +00:00
alex.lyn
d2080fd221 runtime-rs: refactor getting the vfio device guest pci path
Fixes: #8748

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2024-01-02 14:28:34 +08:00
alex.lyn
d795fcfc2f runtime-rs: bridge the vfio device between runtime-rs and dragonball
Previously, Dragonball did not support PCI device hot-plugging or
VFIO device passthrough. Therefore, the runtime-rs support for
Dragonball was incomplete. it is time to complete it so that users
can use Dragonball's PCI hot-plugging and VFIO passthrough capabilities.

Fixes: #8748

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2024-01-02 14:28:10 +08:00
Chao Wu
67b91c1eb3 Merge pull request #8740 from openanolis/upstream/pci-6-final
Dragonball: add pci vfio passthrough, hot(un)plug support
2023-12-29 01:58:32 +08:00
Chao Wu
71c322c293 runtime-rs: fix ci complains
vfio commits introduce quite a lot change in runtime-rs, this commit is
for all the changes related to ci, including compilation errors and so on.

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-12-28 23:34:41 +08:00
Chao Wu
f9e0a4bd7e upcall: introduce pci device add & del kernel patch
add pci add and del guest kernel patch as the extension
in the upcall device manager server side.

also, dump config version to 120 since we need to add config
for dragonball pci in upcall

fixes: #8741

Signed-off-by: Gerry Liu <gerry@linux.alibaba.com>
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-12-28 16:21:30 +08:00
Chao Wu
a3f7601f5a dragonball: add pci hotplug / hot-unplug support
Introduce two new vmm action to implement pci hotplug
and pci hot-unplug: PrepareRemoveHostDevice and RemoveHostDevice.

PrepareRemoveHostDevice is to call upcall to unregister the pci device
in the guest kernel.
RemoveHostDevice should be called after PrepareRemoveHostDevice, it is used
to clean the PCI resource in the Dragonball side.

fixes: #8741

Signed-off-by: Gerry Liu <gerry@linux.alibaba.com>
Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>
Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-12-28 16:08:31 +08:00
Chao Wu
0f402a14f9 dragonball: add InsertHostDevice vmm action
Introduce a new vmm action InsertHostDevice to passthrough
host pci devices like NIC or GPU devices into guest so that
users could have high performance usage of those devices.

fixes: #8741

Signed-off-by: Gerry Liu <gerry@linux.alibaba.com>
Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>
Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-12-28 16:04:22 +08:00
Xuewei Niu
4c023e341c dragonball: Fix compilation issue without all net features
Combinations of network features were tested:

- None
- virtio-net
- vhost-net
- vhost-user-net
- virtio-net,vhost-net
- vhost-net,vhost-user-net
- virtio-net,vhost-user-net
- virtio-net,vhost-net,vhost-user-net

Fixes: #8742

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-12-28 11:37:26 +08:00
Alex.Lyn
990a3adf39 Merge pull request #8618 from Apokleos/csi-for-directvol
runtime-rs: Add dedicated CSI driver for DirectVolume support in Kata
2023-12-27 21:27:29 +08:00
Chao Wu
cbd4481bc1 Merge pull request #7489 from Apokleos/pci_path
runtime-rs: add pci topology for pci devices
2023-12-27 18:52:06 +08:00
alex.lyn
ea69c17008 runtime-rs: initialize pcie topology in Device Manager
Add a pcie_topology field to DeviceManager and initialize
pcie_topology when ResourceManager calls DeviceManager's new()
with TopologyConfigInfo.

Fixes: #7218

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-27 15:57:23 +08:00
alex.lyn
b42548b8e1 runtime-rs: do unregister device in Trait Device/detach
Fixes: #7218

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-27 15:53:18 +08:00
alex.lyn
0f0b6d13c9 runtime-rs: do register/update device in Trait Device/attach
Before calling the device driver to attach a device, register
the device to PCIe topology and allocate a PciPath for it.

However, for some hypervisor such as CLH, the allocation is invalid
when plugging devices to VM, they have the ability to return
DeviceInfo containing PciPath. It'll update the PciPath with the
returned pci path in the PCIe topology for them to prevent the
inferred pcipath from being different from the actual value returned.

But the update will not be executed if the pcipath value doesn't change.

Fixes: #7218

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-27 15:49:18 +08:00
alex.lyn
ce7d363695 runtime-rs: Introduce helper macros to simplify PCIe device ops
Introduce helper macros to simplify PCIe device register/unregister
and update, which provides a convenient way to handle devices in
topology.

Fixes: #7218

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-27 15:43:58 +08:00
alex.lyn
0d4992b24d runtime-rs: add one more argument in Device attach/detach
Add one more argument with type &mut Option<&mut PCIeTopology>
in attach and detach to inroduce methods within PCIe Topology.

Fixes: #7218

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-27 15:40:01 +08:00
alex.lyn
b425de6105 runtime-rs: implement Trait PCIeDevice for pcie/pci device
Implement Trait PCIeDevice register/unregister for pcie/pci
device, such as vfio device which needs set/get device's pci
path for kata agent's device handler.

Fixes: #7218

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-27 15:33:08 +08:00
alex.lyn
87e39cd1f6 runtime-rs: introduce Trait PCIeDevice to do [un]register device
Introduce Trait PCIeDevice with register/unregister, which are
used to register or unregister pcie device within the PCIe topology.

Fixes: #7218

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-27 15:29:35 +08:00
alex.lyn
6ebc4884fa runtime-rs: introduce PCIe Topology framework for pcie/pci devices
Due to different ways that different VMMs handle PCI devices,
we expect to provide a general PCIe topology processing framework
that is as compatible as possible with VMMs such as dragonball,
qemu, clh(Though it has its own management method, no conflict).

Currently,it's mainly developed for kinds of PCIe/PCI devices in
dragonball/clh which are attached on the pci/pcie root bus directly.

More will be added when Qemu is ready in runtime-rs.

Fixes: #7218

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-27 15:29:25 +08:00
alex.lyn
88839026b9 runtime-rs: introduce TopologyConfigInfo to initialize pcie topology
A TopologyConfigInfo added to store device config info for PCIe/PCI
devices in the VM from Hypervisor DeviceInfo.

And TopologyConfigInfo::new will be the entry to initialize PCIe
Topology for each VM.

Fixes: #7218

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-27 15:21:53 +08:00
Fabiano Fidêncio
35f88dfc93 Merge pull request #8733 from fidencio/topic/fix-shim-check-for-snapshotter-configration
kata-deploy: Fix shim check for snapshotter configuration
2023-12-27 03:30:53 -03:00
Chao Wu
8895cb82df Merge pull request #8724 from openanolis/chao/add_vfio
dragonball: introduce vfio support
2023-12-27 11:40:53 +08:00
Xuewei Niu
43a627c96f Merge pull request #8632 from adamqqqplay/support-vhost-user-blk
dragonball: introduce vhost-user-blk device
2023-12-27 09:54:21 +08:00
Chao Wu
2f797a6eb7 pci: rename 2 parameters to follow rust naming convention
PciCapabilityID -> PciCapabilityId
PciBarRegionType::IORegion -> PciBarRegionType::IoRegion

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-12-26 23:28:47 +08:00
Chao Wu
9c13b2c990 dragonball: introduce vfio support
vfio mod collects lots of information related to the vfio operations, including VfioMsi and VfioMsix capability & state,
vfio interrupt info, pci region infor and vfio pci device info & state.

fixes: #8722

Signed-off-by: Gerry Liu <gerry@linux.alibaba.com>
Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>
Signed-off-by: Shifang Feng <fengshifang@linux.alibaba.com>
Signed-off-by: Yang Su <yang.su@linux.alibaba.com>
Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
Signed-off-by: Xin Lin <jingshan@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-12-26 23:28:43 +08:00
alex.lyn
8779fe7dd5 runtime-rs: create a reference that directs users to kata csi doc
Fixes: #8602

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-26 20:36:34 +08:00
alex.lyn
ba5437382a runtime-rs: add examples about Kata pod with directvol by CSI.
Fixes: #8602

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-26 20:36:34 +08:00
alex.lyn
c6d2a32146 runtime-rs: add support for directvol csi deploy scripts.
Fixes: #8602

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-26 20:36:34 +08:00
alex.lyn
25d8e83e43 runtime-rs: Add dedicated CSI driver for DirectVolume support in Kata
Bridge the gap between user requirements for direct block device access
and the DirectVolume capabilities provided by Kata runtimes
(kata-runtime/runtime-rs), and facilitate seamless integration with CSI
to improve user experience.

It aims to integrate DirectVolume CSI support into Kata, enabling users
to benefit from its performance and flexibility advantages.

Fixes: #8602

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-26 20:36:22 +08:00
Fabiano Fidêncio
6ee7fb5402 kata-deploy: Double quote the snapshotter name
Otherwise `jq` will complain about:
```sh
jq: error: nydus/0 is not defined at <top-level>, line 1:
.plugins."io.containerd.grpc.v1.cri".containerd.runtimes."kata-clh".snapshotter=nydus
jq: 1 compile error
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-26 09:14:36 -03:00
Qinqi Qu
81ab174c16 dragonball: support vhost-user-blk in device manager
This patch introduces a feature of supporting vhost-user-blk device.

Fixes: #8631

Signed-off-by: Qinqi Qu <quqinqi@linux.alibaba.com>
2023-12-26 20:02:38 +08:00
Qinqi Qu
ef8dc3b0ce dragonball: support vhost-user-blk
This patch introduces a feature of supporting vhost-user-blk device.

This device needs to be defined before the VM instance is started,
which can be done through the dbs-cli tool with --virblks option:
--virblks '{
	"drive_id": "8623",
	"device_type": "Spdk",
	"path_on_host": "spdk:///var/tmp/vhost.sock",
	"is_root_device": false,
	"is_read_only": false,
	"is_direct": false,
	"no_drop": false,
	"num_queues": 1,
	"queue_size": 256
}'

Fixes: #8631

Signed-off-by: Eric Ren <renzhen@linux.alibaba.com>
Signed-off-by: fupan <fupan.lfp@antgroup.com>
Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Qinqi Qu <quqinqi@linux.alibaba.com>
2023-12-26 20:02:32 +08:00
Fabiano Fidêncio
8332f3c684 kata-deploy: Fix the snapshotter config placement
In the way the script is without this patch, we're trying to set
```toml
[`$shim`]
snapshotter = $snapshotter
```

However, what we actually want to set is the full runtime table instead
of shim.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-26 08:26:38 -03:00
Fabiano Fidêncio
907f1ddb9e kata-deploy: Fix shim check for snapshotter configuration
We want to check whether the shim is part of the "plain text" shims
passed to the daemonset (meaning, checking against `$SHIMS`).  Before
this fix we were checking against `$shims`, which is an array of shims
instead of a string, resulting on a broken check.

Fixes: #8732

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-26 07:42:36 -03:00
Tim Zhang
a4ad12a3d1 Merge pull request #8729 from liubin/fix/package-kata-monitor
kata-monitor: fix Dockerfile to build image
2023-12-26 18:30:15 +08:00
alex.lyn
3b317e69e2 runtime-rs: add README and user guide to deploy directvol CSI Driver
Fixes: #8602

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-26 18:00:35 +08:00
Bin Liu
23eb3042c7 kata-monitor: fix Dockerfile to build image
move `SKIP_GO_VERSION_CHECK` after `make` command to skip
checking golang version.

And also upgrade golang to 1.19.

Fixes: #8728

Signed-off-by: Bin Liu <bin@hyper.sh>
2023-12-26 15:11:13 +08:00
Xuewei Niu
1065ca6fa7 Merge pull request #8626 from justxuewei/vhost-user-endpoint 2023-12-26 12:52:21 +08:00
Xuewei Niu
36a4cbccf6 runtime-rs: Expand all DeviceType in match arms
The compiler will give a warning if a developer forget to add an arm for
a new variants defined.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-12-26 10:18:59 +08:00
Xuewei Niu
f2d08bc00f runtime-rs: Remove unused index from Endpoints
The affected `Endpoint`s are `VhostUserEndpoint` and `TapEndpoint`.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-12-26 10:18:59 +08:00
Xuewei Niu
60a42351e2 runtime-rs: DAN supports vhost-user-net device
DAN reads vhost-user-net device from JSON config. It only supports VMM
running as server right now.

Fixes: #8625

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-12-26 10:18:59 +08:00
Xuewei Niu
693a0cfbfd dragonball: Make vhost-user-net ready for VhostUserEndpoint
The changes involve:

- Expose VhostUserConfig struct to runtime-rs.
- Set a default value while num_queues or queue_size are 0.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-12-26 10:18:59 +08:00
Xuewei Niu
54df832407 runtime-rs: Support VhostUserEndpoint
This commit introduces VhostUserEndpoint and supports relative to
vhost-user-net devices for device manager. For now, Dragonball is able to
attach vhost-user-net devices.

Fixes: #8625

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-12-26 10:18:50 +08:00
Xuewei Niu
374c2f01aa runtime-rs: Simplify VhostUserType enum
Remove unused string parameter from each item.

Fixes: #8625

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-12-25 16:15:57 +08:00
Xuewei Niu
38eb4077a6 Merge pull request #8503 from justxuewei/vhost-user-net
dragonball: Support vhost-user-net device
2023-12-25 13:47:51 +08:00
Xuewei Niu
4c5de72863 dragonball: Wrap config space into set_config_space
Config space of network device is shared and accord with virtio 1.1 spec.
It is a good way to abstract the common part into one function.
`set_config_space()` implements this.

Plus, this patch removes `vq_pairs` from vhost-net devices, since there is
a possibility of data inconsistency. For example, some places read that
from `self.vq_pairs`, others read from `queue_sizes.len() / 2`.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-12-25 10:47:34 +08:00
Alex.Lyn
3a3f39aa2d Merge pull request #8668 from Apokleos/pci-path-refactor
runtime-rs: Refactor the code related to PCI paths and VFIO device driver initialize in DM.
2023-12-23 21:44:07 +08:00
Steve Horsman
1afce09858 Merge pull request #8721 from stevenhorsman/kata-deploy-typos
kata-deploy: snapshotter typo fixes
2023-12-22 21:26:03 +00:00
stevenhorsman
4a95c0d07f kata-deploy: snapshotter typo fixes
- Add spaces so that the if statements are valid

Fixes: #8720
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-12-22 16:32:02 +00:00
Dan Mihai
080541a0f2 genpolicy: add SPDX license header
Add SPDX license header to rules.rego.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-12-22 15:35:05 +00:00
Saul Paredes
7f126be67e genpolicy: Update oci_distribution to 0.10.0
Also support alternative media type and update samples

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-12-22 15:35:05 +00:00
Dan Mihai
9eb6fd4c24 docs: add agent policy and genpolicy docs
Add docs for the Agent Policy and for the genpolicy tool.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-12-22 15:35:05 +00:00
Dan Mihai
57f93195ef genpolicy: add support for StatefulSet YAML input
Generate policy for K8s StatefulSet YAML.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-12-22 15:35:05 +00:00
Dan Mihai
35958ec9cc genpolicy: add support for ReplicationController
Generate policy for K8s ReplicationController YAML.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-12-22 15:35:05 +00:00
Dan Mihai
7da17099f2 genpolicy: add support for ReplicaSet YAML input
Generate policy for K8s ReplicaSet YAML.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-12-22 15:35:05 +00:00
Dan Mihai
d84300f1ee genpolicy: add support for List YAML input
Generate policy for K8s List YAML.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-12-22 15:35:05 +00:00
Dan Mihai
a03452637b genpolicy: add support for Job YAML input
Generate policy for K8s Job YAML.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-12-22 15:35:05 +00:00
Dan Mihai
2dbd01c80b genpolicy: add support for Deployment YAML input
Generate policy for K8s Deployment YAML.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-12-22 15:35:05 +00:00
Dan Mihai
a40a6003d0 genpolicy: add support for DaemonSet YAML input
Generate policy for K8s DaemonSet YAML.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-12-22 15:35:05 +00:00
Dan Mihai
48829120b6 policy: initial genpolicy commit
Add application that infers K8s user's intentions based on user's
K8s YAML file, and generates a Rego/OPA based policy for that YAML.

Just Pod YAML files are supported as input using this initial source
code. Support for other types of YAML files will come with upcoming
commits.

Fixes: #7673

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-12-22 15:35:05 +00:00
Chao Wu
555136c1a5 Merge pull request #8662 from openanolis/pci/4-upstream
dragonball: introduce pci msi/msix interrupt
2023-12-22 18:08:31 +08:00
Steve Horsman
c5f939cdc1 Merge pull request #8655 from fidencio/topic/kata-deploy-add-snapshotter-support
kata-deploy: Allow setting up snapshotters per runtime handler
2023-12-22 09:16:07 +00:00
Chao Wu
8cf3bcefd8 dragonball: introduce pci msi/msix interrupt
introduce msi/msix mod to maintain information for PCI Message Signalled
Interrupt Extended Capability. It will be initialized when parsing pci
configuration space and used when getting interrupt capabilities.

fixes: #8661

Signed-off-by: Gerry Liu <gerry@linux.alibaba.com>
Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>
Signed-off-by: Shifang Feng <fengshifang@linux.alibaba.com>
Signed-off-by: Yang Su <yang.su@linux.alibaba.com>
Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
Signed-off-by: Xin Lin <jingshan@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-12-22 16:28:22 +08:00
Xuewei Niu
beadce54c5 dragonball: Support vhost-user-net devices
This PR introduces vhost-user-net devices to Dragonball. The devices are
allowed to run as server on the VMM side.

Fixes: #8502

Signed-off-by: Eric Ren <renzhen@linux.alibaba.com>
Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-12-22 14:53:18 +08:00
Xuewei Niu
1f21d3cb2c dragonball: Introduce address space for MmioV2DeviceState
Vhost-user-net has a dependency on address space from `MmioV2DeviceState`.
The addition of the address space is introduced in this patch. Plus, it
makes sure all unit tests have the according parameter as well.

Fixes: #8502

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-12-22 14:53:18 +08:00
Fupan Li
dc9a0ac8ce Merge pull request #8718 from justxuewei/enable-vhost
tests: Load vhost modules explicitly while Kata installing
2023-12-22 14:52:49 +08:00
Xuewei Niu
206ed6d77d tests: Load vhost modules explicitly while Kata installing
The default network backend of runtime-rs with Dragonball is vhost-net
after #8609 merged. The tests might be failed if vhost modules are not
loaded.

Fixes: #8717

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-12-22 11:07:37 +08:00
alex.lyn
94c83cea84 runtime-rs: Refactor vfio driver implementation
It's important to ensure that these tasks which setup vfio
devices are completed before add_device.

So Moving vfio device setup code to a dedicated method at device
building time which does not affect the behavior of other code.

And this change makes it easier to understand the difference
between create and attach, and also makes the boundaries
clearer.

Fixes: #8665

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-22 10:37:40 +08:00
alex.lyn
82d3cfdeda runtime-rs: Make VhostUserConfig's field pci_path type more specific
Make VhostUserConfig pci_path's type more specific, change it
from Option<String> to Option<PciPath>.

Fixes: #8665

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-22 10:35:38 +08:00
alex.lyn
5cc2890a10 runtime-rs: refactor and re-implement pci path.
Do refactor and re-implement to make the pci path more "rusty".

Fixes: #8665

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-22 10:34:41 +08:00
Fabiano Fidêncio
32e1ba2525 Merge pull request #8714 from cmaf/libsh-update-loc
tests: Use function from Kata repo
2023-12-21 12:30:31 -03:00
Fabiano Fidêncio
6cc6ca5a7f kata-deploy: Allow setting up snapshotters per runtime handler
Since containerd 1.7.0 we can easily set a specific snapshotter to be
used with a runtime handler, and we should take advantage of this,
mostly as it'll help setting up any runtime using devmapper or nydus
snapshotters.

This implementation here has a few caveats:
* The format expected for the SNAPSHOTTER_HANDLER_MAPPING is:
  `shim:snapshotter,shim:snapshotter,...`
* It only works with containerd 1.7 or newer
* We **never** change the default containerd snapshotter
* We don't do any check on our side to verify whether the snapshotter
  required is properly deployed
* Users will have to add an annotation to their pods, in order to use
  the snapshotter set up per runtime handler
  * Example:
    ```
    metadata:
      ...
      annotations:
        io.containerd.cri.runtime-handler: kata-fc
    ```

Fixes: #8615

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-21 07:20:10 -03:00
alex.lyn
1b5758c1f2 runtime-rs: Move the PciPath-related code to a dedicated file
Move the pciPath code to a new file pci_path.rs and update the
references.

Fixes: #8665

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-21 11:35:18 +08:00
alex.lyn
275de453d5 runtime-rs: remove useless get_host_guest_map and its test case
Fixes: #8665

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-21 11:07:56 +08:00
Chelsea Mafrica
9f394f6e18 tests: Use function from Kata repo
Switch to use function from Kata repo in common.bash to reduce
dependency on the tests repo.

Fixes #8713

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-12-20 16:45:06 -08:00
Dan Mihai
d916da15dd Merge pull request #8688 from microsoft/danmihai1/k8s-confidential
tests: retry connection to pod SSH server
2023-12-20 15:01:26 -08:00
Fabiano Fidêncio
3482256340 Merge pull request #8709 from fidencio/topic/update-jq-for-kata-deploy
kata-deploy: Update `jq` as part of the kata-deploy daemonset
2023-12-20 16:48:07 -03:00
James O. D. Hunt
7da6d0a845 runtime-rs: ch: Implement missing thread/pid APIs
Add implementations for the following `Hypervisor` trait methods which
simply return the same details as the `get_vmm_master_tid()` method:

- `get_thread_ids()`
- `get_pids()`

Fixes: #6438.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-20 17:58:40 +00:00
Fabiano Fidêncio
c9e631dc0c kata-deploy: Reapply "kata-deploy: Use tomlq to configure containerd"
This reverts commit ee5fa08a27.

This is perfectly fine to do as we narrwoed down the issue to be on the
version of `jq` provided by alpine, and we've already updated it in the
previous commit (in this very same series).

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-20 12:52:41 -03:00
Fabiano Fidêncio
41320c586e kata-deploy: Install jq from GitHub
`jq` coming from alpine is in its 1.6 version, and that has a bug that
hits us quite hard, as it changes a float to an int whenever the number
is in the `x.0` format.

One example is:
```bash
/ # jq --version
jq-1.6
/ # echo '{"foo": 1.0}' | jq .foo
1
```

With this in mind, let's switch, at least for now, to using the `jq`
released directly on github, as it does address the issue we've been
hitting.
```bash
⋊> Downloads ./jq-linux-amd64 --version
jq-1.7
⋊> Downloads echo '{"foo": 1.0}' | jq .foo
1.0
```

Fixes: #8678

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-20 12:52:41 -03:00
Greg Kurz
ce094ecdc2 Merge pull request #8679 from stevenhorsman/kata-deploy-containerd-config-fix
gha: kata-deploy: Revert containerd config break
2023-12-20 12:58:56 +01:00
stevenhorsman
ee5fa08a27 Revert "kata-deploy: Use tomlq to configure containerd"
This reverts commit dd9f5b07b9.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-12-20 09:10:43 +00:00
stevenhorsman
9e718b4e23 gha: kata-deploy: Add containerd status check
After kata-deploy has installed, check that the worker nodes
are still in Ready state and don't have a containerd://Unknown
container runtime versions, identicating that container isn't working
to ensure that we didn't corrupt the containerd config during kata-deploy's edits

Fixes: #8678
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-12-20 09:10:43 +00:00
Archana Shinde
7e5868a55f Merge pull request #8588 from amshinde/runtime-rs-update-readme
runtime-rs: Update readme to indicate cloud-hypervisor support
2023-12-19 22:09:14 -08:00
Dan Mihai
8aa390279e tests: retry connection to pod SSH server
To become more resilient against these kinds of errors:

deployment.apps/confidential-unencrypted created
pod/confidential-unencrypted-c5fdd6964-rrb6q condition met
ssh: connect to host 10.42.0.109 port 22: Connection refused

Fixes: #8687

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-12-20 02:48:05 +00:00
GabyCT
5504176e9a Merge pull request #8699 from GabyCT/topic/fixconfidentialscript
tests: k8s: Fix indentation in confidential common script
2023-12-19 16:01:28 -06:00
Dan Mihai
6cea8a5f2a Merge pull request #8697 from microsoft/danmihai1/runk
tests: additional run-runk logging
2023-12-19 11:27:29 -08:00
Dan Mihai
551a50cd72 tests: additional run-runk logging
Add logging to run-runk, for debugging possible failures.

Fixes: #8696

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-12-19 14:08:01 +00:00
Hyounggyu Choi
540a2a7fb1 runtime: Allow no initrd path for IBM Z Secure Execution
This is to reintroduce a configuration rule for IBM Z Secure Execution,
where no initrd path should be configured. For the TEE of interest,
only a kernel image should be specified with `confidential_guest=true`.

Fixes: #8692

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-19 11:21:16 +01:00
Xuewei Niu
ec30d5a9a8 Merge pull request #8700 from justxuewei/dbs-ut
dragonball: Trigger unit tests of dbs_* subcrates by `make test`
2023-12-19 17:51:20 +08:00
Xuewei Niu
039fe7f391 dragonball: Trigger unit tests of dbs_* subcrates by make test
`make SUPPORT_VIRTUALIZATION=1 test` iterates through all subcrates and
does test.

Plus, this patch fixes some issues about unit tests:

- Feed too much parameters to `I8042Device::new()`.
- Virtqueue checks have been introduced since `virtio-queue v0.7.0`.
- GHA might have no access to `/var/tmp` dir on runner.

Fixes: #8690

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-12-19 16:22:37 +08:00
Hyounggyu Choi
ceea8882db Merge pull request #8672 from BbolroC/introduce-vsock-device-init
runtime-rs: Separate init_config() from new() for struct VsockDevice
2023-12-18 22:04:37 +01:00
Gabriela Cervantes
1469a5efca tests: k8s: Fix indentation in confidential common script
This PR fixes the indentation of the confidential common
script for kubernetes tests.

Fixes #8698

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-18 20:25:06 +00:00
Chelsea Mafrica
312475508a Merge pull request #8682 from cmaf/static-checks-update-loc
ci: Use static checks from kata repo for lib functions
2023-12-18 09:53:01 -08:00
Hyounggyu Choi
3cd0cc1388 runtime-rs: Separate init_config() from new() for struct VsockDevice
As a follow-up for #8516, guest_cid and vhost_fd are not necessarily initialised
via new(). Instead, the fields should be initialised later when they are really
used to construct hypervisor's parameters.
This commit is to separate init_config() from new() to initialise guest_cid
and vhost_fd and leave only the assignment of id for the existing function.

Fixes: #8671

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-18 16:36:09 +01:00
Greg Kurz
2987d3eeb5 Merge pull request #8341 from jongwu/fix_cpushares
agent: correct CPUShares and CPUWeight value
2023-12-18 15:40:04 +01:00
James O. D. Hunt
3c49120d2f Merge pull request #8641 from jodh-intel/kata-ctl-add-cfg-file-cli-option
kata-ctl: Add option to dump config files
2023-12-18 11:54:19 +00:00
Greg Kurz
1cfcc80018 Merge pull request #8664 from amshinde/remove-ignore-paths-ga
github-actions: Remove ignore paths for required CI checks
2023-12-18 12:49:21 +01:00
Chelsea Mafrica
b785ef96ec docs: Change location of static checks script
We now use the static checks script from the main kata containers repo
and not the tests repo; update documentation to reflect this.

Fixes #8681

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-12-15 17:13:02 -08:00
Chelsea Mafrica
bfb756199f ci: Use static checks from kata repo for lib functions
Change the two functions in lib.sh to use the static checks script from
the kata containers repo instead of tests. Remove cloning the repo from
these functions since we don't need it anymore. Leave these two
functions because the document checking one may be used locally and the
static checks one is called from the virtcontainers Makefile.

Fixes #8681

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-12-15 17:08:33 -08:00
Archana Shinde
510bc36a77 github-actions: Remove ignore paths for required CI checks
If a PR contains files from the ignore-paths, these actions do not run
as intended. However, the actions are make as required. And there does
not seem to be a way to mark these as non-required in that case.
As a result a PR containing the files from the ignore-paths remains
stalled.
Hence remove the ignore-paths until github provides a way to mark
actions that are skipped due to ignore-paths as non-required/passed.

Fixes: #8663

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-12-15 15:12:20 -08:00
Liu Wenyuan
61fe20cf9a gha: Fix some of gha metrics failure for StratoVirt
Update the Speed & Density metric tests baseline for StratoVirt
and re-enable them, and skip other metric tests temporarily.

Fixes: #8656

Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2023-12-15 17:45:01 +08:00
Zhongtao Hu
0f80dc636c Merge pull request #6876 from openanolis/memory_hotlug
runtime-rs: support Memory hotplug
2023-12-15 14:28:35 +08:00
Zhongtao Hu
9a37e77f2a runtime-rs: check the update memory size
check the update memory size greater than default max memory size

Fixes:#6875
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-12-15 11:25:34 +08:00
Zhongtao Hu
6039417104 runtime-rs: add default_maxmemory in config file
add default_maxmemory in config file

Fixes:#6875
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-12-15 10:25:20 +08:00
Zhongtao Hu
8d9fd9c067 runtime-rs: support memory resize
Fixes:#6875
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-12-15 10:25:13 +08:00
Zhongtao Hu
81e55c424a runtime-rs: add resize_memory trait for hypervisor
Fixes: #6875
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-12-15 10:25:03 +08:00
Zhongtao Hu
d428a3f9b9 runtim-rs: get guest memory details
get memory block size and guest mem hotplug probe

Fixes:#6356
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-12-15 10:22:37 +08:00
GabyCT
4a49dd73db Merge pull request #8676 from GabyCT/topic/fixins
tests: k8s: Fix indentation in setup script
2023-12-14 13:57:47 -06:00
GabyCT
7a606a19c4 Merge pull request #8659 from GabyCT/topic/improvecleanuplatency
metrics: Improve latency network cleanup
2023-12-14 13:57:28 -06:00
GabyCT
0831529279 Merge pull request #8644 from GabyCT/topic/updadockerresint
metrics: Update TensorFlow ResNet50 Int8 Dockerfile
2023-12-14 13:56:41 -06:00
Jianyong Wu
58e88d9469 agent: correct CPUShares and CPUWeight value
If cgroup driver is systemd, CPUShares, for cgroup v1, should be at
least 2 [1] and CPUWeight for cgroup v2, should be at least 1 [2].

Fixes: #8340
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>

[1] d19434fbf8/src/basic/cgroup-util.h (L122)
[2] d19434fbf8/src/basic/cgroup-util.h (L91)
2023-12-15 02:04:31 +08:00
Steve Horsman
04de6eb4fd Merge pull request #8674 from ChengyuZhu6/fix_statis_check
static-checks: Add some dependencies to static checks for CoCo features
2023-12-14 16:47:01 +00:00
Greg Kurz
1bd9c1b4de Merge pull request #8589 from wvell/patch-1
Remove warning for cgroupsv2 only operating systems
2023-12-14 17:37:59 +01:00
Gabriela Cervantes
c92b14da97 tests: k8s: Fix indentation in setup script
This PR fixes the indentation of the kubernetes setup script.

Fixes #8675

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-14 16:26:22 +00:00
Amulya Meka
ac7b3d4735 Merge pull request #8667 from Amulyam24/workflow
gha: add a post cleanup script for cri-containerd ppc64le workflow
2023-12-14 21:52:54 +05:30
Alex.Lyn
c7c7632203 Merge pull request #8620 from Apokleos/enhance-directv-using-csi
runtime-rs: Enhancement of DirectVolume when using a dedicated CSI
2023-12-14 22:59:09 +08:00
ChengyuZhu6
dfad0e6622 .github: fix the failure without devicemapper for host sharing
fix error when running checks and tests:
error: failed to run custom build command for `devicemapper-sys v0.1.5`
fatal error: 'libdevmapper.h' file not found

thread 'main' panicked at 'Could not generate dm.h bindings:
ClangDiagnostic("dm.h:2:10: fatal error: 'libdevmapper.h' file not found\n")',
/home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/devicemapper-sys-0.1.5/build.rs:24:10
  stack backtrace:
     0: rust_begin_unwind
               at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/std/src/panicking.rs:593:5
     1: core::panicking::panic_fmt
               at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/panicking.rs:67:14
     2: core::result::unwrap_failed
               at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/result.rs:1651:5
     3: core::result::Result<T,E>::expect
     4: build_script_build::main
     5: core::ops::function::FnOnce::call_once
  note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
warning: build failed, waiting for other jobs to finish...
make: *** [../../utils.mk:177: standard_rust_check] Error 101

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2023-12-14 20:47:47 +08:00
ChengyuZhu6
983479748f .github: fix error when making checks for CoCo guest pull
Fix error when making checks:
```
error: failed to run custom build command for `image-rs v0.1.0
(https://github.com/confidential-containers/guest-components?tag=v0.8.0#e849dc89)`

Caused by:
  process didn't exit successfully: `/home/runner/work/kata-containers/kata-containers/src/
  agent/target/release/build/image-rs-fd932206d09362b7/build-script-build` (exit status: 101)
  --- stdout
  cargo:rerun-if-changed=./protos/getresource.proto
  cargo:rerun-if-changed=./protos

  --- stderr
  thread 'main' panicked at 'Could not find `protoc` installation and this build crate cannot proceed without
  this knowledge. If `protoc` is installed and this crate had trouble finding
  it, you can set the `PROTOC` environment variable with the specific path to your
  installed `protoc` binary.If you're on debian, try `apt-get install protobuf-compiler`
  or download it from https://github.com/protocolbuffers/protobuf/releases
```

Fixes #8673

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2023-12-14 20:47:42 +08:00
alex.lyn
aa42f0a03f runtime-rs: Enhancement of DirectVolume when using CSI.
We use a matching direct-volume path to determine whether an OCI mount
is a DirectVolume. However, we should handle the case where no match is
found appropriately.
This error will be defined as a non-DirectVolume type when judging the
OCI mount but not failed.

Fixes: #8619

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-14 18:19:03 +08:00
alex.lyn
80d631ee84 runtime-rs: Add attribute serde rename to each field of DirectVolume.
DirectVolume structure in runtime-rs is different from it in kata-runtime,
which causes they has no unified handling method for DirectVolumeMountInfo
and MountInfo.

We should align the two by simply adding the attribute #[serde(rename="x")
to each field in DirectVolumeMountInfo

Fixes: #8619

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-14 18:18:40 +08:00
Xuewei Niu
7f611dfe84 Merge pull request #8609 from justxuewei/runtime-rs-vhost-net
dragonball: Use vhost-net device by default
2023-12-14 16:33:29 +08:00
Amulyam24
0db820fa01 gha: add a post cleanup script for cri-containerd ppc64le workflow
This PR identifies and adds an action to cleanup the ppc64le self hosted runner.

Fixes: #8666

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2023-12-14 13:46:47 +05:30
Hyounggyu Choi
fbc04460f6 Merge pull request #8649 from BbolroC/put-pre-action-gha-s390x
GHA: Put all the preliminary steps into pre-action for s390x
2023-12-14 07:16:17 +01:00
Xuewei Niu
82fde4431e dragonball: Set default queue config for vhost-net device
Dragonball sets a default queue config in the case of `None`. The
queue_size and num_queues of vhost-net are set to `Some(0)` by default.
Therefore, we might get an invalid queue config. This patch fixes this
issue.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-12-14 11:18:33 +08:00
Xuewei Niu
c11b066728 runtime-rs: Use vhost-net device by default
This patch set vhost-net as default backend of networking. It allows users
to set `disable_vhost_net` to `true` to reenable virtio-net backend.
Plus, which backend to use is a matter of hypervisor, runtime-rs will no
longer need to know that.

Fixes: #8608

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-12-14 11:18:26 +08:00
Chelsea Mafrica
6c2e2a9120 Merge pull request #8635 from cmaf/migrate-static-checks-gha
static-checks: Direct Makefile to use new static checks
2023-12-13 16:00:16 -08:00
Gabriela Cervantes
8151117f73 metrics: Improve latency network cleanup
This PR improves the latency network cleanup by removing the pods
even if the test fails.

Fixes #8658

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-13 17:56:01 +00:00
Fabiano Fidêncio
a998e89bcf Merge pull request #8639 from fidencio/topic/kata-deploy-use-tomlq-to-configure-containerd
kata-deploy: Use `tomlq` to configure containerd
2023-12-13 14:11:45 +01:00
Hyounggyu Choi
05e278de5b GHA: Put all the preliminary steps into pre-action for s390x
This is to introduce a pre-action to all the workflows for building artifacts.
The action could take care of tasks such as cleaning up files and reinstalling
packages, which prevents a workflow from getting affected by the environment.

This also includes the removal of the step `Adjust a permission for repo`,
because it could be incorporated into the action.

Fixes: #8648

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-13 13:24:40 +01:00
Chao Wu
dfaf006fcc Merge pull request #8564 from openanolis/chao/add_pci_root_bus_device
dragonball: add pci root bus and root device
2023-12-13 17:57:16 +08:00
Fabiano Fidêncio
7ad873cf29 kata-deploy: Simplify shim configuration
We never have to add a configuration for the "default" case, as we're
already creating the runtime class pointing to what should be the
"default" handler.

This helps to simplify the logic by quite a lot.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-13 10:52:54 +01:00
Fabiano Fidêncio
e618949937 kata-deploy: Remove useless comment from CRI-O drop-in
The comment adds absolutely nothing to the runtime handler added, and
it'd make our life slightly harder to properly say which VMM is being
used when setting the default `kata` handler.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-13 10:49:52 +01:00
Fabiano Fidêncio
dd9f5b07b9 kata-deploy: Use tomlq to configure containerd
This save us a lot of trouble on properly sed'ing content that may or
may not be in the containerd configuration file.

Fixes: #8638

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-13 10:49:49 +01:00
Fabiano Fidêncio
4f01f294bb kata-deploy: Install tomlq to the base image
This will help us to have an easier time playing with the containerd
configuration, instead of having to sed the **** out of it, which is
super error prone.

`tomlq` is a tool that comes from https://github.com/kislyuk/yq, and
that depends on `jq` to do the toml parsing / editing.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-13 10:49:07 +01:00
James O. D. Hunt
d7c6219dfe Merge pull request #8630 from jodh-intel/runtime-rs-ch-set-state-on-vm-stop
runtime-rs: ch: Change state when VM stopped
2023-12-13 09:26:30 +00:00
Xuewei Niu
855adbc63b Merge pull request #8634 from justxuewei/disable-packed-vq
dragonball: Disable packed virtqueue for vhost-user devices
2023-12-13 17:03:05 +08:00
wvell
af4622fcc1 docs: Remove warning for cgroupsv2 only operating systems
Removes warning for cgroupsv2 as it is not needed anymore according to #6259.

Fixes #8650

Signed-off-by: wvell <w.vellema@slash2.nl>
2023-12-13 09:18:39 +01:00
Chelsea Mafrica
b46cb22270 static-checks: Direct Makefile to use new static checks
Direct the Makefile to use the static checks script in the tests
directory of the main Kata Containers repo so it is run in GHA.

Fixes #8595

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-12-12 16:43:35 -08:00
Chelsea Mafrica
63636b869c static-checks: Update copyright dates
Some copyright dates were not updated with the most recent changes to
code; update them.

Fixes #8595

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-12-12 16:34:06 -08:00
Chelsea Mafrica
b11c772865 static-checks: Change dir for building tools
Change directory for running make due to local errors when building with
make -C.

Fixes #8595

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-12-12 16:34:06 -08:00
James O. D. Hunt
2a518f0898 runtime-rs: ch: Change state when VM stopped
Make the CH (Cloud Hypervisor) `stop_vm()` method check the VM state before
attempting to stop the VM, and update the state once the VM has stopped.

This avoids the method failing if called multiple times which will
happen if the workload exits before the container manager requests that
the container stop.

This change ensures the CH driver finishes cleanly.

Fixes: #8629.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-12 18:25:20 +00:00
Fabiano Fidêncio
39f5cea3b1 kata-deploy: Fix k0s cri notation comment
We can safely assume we're using the *newer* notation, not the *older*
one.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-12 18:20:18 +01:00
Gabriela Cervantes
23f76653e5 metrics: Update command to run the tensorflow int8 benchmark
This PR updates the command to run the tensorflow resnet50 int8 benchmark.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-12 16:24:09 +00:00
Gabriela Cervantes
8fd5ef7fb7 metrics: Update TensorFlow ResNet50 Int8 Dockerfile
This PR updates the TensorFlow ResNet50 Int8 Dockerfile to use the
proper python version for kata metrics.

Fixes #8643

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-12 16:20:56 +00:00
James O. D. Hunt
1195692d3c runtime-rs: ch: Move state handling to top-level APIs
Move the state setting to the `Hypervisor` trait calls. This makes the
code clearer.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-12 15:25:27 +00:00
James O. D. Hunt
5637f11a8c kata-ctl: Add option to dump config files
Add a `--show-default-config-paths` command line option for parity with
`kata-runtime`.

Note that this requires the `KataCtlCli.command` to be optional so that
the user can run simply:

```bash
$ kata-ctl --show-default-config-paths
```

... without also specifying a (sub-)command.

Fixes: #8640.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-12 14:20:04 +00:00
Chelsea Mafrica
a9d360728e static-checks: Fix directory for github labels
Fix paths for yqdir (where the install_yq.sh script currently is) so
that static checks can run without error.

Fixes #8595

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-12-12 02:16:35 -08:00
Xuewei Niu
86918e91b3 dragonball: Disable packed virtqueue for vhost-user devices
The layout of packed virtqueue isn't supported by `Endpoint::negotiate()`.
Communication between device and driver will be failed due to the failure
of parsing virtqueue if we don't disable the packed feature. This patch
fixes this issue.

Fixes: #8633

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-12-12 17:24:20 +08:00
Chao Wu
b079e1aabc dragonball: add pci root bus and root device
In order to follow up the PCI implementation in Dragonball, we need to
add PCI root device and root bus support.

root device is a pseudo PCI root device to manage accessing to PCI
configuration space.

root bus is mainly for emulating PCI root bridge and also create the PCI
root bus with the given bus ID with the PCI root bridge.

fixes: #8563

Signed-off-by: Gerry Liu <gerry@linux.alibaba.com>
Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>
Signed-off-by: Shifang Feng <fengshifang@linux.alibaba.com>
Signed-off-by: Yang Su <yang.su@linux.alibaba.com>
Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
Signed-off-by: Xin Lin <jingshan@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-12-12 11:43:14 +08:00
GabyCT
ee74fca92c Merge pull request #8617 from GabyCT/topic/enabletestnerdctl
tests: nerdctl: Enable nerdctl tests for cloud hypervisor runtime-rs
2023-12-11 14:09:58 -06:00
David Esparza
584a26dab0 Merge pull request #8542 from dborquez/metrics_fix_deployment_cleaning
metrics: cleans k8s iperf deployment when the test finishes.
2023-12-11 13:14:39 -06:00
Chao Wu
198e4adcb1 Merge pull request #8599 from openanolis/chao/fix_cargo_fmt
dragonball: add --all for fmt ci
2023-12-12 00:20:21 +08:00
GabyCT
43410e1918 Merge pull request #8560 from GabyCT/topic/enablek8srs
gha: k8s: Add cloud-hypervisor (runtime-rs) support
2023-12-11 09:42:49 -06:00
Hyounggyu Choi
ea2a0dc69d Merge pull request #7769 from BbolroC/opa-multiarch
rootfs: build OPA binary from source for ppc64le and s390x
2023-12-11 15:25:33 +01:00
Chao Wu
52f7a40e4e dragonball: add --all for fmt ci
Right now, cargo fmt check in Dragonball only test with the default
features but not all features. This will cause some code being untested
by the fmt tool.

This PR adds --all option for the Dragonball CI and also fix some code
that forgets to do cargo fmt --all.

fixes: #8598

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-12-11 20:54:25 +08:00
Hyounggyu Choi
375c787e09 rootfs: build OPA binary from source for ppc64le and s390x
This PR is to build a binary for OPA from source code for ppc64le and s390x.

Fixes: #7616

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-11 12:59:48 +01:00
Hyounggyu Choi
16e2a50d17 Merge pull request #8624 from BbolroC/fix-runtime-class-check-qemu-se
GHA: Fix kata-deploy-runtime-classes-check for kata-qemu-se
2023-12-11 12:58:00 +01:00
James O. D. Hunt
2a35541af7 Merge pull request #8592 from jodh-intel/static-checks-try-multiple-user-agents
CI: static-checks: Try multiple user agents
2023-12-11 11:52:29 +00:00
Hyounggyu Choi
28c3e0e5f0 GHA: Fix kata-deploy-runtime-classes-check for kata-qemu-se
This is to fix an error on kata-deploy-runtime-classes-check for kata-qemu-se.

Fixes: #8623

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-11 10:30:00 +01:00
Hyounggyu Choi
b469dbf92f Merge pull request #8622 from BbolroC/hotfix-k3s-kubectl-version
GHA: Use --client=true for k3s kubectl version
2023-12-11 10:00:16 +01:00
Hyounggyu Choi
40f0c8fbb7 GHA: Use --client=true for k3s kubectl version
This is to fix a broken usage for `k3s kubectl version` by switching
an option `--short` to `--client=true`.

Fixes: #8621

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-11 08:26:39 +01:00
Chao Wu
df7f416cb8 Merge pull request #8566 from liubogithub/liubo/dev/panic_fix
runtime-rs: fix panic when hypervisor mismatches with configuration
2023-12-10 21:33:59 +08:00
Gabriela Cervantes
1662a3e859 common: Add cloud hypervisor in enabling hypervisor function
This PR adds the cloud hypervisor in the enabling hypervisor function.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-08 21:32:00 +00:00
Chelsea Mafrica
1c42d94550 Merge pull request #6826 from gabevenberg/log-parser-rs
kata-ctl: Moved log-parser-rs into kata-ctl
2023-12-08 11:33:09 -08:00
James O. D. Hunt
5d085a3042 CI: static-checks: Try multiple user agents
Make the URL checker cycle through a list of user agent values until we
hit one the remote server is happy with.

This is required since, unfortunately, we really, really want to check
these URLs, but some sites block clients based on their `User-Agent`
(UA) request header value. And of course, each site is different and can
change its behaviour at any time.

Our strategy therefore is to try various UA's until we find one the
server accepts:

- No explicit UA (use `curl`'s default)
- Explicitly no UA.
- A blank UA.
- Partial UA values for various CLI tools.
- Partial UA values for various console web browsers.
- Partial UA for Emacs's built-in browser.
- The existing UA which is used as a "last ditch" attempt where the UA implies multiple platforms and browser.

> **Notes:**
>
> - The "partial UA" values specify specify the UA "product" but not the
>   UA "product version": we specify `foo` and not `foo/1.2.3`). We do
>   this since most sites tested appear to not care about the version.
>   This is as expected given that the version is strictly optional (see `[*]`).
>
> - We now log all errors and display an error summary if none of the UAs
>   worked, in addition to the simple list of the URLs we believe to be
>   invalid. This should make future debugging simpler.

`[*]` - https://www.rfc-editor.org/rfc/rfc9110#section-10.1.5

Fixes: #8553.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-08 18:02:41 +00:00
James O. D. Hunt
3174c18772 docs: Remove problematic URL
Removed the Azure Portal URL (https://portal.azure.com) since this
causes problems with our static checks script: that URL returns HTTP 403
("Forbidden") when queried using command-line tools like `curl(1)`,
which is used by the static check script.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-08 17:11:20 +00:00
James O. D. Hunt
3779261a99 docs: Fix whitespace
Remove some extraneous whitespace.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-08 17:11:20 +00:00
James O. D. Hunt
613def0328 CI: static-checks: Move curl to a separate function
Split the call to `curl` in the URL checker out into a new
`run_url_check_cmd()` function to make `check_url()` slightly clearer.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-08 17:11:20 +00:00
James O. D. Hunt
6d859f97ee CI: static-checks: Lint fixes
Declare and then define a couple of variables separately.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-08 17:11:20 +00:00
James O. D. Hunt
efa8e6547c CI: static-checks: Check params have a value
Check that the `check_url()` parameters have a value.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-08 17:11:20 +00:00
James O. D. Hunt
563ea020b0 CI: static-checks: Fold long line
Break up a long line as little to make it easier to read.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-08 17:11:20 +00:00
James O. D. Hunt
3ad43df946 CI: static-checks: Improve markdown checker test
Only attempt to build the markdown checker if it doesn't already exist.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-08 17:11:20 +00:00
Liu Bo
bf97051f11 runtime-rs: fix panic when hypervisor mismatches with configuration
If a wrong configuration.toml file is used by accidentally, runtime-rs
binary could run into panic because of unwrap().

This fixes the panic by returning errors instead of unwrap().

fixes: #8565

Signed-off-by: Liu Bo <liub.liubo@gmail.com>
2023-12-08 08:56:23 -08:00
Zvonko Kaiser
9d38f01c2f Merge pull request #8612 from BbolroC/introduce-secret-inheritance-s390x
GHA: make secrets inherited for build-kata-static-tarball-s390x
2023-12-08 17:32:47 +01:00
Gabriela Cervantes
f3eeab10ab tests: nerdctl: Enable nerdctl tests for cloud hypervisor runtime-rs
This PR enables the nerdctl tests for cloud hypervisor runtime-rs.

Fixes #8616

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-08 16:12:36 +00:00
Hyounggyu Choi
636eef8907 GHA: make secrets inherited for build-kata-static-tarball-s390x
This is to make GHA secrets inherited for the workflow titled
`build-kata-static-tarball-s390x` to configure an environment
variable `CI_HKD_PATH` for a `build-asset-boot-image-se` step.

Fixes: #8611

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-08 13:55:45 +01:00
Chao Wu
5054e59ccb Merge pull request #8429 from adamqqqplay/support-vhost-user-fs
dragonball: introduce vhost-user-fs device
2023-12-08 17:20:52 +08:00
Hyounggyu Choi
588f639a69 Merge pull request #6755 from BbolroC/add-se-artifacts-to-main
packaging: Add IBM Z SE artifacts to main
2023-12-08 05:17:38 +01:00
Gabe Venberg
69fdd05ce5 kata-ctl: Moved log-parser-rs into kata-ctl
Log-parser-rs was always intended to become a sub-functionality of
kata-ctl, but it was useful to develop it and initaly merge it as a
standalone program, and migrate it to a subcommand later.

Fixes #6797

Signed-off-by: Gabe Venberg <gabevenberg@gmail.com>
2023-12-07 21:35:28 -06:00
David Esparza
b2577000e7 metrics: Expose iperf3 pods over a k8s networks.
A prerequisite for measuring kata network bandwidth is
run Iperf3 tool at a the transport layer provided by a
k8s service for exposing a network where the clients
inside the cluster can use to contact Pods in the service.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-12-07 18:07:05 -06:00
David Esparza
a062ba166b metrics: cleans k8s iperf deployment when the test finishes.
This PR fixes small issues like:
1. Cleaning up the k8s environment by removing the iperf test
implementation even when the test fails.
2. Checks if the workload returned a result before generating
an empty results json file as it was bein done.
3. Removes the redundancy of calls to functions that process
subtests and should compose the results json file only when
all results are ready and not before.
4. The tcp service manifest was added to the server deployment
which targets TCP port 5201.

Fixes: #8534

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-12-07 18:02:39 -06:00
Archana Shinde
a5105b4227 Merge pull request #8582 from amshinde/runtime-rs-tryfrom-blkconfig
Implement and use try_from for DiskConfig
2023-12-07 15:02:00 -08:00
Archana Shinde
458e91b289 runtime-rs: Update readme to indicate cloud-hypervisor support
Since cloud-hypervisor is no longer built as an optional feature,
lets mention cloud-hypervisor in the list of hypervisors supported by
runtime-rs.

Fixes: #8587

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-12-07 14:59:43 -08:00
GabyCT
0e0a7d9410 Merge pull request #8604 from GabyCT/topic/enablenerdctlrs
gha: nerdctl: Enable cloud hypervisor runtime-rs for nerdctl CI
2023-12-07 14:35:26 -06:00
Hyounggyu Choi
3fab1690a4 local-build: make strip support for cross-compilation
This is to adjust a name of the binary `strip` to a target architecture for cross-compilation.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-07 20:05:40 +01:00
Hyounggyu Choi
f38c7f14c5 gha: remove build redundancy of kernel and rootfs-initrd
It is to remove the build redundancy of `kernel` and `rootfs-initrd` by making `boot-image-se` built based on them at the second build stage.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-07 20:05:40 +01:00
Hyounggyu Choi
31db56207b local-build: add support for key verification for IBM Secure Execution
This is to make `build_se_image.sh` incorporate the key verification originally supported by `genprotimg`.
It can be achieved by specifying two environment variables called `SIGNING_KEY_CERT_PATH` and `INTERMEDIATE_CA_CERT_PATH`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-07 20:05:40 +01:00
Hyounggyu Choi
52bdc87fe9 local-build: make kernel parameters configurable
This is to make kernel parameters configurable during the secure image build by adding an environment variable SE_KERNEL_PARAMS.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-07 20:05:40 +01:00
Hyounggyu Choi
9ceb2c27e0 local-build: consider cross-compilation env
This is to make a base builder image build genprotimg without a package
manager under the cross-compilation environment.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-07 20:05:40 +01:00
David Esparza
298be4aa1c Merge pull request #8594 from GabyCT/topic/updatedockerfilet
metrics: Update TensorFlow ResNet FP32 dockerfile
2023-12-07 11:14:48 -06:00
Gabriela Cervantes
ce694b905b tests: Fix indentation of gha-run script
This PR fixes the indentation of gha run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-07 16:56:19 +00:00
Gabriela Cervantes
33b300431e tests: Enable but do not run k8s tests for cloud hypervisor
This PR enables but do not run k8s tests for cloud hypervisor
for runtime-rs.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-07 16:39:15 +00:00
Gabriela Cervantes
acee3d8438 gha: k8s: Add cloud-hypervisor (runtime-rs) support
This PR adds the Cloud Hypervisor driver, integrated with the runtime-rs,
as part of the kubernetes tests.

Fixes #8559

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-07 16:33:59 +00:00
Gabriela Cervantes
50a5fa9a65 tests: Enable but do not run the nerdctl tests for cloud hypervisor
This PR enables but do not run the nerdctl tests for cloud hypervisor
runtime-rs until we find out how stable they are.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-07 16:29:51 +00:00
Gabriela Cervantes
e70b2ea95d gha: nerdctl: Enable cloud hypervisor runtime-rs for nerdctl CI
This PR enables the cloud hypervisor runtime-rs for the nerdctl
gha CI.

Fixes #8603

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-07 16:24:36 +00:00
Hyounggyu Choi
ad6aab9918 Merge pull request #8601 from BbolroC/conflict-handling-for-self-hosted-runners
GHA: remove GITHUB_WORKSPACE when workflow fails due to merge conflict
2023-12-07 12:17:31 +01:00
Hyounggyu Choi
0d5a970e54 GHA: remove GITHUB_WORKSPACE when workflow fails due to merge conflict
It is to remove a GITHUB_WORKSPACE directory for self-hosted runners
when a workflow fails due to the merge conflict. This will prevent
the subsequent workflows from getting stuck in the same situation.

Fixes: #8600

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-07 10:25:57 +01:00
Greg Kurz
501910d743 Merge pull request #8509 from zvonkok/stable-overlay
deployment: Add stable overlay for kata-deploy.yaml
2023-12-07 09:43:41 +01:00
Huang Jianan
5629b7454f dragonball: support vhost-user-fs in device manager
This patch implements the virtio-fs device used for filesystem sharing
and heavily based on the vhost-user protocol.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Eryu Guan <eguan@linux.alibaba.com>
Signed-off-by: Huang Jianan <jnhuang@linux.alibaba.com>
Signed-off-by: Qinqi Qu <quqinqi@linux.alibaba.com>
2023-12-07 11:59:07 +08:00
Archana Shinde
a661ac3a0e runtime-rs: Implement and use try_from for DiskConfig
Implement try_from trait function to convert runtime-rs BlockConfig
to cloud-hypervisor DiskConfig. This can allow for code reuse in the
future.

Fixes: #8581

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-12-06 12:10:34 -08:00
Fabiano Fidêncio
c14e3096c8 Merge pull request #8580 from amshinde/runtime-rs-clh-network-hotplug
runtime-rs: add network hotplug for clh
2023-12-06 20:50:04 +01:00
Gabriela Cervantes
56dddab04f metrics: Update command to run tensorflow resnet fp32 benchmark
This PR updates the command needed to run the tensorflow benchmark.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-06 17:02:10 +00:00
Gabriela Cervantes
62fdebeeb5 metrics: Update TensorFlow ResNet FP32 dockerfile
This PR updates the python version for the TensorFlow ResNet FP32
dockerfile so the benchmark can run without issues.

Fixes #8593

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-06 16:53:21 +00:00
GabyCT
3d149d3455 Merge pull request #8578 from GabyCT/topic/fixlinkconfig
docs: Update config containerd url link
2023-12-06 10:40:29 -06:00
Zvonko Kaiser
16380558e0 deployment: Create a stable overaly for kata-deploy
Fixes: #8508

Create a stable overlay for kata-deploy.yaml so we do not have to maintain two files, only one.
Single source for both. This is also preparation for the helm-overlay

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-12-06 14:23:22 +00:00
Huang Jianan
2a1fc29e84 dragonball: add unit test for vhost-user-fs
Add some test cases for vhost-user-fs function.

Signed-off-by: Beiyue <beiyue@linux.alibaba.com>
Signed-off-by: Huang Jianan <jnhuang@linux.alibaba.com>
2023-12-06 10:43:24 +08:00
Huang Jianan
d6cfbe9436 dragonball: support vhost-user-fs
This patch implements the virtio-fs device used for filesystem sharing
and heavily based on the vhost-user protocol.

This vhost-user-fs device defines 5 parameters:
  - path: vhost-user socket path
  - tag: mount tag used from the guest to mount the filesystem
  - req_num_queues: number of request virtqueues
  - queue_size: depth of each virtqueue
  - cache_size: cache window size for dax

This device needs to be defined before the VM instance is started,
which can be done through the dbs-cli tool with --fs option:
--fs '{
    "sock_path":"/path/to/virtiofs.socket",
    "tag":"myfs",
    "num_queues":1,
    "queue_size":1024,
    "cache_size":0,
    "thread_pool_size":1,
    "cache_policy":"auto",
    "writeback_cache":true,
    "no_open":true,
    "xattr":true,
    "drop_sys_resource":false,
    "mode":"vhostuser",
    "fuse_killpriv_v2":true,
    "no_readdir":false,
}'

Fixes: #8428

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Eryu Guan <eguan@linux.alibaba.com>
Signed-off-by: Huang Jianan <jnhuang@linux.alibaba.com>
2023-12-06 10:43:17 +08:00
Archana Shinde
955dec06da runtime-rs: add network hotplug for clh
This is required for clh to work with nerdtcl and docker.
This fixes the issues seen with nerdctl while starting a container.
Hoewever, container exit with docker is still broken due to an unrelated
issue.

Fixes: #8579

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-12-05 15:29:53 -08:00
Fabiano Fidêncio
b056683b7a Merge pull request #8436 from Lu-Biao/main
image-builder: bugfix incorrect partition location
2023-12-06 00:10:06 +01:00
Fabiano Fidêncio
2cd003156e Merge pull request #8573 from fidencio/topic/gha-add-a-timeout-for-tests
gha: basic-ci: Add a timeout for the tests
2023-12-05 22:20:49 +01:00
Fabiano Fidêncio
d149b9f9ca Merge pull request #7231 from wainersm/measured_rootfs-improvements
Build for measured rootfs improvements
2023-12-05 22:20:33 +01:00
Fabiano Fidêncio
f75f17c4ff Merge pull request #8570 from fidencio/topic/gha-dragonball-enable-some-tests-but-do-not-run-them-yet
gha: dragonball: Enable, but do not run, cri-containerd, stability, and devmapper tests
2023-12-05 20:00:24 +01:00
Jeremi Piotrowski
e2c6b8ae6e Merge pull request #4743 from yuchen0cc/main
mount: support checking multiple kinds of block device driver
2023-12-05 18:04:51 +01:00
Gabriela Cervantes
61b868692b docs: Update config containerd url link
This PR updates the config containerd url link in the containerd
kata documentation.

Fixes #8577

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-05 16:35:21 +00:00
Fabiano Fidêncio
05ce52d746 devmapper: dragonball: Enable, but do not run, the tests
This will make the life easier for dragonball developers to properly
enable the tests once the tests are ready.

Fixes: #8569

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-05 15:29:23 +01:00
Fabiano Fidêncio
a8a156b1af stability: dragonball: Enable, but do not run, the tests
This will make the life easier for dragonball developers to properly
enable the tests once the tests are ready.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-05 15:29:23 +01:00
Fabiano Fidêncio
16ad721eda cri-containerd: dragonball: Enable, but do not run, the tests
This will make the life easier for dragonball developers to properly
enable the tests once the tests are ready.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-05 15:29:23 +01:00
James O. D. Hunt
d9daadf15c Merge pull request #8558 from jodh-intel/load-config-improvement
runtime-rs: Show config files attempted on config load failure
2023-12-05 11:48:42 +00:00
Greg Kurz
1650d02b91 Merge pull request #8516 from Apokleos/vsock-dev
move vsock device into device manager
2023-12-05 11:28:37 +01:00
James O. D. Hunt
93c0fc2ad3 Merge pull request #8551 from amshinde/runtime-rs-setns-clh
runtime-rs: Launch cloud-hypervisor in given netns
2023-12-05 10:18:34 +00:00
James O. D. Hunt
d627893975 runtime-rs: Show config files attempted on config load failure
PR #8483 changed the location of the rust runtime config files to
`/etc/kata-containers/runtime-rs/`. However, if you haven't updated your
system to create that directory, attempting to create a container using
the rust runtime was giving the following cryptic message
(formatted for easier reading):

```
failed to handler message try init runtime instance

Caused by:
    0: load config
    1: load toml config
    2: entity not found
```

Now, the message is as follows (again, reformatted for easier reading):

```
failed to handle message try init runtime instance

Caused by:
    0: load config
    1: load TOML config failed (tried [
        \"/etc/kata-containers/runtime-rs/configuration.toml\",
        \"/usr/share/defaults/kata-containers/runtime-rs/configuration.toml\",
        \"/opt/kata/share/defaults/kata-containers/runtime-rs/configuration.toml\"
    ])
```

Fixes: #8557.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-05 09:10:18 +00:00
James O. D. Hunt
45c0364d4c runtime-rs: Fix typo in task service
"failed to handler message" -> "failed to handle message".

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-05 09:10:18 +00:00
Fabiano Fidêncio
a14f2fc180 gha: runk: Fix typo in the test name
tracing -> runk

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-05 09:44:42 +01:00
Fabiano Fidêncio
1a74142a16 gha: basic-ci: Add a timeout for the tests
This will ensure no job will be stuck forever, as we've noticed with a
few jobs already.

Fixes: #8572

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-05 09:42:46 +01:00
GabyCT
e8b28fed2a Merge pull request #8540 from GabyCT/topic/fixctrdoc
docs: Update cri installation url link
2023-12-04 17:36:33 -06:00
Archana Shinde
2df8144cfe runtime-rs: Launch cloud-hypervisor in given netns
Launch cloud-hypervisor binary in the netns provided at the prepare_vm
stage.

Fixes: #6441

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-12-04 13:02:43 -08:00
Hyounggyu Choi
511dd5feac local-build: add support to build IBM Z SE image
This is to add an artifact for IBM Z SE(TEE) to main.

Fixes: #6754

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-04 21:08:51 +01:00
Hyounggyu Choi
4de8ef3d18 local-build: add build target boot-image-se
This is to add a build target boot-image-se for s390x.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-04 21:08:51 +01:00
Hyounggyu Choi
a63a6959d1 local-build: install s390-tools in Dockerfile
This is to install s390-tools including genprotimg during the docker
build.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-04 21:08:51 +01:00
Hyounggyu Choi
6d0dabd81e gha: build secure image for s390x release
This is add a build target boot-image-se with a host-key-document
config for s390x.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-04 21:08:51 +01:00
Hyounggyu Choi
bb1d4adaa9 config: add SE configuration
This is to add SE configuration which is used by kata runtime.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-04 21:08:49 +01:00
Gabriela Cervantes
2b05029347 docs: Update cri installation url link
This PR updates the cri installation url link for the containerd
documentation.

Fixes #8539

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-04 20:07:49 +00:00
Hyounggyu Choi
8de4241d3b kata-deploy: add kata-qemu-se runtimeclass
This is to increase resources for relaxing the limitation of hotplug for
SE.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-04 21:06:53 +01:00
Hyounggyu Choi
9ede2bcd95 local-build: differentiate build targets based on architecture
This is to rule out unnecessary build targets for s390x.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-04 21:06:53 +01:00
GabyCT
1c00a9a6a9 Merge pull request #8524 from GabyCT/topic/addiperfinfo
docs: Update iperf3 network documentation
2023-12-04 14:03:30 -06:00
GabyCT
1b204cc3cb Merge pull request #8550 from GabyCT/topic/enableclhstability
gha: Add cloud runtime rs as part of the stability tests
2023-12-04 11:37:58 -06:00
Gabriela Cervantes
dfc07d1c72 gha: stability: Add cloud-hypervisor (runtime-rs) support
This PR adds the Cloud Hypervisor driver, integraedwith the runtime-rs,
as part of the stability tests.

Fixes #8462

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-04 15:32:29 +00:00
Fabiano Fidêncio
8d7e0f7721 Merge pull request #8556 from fidencio/topic/kernel-add-tdx-guest-driver
kernel: Add CONFIG_TDX_GUEST_DRIVER to the tdx.conf
2023-12-04 15:13:57 +01:00
James O. D. Hunt
e4aebb4560 Merge pull request #8549 from jodh-intel/tdx-no-root
libs: protection: x86_64: drop root requirement for querying
2023-12-04 13:03:10 +00:00
Chao Wu
1550ee6767 Merge pull request #8480 from openanolis/chao/add_dbs_pci
dragonball: init dbs-pci lib with pci bus & pci conf
2023-12-04 18:08:40 +08:00
Fabiano Fidêncio
03c3f4275e kernel: Add CONFIG_TDX_GUEST_DRIVER to the tdx.conf
The driver enables the userspace interface to communicate with the TDX
module to request the TDX guest details, like the attestation report.

Fixes: #8555

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-04 10:25:59 +01:00
Biao Lu
b816dca3ed image-builder: fix incorrect part start position
The 'part_start' of image and dax_image should exactly specify the
same location, according to the parted documentation, to exactly
specify the location, the units of start and end should use MiB.

https://www.gnu.org/software/parted/manual/parted.html#IEC-binary-units

Fixes: #8435

Signed-off-by: Biao Lu <biao.lu@intel.com>
2023-12-04 17:20:26 +08:00
Chao Wu
52fd57e49a Merge pull request #8301 from Apokleos/do-direct-volume
runtime-rs: Enhancing DirectVolMount Handling with Patching Support
2023-12-04 16:49:46 +08:00
James O. D. Hunt
7beab11d9e Merge pull request #8547 from jodh-intel/unbreak-logger
libs:logging: Fix logger
2023-12-04 08:38:03 +00:00
alex.lyn
0fabfa336d runtime-rs: bring support for legacy vsock device.
Bring support for legacy vsock and add Vsock to the ResourceConfig
enum type, and add the processing flow of the Vsock device to the
prepare_before_start_vm function.

Fixes: #8474

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-04 15:54:51 +08:00
alex.lyn
6c08cf35d5 runtime-rs: Introduce prepare_vm_socket_config to VirtSandbox.
Instroduce prepare_vm_socket_config to VirtSandbox for vm
socket config, including Vsock and Hybrid Vsock.
Use the capabilities() trait of the hypervisor to get the
vm socket supported in VMM.

Fixes: #8474

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-04 15:54:50 +08:00
alex.lyn
60f88da5e1 runtime-rs: add Capability of HybridVsockSupport for Hypervisor.
Add Cap of HybridVsockSupport for hypervisors CLH and Dragonball
which use hybrid-vsock, default for Qemu, which uses legacy vsock.

Fixes: #8474

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-04 15:54:50 +08:00
alex.lyn
c5178dd258 runtime-rs: Introduce Capability of HybridVsockSupport.
Introduce HybridVsock Cap to judge which kind of vm socket will
be supported by the Hypervisor.
Use `is_hybrid_vsock_supported` to tell if an hypervisor supports
hybrid-vsock, if not, it supports legacy vsock.

Fixes: #8474

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-12-04 15:54:29 +08:00
James O. D. Hunt
e1caca3e41 kata-ctl: Remove root requirement for "env"
Remove the redundant `kata-ctl` `root` check when running the `env`
command. This check duplicated the `GuestProtection` check, and that
check is now no longer necessary anyway.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-01 15:55:45 +00:00
James O. D. Hunt
f05ada592f libs: protection: x86_64: drop root requirement for querying
It is no longer necessary to be `root` to query the guest protection
(TDX) on `x86_64` systems, so drop the requirement.

> **Note:**
>
> This change drops the `nix` `Uid` import required for the `root` check.
> But at the same time it adds it for PPC64le since that implementation of
> `available_guest_protection()` needs it and it was previously missing.

Fixes: #8548.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-01 15:55:21 +00:00
Fabiano Fidêncio
852021e416 Merge pull request #8483 from fidencio/topic/move-rust-config-files-to-subdir-based-on-jodh-approach
build/kata-deploy: Move rust runtime config files to runtime-rs directory -- based on #8445
2023-12-01 16:22:51 +01:00
James O. D. Hunt
f9f1d3a071 libs:logging: Fix logger
PR #8311 inadvertently broke the logging since no log messages below the
`Info` level are logged now, regardless of the requested log level.

Resolve the issue by storing the requested log level in the
`RuntimeComponentLevelFilter` and using that level in the `log()`
function, rather than hard-coding `Info` as the default where no entry
is found in the `FILTER_RULE` hashmap.

Fixes: #8546.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-01 12:21:20 +00:00
yuchen.cc
1cd1558a92 mount: support checking multiple kinds of block device driver
Device mapper is the only supported block device driver so far,
which seems limiting. Kata Containers can work well with other
block devices. It is necessary to enhance supporting of multiple
kinds of host block device.

Fixes #4714

Signed-off-by: yuchen.cc <yuchen.cc@alibaba-inc.com>
2023-12-01 11:59:30 +08:00
Chelsea Mafrica
818b8f93b1 Merge pull request #8288 from cmaf/migrate-static-checks
Migrate static checks
2023-11-30 17:44:16 -08:00
Chelsea Mafrica
207a7fef90 Merge pull request #7815 from cmaf/runtime-rs-ch-vsock
runtime-rs: Add Hybrid VSOCK device handling for CH
2023-11-30 12:22:36 -08:00
GabyCT
2bd21f7831 Merge pull request #8531 from GabyCT/topic/fixiperfli
metrics: Fix iperf parallel bandwidth limit
2023-11-30 13:47:00 -06:00
Chao Wu
b3da71f21e dragonball: init dbs-pci lib with pci bus & pci conf
This commit inits dbs-pci lib for Dragonball to use.
It contains several implementation now:

1. PCI configuration space
2. PCI bus

More info of the design & behavior of those two features could be found
in the README of dbs-pci.

fixes: #8479

Signed-off-by: Gerry Liu <gerry@linux.alibaba.com>
Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>
Signed-off-by: Shifang Feng <fengshifang@linux.alibaba.com>
Signed-off-by: Yang Su <yang.su@linux.alibaba.com>
Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
Signed-off-by: Xin Lin <jingshan@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-11-30 23:40:26 +08:00
Dan Mihai
38f24c41c0 Merge pull request #8271 from microsoft/danmihai1/exec-test-failure
tests: more k8s-exec-rejected debug output
2023-11-30 07:11:01 -08:00
Greg Kurz
48e5596186 Merge pull request #8456 from cheriL/8447/alpine_bash
osbuilder: add pkg bash for alpine
2023-11-30 13:43:48 +01:00
Steve Horsman
c6110284d5 Merge pull request #8520 from stevenhorsman/hypervisor-ttrpc
runtime: Update hypervisor generated code
2023-11-30 10:01:56 +00:00
Amulya Meka
3d5db65b2e Merge pull request #8526 from Amulyam24/workflow-ppc
gha: fix artefacts build on ppc64le
2023-11-30 15:00:06 +05:30
Fabiano Fidêncio
80fcc56cef Merge pull request #8528 from fidencio/topic/stop-building-and-shipping-log-parser-rs
tools: Stop building / shipping log-parser-rs
2023-11-30 09:14:10 +01:00
Fabiano Fidêncio
9b30d97885 Merge pull request #8533 from fidencio/topic/fix-invalid-cpu-topology-for-tdx
Revert "runtime: confidential: Do not set the max_vcpu to cpu"
2023-11-30 09:06:45 +01:00
Amulyam24
6a922f0e37 gha: fix artefacts build on ppc64le
Add step in the right place to prepare the runner for the builds/tests.

Fixes: #8525

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2023-11-30 09:50:47 +05:30
soup
811ec07359 osbuilder: add pkg bash for alpine
The bash component is required in the guest for debug console to work properly.

Fixes: #8447

Signed-off-by: soup <lqh348659137@outlook.com>
2023-11-30 09:42:39 +08:00
Fabiano Fidêncio
f15e16b692 Revert "runtime: confidential: Do not set the max_vcpu to cpu"
This reverts commit b0157ad73a.
```
commit b0157ad73a
Refs: 3.3.0-alpha0-124-gb0157ad73
Author:     Fabiano Fidêncio <fabiano.fidencio@intel.com>
AuthorDate: Fri Aug 11 14:55:11 2023 +0200
Commit:     Fabiano Fidêncio <fabiano.fidencio@intel.com>
CommitDate: Fri Nov 10 12:58:20 2023 +0100

    runtime: confidential: Do not set the max_vcpu to cpu

    We don't have to do this since we're relying on the
    `static_sandbox_resource_mgmt` feature, which gives us the correct
    amount of memory and CPUs to be allocated.

    Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
```

This commit was removing a requirement that was made previously, but due
to the SMP issue we're facing with the QEMU used for TDX (see commit
d1b54ede290e95762099fff4e0bcdad10f816126*), QEMU will fail to start due
to:
```
Invalid CPU topology: product of the hierarchy must match maxcpus:
sockets (1) * dies (1) * cores (1) * threads (1) != maxcpus (240)"
```

This has no affect on the SEV / SNP workflow and hopefully we'll be able
to re-revet this soon enough, when this gets solved on te QEMU side.

Last but not least, this is not a "clean" revert as we're using
conf.NumVCPUs() instead of conf.NumVCPUs, to ensure we're dealing with
uint32.

Fixes: #8532

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-30 00:41:27 +01:00
Fabiano Fidêncio
1284b4e80d tools: Stop building / shipping log-parser-rs
This is a commit that's a pre-req for #6826, as that PR will merge
log-parser-rs into kata-ctl, but that will result in a CI breakage.

So, let's deal with the CI changes here, thanks to GHA and our favourite
`pull_request_target` event, unblocking that PR to be merged.

Fixes: #6797 (not really, but related).

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-30 00:32:10 +01:00
Gabriela Cervantes
37633d3cc2 metrics: Fix iperf parallel bandwidth limit
This PR fixes the iperf parallel bandwidth limit for the kata
metrics CI.

Fixes #8530

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-29 19:59:45 +00:00
Dan Mihai
96deea52f2 tests: more k8s-exec-rejected debug output
Print more information useful for debugging. Also, use a separate YAML
file for this test, instead of reusing someone else's file.

Fixes: #8270

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-11-29 18:05:15 +00:00
stevenhorsman
47b8c3181f runtime: remote hypervisor updates to ttrpc
- Update the remote hypervisor code to match the re-genned code for
the ttrpc Hypervisor Service

Fixes: #8519
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-11-29 18:04:40 +00:00
stevenhorsman
613c75ba8c runtime: Update hypervisor generated code
Update to use ttrpc_out instead of grpc_out

Fixes: #8519
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-11-29 18:04:40 +00:00
GabyCT
1f1e5377e5 Merge pull request #8497 from GabyCT/topic/removemetricsstratovirt
gha: Disable stratovirt for gha metrics
2023-11-29 11:16:53 -06:00
Fabiano Fidêncio
8fd39d11c4 tests: Adapt enable_hypervisorto the runtime-rs config location change
As the configuration for the runtime-rs based drivers are now placed in
a different location than the golang ones, we should adapt this script
accordingly.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-29 14:51:35 +01:00
Fabiano Fidêncio
38183acbcb tests: Use kata-ctl instead of kata-runtime for runtime-rs
`kata-ctl` is the tool for runtime-rs, and it should be used instead of
`kata-runtime`.

`kata-ctl` requires sudo, and that's the reason it's also been added as
part of the calls.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-29 14:51:35 +01:00
Fabiano Fidêncio
a5a73a11cb tests: Replace kata-runtime kata-env by kata-runtime env
`kata-runtime env` is an alias for `kata-runtime kata-env, and calling
it with the `env` paramenter allows us to easily extend the scripts to
use `kata-ctl` instead of `kata-runtime` when dealing with runtime-rs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-29 14:51:31 +01:00
Chelsea Mafrica
05efb23261 tests: update go.mod and go.sum
Generate a go.sum file for tests.

Fixes #8187

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-11-28 17:40:41 -08:00
Fabiano Fidêncio
30acb5a0c0 tests: nydus: Adapt the default config file for runtime-rs based drivers
As we've done some changes in the runtime-rs based drivers to install
their configuration into a different location, this should also be
reflected as part of this test.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-28 20:37:59 +01:00
Chelsea Mafrica
6d9cb9325d tests: update scripts for static checks migration
Updates to scripts for static-checks.sh functionality, including common
functions location, the move of several common functions to the existing
common.bash, adding hadolint and xurls to the versions file, and changes
to static checks for running in the main kata containers repo.

The changes to the vendor check include searching for existing go.mod
files but no other changes to expand the test.

Fixes #8187

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-11-28 11:13:55 -08:00
Chelsea Mafrica
66f3944b52 tests: move github-labels to main repo
Move tool as part of static checks migration.

Fixes #8187

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Signed-off-by: Derek Lee <derlee@redhat.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:13:55 -08:00
Chelsea Mafrica
7f3c12f1dd tests: move spell check tool to main repo
Move tool as part of static checks migration.

Fixes #8187

Signed-off-by: Bo Chen <chen.bo@intel.com>
Signed-off-by: Carlos Venegas <jos.c.venegas.munoz@intel.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Signed-off-by: Dan Middleton <dan.middleton@intel.com>
Signed-off-by: Derek Lee <derlee@redhat.com>
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Signed-off-by: Hui Zhu <teawater@antfin.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Jimmy Xu <xjmmyshcn@gmail.com>
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:13:55 -08:00
Chelsea Mafrica
8ad433d4ad tests: move markdown check tool to main repo
Move the tool as a dependency for static checks migration.

Fixes #8187

Signed-off-by: Bin Liu <bin@hyper.sh>
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Julio Montes <julio.montes@intel.com>
2023-11-28 11:13:55 -08:00
Chelsea Mafrica
eaa6b1b274 tests: move static checks and dependencies from tests
Move static checks scripts and dependencies from tests to
kata-containers repo.

Fixes #8187

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Signed-off-by: Bin Liu <bin@hyper.sh>
Signed-off-by: Carlos Venegas <jos.c.venegas.munoz@intel.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Signed-off-by: Dan Middleton <dan.middleton@intel.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Derek Lee <derlee@redhat.com>
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Signed-off-by: Jon Olson <jonolson@google.com>
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Julio Montes <julio.montes@intel.com>
Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
Signed-off-by: Nitesh Konkar <niteshkonkar@in.ibm.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Signed-off-by: Xu Wang <xu@hyper.sh>
Signed-off-by: Yang Bo <bo@hyper.sh>
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-11-28 11:13:55 -08:00
Fabiano Fidêncio
61aa84b158 Revert "tests: k8s: Allow passing rust-runtime env var to kata-deploy"
This reverts commit 44899d4cdf, as we've
decided to keep both golang and rust runtime installable and usable at
the same time.

The decision of having both runtimes installable and usable  will help
users to test and easily catch any possible differences between those
runtimes, helping us to get on par with both implementations.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-28 18:02:07 +01:00
James O. D. Hunt
158ca17ae7 kata-deploy: Add cloud-hypervisor
Now that we have a separate Cloud Hypervisor configuration file for the
rust runtime, add it to the kata-deploy.

See: https://github.com/kata-containers/kata-containers/pull/8250

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-28 18:02:06 +01:00
Fabiano Fidêncio
d4e00238ab kata-deploy: Improve the logic for linking to the rust runtime
This change for now doesn't do much, apart from making it easier to
expand which runtimes should be linked to the runtime-rs containerd shim
binary.

Also, this matches the logic used for the config files.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-28 18:01:27 +01:00
James O. D. Hunt
fc28deee0e kata-deploy: Use rust runtime config files in runtime-rs directory
Update `kata-deploy` to modify the rust runtime configuration files in
their new `runtime-rs/` directory.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-28 18:01:25 +01:00
Gabriela Cervantes
9166d0aabb docs: Update iperf3 network documentation
This PR updates the iperf3 network documentation to include
the parallel bandwidth.

Fixes #8523

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-28 15:59:38 +00:00
Wainer dos Santos Moschetta
48bdca4c49 tests/k8s: add k8s-measured-rootfs.bats
Implements the following test case:

  Scenario: Check incorrect hash fails
  **Given** I have a version of kata installed that has a kernel with the
  initramfs built and config with rootfs_verity.scheme=dm-verity
  rootfs_verity.hash=<incorrect hash of rootfs> set in the kernel_params
  **When** I try and create a container a basic pod
  **Then** The pod is doesn't run
  **And**  Ideally we'd get a helpful message to indicate why

Currently on CI only qemu-tdx is built with measured
rootfs support in the kernel, so the test is restriced to that
runtimeclass.

Fixes #7415
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:54 -03:00
Wainer dos Santos Moschetta
1eae657b91 tests/k8s: add set_node() to lib.sh
Use this new function to set the node where the pod should be scheduled
to.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta
c6075c8627 tests/k8s: add setup common
Bring the setup_common() from CCv0 branch test's
integration/kubernetes/confidential/tests_common.sh. It should be used
to reduce boilerplates on the setup() of the tests.

Unlike the original code, this won't export the `test_start_time` variable
as it wouldn't be accurate to grab logs from the worker nodes due
date/time mismatch between the running tests machine and the worker
node. The function export the `node` variable which holds the name of
a random node which has kata installed. Apart from that, it exports the
`node_start_time` which capture the date/time when the test started,
relative to the `node`.

Tests that should inspect the logs can schedule pods/resources to the `node`
and use `node_start_time` as the value reference to grep the logs.

Fixes #7590
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta
220a2d9a15 tests/k8s: add assert_logs_contain() to lib.sh
Bring the assert_logs_contain() from CCv0 branch tests'
integration/kubernetes/confidential/lib.sh.

Introduced the print_node_journal() which uses `kubectl debug` to print
the systemd's journal of a k8s's node.

Fixes #7590
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta
9a9c7a5c6f tests/k8s: add set_metadata_annotation() to lib.sh
This new function allow to the annotations to metadata section in a yaml
configuration file.

Co-authored-by: Ryan Savino <ryan.savino@amd.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta
a13eecf7f3 runtime(-rs): add clean-generated-files target
The new clean-generated-files make target allows for removing the
generated files (including the configuration.toml files).

The tools/packaging/static-build/shim-v2/build.sh script now uses that
target to always force the re-generation of those files.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta
36ea1b8ee7 tests/k8s: add new_pod_config() to lib.sh
Copied the new_pod_config() and pod-config.yaml.in from CCv0 branch
tests' integration/kubernetes/confidential/tests_common.sh and fixtures.
Unlike the original version, new_pod_config() now gets the runtimeclass
by parameter as the RUNTIMECLASS environment variable seems not broadly
used on main branch's CI.

The pod-config.yaml.in was changed as the diff shows below. In
particular the imagePullSecrets was removed to avoid it throwing a
warning on the pod's log.

```
--- a/tests/integration/kubernetes/runtimeclass_workloads/pod-config.yaml.in
+++ b/tests/integration/kubernetes/runtimeclass_workloads/pod-config.yaml.in
@@ -5,12 +5,10 @@
 apiVersion: v1
 kind: Pod
 metadata:
-  name: busybox-cc
+  name: test-e2e
 spec:
   runtimeClassName: $RUNTIMECLASS
   containers:
-  - name: nginx
+  - name: test_container
     image: $IMAGE
-    imagePullPolicy: Always
-  imagePullSecrets:
-  - name: cococred
\ No newline at end of file
+    imagePullPolicy: Always
\ No newline at end of file
```

Co-authored-by: Georgina Kinge <georgina.kinge@ibm.com>
Co-authored-by: Megan Wright <Megan.Wright@ibm.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta
428daf9ebc tests/k8s: add utilities functions for the tests
The following functions were copied from CCv0's branch test's
integration/kubernetes/confidential/lib.sh. I did just smalls
refactorings (shortened their names and delinted shellcheck warnings):

- k8s_delete_all_pods_if_any_exists()
- k8s_wait_pod_be_ready()
- k8s_create_pod()
- assert_pod_fail()

Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Co-authored-by: Georgina Kinge <georgina.kinge@ibm.com>
Co-authored-by: Jordan Jackson <jordan.jackson@ibm.com>
Co-authored-by: Megan Wright <Megan.Wright@ibm.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta
ba4f806c30 initramfs: re-wrote devices checking on init.sh
Re-wrote the logic of init.sh to follow the rules:

 * the root device MUST exist always because it will be either mounted
   or verified (then mounted)
 * if rootfs verifier is enabled then the hash device MUST exist. Avoid
   the case where dm-verity is set but the hash device does not exist and
   so the verification is silently skipped

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta
72ef82368c shim-v2: ensure root hash exist when measured rootfs
When measured toofs is enabled then the shim-v2 build should find the
guest rootfs hash file, otherwise might (silently) generate configuration
files with empty hash.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta
1465e58854 kernel: ensure initramfs exist when measured rootfs
The KATA_BUILD_CC variable plus the existence (or not) of the initramfs
were used to determine whether to build the kernel for measured rootfs
or not. Currently the variable MEASURED_ROOTFS has been used
to trigger the feature build and when it is activated it should expect
the initramfs exist. In other words, this changed the kernel build
so that if `MEASURED_ROOTFS=yes` then the initramf file must exist and
be found.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta
4dbba5215f shim-v2: moved measured rootfs logic to its builder
Moved the measure rootfs logic from kata-deploy-binaries.sh to the
shim-v2's builder script so that the former get less bloated with
components's specific code.

Fixes #6674
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta
34be78df19 kernel: moved measured rootfs logic to its builder
Moved the measure rootfs logic from kata-deploy-binaries.sh to the
kernel's builder script so that the former get less bloated with
components's specific code.

Fixes #6674
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta
3f16d29593 kernel: measured rootfs as argument to build-kernel.sh
By convention the caller of tools/packaging/kernel/build-kernel.sh changes
the script behavior by passing arguments, whereas, for measured rootfs
it has used an environment variable (MEASURED_ROOTFS). This refactor
the script so that the caller now must pass the "-m" argument to enable
the build of the kernel with measured rootfs support.

Fixes #6674
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:51 -03:00
Fabiano Fidêncio
80860478bf runtime-rs: Remove the golang config paths
As the configuration files are different, we can safely remove those as
any new installation of the binary should also bring in the new
configurations.

This makes things less error-prone in the future, as we're ensuring that
the rust runtime will only be reading the rust configuration files.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-28 15:16:53 +01:00
James O. D. Hunt
b86ab5aa21 runtime-rs: Update list of config paths to check
Update the `DEFAULT_RUNTIME_CONFIGURATIONS` list to include a number of
rust runtime specific paths to try to load before checking the
"traditional" (golang) runtime configuration paths.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-28 15:16:53 +01:00
James O. D. Hunt
89ef464b7c build: Install rust config files to runtime-rs directory
Install the rust runtime configuration files to a `runtime-rs/`
directory to distinguish them from the golang config files (which may
have a different syntax).

The default values mean that the rust config files are now installed to
`/opt/kata/share/defaults/kata-containers/runtime-rs/` rather than
`/opt/kata/share/defaults/kata-containers/`.

See: https://github.com/kata-containers/kata-containers/issues/6020

Fixes: #8444.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-28 15:16:53 +01:00
alex.lyn
fe68f25bea runtime-rs: enhancement of vfio volume.
Reimplement vfio volume into direct_volume and do alignment
of rawblock/spdk volume.

Fixes: #8300

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-28 10:08:05 +08:00
alex.lyn
e3fd403126 runtime-rs: enhancement of spdk volume.
(1) Add enum DirectVolumeType for direct volumes.
(2) Reimplement spdk volume into direct_volume and
do alignment of rawblock volume.

Fixes: #8300

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-28 10:08:05 +08:00
alex.lyn
f973729029 runtime-rs: Enhancing DirectVolMount Handling for current Infra.
The current infra(K8S, CSI, CRI, Containerd) for Kata containers is
unable to properly handle direct volumes, resulting in the need for
workarounds like searching/comparision and then patch up volume type.

In this commit, reimplement of handling method is added to support
raw block volume which backends may be rawdisk or other format file.

Fixes: #8300

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-28 10:08:05 +08:00
alex.lyn
e3becea566 runtime-rs: add support kata/multi-containers sharing one vfio volume.
Fiexes: #8300

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-28 10:07:23 +08:00
Steve Horsman
891f488ee3 Merge pull request #8501 from Amulyam24/containerd-tests
gha: add cri-containerd workflow for ppc64le
2023-11-27 17:22:59 +00:00
James O. D. Hunt
45cc417a4e Merge pull request #8461 from jodh-intel/update-codeowners
CODEOWNERS: Expand scope
2023-11-27 15:38:39 +00:00
Fabiano Fidêncio
bb4c51a5e0 Merge pull request #8494 from ChengyuZhu6/kata_virtual_volume
runtime: Pass `KataVirtualVolume` to the guest as devices in go runtime
2023-11-27 16:02:28 +01:00
Steve Horsman
bee6fba5c7 Merge pull request #8459 from Amulyam24/workflow-1
github: add workflows for building and publishing kata artefacts on ppc64le
2023-11-27 14:31:20 +00:00
Amulyam24
754aec02c3 gha: add cri-containerd workflow for ppc64le
This PR adds workflow to run containerd tests on Power as a part of CI migration.

Fixes: #8500

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2023-11-27 17:58:58 +05:30
alex.lyn
6af0592274 runtime-rs: Add vsock device in device manager.
(1) Implement Device Trait for vsock device.
(2) add vsock device in device manager.

Fixes: #8474

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-27 15:23:18 +08:00
alex.lyn
1a6b45d3b7 runtime-rs: Reintroduce Vsock and add it to the DeviceType enum
As vsock device will be used in Qemu or other VMMs, the Vsoock
is reintroduced to DeviceType enum.

Fixes: #8474

Signed-off-by: Pavel Mores <pmores@redhat.com>
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-27 15:12:44 +08:00
alex.lyn
e31dbc94a5 runtime-rs: remove vhost_fd from VsockConfig and make it cloneable.
Currently encounters difficulty in utilizing the clone operation
on VsockConfig due to the implicit management of the vhost fd
within the runtime-rs. This responsibility should be delegated to
the VMM(especially QEMU) child process, as it's not runtime-rs core
responsibilities. We'll remove the member vhost_fd from VsockConfig
and make the VsockConfig/VsockDevice Cloneable.

Fixes: #8474

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-27 15:11:21 +08:00
alex.lyn
eb90962b27 runtime-rs: introduce a new function generate_vhost_vsock_cid.
Introduce a new function generate_vhost_vsock_cid to generate
a guest CID and set guest CID for vsock fd.
Also this commit wouldn't introduce functional change and it's
just splited from the previous VsockDevice::new().

Fixes: #8474

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-27 15:06:58 +08:00
alex.lyn
b952c5c5ce runtime-rs: add support kata/multi-containers sharing one spdk volume.
Fiexes: #8300

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-25 21:13:03 +08:00
alex.lyn
17d2d465d1 runtime-rs: re-organize the volumes with adding new direct_volumes.
Add a new dire direct_volumes containing spdk, rawblock and vfio volume.

Fixes: #8300

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-25 21:04:55 +08:00
alex.lyn
6731466b13 runtime-rs: set a standard NotFound when direct volume path not found.
Fixes: #8300

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-25 19:51:12 +08:00
alex.lyn
d23867273f runtime-rs: split the block volume into block and rawblock volume
(1) rawblock volume is directvol mount type.
(2) block volume is based on the bind mount type.

Fixes: #8300

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-24 23:30:30 +08:00
Amulyam24
ae2c0c5696 github: add workflows for building and publishing kata artifacts on ppc64le
Adds workflows for building kata static tarball and releasing it.

Fixes: #8458

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2023-11-24 15:53:38 +05:30
ChengyuZhu6
5318afe273 runtime: support to create VirtualVolume rootfs storages
1) Creating storage for all `io.katacontainers.volume=` messages in rootFs.Options,
and then aggregates all storages  into `containerStorages`.
2) Creating storage for other data volumes and push them into `volumeStorages`.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2023-11-23 23:22:55 +08:00
ChengyuZhu6
0b4f7c2ee7 runtime: redefine and add functions to handle VirtualVolume to storage
1) Extract function `handleBlockVolume` to create Storage only.
2) Add functions to handle KataVirtualVolume device and construct
   corresponding storages.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2023-11-23 23:07:32 +08:00
ChengyuZhu6
bd099fbda9 runtime: extend SharedFile to support mutiple storage devices
To enhance the construction and administration of `Katavirtualvolume` storages,
this commit expands the 'sharedFile' structure to manage both
rootfs storages(`containerStorages`) including `Katavirtualvolume` and other data volumes storages(`volumeStorages`).

NOTE: `volumeStorages` is intended for future extensions to support Kubernetes data volumes.
Currently, `KataVirtualVolume` is exclusively employed for container rootfs, hence only `containerStorages` is actively utilized.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2023-11-23 23:05:14 +08:00
ChengyuZhu6
e4f33ac141 runtime: add functions to create devices in KataVirtualVolume
The snapshotter will place `KataVirtualVolume` information
into 'rootfs.options' and commence with the prefix 'io.katacontainers.volume='.
The purpose of this commit is to transform the encapsulated KataVirtualVolume data into device information.

Fixes: #8495

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Feng Wang <feng.wang@databricks.com>
Co-authored-by: Samuel Ortiz <sameo@linux.intel.com>
Co-authored-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-11-23 23:05:13 +08:00
Dan Mihai
756022787c Merge pull request #8239 from Sumynwa/sumsharma/fix_configmap_update_propagation
runtime: Fix configmap/secrets updates with FS sharing disabled
2023-11-23 06:50:53 -08:00
Chelsea Mafrica
98aa291c9e runtime-rs: Add Hybrid VSOCK device handling for CH
Update cloud hypervisor implementation to allow hybrid vsock device to
be handled.

Fixes #6692

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-11-22 14:42:09 -08:00
Gabriela Cervantes
8839ca93ba gha: Disable stratovirt for gha metrics
This PR disables the stratovirt for gha metrics.

Fixes #8496

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-22 16:17:31 +00:00
briwan01
231b9dfd9d runtime-rs/clh: Fix unable to boot container
In the case of Cloud Hypervisor running on arm64 architecture,
only arm AMBA UART (pl011) is supported as the TTY. Consequently,
when enabling Hypervisor debug mode, it's essential to configure
the console as "ttyAMA0" rather than "ttyS0

Fixes: #8381

Signed-off-by: briwan01 <brian.wang@arm.com>
2023-11-22 17:52:11 +08:00
GabyCT
358f32e8bb Merge pull request #8467 from GabyCT/topic/fixresult
metrics: Fix result finding in tensorflow benchmark
2023-11-21 13:41:46 -06:00
Fabiano Fidêncio
45a41c3431 Merge pull request #8481 from ChengyuZhu6/guest-kernel
kernel: backport erofs patch to 6.1.52 guest kernel
2023-11-21 12:22:24 +01:00
Fabiano Fidêncio
8425c78c91 Merge pull request #8476 from fidencio/topic/gha-pass-rust-runtime-to-kata-deploy
tests: k8s: Allow passing rust-runtime env var to kata-deploy
2023-11-21 11:09:01 +01:00
Chao Wu
6a6c3c53b5 Merge pull request #8450 from adamqqqplay/vhost-user-general
dragonball: add vhost-user connection management logic
2023-11-21 16:05:17 +08:00
ChengyuZhu6
6de01eacfd kernel: backport erofs patch to 6.1.52 guest kernel
Backport the erofs patch from linux kernel to solve the error #8083

Fixes: #8083

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-11-21 15:22:40 +08:00
Amulyam24
d8a8cc4491 tools: install oras from source on ppc64le
Since the release is not yet out for ppc64le, build oras from source and use it.

Fixes: #8458

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2023-11-21 11:38:20 +05:30
Amulyam24
08f3603123 tools: fix static build of qemu and shimv2 on ppc64le
- statically linked qemu requires slof.bin to run, hence remove it from blacklist
- By default, initrd is used for Power, modify the configuration.toml accordingly

Fixes: #8458

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2023-11-21 11:38:20 +05:30
Alex.Lyn
4fd2914a33 Merge pull request #7932 from Apokleos/wrap-virtiofs-in-dm
runtime-rs: bringing virtio-fs device in device-manager
2023-11-21 13:48:15 +08:00
Huang Jianan
a9571398a6 dragonball: add test utils for vhost-user
The test utils will be used by the upcoming feature tests: vhost-user-net,
vhost-user-blk and vhost-user-fs.

Signed-off-by: Beiyue <beiyue@linux.alibaba.com>
Signed-off-by: Huang Jianan <jnhuang@linux.alibaba.com>
2023-11-21 09:51:56 +08:00
Qinqi Qu
a6a399d5bc dragonball: add vhost-user connection management logic
The vhost-user connection management logic will be used by
the upcoming features: vhost-user-net, vhost-user-blk and
vhost-user-fs.

Fixes: #8448

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Qinqi Qu <quqinqi@linux.alibaba.com>
Signed-off-by: Huang Jianan <jnhuang@linux.alibaba.com>
2023-11-21 09:51:48 +08:00
Fabiano Fidêncio
9445a967b6 Merge pull request #8471 from ChengyuZhu6/kata-virtual-volume
runtime: Introduce `KataVirtualVolume` structure into go runtime
2023-11-20 21:58:27 +01:00
Fabiano Fidêncio
8002de895a Merge pull request #8439 from fidencio/topic/kata-manager-install-a-given-kata-tarball
utils: kata-manager: Allow installing kata from a given tarball
2023-11-20 20:02:25 +01:00
Wainer Moschetta
728565d1e4 Merge pull request #7046 from stevenhorsman/remote-hypervisor-cherry-picks
CC: Remote hypervisor merge to main
2023-11-20 15:22:37 -03:00
Chao Wu
5ee8829700 Merge pull request #8451 from openanolis/chao/pci 2023-11-21 00:29:22 +08:00
Fabiano Fidêncio
41f3f6f93e Merge pull request #8465 from justxuewei/rename-virtio
dragonball: Uniform the spelling of Virtio
2023-11-20 16:31:33 +01:00
Hyounggyu Choi
506b127df8 Merge pull request #8478 from BbolroC/set-default-allowed_hypervisor_annotations
kata-deploy: Set a default value for ALLOWED_HYPERVISOR_ANNOTATIONS
2023-11-20 15:39:56 +01:00
alex.lyn
fe62e656a7 runtime-rs: Name the ShareFs Mount Option type more accurately
Fixes: #7915

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-20 20:05:50 +08:00
alex.lyn
856315ff87 runtime-rs: bringing virtio-fs device in device-manager
It mainly focus on the two parts:
(1) redesign the ShareFsConfig with ShareFsMountConfig

The device mount operation must depend on the fact that sharefs
device exists, and re-design the structure of SharesFsConfig and
move the ShareFsMountConfig into it with Option type, which is to
describe the relation between ShareFsConfig and ShareFsMountConfig.

(2) move virtiofs into device manager
Currently, virtio-fs is still outside of the device manager.
To do Enhancement of device manager, it will bring virtio-fs
device in device-manager for unified management

Fixes: #7915

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-20 20:04:47 +08:00
Chao Wu
b3318e59eb Merge pull request #8332 from Apokleos/bugfix-directvol-multicontainers
runitme-rs/bugfix: kata pod with multi-containers sharing one direct volume
2023-11-20 19:37:58 +08:00
Hyounggyu Choi
c489f1f504 kata-deploy: Set a default value for ALLOWED_HYPERVISOR_ANNOTATIONS
As a follow-up PR for #8404, this is to set a default value for an environment variable `ALLOWED_HYPERVISOR_ANNOTATIONS`.
This will prevent a pod launching without an explicit configuration for the variable from getting into a `CrashLoop` state.

Fixes: #8477

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-11-20 12:33:34 +01:00
Chao Wu
ee55897827 fmt: refactor in pci & balloon
1. merge hashmap get logic according to Xuewei suggestion.

2. do cargo fmt

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-11-20 17:53:51 +08:00
Chao Wu
baf3db9e6e Dragonball: add PCI bus and PCI interrupt support in mptable Spec
In order to support PCI VFIO functionality in Dragonball, we should
first add PCI bus and PCI device Interrupt information in Dragonball
mptable setup process.

This patch add :

1. pci_legacy_irqs transfered to setup_mptable function.
2. pci bus support in mptable mem
3. pci interrupt support in mptable mem

fixes: #8449

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-11-20 17:53:51 +08:00
Xuewei Niu
c305634b4e dragonball: Uniform the spelling of Virtio
The changes are:

- VirtIoError -> VirtioError
- VirtIoResult -> VirtioResult
- VirtIoDevice -> VirtioDevice

Fixes: #8464

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-20 17:00:58 +08:00
Fabiano Fidêncio
44899d4cdf tests: k8s: Allow passing rust-runtime env var to kata-deploy
This will be used for selecting the correct runtimes and runtimeclasses
to be deployed with kata-deploy.

Fixes: #8475

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-20 09:13:05 +01:00
ChengyuZhu6
1353b14e6c runtime: Add KataVirtualVolume struct in runtime
Add the corresponding data structure in the runtime part according to
kata-containers/kata-containers/pull/7698.

Fixes: #8472

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2023-11-19 13:30:32 +08:00
Greg Kurz
110574353d Merge pull request #8345 from beraldoleal/issues/8343
Fixes make check errors
2023-11-17 17:38:29 +01:00
Gabriela Cervantes
37916e7a58 metrics: Fix result finding
This PR fixes the result finding for the general throughput for
the tensorflow benchmark.

Fixes #8466

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-17 15:59:51 +00:00
stevenhorsman
ebf9d2725a kata-deploy: Add remote shim
- Add remote to the list of shims in kata-deploy and kata-cleanup

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-11-17 13:38:49 +00:00
Fabiano Fidêncio
d5cf169adf kata-deploy: Add missing kata-remote runtimeclass
It's CCv0 specific for now, and it's needed as the Operator is now
delegating the runtimeclass creation to the kata-deploy daemonset.

Fixes: #7550

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 2df6cb7609)
2023-11-17 13:34:40 +00:00
Pradipta Banerjee
39e8c84269 runtime: Add support for key annotations to remote hyp
In order to support different pod VM instance type via
remote hypervisor implementation (cloud-api-adaptor),
we need to pass machine_type, default_vcpus
and default_memory annotations to cloud-api-adaptor.

The cloud-api-adaptor then uses these annotations to spin
up the appropriate cloud instance.

Reference PR for cloud-api-adaptor
https://github.com/confidential-containers/cloud-api-adaptor/pull/1088

Fixes: #7140
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
(based on commit 004f07f076)
2023-11-17 13:33:27 +00:00
Yohei Ueda
2910e333a8 runtime: Use static resource in remote hypervisor
This patch updates the template configuration file for
the remote hypervisor to set static_sandbox_resource_mgmt
to be true.  The remote hypervisor uses the peer pod config
to determine the sandbox size, so requires this to be set to
true by default.

Fixes: #6616
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
(based on commit 938447803b)
2023-11-17 13:33:27 +00:00
stevenhorsman
26d56678a9 config: Add initial remote hypervisor config
- Remote hypervisor template config
- Add annotation enablement for machine_type, default_memory and
default_vcpus for flexible instance types

Fixes: #6349
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
(based on commits 7c9a791d67
and 335a456425)
2023-11-17 13:33:24 +00:00
stevenhorsman
ad63439a3e runtime: Update the remote hypervisor config
Add the SELinux setting to ensure it is passed through to the remote
hypervisor

Fixes: #5936

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
(based on commit 3ef2fd1784)
2023-11-17 13:32:52 +00:00
Lei Li
50e0d43dad runtime: Support privileged containers in peer pod VM
This patch fixes the issue of running containers
with privileged as true.

See the discussion at this URL for the details.
https://github.com/confidential-containers/cloud-api-adaptor/issues/111

Signed-off-by: Lei Li <cdlleili@cn.ibm.com>
(based on commit c3e6b66051)
2023-11-17 13:32:52 +00:00
Yohei Ueda
57d4dd8e57 runtime: Support the remote hypervisor type
This patch adds the support of the remote hypervisor type.
Shim opens a Unix domain socket specified in the config file,
and sends TTPRC requests to a external process to control
sandbox VMs.

Fixes #4482

Co-authored-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
(based on commit f9278f22c3)
2023-11-17 13:32:49 +00:00
Yohei Ueda
8ac9a22097 runtime: Add hypervisor proto to support peer pod VMs
This patch adds a protobuf definiton of the remote hypervisor type.

Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
(based on commit 150e8aba6d)
2023-11-17 13:31:09 +00:00
Fabiano Fidêncio
f8322ffad2 Merge pull request #7796 from WenyuanLau/7794/StratoVirt_VMM_support
StratoVirt: add support for a lightweight VMM StratoVirt in Kata
2023-11-17 10:53:17 +01:00
Fabiano Fidêncio
d6d9b45007 Merge pull request #7931 from BbolroC/migrate-to-gha-s390x
tests|gha: add containerd and k8s tests for s390x
2023-11-17 10:24:14 +01:00
Sumedh Alok Sharma
4aaf54bdad runtime: Fix configmap/secrets update propagation with FS sharing disabled
This PR fixes k8's configmap/secrets etc update propagation when filesystem sharing is disabled.
The commit introduces below changes with some limitations:
- creates new timestamped directory in guest
- updates the '..data' symlink
- creates user visible symlinks to newly created secrets.
- Limitation: The older timestamped directory and stale user visible symlinks exist in guest
  due to missing DELETE api in agent.

Fixes: #7398

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2023-11-17 13:01:23 +05:30
Hyounggyu Choi
0c7aa1f307 gha: Set nightly test for s390x to 5 UTC
This is to push back the time for the s390x nightly test to 5 a.m. UTC.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-11-17 05:47:44 +01:00
Hyounggyu Choi
ffe1ea52cf tests|gha: add containerd and k8s tests for s390x
As part of the CI migration, this PR is to add workflows for containerd and k8s for s390x.

Fixes: #7930
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-11-16 18:14:26 +01:00
GabyCT
8586308dcd Merge pull request #8453 from GabyCT/topic/udpreadme
metrics: Add iperf udp information to README
2023-11-16 10:38:56 -06:00
GabyCT
494174a98e Merge pull request #8421 from GabyCT/topic/enablestressng
tests: Enable stressng scalability test
2023-11-16 10:25:05 -06:00
James O. D. Hunt
4a4fc9c648 CODEOWNERS: Expand scope
Improve the `CODEOWNERS` file by specifying more groups.

Since GitHub automatically checks the `CODEOWNERS` file when a PR is
created and adds all matching groups as reviewers for the PR, this may
help reduce the PR backlog since the right people will be alerted and
requested to review the PR. That should improve the quality of reviews
(and thus the quality of the landed code). It may also have a positive
effect on PR velocity.

> **Note:**
>
> This PR combines the other `CODEOWNERS` files so we have
> a single, visible, top-level file.

See: https://github.com/kata-containers/community/issues/253

Fixes: #3804.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-16 16:09:20 +00:00
Fabiano Fidêncio
10996f3bbb Merge pull request #8460 from ldoktor/artifacts
gha: Keep kata tarballs for 15 days
2023-11-16 13:56:25 +01:00
Liu Wenyuan
c77e990c3e tests: Enable tests for StratoVirt hypervisor
This commit enables StratoVirt hypervisor to be tested in kata GHA,
incluing k8s, metrics, cri-containerd, nydus and so on.

Meanwhile, adding some unit tests for StratoVirt to make sure it works.

Fixes: #7794

Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2023-11-16 20:47:26 +08:00
Liu Wenyuan
14d8790d83 kata-deploy: Add StratoVirt support to deploy process
Allow kata-deploy process to pull StratoVirt from release binaries, and
add them as a part of kata release.

Fixes: #7794

Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2023-11-16 20:47:26 +08:00
Liu Wenyuan
9542211e71 configuration: add configuration for StratoVirt hypervisor.
Add configuration-stratovirt.toml.in to generate the StratoVirt configuration,
and parser to deliver config to StratoVirt.

Fixes: #7794

Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2023-11-16 20:47:26 +08:00
Liu Wenyuan
561c85be54 build: Makefile for StratoVirt hypervisor
Add support for building StratoVirt hypervisor, including x86_64 and
arm64.

Fixes: #7794

Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2023-11-16 20:47:26 +08:00
Liu Wenyuan
26966c8469 virtcontainers: Add StratoVirt as a supported hypervisor
Initial support of the MicroVM machine type of StratoVirt
hypervisor for the kata go runtime.

Fixes: #7794

Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2023-11-16 20:47:24 +08:00
Fabiano Fidêncio
edb791315e Merge pull request #7987 from BbolroC/nightly-ci-s390x
tests|gha: add nightly tests for s390x
2023-11-16 11:45:32 +01:00
Lukáš Doktor
8959e3ca05 gha: Keep kata tarballs for 15 days
these tarballs are useful for debugging and re-running jobs, keep them
for 15 days.

Fixes: #8000

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2023-11-16 10:35:20 +01:00
Gabriela Cervantes
9cc6908b09 stability: Update stressng to run on the gha
This PR updates the stressng test to run on the gha for kata CI.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-15 19:34:36 +00:00
Gabriela Cervantes
9d8eb298c3 metrics: Add iperf udp information to README
This PR adds the iperf udp information to the network README
for the kata metrics CI.

Fixes #8452

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-15 15:22:06 +00:00
Gabriela Cervantes
4b7854b668 stability: Add missing dependencies
This PR adds missing dependencies to run stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-15 14:51:14 +00:00
Gabriela Cervantes
79177bb9cb tests: Enable stressng scalability test
This PR enables the stressng scalability test for kata CI.

Fixes #8420

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-15 14:51:14 +00:00
Xuewei Niu
f18794d880 Merge pull request #8426 from justxuewei/vhost-rm-virtio-net
dragonball: Remove vhost-net dependency on virtio-net
2023-11-15 10:39:27 +08:00
alex.lyn
ba632ba825 runitme-rs: kata with multi-containers sharing one direct volume
When multiple containers in a kata pod share one direct volume,
it's important to make sure that the corresponding block device
is only mounted once in the guest. This means that there should
be only one mount entry for the device in the mount information.

Fixes: #8328

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-15 10:37:01 +08:00
alex.lyn
d7594d830c runtime-rs: correct the path from cid to device_id.
When a direct volume is used by multiple containers in Kata,
Generating many shared paths with cids will cause IO error
as the result of one direct volume mounts more than once.
To correct it, use the device_id instead of cid which
ensures that the guest only mounts the FS once.

Fixes: #8328

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-15 10:30:39 +08:00
Fabiano Fidêncio
906f6b7380 Merge pull request #8431 from UiPath/fix-vsock-packets-drop
kernel: Fix vsock packets drop when the driver initializes
2023-11-14 18:52:53 +01:00
Fabiano Fidêncio
1699b84f13 utils: kata-manager: Remove $enable_debug from the install_kata call
This was added as part of d4d65bed38, but
install_kata has never actually used the passed enable_debug var.

With this in mind, let's just remove it.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-14 17:34:03 +01:00
Fabiano Fidêncio
38d2edd83b utils: kata-manager: Allow installing kata from a given tarball
With this change, we give the users the change to try kata-containers
with their own pre-built tarball.

This will become very useful in the CI context, as we won't be
downloading a specific version of kata-containers, but rather installing
whatever was built in previous steps of the CI pipeline.

Fixes: #8438

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-14 17:34:01 +01:00
Fabiano Fidêncio
fd9b6d6837 Merge pull request #7623 from fidencio/topic/runtime-improve-vcpu-allocation-on-host-side
runtime: Improve vCPU allocation for the VMMs
2023-11-14 14:10:54 +01:00
Alexandru Matei
bfd1ce30e1 kernel: Fix vsock packets drop when the vsock driver starts
The virtio vsock driver has a small window during initialization
where it can silently drop replies to connection requests.
Because no reply is sent, kata waits for 10 seconds and in the
end it generates a connection timeout error in HybridVSockDialer.

Fixes: #8291

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2023-11-14 11:02:52 +02:00
Xuewei Niu
49c2e6e23c dragonball: Remove vhost-net dependency on virtio-net
This patch is to remove vhost-net dependency on virtio-net for
dbs-virtio-devices crate. Then, the feature of vhost-net is able to enable
without enabling virtio-net device, error, etc.

Fixes: #8423

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-14 15:35:10 +08:00
Fabiano Fidêncio
dffc6f611c Merge pull request #8432 from justxuewei/rm-ci-docker-and-nerdctl
gha: Remove docker and nerdctl tests from ci.yaml
2023-11-14 08:34:18 +01:00
alex.lyn
4d65c2e8a2 runtime-rs: introduce update_device in trait Hypervisor
Introduce the `update_device` trait in Hypervisor to enable
device updates for VMMs.This trait will initially be utilized
for virtiofs Mount operations.

Fixes: #7915

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-14 11:56:36 +08:00
Xuewei Niu
481486c6d5 gha: Remove docker and nerdctl tests from CI
Two workflows, run-nerdctl-tests-on-garm.yaml and
run-docker-tests-on-garm.yaml, are removed from commit b481d39. However,
they are referenced by CI workflow. It leads to the CI not working
properly. This patch is to remove those files from ci.yaml.

Fixes: #8433

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-14 10:44:14 +08:00
Fabiano Fidêncio
c858ea1460 Merge pull request #8174 from fidencio/topic/re-revert-8115
ci: Re-add tracing tests and move docker/nerdctl to the basic-ci-amd64.yaml file
2023-11-13 18:19:40 +01:00
James O. D. Hunt
a781ce33b0 Merge pull request #8383 from jodh-intel/kata-manager-add-list-option
utils: kata-manager: Add option to list versions
2023-11-13 16:18:36 +00:00
David Esparza
98ec34b04c Merge pull request #8338 from dborquez/improve_metrics_init_environment
metrics: Fix function that completely stops kata containers before running a test
2023-11-13 09:35:27 -06:00
Fabiano Fidêncio
b481d396fc gha: Move docker / nerdctl content to the basic-ci-amd64 file
There's no need to keep those as separate files, and by having those in
the basic-ci-amd64.yaml file actually helps us to avoid the
undocummented GHA limitation about the number of files imported.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-13 15:34:00 +01:00
Fabiano Fidêncio
3c735c236d ci: tracing: Adapt to basic-ci-amd64.yaml
Peng Tao made this move as part of 1280f85343, and here we're
simply adjusting to the move.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-13 15:27:39 +01:00
Fabiano Fidêncio
ee17fe9d20 Revert "gha: ci: Revert tracing test PR to unbreak CI"
This reverts commit e9bd852113.
2023-11-13 15:27:39 +01:00
James O. D. Hunt
4d5b23b73a Merge pull request #8419 from jodh-intel/2023-11-10-fix-tdx
runtime-rs: ch: Fix TDX
2023-11-13 11:58:16 +00:00
James O. D. Hunt
7f666f783d runtime-rs: ch: Fix TDX
PR #8311 inadvertently broke the runtime-rs / Cloud Hypervisor TDX
handling. It also introduced unrecoverable failure scenarios. Hence,
replace slow, fallible regex matching in logging fast path with single pass
non-failing multi-string log level matching.

Also, added a unit test for `parse_ch_log_level()`.

Fixes: #8418.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-13 08:49:47 +00:00
Xuewei Niu
0a9125e629 Merge pull request #7675 from justxuewei/vhost-net 2023-11-12 20:38:18 +08:00
Xuewei Niu
d1deaf0538 dragonball: Minor changes for a comment from Bian
- Add feature control for InsertNetworkDevice.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-12 14:14:10 +08:00
Xuewei Niu
e4f83e27c4 dragonball: vhost-net set_offload with acked features
set_offload() for tap devices depends on acked features.

Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-12 14:10:39 +08:00
Xuewei Niu
6cd572dbbb dragonball: Minor changes for Chao's comments
- Remove two panic statements from InsertNetworkDevice test.
- Rename `NUM_QUEUES` to `DEFAULT_NUM_QUEUES`, `QUEUE_SIZE` to
  `DEFAULT_QUEUE_SIZE` for vhost-net and virtio-net.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-12 14:10:39 +08:00
Xuewei Niu
dcdf3c6556 runtime-rs: Supply missing fields of NetworkConfig
`test_networkconfig_to_netconfig` from clh depends on `NetworkConfig` which
has some new fields in this PR. Therefore, this commit gives the test
missing fields.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-12 14:10:39 +08:00
Xuewei Niu
58e9709c1f dragonball: Changes for ZizhengBian's comments
- Dragonball's vhost-net feature not depends on virtio-net feature.
- Remove `TapError` from dbs-virtio-devices's Error, and add `VirtioNet`
  and `VhostNet` two fields.
- Downgrade visiblity of two fields of `VhostNetDeviceMgr` from
  `pub(crate)`.
- File an issue to record a todo for network rate limiter.
- Print internal errors with `{0:?}.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-12 14:10:33 +08:00
Fabiano Fidêncio
849253e55c tests: Add a simple test to check the VMM vcpu allocation
As we've done some changes in the VMM vcpu allocation, let's introduce
basic tests to make sure that we're getting the expected behaviour.

The test consists in checking 3 scenarios:
* default_vcpus = 0 | no limits set
  * this should allocate 1 vcpu
* default_vcpus = 0.75 | limits set to 0.25
  * this should allocate 1 vcpu
* default_vcpus = 0.75 | limits set to 1.2
  * this should allocate 2 vcpus

The tests are very basic, but they do ensure we're rounding things up to
what the new logic is supposed to do.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-10 18:26:01 +01:00
Fabiano Fidêncio
5e9cf75937 vc: utils: Rename CalculateMilliCPUs() to CalculateCPUsF()
With the change done in the last commit, instead of calculating milli
cpus, we're actually converting the CPUs to a fraction number, a float.

Let's update the function name (and associated vars) to represent that
change.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-10 18:26:01 +01:00
Fabiano Fidêncio
e477ed0e86 runtime: Improve vCPU allocation for the VMMs
First of all, this is a controversial piece, and I know that.

In this commit we're trying to make a less greedy approach regards the
amount of vCPUs we allocate for the VMM, which will be advantageous
mainly when using the `static_sandbox_resource_mgmt` feature, which is
used by the confidential guests.

The current approach we have basically does:
* Gets the amount of vCPUs set in the config (an integer)
* Gets the amount of vCPUs set as limit (an integer)
* Sum those up
* Starts / Updates the VMM to use that total amount of vCPUs

The fact we're dealing with integers is logical, as we cannot request
500m vCPUs to the VMMs.  However, it leads us to, in several cases, be
wasting one vCPU.

Let's take the example that we know the VMM requires 500m vCPUs to be
running, and the workload sets 250m vCPUs as a resource limit.

In that case, we'd do:
* Gets the amount of vCPUs set in the config: 1
* Gets the amount of vCPUs set as limit: ceil(0.25)
* 1 + ceil(0.25) = 1 + 1 = 2 vCPUs
* Starts / Updates the VMM to use 2 vCPUs

With the logic changed here, what we're doing is considering everything
as float till just before we start / update the VMM. So, the flow
describe above would be:
* Gets the amount of vCPUs set in the config: 0.5
* Gets the amount of vCPUs set as limit: 0.25
* ceil(0.5 + 0.25) = 1 vCPUs
* Starts / Updates the VMM to use 1 vCPUs

In the way I've written this patch we introduce zero regressions, as
the default values set are still the same, and those will only be
changed for the TEE use cases (although I can see firecracker, or any
other user of `static_sandbox_resource_mgmt=true` taking advantage of
this).

There's, though, an implicit assumption in this patch that we'd need to
make explicit, and that's that the default_vcpus / default_memory is the
amount of vcpus / memory required by the VMM, and absolutely nothing
else.  Also, the amount set there should be reflected in the
podOverhead for the specific runtime class.

One other possible approach, which I am not that much in favour of
taking as I think it's **less clear**, is that we could actually get the
podOverhead amount, subtract it from the default_vcpus (treating the
result as a float), then sum up what the user set as limit (as a float),
and finally ceil the result.  It could work, but IMHO this is **less
clear**, and **less explicit** on what we're actually doing, and how the
default_vcpus / default_memory should be used.

Fixes: #6909

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2023-11-10 18:25:57 +01:00
Fabiano Fidêncio
8d958b8c47 Merge pull request #8406 from microsoft/danmihai1/policy-doc
docs: add agent policy documentation
2023-11-10 17:19:04 +01:00
James O. D. Hunt
f588d31324 Merge pull request #8374 from jodh-intel/kata-manager-check-dl-url-count
utils: kata-manager: Ensure only one download URL
2023-11-10 13:19:07 +00:00
Fabiano Fidêncio
b0157ad73a runtime: confidential: Do not set the max_vcpu to cpu
We don't have to do this since we're relying on the
`static_sandbox_resource_mgmt` feature, which gives us the correct
amount of memory and CPUs to be allocated.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-10 12:58:20 +01:00
Steve Horsman
b23952c852 Merge pull request #8309 from gkurz/update-release-process-doc
Update release process documentation
2023-11-10 09:44:18 +00:00
James O. D. Hunt
0ead018d0a utils: kata-manager: Add Docker details to list output
Add Docker version details to the output of the list versions
CLI option.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-10 09:19:56 +00:00
James O. D. Hunt
be3044fd01 utils: kata-manager: Add option to list versions
Add a command-line option to list the installed and available versions
of Kata and containerd.

Fixes: #8355.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-10 09:19:56 +00:00
James O. D. Hunt
9969f5a94a utils: kata-manager: Make test container name more unique
Rather than creating a container called `test-kata`, prefix with the
script name to make it a bit "more unique" and less likely for users to
have an existing container with the test container name. The new test
container name is `kata-manager-sh-test-kata`.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-10 09:19:56 +00:00
James O. D. Hunt
436d7d1275 utils: kata-manager: Improve usage message
Update the usage to show that the latest Kata version can also be queried using
`kata-ctl`.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-10 08:29:14 +00:00
James O. D. Hunt
1625a5ce48 utils: kata-manager: Improve version check
Update `github_get_latest_release()` to use `sort -V` rather than
sub-sorting on the major, minor and patch level version number elements.

The new approach is safer and more accurate.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-10 08:29:14 +00:00
James O. D. Hunt
c72a27e219 utils: kata-manager: Ensure only one download URL
Add an extra sanity check to ensure that only a single download URL is
found for the specified release version.

Fixes: #8364.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-10 08:27:23 +00:00
James O. D. Hunt
839f6c3d44 utils: kata-manager: Improve info messages
Improve some of the information messages a little by adding
more detail and quoting file names.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-10 08:27:20 +00:00
Archana Shinde
21e45bebc8 Merge pull request #8376 from fidencio/topic/kata-manager-add-support-for-docker-installation
kata-manager: Add support for Docker CLI installation
2023-11-09 22:11:50 -08:00
Chao Wu
a62fb83c91 Merge pull request #8169 from openanolis/chao/fix_typo_shm
runtime-rs: fix a typo in shm
2023-11-10 14:00:11 +08:00
Chao Wu
820b578aa3 Merge pull request #8370 from gaohuatao-1/bugfix
agent: update AGENT_THREADS metrics value
2023-11-10 13:16:29 +08:00
gaohuatao
78df1bb851 agent: update AGENT_THREADS metrics value
Fixes: #8369

Signed-off-by: gaohuatao <gaohuatao@bytedance.com>
2023-11-10 10:39:57 +08:00
Chao Wu
afb002c25c runtime-rs: fix a typo in shm
is_shim_volume should be is_shm_volume in shm_volume mod.

fixes: #8168
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-11-10 10:36:58 +08:00
Fabiano Fidêncio
2b937400fe Merge pull request #8404 from fidencio/topic/kata-deploy-allow-users-to-enable-hypervisor-annotations
kata-deploy: Allow users to set hypervisor annotations
2023-11-09 17:44:52 +01:00
Dan Mihai
bc49c553ef docs: add agent policy documentation
Add initial agent policy documentation.

Fixes: #7671

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-11-09 16:43:00 +00:00
Fabiano Fidêncio
5d10aed9ba kata-manager: Make containerd_config a global var
As "/etc/containerd/config.toml" is used from more than one place, let's
just make it a global var.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-09 13:47:52 +01:00
Fabiano Fidêncio
66d1b2c173 kata-manager: Add support for docker installation
Add support for also installing the Docker CLI, giving users the chance
to try Kata Containers with docker in the same way we provide users the
chance to try Kata Containers with `ctr`.

Fixes: #8357

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-09 13:47:52 +01:00
Fabiano Fidêncio
1a81989d20 tests: k8s: Use the "ALLOWED_HYPERVISOR_ANNOTATIONS"
The current kata-deploy code has been doing a `sed` to add allowed
hypervisor annotations, so CBL mariner can be tested with their own
kernel and initrd.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-09 13:42:31 +01:00
Fabiano Fidêncio
023c4a17cf kata-deploy: Allow users to set hypervisor annotations
Currently the only way one can specify allowed hypervisor annotations is
during build time, which is a big issue for users grabbing kata-deploy
as we provide.

Fixes: #8403

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-09 13:42:31 +01:00
Fabiano Fidêncio
0352f1e029 kata-manager: Allow passing a specific tool to test_installation
Right now we're only testing with `ctr` and there's no change in
behaviour with this commit.  However, allowing to pass a tool to run the
tests with gives us an easier time when expanding kata-manager to
support, for instance, docker and nerdctl.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-09 11:24:37 +01:00
Fabiano Fidêncio
50df1129ea Merge pull request #8411 from fidencio/topic/fix-k3s-deployment
gha: Fix regex used to get kubectl version from the k3s version
2023-11-09 10:44:34 +01:00
Fabiano Fidêncio
455b7bf776 gha: k3s: Avoid unnecessary escape
There's no reason to escape the first + on the +k3s[0-9]\+ regex, as
shown here:
```sh
ubuntu@k3s:~$ /usr/local/bin/k3s kubectl version --short 2>/dev/null | \
	grep "Client Version" | \
	sed \
		-e 's/Client Version: //' \
		-e 's/+k3s[0-9]\+//'
v1.27.7
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-09 08:42:25 +01:00
Fabiano Fidêncio
e7890ee8f6 gha: Fix regex used to get kubectl version from the k3s version
It seems that with the new k3s release, they've bumped their kubectl
version from x.y.z+k3s1 to x.y.z+k3s2.

Let's ensure our regexp is more generic and future proof for such
changes.

Fixes: #8410

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-09 07:08:02 +01:00
Archana Shinde
1611723465 Merge pull request #8379 from likebreath/1103/clh_v36.0
Upgrade to Cloud Hypervisor v36.0
2023-11-08 21:10:41 -08:00
Archana Shinde
268d4d622f Merge pull request #8389 from justxuewei/vm-capable-test
runtime: Fix TestCheckHostIsVMContainerCapable unstablity issue
2023-11-08 12:14:04 -08:00
Archana Shinde
92a517156c Merge pull request #8367 from amshinde/add-nerdctl-ipvlan-test
network: Fix network hotplug for ipvlan and macvlan endpoints for qemu and add tests
2023-11-08 11:45:13 -08:00
Chelsea Mafrica
83e731328f Merge pull request #8023 from cmaf/runtime-rs-ch-pause-resume
runtime-rs: Update status for pause and resume
2023-11-08 11:34:47 -08:00
Hyounggyu Choi
84b5618733 tests|gha: add internal nightly tests for s390x
This is to add a workflow for internal nightly tests for s390x in Jenkins.

Fixes: #7986
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-11-08 16:07:41 +01:00
Xuewei Niu
acd9057c7b runtime: Fix TestCheckHostIsVMContainerCapable unstablity issue
TestCheckHostIsVMContainerCapable removes sysModuleDir to simulate a
case that the kernel modules are not loaded. However,
checkKernelModules() executes modprobe <module> if a module not
found in that directory. Loading those modules is required to be denied
temporarily.

Fixes: #8390

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 22:40:08 +08:00
Fupan Li
100a73d2fd Merge pull request #7531 from justxuewei/device-cgroup
agent: Restrict device access at upper node of container's cgroup
2023-11-08 22:01:48 +08:00
Chao Wu
4435c1efd7 Merge pull request #8386 from jodh-intel/runtime-rs-ch-tidy-up
runtime-rs: ch: Simplify VSOCK error handling
2023-11-08 17:31:40 +08:00
Xuewei Niu
023d8dc01e agent: Changes according to Pan's comments
- Disable device cgroup restriction while pod cgroup is not available.
- Remove balcklist-related names and change whitelist-related names to
  allowed_all.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 09:39:08 +08:00
Xuewei Niu
136fb76222 tests: Add a integrated test for device cgroup
`TestDeviceCgroup` is added to cri-containerd's integration tests. The test
launches two containers. Each container has a block device. It checks the
validity of device cgroup.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 09:39:07 +08:00
Xuewei Niu
b5f3a8cb39 agent: Fix container launching failure with systemd cgroup
FSManager of systemd cgroup manager is responsible for setting up cgroup
path. The container launching will be failed if the FSManager is in
read-only mode.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 09:39:07 +08:00
Xuewei Niu
6477825195 agent: Minor changes according to Zhou's comments
The changes include:

- Change to debug logging level for resources after processed.
- Remove a todo for pod cgroup cleanup.
- Add an anyhow context to `get_paths_and_mounts()`.
- Remove code which denys access to VMROOTFS since it won't take effect. If
  blackmode is in use, the VMROOTFS will be denyed as default. Otherwise,
  device cgroups won't be updated in whitelist mode.
- Add a unit test for `default_allowed_devices()`.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 09:39:07 +08:00
Xuewei Niu
cec8044744 agent: Make devcg_info optional for LinuxContainer::new()
The runk is a standard OCI runtime that isnt' aware of concept of sandbox.
Therefore, the `devcg_info` argument of `LinuxContainer::new()` is
unneccessary to be provided.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 09:39:07 +08:00
Xuewei Niu
ef4c3844a3 agent: Restrict device access at upper node of container's cgroup
The target is to guarantee that containers couldn't escape to access extra
devices, like vm rootfs, etc.

Assume that there is a cgroup, such as `/A/B`. The `B` is container cgroup,
and the `A` is what we called pod cgroup. No matter what permissions are
set for the container (`B`), the `A`'s permission is always `a *:* rwm`. It
leads that containers could acquire permission to access to other devices
in VM that not belongs to themselves.

In order to set devices cgroup properly, the order of setting cgroups is
that the pod cgroup comes first and the container cgroup comes after.

The `Sandbox` has a new field, `devcg_info`, to save cgroup states. To
avoid setting container cgroup too early, an initialization should be done
carefully. `inited`, one of the states, is a boolean to indicate if the pod
cgroup is initialized. If no, the pod cgroup should be created firstly, and
set default permissions. After that, the pause container cgroup is created
and inherits the permissions from the pod cgroup.

If whitelist mode which allows containers to access all devices in VM is
enabled,  then device resources from OCI spec are ignored.

This feature not supports systemd cgroup and cgroup v2, since:

- Systemd cgroup implemented on Agent hasn't supported devices subsystem so
  far, see: https://github.com/kata-containers/kata-containers/issues/7506.
- Cgroup v2's device controller depends on eBPF programs, which is out of
  scope of cgroup.

Fixes: #7507

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 09:39:07 +08:00
Archana Shinde
c075fa6817 tests: Add test with nerdctl to verify macvlan support
Add test to verify kata supports macvlan networks.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-07 10:13:51 -08:00
Archana Shinde
07db673eb9 tests: Add test with nerdctl to verify ipvlan support
Add test to verify kata supports ipvlan networks.
This test can be bit tricky as it requires knowledge about host interfaces
to be used as a master for the ipvlan network.
However, with github actions, we can assume interface called eth0 to be
present on the host and functioning.

Fixes: #8366

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-07 10:13:51 -08:00
Archana Shinde
a6272733e7 network: Fix network hotplug for ipvlan and macvlan endpoints.
Since moving from network coldplug to hotplug, the only case verified
was veth endpoints. Support for network hotplug for ipvlan and macvlan was
broken/not added. Fix it.

Fixes: #8391

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-07 10:13:51 -08:00
James O. D. Hunt
59d0d4caff runtime-rs: ch: Simplify VSOCK error handling
Remove the redundant `VmConfigError::EmptyVsockSocketPath` error from
the Cloud Hypervisor config crate since this scenario is already handled
by the `VsockConfigError::NoVsockSocketPath` error.

Fixes: #8385.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-07 17:45:38 +00:00
James O. D. Hunt
bdb83f8282 runtime-rs: ch: Remove unused function
Remove the redundant `parse_mac()` function: this was never used and we
already have an implementation in `crates/resource/src/network/utils/mod.rs`.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-07 17:45:38 +00:00
Wainer Moschetta
949ac4d810 Merge pull request #8217 from beraldoleal/issues/8216
tests: fixes permission denied when running test
2023-11-07 12:25:23 -03:00
Wainer Moschetta
7f5d70f48b Merge pull request #8061 from beraldoleal/gogo-removal-v3
Updating containerd to a GogoProtobuf free version
2023-11-07 12:18:50 -03:00
Xuewei Niu
8ea87405ed runtime-rs: Remove virtio config from Backend
Virtio-net and vhost-net share a common virtio config, and vhost-user-net
uses another config, named `VhostUserConfig`. Thus, the virtio config could
be added into `NetworkConfig` instead of `Backend`.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-07 19:35:02 +08:00
Xuewei Niu
ad66378bf5 runtime-rs: Move Dragonball stuff out of device drivers
Moving Dragonball structs convertions out of device drivers to keep driver
neutral. The convertions include `NetworkBackend` to
`DragonballNetworkBackend` and `NetworkConfig` to
`DragonballNetworkConfig`.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-07 19:35:02 +08:00
Xuewei Niu
3e0614cdf0 dragonball: Minor changes to comments
Changes include:

- Merge `VhostNetDeviceError` import item.
- Replace if with match in `add_vhost_net_device()`

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-07 19:35:02 +08:00
Xuewei Niu
a047331a34 runtime-rs: Network config distinguishes backends
Network backends determine the virtio dataplane implementations. Common
protocols include virtio-net, vhost-net and vhost-user-net, etc. Network
config has a new field named `backend` to specify which protocol to use.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-07 19:35:02 +08:00
Xuewei Niu
9203371833 dragonball: Introduce vhost-net device
PLEASE NOTE THAT this pull request just implements vhost-net support for
Dragonball, and adaptation for the Runtime-rs. And this pull request
DOESN'T provide an item to config which backend to use. To sum up,
virtio-net as a default backend is only choice for the user so far.

This pull request introduces vhost-net device for the Dragonball. In
addition, this pull request includes changes of Runtime-rs to improve
network configuration abilities.

The Dragonball part implements a vhost-net device and a vhost-net device
manager, named `VhostNetDeviceMgr`, to manage vhost-net device.
`NetworkInterfaceConfig` is introduced as a high-level abstract for network
config. Then, the Dragonball is able to distinguish network backends, e.g.
virtio-net, vhost-net, vhost-user-net(WIP), etc.

The Runtime-rs part adds support of multiple network backends as well.
`NetworkConfig` has a couple of new fields, like `backend`,
`use_shared_irq`, etc. And Dragonball's network config structs are
implmented `From` trait which allow to be converted from the Runtime-rs's
network config conveniently.

Fixes: #7674

Signed-off-by: Eric Ren <renzhen@linux.alibaba.com>
Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-07 19:35:02 +08:00
Greg Kurz
b27b4ce104 doc: No longer release the test repository
Now that most of the test repository got migrated to the main Kata repository,
it is no longer needed to tag the test repository when doing a release.

Update the documentation accordingly by dropping all references to the test
repository and only mention *the* Kata repository.

Fixes #8302

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-11-07 10:28:43 +01:00
Greg Kurz
af2d897fb1 doc: Release now uses the official GitHub CLI
The hub tool is deprecated. Releases are now based on the official gh
CLI. A notable improvement : when properly setup (see [1]), gh allows
to directly use HTTPS with one's GitHub credentials, instead of having
to setup proper SSH access for pushes to the repo.

Adjust the documentation accordingly.

Fixes #8302

[1] https://docs.github.com/en/github-cli/github-cli/quickstart#prerequisites

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-11-07 10:22:54 +01:00
Greg Kurz
2af9419fa4 doc: No longer run kata-deploy test when releasing
This is already tested by CI for every PR. Drop this step from the release
process documentation.

Fixes #8302

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-11-07 10:19:32 +01:00
Beraldo Leal
dd530ba8ee tests: fixes AMD errors
TestCheckHostIsVMContainerCapable is failing on AMD machines.
kata-check_amd64_test.go:96 has no AMD modules, also getCPUType is
missing.

Fixes #8384.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:59 +00:00
Beraldo Leal
7641c19f74 runtime: bump containerd for gogo deprecation
This update includes necessary changes due to the version bump of
containerd and its dependencies. It's part of a broader initiative to
phase out gogo protobuf, which has been deprecated, and to align with
the current supported libraries.

Fixes #7420.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:59 +00:00
Beraldo Leal
16fa2c39e6 protocols: replace gogo/types.Empty and Any
by Google versions.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:58 +00:00
Beraldo Leal
c61f4a8592 protocols: remove unused fieldpath option
The +fieldpath option, specific to gogoprotobuf, enabled dynamic field
access in protobuf messages, allowing nested fields to be accessed via
string paths.

This change is part of a larger effort to transition to the official Go
protobuf library for better maintainability and community support.
Upon review, no instances of dynamic field access were found in the
codebase, confirming that the feature is not in use.

By removing this unused feature, we simplify the build process and make
it easier to complete the transition away from gogoprotobuf.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:58 +00:00
Beraldo Leal
c87bc60ea0 protocols: removing unused mappings
Those mappings are not used by our .proto files and there is no
difference between .pb.go files generated.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:58 +00:00
Beraldo Leal
c5d845b30a agent: updating Cargo.lock files
Probably previous changes missed updating Cargo.lock.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:58 +00:00
Beraldo Leal
5d88c78a6e protocols: generating agent.pb.go
a3b003c345 modified agent but agent.pb.go
was not updated.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:58 +00:00
David Esparza
28e7b3467b metrics: improving stop and remove running containers
This PR makes the change to using the SIGKILL signal instead
of SIGTERM to force stop each kata component before start
running any metric test.

Fixes: #8336

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-11-06 09:54:32 -06:00
Archana Shinde
3b2fb6a604 Merge pull request #8284 from amshinde/runtime-rs-update-device-pci-info
runtime-rs: update device pci info for vfio and virtio-blk devices
2023-11-06 01:09:20 -08:00
Archana Shinde
036b7787dd runtime-rs: Use PCI path from hypervisor for vfio devices
Remove earlier functionality that tries to assign PCI path to vfio
devices from the host assuming pci slots to start from 1.
Get this from the hypervisor instead.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-05 21:59:44 -08:00
Archana Shinde
c3ce6a1d15 runtime-rs: Provide PCI path to the agent for virtio-block
If PCI path for block device is not empty for a block device, use
that as identifier for agent instead of virt path which is valid only
for mmio devices.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-05 21:59:44 -08:00
Archana Shinde
a2bbbad711 runtime-rs: change hypervisor add_device trait to return device copy
Block(virtio-blk) and vfio devices are currently not handled correctly
by the agent as the agent is not provided with correct PCI paths for
these devices.

The PCI paths for these devices can be inferred from the PCI information
provided by the hypervisor when the device is added.
Hence changing the add_device trait function to return a device copy
with PCI info potentially provided by the hypervisor. This can then be
provided to the agent to correctly detect devices within the VM.

This commit includes implementation for PCI info update for
cloud-hupervisor for virtio-blk devices with stubs provided for other
hypervisors.

Removing Vsock from the DeviceType enum as Vsock currently does not
implement the Device Trait, it has no attach and detach trait functions
among others. Part of the reason is because these functions require Vsock
to implement Clone trait as these functions need cloned copies to be
passed down the hypervisor.

The change introduced for returning a device copy from the add_device
hypervisor trait explicitly requires a device to implement
Copy trait. Hence removing Vsock from the DeviceType enum for now, as
its implementation is incomplete and not currently used.

Note, one of the blockers for adding the Clone trait to Vsock is that it
currently includes a file handle which cannot be cloned. For Clone and
Device Traits to be implemented for Vsock, it requires an implementation
change in the future for it to be cloneable.

Fixes: #8283

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-05 21:59:44 -08:00
Bo Chen
071667f1ca runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v35.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #8378

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-11-03 10:47:06 -07:00
Bo Chen
d1163141b9 versions: Upgrade to Cloud Hypervisor v36.0
Details of this release can be found in ourroadmap project as iteration
v36.0: https://github.com/orgs/cloud-hypervisor/projects/6.

Fixes: #8378

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-11-03 10:46:56 -07:00
Fabiano Fidêncio
0aac3c76ee Merge pull request #8365 from fidencio/topic/kata-manager-restrict-containerd-versions-to-be-used
kata-manager: Accept only "lts" or "active" as containerd versions
2023-11-03 11:54:05 +01:00
Fabiano Fidêncio
8b4fc847d7 kata-manager: Accept only "lts" or "active" as containerd versions
kata-manager is a very nice tool, but we shouldn't be trying to take
care of "everything" in "all possible scenarios", and we should focus on
installing Kata Containers dependencies that are supported.

With this in mind, let's limit a little bit the scope of which versions
of containerd can be installed, limitting to "active" and "lts", which
will then install the latest version of those "flavours".  The default
value will always be "lts" as that's supposed to be the stable one.

NOTE: This is a breaking change, as it changes the behaviour of what the
script takes in its `-c` parameter.  I'm assuming here we're safe to do
so as the majority of the users should / would only be using the full
installation by default.

Fixes: #8356

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-03 10:30:37 +01:00
Fabiano Fidêncio
d395ae8198 Merge pull request #8368 from fidencio/topic/gha-stale-fixes
gha: stale: Fix typo and allow manually triggering it
2023-11-03 10:07:56 +01:00
Fabiano Fidêncio
994615ca28 gha: stale: Allow manually triggering it
This will help us to avoid waiting till the next time cron would trigger
the action to test

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-03 08:17:48 +01:00
Fabiano Fidêncio
6abcf03611 gha: stale: Fix typo action -> actions
This is causing the following error:
```
Unable to resolve action action/stale, repository not found
```

Fixes: #8347

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-03 08:15:18 +01:00
Steve Horsman
a7a14e33d8 Merge pull request #8285 from sazzy4o/patch-1
Docs: Fix Dragonball link
2023-11-02 17:54:47 +00:00
Fabiano Fidêncio
37233622da kata-manager: Ensure we run apt-get update before apt-get install
As that's an operation that can easily fail, and it's quite simple /
cheap for us to run it, let's just do it and avoid the failure.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-02 14:14:32 +01:00
Fabiano Fidêncio
d547798284 Merge pull request #7057 from brianwang12/kata-manager-fix
kata-manager: Fix deployment of containerd on architectures other than amd64.
2023-11-02 14:14:18 +01:00
Fabiano Fidêncio
8905286767 Merge pull request #8348 from fidencio/topic/gha-add-stale-action-for-PRs
gha: Add workflow to close stale PRs
2023-11-02 11:34:35 +01:00
Fabiano Fidêncio
abec287058 gha: Add workflow to close stale PRs
Our goal. as discussed in the Architecture Committee meeting held on
October 31st, 2023, is to take a more aggressive action on issues and
PRs that have been opened for a long time.

This commit is the very first step, and it's **only** targetting
**PRs**.  What this action will do is:
* Mark all the PRs that have no activity for more than 180 days,
  starting from May 1st, 2023, as stale.
  * A message will be added, letting the contributor know that they can
    simply comment on the PR in order to make it "not stale".
* If there's no activity on the PR for 7 days, the PR will be
  automatically closed.

Fixes: #8347

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-02 09:19:44 +01:00
briwan.wang
437db15916 kata-manager: Fix Mulit-Arch deployment for containerd
Fix: Kata-Manager fails to retrieve the correct Containerd string name
for architectures other than amd64.

Update the 'github_get_release_file_url()' function to make it compatible
with different architecture expressions. eg. aarch64/arm64, or x86_64/amd64,
allowing it to acquire the correct URL addresses

Fixes: #7071

Signed-off-by: briwan.wang <briwan.wang@arm.com>
2023-11-02 06:12:04 +00:00
Archana Shinde
004646162e Merge pull request #8308 from gkurz/fully-drop-hub
release: Fully migrate from hub to gh
2023-11-01 22:46:44 -07:00
Peng Tao
b3dbd4f1c7 Merge pull request #8351 from amshinde/update-agent-cargo-lock
cargo: Agent cargo.lock updated
2023-11-02 11:31:24 +08:00
Archana Shinde
58b4d1a264 cargo: Agent cargo.lock updated
The Cargo.lock for agent needs to be updated to include
"safe-path" dependency.

Fixes: #8350

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-01 11:54:33 -07:00
Fabiano Fidêncio
40cc397218 Merge pull request #8255 from cmaf/migrate-checks-fixes-links
docs: Fix broken links
2023-11-01 14:46:30 +01:00
Beraldo Leal
afec54799e libs: fixes dereferenced reference
make check is giving us the following error:

error: this expression creates a reference which is immediately
dereferenced by the compiler.

Fixes #8344

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-10-31 15:55:32 -04:00
Beraldo Leal
c57df607ad libs: fixes comparison to empty slice
Make check gives us an "error: comparison to empty slice".

Fixes #8343

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-10-31 15:51:03 -04:00
Greg Kurz
d20b7381f0 release: Drop obsolete comment in workflow file
This comment belongs to the hub tool that got sunset by 710eb8ab9d.
Just drop it.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-10-31 16:03:12 +01:00
Greg Kurz
6236fa4617 release: Drop build_hub helper
Not used anymore.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-10-31 15:28:57 +01:00
Greg Kurz
bc4c66caaf release: Migrate tag_repos.sh to GitHub CLI
The hub tool is deprecated. Convert this script to use the
official GitHub CLI gh instead of hub.

A typical gh setup is able to access repos using HTTPS along with
GitHub credentials. It is only needed to patch the remote url when
using SSH.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-10-31 15:11:28 +01:00
Greg Kurz
e331102ba3 release: Migrate update-repository-version.sh to GitHub CLI
The hub tool is deprecated. Convert this script to use the
official GitHub CLI gh instead of hub.

A couple of adjustments had to be made :
- the notes.md temporary file is moved to ${tmp_dir} in order to silent gh,
  otherwise it complains about an untracked file,
- title of a PR no longer goes to the notes.md file since gh requires the
  title to be passed with a dedicated --title option.

Fixes #8303

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-10-31 15:10:50 +01:00
Greg Kurz
b83a7149ee release: Introduce helper to get GitHub CLI
If gh isn't installed already, download it from GitHub.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-10-31 15:09:24 +01:00
Fabiano Fidêncio
53cda12a71 Merge pull request #8311 from TimePrinciple/log-system-enhancement
runtime-rs: Log system enhancement
2023-10-31 10:14:41 +01:00
Greg Kurz
ceeabe3714 release: Allow to test release scripts with an alternate repo
We don't want to mess with the official repo when testing a change
in the release scripts. Adapt `update-repository-version.sh` to
be able to use an alternate repo just like `tag_repos.sh` already
does.

This means that the following command :

$ OWNER="$SOME_ORG" ./update-repository-version.sh -p "$NEW_VERSION" "$BRANCH"

will only create a PR in this repo :

http://github.com/$SOME_ORG/kata-containers.git

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-10-31 09:49:27 +01:00
Archana Shinde
148c565b2f Merge pull request #8289 from BbolroC/skip-create-tmpfs-s390x
agent: Skip flaky create_tmpfs on s390x
2023-10-30 22:26:28 -07:00
Ruoqing He
4ad2cfe0c2 runtime-rs: Log system enhancement
By modifying RuntimeLevelFilter drain to improve logging control,
enabling isolation of change effect of the loggers between components,
tuning clh logs to be logged according to their log levels
given by cloud-hypervisor.

Fixes: #8310

Signed-off-by: Ruoqing He <linuxwatcher@outlook.com>
2023-10-31 04:57:46 +00:00
David Esparza
2a17d3889e Merge pull request #8334 from amshinde/ipvlan-nerdctl-fix
network: Fix network attach for ipvlan and macvlan
2023-10-30 16:00:32 -06:00
David Esparza
5573705800 Merge pull request #8202 from dborquez/enable_fio_checkmetrics
Enable fio checkmetrics
2023-10-30 15:55:37 -06:00
David Esparza
c232869af9 metrics: removes double-quotes in checkemtrics when parsing results
This PR removes double quotes in jq output to return raw strings
as input of checkmetrics tool.

Fixes: #8331

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-30 09:43:03 -06:00
David Esparza
c42a2f2eda metrics: increase the number of attempts to stop kata
This PR increases the number of attempts to stop kata components
when it is required usually before starting a metrics test.

Fixes: #8307

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-30 09:43:03 -06:00
David Esparza
1626253d9e metrics: FIO ci test enablement
This PR enables the new FIO test based on the containerd client
which is used to track the I/O metrics in the kata-ci environment.

Additionally this PR fixes the parsing of results.

Fixes: #8199

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-30 09:42:54 -06:00
David Esparza
873386a349 metrics: update iodepth and job size fio parameters to improve workload
This PR updates the values of the fio parameters for iodepth
requests and for the number of jobs, in order to increase the
number of sequential operations.

Additionally, it adds the list of packages needed to parse the
results.

Fixes: #8198

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-30 08:43:06 -06:00
James O. D. Hunt
d93275224b Merge pull request #8323 from jodh-intel/utils-kata-manager-fix-version-checks
utils: kata manager: Fix version checks
2023-10-30 12:25:51 +00:00
Chao Wu
7d26604061 Merge pull request #7831 from lisongqian/feat/dragonball_trace
dragonball: add tracing feature for dragonball
2023-10-30 17:27:30 +08:00
James O. D. Hunt
d7e410ad2b Merge pull request #8314 from jodh-intel/kata-ctl-show-confidential-guest
kata-runtime/kata-ctl: Add security details to output
2023-10-30 07:41:22 +00:00
Songqian Li
2f533c3003 dragonball: add tracing feature for dragonball
This PR adds the tracing capability for dragonball and it depends on the tracing::Subscriber of the upper layer.

Fixes: #7249

Signed-off-by: Songqian Li <mail@lisongqian.cn>
2023-10-28 19:52:24 +08:00
Chao Wu
f1f4410537 Merge pull request #7695 from lisongqian/feat/legacy_metrics
dragonball: add metrics support for legacy device
2023-10-28 16:48:57 +08:00
Archana Shinde
f53f86884f network: Fix network attach for ipvlan and macvlan
We used the approach of cold-plugging network interface for pre-shimv2
support for docker.Since the hotplug approach was not required,
we never really got to implementing hotplug support for certain network
endpoints, ipvlan and macvlan being among them.

Since moving to shimv2 interface as the default for
runtime, we switched to hotplugging the network interface for supporting
docker and nerdctl. This was done for veth endpoints only.

Implement the hot-attach apis for ipvlan and macvlan as well to support
ipvlan and macvlan networks with docker and nerdctl.

Fixes: #8333

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-10-27 21:42:37 -07:00
Peng Tao
52a014d9cd Merge pull request #8033 from h56983577/6715/shared-mount
agent: use open_tree()/move_mount() to set up bind mounts between containers directly.
2023-10-28 10:57:34 +08:00
Songqian Li
da77b19449 dragonball: output legacy device metrics to runtime
Legacy device manager adds device metrics to METRICS when a device is created and removes metrics when a device is dropped.

Fixes: #7248

Signed-off-by: Songqian Li <mail@lisongqian.cn>
2023-10-27 14:09:42 +08:00
Songqian Li
65213e9fbe dragonball: unify the metric interface of legacy device
Fixes: #7248

Signed-off-by: Songqian Li <mail@lisongqian.cn>
2023-10-27 14:09:42 +08:00
Chao Wu
b508091305 Merge pull request #8322 from wainersm/git_helper-fix
tests/git-helper: cancel any previous rebase left halfway
2023-10-27 14:07:16 +08:00
Spencer von der Ohe
fee97e219c docs: Fix Dragonball link
Update dragonball link to be the current repo (from archived repo)

Fixes #8324

Signed-off-by: Spencer von der Ohe <s.vonderohe40@gmail.com>
2023-10-26 21:12:31 -06:00
Archana Shinde
f5c17f89a3 Merge pull request #8250 from amshinde/runtime-rs-clh-config
runtime-rs: Add default configuration file for cloud-hypervisor
2023-10-26 14:54:47 -07:00
Chelsea Mafrica
0608e20a01 docs: Fix broken links
Update broken links so that static checks pass.

Fixes #8254

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-10-26 10:17:01 -07:00
Chelsea Mafrica
4ede63fa4d Merge pull request #8317 from cmaf/gha-spellcheck-reqs
gha: add dependencies for spell checker
2023-10-26 10:11:26 -07:00
James O. D. Hunt
ae3ea1421d utils: kata-manager: Fix containerd version check
Contained release files include the version number without a "v" prefix.
However, the tag for the equivalent release does include it so handle
this distinction and also tighten up the Kata check by specifying an
explicit version number in the regex.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-26 16:34:56 +01:00
James O. D. Hunt
346f195532 utils: kata-manager: Fix whitespace
Use tabs consistently.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-26 16:06:51 +01:00
Wainer dos Santos Moschetta
0ce0abffa6 tests/git-helper: cancel any previous rebase left halfway
In bare-metal machines the git tree might get on unstable state with the
previous rebase left halfway. So let's attempt to abort any rebase before.

Fixes #8318
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-26 11:50:12 -03:00
James O. D. Hunt
2ac7ac1dd2 utils: kata-manager: Fix "Cannot determine download URL" issue
The archive names for x86_64 [Kata releases](https://github.com/kata-containers/kata-containers/releases)
used to include the tag `x86_64`, but that has now been changed to
`amd64`, which unfortunately broke `kata-manager.sh`:

```
kata-static-3.1.3-x86_64.tar.xz
                  ~~~~~~
                  expected

kata-static-3.2.0-alpha3-x86_64.tar.xz
                         ~~~~~~
                         expected

kata-static-3.2.0-alpha4-amd64.tar.xz
                         ~~~~~
                         changed
```

Fixes: #8321.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-26 15:27:37 +01:00
James O. D. Hunt
59bd534827 utils: kata-manager: Lint fixes
Improve the code by fixing some lint issues:

- defining variables before using them.
- Using `grep -E` rather than `egrep`.
- Quoting variables.
- Adding a check for invalid CLI arguments.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-26 15:24:46 +01:00
HanZiyao
a3b003c345 agent: support bind mounts between containers
This feature supports creating bind mounts directly between containers through annotations.

Fixes: #6715

Signed-off-by: HanZiyao <h56983577@126.com>
2023-10-26 16:34:50 +08:00
Archana Shinde
1b8ec08278 Merge pull request #8281 from amshinde/add-clh-config-kata-manager
kata-manager: Add clh config to containerd config file
2023-10-25 13:44:53 -07:00
Chelsea Mafrica
c20aadd7a8 gha: add dependencies for spell checker
In the migration from the tests repo to the kata containers repo we
missed two huspell dictionaries for static checks; add them.

Fixes #8315

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-10-25 12:49:09 -07:00
James O. D. Hunt
d707fa2c0d kata-runtime/kata-ctl: Add security details to output
Add the hypervisor security details to the output of the `kata-runtime
env` and `kata-ctl env` commands so the user can see, amongst other
things, the value of `confidential_guest`.

Fixes: #8313.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-25 16:34:42 +01:00
Chao Wu
29d863350f Merge pull request #7697 from lisongqian/feat/balloon_metrics
dragonball: add metrics support for balloon device
2023-10-25 02:42:14 -05:00
Fabiano Fidêncio
328ba0da99 Merge pull request #7647 from jongwu/use_pcie_virt
AArch64: runtime: use pcie root port to do pci/pcie device hotplug
2023-10-25 09:17:13 +02:00
Archana Shinde
f99de4d5a1 runtime-rs: Make default kernel params as empty
The default kernel params passed to any hypervisor except dragonball is
empty.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-10-24 15:50:12 -07:00
Archana Shinde
a813012785 runtime-rs: Add default configuration file for clouf-hypervisor
The config template file for clh is in the new format for runtime-rs.
It is a result of merging the new format file and options supportted by
cloud-hypervisor.

Some config options from the golang runtime are missing as they may not
be currently supported by the rust runtime. An example of this is the
selinux options, rate limiting options as these are not currently
supported or verified with the rust runtime.

Fixes: #8249

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-10-24 15:17:24 -07:00
Chao Wu
43675bd485 Merge pull request #8294 from ZizhengBian/jason/for-master
runtime-rs: fix a typo in device manager
2023-10-24 04:52:04 -05:00
Songqian Li
dce365d5b4 dragonball: add conditional compilation for BalloonDeviceMetrics
Fixes: #7248

Signed-off-by: Songqian Li <mail@lisongqian.cn>
2023-10-24 13:33:39 +08:00
GabyCT
4c3a664358 Merge pull request #8278 from GabyCT/topic/udpparallel
metrics: Add parallel udp iperf3 benchmark
2023-10-23 10:30:53 -06:00
Fabiano Fidêncio
a001021721 Merge pull request #8292 from fidencio/topic/release-ensure-gh-is-used-from-a-git-repo
release: Always use actions/checkout to ensure we're in a git repo
2023-10-23 15:16:12 +02:00
Songqian Li
3819f0ee6f dragonball: output balloon device metrics to runtime
Balloon device manager adds balloon device metrics to METRICS when a device is created and remove metrics when a device is dropped.

Fixes: #7248

Signed-off-by: Songqian Li <mail@lisongqian.cn>
2023-10-23 21:15:22 +08:00
Zizheng Bian
7d7c25c1d6 runtime-rs: fix a typo in device manager
Fixes: #8293
Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>
2023-10-23 20:33:47 +08:00
Fabiano Fidêncio
c5cfad7023 actions: Move all the checkout actions to v4
It's been released for a while now, and we need to keep consistency
between what we used.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-23 14:01:53 +02:00
Fabiano Fidêncio
b32c6bf805 release: Always use actions/checkout to ensure we're in a git repo
Otherwise we'll face issues like:
```
Run tag=$(echo $GITHUB_REF | cut -d/ -f3-)
  tag=$(echo $GITHUB_REF | cut -d/ -f3-)
  tarball="kata-static-$tag-amd64.tar.xz"
  mv kata-static.tar.xz "$GITHUB_WORKSPACE/${tarball}"
  pushd $GITHUB_WORKSPACE
  echo "uploading asset '${tarball}' for tag: ${tag}"
  GITHUB_TOKEN=*** gh release upload "${tag}" "${tarball}"
  popd
  shell: /usr/bin/bash -e {0}
~/work/kata-containers/kata-containers ~/work/kata-containers/kata-containers
uploading asset 'kata-static-3.3.0-alpha0-amd64.tar.xz' for tag: 3.3.0-alpha0
failed to run git: fatal: not a git repository (or any of the parent directories): .git
```

Fixes: #8286 (or better, just a follow up of that)

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-23 14:00:39 +02:00
Fabiano Fidêncio
8fe88696c0 Merge pull request #8287 from fidencio/topic/release-use-gh-cli-instead-of-hub
actions: release: Use GH cli instead of hub
2023-10-23 12:40:22 +02:00
Hyounggyu Choi
a0746c8d7b agent: Skip flaky create_tmpfs on s390x
This is to skip a flaky test `create_tmpfs()` on s390x until a root cause is identified and fixed.

Fixes: #4248

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-10-23 11:22:14 +02:00
Fabiano Fidêncio
710eb8ab9d actions: release: Use GH cli instead of hub
hub is now deprecated, which has been causing issues with our release
process.

Let's move to the GH cli (https://cli.github.com/manual), and unblock
this release.

**NOTE**: This commit is purposefully not touching anywhere else hub is
used, as that would require more time and investigation to do the
switch, and right now we just want to unblock the release.

Fixes: #8286

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-23 08:49:55 +02:00
Fabiano Fidêncio
74d4865189 Merge pull request #8275 from fidencio/topic/ci-adapt-kata-deploy-regex-on-repo-version-update
release: Adapt the CIs using the kata-deploy image
2023-10-23 00:37:19 +02:00
Archana Shinde
d3250dff34 kata-manager: Add clh config to containerd config file
kata-manager currently adds default config which currently is qemu.
Add config for clh as well to containerd configuration.
This should allow new users to get started with clh using kata-manager.

Also add config related to enabling privileged_without_host_devices.
Always good to have this config enabled when users try to run privileged
containers so that devices from host are not inadverdantly passed to the
guest.

Fixes: #8280

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-10-20 18:16:16 -07:00
Gabriela Cervantes
2d0518cbe6 metrics: Add parallel udp iperf3 benchmark
This PR adds the parallel udp iperf3 benchmark for network metrics.

Fixes #8277

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-20 19:54:06 +00:00
Dan Mihai
732fe163f3 Merge pull request #8229 from microsoft/danmihai1/no-config-toml-endpoints
agent: no endpoint blocking from agent-config.toml
2023-10-20 11:30:43 -07:00
Fabiano Fidêncio
026f6a1a4c release: Adapt the CIs using the kata-deploy image
This is needed in order to properly run the CIs in branches that are not
the main one, as the kata-deploy.yaml file on those branches do not have
the `latest` tag, but rather the latest stable release.

Fixes: #8274

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-20 18:59:14 +02:00
Fabiano Fidêncio
124f498830 Merge pull request #8266 from fidencio/3.3.0-alpha0-branch-bump
# Kata Containers 3.3.0-alpha0
2023-10-20 17:40:44 +02:00
GabyCT
8486283012 Merge pull request #8247 from GabyCT/topic/iperfudp
metrics: Add iperf udp benchmark
2023-10-20 09:21:37 -06:00
Fabiano Fidêncio
0fb69ddf6a release: Kata Containers 3.3.0-alpha0
- kata-deploy-stable: Switch to using the ubuntu based payload
- libs: protection: Fix typo in TDX output
- ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat
- tests: Enable agent stability test
- docs: Fix paths to build kernel in SNP VMs documentation
- runtime-rs: ch: Add TDX CH features check
- runtime: Validate hypervisor section name in config file
- tests: query data from the OPA service
- release: tag_repos: Stop tagging the `tests` repo
- metrics: fixes common.sh function to always return true
- Memory footprint test removing trailing commas to make json results file valid
- policy: allow access to ReseedRandomDev
- runtime/kata-ctl: update dependencies
- runtime-rs : fix Nydus support for runtime-rs + Dragonball
- metrics: removal of reference in the documentation to the fio dax subtest.
- runtime-rs: ch: Detect Intel TDX version
- runitme-rs: use the same base64 as kata-runtime/direct-volume does
- tests: Enable scability test for stability CI
- runtime-rs: Add support for adding vfio device for cloud-hypervisor
- tests: Enable soak parallel stability test
- dragonball: vcpu metrics change to be recorded per vcpu
- ci: k8s: adapt gha-run.sh to run locally
- metrics: removes kata components and k8s deployment when test finishes
- GHA: fix up referenced yaml exceeding 20 limit problem
- gha: ci: Revert tracing test PR to unbreak CI
- runtime-rs: ch: Enable feature
- gha: ci: Port runk tests over
- ci: gha: Port tracing tests over
- Enable fio test using containerd client
- gha: Add stability tests workflow for gha
- gha: arm64: Ensure the builder is arm64-builder
- kata-deploy: Build kata-agent as we build all the other components
- versions: migrate out of k8s.gcr.io
- doc: Update crictl pod-config
- gha: Fix k0s deployment
- tests: Add stability test for kata CI
- docs: Update url in kata vra document
- gpu: Adding CDI support for cold and hot-plug of VFIO devices
- kata-deploy: build & ship the rust components from src/tools/
- metrics: Add latency value limits for kata CI
- runtime: fix reading cgroup stats of sandboxes
- Upgrade to Cloud Hypervisor v35.0
- ci: Port kata-monitor tests from Jenkins to GHA
- metrics: Fix latency yamls path
- metrics: Fix metrics README
- metrics: Fix C-Ray documentation
- runtime-rs: ch: Enable Intel TDX
- ci: k8s: crio: Follow up patches to have CRI-O also working as part of our CI
- metrics: Enable latency test in gha run script
- local-build: Fix .docker ownership before build-payload
- runtime-rs: Add network support for cloud-hypervisor
- osbuild: Reduce guest components binary size with strip
- gha: Add pandoc as a dependency for static checks
- ci: rootfs-image build-asset is failing
- feat(runtime-rs): introduce huge page mode to select VM RAM's backend
- clh: Direct IO support for block devices
- gha: Install hunspell for static checks
- ci: Trigger payload-after-push on workflow_dispatch
- ci: Actually enable the CRI-O tests
- protocol: remove gogoprotobuff tests
- ci: k8s: Also run tests with CRI-O
- runtime: support kernel params including spaces
- ci: kata-deploy: Fix runner name
- metrics: Enable parallel bandwidth iperf limit
- ci: kata-deploy: Enable all k8s flavours that we support
- ci: Create clusters in individual resource groups
- versions: Bump virtiofsd to v1.8.0
- clh: arm: Use static_sandbox_resource_mgmt=true
- Bump nydus versions and update nydus tests
- runtime/qemu: Rework QMP/HMP support
- clh:arm64: use arm AMBA UART for hypervisor debug
- ci: Use variable size of VMs depending on the tests running
- ci: Rework static checks
- runtime: incorrect handling of non-empty []Endpoint parameter in Remo…
- ci: cache: Check the sha256sum of the components & fix ovmf-sev cache usage
- ci: cache: Use the artefacts stored in ghcr.io/kata-containers/cached-artefacts/${component}
- ci: Run some of the GARM tests in smaller instances
- ci: Reduce the size of the AKS VMs
- ci: cache: Allow pushing our artefacts to an OCI registry
- metrics: Add iperf value for cpu utilization
- ci: cache: Export env vars needed to use ORAS
- gha: vfio: Import test script
- tests: fix kernel and initrd annotations
- metrics: Add iperf bandwidth value for kata metrics
- metrics: Add Cassandra Metrics documentation
- metrics: Remove warning from metrics documentation
- ci: docker: nerdctl: Switch to tcp port 80 ping
- runtime: Naming conflict of network devices
- Remove gogoproto.nullable extension
- metrics: Ensure docker is running in init_env
- metrics: this PR skips the FIO test temprarily to fix issues
- ci: Add a very basic nerdctl sanity test
- runtime-rs: hypervisor: Remove debug kernel options
- versions: Bump rust version
- ci: Add a very basic docker sanity test
- dragonball: fix for non-deterministic builds
- runtime-rs: bring hybrid vsock devices in manager.
- ci: use github.ref_name instead of $GITHUB_REF_NAME
- ci: Add more target-branch related fixes
- ci: Fix target-branch usage
- agent: optimize the code of systemd cgroup manager
- gha: Manually rebase PR atop of the target branch before testing
- Update kernel to the latest LTS release (v6.1.52) and bring in erofs patches needed for the CC work
- kata-deploy: Fix aarch64 image build
- runtime: Fix more virtiofs args
- kata-deploy: Switch to an alpine image
- metrics: Use TensorFlow optimized image
- metrics: fix FIO test initialization
- ci: k8s: Add clean-up-garm argument for gha-run.sh
- ci: k8s: Second round of fix-ups with the devmapper CI
- metrics: re-enable memory-usage initialization step
- Dragonball: optimize the placement of dbs-upcall features
- ci: k8s: Fix typo in run-k8s-tests-on-garm.yaml
- ci: k8s: Add k8s devmapper tests (part 0)
- kata-deploy: Create kata-static.tar with correct ownership
- runtime: run prestart hooks before starting VM for FC
- metrics: Add write 95 percentile FIO value
- runtime: Allow virtio_fs_extra_args annotation
- packaging: do not install docker-compose-plugin for s390x|ppc64le
- runtime-rs: Fix volumes and rootfs cleanup issues
- metrics: Enable iperf benchmark on gha for kata metrics
- CI: switch static-checks-dragonball CI machines to Azure
- metrics: Add README for kata metrics report
- osbuilder: Remove chcon operation for guest SELinux
- kata-sys-util: protection: Update TDX checks
- Improve the way to clean up storage devices for sandbox
- agent: avoid possible leakage of storage device
- tests: add policy to existing tests
- gha: Rebase PR atop of the target branch before testing
- versions: Update alpine to its 3.18 version
- runtime: Fix data race in ioCopy
- metrics: Add grabdata script for metrics report
- Fixes tests on AMD machines
- metrics: Enable FIO limits for kata metrics
- metrics: Add metrics report script
- metrics: Fix memory inside limits for kata metrics
- metrics: fix parsing issue on memory-usage test
- dragonball: vsock add fifo/pipe stream support for passed fd hybridSt…
- tests: Add confidential test
- tdx: Update the components needed for using the 6.2 kernel stack
- tests: delete k8s deployment at the test's end
- tests: use unique test name
- runtime-rs: check peer close in log_forwarder
- gha: Avoid "fail-fast" in tests that are known to be flaky
- Refine storage device management for kata-agent
- metrics: Remove unused variable in tensorflow nhwc script
- kata-deploy: Don't try to remove /opt/kata
- metrics: Add TensorFlow ResNet50 FP32 benchmark
- gha: vfio: Run on Ubuntu 23.04 runner
- kata-agent: use default filemode for block device when it is set to 0
- kata-types: introduce KataVirtualVolume to support nydus, direct volume and image pull
- libs,tests: fix typo disable_guest_seccomp in configuration-anno-1.toml
- local-build: Remove GID before creating group
- kata-deploy: Avoid failing on content removal
- runtime: fix image and initrd assets handling
- metrics: Add disk link to README
- metrics: Fix FIO path
- gha: capture additional kata-deploy output
- metrics: Use function from metrics common in pytorch script
- metrics: Enable kata runtime in K8s for FIO test.
- metrics: Fix README for pytorch
- metrics: Remove unused variable in tensorflow mobilenet script
- rootfs: agent: Policy support with AGENT_INIT=yes
- gha: k8s: kata-deploy: Move kata-deploy specific tests from integration/kubernetes to functional/kata-deploy
- metrics: Fix check results for tensorflow benchmark
- metrics: Add Tensorflow ResNet50 int8 benchmark
- kata-deploy: Properly create default runtime class
- agent: simplify error handling
- metrics: Fix MobileNet help me description
- gha: ci: Start running kata-deploy tests
- runk: Modify kill command's error message for containerd tests
- runtime-rs: add driver option
- gha: cri-containerd: Enable tests
- metrics: Rename tensorflow scripts
- gha: tests: Add kata-deploy functional tests -- Part 1
- agent: runtime: add Agent Policy feature
- runk: Support without pid ns
- metrics: Add Cassandra Kubernetes benchmark for kata metrics
- metrics: Add common functions to the common script
- metrics: fix the loop used to stop kata components
- docs: Remove installation step in virtcontainers doc
- Propogate secrets, config maps etc into guest if sharedFS not available
- kata-deploy: Preliminary k0s support
- gha: static-checks: Move to the Azure instances
- versions: Update firecracker version to 1.4.0
- agent: Allow clippy::redundant_clone in the unit tests
- agent: avoid creating new `Vec` instances when easily avoidable
- metrics: compute tensorflow statistics
- metrics: Add network nginx benchmark
- metrics: install kata once and run multiple checks
- ci: unencrypted-image: Fix build context
- ci: create-confidential-image: Add dependent actions
- Follow up fixes for https://github.com/kata-containers/kata-containers/pull/7596
- tests: Create image that will be used in the unencrypted confidential tests
- kata-deploy: Ensure we cover SHIMS / DEFAULT_SHIM as part of our tests
- tests: upgrade bats version
- Fix mimor bugs and improve coding stype of agent rpc/sandbox/mount
- deps: Bump dependent crate versions
- fix number of queues handling in dragonball share fs device
- runtime-rs: Introduce directly attachable network
- metrics: General improvements to mobilenet tensorflow test
- gha: Add iperf network metrics
- docs: Use control-plane term instead of master
- agent: avoid unnecessary calls to `Arc::clone`
- metrics: Add network latency test
- Image pulling on the host
- Use version 0.10.4 of `fuse-backend-rs`
- kata-deploy: Use host's systemctl
- release: Revert kata-deploy changes after 3.2.0-rc0 release
- metrics: stop kata components before start a metric test.
- runtime-rs: Add block device handling for cloud hypervisor

a93fdb014 kata-deploy-stable: Adapt to what we're using in the stable branch
36109da93 ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat
d01daf749 tests: Adjust timeout for agent stability test
9b14dda14 libs: protection: Fix typo in TDX output
0e0867f15 runtime-rs: ch: Add TDX CH features check
409eadddb runtime-rs: ch: Improve readability of guest protection checks
82a0814fc tests: Enable agent stability test
32be8e3a8 tests: query data from the OPA service
b81c0a669 tests: encode policy file during test
4f9681b41 metrics: fixes common.sh function to always return true
2ef2b2a6d docs: Fix paths to build kernel in SNP VMs documentation
408b59c02 runtime-rs: fix bugs to support Nydus v5
157caea9f Revert "nydus: Temporarily skip tests on dragonball"
678fe3cd3 Dragonball: fix Nydus config serde problem
b6ec62138 policy: allow access to ReseedRandomDev
908519db9 metrics: skips docker restart when it is not installed or is masked.
c2763120a metrics: removing trailing comma characters from json file.
3e8cf6959 runtime: Validate hypervisor section name in config file
ef6388e81 tests: Remove unused function from scability test
fbc8f8f46 scripts: Use install_yq from the `kata-containers`  repo
65b1a2d27 release: tag_repos: Stop tagging / updating the `tests` repo
87b760f56 runtime-rs: ch: Detect Intel TDX version
73e81f5e3 runitme-rs: unify base64 encoding for direct-volume
c6463cb5a tests: Fix path for versions yaml for soak parallel test
89c9454fc metrics: removal of reference in the documentation to the dax test.
30ff58904 tests: Enable scability test for stability CI
8d6f7b909 runtime-rs: Add support for handling vfio device for cloud-hypervisor
e786b2b01 gha: Add install dependencies for stability tests
dbfe6512f dragonball: vcpu metrics change to be recorded per vcpu
fa60fbe02 dragonball: METRICS is refactored to RwLock<DragonballMetrics>
500d1c5ce kata-ctl: update rustls-webpki/webpki dependency
d7660d82a runtime: unify gopkg.in/yaml.v3 to v3.0.1
fc9a107e8 runtime: unify swag and testify dependency
79ebb959c runtime: update runc dependency to v1.1.9
7f3e8bd65 runtime: unify golang.org/x/text to v0.7.0
df325ae37 runtime: update golang.org/x/net to v0.7.0
bba34910d metrics: stops kata components and k8s deployment when test finishes
84e3d884e gha: Add general dependencies to stability tests
dec3951ca tests: Add soak parallel stability test
0f04d527d tests: Enable soak parallel test
e669282c2 ci: k8s: set KUBERNETES default value
c30c3ff18 tests: run k8s-volume on a given node
666993da8 tests: run k8s-file-volume on a given node
3a00fc910 tests: exec_host() now gets the node name
61c9c17bf tests: add get_one_kata_node() to tests_common.sh
68f083c4d ci: k8s: set KATA_HYPERVISOR default value
6677a61fe ci: k8s: configurable deploy kata timeout
200e54292 ci: k8s: shellcheck fixes to gha-run.sh
4af78be13 kata-deploy: re-format kata-[deploy|cleanup].yaml
d54e6d9cd ci: k8s: run_tests() for kcli
c2ef1f0fb ci: k8s: add deploy-kata-kcli() to gh-run.sh
d2be8eef1 ci: k8s: add cleanup-kcli() to gha-run.sh
cbb9aa15b ci: k8s: set default image for deploy_kata()
89bef7d03 ci: k8s: create k8s clusters with kcli
954d40cce gha: combine coco jobs into a single yaml
b60e0a9b5 gha: combine basic amd64 jobs into a single yaml
e9bd85211 gha: ci: Revert tracing test PR to unbreak CI
b8a46a4b8 runtime-rs: ch: Enable feature
0f2dc8c67 gha: Add containerd stability tests to ci yaml
da91c9df8 ci: Port runk tests to this repo
7f2377276 ci: Add placeholder for runk tests
9205acc3d ci: Move tracing tests here
85d290a04 gha: Add stability gha run script
54f0c8f88 gha: Add stability tests workflow for gha
3bb2923e5 ci: Add placeholder for tracing tests
2c3bf406d ci: Create a function to install docker
119f03de2 gha: arm64: Ensure the builder is arm64-builder
8c498ef5e metrics: Use jq tool to pretty-print json metrics output
a2159a636 metrics: Enables FIO test for kata containers
70e7ec3e2 gha: Fix k0s deployment
560bbffb5 packaging: tools: Remove `set -x` leftover
18fa483d9 packaging: release: Mention newly added images
ca3b88837 packaging: tools: Fix container image env var name
5ca66795c packaging: Allow passing the TOOLS_CONTAINER_BUILDER
02acef957 gha: Build the kata-agent as part of our workflows
5208386ab packaging: Build the kata-agent
1727487ee agent: Allow specifying DESTDIR and AGENT_POLICY via env vars
45c118883 packaging: Add get_agent_image_name()
0db8fb8f9 versions: migrate out of k8s.gcr.io
a1a054367 doc: Fix spelling
6339605a1 tests: Add general stability fixes
59ae24444 doc: Update crictl pod-config
fd19f4082 tests: Add agent stability test
215577032 tests: Add cassandra stress in stability tests
f2d3ea988 tests: Add stressng dockerfile for stability tests
6493aa309 tests: Add stressor CPU test for stability tests
ef68a3a36 metrics: Add stability test for kata CI
7c934dc7d gpu: Fix cold-plug of VFIO devices
8d66ef518 metrics: Increase qemu jitter value
5600e28b5 metrics: Increase jitter value for clh
a6b1f5e21 ci: Build src/tools components as part of our tests / releases
501a168a8 kata-deploy: Build components from src/tools
6ef42db5e static-build: Add scripts to build content from src/tools
4d08ec29b packaging: Add get_tools_image_name()
98097c96d packaging: Use git abbreviated hash
489caf1ad ci: kata-monitor: Move tests over
a3fb067f1 ci: Add placeholder for kata-monitor tests
57cb4ce20 ci: Make install_kata aware of container engines
de1eeee33 ci: Create a generic install_crio function
64a200085 ci: Add install_cni_plugins helper
8132fe15c ci: Modify containerd default config
8cb7df1be metrics: Add checkmetrics for latency test
e90440ae2 metrics: Add qemu latency value limit
a74a8f8a9 metrics: Add latency value limits for kata CI
d7def8317 metrics: Fix general check static warnings
928553d1b docs: Update url in kata vra document
b0a3293d5 runtime-rs: ch: Enable Intel TDX
523399c32 runtime-rs: ch: Add more consts
dea806581 runtime-rs: ch: Remove unused function
995f2c015 runtime-rs: ch: Only handle particular pending device types
b1b96a5c4 runtime-rs: ch: Remove erroneous "virtio-blk-mmio" check
9ac29b8d3 metrics: Add init_env function to latency test
dfd0c9fa9 runtime: clh: Re-generate the client code
8f9f087e3 versions: Upgrade to Cloud Hypervisor v35.0
81c8babca metrics: Fix latency yamls path
481573682 metrics: Fix C-Ray documentation
ef63d67c4 ci: crio: Trail '\r' from exec_host() output
74c12b292 ci: crio: Enable default capabilities
358dc2f56 kata-deploy: Fix CRI-O detection
ebaa4fa4c ci: crio: Pass `-y` to apt
97e73b223 metrics: Fix spelling warnings
36c8cd6f1 metrics: Fix metrics README
15425a2b8 local-build: Fix .docker ownership before build-payload
13ca7d9f9 gha: Add pandoc as a dependency for static checks
08bc8e4db metrics: Add latency benchmark for gha
6776b55d7 metrics: Enable latency test in gha run script
94e2ccc2d runtime: fix reading cgroup stats of sandboxes
d507d189b fc: Add support for noflush cache option
2ca781518 clh: Direct IO support for block devices
0c95697cc ci: Trigger payload-after-push on workflow_dispatch
28cbc3b51 ci: rootfs-image build-asset is failing Fixes: #8027
87a861648 gha: Install hunspell for static checks
8c3c50ca8 ci: Actually enable the CRI-O tests
3a6510ad6 osbuild: Reduce guest components binary size with strip
07a6e63a6 ci: k8s: rke2: Use sudo to call systemd
03b82e848 ci: k8s: Add a CRI-O test
d7105cf7a ci: k8s: Add a method to install CRI-O
54c0a471b ci: k8s: k0s: Allow passing parameters to the k0s installer
730ef5169 deps: updating dependencies
3a2c83d69 ci: kata-deploy: Fix runner name
82ff2db46 runtime: support kernel params including spaces
604a9dd67 protocol: remove gogoprotobuff tests
f7fa7f602 ci: Enable kata-deploy tests for all the supported k8s flavours
2c908b598 ci: kata-deploy: Add the ability to deploy rke2
eaf616491 ci: kata-deploy: Add the ability to deploy k0s
001525763 ci: kata-deploy: Add deploy-k8s argument to gha-run.sh
bf2cb0228 ci: kata-deploy: Expland tests to run on k0s / rke2
b12b9e188 ci: kata-deploy: Add placeholder for tests on GARM
9e1fb8a96 ci: kata-deploy: Export KUBERNETES env var
09cc0ed43 ci: Move deploy_k8s() to gha-run-k8s-common.sh
486fe14c9 ci: Properly set K8S_TEST_UNION
d9ef1352a ci: Add first letter of the K8S_TEST_HOST_TYPE to resource group name
68267a399 ci: Create clusters in individual resource groups
9aa8d1c91 metrics: Add parallel bandwidth limit for qemu
44c7c082d versions: Bump virtiofsd to v1.8.0
af59d4bf4 metrics: Enable parallel bandwidth iperf limit
aba36ab18 nydus: Temporarily skip tests on dragonball
b8a8dfcd1 nydus: Use `kata-${KATA_HYPERVISOR}` instead of `kata`
f6df3d6ef static-build: Fix arch error on nydus build
2f9c9e2e6 tests: nydus: Update nydus tests
c9a4e7e46 versions: Bump nydus and nydus-snapshotter to its latest release
b73bde320 gha: nydus: Populate run()
b3904a1a3 gha: nydus: Populate install_dependencies()
d2b3b67f5 gha: nydus: Actually install kata when `install-kata` is called
0ec00ad42 gha: nydus: Get rid of nydus{,-snapshotter} install from nydus_test.sh
568439c77 tests: nydus: Add timeout to the crictl calls
5ac3b76eb tests: nydus: Add uid / namespace to the nydus container / sandbox
376574a16 tests: nydus: Decorate some calls with `sudo`
4290fd4b6 tests: nydus: Adapt "source ..." to GHA
a84efa3e8 tests: nydus: Adapt check to "clh" instead "cloud-hypervisor"
56a14b395 tests: common: Add install_nydus_snapshotter()
b6563783e tests: common: Add install_nydus()
72599f191 clh: arm: Use static_sandbox_resource_mgmt=true
1f16b6627 runtime/qemu: Rework QMP/HMP support
8b1e9b0c7 ci: static-checks: Clean up static-checks job
2c5ca2eaf ci: static-checks: Run tests depending on KVM
509c309ab ci: static-checks: Move "sudo make test" to the new test matrix
4e963cedf ci: static-checks: Move "make test" to the new test matrix
08f2e5ae0 runtime-rs: Ensure static-checks-build is a dep of `make test`
2bc3a616a kata-ctl: Use `loop` instead of `kvm` module in tests
46daddc50 kata-ctl: Ensure GENERATED_CODE is a dep of `make test`
ec826f328 agent: Ensure GENERATED_CODE is a dep of `make test`
1d32410a8 ci: install_libseccomp: Do not depend on the tests repo
bf888b9a5 ci: static-checks: Move "make check" to the new test matrix
473ec8780 kata-ctl: Add `kata-types` to the Cargo.lock file
ea19549a9 kata-ctl: Ensure GENERATED_CODE is a dep of `make check`
e12577586 tests: install_rust: Also install clippy
e2c61a152 ci: static-checks: Move vendor check to its own job
6794d4c84 tests: Move install_rust.sh from the tests repo
e64508c30 tests: install_go: Remove tests repo dependency
11dff731b tests: Move functions from kata_arch script here
75c974c80 ci: static-checks: Move kernel config check to its own job
9c233bb9e test: Add test to verify try_from for clh Netconfig
c69a1e33b ci: Use variable size of VMs depending on the tests running
9049d311d runtime-rs: Add network support for cloud-hypervisor
eecd5bf2a ci: cache: Fix ovmf-sev cache
86c41074b ci: cache: Check the sha256sum of the component
460988c5f ci: cache: Remove the script used to cache artefacts on Jenkins
4533a7a41 ci: cache: Also store the ${component} sha256sum
eccc76df6 ci: cache: Use the cached artefacts from ORAS
7f5e77bcb kernel: enable Arm pl011 support
241c355e0 clh:arm64: use arm AMBA uart for hypervisor debug
094b6b2cf ci: k8s: Temporarily disable tests that require a bigger VM instance
d0c257b3a ci: cache: Push cached artefacts to ghcr.io
108f1b60d kata-deploy: Generate latest_{artefact,image_builder} files
be2eb7b37 ci: cache: Install ORAS in the kata-deploy binaries builder container
fb24fb0dc ci: k8s: devmapper: Use a smaller / cheaper VM instance
1daf02f5d ci: nydus: Use a smaller / cheaper VM instance
e60d81f55 ci: nerdctl: Use a smaller / cheaper VM instance
4db416997 ci: docker: Use a smaller / cheaper VM instance
32841827b ci: cri-containerd: Use a smaller / cheaper VM instance
92fff129f ci: k8s: Don't set cpu limit request for k8s-inotofy test
faf98c062 ci: Reduce the size of the AKS VMs
adc18ecdb ci: cache: For consistency, read all used env vars
c7a851efd ci: cache: Pass the exposed env vars to the kata-deploy binaries in docker
6bd15a85d ci: cache: Export env vars needed to use ORAS
cd4fd1292 metrics: Add iperf cpu utilization limit for qemu
df5cd10ea metrics: Add iperf value for cpu utilization
a96050a7a tests: Apply timeout to 'ctr t kill'
9d9303678 tests/vfio: Bump VM image to Fedora 38
faee59b52 tests/vfio: Accept single device in vfio group for CLH
df3dc1105 tests/vfio: Get rid of sync's
7211c3dcc gha: vfio: Set test timeout to 15m
1b02f89e4 packaging: kernel: Enable VIRTIO_IOMMU on x86_64
3a1db7a86 runtime: clh: Support enabling iommu
9f1a42c6c tests/vfio: Give commands 30s to execute
b46b0ecf8 tests/vfio: Configure a value for 'hot_plug_vfio' for both vmms
bfc93927f runtime: Remove redundant check in checkPCIeConfig
7c4e73b60 runtime: Add test cases for checkPCIeConfig
fc51e4b9e runtime: Check config for supported CLH (cold|hot)_plug_vfio values
509771e6f runtime: clh: Add hot_plug_vfio entry to config
5f6475a28 tests/vfio: Gather debug info and disable tdp_mmu
8fffdc81c tests/vfio: Capture journal from vm
df815087e tests/vfio: Change to get the test working in GHA
a92ddeea1 tests/vfio: Move dependency installation to gha-run.sh
5a551a85b gha: vfio: Import jobs scripts from tests repo
49e2fa189 metrics: Increase jitter value for qemu
49234433a metrics: Increase value limit for jitter in clh
813bfdec0 ci: docker: nerdtl: Use io.containerd.kata-${KATA_HYPERVISOR}.io
46bc0b1c0 ci: nerdctl: Create the containerd config
13968aa7f ci: nerdctl: Switch to tcp port 80 ping
e0c811678 ci: docker: Switch to tcp port 80 ping
1636abbe1 runtime: issue with non-empty []Endpoint in RemoveEndpoints
0aa073967 metrics: Add iperf bandwidth value for qemu
c0ad91476 tests: fix kernel and initrd annotations
615c1cbf1 metrics: Add iperf bandwidth value for kata metrics
d53eb73ee metrics: Ensure docker is running in init_env
ad08321b8 metrics: Add Cassandra Metrics documentation
a58ea6659 metrics: this PR skips the FIO test temprarily to fix issues
f536ef5ce ci: docker: Also run the smoke test with runc
c83f167c5 ci: docker: Run the tests after the kata-static is created
12d833d07 ci: Add a very basic nerdctl sanity test
348b8644d ci: Add a very basic docker sanity test
a75fd5eb8 runk: Fix rust unecessary mut error
a31c14517 kata-ctl: useless-vec warning
c8419fc3b kata-ctl: Resolve non-minimal-cfg warning
3eaf68d95 agent-ctl: Allow clippy lint
1d8b78959 runtime-rs: Fix useless-vec warning
99f3d69e9 runtime-rs: Remove mut
16fbc27b0 dragonball: Allow ambiguous-glob-reexports
bbf191951 dragonball: Resolve non-minimal-cfg warning
75cfdd5d5 agent: config: Allow clippy lint
f3a0fd590 agent: config: Fix useles-vec warning
9e423bd3d libs: Fix clippy unnecesary hashes error
444395050 versions: Bump rust version
a16b0962b chore(cargo): update cargo lock
ca4b6b051 runtime: Naming conflict of network devices
202049f35 feat(runtime-rs): introduce huge page type to select VM RAM's backend
f811b064c ci: use github.ref_name instead of $GITHUB_REF_NAME
6d795c089 ci: Add more target-branch related fixes
8509c3187 ci: Fix target-branch usage
060499dca metrics: Remove warning from metrics documentation
c0f697fcc runtime: Allow kernel_params annotation
b03e49794 dragonball: fix for non-deterministic builds
976d10150 runtime-rs: hypervisor: Remove debug kernel options
fde34610c kernel: Add erofs patches needed for CC related work
dc6a4588a versions: Bump kernel to the latest LTS release (6.1.52)
52f6449b7 kata-manager: Remove initcall_debug kernel option
8b4a0b368 kata-deploy: Remove curl after it's used
139c7f03a kata-deploy: Fix aarch64 image build
470d06541 agent: optimize the code of systemd cgroup manager
bd24afcf7 gha: Manually rebase PR atop of the target branch before testing
72c510d05 runtime/virtiofsd: Drop all references to "--cache=none"
ead724bec protocol: removing gogo.nullable feature
d8e4bb985 protocol: remove unused PROTO_FILE env
5e1106a77 protocol: remove unused import_path
87accaaec protocol: use workdir during build
711a7ed96 protocol: remove mapping definitions
8db84c1bd protocol: force GOPATH to be set
68156d77a protocol: breaking lines to improve readability
670a8e9c7 kata-deploy: Switch to an alpine image
9d74b7ccc k8s: ci: Skip "Pod quota" test with firecracker
f6cd3930c ci: k8s: Remove useless skip statement from tests
3cc20b47a ci: k8s: Also check for "fc" (for firecracker)
b5bad3cb0 ci: k8s: Add clean-up-garm argument for gha-run.sh
aaec5a09f ci: k8s: devmapper tests should be using ubuntu 20.04
27fa7d828 ci: k8s: Add a kata-deploy-garm target
fa62a4c01 ci: k8s: Export KUBERNETES env var
8c9380a79 ci: k8s: Install bats on GARM runners
3de23034f ci: k8s: Wait some time after restarting k3s
adfea55b8 metrics: fix FIO test initialization
2df183fd9 ci: k8s: Append, instead of overwrite, the devmapper config
369a8af8f ci: k8s: Decrease k3s sleep from 4 to 2 minutes
ada65b988 ci: k8s: Use vanilla kubectl with k3s
ad45ab5d3 ci: k8s: Ensure k3s is deploy with --write-kubeconfig-mode=644
028a97e0d ci: k8s: Use the proper command for sleep
3a427795e metrics: Use TensorFlow optimized image
8d99972a8 ci: k8s: Fix typo in run-k8s-tests-on-garm.yaml
deed1b927 Dragonball: optimize the placement of dbs-upcall features
0e8bd50cb ci: k8s: Add k8s devmapper tests (part 0)
b28b54df0 ci: k8s: Add a function to configure devmapper for containerd
54f711721 ci: k8s: Add a function to deploy k3s
81536f21a runtime/qemu: Pass "--xattr" to virtiofsd instead of "-o xattr"
b1dd09a4d runtime: Allow virtio_fs_extra_args annotation
2efda20c7 packaging: do not install docker-compose-plugin for s390x|ppc64le
438fbf966 metrics: Add write 95 percentile for FIO for qemu
024b4d2ff metrics: Add write 95 percentile FIO value
e98e5cdea metrics: Add checkmetrics to gha run script
c1edfe551 metrics: Add checkmetrics value for qemu for iperf
6a79ecedf metrics: Add jitter value for clh
f609a9a75 metrics: Add test selector to iperf metrics
5b8db3042 metrics: Enable iperf benchmark on gha for kata metrics
60f733d30 CI: switch static-checks-dragonball CI machines to Azure
7870b33a2 runtime-rs: bring hybridVsock devices in manager.
18c94ebbe kata-deploy: Create kata-static.tar with correct ownership
57e7bf14a agent: refine StorageDeviceGeneric::cleanup()
53edb1937 agent: implement StorageDeviceGeneric::cleanup()
0c63453e2 types: make StorageDevice::cleanup() return possible error code
3a3d77b3b agent: move StorageDeviceGeneric from kata-types into agent
b151cfd14 metrics: re-enable memory-usage initialization step
f3e1a6a94 osbuilder: alpine: Change mirror
ac612aef5 osbuilder: alpine: Match the version on versions.yaml
9cd706d1c agent: avoid possible leakage of storage device
bf21411e9 tests: add policy to k8s tests
d0e061067 runtime: config: use the SEV initrd for SNP
67fed26f1 runtime: Use TDX image with in the qemu-tdx config
ac939c458 gha: Rebase atop of the target branch
82cd14ba3 versions: Update alpine to its 3.18 version
666882575 metrics: Add grabdata script for metrics report
c290eaed8 kata-sys-util: protection: Update TDX checks
d7a996c68 gha: Update to checkout@v3 action
c2ba29c15 runtime: Fix data race in ioCopy
211de08d9 osbuilder: Remove chcon operation for guest SELinux
9f21fa9b3 metrics: Add report generator link to general documentation
c0ed5ea0a metrics: Add README for kata metrics report
a7b59a5bf metrics: Add limit for 90 percentile for qemu value
99db6568e metrics: Add limit for write 90 percentile value for clh
6e06392c5 metrics: Enable FIO limits for kata metrics
2e4c87472 runtime/vc: runPrestartHooks should ignore GetHypervisorPid failure
21204caf2 runtime: fail early when starting docker container with FC
32fd01371 runtime: run prestart hooks before starting VM for FC
00e7ffd98 tests: check vmx only on Intel machines
c8dd3c073 metrics: Fix memory footprint qemu limit
8877ec62f metrics: Fix memory inside limits for kata metrics
80146f207 tests: Fixes cpuType check on AMD machines
7e364716d metrics: Add test setup details to metrics report
17dc1b976 metrics: Add boot lifecycle times to metrics report
3b0d6538f metrics: Add memory inside container to metrics report
79fbb9d24 metrics: Add scaling system footprint in metrics report
8e6d4e6f3 metrics: Add metrics reportgen
139ffd4f7 metrics: Add report file titles
878d1a2e7 metrics: Generate PNGs alongside the PDF report
fce248797 metrics: Add metrics report R files
08812074d metrics: Add report dockerfile
69781fc02 metrics: Add metrics report script
e286e842c tests: Expand confidential test to support TDX
e31f099be tests: Expand confidential test to support SNP
c3b9d4945 tests: Add confidential test for SEV
538c965c2 metrics: fix parsing issue on memory-usage test
3818bf331 local-build: Remove $HOME/.docker/buildx/activity/default
d1b54ede2 qemu: tdx: Workaround SMP issue with TDX 1.5
1e34220c4 qemu: tdx: Adapt to the TDX 1.5 stack
8115a0522 versions: tdx: Update Kernel to 6.2 + TDX
ec18180f3 versions: tdx: Update TDVF to the "edk2-stable202302"
9803b2428 versions: tdx: Update QEMU to v7.2 + TDX v1.10
dffc16e5b runtime-rs: check peer close in log_forwarder
aaa5ab126 agent: simplify storage device by removing StorageDeviceObject
fb49d5d7c gha: Avoid "fail-fast" in tests that are known to be flaky
183f51d6f tests: use unique test name
6a974679f tests: delete k8s deployment at the test's end
32a778b6d metrics: Remove unused variable in tensorflow nhwc script
d8f3ce649 kata-deploy: Don't try to remove /opt/kata
936e8091a gha: vfio: Run on Ubuntu 23.04 runner
0e7248264 agent: move storage device related code into dedicated files
268e84655 runtime-rs: Fix volumes and rootfs cleanup issues
8f49ee33b agent: refine storage related code a bit
60ca12ccb agent: switch to new storage subsystem
fcbda0b41 kata-types: introduce StorageDevice and StorageHandlerManager
b03b1f613 agent: simplify the way to manage storage object
8392c71bf sys-util: support more mount flags in parse_mount_options()
c00d8f3d4 agent: use create_mount_destination() from kata-sys-util
5e867f053 types: add more mount related constants
880e6c9a7 agent: use function from kata-sys-utils to reduce code
3b881fbc0 local-build: Remove GID before creating group
959ca4944 metrics: Add TensorFlow ResNet50 fp32 Dockerfile
4b7d72c4a metrics: Add TensorFlow ResNet50 FP32 benchmark
5cba38c17 kata-deploy: Avoid failing on content removal
18d42da21 runtime/fc: fix image/initrd annotation handling
9fda7059a runtime/clh: fix image/initrd annotation handling
1a0092d63 runtime/qemu: fix image/initrd annotation handling
22d8f335d libs,tests: fix typo disable_guest_seccomp in configuration-anno-1.toml
8afd158ce metrics: Add disk link to README
40914b25d kata-agent: use default filemode for block device when it is set to 0
eee2ee6ee metrics: Fix FIO path
39bc3488f metrics: Use function from metrics common in pytorch script
400eb8874 gha: capture additional kata-deploy output
4aee3eade kata-types: implement serde methods for KataVirtualVolume
b875e3932 kata-types: validate KataVirtualVolume object
fa2fdc105 kata-types: implement two conversion helpers for KataVirtualVolume
6326af20e kata-types: introduce KataVirtualVolume
c8b43f8b3 metrics: Fix README for pytorch
fb571f8be metrics: Enable kata runtime in K8s for FIO test.
cb056f8cb rootfs: agent: Policy support with AGENT_INIT=yes
85c02828e metrics: Update tensorflow name in gha run script
e8a511934 metrics: Fix check results for tensorflow benchmark
2d896ad12 gha: kata-deploy: Do the runtime class cleanup as part of the cleanup
4ffc2c86f gha: kata-deploy: Add the first kata-deploy test
8616c050a metrics: Remove unused variable in tensorflow mobilenet script
285e616b5 tests: common: Ensure test_type is used as part of the cluster's name
790bd3548 tests: commob: Don't fail if yq is not part of the cache
ce6adecd0 gha: kata-deploy: Add run-kata-deploy-tests.sh
cfc29c11a gha: k8s: Stop running kata-deploy tests as part of the k8s suite
f4dd15286 tests: k8s: Call ensure_yq() in setup.sh
339569b69 kata-deploy: Properly create default runtime class
2a491e9b1 metrics: Fix MobileNet help me description
d19a75e80 gha: ci: Start running kata-deploy tests
d90f7ac68 runtime-rs: add unit test for block driver
e44919f0d runtime-rs: add load_test_config for unit test
7f48a6937 runtime-rs: add driver option
bade6a5c3 docs: Fix TensorFlow word across the document
1a1b20776 docs: Add Tensorflow Resnet50 documentation
24baededc metrics: Add Dockerfile for ResNet50 int8
6d971ba8d metrics: Add Tensorflow ResNet50 int8 benchmark
25d151bd1 runk: Modify kill command's error message for containerd tests
b3592ab25 gha: cri-containerd: Enable tests
84dd02e0f gha: cri-containerd: Add timeout to the crictl calls on testContainerStop
b29782984 gha: cri-containerd: Show pod before deleting it
ae0930824 gha: cri-containerd: Print kata logs in case of error
6c8b2ffa6 gha: cri-containerd: Group containerd logs
9e898701f gha: cri-containerd: Ensure RUNTIME takes KATA_HYPERVISOR into account
76dac8f22 agent: simplify error handling
18a7fd8e4 metrics: Rename tensorflow scripts
e55fa93db tests: kata-deploy: Add placeholder for kata-deploy-tests-on-tdx
d9ee17aae tests: kata-deploy: Add placeholder for kata-deploy-tests-on-aks
ab829d103 agent: runtime: add the Agent Policy feature
831e73ff9 tests: kata-deploy: Add functional/kata-deploy/gha-run.sh placeholder
af1b46bbf tests: Add gha-run-k8s-common.sh
416445e7e docs: Remove installation step in virtcontainers doc
72cbcf040 kata-deploy: Add k0s support
767434d50 metrics: fix the loop used to stop kata components #7629
5d0f0d43c metrics: Add cassandra statefulset yaml
c1dcc1396 metrics: Add cassandra service yaml
2297a0d1c metrics: Add block loop pvc yaml for cassandra
e3d511946 metrics: Add block loop pv yaml for cassandra test
989027159 metrics: Add block loop pvc for cassandra test
349b89969 metrics: Add Cassandra Kubernetes benchmark for kata metrics
c52d09052 gha: static-checks: Move to the Azure instances
8815ed066 runtime: Remove config warnings
afe1a6ac5 agent: support copying of directories and symlinks
ab13ef87e runtime: propagate configmap/secrets etc changes for remote-hyp
c074ec4df runtime: Copy shared files recursively
fdcd52ff7 metrics: Add check containers are running in tensorflow mobilenet
36337ee14 metrics: Add check containers are up in tensorflow script
f700f9b0b metrics: Remove unused variable in tensorflow script
833cf7a68 metrics: Add check containers are running function
918c78308 metrics: Add check containers are up in tensorflow mobilenet script
9d57a1fab metrics: Use check containers are up in tensorflow script
1c84680d8 metrics: Add check containers are up in common script
d3e57cf45 metrics: Use collect_results function in tensorflow mobilenet test
286de046a metrics: Remove collect results function definition
9879709aa metrics: Add common functions to the common script
4746fa3da docs: Specify supported Firecracker version using `versions.yaml`
cc922be5e versions: Update firecracker version to 1.4.0
39e67b06e dragonball: vsock add fifo/pipe stream support for passed fd hybridStream
473b0d3a3 metrics: compute tensorflow statistics
03d1fa67b ci: unencrypted-image: Fix build context
eb463b38e ci: unencrypted-image: Don't fail to build on s390x
a2d731ad2 ci: create-confidential-image: Add dependent actions
d1a629622 metrics: Add nginx documentation to network README
498f7c054 metrics: Add nginx kubernetes yaml
f8a5255cf metrics: Add network nginx benchmark
43fe5d1b9 ci: k8s: tees: Ensure PR_NUMBER is exported
54f6a7850 ci: {{ pr-number }} should be {{ inputs.pr-number }}
034d7aab8 tests: k8s: Ensure the runtime classes are properly created
fac8ccf5c ci: Add build-and-publish-tee-confidential-unencrypted-image
ab5f603ff ci: k8s: Add the image used for unencrypted confidential tests
1e8fe131b k8s: tests: Take advantage of `SHIMS` and `DEFAULT_SHIM` env vars
729b2dd61 agent: avoid creating new `Vec` instances when easily avoidable
aeaec9dae tests: upgrade bats version
e66496986 metrics: install kata once and run multiple checks
baabfa9f1 agent: refine implementation of mount related code
98ba211a3 agent: fix a bug in update_ephemeral_mounts()
5333618d7 agent: make add_storage() take &[Storage] instead of Vec<Storage>
37f34781d agent: simplify function online_cpu_memory()
d3c542237 agent: refine style of code related to sandbox
71a9f6778 agent: avoid unwrap() in function do_remove_container()
84badd89d agent: avoid clone objects when possible
b23c5ed15 deps: Bump dependent crate versions
863283716 metrics: General improvements to mobilenet tensorflow test
3c319d8d4 metrics: Add iperf to gha run script
5b5caf890 gha: Add iperf network metrics
66db5b535 metrics: Add latency test to network README
c36572418 agent: avoid unnecessary calls to `Arc::clone`
4fbe0a3a5 runtime: bind-mount mounted block device into container
7e1b1949d runtime: add support for kata overlays
6c867d9e8 agent: add io.katacontainers.fs-opt.overlay-rw option
6163c3565 agent: skip mount options that start with "io.katacontainers."
b2ff97aa0 dragonball: use version 0.10.4 of `fuse-backend-rs`
845eeb4d7 agent: Allow clippy::redundant_clone in the unit tests
1163fc9de release: Revert kata-deploy changes after 3.2.0-rc0 release
3958a39d0 runtime-rs: Introduce directly attachable network
1e15369e5 metrics: Improve naming testing containers in launch times test
5dbe88330 metrics: Clean kata components before start a metric test.
3b45060b6 metrics: Add latency server yaml
9bb8451df metrics: Add latency client yaml
64fdb9870 metrics: Add network latency test
a81ad3b58 runtime-rs: Add block device handling in cloud hypervisor
3230dec95 kata-deploy: Use host's systemctl
1b21a4624 docs: Use control-plane term instead of master
28e5e9c86 runtime-rs: fix number of queues handling in dragonball share fs device
f1d8de9be runk: Allow runk to launch a container without pid namespace

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-20 14:44:50 +02:00
Fabiano Fidêncio
f6e20ac230 Merge pull request #7195 from fidencio/topic/adapt-kata-deploy-stable-to-using-ubuntu
kata-deploy-stable: Switch to using the ubuntu based payload
2023-10-20 14:42:04 +02:00
Fabiano Fidêncio
a93fdb014b kata-deploy-stable: Adapt to what we're using in the stable branch
This is basically to make sure that folks trying to use the kata-deploy
script from the main branch, to deploy **stable** kata-deploy images, do
not have a hard time.

Fixes: #7194

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-20 12:58:42 +02:00
James O. D. Hunt
79ed501a20 Merge pull request #8258 from jodh-intel/protection-fix-tdx-typo
libs: protection: Fix typo in TDX output
2023-10-20 08:36:22 +01:00
Dan Mihai
52aaf10759 agent: no endpoint blocking from agent-config.toml
Remove the ability to block access to kata agent endpoints by using
agent-config.toml. That functionality is now implemented using the
Agent Policy feature (#7573).

The CCv0 branch relied on blocking endpoints using agent-config.toml
but will set-up an equivalent default policy file instead (#8219).

Fixes: #8228

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-10-20 02:26:54 +00:00
Fabiano Fidêncio
468a3e4b53 Merge pull request #8260 from gkurz/fix-8259
ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat
2023-10-19 23:58:22 +02:00
GabyCT
5d6bdbd0a1 Merge pull request #8241 from GabyCT/topic/enableagenttest
tests: Enable agent stability test
2023-10-19 14:12:49 -06:00
Greg Kurz
36109da93f ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat
Fixes #8259

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-10-19 21:53:23 +02:00
GabyCT
dc295600b8 Merge pull request #8157 from GabyCT/topic/fixsevdoc
docs: Fix paths to build kernel in SNP VMs documentation
2023-10-19 11:42:03 -06:00
Gabriela Cervantes
d01daf749b tests: Adjust timeout for agent stability test
This PR adjusts the timeout for the agent stability test
to run on the gha.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-19 16:55:23 +00:00
James O. D. Hunt
9b14dda147 libs: protection: Fix typo in TDX output
Add the missing closing bracket to the output of the TDX details,
so rather than:

```bash
$ sudo kata-ctl env 2>/dev/null | grep available_guest_protection
available_guest_protection = "tdx (major_version: 1, minor_version: 0"
:                                                                    ^
:                                                           Missing ')' !
```

... we now have:

```bash
$ sudo kata-ctl env 2>/dev/null | grep available_guest_protection
available_guest_protection = "tdx (major_version: 1, minor_version: 0)"
:                                                                    ^
:                                                                   Aha!
```

Added a unit test for this scenario.

Fixes: #8257.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-19 16:06:08 +01:00
James O. D. Hunt
9336e2e492 Merge pull request #8155 from jodh-intel/runtime-rs-check-ch-tdx-build-feature
runtime-rs: ch: Add TDX CH features check
2023-10-19 14:13:08 +01:00
James O. D. Hunt
048cc70654 Merge pull request #8213 from jodh-intel/validate-hypervisor-cfg-name
runtime: Validate hypervisor section name in config file
2023-10-19 07:40:58 +01:00
Dan Mihai
99db6dff24 Merge pull request #8230 from microsoft/danmihai1/opa-data
tests: query data from the OPA service
2023-10-18 15:32:23 -07:00
James O. D. Hunt
0e0867f15d runtime-rs: ch: Add TDX CH features check
If you attempt to create a container (a TD) on a TDX system using a
custom build of Cloud Hypervisor (CH) that was not built with the `tdx`
CH feature, Kata will report the following, somewhat cryptic, CH error:

```
ApiError(VmBoot(InvalidPayload))
```

Newer versions of CH now report their build-time features in the ping
API response message so we now use that, if available, to detect this
scenario and generate a user-friendly error message instead.

This changes improves the readability of `handle_guest_protection()` and
adds a couple of additional tests for that method.

Fixes: #8152.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-18 18:07:39 +01:00
James O. D. Hunt
409eadddb2 runtime-rs: ch: Improve readability of guest protection checks
Improve the way `handle_guest_protection()` is structured by inverting
the logic and checking the value of the `confidential_guest` setting
before checking the guest protection. This makes the code easier to
understand.

> **Notes:**
>
> - This change also unconditionally saves the available guest protection
>   (where previously it was only saved when `confidential_guest=true`).
>   This explains the minor unit test fix.
>
> - This changes also errors if the CH driver finds an unexpected
>   protection (since only Intel TDX is currently tested).

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-18 18:06:02 +01:00
Greg Kurz
9863805752 Merge pull request #8201 from fidencio/topic/release-tag-repo-stop-tagging-the-tests-repo
release: tag_repos: Stop tagging the `tests` repo
2023-10-18 18:10:39 +02:00
Gabriela Cervantes
a58afe70b8 metrics: Add iperf udp benchmark
This PR adds the iperf udp benchmark for bandwdith measurement
for network metrics.

Fixes #8246

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-18 15:52:03 +00:00
Jianyong Wu
f9c9d8f645 runtime: QemuVirt: hotadd virtio-mem dev to pcie root port
Hotplug virtio-mem device to pcie root port for Qemu Virt.

Fixes: #7646
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-10-18 06:35:57 +00:00
Jianyong Wu
ef18c9550c runtime:qemuvirt: hotadd net dev to pcie root port
Hotplug network device to pcie root port as this is the only way on
QemuVirt.

Fixes: #7646
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-10-18 06:35:57 +00:00
Jianyong Wu
f1aec98f9d qemu/virt: use pcie_root_port to do device hotplug for virt
ACPI PCI device hotplug on qemu virt is not supported. The only way to
hotplug pci device is pcie native way. Thus we need create pcie root
port as default.

Pcie root port number depends on following:
1. reserved one for network device as default;
2. virtio-mem dev;
3. add enough port for vhost user blk dev;

Fixes: #7646
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-10-18 06:35:57 +00:00
Jianyong Wu
28a41e1d16 runtime: add a new API for Network interface
Add GetEndpointsNum API for Network Interface to get the number of
network endpoints. This is used for caculate the number of pcie root
port for QemuVirt.

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-10-18 06:35:57 +00:00
Songqian Li
09d46450f1 dragonball: add metrics support for balloon device
Fixes: #7248

Signed-off-by: Songqian Li <mail@lisongqian.cn>
2023-10-18 14:02:56 +08:00
Gabriela Cervantes
82a0814fc2 tests: Enable agent stability test
This PR enables the agent stability test for stability gha CI.

Fixes #8240

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-17 15:16:06 +00:00
Dan Mihai
32be8e3a87 tests: query data from the OPA service
Add example for querying json data from the OPA service.

Fixes: #8231

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-10-17 13:31:43 +00:00
David Esparza
d90d1c5c10 Merge pull request #8243 from dborquez/fix_systemctl_masked_query
metrics: fixes common.sh function to always return true
2023-10-16 20:17:24 -06:00
Dan Mihai
b81c0a6693 tests: encode policy file during test
Encode policy file during test - easier to understand than hard-coding
the encoded file contents.

Fixes: #8214

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-10-16 15:58:12 -07:00
David Esparza
4f9681b411 metrics: fixes common.sh function to always return true
This PR corrects the init env() helper function, to make that
systemctl always returns true when enumerating masked services,
and preventing the test from failing

Fixes: #8242

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-16 15:57:57 -06:00
David Esparza
59e8b1d5a7 Merge pull request #8206 from dborquez/memory_footprint_test_removing_trailing_commas_to_make_json_results_file_valid
Memory footprint test removing trailing commas to make json results file valid
2023-10-16 14:31:28 -06:00
Gabriela Cervantes
2ef2b2a6dc docs: Fix paths to build kernel in SNP VMs documentation
This PR fixes the correct path to setup, build and install properly
the kernel for snp.

Fixes #8156

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-16 20:09:02 +00:00
Fabiano Fidêncio
db37692f36 Merge pull request #8226 from microsoft/danmihai1/policy-typo
policy: allow access to ReseedRandomDev
2023-10-16 19:17:31 +02:00
Peng Tao
45e82b6581 Merge pull request #8192 from bergwolf/github/deps
runtime/kata-ctl: update dependencies
2023-10-16 16:39:17 +08:00
Chao Wu
44e602d69a Merge pull request #8014 from openanolis/chao/fix_nydus_break
runtime-rs : fix Nydus support for runtime-rs + Dragonball
2023-10-16 01:30:22 -05:00
Chao Wu
408b59c02c runtime-rs: fix bugs to support Nydus v5
1. enable virtio-fs-pro in Dragonball to have the ability to process nydus backend registry
2. change passthrough for rw layer's readonly config to false to have the accurate read write ability.

Fixes:#8013

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-10-16 10:22:21 +08:00
Chao Wu
157caea9fe Revert "nydus: Temporarily skip tests on dragonball"
This reverts commit aba36ab188.

Fixes: #8013

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-10-16 10:22:21 +08:00
Chao Wu
678fe3cd31 Dragonball: fix Nydus config serde problem
Since Nydus snapshotter has been updated in previous commits, there is a
problem that the config passthrough to Dragonball during mount_rafs is
RafsConfig instead of ConfigV2, but Dragonball could only serde ConfigV2
so it will panic.

We need to add the support for RafsConfig

Fixes:#8013

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-10-16 10:22:21 +08:00
Dan Mihai
b6ec621389 policy: allow access to ReseedRandomDev
Allow access to the ReseedRandomDev endpoint by default. Using false
for ReseedRandomDevRequest was unintended.

Fixes: #8225

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-10-13 21:18:27 +00:00
David Esparza
908519db9d metrics: skips docker restart when it is not installed or is masked.
To avoid errors when initializing the test environment, the
kill_processes_before_start() helper function needs to verify that
docker is installed before attempting to stop it.

Fixes: #8218

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-13 18:02:00 +00:00
David Esparza
c2763120aa metrics: removing trailing comma characters from json file.
This PR removes trailing commas so that the json results
file is valid.

This PR also changes the way data results are collected by
terating through the array of memory values to calculate
their average.

Fixes: #8204

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-13 18:00:57 +00:00
Beraldo Leal
5ef691528d tests: fixes permission denied when running test
After running cri-containerd/integration-tests twice we receive
permission denied during containerd clean.

Fixes: #8216

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-10-12 19:23:40 +00:00
GabyCT
1974d13122 Merge pull request #8188 from dborquez/metrics_add_fio_readme.md
metrics: removal of reference in the documentation to the fio dax subtest.
2023-10-12 10:53:55 -06:00
James O. D. Hunt
3e8cf6959c runtime: Validate hypervisor section name in config file
Previously, if you accidentally modified the name of the hypervisor
section in the config file, the default golang runtime gives a cryptic
error message ("`VM memory cannot be zero`"). This can be demonstrated
using the `kata-runtime` utility program which uses the same golang
config package as the actual runtime (`containerd-shim-kata-v2`):

```bash
$ kata-runtime env >/dev/null; echo $?
0
$ sudo sed -i 's!^\[hypervisor\.qemu\]!\[hypervisor\.foo\]!g' /etc/kata-containers/configuration.toml
$ kata-runtime env >/dev/null; echo $?
VM memory cannot be zero
1
```

The hypervisor name is now validated so that the behaviour becomes:

```bash
$ kata-runtime env >/dev/null; echo $?
0
$ sudo sed -i 's!^\[hypervisor\.qemu\]!\[hypervisor\.foo\]!g' /etc/kata-containers/configuration.toml
$ ./kata-runtime env >/dev/null; echo $?
/etc/kata-containers/configuration.toml: configuration file contains invalid hypervisor section: "foo"
1
```

Fixes: #8212.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-12 13:53:37 +01:00
James O. D. Hunt
45d28998d9 Merge pull request #8149 from jodh-intel/runtime-rs-ch-detect-tdx-version
runtime-rs: ch: Detect Intel TDX version
2023-10-12 10:09:42 +01:00
QuanweiZhou
f904e64155 Merge pull request #8179 from Apokleos/directvol-urlEncode
runitme-rs: use the same base64 as kata-runtime/direct-volume does
2023-10-12 09:04:11 +08:00
GabyCT
bc6eadf4f6 Merge pull request #8197 from GabyCT/topic/enablescability
tests: Enable scability test for stability CI
2023-10-11 16:41:46 -06:00
Archana Shinde
f814b1a0a2 Merge pull request #8073 from amshinde/runtime-rs-vfio-clh
runtime-rs: Add support for adding vfio device for cloud-hypervisor
2023-10-11 15:01:55 -07:00
Gabriela Cervantes
ef6388e815 tests: Remove unused function from scability test
This PR removes an unused function from scability test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-11 19:44:21 +00:00
Fabiano Fidêncio
fbc8f8f466 scripts: Use install_yq from the kata-containers repo
As the file is already part of the kata-containers repo, and the tests
repo is about to become read-only, we're good to drop the tests
references from here and use everything coming from the
`kata-containers` repo instead.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-11 12:52:55 +02:00
Fabiano Fidêncio
65b1a2d277 release: tag_repos: Stop tagging / updating the tests repo
As we've moved all the tests to the `kata-containers` repo, the `tests`
repo will become a read-only repo.

Fixes: #8200

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-11 11:45:27 +02:00
James O. D. Hunt
87b760f569 runtime-rs: ch: Detect Intel TDX version
Improve the `GuestProtection` handling to detect the version of
Intel TDX available.

The TDX version is now logged by the Cloud Hypervisor driver.

Fixes: #8147.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-11 09:38:00 +01:00
alex.lyn
73e81f5e39 runitme-rs: unify base64 encoding for direct-volume
Direct-volume needs to use the same base64 character set as
kata-runtime/direct-volume does.

Fixes: #8175

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-10-11 14:00:13 +08:00
Gabriela Cervantes
c6463cb5ae tests: Fix path for versions yaml for soak parallel test
This PR fixes the path for versions yaml for soak parallel test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-10 22:29:20 +00:00
David Esparza
89c9454fca metrics: removal of reference in the documentation to the dax test.
This PR removes the reference in the documentation to the DAX
subtest of the FIO benchmark, because this metric is currently
WIP.

Fixes: #8159

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-10 15:55:59 -06:00
Gabriela Cervantes
30ff58904e tests: Enable scability test for stability CI
This PR enables the scability test for stability CI gha.

Fixes #8196

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-10 19:59:57 +00:00
GabyCT
538131ab44 Merge pull request #8154 from GabyCT/topic/addstability
tests: Enable soak parallel stability test
2023-10-10 13:53:14 -06:00
Archana Shinde
8d6f7b9096 runtime-rs: Add support for handling vfio device for cloud-hypervisor
This change adds support for adding and removing vfio devices for
 cloud-hypervisor.

Fixes: #6691

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-10-10 12:25:44 -07:00
Gabriela Cervantes
e786b2b019 gha: Add install dependencies for stability tests
This PR adds the install dependencies for stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-10 16:05:48 +00:00
Chao Wu
936553ae79 Merge pull request #7505 from lisongqian/feat/dragonball_metrics
dragonball: vcpu metrics change to be recorded per vcpu
2023-10-10 10:52:40 -05:00
Wainer Moschetta
d311c3dd04 Merge pull request #7621 from wainersm/gha-run-local
ci: k8s: adapt gha-run.sh to run locally
2023-10-10 11:19:19 -03:00
David Esparza
93fef543e0 Merge pull request #8127 from dborquez/fix_iperf_check_kata_processes_issue
metrics: removes kata components and k8s deployment when test finishes
2023-10-10 07:05:24 -06:00
lisongqian
dbfe6512fc dragonball: vcpu metrics change to be recorded per vcpu
In this commit, the vcpu metrics in Dragonball will be changed to record per-vcpu.

Fixes: #7248

Signed-off-by: lisongqian <mail@lisongqian.cn>
2023-10-10 16:22:40 +08:00
lisongqian
fa60fbe023 dragonball: METRICS is refactored to RwLock<DragonballMetrics>
In this commit, the METRICS is refactored to RwLock<DragonballMetrics>.

Fixes: #7248

Signed-off-by: lisongqian <mail@lisongqian.cn>
2023-10-10 16:22:40 +08:00
Peng Tao
500d1c5cee kata-ctl: update rustls-webpki/webpki dependency
The old ones have security issues.
ref: https://github.com/briansmith/webpki/issues/69
https://github.com/briansmith/webpki/issues/69

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-10-10 03:56:45 +00:00
Peng Tao
d7660d82a0 runtime: unify gopkg.in/yaml.v3 to v3.0.1
The older versions have Denial of Service issues.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-10-10 03:56:45 +00:00
Peng Tao
fc9a107e8e runtime: unify swag and testify dependency
So that we don't need to depend on that many versions of them.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-10-10 03:56:45 +00:00
Peng Tao
79ebb959c5 runtime: update runc dependency to v1.1.9
To pick up security fixes.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-10-10 03:56:45 +00:00
Peng Tao
7f3e8bd65e runtime: unify golang.org/x/text to v0.7.0
The older versions contain security issues.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-10-10 03:56:45 +00:00
Peng Tao
df325ae371 runtime: update golang.org/x/net to v0.7.0
To pick up fix for the following issue:

A maliciously crafted HTTP/2 stream could cause excessive CPU
consumption in the HPACK decoder, sufficient to cause a denial of
service from a small number of small requests.

Fixes: #8190
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-10-10 03:56:39 +00:00
David Esparza
bba34910df metrics: stops kata components and k8s deployment when test finishes
This PR adds a trap whenever the scrip exits, it deletes the iperf
k8s deployment and k8s services, and deletes the kata components.

This way, when the script finishes, it verifies that there are
indeed no kata components still running.

Fixes: #8126

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-09 13:41:43 -06:00
Gabriela Cervantes
84e3d884e4 gha: Add general dependencies to stability tests
This PR adds the general dependencies to stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-09 17:02:49 +00:00
Gabriela Cervantes
dec3951ca5 tests: Add soak parallel stability test
This PR adds the soak parallel stability test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-09 17:02:49 +00:00
Gabriela Cervantes
0f04d527d9 tests: Enable soak parallel test
This PR enables the soak parallel test for stability test.

Fixes #8153

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-09 17:02:49 +00:00
Wainer dos Santos Moschetta
e669282c25 ci: k8s: set KUBERNETES default value
The KUBERNETES variable is mostly used by kata-deploy whether to apply
k3s specific deployments or not. It is used to select the type of
kubernetes to be installed (k3s, k0s, rancher...etc) and it is always
set on CI. Running the script locally we want to set a value by default
to avoid `KUBERNETES: unbound variable` errors.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:08:48 -03:00
Wainer dos Santos Moschetta
c30c3ff185 tests: run k8s-volume on a given node
This test can give false-positive on a multi-node cluster. Changed it to
use the new get_one_kata_node() and the modified exec_host() to run the
setup commands on a given node (that has kata installed) and ensure the
test pod is scheduled at that same node.

Fixes #7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:08:48 -03:00
Wainer dos Santos Moschetta
666993da8d tests: run k8s-file-volume on a given node
This test can give false-positive on a multi-node cluster. Changed it to
use the new get_one_kata_node() and the modified exec_host() to run the
setup commands on a given node (that has kata installed) and ensure the
test pod is scheduled at that same node.

Fixes #7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:08:48 -03:00
Wainer dos Santos Moschetta
3a00fc9101 tests: exec_host() now gets the node name
The exec_host() simply fails on cluster with multi-nodes because
`kubectl get node -o name" will return a list o names. Moreover, it will
return control nodes names which usually don't have kata installed.

Fixes #7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
61c9c17bff tests: add get_one_kata_node() to tests_common.sh
The introduced get_one_kata_node() returns the first node that
has the kata-runtime=true label, i.e., supposedly a node with
kata installed.

This is useful for tests that should run on a determined worker
node on a multi-nodes cluster.

Fixes #7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
68f083c4d0 ci: k8s: set KATA_HYPERVISOR default value
Let KATA_HYPERVISOR be qemu by default in gh-run.sh as this variable
is required to tweak some configurations of kata-deploy.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
6677a61fe4 ci: k8s: configurable deploy kata timeout
The deploy-kata() of gha-run.sh will wait for 10 minutes for the kata
deploy installation finish. This allow users of the script to overwrite
that value by exporting the KATA_DEPLOY_WAIT_TIMEOUT environment
variable.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
200e542921 ci: k8s: shellcheck fixes to gha-run.sh
Fixed a couple of warns shellcheck emitted and disabled others:
 * SC2154 (var is referenced but not assigned)
 * SC2086 (Double quote to prevent globbing and word splitting)

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
4af78be13a kata-deploy: re-format kata-[deploy|cleanup].yaml
The .tests/integration/kubernetes/gh-run.sh script run `yq write` a
couple of times to edit the kata-[deploy|cleanup].yaml, resulting
on the file being formatted again. This is annoying because leaves
the git tree dirty.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
d54e6d9cda ci: k8s: run_tests() for kcli
The only difference to the other platforms is that it needs to
export KUBECONFIG.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
c2ef1f0fb0 ci: k8s: add deploy-kata-kcli() to gh-run.sh
The cleanup-kcli() behaves like other deploy kata for
bare-metal (e.g. sev, tdx...etc) except that KUBECONFIG
should be exported.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
d2be8eef1a ci: k8s: add cleanup-kcli() to gha-run.sh
The cleanup-kcli() behaves like other clean up for bare-metal (e.g. sev,
tdx...etc) except that KUBECONFIG should be exported.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
cbb9aa15b6 ci: k8s: set default image for deploy_kata()
On CI workflows the variables DOCKER_REGISTRY, DOCKER_REPO and
DOCKER_TAG are exported to match the built image. However, when running
the script outside of CI context, a developer might just use the latest
image which in this case will be
`quay.io/kata-containers/kata-deploy-ci:kata-containers-latest`.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
89bef7d036 ci: k8s: create k8s clusters with kcli
Adapted the gha-run.sh script to create a Kubernetes cluster locally
using the kcli tool.

Use `./gha-run.sh create-cluster-kcli` to create it, and
`./gha-run.sh delete-cluster-kcli` to delete.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Fabiano Fidêncio
1280f85343 Merge pull request #8171 from bergwolf/github/fix-up-gha
GHA: fix up referenced yaml exceeding 20 limit problem
2023-10-09 09:37:03 +02:00
Peng Tao
954d40cce5 gha: combine coco jobs into a single yaml
So that we don't risk exceeding the GHA 20 rerefenced yaml files limit
that easy.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-10-08 14:22:01 +00:00
Peng Tao
b60e0a9b57 gha: combine basic amd64 jobs into a single yaml
GHA has an undocumented limitation that there can be at most 20
referenced yamls in a single yaml file. We workaround it by combining
multiple jobs into a single yaml file.

Fixes: #8161
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-10-08 13:55:01 +00:00
Fabiano Fidêncio
108db0a721 Merge pull request #8162 from sprt/sprt/unbreak-ci
gha: ci: Revert tracing test PR to unbreak CI
2023-10-08 10:13:46 +02:00
Aurélien Bombo
e9bd852113 gha: ci: Revert tracing test PR to unbreak CI
Revert "Merge pull request #8115 from fidencio/topic/ci-add-tracing-tests"

This unbreaks CI as seen in https://github.com/kata-containers/kata-containers/actions/runs/6434757133

Fixes: #8161

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-10-06 14:13:17 -07:00
James O. D. Hunt
16fe81f27c Merge pull request #8124 from jodh-intel/ch-enable-feature
runtime-rs: ch: Enable feature
2023-10-06 13:02:08 +01:00
Fabiano Fidêncio
fa6786d1d7 Merge pull request #8117 from fidencio/topic/ci-add-runk-tests
gha: ci: Port runk tests over
2023-10-06 11:19:55 +02:00
Fabiano Fidêncio
8fec654716 Merge pull request #8115 from fidencio/topic/ci-add-tracing-tests
ci: gha: Port tracing tests over
2023-10-06 10:06:57 +02:00
GabyCT
265f53e594 Merge pull request #8082 from dborquez/enable_fio_on_ctr
Enable fio test using containerd client
2023-10-05 17:26:22 -06:00
GabyCT
c8b9ec1cb5 Merge pull request #8108 from GabyCT/topic/ghastability
gha: Add stability tests workflow for gha
2023-10-05 17:10:10 -06:00
James O. D. Hunt
b8a46a4b85 runtime-rs: ch: Enable feature
Enable the Cloud Hypervisor driver (the `cloud-hypervisor` build feature) for the rust runtime.

Fixes: #6264.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-05 17:58:39 +01:00
Gabriela Cervantes
0f2dc8c675 gha: Add containerd stability tests to ci yaml
This PR adds containerd stability tests to ci yaml.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-05 15:21:24 +00:00
Fabiano Fidêncio
89f73e658d Merge pull request #8110 from fidencio/topic/gha-be-more-specific-about-the-arm-runners
gha: arm64: Ensure the builder is arm64-builder
2023-10-04 21:20:08 +02:00
Fabiano Fidêncio
da91c9df88 ci: Port runk tests to this repo
I'm basically moving the runk tests from the tests repo to this one, and
I'm adding the "Signed-off-by:" of every single contributor the tests.

Fixes: #8116

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Chen Yiyang <cyyzero@qq.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-04 20:41:29 +02:00
Fabiano Fidêncio
7f23772763 ci: Add placeholder for runk tests
The runk test has been executed as part of the former "ubuntu" jenkins
CI.

We're porting it to GHA and running it against LTS containerd.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-04 20:40:32 +02:00
Fabiano Fidêncio
9205acc3d2 ci: Move tracing tests here
I'm basically moving the tracing tests from the tests repo to this one,
and I'm adding the "Signed-off-by:" of every single contributor to the
tests.

Fixes: #8114

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>
2023-10-04 20:02:27 +02:00
Gabriela Cervantes
85d290a048 gha: Add stability gha run script
This PR adds the stability gha run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-04 17:45:45 +00:00
Gabriela Cervantes
54f0c8f88e gha: Add stability tests workflow for gha
This PR adds the stability test workflow for gha for the kata CI.

Fixes #8107

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-04 16:32:13 +00:00
Fabiano Fidêncio
3bb2923e5d ci: Add placeholder for tracing tests
The tracing tests are currently running as part of the Jenkins CI with
the following setups:
* Container Engines: containerd
* VMMs: QEMU | Cloud Hypervisor
* Snapshotters: overlayfs | devmapper

We'll be restricting those tests to be running on LTS version of
containerd, without devmapper.

As it's known due to our GHA limitation, this is just a placeholder and
the tests will actually be added in the next interations.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-04 18:02:02 +02:00
Fabiano Fidêncio
2c3bf406dc ci: Create a function to install docker
This will be re-used in other tests as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-04 15:01:51 +02:00
Fabiano Fidêncio
c2cce12de5 Merge pull request #8100 from fidencio/topic/kata-deploy-build-agent
kata-deploy: Build kata-agent as we build all the other components
2023-10-04 11:56:03 +02:00
Steve Horsman
c430cc3707 Merge pull request #8098 from stevenhorsman/k8s-registry-suite
versions: migrate out of k8s.gcr.io
2023-10-04 10:51:39 +01:00
Fabiano Fidêncio
119f03de26 gha: arm64: Ensure the builder is arm64-builder
Otherwise we'll use any arm64 machine that's added as a runner, and
whenever new machines are added those may end up being only used for
running some specific set of the tests.

Fixes: #8109

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-04 11:08:11 +02:00
Fabiano Fidêncio
59b9380d1c Merge pull request #8093 from stevenhorsman/crictl-pod-config-update
doc: Update crictl pod-config
2023-10-04 10:49:04 +02:00
David Esparza
8c498ef5ee metrics: Use jq tool to pretty-print json metrics output
This PR enables the use of jq pretty-print feature to
improve the formatting of metric results json files.

Fixes: #8081

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-03 23:33:19 -06:00
David Esparza
a2159a6361 metrics: Enables FIO test for kata containers
FIO benchmark is enabled to measure IO in Kata
at different latencies using containerd client,
in order to complement the CI metrics testing set.

This PR asl deprecated the previous Fio bench
based on k8s.

Fixes: #8080

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-03 23:32:38 -06:00
Fabiano Fidêncio
f337315952 Merge pull request #8106 from fidencio/topic/gha-fix-k0s-related-cis
gha: Fix k0s deployment
2023-10-03 21:47:40 +02:00
GabyCT
d1d9af5de2 Merge pull request #8085 from GabyCT/topic/stabilitytests
tests: Add stability test for kata CI
2023-10-03 11:28:49 -06:00
Fabiano Fidêncio
70e7ec3e23 gha: Fix k0s deployment
The tests are failing when setting up k0s, and that happens because we
download a kubectl binary matching the kubernetes version k0s is using,
and we do that by:
```
sudo k0s kubectl version --short 2>/dev/null | ...
```

With kubectl 1.28, which is now the default on k0s, `kubectl version
--short` has been removed, leading us to an empty stringm causing then
the error in the CI.

Fixes: #8105

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 17:21:40 +02:00
Fabiano Fidêncio
560bbffb57 packaging: tools: Remove set -x leftover
This was used for debugging, and ended up being merged with that.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 15:33:55 +02:00
Fabiano Fidêncio
18fa483d90 packaging: release: Mention newly added images
We've added two new containerd builder images recently, one for the
components under `src/tools` and another one for the Kata Containers
agent.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 15:33:55 +02:00
Fabiano Fidêncio
ca3b888371 packaging: tools: Fix container image env var name
This should be TOOLS_CONTAINER_BUILDER instead of
VIRTIOFSD_CONTAINER_BUILDER.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 15:33:55 +02:00
Fabiano Fidêncio
5ca66795c7 packaging: Allow passing the TOOLS_CONTAINER_BUILDER
This follows what we've been doing for all the components we're
building, but was missed as part of #8077.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 15:33:55 +02:00
Fabiano Fidêncio
02acef9575 gha: Build the kata-agent as part of our workflows
The kata-agent binary won't be released, just built so it can be used,
later on,  as part of our tests and as part of the rootfs build.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 15:33:55 +02:00
Fabiano Fidêncio
5208386ab1 packaging: Build the kata-agent
Let's add the needed functions to start building the kata-agent, with or
without the OPA support.

For now this build is not used as part of the rootfs build, but later on
this will (not as part of this series, though).

Fixes: #8099

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 15:33:55 +02:00
Fabiano Fidêncio
1727487eef agent: Allow specifying DESTDIR and AGENT_POLICY via env vars
This will help to build the agent binary as part of the kata-deploy
localbuild, as we need to pass the DESTDIR to where the agent will be
installed, and also whether we're building the agent with policy support
enabled or not.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 14:18:45 +02:00
Fabiano Fidêncio
45c1188839 packaging: Add get_agent_image_name()
This will be used for building the kata-agent.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 14:17:38 +02:00
Wainer dos Santos Moschetta
0db8fb8f98 versions: migrate out of k8s.gcr.io
The k8s.gcr.io is deprecated for a while now and has been redirected to
registry.k8s.io. However on some bare-metal machines in our testing
pools that redirection is not working, so let's just replace the
registries.

Fixes #8098
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit b2c3bca558c38deff2117d5909d9071c23c05590)
2023-10-03 11:52:59 +01:00
stevenhorsman
a1a0543671 doc: Fix spelling
Spell check failed with:
```
[kata-spell-check.sh:275] WARNING: Word 'overcommitment':
did you mean one of the following?: over commitment, over-commitment,
commitment
```
So update this to pass the static checks

Fixes: #
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-10-03 10:17:38 +01:00
Gabriela Cervantes
6339605a14 tests: Add general stability fixes
This PR adds general stability fixes.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-02 19:42:46 +00:00
stevenhorsman
59ae244442 doc: Update crictl pod-config
- Ensure that our documented crictl pod config file contents have
uid  and namespace fields for compatibility with crictl 1.24+

This avoids a user potentially hitting the error:
```
getting sandbox status of pod "d3af2db414ce8": metadata.Name,
metadata.Namespace or metadata.Uid is not in metadata
"&PodSandboxMetadata{Name:nydus-sandbox,Uid:,Namespace:default,Attempt:1,}"

getting sandbox status of pod "-A": rpc error: code = NotFound desc = an
error occurred when try to find sandbox: not found
```

Fixes: #8092
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
(cherry picked from commit 8f8c2215)
2023-10-02 14:53:46 +01:00
Gabriela Cervantes
fd19f4082f tests: Add agent stability test
This PR adds the agent stability test to stability test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-28 22:37:02 +00:00
Gabriela Cervantes
215577032f tests: Add cassandra stress in stability tests
This PR adds the cassandra stress at the stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-28 22:34:45 +00:00
GabyCT
a890ad3a16 Merge pull request #8066 from GabyCT/topic/urlvra
docs: Update url in kata vra document
2023-09-28 14:59:34 -06:00
Zvonko Kaiser
79e33c211c Merge pull request #7325 from zvonkok/vfio-sandbox-id-debug
gpu: Adding CDI support for cold and hot-plug of VFIO devices
2023-09-28 21:31:12 +02:00
Gabriela Cervantes
f2d3ea988d tests: Add stressng dockerfile for stability tests
This PR adds the stressng dockerfile for stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-28 16:35:22 +00:00
Gabriela Cervantes
6493aa309e tests: Add stressor CPU test for stability tests
This PR adds the stressor CPU test for stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-28 16:33:08 +00:00
Gabriela Cervantes
ef68a3a36b metrics: Add stability test for kata CI
This PR adds the stability test for kata containers repository.

Fixes #8084

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-28 16:23:36 +00:00
David Esparza
f7ef45b167 Merge pull request #8077 from fidencio/topic/kata-deploy-ship-the-tools
kata-deploy: build & ship the rust components from src/tools/
2023-09-28 09:59:19 -06:00
Zvonko Kaiser
7c934dc7da gpu: Fix cold-plug of VFIO devices
We need to do proper sandbox sizing when we're doing cold-plug introduce CDI,
the de-facto standard for enabling devices in containers. containerd
will pass-through annotations for accumulated CPU,Memory and now CDI
devices. With that information sandbox sizing can be derived correctly.

Fixes: #7331

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-09-28 09:49:13 +00:00
GabyCT
fcc755fc3b Merge pull request #8068 from GabyCT/topic/limitlatency
metrics: Add latency value limits for kata CI
2023-09-27 13:28:41 -06:00
Greg Kurz
defbb64ac8 Merge pull request #8036 from rye-stripe/bugfix/overhead-metrics
runtime: fix reading cgroup stats of sandboxes
2023-09-27 19:39:55 +02:00
Archana Shinde
95455e6fe8 Merge pull request #8058 from likebreath/0925/clh_v35.0
Upgrade to Cloud Hypervisor v35.0
2023-09-27 10:39:32 -07:00
Gabriela Cervantes
8d66ef5185 metrics: Increase qemu jitter value
This PR increases qemu jitter value.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-27 17:31:07 +00:00
Gabriela Cervantes
5600e28b54 metrics: Increase jitter value for clh
This PR increases jitter value for clh.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-27 17:30:19 +00:00
Fabiano Fidêncio
a6b1f5e21b ci: Build src/tools components as part of our tests / releases
Build those as part of our CI and release workflows.

Fixes #5520 #5348

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 18:50:25 +02:00
Fabiano Fidêncio
501a168a81 kata-deploy: Build components from src/tools
Let's add targets and actually enable users and oursevles to build those
components in the same way we build the rest of the project.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 18:49:02 +02:00
Fabiano Fidêncio
6ef42db5ec static-build: Add scripts to build content from src/tools
As we'd like to ship the content from src/tools, we need to build them
in the very same way we build the other components, and the first step
is providing scripts that can build those inside a container.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 18:48:56 +02:00
Fabiano Fidêncio
4d08ec29bc packaging: Add get_tools_image_name()
This will be used for building all the (rust) components from src/tools.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 18:48:35 +02:00
Fabiano Fidêncio
98097c96de packaging: Use git abbreviated hash
This will make it easier to build images that rely on several
directories hashes.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 18:48:30 +02:00
Fabiano Fidêncio
8b25e90027 Merge pull request #8075 from fidencio/topic/ci-add-kata-monitor-tests
ci: Port kata-monitor tests from Jenkins to GHA
2023-09-27 15:48:46 +02:00
Fabiano Fidêncio
489caf1ad0 ci: kata-monitor: Move tests over
Let's move, adapt, and use the kata-monitor tests from the tests repo.
In this PR I'm keeping the SoB from every single contributor from who
touched those tests in the past.

Fixes: #8074

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-27 11:40:31 +02:00
Fabiano Fidêncio
a3fb067f1b ci: Add placeholder for kata-monitor tests
The kata-monitor tests is currently running as part of the Jenkins CI
with the following setups:
* Container Engines: CRI-O | containerd
* VMMs: QEMU

When using containerd, we're testing it with:
* Snapshotter: overlayfs | devmapper

We will stop running those tests on devmapper / overlayfs as that hardly
would get us a functionality issue.

Also, we're restricting this to run with the LTS version of containerd,
when containerd is used.

As it's known due to our GHA limitation, this is just a placeholder and
the tests will actually be added in the next iterations.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:31:17 +02:00
Fabiano Fidêncio
57cb4ce204 ci: Make install_kata aware of container engines
This will help us when running tests using CRI-O.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:31:17 +02:00
Fabiano Fidêncio
de1eeee334 ci: Create a generic install_crio function
This will serve us quite will in the upcoming tests addition, which will
also have to be executed using CRi-O.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:26:13 +02:00
Fabiano Fidêncio
64a2000859 ci: Add install_cni_plugins helper
This will become handy when doing tests with CRI-O, as CRI-O doesn't
install the CNI plugins for us.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:26:13 +02:00
Fabiano Fidêncio
8132fe15c9 ci: Modify containerd default config
Let's ensure we have runc running with `SystemdCgroups = false`,
otherwise we'll face failures when running tests depending on runc on
Ubuntu 22.04, woth LTS containerd.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:16:12 +02:00
Chelsea Mafrica
a49bc68374 runtime-rs: Update status for pause and resume
Pause and resume task do not currently update the status of the
container to paused or running, so fix this. This is specifically for
pausing the task and not the VM.

Fixes #6434

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-09-26 17:22:47 -07:00
Gabriela Cervantes
8cb7df1bed metrics: Add checkmetrics for latency test
This PR adds the checkmetrics for latency test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-26 19:11:08 +00:00
Gabriela Cervantes
e90440ae24 metrics: Add qemu latency value limit
This PR adds the qemu latency value limit for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-26 17:30:09 +00:00
Gabriela Cervantes
a74a8f8a9d metrics: Add latency value limits for kata CI
This PR adds latency value limits for kata CI.

Fixes #8067

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-26 17:29:07 +00:00
Gabriela Cervantes
d7def8317a metrics: Fix general check static warnings
This PR fixes general check static warnings.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-26 16:30:59 +00:00
GabyCT
309103169d Merge pull request #8056 from GabyCT/topic/fixlatencypath
metrics: Fix latency yamls path
2023-09-26 10:16:55 -06:00
Gabriela Cervantes
928553d1ba docs: Update url in kata vra document
This PR updates the url in kata vra document.

Fixes #8065

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-26 16:13:12 +00:00
GabyCT
5c0afaacf4 Merge pull request #8018 from GabyCT/topic/fixreadme
metrics: Fix metrics README
2023-09-26 09:51:47 -06:00
David Esparza
83326f89b3 Merge pull request #8054 from GabyCT/topic/fixcrdoc
metrics: Fix C-Ray documentation
2023-09-26 09:50:19 -06:00
James O. D. Hunt
31478b9c33 Merge pull request #7944 from jodh-intel/runtime-rs-ch-enable-tdx
runtime-rs: ch: Enable Intel TDX
2023-09-26 14:11:12 +01:00
James O. D. Hunt
b0a3293d53 runtime-rs: ch: Enable Intel TDX
Allow Cloud Hypervisor to create a confidential guest (a TD or
"Trust Domain") rather than a VM (Virtual Machine) on Intel systems
that provide TDX functionality.

> **Notes:**
>
> - At least currently, when built with the `tdx` feature, Cloud Hypervisor
>   cannot create a standard VM on a TDX capable system: it can only create
>   a TD. This implies that on TDX capable systems, the Kata Configuration
>   option `confidential_guest=` must be set to `true`. If it is not, Kata
>   will detect this and display the following error:
>
>   ```
>   TDX guest protection available and must be used with Cloud Hypervisor (set 'confidential_guest=true')
>   ```
>
> - This change expands the scope of the protection code, changing
>   Intel TDX specific booleans to more generic "available guest protection"
>   code that could be "none" or "TDX", or some other form of guest
>   protection.

Fixes: #6448.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 10:55:25 +01:00
James O. D. Hunt
523399c329 runtime-rs: ch: Add more consts
Introduce a few new constants (for PCI segment count and FS queues) and
move the disk queue constants to `convert.rs` to allow them to be used
there too.

> **Note:**
>
> This change gives the `ShareFs` code it's own set of values rather
> than relying on the disk queue constants.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 08:41:32 +01:00
James O. D. Hunt
dea8065811 runtime-rs: ch: Remove unused function
Delete the `handle_pending_devices_after_boot()` function which is no
longer required.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 08:41:32 +01:00
James O. D. Hunt
995f2c015f runtime-rs: ch: Only handle particular pending device types
Modify the Cloud Hypervisor `add_device()` method to add `ShareFs` and
`Network` devices to the list of pending devices since only these two
device types need to be cached before VM startup. Full details in the
comments.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 08:41:32 +01:00
James O. D. Hunt
b1b96a5c49 runtime-rs: ch: Remove erroneous "virtio-blk-mmio" check
Remove the `VIRTIO_BLK_MMIO` check which appears to have been added
erroneously in the first place.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 08:41:32 +01:00
Gabriela Cervantes
9ac29b8d38 metrics: Add init_env function to latency test
This Pr adds the init_env function to latency test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-25 22:06:00 +00:00
Bo Chen
dfd0c9fa9a runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v35.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #8057

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-09-25 12:22:37 -07:00
Bo Chen
8f9f087e35 versions: Upgrade to Cloud Hypervisor v35.0
Details of this release can be found in ourroadmap project as iteration
v35.0: https://github.com/orgs/cloud-hypervisor/projects/6.

Fixes: #8057

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-09-25 12:22:01 -07:00
Fabiano Fidêncio
a4daa86535 Merge pull request #8028 from fidencio/topic/ci-test-with-crio-part-2
ci: k8s: crio: Follow up patches to have CRI-O also working as part of our CI
2023-09-25 18:40:42 +02:00
Gabriela Cervantes
81c8babca9 metrics: Fix latency yamls path
This PR fixes the latency yamls path for the latency test for
kata metrics.

Fixes #8055

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-25 15:52:24 +00:00
Gabriela Cervantes
4815736820 metrics: Fix C-Ray documentation
This PR fixes the C-Ray documentation for kata metrics.

Fixes #8052

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-25 15:27:58 +00:00
Fabiano Fidêncio
ef63d67c41 ci: crio: Trail '\r' from exec_host() output
We've faced this as part of the CI, only happening with the CRI-O tests:
```
 not ok 1 Test readonly volume for pods
 # (from function `exec_host' in file tests_common.sh, line 51,
 #  in test file k8s-file-volume.bats, line 25)
 #   `exec_host "echo "$file_body" > $tmp_file"' failed with status 127
 # [bats-exec-test:38] INFO: k8s configured to use runtimeclass
 # bash: line 1: $'\r': command not found
 #
 # Error from server (NotFound): pods "test-file-volume" not found
```

I must say I didn't dig into figuring out why this is happening, but we
may be safe enough to just trail the '\r', as long as all the tests keep
passing on containerd.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-25 16:42:18 +02:00
Fabiano Fidêncio
74c12b2927 ci: crio: Enable default capabilities
We need the default capabilities to be enabled, especially `SYS_CHROOT`,
in order to have tests accessing the host to pass.

A huge thanks to Greg Kurz for spotting this and suggesting the fix.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
2023-09-25 14:56:15 +02:00
Fabiano Fidêncio
358dc2f569 kata-deploy: Fix CRI-O detection
Some of the "k8s distros" allow using CRI-O in a non-official way, and
if that's done we cannot simply assume they're on containerd, otherwise
kata-deploy will simply not work.

In order to avoid such issue, let's check for `cri-o` as the container
engine as the first place and only proceed with the checks for the "k8s
distros" after we rule out that CRI-O is not being used.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-25 14:56:15 +02:00
Fabiano Fidêncio
ebaa4fa4c1 ci: crio: Pass -y to apt
That was something overlooked during my tests. :-/

Fixes: #8005

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-25 14:56:15 +02:00
GabyCT
11cf0e2d28 Merge pull request #8038 from GabyCT/topic/latency
metrics: Enable latency test in gha run script
2023-09-22 16:57:53 -06:00
GabyCT
3ef57b335e Merge pull request #8045 from jepio/fix-docker-ownership
local-build: Fix .docker ownership before build-payload
2023-09-22 14:43:38 -06:00
Archana Shinde
9bb9a3e7a4 Merge pull request #7966 from amshinde/runtime-rs-network-clh
runtime-rs: Add network support for cloud-hypervisor
2023-09-22 13:08:09 -07:00
Gabriela Cervantes
97e73b2234 metrics: Fix spelling warnings
This PR fixes general spelling warnings detected by the spelling check.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-22 15:50:51 +00:00
Gabriela Cervantes
36c8cd6f1f metrics: Fix metrics README
This PR fixes the network metrics section at the README by leaving
the current tests that we have in our kata metrics.

Fixes #8017

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-22 15:28:58 +00:00
Fabiano Fidêncio
c5a5a0c95e Merge pull request #8012 from arronwy/strip
osbuild: Reduce guest components binary size with strip
2023-09-22 15:45:38 +02:00
Fabiano Fidêncio
9d190f2390 Merge pull request #8042 from GabyCT/topic/pandoc
gha: Add pandoc as a dependency for static checks
2023-09-22 15:31:18 +02:00
Jeremi Piotrowski
15425a2b80 local-build: Fix .docker ownership before build-payload
The permissions on .docker/buildx/activity/default are regularly broken by us
passing docker.sock + $HOME/.docker to a container running as root and then
using buildx inside. Fixup ownership before executing docker commands.

Fixes: #8027
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-22 13:44:53 +02:00
Jeremi Piotrowski
a5338e885e Merge pull request #8030 from portersrc/8027-ci-rootfs-image-build-asset-is-failing-oras
ci: rootfs-image build-asset is failing
2023-09-22 11:07:50 +02:00
Chao Wu
6f98fbafde Merge pull request #6706 from guixiongwei/feat/thp
feat(runtime-rs): introduce huge page mode to select VM RAM's backend
2023-09-22 15:27:06 +08:00
Gabriela Cervantes
13ca7d9f97 gha: Add pandoc as a dependency for static checks
To avoid the failure of not finding pandoc command this PR adds that
package as a dependency for static checks.

Fixes #8041

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-21 20:14:41 +00:00
Jeremi Piotrowski
28dd5ae91e Merge pull request #7799 from UiPath/clh-directio-support
clh: Direct IO support for block devices
2023-09-21 19:16:08 +02:00
David Esparza
6de9f39895 Merge pull request #8020 from GabyCT/topic/fixhunspell
gha: Install hunspell for static checks
2023-09-21 10:58:40 -06:00
Gabriela Cervantes
08bc8e4db4 metrics: Add latency benchmark for gha
This PR adds the latency benchmark for gha for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-21 16:14:39 +00:00
Gabriela Cervantes
6776b55d7e metrics: Enable latency test in gha run script
This PR enables the latency test for gha run script for kata metrics.

Fixes #8037

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-21 16:11:58 +00:00
Peteris Rudzusiks
94e2ccc2d5 runtime: fix reading cgroup stats of sandboxes
The cgroup stats come from resourcecontrol package in the form of pointers
to structs. The sandbox Stat() method incorrectly was expecting structs.
This caused the cpu and memory stats to always be 0, which in turn caused
incorrect pod overhead metrics.

Fixes #8035

Signed-off-by: Peteris Rudzusiks <rye@stripe.com>
2023-09-21 17:00:53 +02:00
Alexandru Matei
d507d189bb fc: Add support for noflush cache option
Firecracker supports noflush semantic via Unsafe cache type.
There is no support for direct i/o, remove it from config file

Fixes: #7823

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2023-09-21 14:48:24 +03:00
Alexandru Matei
2ca781518a clh: Direct IO support for block devices
Clh suports direct i/o for disks. It doesn't
offer any support for noflush, removed passing
of option to cloud-hypervisor internal config

Fixes: #7798

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2023-09-21 14:48:24 +03:00
Fabiano Fidêncio
dd27912f31 Merge pull request #8032 from fidencio/topic/ci-make-push-after-build-be-trigger-by-workflow-dispatch
ci: Trigger payload-after-push on workflow_dispatch
2023-09-21 10:25:24 +02:00
Fabiano Fidêncio
0c95697cc4 ci: Trigger payload-after-push on workflow_dispatch
This will allow us to easily test failures and fixes on that workflows.

Fixes: #8031

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-21 09:24:13 +02:00
Chris Porter
28cbc3b51c ci: rootfs-image build-asset is failing
Fixes: #8027

Signed-off-by: Chris Porter <porter@ibm.com>
2023-09-21 00:58:42 -05:00
Fabiano Fidêncio
21f6f9a173 Merge pull request #8016 from fidencio/topic/ci-test-with-crio-part-1
ci: Actually enable the CRI-O tests
2023-09-21 07:42:27 +02:00
Wainer Moschetta
87e64a07ed Merge pull request #7979 from beraldoleal/gogo-removal
protocol: remove gogoprotobuff tests
2023-09-20 22:38:10 -03:00
Gabriela Cervantes
87a8616488 gha: Install hunspell for static checks
Seems like the static checks are failing due the missing of the hunspell
package this PR fixes that.

Fixes #8019

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-20 16:58:10 +00:00
Fabiano Fidêncio
8c3c50ca8a ci: Actually enable the CRI-O tests
The test has been added to the repo, but we have to also add it to the
list of jobs to be executed.

Fixes: #8005

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-20 18:01:25 +02:00
David Esparza
03554c799a Merge pull request #8006 from fidencio/topic/ci-test-with-crio-part-0
ci: k8s: Also run tests with CRI-O
2023-09-20 07:45:17 -06:00
Fabiano Fidêncio
c6a9e50c37 Merge pull request #8004 from microsoft/danmihai1/quoted-spaces
runtime: support kernel params including spaces
2023-09-20 12:10:51 +02:00
Wang, Arron
3a6510ad61 osbuild: Reduce guest components binary size with strip
opa_linux_amd64_static 38M => 27M
kata-agent 30M => 23M

ls -alh opa_linux_amd64_static
-rw-rw-r-- 1 arron arron 38M Jul 28 01:59 opa_linux_amd64_static
➜ kata-containers git:(main) ✗ strip opa_linux_amd64_static
➜ kata-containers git:(main) ✗ ls -alh opa_linux_amd64_static
-rw-rw-r-- 1 arron arron 27M Sep 20 16:12 opa_linux_amd64_static

ls -alh ./usr/bin/kata-agent
-rwxr-xr-x. 1 root root 30M Jul 30 23:41 ./usr/bin/kata-agent
ls -alh ./usr/bin/kata-agent
-rwxr-xr-x. 1 root root 23M Sep 20 16:13 ./usr/bin/kata-agent

Fixes: #8011

Signed-off-by: Wang, Arron <arron.wang@intel.com>
2023-09-20 16:23:17 +08:00
Fabiano Fidêncio
07a6e63a6b ci: k8s: rke2: Use sudo to call systemd
Otherwise we'll face the following error:
```
Failed to enable unit: Interactive authentication required.
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-20 08:48:29 +02:00
Fabiano Fidêncio
03b82e8484 ci: k8s: Add a CRI-O test
Let's make sure we'll also be testing k8s using CRI-O.

For now, we'll only be running the CRI-O test with QEMU.  Once it
becomes stable we can expand this to other Hypervisors as well.

Fixes: #8005

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-20 00:59:09 +02:00
Fabiano Fidêncio
d7105cf7a4 ci: k8s: Add a method to install CRI-O
This is based on official CRI-O documentations[0] and right now we're
making this specific to Ubuntu as that's what we have as runners.

We may want to expand this in the future, but we're good for now.

[0]:
https://github.com/cri-o/cri-o/blob/main/install.md#apt-based-operating-systems

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-20 00:59:09 +02:00
Fabiano Fidêncio
54c0a471b1 ci: k8s: k0s: Allow passing parameters to the k0s installer
We'll need this in order to setup k0s with a different container engine.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-20 00:59:09 +02:00
Fabiano Fidêncio
31ef64606c Merge pull request #8007 from fidencio/topic/ci-kata-deploy-fix-garm-runner-name
ci: kata-deploy: Fix runner name
2023-09-20 00:58:33 +02:00
Beraldo Leal
730ef51693 deps: updating dependencies
Updating dependencies after make check, make test.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-09-19 16:54:35 -04:00
GabyCT
6111ef6fb6 Merge pull request #7990 from GabyCT/topic/parallelbandwidth
metrics: Enable parallel bandwidth iperf limit
2023-09-19 14:52:21 -06:00
Fabiano Fidêncio
3a2c83d69b ci: kata-deploy: Fix runner name
It should be garm-ubuntu-2004-smaller instead of garm-ubuntu-2004-small.

Fixes: #7890

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 22:34:37 +02:00
Dan Mihai
82ff2db460 runtime: support kernel params including spaces
Support quoted kernel command line parameters that include space
characters. Example:

dm-mod.create="dm-verity,,,ro,0 736328 verity 1
/dev/vda1 /dev/vda2 4096 4096 92041 0 sha256
f211b9f1921ef726d57a72bf82be23a510076639fa8549ade10f85e214e0ddb4
065c13dfb5b4e0af034685aa5442bddda47b17c182ee44ba55a373835d18a038"

Fixes: #8003

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-09-19 20:26:38 +00:00
Beraldo Leal
604a9dd673 protocol: remove gogoprotobuff tests
This is part of a bigger effort to drop gogoprotobuff from our code
base. IIUC, those options are basically used by *pb_test.go, and since
we are dropping gogoprotobuff and those are auto generated tests, let's
just remove it.

Fixes #7978.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-09-19 12:55:42 -04:00
Fabiano Fidêncio
5560e72024 Merge pull request #7896 from fidencio/topic/ground-work-for-testing-all-k8s-flavours-we-support
ci: kata-deploy: Enable all k8s flavours that we support
2023-09-19 17:44:34 +02:00
Fabiano Fidêncio
f7fa7f602a ci: Enable kata-deploy tests for all the supported k8s flavours
Let's ensure we test kata-deploy on RKE2 and k0s as well.

Fixes: #7890

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 13:38:10 +02:00
Fabiano Fidêncio
2c908b598c ci: kata-deploy: Add the ability to deploy rke2
This will be very useful in the near future, when we start testing
kata-deploy with rke2 as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 13:38:10 +02:00
Fabiano Fidêncio
eaf6164916 ci: kata-deploy: Add the ability to deploy k0s
This will be very useful in the near future, when we start testing
kata-deploy with k0s as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 13:38:10 +02:00
Fabiano Fidêncio
0015257636 ci: kata-deploy: Add deploy-k8s argument to gha-run.sh
We'll be using exactly the same code used for the k8s tests, which are
already deploying k3s on GARM.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 13:38:10 +02:00
Fabiano Fidêncio
bf2cb02283 ci: kata-deploy: Expland tests to run on k0s / rke2
We just need to make sure the correct overlay is applied, following what
we already have been doing for k3s.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 13:38:10 +02:00
Fabiano Fidêncio
6d5d844e5c Merge pull request #7983 from sprt/resource-group-naming
ci: Create clusters in individual resource groups
2023-09-19 12:54:21 +02:00
Fabiano Fidêncio
b12b9e1886 ci: kata-deploy: Add placeholder for tests on GARM
We'll be testing kata-deploy with different kubernetes flavours as part
of our GARM tests, and this is a place-holder for this.

Once enabled, we'll do nothing, just `return 0`, so we can then properly
add the tests after this commit gets merged.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 12:42:02 +02:00
Fabiano Fidêncio
9e1fb8a966 ci: kata-deploy: Export KUBERNETES env var
So we have a better control on which flavour of kubernetes kata-deploy
is expected to be targetting.

This was also done as part of fa62a4c01b,
for the k8s tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 12:37:56 +02:00
Fabiano Fidêncio
09cc0ed438 ci: Move deploy_k8s() to gha-run-k8s-common.sh
This will allow us to re-use the function in the kata-deploy tests,
which will come soon.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 12:37:56 +02:00
Fabiano Fidêncio
1829f5c049 Merge pull request #7992 from skaegi/virtiofsd-1.8.0
versions: Bump virtiofsd to v1.8.0
2023-09-19 11:52:49 +02:00
Fabiano Fidêncio
486fe14c99 ci: Properly set K8S_TEST_UNION
Otherwise only the first test will be executed

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 10:23:58 +02:00
Aurélien Bombo
d9ef1352af ci: Add first letter of the K8S_TEST_HOST_TYPE to resource group name
Ideally we'd add the instance_type or the full K8S_TEST_HOST_TYPE but
that exceeds the maximum amount of characteres allowed for the cluster
name.  With this in mind, let's use the first letter of
K8S_TEST_HOST_TYPE instead.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-09-19 10:23:58 +02:00
Aurélien Bombo
68267a3996 ci: Create clusters in individual resource groups
This makes it so that each AKS cluster is created in its own individual
resource group, rather than using the "kataCI" resource group for all
test clusters.

This is to accommodate a tool that we recently introduced in our Azure
subscription which automatically deletes resource groups after a set
amount of time, in order to keep spending under control.

The tool will automatically delete any resource group, unless it has a
tag SkipAutoDeleteTill = YYYY-MM-DD. When this tag is present, the
resource group will be retained until the specified date.

Note that I tagged all current resource groups in our subscription with
SkipAutoDeleteTill = 2043-01-01 so that we don't lose any existing
resources.

Fixes: #7982

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-09-19 10:23:55 +02:00
Fabiano Fidêncio
84c0d59d23 Merge pull request #7985 from fidencio/topic/clh-use-static_sandbox_resource_mgmt-as-default-on-arm
clh: arm: Use static_sandbox_resource_mgmt=true
2023-09-19 09:25:34 +02:00
Gabriela Cervantes
9aa8d1c917 metrics: Add parallel bandwidth limit for qemu
This PR adds the parallel bandwidth limit for qemu for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-18 21:08:54 +00:00
Simon Kaegi
44c7c082d9 versions: Bump virtiofsd to v1.8.0
https://gitlab.com/virtio-fs/virtiofsd/-/releases/v1.8.0 was released two weeks ago. We have fully tested and are using this version.

Also bumps toolchain version to match what virtiofsd used.

Fixes: #7960

Signed-off-by: Simon Kaegi <simon.kaegi@gmail.com>
2023-09-18 15:21:15 -04:00
Fabiano Fidêncio
5f8e210d3b Merge pull request #7961 from ChengyuZhu6/update_nydus
Bump nydus versions and update nydus tests
2023-09-18 21:02:20 +02:00
Fabiano Fidêncio
c3ee913bf6 Merge pull request #7953 from gkurz/extra-monitor-socket
runtime/qemu: Rework QMP/HMP support
2023-09-18 19:04:14 +02:00
Gabriela Cervantes
af59d4bf4a metrics: Enable parallel bandwidth iperf limit
This PR enables the parallel bandwidth iperf limit for kata metrics.

Fixes #7989

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-18 16:32:11 +00:00
Fabiano Fidêncio
aba36ab188 nydus: Temporarily skip tests on dragonball
We're hitting a specific issue after updating, which will require some
work on dragonball before it can be re-added here.

The issue:
```
...
3: failed to do rafs mount\\n
4: fail to attach rafs \\\"/var/lib/containerd-nydus/snapshots/2/fs/image/image.boot\\\"\\n
5: add share fs mount\\n
6: Mount rafs at
   /rafs/197ef3db03c86b91bf3045ff59183ce8b5750941ad1d3484f4a8301a70f5109f/rootfs_lower
   error: Failed to Mount backend
...

Caused by:
vmm action error: FsDevice(AttachBackendFailed(\\\"attach/detach a
backend filesystem failed:: missing field `version` at line 1 column
489\\\"))\"): unknown"
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
b8a8dfcd15 nydus: Use kata-${KATA_HYPERVISOR} instead of kata
This will ensure we're testing with the correct runtime, instead of
using the `default` one.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
ChengyuZhu6
f6df3d6efb static-build: Fix arch error on nydus build
Fix the arch error when downloading the nydus tarball.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Signed-off-by: Steven Horsman <steven@uk.ibm.com>
2023-09-18 17:40:06 +02:00
ChengyuZhu6
2f9c9e2e63 tests: nydus: Update nydus tests
To support the v0.12.0 nydus-snapshotter, we need to update the config
files and the commandline to start nydus-snapshotter.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
c9a4e7e46d versions: Bump nydus and nydus-snapshotter to its latest release
As we need https://github.com/containerd/nydus-snapshotter/pull/530 in.

Fixes #7984

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
b73bde320d gha: nydus: Populate run()
And with this we finally enable the nydus tests to run as part of our
GHA CI.

Fixes: #6543

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
b3904a1a30 gha: nydus: Populate install_dependencies()
Let's have all the dependencies needed for running the nydus tests
installed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
d2b3b67f5d gha: nydus: Actually install kata when install-kata is called
We've been simply doing nothing whenever `install-kata` was called, and
that was the intent when we added the placeholder calls.

Now, let's install kata, as expected. :-)

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
0ec00ad42e gha: nydus: Get rid of nydus{,-snapshotter} install from nydus_test.sh
As we've added install_nydus() and install_nydus_snapshotter(), which do
conform with the pattern we're following on GHA, let's rely on them
rather than relying on the bits coming from nydus_test.sh.

Later on we'll have install_nydus() and install_nydus_snapshotter() as
part of the dependencies install in our `gha-run.sh`.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
568439c77b tests: nydus: Add timeout to the crictl calls
Similarly to what's been done for the cri-containerd tests, as part of
84dd02e0f9, we need to add the timeout
here for the crictl calls.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
5ac3b76eb1 tests: nydus: Add uid / namespace to the nydus container / sandbox
Otherwise we may face errors like:
```
getting sandbox status of pod "d3af2db414ce8": metadata.Name,
metadata.Namespace or metadata.Uid is not in metadata
"&PodSandboxMetadata{Name:nydus-sandbox,Uid:,Namespace:default,Attempt:1,}"

getting sandbox status of pod "-A": rpc error: code = NotFound desc = an
error occurred when try to find sandbox: not found
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
376574a16c tests: nydus: Decorate some calls with sudo
Otherwise we canoot properly start the nydus snapshotter, nor properly
kill it after it's been started.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
4290fd4b67 tests: nydus: Adapt "source ..." to GHA
The "source ..." we've been doing was not changed since those tests were
part of the Jenkins tests, and we need to adapt them, either setting the
correct path or entirely removing the ones that are not relevant to us
anymore.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
a84efa3e87 tests: nydus: Adapt check to "clh" instead "cloud-hypervisor"
As that's what we've been using as part of the GHA.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
56a14b3950 tests: common: Add install_nydus_snapshotter()
This function will be used to download and install the
nydus-snapshotter, and it follows the same pattern we already have
introduced for downloading and installing another dependencies from
GitHub.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
b6563783e2 tests: common: Add install_nydus()
This function will be used to download and install nydus, and it follows
the same pattern we already have introduced for downloading and
installing another dependencies from GitHub.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
72599f1911 clh: arm: Use static_sandbox_resource_mgmt=true
Users have noticed that this is needed, as CLH does not yet implement a
way to hotplug resources on aarh64.

With this patch, when building for x86_64, I can see the this is the
resulting config:
```
$ ARCH=amd64 make
...

$ cat config/configuration-clh.toml | grep static_sandbox_resource_mgmt
static_sandbox_resource_mgmt=false

```

And when building for aarch64:
```
$ ARCH=arm64 make
...

$ cat config/configuration-clh.toml | grep static_sandbox_resource_mgmt
static_sandbox_resource_mgmt=true
```

Fixes: #7941

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 14:14:10 +02:00
Jeremi Piotrowski
dfa6af54df Merge pull request #7806 from jongwu/clh_serial
clh:arm64: use arm AMBA UART for hypervisor debug
2023-09-18 12:29:07 +02:00
Greg Kurz
1f16b6627b runtime/qemu: Rework QMP/HMP support
PR #6146 added the possibility to control QEMU with an extra HMP socket
as an aid for debugging. This is great for development or bug chasing
but this raises some concerns in production.

The HMP monitor allows to temper with the VM state in a variety of ways.
This could be intentionally or mistakenly used to inject subtle bugs in
the VM that would be extremely hard if not even impossible to debug. We
definitely don't want that to be enabled by default.

The feature is currently wired to the `enable_debug` setting in the
`[hypervisor.qemu]` section of the configuration file. This setting has
historically been used to control "debug output" and it is used as such
by some downstream users (e.g. Openshift). Forcing people to have the
extra HMP backdoor at the same time is abusive and dangerous.

A new `extra_monitor_socket` is added to `[hypervisor.qemu]` to give
fine control on whether the HMP socket is wanted or not. This setting
is still gated by `enable_debug = true` to make it clear it is for
debug only. The default is to not have the HMP socket though. This
isn't backward compatible with #6416 but it is for the sake of "better
safe than sorry".

An extra monitor socket makes the QEMU instance untrusted. A warning is
thus logged to the journal when one is requested.

While here, also allow the user to choose between HMP and QMP for the
extra monitor socket. Motivation is that QMP offers way more options to
control or introspect the VM than HMP does. Users can also ask for
pretty json formatting well suited for human reading. This will improve
the debugging experience.

This feature is only made visible in the base and GPU configurations
of QEMU for now.

Fixes #7952

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-09-18 12:13:01 +02:00
Greg Kurz
cab46c9e23 Merge pull request #7973 from fidencio/topic/ci-use-bigger-machine-sizes-for-the-needed-tests-part-0
ci: Use variable size of VMs depending on the tests running
2023-09-18 12:06:44 +02:00
Fabiano Fidêncio
0e3bfac3b3 Merge pull request #7976 from fidencio/topic/ci-static-checks-rework-part-0
ci: Rework static checks
2023-09-18 11:01:18 +02:00
Peng Tao
6eedd9b0b9 Merge pull request #7738 from Xuanqing-Shi/7732/handle-non-empty-endpoints-in-RemoveEndpoints
runtime: incorrect handling of non-empty []Endpoint parameter in Remo…
2023-09-18 10:58:28 +08:00
Fabiano Fidêncio
8b1e9b0c75 ci: static-checks: Clean up static-checks job
Now that the static-checks job only takes care of running the
static-checks, let's clean it up, remove all the unneeded steps, make
sure that we're using the actions in their latest version, and have it
running in a cost free runner.

At some point I'd like to see those tests done in parallel, in the same
way that I've organised the build-checks, but that's something for
someone else, at some other time.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 14:23:02 +02:00
Fabiano Fidêncio
2c5ca2eaf8 ci: static-checks: Run tests depending on KVM
With this we're removing the dragonball static-checks CI, as the test is
running here now. :-)

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 14:22:38 +02:00
Fabiano Fidêncio
509c309ab2 ci: static-checks: Move "sudo make test" to the new test matrix
We're moving it out of the previous "static-checks" confusing matrix,
and adding it to the matrix that was currently being used for the `make
vendor` and `make check` checks.

This will allow us to have one job per component, and with that we can
easily run those in parallel and on the zero cost runners.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:53:23 +02:00
Fabiano Fidêncio
4e963cedf4 ci: static-checks: Move "make test" to the new test matrix
We're moving it out of the previous "static-checks" confusing matrix,
and adding it to the matrix that was currently being used for the `make
vendor` and `make check` checks.

This will allow us to have one job per component, and with that we can
easily run those in parallel and on the zero cost runners.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:53:17 +02:00
Fabiano Fidêncio
08f2e5ae0b runtime-rs: Ensure static-checks-build is a dep of make test
Otherwise `make test` will simply fail with:
```
error[E0583]: file not found for module `config`
```

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:53:13 +02:00
Fabiano Fidêncio
2bc3a616ae kata-ctl: Use loop instead of kvm module in tests
This makes it pssible to run the tests in the cost free runners, which
are not KVM capable.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:53:08 +02:00
Fabiano Fidêncio
46daddc500 kata-ctl: Ensure GENERATED_CODE is a dep of make test
Otherwise `make test` will simply fail with:
```
error[E0583]: file not found for module `version`
```

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:53:01 +02:00
Fabiano Fidêncio
ec826f328f agent: Ensure GENERATED_CODE is a dep of make test
Otherwise `make test` will fail with:
```
error[E0583]: file not found for module `version`
```

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:57 +02:00
Fabiano Fidêncio
1d32410a83 ci: install_libseccomp: Do not depend on the tests repo
It makes things way simpler, waaaaay simpler.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:49 +02:00
Fabiano Fidêncio
bf888b9a5e ci: static-checks: Move "make check" to the new test matrix
We're moving it out of the previous "static-checks" confusing matrix,
and adding it to the matrix that was currently being used for the `make
vendor` checks.

This will allow us to have one job per component, and with that we can
easily run those in parallel and on the zero cost runners.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:45 +02:00
Fabiano Fidêncio
473ec87806 kata-ctl: Add kata-types to the Cargo.lock file
Commit message covered everything. :-)

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:40 +02:00
Fabiano Fidêncio
ea19549a99 kata-ctl: Ensure GENERATED_CODE is a dep of make check
Otherwise `make check` would fail with:
```
Error writing files: failed to resolve mod `version`:
/home/runner/work/kata-containers/kata-containers/src/tools/kata-ctl/src/ops/version.rs
does not exist make: *** [../../../utils.mk:176: standard_rust_check] Error 1
```

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:36 +02:00
Fabiano Fidêncio
e125775863 tests: install_rust: Also install clippy
clippy is used as part our tests, so it's useful to have it installed
while we're already installing rust.

In case of developers, they also better be using it. :-)

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:31 +02:00
Fabiano Fidêncio
e2c61a152c ci: static-checks: Move vendor check to its own job
Similarly to the static-check jobs, those jobs can be run on the zero
cost runners.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:30 +02:00
Fabiano Fidêncio
6794d4c843 tests: Move install_rust.sh from the tests repo
We'll use it as part of the refactoring we're doing in the static check
tests.

I can see a lot of other uses of this, but changing all of them to this
one is out of the scope for this PR.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:29 +02:00
Fabiano Fidêncio
e64508c308 tests: install_go: Remove tests repo dependency
We can rely on the functions that are now part of the common.bash.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:28 +02:00
Fabiano Fidêncio
11dff731b7 tests: Move functions from kata_arch script here
We can use this a lot as part of our CI, but right now I'm just moving
those here with the intent to use later on in this series.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:28 +02:00
Fabiano Fidêncio
75c974c802 ci: static-checks: Move kernel config check to its own job
It doesn't make sense to run this for all the bits of the matrix,
neither it's demanding enough to require running this in one of our
Azure sponsored runners.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:25 +02:00
Archana Shinde
9c233bb9e0 test: Add test to verify try_from for clh Netconfig
Add tests to verify conversion from runtime NetworkConfig
to clh specific config.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-09-16 00:24:14 -07:00
Fabiano Fidêncio
c69a1e33bd ci: Use variable size of VMs depending on the tests running
Let me start with a fair warning that this commit is hard to split into
different parts that could be easily tested (or not tested, just
ignored) without breaking pieces.

Now, about the commit itself, as we're on the run to reduce costs
related to our sponsorship on Azure, we can split the k8s tests we run
in 2 simple groups:
* Tests that can be run in the smaller Azure instance (D2s_v5)
* Tests that required the normal Azure instance (D4s_v5)

With this in mind, we're now passing to the tests which type of host
we're using, which allows us to select to run either one of the two
types of tests, or even both in case of running the tests on a baremetal
system.

Fixes: #7972

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 09:13:54 +02:00
Archana Shinde
9049d311df runtime-rs: Add network support for cloud-hypervisor
This PR adds support for adding a network device before starting the
cloud-hypervisor VM.

Support for adding and removing network devices is not really added to
the resource manager, so supporting this for cloud-hypervisor is not
scoped in this PR.

This also changes "pending_devices" for clh implementation from an
Option of vector to simply a vector. This simplifies the structure a bit
as we can simple iterate over the pending devices instead of having to
check for a "Some" value as this is not really required.

Fixes: #6333

Signed-off-by: Shuaiyi Zhang <zhang_syi@qq.com>
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-09-15 23:25:20 -07:00
Greg Kurz
79c494eb4e Merge pull request #7969 from fidencio/topic/ci-cache-using-oras-part-3
ci: cache: Check the sha256sum of the components & fix ovmf-sev cache usage
2023-09-15 16:30:22 +02:00
Fabiano Fidêncio
eecd5bf2aa ci: cache: Fix ovmf-sev cache
The cached tarball is relying on the component name, thus it's important
to set it correctly, otherwise we'll end up always building it.

With this patch applied:
```
≡ ⨯ make ovmf-sev-tarball
make ovmf-sev-tarball-build
make[1]: Entering directory '/home/ffidenci/src/upstream/kata-containers/kata-containers'
/home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build//kata-deploy-binaries-in-docker.sh  --build=ovmf-sev
sha256:67cc94e393dc1d5bfc2b77a77e83c9b1c0833d0fbbebaa9e9e36f938bb841fcc
Build kata version 3.2.0-rc0: ovmf-sev
INFO: DESTDIR /home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/ovmf-sev/destdir
Downloading a76f5522493f ovmf-sev-builder-image-version
Downloading 7e98c854bd94 kata-static-ovmf-sev.tar.xz
Downloading 559311973ff8 ovmf-sev-version
Downloaded  a76f5522493f ovmf-sev-builder-image-version
Downloading 353b655c2297 ovmf-sev-sha256sum
Downloaded  559311973ff8 ovmf-sev-version
Downloaded  353b655c2297 ovmf-sev-sha256sum
Downloaded  7e98c854bd94 kata-static-ovmf-sev.tar.xz
Pulled [registry] ghcr.io/kata-containers/cached-artefacts/ovmf-sev:latest-main-x86_64
Digest: sha256:933236c2c79e53be3ca7acc0b966d0ddac9c0335edcb1e8cad8b9bb3aaf508ce
kata-static-ovmf-sev.tar.xz: OK
INFO: Using cached tarball of ovmf-sev
drwxr-xr-x runner/runner     0 2023-09-15 10:34 ./
drwxr-xr-x runner/runner     0 2023-09-15 10:34 ./opt/
drwxr-xr-x runner/runner     0 2023-09-15 10:34 ./opt/kata/
drwxr-xr-x runner/runner     0 2023-09-15 10:34 ./opt/kata/share/
drwxr-xr-x runner/runner     0 2023-09-15 10:34 ./opt/kata/share/ovmf/
-rwxr-xr-x runner/runner 4194304 2023-09-15 10:34 ./opt/kata/share/ovmf/AMDSEV.fd
~/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build ~/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/ovmf-sev/builddir
~/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/ovmf-sev/builddir
make[1]: Leaving directory '/home/ffidenci/src/upstream/kata-containers/kata-containers'
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 12:39:22 +02:00
Fabiano Fidêncio
86c41074b4 ci: cache: Check the sha256sum of the component
We've removed this in the part 2 of this effort, as we were not caching
the sha256sum of the component.  Now that this part has been merged,
let's get back to checking it.

Fixes: #7834 -- part 3

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 12:34:30 +02:00
Fabiano Fidêncio
f5e52d02d3 Merge pull request #7964 from fidencio/topic/ci-cache-using-oras-part-2
ci: cache: Use the artefacts stored in ghcr.io/kata-containers/cached-artefacts/${component}
2023-09-15 12:29:28 +02:00
Fabiano Fidêncio
2fe0b494da Merge pull request #7959 from fidencio/topic/ci-run-on-smaller-garm-instances
ci: Run some of the GARM tests in smaller instances
2023-09-15 11:30:13 +02:00
Fabiano Fidêncio
460988c5f7 ci: cache: Remove the script used to cache artefacts on Jenkins
That's not needed anymore, as we've switched to using ORAS and an OCI
registry to cache the artefacts.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 10:27:55 +02:00
Fabiano Fidêncio
4533a7a416 ci: cache: Also store the ${component} sha256sum
This is something that was done by our Jenkins jobs, but that I ended up
missing when writing d0c257b3a7.

Now, let's also add the sha256sum to the cached artefact, and in a
coming up PR (after this one is merged) we will also start checking for
that.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 10:25:26 +02:00
Fabiano Fidêncio
eccc76df63 ci: cache: Use the cached artefacts from ORAS
In the previous series related to the artefacts we build, we've
switching from storing the artefacts on Jenkins, to storing those in the
ghcr.io/kata-containers/cached-artefacts/${artefact_name}.

Now, let's take advantage of that and actually use the artefacts coming
from that "package" (as GitHub calls it).

NOTE: One thing that I've noticed that we're missing, is storing and
checking the sha256sum of the artefact.  The storing part will be done
in a different commit, and the checking the sha256sum will be done in a
different PR, as we need to ensure those were pushed to the registry
before actually taking the bullet to check for them.

Fixes: #7834 -- part 2

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 10:13:47 +02:00
Jeremi Piotrowski
6f30d00ae7 Merge pull request #7956 from fidencio/topic/ci-reduce-the-machine-size-used
ci: Reduce the size of the AKS VMs
2023-09-15 08:49:08 +02:00
Steve Horsman
1b8f3fa9ae Merge pull request #7957 from fidencio/topic/ci-cache-using-oras-part-1
ci: cache: Allow pushing our artefacts to an OCI registry
2023-09-15 07:45:24 +01:00
Jianyong Wu
7f5e77bcb8 kernel: enable Arm pl011 support
Enable pl011 (ttyAMA0) support in kernel for aarch64.

Fixes: #5080
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-09-15 01:45:16 +00:00
Jianyong Wu
241c355e07 clh:arm64: use arm AMBA uart for hypervisor debug
cloud hypervisor on arm64 only support arm AMBA UART(pl011) as
tty. So, the console should be set to "ttyAMA0" instead of "ttyS0"
when enable hypervisor debug mode.

Fixes: #5080
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-09-15 01:44:23 +00:00
Fabiano Fidêncio
094b6b2cf8 ci: k8s: Temporarily disable tests that require a bigger VM instance
The list of tests which require a bigger VM instance is:
* k8s-number-cpus.bats -- failing on all CIs
* k8s-parallel.bats -- only failing on the cbl-mariner CI
* k8s-scale-nginx.bats -- only failing on the cbl-mariner CI

We'll keep those disabled while we re-work the logic to **only run
those** in a bigger (and more expensive) VM instance.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 01:33:19 +02:00
GabyCT
6fe5cd3bd5 Merge pull request #7937 from GabyCT/topic/iperfbandwidth
metrics: Add iperf value for cpu utilization
2023-09-14 16:47:19 -06:00
Fabiano Fidêncio
d0c257b3a7 ci: cache: Push cached artefacts to ghcr.io
Let's push the artefacts to ghcr.io and stop relying on jenkins for
that.

Fixes: #7834 -- part 1

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 00:39:57 +02:00
Fabiano Fidêncio
108f1b60dd kata-deploy: Generate latest_{artefact,image_builder} files
Right now this is not used, but it'll be used when we start caching the
artefacts using ORAS.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 00:39:57 +02:00
Fabiano Fidêncio
be2eb7b378 ci: cache: Install ORAS in the kata-deploy binaries builder container
ORAS is the tool which will help us to deal with our artefacts being
pushed to and pulled from a container registry.

As both the push to and the pull from will be done inside the
kata-deploy binaries builder container, we need it installed there.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 00:39:57 +02:00
Fabiano Fidêncio
fb24fb0dc1 ci: k8s: devmapper: Use a smaller / cheaper VM instance
We don't need to run on a D4s_v5. as those tests are not CPU / memory
intense.  With this is mind, let's use a smaller version of the
instance, the D2s_v5 one.

Fixes: #7958

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 00:27:05 +02:00
Fabiano Fidêncio
1daf02f5d4 ci: nydus: Use a smaller / cheaper VM instance
We don't need to run on a D4s_v5. as those tests are not CPU / memory
intense.  With this is mind, let's use a smaller version of the
instance, the D2s_v5 one.

Fixes: #7958

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 00:25:41 +02:00
Fabiano Fidêncio
e60d81f554 ci: nerdctl: Use a smaller / cheaper VM instance
We don't need to run on a D4s_v5. as those tests are not CPU / memory
intense.  With this is mind, let's use a smaller version of the
instance, the D2s_v5 one.

Fixes: #7958

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 00:25:41 +02:00
Fabiano Fidêncio
4db416997c ci: docker: Use a smaller / cheaper VM instance
We don't need to run on a D4s_v5. as those tests are not CPU / memory
intense.  With this is mind, let's use a smaller version of the
instance, the D2s_v5 one.

Fixes: #7958

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 00:25:41 +02:00
Fabiano Fidêncio
32841827b8 ci: cri-containerd: Use a smaller / cheaper VM instance
We don't need to run on a D4s_v5. as those tests are not CPU / memory
intense.  With this is mind, let's use a smaller version of the
instance, the D2s_v5 one.

Fixes: #7958

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 00:25:35 +02:00
Fabiano Fidêncio
92fff129fd ci: k8s: Don't set cpu limit request for k8s-inotofy test
Without setting the cpu limit / request to 1, we can make this test run
in a smaller VM instance without any issue.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-14 22:03:16 +02:00
Fabiano Fidêncio
faf98c0623 ci: Reduce the size of the AKS VMs
We do **not** need a very powerful machine for our tests, as we're not
building anything there.

The instance we switched to (Standard_D2s_v5) still has nested virt
available, as shown here[0], but has half of the amount of vCPUs /
Memory, which should be fine only for running the tests, costing us
basically half of the price[1].

[0]:
https://learn.microsoft.com/en-us/azure/virtual-machines/dv5-dsv5-series
[1]:
https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/#pricing

Fixes: #7955

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-14 22:03:16 +02:00
Fabiano Fidêncio
adc18ecdb1 ci: cache: For consistency, read all used env vars
Instead of having some of them only being considered if explicitly
passed to the script.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-14 20:24:48 +02:00
Fabiano Fidêncio
c7a851efd7 ci: cache: Pass the exposed env vars to the kata-deploy binaries in docker
As the environment variables are now being passed down from the GitHub
Actions, let's make sure they're exposed to the container used to build
the kata-deploy binaries, and during the build process we'll be able to
use those to log in and push the artefacts to the OCI registry, using
ORAS.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-14 20:24:48 +02:00
Fabiano Fidêncio
2e8b41f39c Merge pull request #7954 from fidencio/topic/ci-cache-using-oras-part-0
ci: cache: Export env vars needed to use ORAS
2023-09-14 20:23:55 +02:00
Fabiano Fidêncio
6bd15a85d5 ci: cache: Export env vars needed to use ORAS
We do the build of our artefacts inside a container image, and we need
to expose some env vars to the container so ORAS can be used there to
push the artefacts we want to cache to ghcr.io.

The env vars we're exposing are:
* ARTEFACT_REGISTRY: The registry where we're going to save the
  artefacts.
* ARTEFACT_REGISTRY_USERNAME: The username to log in to the registry, as
  ORAS does not use the same json file used by docker.
* ARTEFACT_REGISTRY_PASSWORD: The pasword to log in to the the registry,
  as the ORAS does not use the same json file used by docker.
* TARGET_BRANCH: The target branch, which will be part of the tag of the
  artefact, as we may end up caching the artefacts for both main and
  stable branches.

Fixes: #7834 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-14 19:36:33 +02:00
Gabriela Cervantes
cd4fd1292a metrics: Add iperf cpu utilization limit for qemu
This PR adds the iperf cpu utilization limit for qemu for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-14 17:17:47 +00:00
Gabriela Cervantes
df5cd10ea0 metrics: Add iperf value for cpu utilization
This PR adds the iperf value for cpu utilization for kata metrics.

Fixes #7936

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-14 16:06:49 +00:00
Jeremi Piotrowski
b54dd8cdf4 Merge pull request #7704 from jepio/vfio-part-1
gha: vfio: Import test script
2023-09-14 16:45:31 +02:00
Jeremi Piotrowski
a96050a7ad tests: Apply timeout to 'ctr t kill'
This task has been observed to hang at times.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
9d93036783 tests/vfio: Bump VM image to Fedora 38
We need a very recent L2 guest kernel to fix all the bugs that occur in nested
virtualization.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
faee59b520 tests/vfio: Accept single device in vfio group for CLH
cloud hypervisor does not emulate pcie switches or pci bridges, so we need to
accept a lonely device.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
df3dc1105c tests/vfio: Get rid of sync's
It is fine to start a VM with the disk image without syncing it as we now run
the test in an ephemeral Azure instance.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
7211c3dccc gha: vfio: Set test timeout to 15m
Sometimes the test gets stuck running commands in the container - need to
investigate why later.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
1b02f89e4f packaging: kernel: Enable VIRTIO_IOMMU on x86_64
Cloud Hypervisor exposes a VIRTIO_IOMMU device to the VM when IOMMU support is
enabled. We need to add it to the whitelist because dragonball uses kernel
v5.10 which restricted VIRTIO_IOMMU to ARM64 only.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
3a1db7a86b runtime: clh: Support enabling iommu
by enabling IOMMU on the default PCI segment. For hotplug to work we need a
virtualized iommu and clh exposes one if there is some device or PCI segment
that requests it. I would have preferred to add a separate PCI segment for
hotplugging vfio devices but unfortunately kata assumes there is only one
segment all over the place. See create_pci_root_bus_path(),
split_vfio_pci_option() and grep for '0000'.

Enabling the IOMMU on the default PCI segment requires passing enabling IOMMU on
every device that is attached to it, which is why it is sprinkled all over the
place.

CLH does not support IOMMU for VirtioFs, so I've added a non IOMMU segment for
that device.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
9f1a42c6cc tests/vfio: Give commands 30s to execute
This is a to catch the case of the guest getting stuck.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
b46b0ecf8b tests/vfio: Configure a value for 'hot_plug_vfio' for both vmms
This shouldn't be hiding behind only a qemu check, we need this for clh as
well.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
bfc93927fb runtime: Remove redundant check in checkPCIeConfig
There is no way for this branch to be hit, as port is only set when it is
different than config.NoPort.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
7c4e73b609 runtime: Add test cases for checkPCIeConfig
These test cases shows which options are valid for CLH/Qemu, and test that we
correctly catch unsupported combinations.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
fc51e4b9eb runtime: Check config for supported CLH (cold|hot)_plug_vfio values
The only supported options are hot_plug_vfio=root-port or no-port.
cold_plug_vfio not supported yet.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
509771e6f5 runtime: clh: Add hot_plug_vfio entry to config
hot_plug_vfio needs to be set to root-port, otherwise attaching vfio devices to
CLH VMs fails. Either cold_plug_vfio or hot_plug_vfio is required, and we have
not implemented support for cold_plug_vfio in CLH yet.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
5f6475a28a tests/vfio: Gather debug info and disable tdp_mmu
tdp_mmu had some issues up until around Linux v6.3 that make it work
particularly bad when running nested on Hyper-V. Reload the module at the start
of the test and disable the tdp_mmu param.

Gather debug info at the end of the test to make it easier to figure out what
went wrong. This uses github actions group syntax so that each section can be
collapsed.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
8fffdc81c5 tests/vfio: Capture journal from vm
For debugging (though this doesn't get exposed yet).

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
df815087e7 tests/vfio: Change to get the test working in GHA
- reduce memory and cpu usage to fit in a D4s_v5
- source correct lib
- mount workspace from 9p
- disable cpu mitigations for speed
- drop unused commands and variables
- install containerd
- install kata from built artifacts

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
a92ddeea15 tests/vfio: Move dependency installation to gha-run.sh
To match the flow of other github actions workflows.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
5a551a85b1 gha: vfio: Import jobs scripts from tests repo
This imports the vfio test scripts github.com/kata-containers/tests. The test
case doesn't work yet but doing the changes in a separate commit will make it
easier to track the changes. The only change in this commit is renaming
vfio_jenkins_job_build.sh -> vfio_fedora_vm_wrapper.sh

Fixes: #6555
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Fabiano Fidêncio
a1e3fa7ac4 Merge pull request #7905 from microsoft/danmihai1/mariner-annotations
tests: fix kernel and initrd annotations
2023-09-14 10:37:42 +02:00
GabyCT
1d331124ad Merge pull request #7925 from GabyCT/topic/bandwidthlimit
metrics: Add iperf bandwidth value for kata metrics
2023-09-13 17:43:55 -06:00
Gabriela Cervantes
49e2fa189c metrics: Increase jitter value for qemu
This PR increases the jitter value for qemu for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-13 22:36:09 +00:00
Gabriela Cervantes
49234433a7 metrics: Increase value limit for jitter in clh
This PR increases the value limit for jitter in clh.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-13 21:27:08 +00:00
David Esparza
0a24d3f718 Merge pull request #7923 from GabyCT/topic/addcassandradoc
metrics: Add Cassandra Metrics documentation
2023-09-13 10:17:00 -06:00
GabyCT
c565053bac Merge pull request #7895 from GabyCT/topic/removewarning
metrics: Remove warning from metrics documentation
2023-09-13 10:16:38 -06:00
Fabiano Fidêncio
8b9df1d32e Merge pull request #7929 from fidencio/topic/use-tcp-port-ping-on-docker-nerdctl-tests
ci: docker: nerdctl: Switch to tcp port 80 ping
2023-09-13 15:46:31 +02:00
Peng Tao
55ca7e8aec Merge pull request #7907 from Xuanqing-Shi/7876/network-devices-naming-conflict
runtime: Naming conflict of network devices
2023-09-13 19:29:41 +08:00
Fabiano Fidêncio
813bfdec01 ci: docker: nerdtl: Use io.containerd.kata-${KATA_HYPERVISOR}.io
This will ensure that we're calling the correct binary for the
hypervisor.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-13 13:10:14 +02:00
Fabiano Fidêncio
46bc0b1c01 ci: nerdctl: Create the containerd config
Otherwise we'll fail to configure kata-containers in the `install-kata`
step.

This is mostly needed because the nerdctl-full tarball doesn't provide a
contaienrd configuration, just the binary, as contaienrd does not
actually require a configuration file to run with the default config.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-13 13:00:57 +02:00
Fabiano Fidêncio
13968aa7f6 ci: nerdctl: Switch to tcp port 80 ping
TIL that the Azure VMs we use are created without an explicit outbund
connectivity defined.

This leads us to issues using `ping ...` as part of our tests, and when
consulting Jeremi Piotrowski about the issue he pointed me out to two
interesting links:
* https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access
* https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity

For your own sanity, do not read the comments, after all this is
internet. :-)

Anyways, the suggestion is to use nping instead, which is provided by
the nmap package, so we can explicitly switch to using the tcp port 80
for the ping.  With this in mind, I'm switching the image we use for the
test and using one that provided nping as a possible entry point, and
from now on (this part of) the tests should work.

Fixes: #7910

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-13 13:00:57 +02:00
Fabiano Fidêncio
e0c811678b ci: docker: Switch to tcp port 80 ping
TIL that the Azure VMs we use are created without an explicit outbund
connectivity defined.

This leads us to issues using `ping ...` as part of our tests, and when
consulting Jeremi Piotrowski about the issue he pointed me out to two
interesting links:
* https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access
* https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity

For your own sanity, do not read the comments, after all this is
internet. :-)

Anyways, the suggestion is to use nping instead, which is provided by
the nmap package, so we can explicitly switch to using the tcp port 80
for the ping.  With this in mind, I'm switching the image we use for the
test and using one that provided nping as a possible entry point, and
from now on (this part of) the tests should work.

Fixes: #7910

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-13 13:00:57 +02:00
shixuanqing
1636abbe1c runtime: issue with non-empty []Endpoint in RemoveEndpoints
In the RemoveEndpoints(), when the endpoints paramete isn't empty,
using idx may result in wrong endpoint removals. To improve,
directly passing the endpoint parameter helps
locate the correct elements within n.eps.

Fixes: #7732

Signed-off-by: shixuanqing <1356292400@qq.com>

Fixes: #7732

Signed-off-by: shixuanqing <1356292400@qq.com>

Update src/runtime/virtcontainers/network_linux.go

Co-authored-by: Xuewei Niu <justxuewei@apache.org>
2023-09-13 09:47:18 +00:00
Peng Tao
9766f9090c Merge pull request #7719 from beraldoleal/nullable
Remove gogoproto.nullable extension
2023-09-13 15:11:56 +08:00
David Esparza
c2b2a00ad9 Merge pull request #7899 from GabyCT/topic/startdocker
metrics: Ensure docker is running in init_env
2023-09-12 23:01:26 -06:00
Gabriela Cervantes
0aa073967d metrics: Add iperf bandwidth value for qemu
This PR adds the iperf bandwidth value for qemu for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-12 20:57:14 +00:00
Dan Mihai
c0ad914766 tests: fix kernel and initrd annotations
Fix kernel and initrd annotations in the k8s tests on Mariner. These
annotations must be applied to the spec.template for Deployment, Job
and ReplicationController resources.

Fixes: #7764

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-09-12 20:15:25 +00:00
Gabriela Cervantes
615c1cbf19 metrics: Add iperf bandwidth value for kata metrics
This PR adds the iperf bandwidth value for kata metrics.

Fixes #7924

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-12 19:30:24 +00:00
Gabriela Cervantes
d53eb73eec metrics: Ensure docker is running in init_env
This PR ensures that docker is running as part of the init_env function
in kata metrics to avoid failures like docker is not running and making
the kata metrics CI to fail.

Fixes #7898

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-12 19:13:09 +00:00
GabyCT
c0d502493e Merge pull request #7921 from dborquez/metrics_disable_fio_test
metrics: this PR skips the FIO test temprarily to fix issues
2023-09-12 12:08:48 -06:00
Gabriela Cervantes
ad08321b83 metrics: Add Cassandra Metrics documentation
This PR adds the Cassandra Metrics documentation for kata metrics.

Fixes #7922

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-12 16:30:35 +00:00
David Esparza
a58ea66592 metrics: this PR skips the FIO test temprarily to fix issues
FIO test is showing ongoing issues when running in k8s.
Working on running FIO on the ctr client which has been
shown to be stable.

Fixes: #7920

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-09-12 10:23:57 -06:00
Fabiano Fidêncio
2d8447fc6b Merge pull request #7916 from fidencio/topic/add-functional-nerdctl-tests
ci: Add a very basic nerdctl sanity test
2023-09-12 17:47:08 +02:00
James O. D. Hunt
7feb8de9dc Merge pull request #7887 from jodh-intel/hypervisor-remove-debug-kernel-options
runtime-rs: hypervisor: Remove debug kernel options
2023-09-12 16:31:48 +01:00
Fabiano Fidêncio
f536ef5ce1 ci: docker: Also run the smoke test with runc
This will help us to make sure that the failure is actually related to
Kata Containers.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-12 16:54:02 +02:00
Fabiano Fidêncio
c83f167c59 ci: docker: Run the tests after the kata-static is created
There's no reason to wait till the payload is created to run the tests,
as we rely on the tarball, not on the kata-deploy payload.

That was a mistake on my side, and that's already fixed for the nerdctl
tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-12 16:53:47 +02:00
Fabiano Fidêncio
12d833d07d ci: Add a very basic nerdctl sanity test
Let's add a very basic sanity test to check that we can spawn a
containers using nerdctl + Kata Containers.

This will ensure that, at least, we don't regress to the point where
this feature doesn't work at all.

In the future, we should also test all the VMMs with devmapper, but
that's for a follow-up PR after this test is working as expected.

Fixes: #7911

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-12 16:52:55 +02:00
Greg Kurz
be71a0ab4e Merge pull request #7811 from stevenhorsman/bump-rust-to-1.72
versions: Bump rust version
2023-09-12 15:30:35 +02:00
Fabiano Fidêncio
b020912629 Merge pull request #7913 from fidencio/topic/add-functional-docker-tests
ci: Add a very basic docker sanity test
2023-09-12 15:28:49 +02:00
Fabiano Fidêncio
348b8644d6 ci: Add a very basic docker sanity test
Let's add a very basic sanity test to check that we can spawn a
containers using docker + Kata Containers.

This will ensure that, at least, we don't regress to the point where
this feature doesn't work at all.

For now we're running this test against Cloud Hypervisor and QEMU only,
due to an already reported issue with dragonball:
https://github.com/kata-containers/kata-containers/issues/7912

In the future, we should also test all the VMMs with devmapper, but
that's for a follow-up PR after this test is working as expected.

Fixes: #7910

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-12 15:15:26 +02:00
stevenhorsman
a75fd5eb81 runk: Fix rust unecessary mut error
- Fix `error: variable does not need to be mutable`
in rust 1.72

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
a31c145172 kata-ctl: useless-vec warning
- Fix clippy::useless-vec warning

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
c8419fc3bb kata-ctl: Resolve non-minimal-cfg warning
- In rust 1.72, clippy warned clippy::non-minimal-cfg
as the cfg has only one condition, so doesn't
need to be wrapped in the any combinator.

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
3eaf68d954 agent-ctl: Allow clippy lint
- Allow `clippy::redundant-closure-call`
which has issues with the guard function passed into
the `run_if_auto_values` macro

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
1d8b78959d runtime-rs: Fix useless-vec warning
Fix clippy::useless-vec warning

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
99f3d69e94 runtime-rs: Remove mut
Fix `error: variable does not need to be mutable`

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
16fbc27b09 dragonball: Allow ambiguous-glob-reexports
The bindgen generated code is triggering lots of
ambiguous-glob-reexports warnings in rust 1.70+

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
bbf1919516 dragonball: Resolve non-minimal-cfg warning
- In rust 1.72, clippy warned clippy::non-minimal-cfg
as the cfg has only one condition, so doesn't
need to be wrapped in the all combinators.

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
75cfdd5d59 agent: config: Allow clippy lint
- Allow `clippy::redundant-closure-call` in `from_cmdline`
which has issues with the guard function passed into
the `parse_cmdline_param` macro

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
f3a0fd5907 agent: config: Fix useles-vec warning
Fix clippy::useless-vec warning

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
9e423bd3d6 libs: Fix clippy unnecesary hashes error
- Fix error: unnecessary hashes around raw string literal

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
444395050a versions: Bump rust version
Bump rust to 1.72.0 to test what extra warnings/issues we get

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
Yipeng Yin
a16b0962b5 chore(cargo): update cargo lock
Update cargo lock for runtime-rs, agent and kata-ctl.

Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
2023-09-12 15:27:38 +08:00
Chao Wu
c800d0739f Merge pull request #7889 from UiPath/fix-dragonball-build
dragonball: fix for non-deterministic builds
2023-09-12 14:06:18 +08:00
shixuanqing
ca4b6b051d runtime: Naming conflict of network devices
When creating a new endpoint, we check existing endpoint names and automatically adjust the naming of the new endpoint to ensure uniqueness.

Fixes: #7876

Signed-off-by: shixuanqing <1356292400@qq.com>
2023-09-12 04:29:51 +00:00
Guixiong Wei
202049f35e feat(runtime-rs): introduce huge page type to select VM RAM's backend
This commit allows us to specify the huge page backend when enabling huge
page. Currently, we support two backends: thp and hugetlbfs, the default
is hugetlbfs.

To ensure backward compatibility, we introduce another configuration item
"hugepage_type" to select the memory backend, which is available only when
"enable_hugepages" is true. Besides, we add an annotation
"io.katacontainers.config.hypervisor.hugepage_type" to configure huge page
type per pod.

Fixes: #6703

Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
2023-09-12 11:28:27 +08:00
Zhongtao Hu
e1f54f96d0 Merge pull request #7766 from Apokleos/wrap-vsock-virtiofs
runtime-rs: bring hybrid vsock devices in manager.
2023-09-12 09:27:34 +08:00
GabyCT
af29eeb8b1 Merge pull request #7901 from fidencio/topic/ci-target-branch-fixes-follow-up-3
ci: use github.ref_name instead of $GITHUB_REF_NAME
2023-09-11 15:31:29 -06:00
Fabiano Fidêncio
f811b064ca ci: use github.ref_name instead of $GITHUB_REF_NAME
As, regardless of what's mentioned in the documentation, it seems that
$GITHUB_REF_NAME is passed down as a literal string.

Fixes: #7414

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-11 22:14:55 +02:00
Fabiano Fidêncio
dc0b350e49 Merge pull request #7900 from fidencio/topic/ci-target-branch-fixes-follow-up-2
ci: Add more target-branch related fixes
2023-09-11 21:26:26 +02:00
Fabiano Fidêncio
6d795c089e ci: Add more target-branch related fixes
The ones for the payload-after-push.yamland ci-nightly.yaml are not that
much important right now, but they're needed for when we start running
those on stable branches as well.

The other ones were missed during
bd24afcf73.

Fixes: #7414

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-11 20:42:57 +02:00
Fabiano Fidêncio
07d0ad0ad7 Merge pull request #7897 from fidencio/topic/ci-devmapper-do-the-rebase-as-well
ci: Fix target-branch usage
2023-09-11 20:30:53 +02:00
Fabiano Fidêncio
d7f991d139 Merge pull request #7151 from Yuan-Zhuo/fix-systemd-cgroup
agent: optimize the code of systemd cgroup manager
2023-09-11 20:15:51 +02:00
Fabiano Fidêncio
8509c31870 ci: Fix target-branch usage
We missed those one as part of bd24afcf73.

Fixes: #7414

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-11 20:10:27 +02:00
Gabriela Cervantes
060499dcae metrics: Remove warning from metrics documentation
Now that the metrics migration from the tests to kata containers has been completed, this PR removes the warning from the main metrics documentation.

Fixes #7894

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-11 16:41:48 +00:00
GabyCT
b384757ac7 Merge pull request #7874 from fidencio/topic/manually-rebase-branches-atop-of-the-target-one
gha: Manually rebase PR atop of the target branch before testing
2023-09-11 10:35:01 -06:00
Fabiano Fidêncio
46e73cf7a2 Merge pull request #7884 from fidencio/topic/update-kernel-to-the-latest-lts-plus-bring-in-erofs-patches
Update kernel to the latest LTS release (v6.1.52) and bring in erofs patches needed for the CC work
2023-09-11 13:58:43 +02:00
James O. D. Hunt
c0f697fcc5 runtime: Allow kernel_params annotation
To support the removal of the `initcall_debug` and `earlyprintk=`
options from the default guest kernel cmdline, add `kernel_params` to the list
of enabled annotations to allow those kernel options (or others) to be
set using `kata-deploy` for either runtime.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-11 12:12:12 +01:00
Alexandru Matei
b03e49794e dragonball: fix for non-deterministic builds
Fixes: #7888

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2023-09-11 14:07:10 +03:00
Fabiano Fidêncio
93bad13769 Merge pull request #7875 from fidencio/topic/kata-deploy-fix-arm64-image-build
kata-deploy: Fix aarch64 image build
2023-09-11 11:36:52 +02:00
James O. D. Hunt
976d10150c runtime-rs: hypervisor: Remove debug kernel options
Removed the following kernel command line options:

- `earlyprintk=ttyS0`
- `initcall_debug`

Both these options are only useful when debugging a guest kernel failure
which is not a common occurrence.

Further, the `earlyprintk=` option can have a large negative performance
impact (it can increase the VM boot time significantly).

If the user wishes to use either of these options, they can add them to the
`kernel_params=` setting in the Kata configuration file's hypervisor
stanza.

Fixes: #7886.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-11 09:43:39 +01:00
Fabiano Fidêncio
fde34610cd kernel: Add erofs patches needed for CC related work
All the patches have already been merged upstream and they've just been
cherry-picked to this branch.

Fixes: #7885

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-11 10:39:37 +02:00
Fabiano Fidêncio
dc6a4588a2 versions: Bump kernel to the latest LTS release (6.1.52)
We're bumping here in order to make our lives easier backporting EROFS
patches needed for the CC related work.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-11 10:32:16 +02:00
James O. D. Hunt
52f6449b70 kata-manager: Remove initcall_debug kernel option
Removed the addition of the `initcall_debug` kernel option when agent
debugging enabled. This option has nothing to do with the agent.

If the user wishes to use this option, they can add it to the
`kernel_params=` setting in the Kata configuration file's hypervisor
stanza.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-11 09:31:44 +01:00
Fabiano Fidêncio
6cd5d83a37 Merge pull request #7865 from gkurz/fix-more-virtiofs-args
runtime: Fix more virtiofs args
2023-09-09 21:30:16 +02:00
Fabiano Fidêncio
8b4a0b368f kata-deploy: Remove curl after it's used
There's no need to keep curl there after the kubectl binary has already
been downloaded.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-09 10:52:05 +02:00
Fabiano Fidêncio
139c7f03ab kata-deploy: Fix aarch64 image build
Similarly to what's been done for x86_64 -> amd64, we need to do a
aarch64 -> arm64 change in order to be able to download the kubectl
binary.

Fixes: #7861

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-09 10:51:52 +02:00
Fabiano Fidêncio
94f5a69346 Merge pull request #7862 from fidencio/topic/kata-deploy-use-alpine-as-base-image
kata-deploy: Switch to an alpine image
2023-09-09 09:02:13 +02:00
Yuan-Zhuo
470d065415 agent: optimize the code of systemd cgroup manager
1. Directly support CgroupManager::freeze through systemd API.
2. Avoid always passing unit_name by storing it into DBusClient.
3. Realize CgroupManager::destroy more accurately by killing systemd unit rather than stop it.
4. Ignore no such unit error when destroying systemd unit.
5. Update zbus version and corresponding interface file.

Acknowledgement: error handling for no such systemd unit error refers to

Fixes: #7080, #7142, #7143, #7166

Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
2023-09-09 13:56:43 +08:00
GabyCT
fa818bfad1 Merge pull request #7867 from GabyCT/topic/optimizedimage
metrics: Use TensorFlow optimized image
2023-09-08 11:34:21 -06:00
Fabiano Fidêncio
bd24afcf73 gha: Manually rebase PR atop of the target branch before testing
We're changing what's been done as part of ac939c458c, as we've
notcied issues using `github.event.pull_request.merge_commit_sha`.

Basically, whenever a force-push would happen, the reference of
merge_commit_sha wouldn't be updated, leading us to test PRs with the
old code. :-/

In order to get the rebase properly working, we need to ensure we pull
the hash of the commit as part of checkout action, and ensure
fetch-depth is set to 0.

Fixes: #7414

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 18:56:31 +02:00
GabyCT
dc7414f5c1 Merge pull request #7870 from dborquez/metrics_fio_fix_clean_env_order
metrics: fix FIO test initialization
2023-09-08 10:28:10 -06:00
Greg Kurz
72c510d057 runtime/virtiofsd: Drop all references to "--cache=none"
This syntax belongs to the legacy C virtiofsd implementation that
we don't support anymore since kata-containers 3.1.3 because
of other API breaking changes.

People have been warned to switch from "none" to "never" since
kata-containers 2.5.2. Let's officially do that.

The compat code that would convert "none" to "never" isn't
needed anymore. Just drop it.

Fixes #7864

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-09-08 17:57:30 +02:00
Beraldo Leal
ead724bec1 protocol: removing gogo.nullable feature
gogo.nullable is the main gogo.protobuf' feature used here. Since we are
trying to remove gogo.protobuf, the first reasonable step seems to be
remove this feature. This is a core update, and it will change how the
structs are defined. I could spot only a few places using those structs,
based on make check/build.

Fixes #7723.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-09-08 11:49:01 -04:00
Beraldo Leal
d8e4bb9859 protocol: remove unused PROTO_FILE env
There is no reference to PROTO_FILE and this is not working. Also we are
not inside a Makefile, so makes sense to adapt the usage to reflect the
script instead of a make command.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-09-08 11:49:01 -04:00
Beraldo Leal
5e1106a770 protocol: remove unused import_path
import_path is used as the default package when no input files specify
go_package. However, all the files we are currently building already
have a go_package definition, making this behavior both redundant and
error-prone.

Additionally, one of our files (types.pb.go) resides outside the grpc
directory, indicating that it's indeed ignored but also inconsistent.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-09-08 11:49:01 -04:00
Beraldo Leal
87accaaecb protocol: use workdir during build
Currently, the script searches for .proto files within $GOPATH/.
Consequently, modifications to a definition file in the current working
directory won't influence the output .pb.go if the directory is outside
of $GOPATH. For developers, it's more intuitive to alter the local
codebase than the version stored in $GOPATH.

With this modification, the generated .pb.go files will be relative to
the current working directory, removing the need to clone this project
under $GOPATH/src/github.com/kata-containers.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-09-08 11:49:01 -04:00
Beraldo Leal
711a7ed965 protocol: remove mapping definitions
The definitions are already specified in the .proto files using the
go_package option. Centralizing them in one location reduces the
potential for errors and simplifies the script.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-09-08 11:49:01 -04:00
Beraldo Leal
8db84c1bd2 protocol: force GOPATH to be set
Currently, if GOPATH is not set, errors will raise since protoc is using
GOPATH to find packages.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-09-08 11:49:01 -04:00
Beraldo Leal
68156d77ac protocol: breaking lines to improve readability
Just a small change to improve the readability of modules before the
actual changes.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-09-08 11:49:01 -04:00
Fabiano Fidêncio
670a8e9c73 kata-deploy: Switch to an alpine image
This will make our image smaller, and still ensure it's multi-arch
support.

Fixes: #7861

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 17:39:51 +02:00
Fabiano Fidêncio
0b26a5d053 Merge pull request #7871 from fidencio/topic/ci-add-k8s-devmapper-tests-follow-up-3
ci: k8s: Add clean-up-garm argument for gha-run.sh
2023-09-08 17:27:57 +02:00
Fabiano Fidêncio
9d74b7ccc9 k8s: ci: Skip "Pod quota" test with firecracker
The test is failing, and an issue has been opened to track it.
For now, let's skip it.

Issue:
https://github.com/kata-containers/kata-containers/issues/7873

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 15:51:46 +02:00
Fabiano Fidêncio
f6cd3930c5 ci: k8s: Remove useless skip statement from tests
There's absolutely no need to have the skip check as part of the test
itself when it's already done as part of the setup function.

We're only touching the files here that were touched in the previous
commit.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 14:25:29 +02:00
Fabiano Fidêncio
3cc20b47a6 ci: k8s: Also check for "fc" (for firecracker)
Let's keep both checks for now, but in the future we'll be able to
remove the check for "firecracker", as the hypervisor name used as part
of the GitHub Actions has to match what's used as part of the
kata-deploy stuff, which is `fc` (as in `kata-fc for the runtime class)
instead of `firecracker`.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 14:25:24 +02:00
Fabiano Fidêncio
b5bad3cb0f ci: k8s: Add clean-up-garm argument for gha-run.sh
The tests are failing to finish as the argument is invalid.

Fixes: #6542

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 14:04:50 +02:00
Fabiano Fidêncio
05e2e7636e Merge pull request #7868 from fidencio/topic/ci-add-k8s-devmapper-tests-follow-up-2
ci: k8s: Second round of fix-ups with the devmapper CI
2023-09-08 11:02:20 +02:00
Fabiano Fidêncio
aaec5a09f3 ci: k8s: devmapper tests should be using ubuntu 20.04
That's what we've been using as part of Jenkins, so let's ensure things
will work as they did before, and only after that consider upgrading the
base OS used for the tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 10:09:04 +02:00
Fabiano Fidêncio
27fa7d828d ci: k8s: Add a kata-deploy-garm target
We've been using the `kata-deploy-tdx` target as that also uses k3s as
base, but it's better to just have a specific garm target.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 10:09:04 +02:00
Fabiano Fidêncio
fa62a4c01b ci: k8s: Export KUBERNETES env var
So we have a better control on which flavour of kubernetes kata-deploy
is expected to be targetting.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 10:09:04 +02:00
Fabiano Fidêncio
8c9380a798 ci: k8s: Install bats on GARM runners
GARM runners do not come with the whole set of tools we need, or are
used to when it comes to the GHA runners, so we need to manually install
bats on those.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 10:09:04 +02:00
Fabiano Fidêncio
3de23034f8 ci: k8s: Wait some time after restarting k3s
Let's put a 1 minute sleep, just to make sure everything is back up
again.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:46:58 +02:00
David Esparza
adfea55b8f metrics: fix FIO test initialization
This PR changes the order in which the FIO test first
cleans the environment and then checks if the environment
is indeed clean.

Fixes: #7869

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-09-07 15:41:59 -06:00
Fabiano Fidêncio
2df183fd99 ci: k8s: Append, instead of overwrite, the devmapper config
As we were using `tee` without the `-a` (or `--apend`) aptton, the
containerd config would be overwritten, leading to a NotReady state of
the Node.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
Fabiano Fidêncio
369a8af8f7 ci: k8s: Decrease k3s sleep from 4 to 2 minutes
It should be plenty, and worked well in local tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
Fabiano Fidêncio
ada65b988a ci: k8s: Use vanilla kubectl with k3s
Let's download the vanilla kubectl binary into `/usr/bin/`, as we need
to avoid hitting issues like:
```sh
error: open /etc/rancher/k3s/k3s.yaml.lock: permission denied
```

The issue basically happens because k3s links `/usr/local/bin/kubectl`
to `/usr/local/bin/k3s`, and that does extra stuff that vanilla
`kubectl` doesn't do.

Also, in order to properly use the k3s.yaml config with the vanilla
kubectl, we're copying it to ~/.kube/config.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
Fabiano Fidêncio
ad45ab5d33 ci: k8s: Ensure k3s is deploy with --write-kubeconfig-mode=644
Otherwise the /etc/rancher/k3s/k3s.yaml is not readable by other users
than root.

As --write-config-mode is being passed, and that's an option that has to
be passed to the `server`, -s is also added to the command line.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
Fabiano Fidêncio
028a97e0d5 ci: k8s: Use the proper command for sleep
`wait` waits for a job to complete, not a number of seconds.  Not sure
how I got that wrong in the first place, but it's what it's.

Fixes: #6542

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
David Esparza
34f580901f Merge pull request #7824 from dborquez/fix_memory_usage_initialization
metrics: re-enable memory-usage initialization step
2023-09-07 14:24:27 -06:00
Gabriela Cervantes
3a427795ea metrics: Use TensorFlow optimized image
This PR replaces the ubuntu image for one which has TensorFlow optimized
for kata metrics.

Fixes #7866

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-07 15:38:51 +00:00
Chao Wu
cd8c217ee1 Merge pull request #6879 from openanolis/chao/update_upstream_upcall_feature
Dragonball: optimize the placement of dbs-upcall features
2023-09-07 18:07:53 +08:00
Fabiano Fidêncio
dfa1cce916 Merge pull request #7860 from fidencio/topic/ci-add-k8s-devmapper-tests-follow-up-1
ci: k8s: Fix typo in run-k8s-tests-on-garm.yaml
2023-09-07 11:48:30 +02:00
Fabiano Fidêncio
8d99972a8a ci: k8s: Fix typo in run-k8s-tests-on-garm.yaml
integrations -> integration
integrtion -> integration

Fixes: #6542

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 11:31:30 +02:00
Fabiano Fidêncio
0483d3d16d Merge pull request #7841 from fidencio/topic/ci-add-k8s-devmapper-tests
ci: k8s: Add k8s devmapper tests (part 0)
2023-09-07 10:53:09 +02:00
Jeremi Piotrowski
f6cc01d77c Merge pull request #7833 from jepio/kata-static-fix-ownership
kata-deploy: Create kata-static.tar with correct ownership
2023-09-07 10:16:23 +02:00
Peng Tao
435e890cd9 Merge pull request #7703 from bergwolf/github/nerdctl-fc
runtime: run prestart hooks before starting VM for FC
2023-09-07 10:55:31 +08:00
Chao Wu
deed1b927d Dragonball: optimize the placement of dbs-upcall features
Currently, the dbs-upcall features have 2 problems that are needed to be
fixed :

There are redundant dbs-upcall features that are needed to be removed.
Some place should be controlled by dbs-upcall but not being implemented.

This commit will fix those two problems.

fixes: #6878

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-09-07 10:27:29 +08:00
Fabiano Fidêncio
0e8bd50cbb ci: k8s: Add k8s devmapper tests (part 0)
Let's enable the devmapper kubernetes tests to match exactly what's been
tested as part of the Jenkins CI.

Fixes: #6542

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-06 23:08:38 +02:00
Fabiano Fidêncio
b28b54df04 ci: k8s: Add a function to configure devmapper for containerd
This function right now is completely based on what's part of the tests
repo[0], and that's the reason I'm keeping the `Signed-off-by` of all
the contributors to that file.

This is not perfect, though, as it changes the default snapshotter to
devmapper, instead of only doing so for the Kata Containers specific
runtime handlers.  OTOH, this is exactly what we've always been doing as
part of the tests.

We'll improve it, soon enough, when we get to also add a way for
kata-deploy to set up different snapshotters for different handlers.
But, for now, this is as good (or as bad) as it's always been.

It's important to note that the devmapper setup doesn't take into
consideration a BM machine, and this is not suitable for that.  We're
really only targetting GHA runners which will be thrown away after the
run is over.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-06 23:08:17 +02:00
Fabiano Fidêncio
54f7117212 ci: k8s: Add a function to deploy k3s
One can use different kubernetes flavours for getting a kubernetes
cluster up and running.

As part of our CI, though, I really would like to avoid contributors
spending time maintaining and updating kubernetes dependencies, as done
with the tests repo, and which has been proven to be really good on
getting things rotten.

With this in mind, I'm taking the bullet and using "k3s" as the way to
deploy kubernetes for the devmapper related tests, and that's the reason
I'm adding a function to do so, and this will be used later on as part
of this series.

It's important to note that the k3s setup doesn't take into
consideration a BM machine, and this is not suitable for that.  We're
really only targetting GHA runners which will be thrown away after the
run is over.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-06 23:07:41 +02:00
David Esparza
cf258090aa Merge pull request #7843 from GabyCT/topic/ffiolimit
metrics: Add write 95 percentile FIO value
2023-09-06 14:52:00 -06:00
Fabiano Fidêncio
c5e1e7ddc3 Merge pull request #7854 from fidencio/topic/runtime-allow-virtio_fs_extra_args-annotation
runtime: Allow virtio_fs_extra_args annotation
2023-09-06 19:20:40 +02:00
Greg Kurz
81536f21af runtime/qemu: Pass "--xattr" to virtiofsd instead of "-o xattr"
The "-o" syntax belongs to the legacy C virtiofsd. It is deprecated
with the rust implementation.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-09-06 17:50:35 +02:00
Fabiano Fidêncio
b1dd09a4d3 runtime: Allow virtio_fs_extra_args annotation
Some use cases may just require passing extra arguments to virtiofsd,
and having this disabled by default makes it impossible to set when
using kata-deploy, as changes in the configuration file would be
overwritten by the daemon-set.

With this in mind, let's allow users to pass whatever thet need (and
here I'm specifically looking at `--xattr`) as a virtio_fs_extra_arg.

Fixes: #7853

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-06 17:11:16 +02:00
Hyounggyu Choi
d27fe18167 Merge pull request #7849 from BbolroC/hot-fix-dockerbuild
packaging: do not install docker-compose-plugin for s390x|ppc64le
2023-09-06 13:13:25 +02:00
Hyounggyu Choi
2efda20c77 packaging: do not install docker-compose-plugin for s390x|ppc64le
This PR is to skip installing docker-compose-plugin while buiding a `build-kata-deploy` image for s390x|ppc64le.
It is a temporary solution to fix current CI failures for s390x regarding `hash sum mismatch`.

Fixes: #7848
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-09-06 11:12:03 +02:00
Zhongtao Hu
aa85e0b3ec Merge pull request #7714 from justxuewei/volumes-cleanup
runtime-rs: Fix volumes and rootfs cleanup issues
2023-09-06 10:13:55 +08:00
Gabriela Cervantes
438fbf9669 metrics: Add write 95 percentile for FIO for qemu
This PR adds the write 95 percentile for FIO for qemu for
checkmetrics for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 22:50:31 +00:00
Gabriela Cervantes
024b4d2ffe metrics: Add write 95 percentile FIO value
This PR adds the write 95 percentile FIO value for checkmetrics
for kata metrics.

Fixes #7842

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 21:00:05 +00:00
GabyCT
3e3a91fd2c Merge pull request #7577 from GabyCT/topic/enableiperfm
metrics: Enable iperf benchmark on gha for kata metrics
2023-09-05 14:53:47 -06:00
Gabriela Cervantes
e98e5cdea2 metrics: Add checkmetrics to gha run script
This PR adds the checkmetrics to gha run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 17:05:03 +00:00
Gabriela Cervantes
c1edfe5511 metrics: Add checkmetrics value for qemu for iperf
This PR adds the checkmetrics value for qemu for iperf benchmark.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 16:04:52 +00:00
Gabriela Cervantes
6a79ecedf9 metrics: Add jitter value for clh
This PR adds jitter value for clh for iperf metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 16:04:52 +00:00
Gabriela Cervantes
f609a9a754 metrics: Add test selector to iperf metrics
This PR adds test selector to iperf metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 16:04:52 +00:00
Gabriela Cervantes
5b8db30422 metrics: Enable iperf benchmark on gha for kata metrics
This PR enables the iperf benchmark to run on the gha for kata metrics.

Fixes #7575

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 16:04:52 +00:00
Jeremi Piotrowski
cf46b056fd Merge pull request #7839 from openanolis/chao/switch_to_azure
CI: switch static-checks-dragonball CI machines to Azure
2023-09-05 10:59:02 +02:00
Chao Wu
60f733d301 CI: switch static-checks-dragonball CI machines to Azure
Previously, static-checks-dragonball is using machines from Alibaba
Cloud to run all the CI jobs.

Currently, we are going through an internal process to apply for the new
machines for Dragonball CI. Before the internal process is over, we will
temporarily use Azure VM to run static-checks-dragonball jobs.

fixes: #7838

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-09-05 15:19:07 +08:00
alex.lyn
7870b33a2d runtime-rs: bring hybridVsock devices in manager.
Currently, virtio_vsock are still outside of the device
manager. This causes some management issues,such as the
inability to unify PCI address management.

Just do some work for hybrid vsock.

Fixes: #7655

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-09-05 08:46:56 +08:00
Jeremi Piotrowski
18c94ebbe3 kata-deploy: Create kata-static.tar with correct ownership
Pass --owner and --group to the tar invokation to prevent gihtub runner user
from leaking into release artifacts.

Fixes: #7832
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-04 17:24:00 +02:00
Fabiano Fidêncio
b663ec21ac Merge pull request #7803 from GabyCT/topic/readmereportdoc
metrics: Add README for kata metrics report
2023-09-03 21:57:13 +02:00
Fabiano Fidêncio
e490b0bc76 Merge pull request #7808 from ManaSugi/fix/remove-manual-chcon
osbuilder: Remove chcon operation for guest SELinux
2023-09-03 21:55:02 +02:00
Fabiano Fidêncio
27dab249a0 Merge pull request #7800 from jodh-intel/kata-sys-util-update-tdx-protection-checks
kata-sys-util: protection: Update TDX checks
2023-09-02 14:47:51 +02:00
Jiang Liu
d5729e818c Merge pull request #7819 from jiangliu/storage-cleanup
Improve the way to clean up storage devices for sandbox
2023-09-02 17:02:51 +08:00
Jiang Liu
57e7bf14a6 agent: refine StorageDeviceGeneric::cleanup()
Refine StorageDeviceGeneric::cleanup() to improve safety.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-09-02 14:22:21 +08:00
Jiang Liu
53edb19374 agent: implement StorageDeviceGeneric::cleanup()
Refactor cleanup_sandbox_storage as StorageDeviceGeneric::cleanup().

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-09-02 14:00:26 +08:00
Jiang Liu
0c63453e28 types: make StorageDevice::cleanup() return possible error code
Make StorageDevice::cleanup() return possible error code.

Fixes: #7818

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-09-02 13:27:06 +08:00
Jiang Liu
3a3d77b3b5 agent: move StorageDeviceGeneric from kata-types into agent
Move StorageDeviceGeneric from kata-types into agent, so we can
refactor code later.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-09-02 13:12:17 +08:00
Jiang Liu
d848126b61 Merge pull request #7821 from jiangliu/storage-leak
agent: avoid possible leakage of storage device
2023-09-02 12:40:40 +08:00
Fabiano Fidêncio
4f92e6df90 Merge pull request #7683 from microsoft/danmihai1/policy-tests
tests: add policy to existing tests
2023-09-01 23:52:15 +02:00
David Esparza
b151cfd140 metrics: re-enable memory-usage initialization step
This PR re-enables the initialization step disabled
on 538c965c2b.

Fixes: #7804

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-09-01 14:29:34 -06:00
Fabiano Fidêncio
f3e1a6a94f osbuilder: alpine: Change mirror
As we're hitting a lot of:
```
ERROR: https://dl-5.alpinelinux.org/alpine/v3.18/main: operation timed
out
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-01 16:01:42 +00:00
Fabiano Fidêncio
ac612aef5e osbuilder: alpine: Match the version on versions.yaml
We've switching to 3.18 as part of
82cd14ba39.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-01 16:01:33 +00:00
Jiang Liu
9cd706d1c9 agent: avoid possible leakage of storage device
When a storage device is used by more than one container, the second
and forth instances will cause storage device reference count leakage,
thus cause storage device leakage. The reason is:
add_storages() will increase reference count of existing storage device,
but forget to add the device to the `mount_list` array, thus leak the
reference count.

Fixes: #7820

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-09-01 22:52:42 +08:00
Dan Mihai
bf21411e90 tests: add policy to k8s tests
Use AGENT_POLICY=yes when building the Guest images, and add a
permissive test policy to the k8s tests for:
- CBL-Mariner
- SEV
- SNP
- TDX

Also, add an example of policy rejecting ExecProcessRequest.

Fixes: #7667

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-09-01 14:28:08 +00:00
Dan Mihai
d0e0610679 runtime: config: use the SEV initrd for SNP
Thanks Unmesh Deodhar!

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-09-01 14:28:08 +00:00
Fabiano Fidêncio
67fed26f18 runtime: Use TDX image with in the qemu-tdx config
Let's make sure we use the TDX image as part of the QEMU TDX
configuration, which will help us to have the policies tested here.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-01 14:28:08 +00:00
Fabiano Fidêncio
f65ffb23da Merge pull request #7814 from fidencio/topic/gha-rebase-prs-atop-of-main-for-the-tests
gha: Rebase PR atop of the target branch before testing
2023-09-01 16:26:32 +02:00
Fabiano Fidêncio
ef70aeb6b8 Merge pull request #7817 from fidencio/topic/update-alpine-to-its-latest-release
versions: Update alpine to its 3.18 version
2023-09-01 14:51:58 +02:00
Fabiano Fidêncio
ac939c458c gha: Rebase atop of the target branch
We have two scenarios we care about this, `pull_request` and
`pull_request_target` events triggered a job.

`pull_request` event:
When using the checkout action, it'll already provide a "rebased atop of
main" repo for us, nothing else is needed, and that's basically what we
already have as part of the jobs in our CI.

`pull_request_target` event:
This one is a little bit tricky, as the checkout action, unless passing
a spsecific repo, give us the PR checked out rebased atop of the HEAD of
the PR branch.  Jeremi Piotrowski nicely pointed out that we could use
github.event.pull_request.merge_commit_sha instead, which is the result
of the PR's branch with the official repo target branch.

Now, the only cases where the contributor's rebase would still be needed
is when the action itself has been changed.

Fixes: #7414

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-01 11:23:31 +02:00
Jeremi Piotrowski
bde06758b1 Merge pull request #7761 from jepio/iocopy-fix-race
runtime: Fix data race in ioCopy
2023-09-01 09:30:54 +02:00
Fabiano Fidêncio
82cd14ba39 versions: Update alpine to its 3.18 version
3.15 will be out of life in 2 months from now.

Fixes: #7816

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-31 23:02:54 +02:00
GabyCT
d75c7b5f9c Merge pull request #7813 from GabyCT/topic/genreport
metrics: Add grabdata script for metrics report
2023-08-31 13:33:38 -06:00
Gabriela Cervantes
6668825752 metrics: Add grabdata script for metrics report
This PR adds the grabdata script so it can be used for the metrics report
for kata metrics.

Fixes #7812

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-31 16:17:29 +00:00
James O. D. Hunt
c290eaed8c kata-sys-util: protection: Update TDX checks
Update the protection checking code to detect newer versions of Intel
TDX (whose userland interface has now stabilised).

> **Note:** that we don't need to retain the existing behaviour since:
>
> - We haven't yet landed the TDX feature (#6448).
> - Systems wishing to use TDX will need to use the latest available
>   system components (such as firmware and host kernel).

Also added an explicit TDX unit test.

Fixes: #7384.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-08-31 16:15:15 +01:00
Fabiano Fidêncio
d7a996c686 gha: Update to checkout@v3 action
At this point we should always be using the latest checkout action.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-31 16:02:31 +02:00
Jeremi Piotrowski
d7612440b8 Merge pull request #7789 from beraldoleal/tests/amd
Fixes tests on AMD machines
2023-08-31 11:23:51 +02:00
Jeremi Piotrowski
c2ba29c15b runtime: Fix data race in ioCopy
IoCopy is a tricky function (I don't claim to fully understand its contract),
but here is what I see: The goroutine that runs it spawns 3 goroutines - one
for each stream to handle (stdin/stdout/stderr). The goroutine then waits for
the stream goroutines to exit. The idea is that when the process exits and is
closed, the stdout goroutine will be unblocked and close stdin - this should
unblock the stdin goroutine. The stderr goroutine will exit at the same time as
the stdout goroutine. The iocopy routine then closes all tty.io streams.

The problem is that the stdout goroutine decrements the WaitGroup before
closing the stdin stream, which causes the iocopy goroutine to race to close
the streams. Move the wg.Done() of the stdout routine past the close so that
*this* race becomes impossible. I can't guarantee that this doesn't affect some
unspecified behavior.

Fixes: #5031
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-08-31 10:17:38 +02:00
Manabu Sugimoto
211de08d9e osbuilder: Remove chcon operation for guest SELinux
Remove the `chcon` operation which adds `container_runtime_exec_t` label to
the `kata-agent` binary because the container-selinux package including
the 39f83cc74d
commit has been released officially.
Ref. https://centos.pkgs.org/9-stream/centos-appstream-x86_64/container-selinux-2.221.0-1.el9.noarch.rpm.html

The container-selinux package is installed in a guest rootfs when we create it with `SELinux = yes`,
and `restorecon` sets `container_runtime_exec_t` to the `kata-agent`.

Fixes: #7807

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2023-08-31 16:44:32 +09:00
GabyCT
b467f2ef68 Merge pull request #7772 from GabyCT/topic/fiolimit
metrics: Enable FIO limits for kata metrics
2023-08-30 14:49:04 -06:00
Gabriela Cervantes
9f21fa9b39 metrics: Add report generator link to general documentation
This PR adds the report generator link to general documentation.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-30 16:55:14 +00:00
Gabriela Cervantes
c0ed5ea0ad metrics: Add README for kata metrics report
This PR adds the README for kata metrics report.

Fixes #7802

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-30 16:36:08 +00:00
Fabiano Fidêncio
aa2b51a831 Merge pull request #7783 from GabyCT/topic/makereport
metrics: Add metrics report script
2023-08-30 17:11:39 +02:00
Gabriela Cervantes
a7b59a5bf9 metrics: Add limit for 90 percentile for qemu value
This PR adds the limit for 90 percentile for qemu value for
FIO kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-30 13:53:38 +00:00
Gabriela Cervantes
99db6568e9 metrics: Add limit for write 90 percentile value for clh
This PR adds the limit for write 90 percentile value for clh for
FIO metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-30 13:53:38 +00:00
Gabriela Cervantes
6e06392c55 metrics: Enable FIO limits for kata metrics
This PR enables the FIO limits for kata metrics.

Fixes #7771

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-30 13:53:38 +00:00
David Esparza
924d06a7f5 Merge pull request #7787 from GabyCT/topic/fixmemoryinsidelimit
metrics: Fix memory inside limits for kata metrics
2023-08-30 07:45:17 -06:00
Peng Tao
2e4c874726 runtime/vc: runPrestartHooks should ignore GetHypervisorPid failure
If we are running FC hypervisor, it is not started when prestart hooks
are executed. So we should just ignore such error and just go ahead and
run the hooks.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-08-30 03:06:11 +00:00
Peng Tao
21204caf20 runtime: fail early when starting docker container with FC
FC does not support network device hotplug. Let's add a check to fail
early when starting containers created by docker.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-08-30 02:52:01 +00:00
Peng Tao
32fd013716 runtime: run prestart hooks before starting VM for FC
Add a new hypervisor capability to tell if it supports device hotplug.
If not, we should run prestart hooks before starting new VMs as nerdctl
is using the prestart hooks to set up netns. To make nerdctl + FC
to work, we need to run the prestart hooks before starting new VMs.

Fixes: #6384
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-08-30 02:52:01 +00:00
Beraldo Leal
00e7ffd988 tests: check vmx only on Intel machines
When running on amd machines, those tests will fail because there is no
vmx flag. Following other tests that checks for cpuType, let's adapt
them to restrict vmx only on Intel machines.

Fixes #7788.
Related #5066

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-08-29 20:04:31 -04:00
Gabriela Cervantes
c8dd3c0737 metrics: Fix memory footprint qemu limit
This PR fixes the memory footprint qemu limit for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 22:51:21 +00:00
Gabriela Cervantes
8877ec62fb metrics: Fix memory inside limits for kata metrics
This PR fixes the memory inside limit for clh for kata metrics due
to the recent changes that we had in the script which impacted
in the performance measurement.

Fixes #7786

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 21:38:18 +00:00
Beraldo Leal
80146f2078 tests: Fixes cpuType check on AMD machines
cpuType is not initialized yet. gets 0 (Intel) by default, failing on
AMD machines.

Fixes #7785

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-08-29 17:04:07 -04:00
Gabriela Cervantes
7e364716dd metrics: Add test setup details to metrics report
This PR adds test setup details to metrics report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:56:53 +00:00
Gabriela Cervantes
17dc1b9760 metrics: Add boot lifecycle times to metrics report
This PR adds the boot lifecycle times to metrics report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:55:44 +00:00
Gabriela Cervantes
3b0d6538f2 metrics: Add memory inside container to metrics report
This PR adds memory inside container to metrics report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:53:17 +00:00
Gabriela Cervantes
79fbb9d243 metrics: Add scaling system footprint in metrics report
This PR adds scaling system footprint in metrics report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:51:27 +00:00
Gabriela Cervantes
8e6d4e6f3d metrics: Add metrics reportgen
This PR adds metrics reportgen for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:45:36 +00:00
Gabriela Cervantes
139ffd4f75 metrics: Add report file titles
This PR adds report file titles for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:43:06 +00:00
GabyCT
8f2dae7b53 Merge pull request #7775 from dborquez/fix_memory_usage_parsing_results
metrics: fix parsing issue on memory-usage test
2023-08-29 11:26:13 -06:00
Gabriela Cervantes
878d1a2e7d metrics: Generate PNGs alongside the PDF report
This PR generates the PNGs for the kata metrics PDF report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 16:50:32 +00:00
Gabriela Cervantes
fce2487971 metrics: Add metrics report R files
This PR adds the metrics report R files.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 16:45:22 +00:00
Gabriela Cervantes
08812074d1 metrics: Add report dockerfile
This PR adds the report dockerfile for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 16:28:32 +00:00
Gabriela Cervantes
69781fc027 metrics: Add metrics report script
This PR adds metrics report script for kata metrics.

Fixes #7782

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 16:25:14 +00:00
Chao Wu
e4fb20c74a Merge pull request #7585 from lifupan/main
dragonball: vsock add fifo/pipe stream support for passed fd hybridSt…
2023-08-29 23:39:21 +08:00
Fabiano Fidêncio
50e51bcafe Merge pull request #7185 from UnmeshDeodhar/add-cc-sev-test
tests: Add confidential test
2023-08-29 15:32:25 +02:00
Fabiano Fidêncio
e286e842c1 tests: Expand confidential test to support TDX
Let's expand the confidential test to also support TDX.

The main difference on the test, though, is that we're not grepping for
a string in the `dmesg` output, but rather relying on `cpuid` to detect
a TDX guest.

Fixes: #7184

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-29 14:10:47 +02:00
Unmesh Deodhar
e31f099be1 tests: Expand confidential test to support SNP
Let's expand the confidential test to also support SNP.

Fixes: #7184

Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
2023-08-29 14:10:47 +02:00
Unmesh Deodhar
c3b9d4945e tests: Add confidential test for SEV
Add a test case for the launch of unencrypted confidential
container, verifying that we are running inside a TEE.

Right now the test only works with SEV, but it'll be expanded in the
coming commits, as part of this very same series.

Fixes: #7184

Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-29 14:10:34 +02:00
David Esparza
538c965c2b metrics: fix parsing issue on memory-usage test
This PR fixes an issues in the parsing results stage,
by collecting just the n-results from the n-running
containers, discarding irrelevant data.

Fixes: #7774

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-08-28 23:39:46 -06:00
Fabiano Fidêncio
708b0a3052 Merge pull request #7768 from fidencio/topic/update-tdx-to-the-6.2-kernel-based-stack
tdx: Update the components needed for using the 6.2 kernel stack
2023-08-28 19:27:15 +02:00
Fabiano Fidêncio
3818bf3311 local-build: Remove $HOME/.docker/buildx/activity/default
The file can be removed between builds without causing any issue, and
leaving it around has been causing us some headache due to:
```
ERROR: open /home/runner/.docker/buildx/activity/default: permission denied
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-28 13:41:36 +02:00
Fabiano Fidêncio
d1b54ede29 qemu: tdx: Workaround SMP issue with TDX 1.5
`...,sockets=1,cores=numvcpus,threads=1,...` must be used.

Fixes: #7770

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-28 13:41:36 +02:00
Archana Shinde
1e34220c41 qemu: tdx: Adapt to the TDX 1.5 stack
QEMU for TDX 1.5 makes use of private memory map/unmap.
Make changes to govmm to support this. Support for private backing fd
for memory is added as knob to the qemu config.

Userspace's map/unmap operations are done by fallocate() ioctl on the
backing store fd.
Reference:
https://lore.kernel.org/linux-mm/20220519153713.819591-1-chao.p.peng@linux.intel.com/

Fixes: #7770

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-28 13:41:36 +02:00
Fabiano Fidêncio
8115a0522d versions: tdx: Update Kernel to 6.2 + TDX
This is the version that's been used and tested inside Intel, and it
matches with https://github.com/intel/tdx-tools/releases/tag/2023ww15.

Fixes: #7770

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-28 13:11:34 +02:00
Fabiano Fidêncio
ec18180f34 versions: tdx: Update TDVF to the "edk2-stable202302"
This is the version that's been used and tested inside Intel, and it
matches with https://github.com/intel/tdx-tools/releases/tag/2023ww15.

Fixes: #7770

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-28 13:11:34 +02:00
Fabiano Fidêncio
9803b24286 versions: tdx: Update QEMU to v7.2 + TDX v1.10
This is the version that's been used and tested inside Intel, and it
matches with https://github.com/intel/tdx-tools/releases/tag/2023ww15.

Fixes: #7770

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-28 13:11:27 +02:00
Fabiano Fidêncio
02a08c956b Merge pull request #7754 from microsoft/danmihai1/pod-quota-deployment
tests: delete k8s deployment at the test's end
2023-08-27 17:52:00 +02:00
Fabiano Fidêncio
98037ced52 Merge pull request #7755 from microsoft/danmihai1/unique-test-name
tests: use unique test name
2023-08-27 17:27:40 +02:00
Zhongtao Hu
f0440a9cfe Merge pull request #7742 from frezcirno/fix-log-forwarder-loop
runtime-rs: check peer close in log_forwarder
2023-08-26 10:44:09 +08:00
Fabiano Fidêncio
16a610d788 Merge pull request #7758 from fidencio/topic/gha-avoid-fail-fast-till-everything-is-ultra-stable
gha: Avoid "fail-fast" in tests that are known to be flaky
2023-08-25 16:49:26 +02:00
Jiang Liu
91db888d83 Merge pull request #7602 from jiangliu/agent-storage
Refine storage device management for kata-agent
2023-08-25 22:20:18 +08:00
Zixuan Tan
dffc16e5b3 runtime-rs: check peer close in log_forwarder
The log_forwarder task does not check if the peer has closed, causing a
meaningless loop during the period of “kata vm exit”, when the peer
closed, and “ShutdownContainer RPC received” that aborts the log forwarder.

This patch fixes the problem.

Fixes: #7741

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2023-08-25 19:00:07 +08:00
Jiang Liu
aaa5ab1264 agent: simplify storage device by removing StorageDeviceObject
Simplify storage device implementation by removing StorageDeviceObject.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-25 17:23:16 +08:00
Fabiano Fidêncio
fb49d5d7ce gha: Avoid "fail-fast" in tests that are known to be flaky
Otherwise we'll have to re-run all the tests due to a flaky behaviour in
one of the parts.

Fixes: #7757

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-25 10:00:17 +02:00
Dan Mihai
183f51d6f6 tests: use unique test name
k8s-pid-ns.bats was already using the test name from
k8s-kill-all-process-in-container.bats - probably a copy/paste bug.

Fixes: #7753

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-08-25 03:41:06 +00:00
Dan Mihai
6a974679f2 tests: delete k8s deployment at the test's end
At the end of k8s-kill-all-process-in-container.bats, delete the
deployment it created.

Fixes: #7752

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-08-25 03:34:37 +00:00
David Esparza
686eb3878b Merge pull request #7751 from GabyCT/topic/unusednhwc
metrics: Remove unused variable in tensorflow nhwc script
2023-08-24 18:34:06 -06:00
Fabiano Fidêncio
f1d8e1f513 Merge pull request #7747 from fidencio/topic/kata-deploy-dont-try-to-remove-opt-kata
kata-deploy: Don't try to remove /opt/kata
2023-08-24 18:56:52 +02:00
Gabriela Cervantes
32a778b6da metrics: Remove unused variable in tensorflow nhwc script
This PR removes unused variable in tensorflow nhwc script.

Fixes #7750

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-24 15:54:27 +00:00
David Esparza
875a85ee14 Merge pull request #7736 from GabyCT/topic/tensorflowfp32
metrics: Add TensorFlow ResNet50 FP32 benchmark
2023-08-24 08:56:24 -06:00
Fabiano Fidêncio
d8f3ce6497 kata-deploy: Don't try to remove /opt/kata
The directory is a host path mount and cannot be removed from within the
container.  What we actually want to remove is whatever is inside that
directory.

This may raise errors like:
```
rm: cannot remove '/opt/kata/': Device or resource busy
```

Fixes: #7746

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-24 13:57:36 +02:00
Jeremi Piotrowski
71c90b994a Merge pull request #7745 from jepio/vfio-part-0
gha: vfio: Run on Ubuntu 23.04 runner
2023-08-24 12:15:19 +02:00
Greg Kurz
9991772b26 Merge pull request #7718 from littlejawa/fix_filemode_when_zero
kata-agent: use default filemode for block device when it is set to 0
2023-08-24 11:40:28 +02:00
Jeremi Piotrowski
936e8091a7 gha: vfio: Run on Ubuntu 23.04 runner
The vfio test requires nested-nested virtualization:

L0 Azure host
-> L1 Ubuntu VM
  -> L2 Fedora VM
    -> L3 Kata

This hits a kernel bug on v5.15 but works quite nicely on the v6.2 kernel
included in Ubuntu 23.04. We can switch back to Ubuntu 22.04 when they roll out
v6.2.

Fixes: #6555
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-08-24 10:10:02 +02:00
Jiang Liu
0e7248264d agent: move storage device related code into dedicated files
Move storage device related code into dedicated files.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-24 13:48:51 +08:00
Xuewei Niu
268e846558 runtime-rs: Fix volumes and rootfs cleanup issues
There are several processes for container exit:

- Non-detach mode: `Wait` request is sent by containerd, then
  `wait_process()` will be called eventually.
- Detach mode: `Wait` request is not sent, the `wait_process()` won’t be
  called.
    - Killed by ctr: For example, a container runs `tail -f /dev/null`, and
      is killed by `sudo ctr t kill -a -s SIGTERM <CID>`. Kill request is
      sent, then `kill_process()` will be called. User executes `sudo ctr c
      rm <CID>`, `Delete` request is sent, then `delete_process()` will be
      called.
    - Exited on its own: For example, a container runs `sleep 1s`. The
      container’s state goes to `Stopped` after 1 second. User executes
      the delete command as below.

Where do we do container cleanup things?

- `wait_process()`: No, because it won’t be called in detach mode.
- `delete_process()`: No, because it depends on when the user executes the
  delete command.
- `run_io_wait()`: Yes. A container is considered exited once its IO ended.
  And this always be called once a container is launched.

Fixes: #7713

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-08-24 13:23:47 +08:00
Jiang Liu
8f49ee33b2 agent: refine storage related code a bit
Refine storage related code by:
- remove the STORAGE_HANDLER_LIST
- define type alias
- move code near to its caller

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-24 13:09:10 +08:00
Jiang Liu
60ca12ccb0 agent: switch to new storage subsystem
Switch to new storage subsystem to create a StorageDevice for each
storage object.

Fixes: #7614

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-24 13:09:09 +08:00
Jiang Liu
fcbda0b419 kata-types: introduce StorageDevice and StorageHandlerManager
Introduce StorageDevice and StorageHandlerManager, which will be used
to refine storage device management for kata-agent.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-24 13:08:55 +08:00
Jiang Liu
b03b1f6134 agent: simplify the way to manage storage object
Simplify the way to manage storage objects, and introduce
StorageStateCommon structures for coming extensions.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-24 12:58:24 +08:00
Jiang Liu
8392c71bf2 sys-util: support more mount flags in parse_mount_options()
Support more mount flags in parse_mount_options().

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-24 12:17:39 +08:00
Jiang Liu
c00d8f3d48 agent: use create_mount_destination() from kata-sys-util
Use create_mount_destination() from kata-sys-util crate to reduce
redundant code.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-24 12:17:38 +08:00
Jiang Liu
5e867f0538 types: add more mount related constants
Add more mount related constants.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-24 12:17:36 +08:00
Jiang Liu
880e6c9a76 agent: use function from kata-sys-utils to reduce code
Use function get_linux_mount_info() from kata-sys-util crate to share
common code.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-24 12:17:34 +08:00
QuanweiZhou
a6921dd837 Merge pull request #7698 from jiangliu/virtual-volume
kata-types: introduce KataVirtualVolume to support nydus, direct volume and image pull
2023-08-24 11:50:39 +08:00
Fabiano Fidêncio
7705c5962e Merge pull request #7728 from ManaSugi/fix/typo-test-toml
libs,tests: fix typo disable_guest_seccomp in configuration-anno-1.toml
2023-08-23 23:55:41 +02:00
GabyCT
c1712e1930 Merge pull request #7737 from jepio/fix-local-build
local-build: Remove GID before creating group
2023-08-23 12:26:39 -06:00
Jeremi Piotrowski
3b881fbc0e local-build: Remove GID before creating group
docker install now creates a group with gid 999 which happens to match what we
need to get docker-in-docker to work. Remove the group first as we don't need
it.

Fixes: #7726
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-08-23 18:58:38 +02:00
David Esparza
ebce5d25a9 Merge pull request #7734 from fidencio/topic/kata-deploy-fix-removal
kata-deploy: Avoid failing on content removal
2023-08-23 10:29:57 -06:00
Gabriela Cervantes
959ca49447 metrics: Add TensorFlow ResNet50 fp32 Dockerfile
This PR adds the TensorFlow ResNet50 fp32 Dockerfile for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-23 16:24:58 +00:00
Gabriela Cervantes
4b7d72c4a8 metrics: Add TensorFlow ResNet50 FP32 benchmark
This PR adds TensorFlow ResNet50 FP32 benchmark for kata metrics.

Fixes #7735

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-23 16:21:09 +00:00
Fabiano Fidêncio
e7e4cc2182 Merge pull request #7716 from bergwolf/github/image-initrd-assets
runtime: fix image and initrd assets handling
2023-08-23 18:02:15 +02:00
Fabiano Fidêncio
5cba38c175 kata-deploy: Avoid failing on content removal
We can simply use `rm -f` all over the place and avoid the container
returning any error.

Fixes: #7733

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-23 16:49:26 +02:00
Peng Tao
18d42da21e runtime/fc: fix image/initrd annotation handling
Right now if we configure an image annotation and have a config file
setting initrd, the initrd config would override the image annotation.

Make sure annotations are preferred over config options in image and initrd
path handling.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-08-23 03:47:28 +00:00
Peng Tao
9fda7059a5 runtime/clh: fix image/initrd annotation handling
We should make sure annotations are preferred over
config options in image and initrd path handling.

Fixes: #7705
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-08-23 03:47:28 +00:00
Peng Tao
1a0092d631 runtime/qemu: fix image/initrd annotation handling
Right now if we configure an image annotation and have a config file
setting initrd, the initrd config would override the image annotation.

Add a helper function ImageOrInitrdAssetPath to make sure annotations
are preferred over config options in image and initrd path handling.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-08-23 03:47:27 +00:00
Manabu Sugimoto
22d8f335d6 libs,tests: fix typo disable_guest_seccomp in configuration-anno-1.toml
Change `pdisable_guest_seccomp` to `disable_guest_seccomp`

Fixes: #7727

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2023-08-23 12:08:18 +09:00
GabyCT
b8990c0490 Merge pull request #7722 from GabyCT/topic/adddiskreadme
metrics: Add disk link to README
2023-08-22 12:29:54 -06:00
GabyCT
514d3d42b8 Merge pull request #7712 from GabyCT/topic/fixfiopath
metrics: Fix FIO path
2023-08-22 12:28:28 -06:00
Gabriela Cervantes
8afd158cef metrics: Add disk link to README
This PR adds disk link to README documentation for kata metrics.

Fixes #7721

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-22 16:20:31 +00:00
Julien Ropé
40914b25d4 kata-agent: use default filemode for block device when it is set to 0
When the FileMode field for the device is unset (0), use a default value instead
to allow the use of the device from the container.
This behaviour is seen from cri-o typically.

Note: this is what runc is doing, which is why regular containers don't have an
issue. This change makes sure kata behaves the same as runc.

Fixes: #7717

Signed-off-by: Julien Ropé <jrope@redhat.com>
2023-08-22 16:08:14 +02:00
Fabiano Fidêncio
8032797418 Merge pull request #7708 from microsoft/danmihai1/kata-deploy-log
gha: capture additional kata-deploy output
2023-08-21 23:43:51 +02:00
David Esparza
d2c130ea69 Merge pull request #7710 from GabyCT/topic/fixpytorch1
metrics: Use function from metrics common in pytorch script
2023-08-21 15:31:24 -06:00
Gabriela Cervantes
eee2ee6eeb metrics: Fix FIO path
This PR fixes the FIO path for the FIO files.

Fixes #7711

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-21 21:06:04 +00:00
David Esparza
9347051592 Merge pull request #7666 from dborquez/metrics_improve_fio_test
metrics: Enable kata runtime in K8s for FIO test.
2023-08-21 13:51:57 -06:00
Gabriela Cervantes
39bc3488f5 metrics: Use function from metrics common in pytorch script
This PR uses a common function into the pytorch script.

Fixes #7709

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-21 16:12:35 +00:00
Dan Mihai
400eb88743 gha: capture additional kata-deploy output
10 lines can be insufficient for diagnostics.

Fixes: #7707

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-08-21 15:58:57 +00:00
GabyCT
700759232f Merge pull request #7690 from GabyCT/topic/fixpytorch
metrics: Fix README for pytorch
2023-08-21 09:50:14 -06:00
Jiang Liu
6e038e66e4 Merge pull request #7680 from GabyCT/topic/removetime
metrics: Remove unused variable in tensorflow mobilenet script
2023-08-21 23:39:07 +08:00
Jiang Liu
4aee3eade0 kata-types: implement serde methods for KataVirtualVolume
Implement serilization/deserialization methods for KataVirtualVolume.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-21 16:46:56 +08:00
Jiang Liu
b875e39323 kata-types: validate KataVirtualVolume object
Implement method validate() for KataVirtualVolume to validate message
format.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-21 16:42:07 +08:00
Jiang Liu
fa2fdc1057 kata-types: implement two conversion helpers for KataVirtualVolume
Enable conversions from NydusExtraOptions/DirectVolumeMountInfo to
KataVirtualVolume.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-21 16:35:26 +08:00
Jiang Liu
6326af20e3 kata-types: introduce KataVirtualVolume
Introduce structure KataVirtualVolume to to encapsulate information
for extra mount options and direct volumes, so we could build a common
infrastructure to handle these cases.

Fixes: #7699

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-21 16:19:47 +08:00
Gabriela Cervantes
c8b43f8b3e metrics: Fix README for pytorch
This PR fixes the pytorch reference in the README file.

Fixes #7689

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-18 20:14:49 +00:00
Aurélien
fa34d61805 Merge pull request #7664 from microsoft/danmihai1/agent-init-policy
rootfs: agent: Policy support with AGENT_INIT=yes
2023-08-18 10:51:55 -07:00
Fabiano Fidêncio
7e66d1f6b5 Merge pull request #7649 from fidencio/topic/k8s-tests-remove-kata-deploy-tests
gha: k8s: kata-deploy: Move kata-deploy specific tests from integration/kubernetes to functional/kata-deploy
2023-08-18 07:47:26 +02:00
David Esparza
fb571f8be9 metrics: Enable kata runtime in K8s for FIO test.
This PR configures the corresponding kata runtime in K8s
based on the tested hypervisor.

This PR also enables FIO metrics test in the kata metrics-ci.

Fixes: #7665

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-08-17 17:11:27 -06:00
Dan Mihai
cb056f8cb3 rootfs: agent: Policy support with AGENT_INIT=yes
When building with AGENT_POLICY=yes and AGENT_INIT=yes:
1. Include OPA and the Policy settings in rootfs.
2. Start OPA from the kata agent.

Before these changes, building with both AGENT_POLICY=yes and
AGENT_INIT=yes was unsupported.

Starting OPA from systemd (when AGENT_INIT=no) was already supported.

Fixes: #7615

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-08-17 22:37:58 +00:00
GabyCT
c358056a3f Merge pull request #7685 from GabyCT/topic/changename
metrics: Fix check results for tensorflow benchmark
2023-08-17 15:39:43 -06:00
Gabriela Cervantes
85c02828e1 metrics: Update tensorflow name in gha run script
This PR update tensorflow name in gha run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-17 20:17:48 +00:00
Gabriela Cervantes
e8a5119343 metrics: Fix check results for tensorflow benchmark
This PR fixes the check results for tensorflow benchmark now
that we change the name of the test.

Fixes #7684

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-17 19:52:45 +00:00
Fabiano Fidêncio
2d896ad12f gha: kata-deploy: Do the runtime class cleanup as part of the cleanup
Instead of doing this as part of the test itself, let's ensure it's done
before running the tests and during the tests cleanup.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 18:54:46 +02:00
Fabiano Fidêncio
4ffc2c86f3 gha: kata-deploy: Add the first kata-deploy test
This test, at least for now, only checks whether the runtimeclasses
have been properly created.

This is just a migration from a test we had as part of the k8s suite.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 18:54:46 +02:00
GabyCT
4ba684e6e4 Merge pull request #7653 from GabyCT/topic/tensorflowfp32
metrics: Add Tensorflow ResNet50 int8 benchmark
2023-08-17 10:44:25 -06:00
Gabriela Cervantes
8616c050ae metrics: Remove unused variable in tensorflow mobilenet script
This PR removes unused variable in tensorflow mobilenet script.

Fixes #7679

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-17 16:04:18 +00:00
Fabiano Fidêncio
285e616b5e tests: common: Ensure test_type is used as part of the cluster's name
By doing this we can make sure there won't be any clash on the cluster
name created for either the k8s or the kata-deploy tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 14:22:16 +02:00
Fabiano Fidêncio
790bd3548d tests: commob: Don't fail if yq is not part of the cache
This may happen on external runners.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 14:22:14 +02:00
Fabiano Fidêncio
ce6adecd0a gha: kata-deploy: Add run-kata-deploy-tests.sh
This will have the same function as run-k8s-tests.sh has, but for
kata-deploy.

Right now it doesn't have any tests, and the command to actually run the
tests is commented out, but right now this is just a placeholder that
will be populated sooner than later.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 09:49:03 +02:00
Fabiano Fidêncio
cfc29c11a3 gha: k8s: Stop running kata-deploy tests as part of the k8s suite
In a follow-up series, we'll add a whole suite for the kata-deploy
tests.  With this in mind, let's already get rid of this one and avoid
more kata-deploy tests to land here.

Fixes: #7642

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 09:48:54 +02:00
Fabiano Fidêncio
e470a650e0 Merge pull request #7654 from sprt/ci-fixes
kata-deploy: Properly create default runtime class
2023-08-17 09:43:34 +02:00
Wedson Almeida Filho
962378606e Merge pull request #7627 from wedsonaf/error-conv
agent: simplify error handling
2023-08-16 21:02:38 -03:00
Aurélien Bombo
f4dd152863 tests: k8s: Call ensure_yq() in setup.sh
It wasn't the `common.bash` import in `run_kubernetes_tests.sh` causing
the yq error so let's try this instead.

Reference: https://github.com/kata-containers/kata-containers/actions/runs/5674941359/job/15379797568#step:10:341

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-08-16 14:13:56 -07:00
GabyCT
3d0cfc88c9 Merge pull request #7662 from GabyCT/topic/fixhelptensorflow
metrics: Fix MobileNet help me description
2023-08-16 14:13:39 -06:00
Aurélien Bombo
339569b69c kata-deploy: Properly create default runtime class
The default `kata` runtime class would get created with the `kata`
handler instead of `kata-$KATA_HYPERVISOR`. This made Kata use the wrong
hypervisor and broke CI.

Fixes: #7663

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-08-16 11:04:44 -07:00
Gabriela Cervantes
2a491e9b1f metrics: Fix MobileNet help me description
This PR fixes MobileNet help me description in the
tensorflow script.

Fixes #7661

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-16 15:25:39 +00:00
Fabiano Fidêncio
606e419fac Merge pull request #7660 from fidencio/topic/add-kata-deploy-tests-as-part-of-the-ci
gha: ci: Start running kata-deploy tests
2023-08-16 16:44:08 +02:00
Fabiano Fidêncio
d19a75e80c gha: ci: Start running kata-deploy tests
Let's add the tests as part of the ci.yaml, so they an be triggered as
part of each PR.

For this PR those tests won't be triggered, courtesy to the
`pull_request_target` event we rely on.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-16 16:08:05 +02:00
Fabiano Fidêncio
4adcf2192e Merge pull request #7651 from ManaSugi/runk/containerd-test
runk: Modify kill command's error message for containerd tests
2023-08-16 15:37:48 +02:00
Zhongtao Hu
5c8a61a4c8 Merge pull request #7558 from openanolis/fix/driver_option
runtime-rs: add driver option
2023-08-16 13:56:29 +08:00
Zhongtao Hu
d90f7ac689 runtime-rs: add unit test for block driver
add unit test for block driver

Fixes:#7539
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-08-16 11:45:27 +08:00
Zhongtao Hu
e44919f0da runtime-rs: add load_test_config for unit test
add load_test_config for unit test

Fixes:#7539
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-08-16 11:32:56 +08:00
Zhongtao Hu
7f48a69379 runtime-rs: add driver option
add driver option when handle linux devices

Fixes:#7539
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-08-16 11:32:49 +08:00
Gabriela Cervantes
bade6a5c3b docs: Fix TensorFlow word across the document
This PR fixes the TensorFlow word across the document to have uniformity
across all the document.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-15 20:13:05 +00:00
Fabiano Fidêncio
0bc48eab60 Merge pull request #7640 from fidencio/topic/gha-cri-containerd-enable-tests
gha: cri-containerd: Enable tests
2023-08-15 21:18:28 +02:00
Gabriela Cervantes
1a1b207760 docs: Add Tensorflow Resnet50 documentation
This PR adds the Tensorflow Resnet50 documentation.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-15 17:46:44 +00:00
Gabriela Cervantes
24baededc0 metrics: Add Dockerfile for ResNet50 int8
This PR adds the dockerfile for ResNet50 int8 benchmark.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-15 17:38:26 +00:00
Gabriela Cervantes
6d971ba8df metrics: Add Tensorflow ResNet50 int8 benchmark
This PR adds the Tensorflow ResNet50 int8 script for kata metrics.

Fixes #7652

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-15 17:30:22 +00:00
Manabu Sugimoto
25d151bd1b runk: Modify kill command's error message for containerd tests
The error message when the kill command is executed with the container's
state == Stopped should be "container not running" because the containerd
tests expect that OCI runtimes return the error message and compare it.
If the error message is different from the expected one, the tests fail.

Fixes: #7650

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2023-08-16 00:39:50 +09:00
GabyCT
0bbabeaaf8 Merge pull request #7644 from GabyCT/topic/renametensorflow
metrics: Rename tensorflow scripts
2023-08-15 09:23:24 -06:00
Fabiano Fidêncio
46d25d908d Merge pull request #7643 from fidencio/topic/add-functional-kata-deploy-tests
gha: tests: Add kata-deploy functional tests -- Part 1
2023-08-15 15:23:48 +02:00
Fabiano Fidêncio
b3592ab25c gha: cri-containerd: Enable tests
As the cri-containerd tests have been fully migrated to GHA, let's make
sure we get them running.

Fixes: #6543

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:32:42 +02:00
Fabiano Fidêncio
84dd02e0f9 gha: cri-containerd: Add timeout to the crictl calls on testContainerStop
As part of the runners, we're hitting a timeout that I cannot reproduce,
at all, when allocating the same instance and running the tests
manually.

The default timeout to connect to the server is 2s when using `crictl`.
Let's increase this to 20s.

It's fairly important to mention that in the first tests I used a
timeout of 10s, and that helped but we still hit issues every now and
then.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Fabiano Fidêncio
b29782984a gha: cri-containerd: Show pod before deleting it
It'll help us to debug failures with the pod stop / pod delete.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Fabiano Fidêncio
ae0930824a gha: cri-containerd: Print kata logs in case of error
We need this to fully understand what are the issues we're facing.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Fabiano Fidêncio
6c8b2ffa60 gha: cri-containerd: Group containerd logs
This improves readability in case of failures by a lot.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Fabiano Fidêncio
9e898701f5 gha: cri-containerd: Ensure RUNTIME takes KATA_HYPERVISOR into account
Short commit log says it all.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Wedson Almeida Filho
76dac8f22c agent: simplify error handling
We extend the `Result` and `Option` types with associated types that
allows converting a `Result<T, E>` and `Option<T>` into
`ttrpc::Result<T>`.

This allows the elimination of many `match` statements in favor of
calling the map function plus the `?` operator. This transformation
simplifies the code.

Fixes: #7624

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-08-15 06:55:27 -03:00
Fabiano Fidêncio
e107d1d94e Merge pull request #7574 from microsoft/danmihai1/policy
agent: runtime: add Agent Policy feature
2023-08-15 11:29:13 +02:00
Bin Liu
ea81eb6c2e Merge pull request #7169 from chethanah/runk/support-no-pid-ns
runk: Support without pid ns
2023-08-15 13:00:40 +08:00
Gabriela Cervantes
18a7fd8e4e metrics: Rename tensorflow scripts
This PR renames the tensorflow scripts to include the data format
that is being used as we will have multiple tests with different
data and model formats for tensorflow so this will help us to
distinguish them.

Fixes #7645

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-14 20:40:35 +00:00
GabyCT
a740c80251 Merge pull request #7626 from GabyCT/topic/cassandrak
metrics: Add Cassandra Kubernetes benchmark for kata metrics
2023-08-14 14:22:52 -06:00
GabyCT
4e5e39e8b3 Merge pull request #7618 from GabyCT/topic/addfunctionscommon
metrics: Add common functions to the common script
2023-08-14 14:22:30 -06:00
GabyCT
a19d471c01 Merge pull request #7629 from dborquez/metrics_improve_stopping_kata_components
metrics: fix the loop used to stop kata components
2023-08-14 14:22:06 -06:00
Fabiano Fidêncio
e55fa93db9 tests: kata-deploy: Add placeholder for kata-deploy-tests-on-tdx
This will not be tested as part of the PR, thanks to the
`pull_request_target` event, but we want it to be added so we can build
atop of that in a coming up series.

Fixes: #7642

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-14 21:38:00 +02:00
Fabiano Fidêncio
d9ee17aaec tests: kata-deploy: Add placeholder for kata-deploy-tests-on-aks
This will not be tested as part of the PR, thanks to the
`pull_request_target` event, but we want it to be added so we can build
atop of that in a coming up series.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-14 21:37:52 +02:00
Chelsea Mafrica
22465d22f0 Merge pull request #7638 from ManaSugi/fix/virtcontainers-doc
docs: Remove installation step in virtcontainers doc
2023-08-14 10:21:57 -07:00
Dan Mihai
ab829d1038 agent: runtime: add the Agent Policy feature
Fixes: #7573

To enable this feature, build your rootfs using AGENT_POLICY=yes. The
default is AGENT_POLICY=no.

Building rootfs using AGENT_POLICY=yes has the following effects:

1. The kata-opa service gets included in the Guest image.

2. The agent gets built using AGENT_POLICY=yes.

After this patch, the shim calls SetPolicy if and only if a Policy
annotation is attached to the sandbox/pod. When creating a sandbox/pod
that doesn't have an attached Policy annotation:

1. If the agent was built using AGENT_POLICY=yes, the new sandbox uses
   the default agent settings, that might include a default Policy too.

2. If the agent was built using AGENT_POLICY=no, the new sandbox is
   executed the same way as before this patch.

Any SetPolicy calls from the shim to the agent fail if the agent was
built using AGENT_POLICY=no.

If the agent was built using AGENT_POLICY=yes:

1. The agent reads the contents of a default policy file during sandbox
   start-up.

2. The agent then connects to the OPA service on localhost and sends
   the default policy to OPA.

3. If the shim calls SetPolicy:

   a. The agent checks if SetPolicy is allowed by the current
      policy (the current policy is typically the default policy
      mentioned above).

   b. If SetPolicy is allowed, the agent deletes the current policy
      from OPA and replaces it with the new policy it received from
      the shim.

   A typical new policy from the shim doesn't allow any future SetPolicy
   calls.

4. For every agent rpc API call, the agent asks OPA if that call
   should be allowed. OPA allows or not a call based on the current
   policy, the name of the agent API, and the API call's inputs. The
   agent rejects any calls that are rejected by OPA.

When building using AGENT_POLICY_DEBUG=yes, additional Policy logging
gets enabled in the agent. In particular, information about the inputs
for agent rpc API calls is logged in /tmp/policy.txt, on the Guest VM.
These inputs can be useful for investigating API calls that might have
been rejected by the Policy. Examples:

1. Load a failing policy file test1.rego on a different machine:

opa run --server --addr 127.0.0.1:8181 test1.rego

2. Collect the API inputs from Guest's /tmp/policy.txt and test on the
   machine where the failing policy has been loaded:

curl -X POST http://localhost:8181/v1/data/agent_policy/CreateContainerRequest \
--data-binary @test1-inputs.json

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-08-14 17:07:35 +00:00
Fabiano Fidêncio
831e73ff91 tests: kata-deploy: Add functional/kata-deploy/gha-run.sh placeholder
Right now this file does nothing, as it's not even called by any GHA.
However, it'll be populated later on as part of a different series,
where we'll have kata-deploy specific tests running here.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-14 17:46:10 +02:00
Fabiano Fidêncio
af1b46bbf2 tests: Add gha-run-k8s-common.sh
Let's split a good portion of `tests/integration/kuberentes/gha-run.sh`
out, and put them in a place where they can be used to the soon-to-come
kata-deploy specific tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-14 17:45:58 +02:00
Jeremi Piotrowski
a57e7ffe14 Merge pull request #7211 from stevenhorsman/propogate-secrets
Propogate secrets, config maps etc into guest if sharedFS not available
2023-08-14 11:24:47 +02:00
Manabu Sugimoto
416445e7eb docs: Remove installation step in virtcontainers doc
Remove the installation step in the virtcontainers doc
because the virtcontainers install/uninstall targets have
been removed by 86723b51ae
and they are not used anymore.

Fixes: #7637

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2023-08-14 15:15:24 +09:00
Fabiano Fidêncio
b975c27793 Merge pull request #7547 from stevefan1999-personal/patch-k0s
kata-deploy: Preliminary k0s support
2023-08-12 14:28:13 +02:00
Fabiano Fidêncio
6ed57d1e9a Merge pull request #7447 from fidencio/topic/gha-move-static-jenkins-to-azure-instances
gha: static-checks: Move to the Azure instances
2023-08-12 13:31:54 +02:00
Steve Fan
72cbcf040b kata-deploy: Add k0s support
Add k0s support to kata-deploy, in the very same way kata-containers
already supports k3s, and rke2.

k0s support requires v1.27.1, which is noted as part of the kata-deploy
documentation, as it's the way to use dynamic configuration on
containerd CRI runtimes.

This support will only be part of the `main` branch, as it's not a bug
fix that can be backported to the `stable-3.2` branch, and this is also
noted as part of the documentation.

Fixes: #7548
Signed-off-by: Steve Fan <29133953+stevefan1999-personal@users.noreply.github.com>
2023-08-11 21:17:23 +02:00
David Esparza
767434d50a metrics: fix the loop used to stop kata components #7629
This PR fixed the loop that stops the kata-shim and the
hypervisors used in metrics checks.

Fixes: #7628

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-08-11 12:32:41 -06:00
Gabriela Cervantes
5d0f0d43c7 metrics: Add cassandra statefulset yaml
This PR adds cassandra statefulset yaml for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:39 +00:00
Gabriela Cervantes
c1dcc1396f metrics: Add cassandra service yaml
This PR adds the cassandra service yaml for the benchmark.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:36 +00:00
Gabriela Cervantes
2297a0d1c5 metrics: Add block loop pvc yaml for cassandra
This PR adds block loop pvc yaml for cassandra test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:33 +00:00
Gabriela Cervantes
e3d511946f metrics: Add block loop pv yaml for cassandra test
This PR adds the block loop pv yaml for cassandra test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:29 +00:00
Gabriela Cervantes
9890271594 metrics: Add block loop pvc for cassandra test
This PR adds the block loop pvc for cassandra test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:19 +00:00
Gabriela Cervantes
349b89969a metrics: Add Cassandra Kubernetes benchmark for kata metrics
This PR adds Cassandra Kubernetes benchmark for kata metrics tests.

Fixes #7625

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:21:48 +00:00
Fabiano Fidêncio
c52d090522 gha: static-checks: Move to the Azure instances
The GHA runners are not exactly powerful, which makes the static-checks
take way too long (almost an hour).

Let's give a try and move those to the same size of Azure instances used
as part of our CI, and probably have this time reduced.

Fixes: #7446

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-11 18:47:47 +02:00
stevenhorsman
8815ed0665 runtime: Remove config warnings
Remove configuration file shared_fs = none warnings
now that there is a solution to updating configMaps, secrets etc

Fixes: #7210
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-08-11 16:31:08 +01:00
Yohei Ueda
afe1a6ac5a agent: support copying of directories and symlinks
This patch allows copying of directories and symlinks when
static file copying is used between host and guest. This change is
necessary to support recursive file copying between shim and agent.

Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
(cherry picked from commit de232b8030)
2023-08-11 16:31:08 +01:00
Pradipta Banerjee
ab13ef87ee runtime: propagate configmap/secrets etc changes for remote-hyp
For remote hypervisor, the configmap, secrets, downward-api or project-volumes are
copied from host to guest. This patch watches for changes to the host files
and copies the changes to the guest.

Note that configmap updates takes significantly longer than updates via downward-api.
This is similar across runc and Kata runtimes.

Fixes: #7210

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Signed-off-by: Julien Ropé <jrope@redhat.com>
(cherry picked from commit 3081cd5f8e)
(cherry picked from commit 68ec673bc4d9cd853eee51b21a0e91fcec149aad)
2023-08-11 16:31:08 +01:00
Yohei Ueda
c074ec4df1 runtime: Copy shared files recursively
This patch enables recursive file copying
when filesystem sharing is not used.

Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
(cherry picked from commit 5422a056f2)
(cherry picked from commit 16055ce040bbd724be2916bc518d89b69c9e0ca5)

Fixes: #7210
2023-08-11 16:16:52 +01:00
Peng Tao
a39fd6c066 Merge pull request #7611 from ManaSugi/fix/fc-version
versions: Update firecracker version to 1.4.0
2023-08-11 16:43:37 +08:00
Chao Wu
7031b5db07 Merge pull request #7535 from ManaSugi/fix/allow-redundant-clone
agent: Allow clippy::redundant_clone in the unit tests
2023-08-11 14:17:56 +08:00
Gabriela Cervantes
fdcd52ff78 metrics: Add check containers are running in tensorflow mobilenet
This PR adds check containers are running in tensorflow mobilenet
that is being defined in common script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 20:17:20 +00:00
Gabriela Cervantes
36337ee146 metrics: Add check containers are up in tensorflow script
This PR adds the check containers are up function from common
in tensorflow script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 20:15:18 +00:00
Gabriela Cervantes
f700f9b0ba metrics: Remove unused variable in tensorflow script
This PR removes an unused variable in tensorflow script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 20:13:37 +00:00
Gabriela Cervantes
833cf7a684 metrics: Add check containers are running function
This PR adds the check containers are running function the common metrics
script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 20:12:22 +00:00
Gabriela Cervantes
918c783084 metrics: Add check containers are up in tensorflow mobilenet script
This PR adds the check containers are up in the common script
in the tensorflow mobilenet script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 20:06:40 +00:00
Gabriela Cervantes
9d57a1fab4 metrics: Use check containers are up in tensorflow script
This PR uses the check containers are up from the common script
in the tensorflow script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 17:42:09 +00:00
Gabriela Cervantes
1c84680d8c metrics: Add check containers are up in common script
This PR adds check containers are up in common script for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 17:39:24 +00:00
Gabriela Cervantes
d3e57cf454 metrics: Use collect_results function in tensorflow mobilenet test
This PR uses the collect results function defined in common for
the tensorflow mobilenet test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 17:34:30 +00:00
Gabriela Cervantes
286de046af metrics: Remove collect results function definition
This PR removes the collect results function from tensorflow script
as it is going to be referenced in the common metrics script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 17:31:23 +00:00
Gabriela Cervantes
9879709aae metrics: Add common functions to the common script
This PR adds the collect results function to the common metrics
script.

Fixes #7617

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 17:27:11 +00:00
Fabiano Fidêncio
a89c9cd620 Merge pull request #7557 from wedsonaf/no-new-vecs
agent: avoid creating new `Vec` instances when easily avoidable
2023-08-10 18:43:46 +02:00
Manabu Sugimoto
4746fa3daa docs: Specify supported Firecracker version using versions.yaml
Specify the supported version of Firecracker using our `versions.yaml`
to improve the maintainability of the documentation.

Fixes: #7610

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2023-08-10 16:49:45 +09:00
Manabu Sugimoto
cc922be5ec versions: Update firecracker version to 1.4.0
This patch upgrades Firecracker version from v1.1.0 to v1.4.0.

* Generate swagger models for v1.4.0 (from `firecracker.yaml`)
  - The version of go-swagger used is v0.30.0
* The firecracker v1.4.0 includes the following changes.
  - Added
    * Added support for custom CPU templates allowing users to adjust vCPU features
    exposed to the guest via CPUID, MSRs and ARM registers.
    * Introduced V1N1 static CPU template for ARM to represent Neoverse V1 CPU
    as Neoverse N1.
    * Added support for the virtio-rng entropy device. The device is optional. A
    single device can be enabled per VM using the /entropy endpoint.
    * Added a cpu-template-helper tool for assisting with creating and managing
    custom CPU templates.
  - Changed
    * Set FDP_EXCPTN_ONLY bit (CPUID.7h.0:EBX[6]) and ZERO_FCS_FDS bit
    (CPUID.7h.0:EBX[13]) in Intel's CPUID normalization process.
  - Fixed
    * Fixed feature flags in T2S CPU template on Intel Ice Lake.
    * Fixed CPUID leaf 0xb to be exposed to guests running on AMD host.
    * Fixed a performance regression in the jailer logic for closing open file
    descriptors.
    * A race condition that has been identified between the API thread and the VMM
    thread due to a misconfiguration of the api_event_fd.
    * Fixed CPUID leaf 0x1 to disable perfmon and debug feature on x86 host.
    * Fixed passing through cache information from host in CPUID leaf 0x80000006.
    * Fixed the T2S CPU template to set the RRSBA bit of the IA32_ARCH_CAPABILITIES
    MSR to 1 in accordance with an Intel microcode update.
    * Fixed the T2CL CPU template to pass through the RSBA and RRSBA bits of the
    IA32_ARCH_CAPABILITIES MSR from the host in accordance with an Intel microcode
    update.
    * Fixed passing through cache information from host in CPUID leaf 0x80000005.
    * Fixed the T2A CPU template to disable SVM (nested virtualization).
    * Fixed the T2A CPU template to set EferLmsleUnsupported bit
    (CPUID.80000008h:EBX[20]), which indicates that EFER[LMSLE] is not supported.

Fixes: #7610

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2023-08-10 16:48:13 +09:00
Fupan Li
39e67b06e9 dragonball: vsock add fifo/pipe stream support for passed fd hybridStream
Since the passed fd through unix socket would be any
stream fd such as pipe/fifo fd or any other socket
fd, thus we should deal with it as a normal hybrid
stream instead of a unix stream.

Fixes:#7584

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2023-08-10 11:07:10 +08:00
David Esparza
7bf994827d Merge pull request #7609 from dborquez/tensorflow_check_completion
metrics: compute tensorflow statistics
2023-08-09 18:47:47 -06:00
David Esparza
dcdb3b067f Merge pull request #7606 from GabyCT/topic/nginx
metrics: Add network nginx benchmark
2023-08-09 16:14:13 -06:00
David Esparza
2defdcc598 Merge pull request #7579 from dborquez/simplify_gha_metrics_workflow
metrics: install kata once and run multiple checks
2023-08-09 14:45:09 -06:00
David Esparza
473b0d3a31 metrics: compute tensorflow statistics
This PR computes average results for TF bench.
Additionally, it improves the data parsing from
all running containers.

Fixes: #7603

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-08-09 14:42:30 -06:00
Fabiano Fidêncio
0a8208c670 Merge pull request #7608 from fidencio/topic/create-image-to-be-used-by-the-confidential-tests-follow-up-3
ci: unencrypted-image: Fix build context
2023-08-09 21:00:46 +02:00
Fabiano Fidêncio
03d1fa67b1 ci: unencrypted-image: Fix build context
The build context should be the folder where the Dockerfile is present,
otherwise the files copied into the image won't be found.

Fixes: #7595

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-09 20:32:36 +02:00
Fabiano Fidêncio
eb463b38ec ci: unencrypted-image: Don't fail to build on s390x
Let's make sure that we don't fail in case we're building non x86_64.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-09 20:32:36 +02:00
Fabiano Fidêncio
ebc86091d1 Merge pull request #7607 from fidencio/topic/create-image-to-be-used-by-the-confidential-tests-follow-up-2
ci: create-confidential-image: Add dependent actions
2023-08-09 19:53:49 +02:00
Fabiano Fidêncio
a2d731ad26 ci: create-confidential-image: Add dependent actions
Following the example on https://github.com/docker/build-push-action,
it's clear that the actions to "Set up QEMU" and "Set up Docker Buildx"
are missing.

Let's add them, and also take the advantage to bump the
build-push-action to its v4, which, by the way, had a typo on its name
(build-and-push-action does **NOT** exist, build-push-action does).

Fixes: #7595

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-09 18:36:51 +02:00
Gabriela Cervantes
d1a6296221 metrics: Add nginx documentation to network README
This PR adds nginx documentation to network README for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-09 16:17:46 +00:00
Gabriela Cervantes
498f7c0549 metrics: Add nginx kubernetes yaml
This PR adds the nginx kubernetes yaml.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-09 16:14:04 +00:00
Gabriela Cervantes
f8a5255cf7 metrics: Add network nginx benchmark
This PR adds the network nginx benchmark for kata metrics.

Fixes #7605

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-09 16:12:21 +00:00
Fabiano Fidêncio
86f705d98b Merge pull request #7604 from fidencio/topic/create-image-to-be-used-by-the-confidential-tests-follow-up-1
Follow up fixes for https://github.com/kata-containers/kata-containers/pull/7596
2023-08-09 18:05:46 +02:00
Fabiano Fidêncio
43fe5d1b90 ci: k8s: tees: Ensure PR_NUMBER is exported
Right now this is not being used, but it'll as the image generated for
the confidential tests have that as part of their tag.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-09 17:45:42 +02:00
Fabiano Fidêncio
54f6a78500 ci: {{ pr-number }} should be {{ inputs.pr-number }}
One of the joys to rely on the `pull_request_target` is to only be able
to catch those after those are merged.

Fixes: #7595

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-09 17:41:07 +02:00
Fabiano Fidêncio
5cdf981a2b Merge pull request #7596 from fidencio/topic/create-image-to-be-used-by-the-confidential-tests
tests: Create image that will be used in the unencrypted confidential tests
2023-08-09 17:06:07 +02:00
Fabiano Fidêncio
c932369f42 Merge pull request #7492 from fidencio/topic/adapt-tests-to-the-new-kata-deploy-env-vars
kata-deploy: Ensure we cover SHIMS / DEFAULT_SHIM as part of our tests
2023-08-09 12:55:03 +02:00
Fabiano Fidêncio
034d7aab87 tests: k8s: Ensure the runtime classes are properly created
With these 2 simple checks we can ensure that we do not regress on the
behaviour of allowing the runtime classes / default runtime class to be
created by the kata-deploy payload.

Fixes: #7491

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-09 11:46:04 +02:00
Fabiano Fidêncio
fac8ccf5cd ci: Add build-and-publish-tee-confidential-unencrypted-image
This will be done before running TEE tests, and it's a hard dependency
fr them.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-09 11:36:10 +02:00
Fabiano Fidêncio
ab5f603ffa ci: k8s: Add the image used for unencrypted confidential tests
Let's add here the image we'll be using for unencrypted confidential
tests.  Later on, we'll make sure to build and use this image as part of
our CI.

The image can easily be built as a multi-arch image, and has `cpuid`
installed in case of `x86_64` build, so it can be used to detect whether
we're running on a TEE guest without having to rely on `dmesg | grep
...`.

Fixes: #7595

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-09 11:33:18 +02:00
Fabiano Fidêncio
36d53dd2af Merge pull request #7598 from UnmeshDeodhar/upgrade-bats-version
tests: upgrade bats version
2023-08-09 11:18:56 +02:00
Fabiano Fidêncio
1e8fe131bd k8s: tests: Take advantage of SHIMS and DEFAULT_SHIM env vars
We don't have to do any sed to replace the runtimeclass being used by
the moment we start taking advantage of the `DEFAULT_SHIM` environment
variable exposed merged in the previous commits.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-09 11:15:34 +02:00
Wedson Almeida Filho
729b2dd611 agent: avoid creating new Vec instances when easily avoidable
There are many places where the code currently creates new `Vec`
instances when it's not really needed. The result is a perf hit because
it allocates memory, copies all elements, then frees the memory; in some
cases, copying elements also involves extra allocations (e.g., when
elements are strings, or structs containing strings).

This patch addresses a number of these cases.

Fixes: #7203

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-08-09 02:38:36 -03:00
Jiang Liu
311671abb5 Merge pull request #7552 from jiangliu/agent-r1
Fix mimor bugs and improve coding stype of agent rpc/sandbox/mount
2023-08-09 13:19:02 +08:00
Unmesh Deodhar
aeaec9dae9 tests: upgrade bats version
Instead of using package manager to install bats, building
this from source. This gives us the updated version of bats
which supports functions such as setup_file and
teardown_file.
We can use these functions into our current tests.

Fixes: #7597

Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
2023-08-08 18:16:39 -05:00
David Esparza
e664969862 metrics: install kata once and run multiple checks
This PR changes the metrics workflow in order to just install
kata once, and run the checks for multiple hypervisor variations.

In this way we save time avoiding installing kata for each
hypervisor to be tested.

Fixes: #7578

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-08-08 10:25:13 -06:00
Jiang Liu
baabfa9f1f agent: refine implementation of mount related code
Refine implementation of mount by:
- log message with `path.display()` instead of `{:?}`
- add prefix "_" to unused variables
- pass by reference instead of by value to avoid creating redundant
  array
- exactly matching prefix "fsgid=" instead of "fsgid"
- avoid redundant clone() operations

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-08 18:03:03 +08:00
Jiang Liu
98ba211a34 agent: fix a bug in update_ephemeral_mounts()
There's a bug in function update_ephemeral_mounts() which only handles
the first storage object and ignores all other storage objects.

Fixes: #7551

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-08 18:03:02 +08:00
Jiang Liu
5333618d70 agent: make add_storage() take &[Storage] instead of Vec<Storage>
Simplify add_storage() by taking &[Storage] instead of Vec<Storage>.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-08 18:03:01 +08:00
Jiang Liu
37f34781d1 agent: simplify function online_cpu_memory()
Simplify function online_cpu_memory() by on calling update_cpuset_path()
for containers with cpuset configured.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-08 18:03:00 +08:00
Jiang Liu
d3c5422379 agent: refine style of code related to sandbox
Refine style of code related to sandbox by:
- remove unnecessary comments for caller to take lock, we have already taken
  `&mut self`.
- change "*count < 1 " to "*count == 0", `count` is type of u32.
- make remove_sandbox_storage() to take `&mut self` instead of `&self`.
- group related function to each others
- avoid search the map twice in function find_process()
- avoid unwrap() in function run_oom_event_monitor()
- avoid unwrap() in online_resources()

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-08 18:02:59 +08:00
Jiang Liu
71a9f67781 agent: avoid unwrap() in function do_remove_container()
Avoid unwrap() in function do_remove_container(), and also make
implmementation symmetric for both timeout and non-timeout cases.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-08 18:02:58 +08:00
Jiang Liu
84badd89d7 agent: avoid clone objects when possible
Optimize agent rpc implementation by:
- avoid clone objects when possible
- avoid unwrap() when possible
- explictly drop object to ensure order

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-08 18:02:56 +08:00
Chao Wu
b098960442 Merge pull request #7581 from justxuewei/bump-versions
deps: Bump dependent crate versions
2023-08-08 15:16:57 +08:00
Chao Wu
24bf637835 Merge pull request #7500 from pmores/fix-queue-num-in-dragonball-share-fs
fix number of queues handling in dragonball share fs device
2023-08-08 12:07:25 +08:00
Xuewei Niu
b23c5ed155 deps: Bump dependent crate versions
This pull request is mainly for updating vm-memory and vmm-sys-util.

The affacted crates include:

- vm-memory: from 0.9.0 to 0.10.0
- vmm-sys-util: from 0.10.0 to 0.11.0
- virtio-queue: from 0.6.0 to 0.7.0
- fuse-backend-rs: from 0.10.4 to 0.10.5
- linux-loader: from 0.6.0 to 0.8.0
- nydus-api: from 0.3.0 to 0.3.1
- nydus-rafs: from 0.3.1 to 0.3.2
- nydus-storage: from 0.6.3 to 0.6.4

Fixes: #0000

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-08-08 11:54:09 +08:00
Fupan Li
5a20d8dcaf Merge pull request #7383 from justxuewei/dan
runtime-rs: Introduce directly attachable network
2023-08-08 09:54:28 +08:00
Chelsea Mafrica
553fd79ea9 Merge pull request #7572 from GabyCT/topic/resnet50fp32
metrics: General improvements to mobilenet tensorflow test
2023-08-07 13:33:28 -07:00
GabyCT
194120b679 Merge pull request #7540 from GabyCT/topic/enableiperf
gha: Add iperf network metrics
2023-08-07 13:40:02 -06:00
Gabriela Cervantes
863283716d metrics: General improvements to mobilenet tensorflow test
This PR renames the mobilenet tensorflow test to have a more specific
tensorflow name mainly because tensorflow has different configurations
and we will add more tensorflow tests so we want to distinguish each
tensorflow test.

Fixes #7571

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-07 16:50:00 +00:00
Gabriela Cervantes
3c319d8d4c metrics: Add iperf to gha run script
This PR adds iperf to gha run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-07 16:20:00 +00:00
Gabriela Cervantes
5b5caf8908 gha: Add iperf network metrics
This PR adds the iperf network metrics to the github actions
for kata metrics.

Fixes #7535

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-07 16:20:00 +00:00
Chelsea Mafrica
4559caf619 Merge pull request #7467 from ManaSugi/doc/use-k8-control-plane
docs: Use control-plane term instead of master
2023-08-06 23:40:51 -07:00
Fabiano Fidêncio
b365bef570 Merge pull request #7191 from wedsonaf/avoid-clones
agent: avoid unnecessary calls to `Arc::clone`
2023-08-06 15:34:07 +02:00
GabyCT
7144acb2a5 Merge pull request #7527 from GabyCT/topic/latency
metrics: Add network latency test
2023-08-04 15:54:07 -06:00
Gabriela Cervantes
66db5b5350 metrics: Add latency test to network README
This PR adds latency test to network README for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-04 20:27:27 +00:00
Wedson Almeida Filho
c36572418f agent: avoid unnecessary calls to Arc::clone
These calls cause two extra atomic instructions each time they're used,
one to increment and another one to decrement the refcount.

Since we don't need them because the referred value is guaranteed to
outlive the function, remove the calls.

Fixes: #7190

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-08-03 20:53:05 -03:00
Fabiano Fidêncio
8c03deac3a Merge pull request #7106 from wedsonaf/image-pulling
Image pulling on the host
2023-08-04 01:08:42 +02:00
Wedson Almeida Filho
4fbe0a3a53 runtime: bind-mount mounted block device into container
When the mounted block device isn't a layer, we want to mount it into
containers, but since it's already mounted with the correct fs (e.g.,
tar, ext4, etc.) in the pod, we just bind-mount it into the container.

Fixes: #7536

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-08-03 17:58:39 -03:00
Wedson Almeida Filho
7e1b1949d4 runtime: add support for kata overlays
When at least one `io.katacontainers.fs-opt.layer` option is added to
the rootfs, it gets inserted into the VM as a layer, and the file system
is mounted as an overlay of all layers using the overlayfs driver.

Additionally, if the `io.katacontainers.fs-opt.block_device=file` option
is present in a layer, it is mounted as a block device backed by a file
on the host.

Fixes: #7536

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-08-03 17:58:39 -03:00
Wedson Almeida Filho
6c867d9e86 agent: add io.katacontainers.fs-opt.overlay-rw option
This causes the overlay-fs driver to add the `upperdir` and `workdir`
options to an overlay-fs mount so that the mount becomes writable using
a discardable directory under the container id.

Fixes: #7536

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-08-03 17:58:39 -03:00
Wedson Almeida Filho
6163c35657 agent: skip mount options that start with "io.katacontainers."
This is so that file systems don't fail when we pass kata-specific
options from the snapshotter to kata.

Fixes: #7536

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-08-03 17:58:39 -03:00
Fabiano Fidêncio
fa35afa982 Merge pull request #7542 from wedsonaf/ci-fix
Use version 0.10.4 of `fuse-backend-rs`
2023-08-03 22:50:11 +02:00
Wedson Almeida Filho
b2ff97aa01 dragonball: use version 0.10.4 of fuse-backend-rs
Version 0.10.5, which was just released, breaks `nydus-storage`.

This is a workaround to fix the CI which is blocking other PRs.

Fixes: #7541

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-08-03 14:15:17 -03:00
Fabiano Fidêncio
ebdae7cfdf Merge pull request #7520 from jepio/host-systemctl
kata-deploy: Use host's systemctl
2023-08-03 13:53:28 +02:00
Manabu Sugimoto
845eeb4d7b agent: Allow clippy::redundant_clone in the unit tests
Allow `clippy::redundant_clone` in the agent's unit tests
because rustc>=1.70 shows the errors as false-negatives.
These `clone()` are required because the following codes
refer to the variable, but the clippy analyzes them by mistake,
using the conservative and limited approach.
Ref. https://rust-lang.github.io/rust-clippy/master/index.html#/redundant_clone

Fixes: #7534

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2023-08-03 19:07:40 +09:00
Fabiano Fidêncio
e2755a47b8 Merge pull request #7524 from fidencio/revert-kata-deploy-changes-after-3.2.0-rc0-release
release: Revert kata-deploy changes after 3.2.0-rc0 release
2023-08-03 11:28:43 +02:00
Fabiano Fidêncio
1163fc9de2 release: Revert kata-deploy changes after 3.2.0-rc0 release
As 3.2.0-rc0 has been released, let's switch the kata-deploy / kata-cleanup
tags back to "latest", and re-add the kata-deploy-stable and the
kata-cleanup-stable files.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-03 10:08:20 +02:00
Xuewei Niu
3958a39d07 runtime-rs: Introduce directly attachable network
Kata containers as VM-based containers are allowed to run in the host
netns. That is, the network is able to isolate in the L2. The network
performance will benefit from this architecture, which eliminates as many
hops as possible. We called it a Directly Attachable Network (DAN for
short).

The network devices are placed at the host netns by the CNI plugins. The
configs are saved at {dan_conf}/{sandbox_id}.json in the format of JSON,
including device name, type, and network info. At the very beginning stage,
the DAN only supports host tap devices. More devices, like the DPDK, will
be supported in later versions.

The format of file looks like as below:

```json
{
	"netns": "/path/to/netns",
	"devices": [{
		"name": "eth0",
		"guest_mac": "xx:xx:xx:xx:xx",
		"device": {
			"type": "vhost-user",
			"path": "/tmp/test",
			"queue_num": 1,
			"queue_size": 1
		},
		"network_info": {
			"interface": {
				"ip_addresses": ["192.168.0.1/24"],
				"mtu": 1500,
				"ntype": "tuntap",
				"flags": 0
			},
			"routes": [{
				"dest": "172.18.0.0/16",
				"source": "172.18.0.1",
				"gateway": "172.18.31.1",
				"scope": 0,
				"flags": 0
			}],
			"neighbors": [{
				"ip_address": "192.168.0.3/16",
				"device": "",
				"state": 0,
				"flags": 0,
				"hardware_addr": "xx:xx:xx:xx:xx"
			}]
		}
	}]
}
```

Fixes: #1922

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-08-03 15:33:34 +08:00
David Esparza
7d1c48c881 Merge pull request #7530 from dborquez/fix_check_running_processes
metrics: stop kata components before start a metric test.
2023-08-02 23:51:27 -06:00
Zhongtao Hu
e719423262 Merge pull request #7127 from cmaf/runtime-rs-ch-blk-2
runtime-rs: Add block device handling for cloud hypervisor
2023-08-03 09:46:32 +08:00
David Esparza
1e15369e59 metrics: Improve naming testing containers in launch times test
This commit provides a new way to name the containers used
in the launch-times-test in this form:
'kata_launch_times_RANDOM_NUMBER', where RANDOM_NUMBER is
in the 0-1000 range.

Fixes: #7529

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-08-02 17:04:55 -06:00
David Esparza
5dbe88330f metrics: Clean kata components before start a metric test.
This PR kills all kata components before start a new
metric test.

Fixes: #7528

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-08-02 17:04:51 -06:00
Fabiano Fidêncio
d424f3c595 Merge pull request #7523 from fidencio/3.2.0-rc0-branch-bump
# Kata Containers 3.2.0-rc0
2023-08-02 20:04:37 +02:00
Zvonko Kaiser
cf8899f260 Merge pull request #7494 from zvonkok/vfio-mode
vfio: Fix vfio device ordering
2023-08-02 19:45:22 +02:00
Gabriela Cervantes
3b45060b61 metrics: Add latency server yaml
This PR adds latency server yaml for kubernetes test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-02 16:52:17 +00:00
Gabriela Cervantes
9bb8451df5 metrics: Add latency client yaml
This PR adds latency client yaml for the kubernetes test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-02 16:50:51 +00:00
Gabriela Cervantes
64fdb98704 metrics: Add network latency test
This PR adds network latency test for kata metrics.

Fixes #7526

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-02 16:46:48 +00:00
Chelsea Mafrica
a81ad3b587 runtime-rs: Add block device handling in cloud hypervisor
Add functions for adding a block device to a container for CH.

Fixes #6690

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-08-02 09:18:48 -07:00
David Esparza
542012c8be Merge pull request #7503 from GabyCT/topic/ghafio
metrics: Add FIO test to gha for kata metrics CI
2023-08-02 10:05:09 -06:00
David Esparza
5979f3790b Merge pull request #7516 from GabyCT/topic/addiperf
metrics: Add iperf3 network test
2023-08-02 10:04:51 -06:00
Fabiano Fidêncio
006ecce49a release: Kata Containers 3.2.0-rc0
- ci-on-push: Make the CI also run for the stable-* branches
- ci: k8s: Do not fail when gathering info on AKS nodes
- kata-deploy: enable cross build for non-x86
- runtime-rs: add support for gather metrics in runtime-rs
- kata-ctl: add monitor subcommand for runtime-rs
- release: release-note.sh: Fix typos and reference to images
- metrics: Add sysbench performance test
- Simplify implementation of runtime-rs/service

6ad16d497 release: Adapt kata-deploy for 3.2.0-rc0
025596b28 ci-on-push: Make the CI also run for the stable-* branches
7ffc0c122 static-build: enable cross build for qemu
35d6d86ab static-build: enable cross-build for image build
2205fb9d0 static-build: enable cross build for virtiofsd
11631c681 static-build: enable cross build for shim-v2
7923de899 static-build: cross build kernel
e2c31fce2 kata-deploy: enable cross build for kata deploy script
2fc5f0e2e kata-depoly: prepare env for cross build in lib.sh
f5e9985af release: release-note.sh: Fix typos and reference to images
f910c66d6 ci: k8s: Do not fail when gathering info on AKS nodes
632818176 metrics: Add k8s sysbench documentation
b3901c46d runtime-rs: ignore errors during clean up sandbox resources
5a1b5d367 metrics: Add sysbench pod yaml
ad413d164 metrics: Add sysbench dockerfile
151256011 metrics: Add sysbench performance test
62e328ca5 runtime-rs: refine implementation of TaskService
458e1bc71 runtime-rs: make send_message() as an method of ServiceManager
1cc1c81c9 runtime-rs: fix possibe bug in ServiceManager::run()
1a5f90dc3 runtime-rs: simplify implementation of service crate
731e7c763 kata-ctl: add monitor subcommand for runtime-rs The previous kata-monitor in golang could not communicate with runtime-rs to gather metrics due to different sandbox addresses. This PR adds the subcommand monitor in kata-ctl to gather metrics from runtime-rs and monitor itself.
d74639d8c kata-ctl: provide the global TIMEOUT for creating MgmtClient
02cc4fe9d runtime-rs: add support for gather metrics in runtime-rs

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-02 16:59:41 +02:00
Fabiano Fidêncio
6ad16d4977 release: Adapt kata-deploy for 3.2.0-rc0
kata-deploy files must be adapted to a new release.  The cases where it
happens are when the release goes from -> to:
* main -> stable:
  * kata-deploy-stable / kata-cleanup-stable: are removed

* stable -> stable:
  * kata-deploy / kata-cleanup: bump the release to the new one.

There are no changes when doing an alpha release, as the files on the
"main" branch always point to the "latest" and "stable" tags.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-02 16:59:41 +02:00
Fabiano Fidêncio
4e812009f5 Merge pull request #7519 from fidencio/topic/gha-ci-run-on-stable-branches
ci-on-push: Make the CI also run for the stable-* branches
2023-08-02 16:13:06 +02:00
Jeremi Piotrowski
3230dec950 kata-deploy: Use host's systemctl
when interacting with systemd. We have occasionally faced issues with
compatibility between the systemctl version used inside the kata-deploy
container and the systemd version on the host. Instead of using a containerized
systemctl with bind mounted sockets, nsenter the host and run systemctl from
there. This provides less coupling between the kata-deploy container and the
host.

Fixes: #7511
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-08-02 15:32:01 +02:00
Fabiano Fidêncio
29855ed0c6 Merge pull request #7510 from fidencio/topic/ci-k8s-aks-do-not-fail-gathering-info
ci: k8s: Do not fail when gathering info on AKS nodes
2023-08-02 09:44:19 +02:00
Fabiano Fidêncio
025596b289 ci-on-push: Make the CI also run for the stable-* branches
As we only support one stable branch, it'll be used as part of the
stable-3.2 and onwards.

Fixes: #7518

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-02 09:26:24 +02:00
Fabiano Fidêncio
e1a69c0c92 Merge pull request #6586 from jongwu/cross_build
kata-deploy: enable cross build for non-x86
2023-08-02 09:11:56 +02:00
Fupan Li
1a6b27bf6a Merge pull request #5797 from Yuan-Zhuo/add-metrics-for-runtime-rs
runtime-rs: add support for gather metrics in runtime-rs
2023-08-02 13:40:22 +08:00
Fupan Li
a536d4a7bf Merge pull request #6672 from Yuan-Zhuo/add-monitor-in-kata-ctl
kata-ctl: add monitor subcommand for runtime-rs
2023-08-02 13:39:02 +08:00
Gabriela Cervantes
ad6e53c399 metrics: Modify boot time values
This PR modifies boot time values limit.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-01 23:34:15 +00:00
Jianyong Wu
7ffc0c1225 static-build: enable cross build for qemu
Depends on mutiarch feature of ubuntu, we can set up cross build
environment easily and achive as good build performance as native
build.

Fixes: #6557
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-08-01 23:28:52 +02:00
Jianyong Wu
35d6d86ab5 static-build: enable cross-build for image build
It's too long a time to cross build agent based on docker buildx, thus
we cross build rootfs based on a container with cross compile toolchain
of gcc and rust with musl libc. Then we get fast build just like native
build.

rootfs initrd cross build is disabled as no cross compile tolchain for
rust with musl lib if found for alpine and based on docker buildx takes
too long a time.

Fixes: #6557
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-08-01 23:28:52 +02:00
Gabriela Cervantes
f764248095 gha: Add FIO test to run metrics yaml
This PR adds FIO test to run metrics yaml.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-01 20:29:16 +00:00
Jianyong Wu
2205fb9d05 static-build: enable cross build for virtiofsd
Based on messense/rust-musl-cross which offer cross build musl lib
environment to cross compile virtiofsd.

Fixes: #6557
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-08-01 22:10:46 +02:00
Jianyong Wu
11631c681a static-build: enable cross build for shim-v2
shim-v2 has go and rust code. For rust code, we use messense/rust-musl-cross
to build for speed up as it doesn't depends on qemu emulation. Build go
code based on docker buildx as it doesn't support cross build now.

Fixes: #6557
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-08-01 22:10:46 +02:00
Jianyong Wu
7923de8999 static-build: cross build kernel
Prepare cross build environment based on current Dockerfile.

Fixes: #6557
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-08-01 22:10:46 +02:00
Jianyong Wu
e2c31fce23 kata-deploy: enable cross build for kata deploy script
kata-deploy-binaries-in-docker.sh is the entry to build kata components.
set some environment to facilitate the following cross build work.

Fixes: #6557
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-08-01 22:10:46 +02:00
Jianyong Wu
2fc5f0e2e0 kata-depoly: prepare env for cross build in lib.sh
We leverage three env, TARGET_ARCH means the buid target tuple;
ARCH nearly the same meaning with TARGET_ARCH but has been widely
used in kata; CROSS_BUILD means if you want to do cross compile.

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-08-01 22:10:46 +02:00
Fabiano Fidêncio
c0171ea0a7 Merge pull request #7508 from fidencio/topic/fix-release-notes-typos-and-references
release: release-note.sh: Fix typos and reference to images
2023-08-01 22:05:32 +02:00
Gabriela Cervantes
58f9a57c20 metrics: Add network reference to general README metrics
This PR adds network reference to the general metrics README.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-01 16:54:00 +00:00
Gabriela Cervantes
07694ef3ae metrics: Add Kata Containers network metrics README
This PR adds the Kata Containers network metrics README.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-01 16:49:09 +00:00
Gabriela Cervantes
d8439dba89 metrics: Add iperf3 deployment yaml
This PR adds the iperf3 deployment yaml.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-01 16:45:01 +00:00
Gabriela Cervantes
bda83cee5d metrics: Add iperf3 daemonset for k8s
This PR adds the iperf3 daemonset for k8s.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-01 16:42:15 +00:00
Gabriela Cervantes
badff23c71 metrics: Add iperf3 service yaml for k8s
This PR adds the iperf3 service yaml for k8s.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-01 16:37:19 +00:00
Gabriela Cervantes
27c02367f9 metrics: Add iperf3 network test
This PR adds the iperf3 benchmark test for kata metrics.

Fixes #7515

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-01 16:30:46 +00:00
GabyCT
a0a524efc2 Merge pull request #7486 from kata-containers/topic/addsysbench
metrics: Add sysbench performance test
2023-08-01 10:17:48 -06:00
Fabiano Fidêncio
f5e9985afe release: release-note.sh: Fix typos and reference to images
diferent -> different

And also let's make sure we escape the backticks around the kata-deploy
environment variables, otherwise bash will try to interpret those.

Fixes: #7497

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-01 12:42:03 +02:00
Fabiano Fidêncio
f910c66d6f ci: k8s: Do not fail when gathering info on AKS nodes
Otherwise the VM deletion may not delete, leaving us with several
machines behind.

Fixes: #7509

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-01 12:36:33 +02:00
Manabu Sugimoto
1b21a46246 docs: Use control-plane term instead of master
Replace `master` with `control-plane` in the context of K8s
because `master` is a legacy term and haven't been used any more.

Ref. https://github.com/kubernetes/enhancements/tree/master/keps/sig-cluster-lifecycle/kubeadm/2067-rename-master-label-taint

Fixes: #7466

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2023-08-01 17:41:40 +09:00
Chao Wu
1a94aad44f Merge pull request #7480 from jiangliu/rt-service
Simplify implementation of runtime-rs/service
2023-08-01 16:05:33 +08:00
Chao Wu
2d13e2d71c Merge pull request #7504 from fidencio/topic/gha-release-fix-upload-versions-yaml
release: Fix upload-versions-yaml
2023-08-01 13:58:07 +08:00
GabyCT
b77d69aeee Merge pull request #7396 from GabyCT/topic/addghatensorflow
metrics: Enable Tensorflow metrics for kata CI
2023-07-31 17:13:24 -06:00
Fabiano Fidêncio
743291c6c4 release: Fix upload-versions-yaml
This requires the GITHUB_UPLOAD_TOKEN.  While we're here, let's also fix
the name of the action and remove the "-tarball" suffix, as it's not
really a tarball.

Fixes: #7497

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-31 23:57:33 +02:00
Fabiano Fidêncio
a71d35c764 Merge pull request #7499 from fidencio/topic/gha-release-ensure-stage-is-defined-for-amr64-s300x
gha: release: `stage` must be defined for arm64 / s390x yamls
2023-07-31 22:55:54 +02:00
Gabriela Cervantes
6328181762 metrics: Add k8s sysbench documentation
This PR adds k8s sysbench documentation at general density documentation.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-31 20:28:37 +00:00
Chelsea Mafrica
f74b7aba18 Merge pull request #7488 from cmaf/docs-k8s-links
docs: Update links for pods and kubelet
2023-07-31 12:44:24 -07:00
Gabriela Cervantes
8933d54428 metrics: Add FIO to gha run script
This PR adds FIO to gha run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-31 17:51:11 +00:00
Gabriela Cervantes
8a584589ff metrics: Add DAX FIO README
This PR adds DAX FIO README information.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-31 17:42:44 +00:00
Gabriela Cervantes
21f5b65233 metrics: Add FIO information in storage general README
This PR adds FIO information in storage general README.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-31 17:33:39 +00:00
Gabriela Cervantes
69f05cf9e6 metrics: Add FIO general README
This PR adds FIO general README information.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-31 17:30:05 +00:00
Gabriela Cervantes
87d41b3dfa metrics: Add FIO test to gha for kata metrics CI
This PR adds FIO test to gha for kata metrics CI.

Fixes #7502

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-31 16:50:16 +00:00
Pavel Mores
28e5e9c86e runtime-rs: fix number of queues handling in dragonball share fs device
Looks like a copy/paste error...

Fixes #7501

Signed-off-by: Pavel Mores <pmores@redhat.com>
2023-07-31 17:25:47 +02:00
Fabiano Fidêncio
ff8d7e7e41 Merge pull request #7496 from fidencio/topic/topic/kata-deploy-take-nfd-into-consideration-pre-work
k8s: Rely on the USING_NFD environment variable passed by the jobs
2023-07-31 14:56:15 +02:00
Fabiano Fidêncio
1b111a9aab gha: release: stage must be defined for arm64 / s390x yamls
`stage`  has been added, but only hooked up to the amd64 logic, leaving
arm64 and s390x behind.

Let's fix this right now, and make sure no error occurs when passing
this down to the yaml files.

Fixes: #7497

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-31 14:41:35 +02:00
Fabiano Fidêncio
684a6e1a55 Revert "gha: release: stage must be a string"
This reverts commit 7c857d38c1.

I've misunderstood the error given by github action, let's fix this in
the next commit.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-31 14:37:52 +02:00
Fabiano Fidêncio
99711f107f Merge pull request #7498 from fidencio/topic/gha-release-stage-must-be-a-string
gha: release: `stage` must be a string
2023-07-31 14:32:47 +02:00
Fabiano Fidêncio
7c857d38c1 gha: release: stage must be a string
Otherwise we'll face the following error as part of our GHA:
```
The workflow is not valid.
kata-containers/kata-containers/.github/workflows/release-$foo.yaml
(Line: 13, Col: 14): Invalid input, stage is not defined in the
referenced workflow.
```

Fixes: #7497

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-31 13:39:13 +02:00
Fabiano Fidêncio
28e171bf73 Merge pull request #7490 from fidencio/3.2.0-alpha4-branch-bump
# Kata Containers 3.2.0-alpha4
2023-07-31 13:34:15 +02:00
Fabiano Fidêncio
91e1e612c3 k8s: Rely on the USING_NFD environment variable passed by the jobs
Let's make sure we can rely on the tests passing down whether they want
to be tested using Node Feataure Discovery or not.

Right now, only the TDX job has this option set to "true", all the other
jobs have this option set to "false".

We can and have to merge this one before merging the NFD related patches
as:
1) It causes no harm in exporting this environment variable, but not
   having it used
2) It will allow us to test the NFD after this one is merged, as changes
   in the yaml file, in the case of the pull_request_target event,  are
   not taken into consideration before they're merged

Fixes: #7495

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-31 13:30:18 +02:00
Zvonko Kaiser
cddcde1d40 vfio: Fix vfio device ordering
If modeVFIO is enabled we need 1st to attach the VFIO control group
device /dev/vfio/vfio an 2nd the actuall device(s) afterwards.Sort the
devices starting with device #1 being the VFIO control group device and
the next the actuall device(s)
/dev/vfio/<group>

Fixes: #7493

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-07-31 11:26:27 +00:00
Fabiano Fidêncio
7edc7172c0 release: Kata Containers 3.2.0-alpha4
- tests: Add `k8s-volume` and `k8s-file-volume` tests to GHA CI
- metrics: Update boot time for kata metrics
- metrics: Add FIO report files for kata metrics
- kata-deploy: Allow runtimeclasses to be created by the daemonset
- runtime-rs: change block index to 0
- agent: fix typo in constant
- metrics: Add FIO benchmark for metrics tests
- gha: dragonball: Run only on the dragonball labeled machine
- tests: Fix `k8s-job` test
- agent,libs: Remove unused 'mut' keywords
- runtime-rs: remove unneeded 'mut' keywords
- tests: QoL improvements for running tests locally
- agent: exclude symlinks from recursive ownership change
- cache: kernel: Fix kernel caching
- runk: Add Docker guide to README
- metrics: General improvements to json.bash script
- kata-deploy: Allow shim creation based on what's passed to the daemonset
- gha: ci: Add skeleton of vfio job
- s390x: Fixing device.Bus assignment
- release: Mention the container images used to build the project
- kata-deploy-binaries: kernel_cache: Take module_dir into account
- ci: nydus: Fix typo in "source"
- gha: ci: Add no-op nydus tests to our CI
- Dragonball: migrate dragonball-sandbox crates to Kata
- ci: gha: Add cri-containerd tests (but still do not enable them)
- packaging/tools: Add kata-debug and use it as part of our CI
- cache: kernel: Consider changes in tools/packaging/kernel
- kata-deploy: Properly get the path of the versions.yaml file
- kata-deploy: Add VERSION and versions.yaml to the final tarball
- metrics: Add C-Ray performance test
- metrics: enable TensorFlow benchmark to be run on gha
- metrics: Add function to memory inside container script
- Revert "metrics: Replace backslashes used to escape double quoted key in jq expr"
- versions: Bump virtiofsd to v1.7.0
- metrics: stop hypervirsor and shim at init_env stage
- ci: k8s: Adapt "source ..." to the new location of gha-run.sh
- ci: Move `tests/integration/gha-run.sh`  to `tests/integration/kuberentes/` ... and also remove KUBECONFIG from the tdx envs
- versions: Update kernel to version v6.1.x
- agent: Fix exec hang issues with a backgroud process
- agent: Ignore already mounted dev/fs/pseudo-fs
- ci: k8s: Bring TDX tests back
- metrics: Update machine learning documentation
- gha: ci: cri-containerd: Fix KATA_HYPERVSIOR typo
- tests: Add MobileNet Tensorflow performance benchmark
- metrics: replace backslashes used to escape double quoted jq key expr.
- runtime-rs: enhancement of Device Manager for network endpoints.
- feat(Tracing): tracing in Rust runtime
- runtime-rs: ignore unconfigured network interfaces
- metrics: Stop running kata-env before kata is properly installed.
- metrics: use rm -f to remove the oldest continerd config file.
- kernel: Update kernel config name
- kata-deploy: Add a debug option to kata-deploy (and also use it as part of our CI)
- runtime-rs: add parameter for propagation of (u)mount events
- kata-ctl: Move GuestProtection code to kata-sys-util
- tests: Add function before function name in common.bash for metrics
- tests: Add metrics storage documentation
- metrics: Fix metrics ts generator to treat numbers as decimals
- gha: ci: Add cri-containerd tests skeleton -- follow up 1
- dragonball/agent: Add some optimization for Makefile and bugfixes of unit tests on aarch64
- metrics: Enable blogbench test
- tests: Add machine learning performance tests
- tests: gha: ci: Add cri-containerd tests skeleton
- metrics: Enable memory inside container metrics
- tools: Use a consistent target name when building mariner initrd
- gha: ci: Gather info about the node / pods
- runtime-rs: Do not scan network if network model is "none"
- gha: k8s: tdx: Temporarily disable TDX tests
- metrics: Update memory usage script
- gha: Cancel previous jobs if a PR is updated
- gha: nightly: Fix long name of AKS clusters issue and make the CI easier to test
- README: Add badge for our Nightly CI
- gha: Do not run all the tests if only docs are updated
- bugfix: plus default_memory when calculating mem size
- gha: ci: Use github.sha to get the last commit reference
- dragonball: Don't fail if a request asks for more CPUs than allowed
- gha: ci: Fix refernce passed to checkout@v3
- gha: ci: Avoid using env also in the ci-nightly and payload-after-push
- gha: k8s: Ensure cluster doesn't exist before creating it
- gha: ci: More follow up fixes after adding a nightly CI
- tests: Enable running k8s tests on Mariner
- gha: ci: Avoid using env unless it's really needed
- gha: ci: Follow up fixes for the nightly jobs
- tests: Enable memory usage metrics tests
- gha: Add nightly jobs
- metrics: storing metrics workflow artifacts
- gha: k8s: Ensure tests are running on a specific namespace
- metrics: Adds blogbench and webtool metrics tests
- gha: dragonball: Correctly propagate PATH update
- versions: Upgrade to Cloud Hypervisor v33.0
- Convert `is_allowed`, `ttrpc_error` and `sl` to functions
- gha: release: Use a specific release of hub
- metrics: Add checkmetrics to gha-run.sh for metrics CI
- packaging: Fix indentation of build.sh script at ovmf
- doc: Add documentation for the virtualization reference architecture
- gpu: Update kernel building to the latest changes
- runtime: fix PCIe topology for GPUDirect use-case
- metrics: Add memory footprint tests
- runtime: Add "none" as a shared_fs option
- metrics: Uniformity across function names in gha-run.sh
- runtime-rs:  support physical endpoint using device manager
- runtime-rs: bugfix for direct volume path's validation.
- metrics: Fix retrieving hypervisor version on metrics
- runtime-rs: fix build error on AArch64
- checkmetrics: Add checkmetrics makefile and documentation
- docs: Add boot time metrics documentation
- runtime-rs: add support spdk/vhost-user based volume.
- static-build: Remove kata-version parameter
- dragonball: avoid obtaining lock twice in create_stdio_console
- metrics: Add checkmetrics for kata metrics CI
- metrics: enable launch-times test on gha-run metrics script
- docs: Add general metrics documentation
- add support vfio device manager
- gha: Don't automatically trigger CI
- kata-ctl: Check for vm capability
- docs: fix spelling of "crate"
- packaging: Fix indentation in init.sh script
- gha: Fix gha actions
- metrics: install kata and launch-times test
- tests: Move tests helper script to this repo
- tests: Add json script for metrics tests
- Cherry pick initramfs caching updates from CCv0
- gha: Fix format for run launchtimes metrics yaml
- tests: Add tests lib common script
- Fix deprecated virtiofsd args (go shim only)
- gha: Add base branch on SHA on pull requst
- gha: ci-on-push: Run metrics tests
- docs: Update Developer Guide
- runtime-rs: Enhance flexibility of virtio-fs config
- versions: Update firecracker version to 1.3.3
- tools: Fix no-op builds
- runtime-rs: update Cargo.lock
- gha: Fix `stage` definition in matrix
- feat(runtime): vcpu resize capability
- packaging: Remove snap package
- gha: Add new build targets for Mariner
- Dragonball: support resize memory
- Port Measured rootfs feature from CCv0 branch to main
- add support direct volume and refactor device manager
- gha: Fix gha-run.sh and unbreak CI
- kata-ctl: Switch to slog logging; add --log-level and --json-logging arguments
- log-parser: Update log parser link at README
- gha: aks: Extract `run` commands to a script
- runtime-rs: handle copy files when share_fs is not available
- agent-ctl: fix the compile error
- agent: fix the issue of exec hang with a backgroud process
- runtime-rs: bugfix: update Cargo.lock
- gha: aks: Use short SHA in cluster name
- README: Display badge for the "Publish Artefacts" job and update the Kata Containers logo
- kata-deploy: Change how we get the Ubuntu k8s key
- gha: aks: Ensure host_os is used everywhere needed
- kubernetes: add agnhost command in pod yaml
- main | release: Standardize kata static file name
- packaging: make BUILDER_REGISTRY configurable
- gha: aks: Add the host_os as part of the aks cluster's name
- kernel: Modify build-kernel.sh to accomodate for changes in version.yaml
- gha: Fix Mariner cluster creation
- gha: Unbreak CI and fix cluster creation step
- Dragonball: support vcpu hotplug on aarch64
- runtime-rs/sandbox_bindmounts: add support for sandbox bindmounts
- runtime-rs/kata-ctl: Enhancement of DirectVolumeMount.
- gha: Create Mariner host as part of k8s tests
- netlink: Fix the issue of update_interface
- gha: Increase timeout for AKS jobs and give more time to start running the tests
- runtime: sending SIGKILL to qemu
- dragonball: convert BlockDeviceMgr and VirtioNetDeviceMgr functions to methods
- dragonball: Remove virtio-net and vsock devices gracefully
- kata-deploy: Improve shim backup / restore
- doc: Update git commands
- kata-deploy: Fix indentation on kata deploy merge script

8353aae41 ci: k8s: Rework get_nodes_and_pods_info()
6ad5d7112 ci: k8s: Do not gather node info before running the tests
5261e3a60 ci: k8s: Group messages to improve readability
9cc6b5f46 ci: k8s: Get logs from kata-deploy
9d285c622 ci: k8s: Let kata-deploy take care of the runtimeclasses
87568ed98 gha: Test split out runtimeclasses are in sync with all-in-one file
39192c608 kata-deploy: Print variables passed to the script
0e157be6f kata-deploy: Allow runtimeclasses to be created by the daemonset
a27433324 kata-deploy: Change default values of DEBUG
69535b808 kata-deploy: runtimeclass: Split out entries
9e1710674 kata-runtimeClasses: Alphabetically sort the enrties
6222bd910 tests: Add k8s-file-volume test
187a72d38 tests: Add k8s-volume test
0c8427035 metrics: Add boot time value for qemu
6520dfee3 metrics: Update boot time for kata metrics
ff2279061 metrics: Update runtime and configuration paths
a5d4e3388 metrics: Add compare virtiofsd dax script
5e937fa62 metrics: Update general FIO tests
b0bea47c5 metrics: Add makefile to report generator
73c57b9a1 metrics: Add FIO report files for kata metrics
c8fcd29d9 runtime-rs: use device manager to handle virtio-pmem
901c19225 runtime-rs: support configure vm_rootfs_driver
5d6199f9b runtime-rs: use device manager to handle vm rootfs
20f1f62a2 runtime-rs: change block index to 0
662f87539 metrics: Add general FIO makefile
c5a87eed2 tests: gha: Add timeout to cluster creation
6daeb08e6 tests: k8s: Clean up node debuggers after running
3aa6c77a0 gha: dragonball: Run only on the dragonball labeled machine
37641a543 metrics: Add example config for fio jobs
314aec73d agent: fix typo in constant
4703434b1 tests: k8s: Allow using custom resource group
350f3f70b tests: Import `common.bash` in `run_kubernetes_tests.sh`
d7f04a64a tests: k8s: Leave `runtimeclass_workloads/` alone
bdde6aa94 tests: k8s: Split deployment and testing commands
91a0b3b40 tests: aks: Simply delete cluster when cleaning up
3c1044d9d metrics: Update FIO paths for k8s runner
6177a0db3 metrics: Add env files for FIO
a45900324 metrics: Add fio exec
ea198fddc metrics: Add FIO runner k8s
8f7ef41c1 metrics: Add FIO vendor code
6293c17bd metrics: Add FIO benchmark for metrics tests
ff4cfcd8a runk: Add Docker guide to README
c8ac56569 cache: kernel: Harmonize commit with fetching side
81775ab1b cache: kernel: Fix SEV kernel caching
717f775f3 gha: ci: Add skeleton of vfio job
b9f100b39 agent,libs: Remove unused 'mut' keywords
a56f96bb2 kata-deploy: Allow shim creation based on what's passed to the daemonset
4a5ab38f1 metrics: General improvements to json.bash script
d4eba3698 kata-deploy-binaries: kernel_cache: Take module_dir into account
b7c9867d6 release: Mention the container images used to build the project
7c4b59781 ci: nydus: Fix typo in "source"
6a680e241 gha: ci: Add placeholder for the nydus tests as part of the CI
fb4f7a002 gha: nydus: Add a no-op GHA for nydus
4a207a16f gha: nydus: Bring tests as they are from the tests repo
2c8f83424 runtime-rs: remove unneeded 'mut' keywords
1fc715bc6 s390x: Add AP Attach/Detach test
e91f5edba ci: cri-containerd: Fix default typo for testContainerStart()
8b8aef09a ci: cri-containerd: Temporarily disable TestContainerSwap
56767001c ci: cri-containerd: Add namespace / uid to the pods
a84773652 ci: cri-containerd: Always use sudo to call crictl
99ba86a1b ci: cri-containerd: Add /usr/local/go/bin to the PATH
7f3b30999 ci: cri-containerd: Add `function` before each function
fde22d6bc ci: cri-containerd: Assume podman is always used
9465a0496 ci: cri-containerd: Adapt "source ..." to this repo
df8d14411 ci: cri-containerd: Remove CI variable
f90570aef ci: cri-containerd: Remove unused runc_runtime_bin
c3637039f ci: cri-containerd: Remove KILL_VMM_TEST env var
bc4919f9b ci: cri-containerd: Always run shim-v2 tests
f9e332c6d ci: cri-containerd: Stop cloning containerd
cfd662fee ci: cri-containerd: Remove ununsed SNAP_CI var
d36c3395c ci: cri-containerd: Update copyright
b5be8a4a8 ci: cri-containerd: Move integration-tests.sh as it was
f2e00c95c ci: cri-containerd: Populate install_dependencies()
897955252 versions: Add "latest" field for cri-tools
1bbcbafa6 ci: Add clone_cri_container()
f66c68a2b ci: Add install_cri_tools()
4dd828414 ci: Add install_cri_containerd()
ad47d1b9f ci: Add download_github_project_tarball()
788c562a9 ci: Add get_latest_patch_release_from_a_github_project()
6742f3a89 ci: Use `function` before each install_go.sh function
5eacecffc ci: Adjust paths for install_go.sh
8ed1595f9 ci: Update copyright for install_go.sh
6123d0db2 ci: Move install_go.sh as it was
8653be71b ci: Do not take cross-build into consideration for kata-arch.sh
6a76bf92c ci: Fix style / identation if kata-arch.sh
72743851c ci: Add `function` before each kata-arch.sh function
9f6d4892c ci: Update copyright for kata-arch.sh
6f73a7283 ci: Move kata-arch.sh as it was
3615d7343 ci: Add get_from_kata_deps()
34779491e gha: kubernetes: Avoid declaring repo_root_dir
f3738beac tests: Use $HOME/go as fallback for $GOPATH
b87ed2741 tests: Move `ensure_yq` to common.bash
124e39033 tests: common: Fix quoting when globbing
db77c9a43 tests: Make install_kata take care of the links
13715db1f tests: Do not call `install_check_metrics` when installing kata
630634c5d ci: k8s: Group logs to make them easier to read
228b30f31 ci: k8s: Gather node info during the cleanup
81f99543e ci: k8s: Cleanup cluster before deleting it
38a7b5325 packaging/tools: Add kata-debug
ae6e8d2b3 kata-deploy: Properly get the path of the versions.yaml file
309e23255 cache: kernel: Consider changes in tools/packaging/kernel
59fdd69b8 kata-deploy: Add VERSION and versions.yaml to the final tarball
5dddd7c5d release: Upload versions.yaml as part of the release
bad3ac84b metrics: Rename C-Ray to cpu performance tests
87d99a71e versions: Remove "kernel-experimental"
545de5042 vfio: Fix tests
62aa6750e vfio: Added better handling of VFIO Control Devices
dd422ccb6 vfio: Remove obsolete HotplugVFIOonRootBus
114542e2b s390x: Fixing device.Bus assignment
371a118ad agent: exclude symlinks from recursive ownership change
e64edf41e metrics: Add tensorflow function in gha-run script
67a6fff4f metrics: Enable tensorflow benchmark on gha
01450deb6 Revert "metrics: Replace backslashes used to escape double quoted key in jq expr."
843006805 metrics: Add function to memory inside container script
bbd3c1b6a Dragonball: migrate dragonball-sandbox crates to Kata
fad801d0f ci: k8s: Adapt "source ..." to the new location of gha-run.sh
55e2f0955 metrics: stop hypervirsor and shim at init_env stage
556e663fc metrics: Add disk link to general metrics README
98c121709 metrics: Add C-Ray README
8e7d9926e metrics: Add C-Ray Dockerfile
e2ee76978 metrics: Add C-Ray performance test
2ee2cd307 ci: k8s: Move gha-run.sh to the kubernetes dir
88eaff533 ci: tdx: Adjust KUBECONFIG
c09e268a1 versions: Downgrade SEV(-SNP) kernel back to v5.19.x
6a7a32365 versions: Bump virtiofsd to v1.7.0
ac5f5353b ci: k8s: Bring TDX tests back
950b89ffa versions: Update kernel to version v6.1.38
8ccc1e5c9 metrics: Update machine learning documentation
f50d2b066 gha: ci: cri-containerd: Fix KATA_HYPERVSIOR typo
620b94597 metrics: Add Tensorflow Mobilenet documentation
6c91af0a2 agent: Fix exec hang issues with a backgroud process
59f4731bb metrics: Stop running kata-env before kata is properly installed.
468f017e2 metrics: Replace backslashes used to escape double quoted key in jq expr.
64f013f3b ci: k8s: Enable debug when running the tests
8f4b1df9c kata-deploy: Give users the ability to run it on DEBUG mode
2c8dfde16 kernel: Update kernel config name
150e54d02 runtime-rs: ignore unconfigured network interfaces
3ae02f920 metrics: use rm -f to remove older continerd config file.
a864d0e34 tests: Add tensorflow mobilenet dockerfile
788d2a254 tests: Add tensorflow mobilenet performance test
3fed61e7a tests: Add storage link to general metrics documentation
b34dda4ca tests: Add storage blogbench metrics documentation
6787c6390 runtime-rs: add parameter for propagation of (u)mount events
6e5679bc4 tests: Add function before function name in common.bash for metrics
62080f83c kata-sys-util: Fix compilation errors
02d99caf6 static-checks: Make cargo clippy pass.
982420682 agent: Make the static checks pass for agent
61e4032b0 kata-ctl: Remove all utility functions to get platform protection
a24dbdc78 kata-sys-util: Move utilities to get platform protection
dacdf7c28 kata-ctl: Remove cpu related functions from kata-ctl
f5d195717 kata-sys-util: Move additional functionality to cpu.rs
304b9d914 kata-sys-util: Move CPU info functions
7319cff77 ci: cri-containerd: Add LTS / Active versions for containerd
2a957d41c ci: cri-containerd: Export GOPATH
75a294b74 ci: cri-containerd: Ensure deps are installed
6924d14df metrics: Fix metrics ts generator to treat numbers as decimals
9e048c8ee checkmetrics: Add blogbench read value for qemu
2935aeb7d checkmetrics: Add blogbench write value for qemu
02031e29a checkmetrics: Add blogbench read value for clh
107fae033 checkmetrics: Add blogbench write value for clh
8c75c2f4b metrics: Update blogbench Dockerfile
49723a9ec metrics: Add double quotes to variables
dc67d902e metrics: Enable blogbench test
438fe3b82 gha: ci: Add cri-containerd tests skeleton
bd08d745f tests: metrics: Move metrics specific function to metrics gha-run.sh
3ffd48bc1 tests: common: Move a few utility functions to common.bash
7f961461b tests: Add machine learning README
bb2ef4ca3 tests: Add `function` before each function
063f7aa7c tests: Add Pytorch Dockerfile
1af03b9b3 tests: Add Pytorch performance test
4cecd6237 tests: Add tensorflow Dockerfile
c4094f62c tests: Add metrics machine learning performance tests
89b622dcb gha: k8s: tdx: Temporarily disable TDX tests
8c9d08e87 gha: ci: Gather info about the node / pods
283f809dd runtime-rs: Enhancing Device Manager for network endpoints.
a65291ad7 agent: rustjail: update test_mknod_dev
46b81dd7d agent: clippy: fix cargo clippy warnings
c4771d9e8 agent: Makefile: enable set SECCOMP dynamically
a88212e2c utils.mk: update BUILD_TYPE argument
883b4db38 dragonball: fix cargo test on aarch64
6822029c8 runtime-rs: Do not scan network if network model is "none"
ce54e43eb metrics: Update memory usage script
fbc2a91ab gha: Cancel previous jobs if a PR is updated
307cfc8f7 tools: Use a consistent target name when building mariner initrd
d780cc08f gha: nightly: Also use `workflow_dispatch` to trigger it
b99ff3026 gha: nightly: Fix name size limit for AKS
aedc586e1 dragonball: Makefile: add coverage target
310e069f7 checkmetrics: Enable checkmetrics for memory inside test
1363fbbf1 README: Add badge for our Nightly CI
1776b18fa gha: Do not run all the tests if only docs are updated
28c29b248 bugfix: plus default_memory when calculating mem size
0c1cbd01d gha: ci: after-push: Use github.sha to get the last commit reference
37a955678 gha: ci: nightly: Use github.sha to get the last commit reference
ed23b47c7 tracing: Add tracing to runtime-rs
96e9374d4 dragonball: Don't fail if a request asks for more CPUs than allowed
38f0aaa51 Revert "gha: k8s: dragonball: Skip k8s-number-cpus"
828a72183 gha: k8s: dragonball: Skip k8s-oom
a79505b66 gha: k8s: dragonball: Skip k8s-number-cpus
275c84e7b Revert "agent: fix the issue of exec hang with a backgroud process"
2be342023 checkmetrics: Add memory usage inside container value for qemu
6ca34f949 checkmetrics: Add memory inside container value for clh
6c6892423 metrics: Enable memory inside container metrics
0ad298895 gha: ci: Fix refernce passed to checkout@v3
86904909a gha: ci: Avoid using env also in the ci-nightly and payload-after-push
f72cb2fc1 agent: Remove shadowed function, add slog-term
1d05b9cc7 gha: ci: Pass down secrets to ci-on-push / ci-nightly
c5b4164cb gha: ci: Fix tarball-suffix passed to the metrics tests
07810bf71 agent: Ignore already mounted dev/fs/pseudo-fs
11e3ccfa4 gha: ci: Avoid using env unless it's really needed
c45f646b9 gha: k8s: Ensure cluster doesn't exist before creating it
1a7bbcd39 gha: ci: Fix typo pull_requesst -> pull_request
ddf4afb96 gha: ci: Fix set-fake-pr-number job
8a0a66655 gha: ci: schedule expects a list, not a map
5c0269dc5 gha: ci: Add pr-number input to the correct job
de83cd9de gha: ci: Use $VAR instead of ${{ env.VAR }}
6acce83e1 metrics: Fix the call to check_metrics function
e067d1833 gha: Add a nightly CI job
7c0de8703 gha: k8s: Ensure tests are running on a specific namespace
106e30571 gha: Create a re-usable `ci.yaml` file
cc3993d86 gha: Pass event specific info from the caller workflow
4e396e728 metrics: Add function keyword to to helper metrics functions
1ca17c2f7 metrics: storing metrics workflow artifacts
5a61065ab checkmetrics: Add checkmetrics value for memory usage in qemu
78086ed1f checkmetrics: Add memory usage value for clh
1c3dbafbf metrics: Fix function of how to retrieve multiple values
18968f428 metrics: Add function to have uniformity
35d096b60 metrics: Adds blogbench and webtool metrics tests
d8f90e89d metrics: Rename function at memory usage script
b9d66e0d5 metrics: Fix double quotes variables in memory usage script
476a11194 tests: Enable memory usage metrics tests
b568c7f7d tests/integration: Provide default value for KATA_HOST_OS
d6e96ea06 tests/integration: Use AzureLinux instead of Mariner
40c46c75e tests/integration: Perform yq install in run_tests()
d8b8f7e94 metrics: Enable launch tests time metrics
72fd562bd gha: release: Use a specific release of hub
0502354b4 checkmetrics: Add checkmetrics json for qemu
b481ef188 makefile: Add -buildvcs=false flag to go build
e94aaed3c ci_worker: Add checkmetrics ci worker for cloud hypervisor
917576e6f metrics: Add double quotes in all variables
cc8f0a24e metrics: Add checkmetrics to gha-run.sh for metrics CI
477856c1e gha: dragonball: Correctly propagate PATH update
1c211cd73 gha: Swap asset/release in build matrix
0152c9aba tools: Introduce `USE_CACHE` environment variable
2b5975689 tests: Build CLH with glibc for Mariner
80c78eadc tests: Use baked-in kernel with Mariner
532755ce3 tests: Build Mariner rootfs initrd
6a21e20c6 runtime: Add "none" as a shared_fs option
5681caad5 versions: Upgrade to Cloud Hypervisor v33.0
b2ce8b4d6 metrics: Add memory footprint tests to the CI
d035955ef doc: Add documentation for the virtualization reference architecture
0f454d0c0 gpu: Fixing typos for PCIe topology changes
6bb2ea819 packaging: Fix indentation of build.sh script at ovmf
0504bd725 agent: convert the `sl` macros to functions
0860fbd41 agent: convert the `ttrpc_error` macro to a function
0e5d6ce6d agent: convert the `is_allowed` macro to a function
f680fc52b agent: change `AGENT_CONFIG`'s lazy type to just `AgentConfig`
beb706368 metrics: Uniformity across function names
1f3e837e4 runtime-rs: fix build error on AArch64
6fd25968c runtime-rs: bugfix for direct volume path's validation.
415578cf3 docs: Add general README
bff4672f7 runtime-rs: support physical endpoint using device manager
32cba7e44 metrics: Fix retrieving hypervisor version on metrics
aa7946de4 checkmetrics: Add general checkmetrics documentation
2fac2b72f checkmetrics: Add checkmetrics makefile
e45899ae0 docs: Add time tests documentation reference
28130d3ce docs: Add boot time metrics documentation
0df2fc270 runtime-rs: add support spdk/vhost-user based volume.
17198089e vendor: Add vendor checkmetrics dependencies
f1dfea6e8 docs: Add metrics documentation reference
8330fb8ee gpu: Update unit tests
859359424 metrics: enable launch-times test on gha-run metrics script
c4ee601bf metrics: Add checkmetrics for kata metrics CI
e0d6475b4 gha: Don't automatically trigger CI
b535c7cbd tests: Enable running k8s tests on Mariner
71071bdb6 docs: Add general metrics documentation
610f7986e check: Relax the unrestricted_guest check when running in a VM
1b406b9d0 kata-ctl:Implement functionality to check host is capable of running VM
adf88eaa8 static-build: Remove kata-version parameter
09720babc docs: fix spelling of "crate"
7185afc50 gha: Fix gha actions
21294b868 packaging: Fix indentation in init.sh script
fad3ac9f5 metrics: install kata and launch-times test
4bbfcfaf1 tests: Move tests helper script to this repo
f152f0e8c metrics: Add launch-times to metrics tests
59510cfee runtime-rs: add support vfio device based volume
1e3b372bb runtime-rs: add support vfio device manager
6b0848930 gha: Fix format for run launchtimes metrics yaml
3cefa43e7 tests: Add json script for metrics tests
6a3710055 initramfs: Build dependencies as part of the Dockerfile
aa2380fdd packaging: Add infra to push the initramfs builder image
1c7fcc6cb packaging: Use existing image to build the initramfs
a43ea24df virtiofsd: Convert legacy `-o` sub-options to their `--` replacement
8e00dc694 virtiofsd: Drop `-o no_posix_lock`
2a15ad978 virtiofsd: Stop using deprecated `-f` option
c3043a6c6 tests: Add tests lib common script
b16e0de73 gha: Add base branch on SHA on pull requst
72f2cb84e gpu: Reset cold or hot plug after overriding
fbacc0964 gpu: PCIe topology, consider vhost-user-block in Virt
bc152b114 gha: ci-on-push: Run metrics tests
dad731d5c docs: Update Developer Guide
b11246c3a gpu: Various fixes for virt machine type
40101ea7d vfio: Added annotation for hot(cold) plug
8f0d4e261 vfio: Cleanup of Cold and Hot Plug
b5c4677e0 vfio: Rearrange the bus assignemnt
b1aa8c8a2 gpu: Moved the PCIe configs to drivers
55a66eb7f gpu: Add config to TOML
da42801c3 gpu: Add config settings tests for hot-plug
de39fb7d3 runtime: Add support for GPUDirect and GPUDirect RDMA PCIe topology
9318e022a gpu: Add CC relates configs
b7932be4b gpu: Add Arm64 Kernel Settings
211b0ab26 gpu: Update Kernel Config
5f103003d gpu: Update kernel building to the latest changes
35e4938e8 tools: Fix no-op builds
347385b4e runtime-rs: Enhance flexibility of virtio-fs config
21d227853 versions: Update firecracker version to 1.3.3
0e2379909 gha: Fix `stage` definition in matrix
ae2cfa826 doc: add vcpu handlint doc for runtime-rs
7b1e67819 fix(clippy): fix clippy error
67972ec48 feat(runtime-rs): calculate initial size
aaa96c749 feat(runtime-rs): modify onlineCpuMemRequest
d66f7572d feat(runtime-rs): clear cpuset in runtime side
a0385e138 feat(runtime-rs): update linux resource when stop_process
a39e1e6cd feat(runtime-rs): merge the update_cgroups in update_linux_resources
fa6dff9f7 feat(runtime-rs): support vcpu resizing on runtime side
8cb4238b4 packaging: Remove snap package
213773998 runtime-rs: update Cargo.lock
56d2ea9b7 kata-ctl: Refactor kernel module check
9f7a45996 gha: Add `rootfs-initrd-mariner` build target
f28a62164 gha: Add `cloud-hypervisor-glibc` build target
8fb7ab751 dragonball: introduce virtio-balloon device
7ed949497 dragonball: introduce virtio-mem device
776a15e09 runtime-rs: add support direct volume.
a8e0f51c5 dragonball: extend DeviceOpContext
abae11404 runtime-rs: refactor device manager implementation
210a15794 dragonball: avoid obtaining lock twice in create_stdio_console
69668ce87 tests: gha-run: Use correct env variable for repo
f487199ed gha: aks: Fix argument in call to gha-run.sh
f6afae9c7 packaging: Add rootfs-image-tdx-tarball target
f62b2670c config: Add root hash value and measure config to kernel params
008058807 kernel: Integrate initramfs into Guest kernel
28b264562 initramfs: Add build script to generate initramfs
5cb02a806 image-build: generate root hash as an separate partition for rootfs
31c0ad207 packaging: Add cryptsetup support in Guest kernel and rootfs
980d084f4 log-parser: Update log parser link at README
410bc1814 agent-ctl: fix the compile error
77519fd12 kata-ctl: Switch to slog logging; add --log-level, --json-logging args
aab603096 gha: aks: Extract `run` commands to a script
e4eb664d2 runtime-rs: update rust to 1.69.0
ed37715e0 runtime-rs: handle copy files when share_fs is not available
5f6fc3ed7 runtime-rs: bugfix: update Cargo.lock
1c6d22c80 gha: aks: Use short SHA in cluster name
3c1f6d36d readme: Update Kata Containers logo
388684113 readme: Add status badge for the "Publish Artefacts" job
26f752038 kata-deploy: Change how we get the Ubuntu k8s key
aebd3b47d gha: aks: Ensure host_os is used everywhere needed
0c8282c22 gha: aks: Add the host_os as part of the aks cluster's name
4b89a6bda release: Standardize kata static file name
9228815ad  kernel: Modify build-kernel.sh to accomodate for changes in version.yaml
03027a739 gha: Fix Mariner cluster creation
43e73bdef packaging: make BUILDER_REGISTRY configurable
ffe3157a4 dragonball: add arm64 patches for upcall
560442e6e dragonball: add vcpu_boot_onlined vector
e31772cfe dragonball: add support resize_vcpu on aarch64
64c764c14 dragonball: update dbs-boot to v0.4.0
fd9b41464 dragonball: update comment for init_microvm
af16d3fca gha: Unbreak CI and fix cluster creation step
5ddc4f94c runtime-rs/kata-ctl: Enhancement of DirectVolumeMount.
25d2fb0fd agent: fix the issue of exec hang with a backgroud process
4af4ced1a gha: Create Mariner host as part of k8s tests
eee7aae71 runtime-rs/sandbox_bindmounts: add support for sandbox bindmounts
557b84081 gha: aks: Wait longer to start running the tests
c04c872c4 gha: aks: Increase the timeout time
428041624 kata-deploy: Improve shim backup / restore
14c3f1e9f kata-deploy: Fix indentation on kata deploy merge script
0e47cfc4c runtime: sending SIGKILL to qemu
6a0035e41 doc: Update git commands
433b5add4 kubernetes: add agnhost command in pod yaml
c477ac551 dragonball: Convert VirtioNetDeviceMgr function to method
4659facb7 dragonball: Convert BlockDeviceMgr function to method
ee6deef09 dragonball: Remove virtio-net and vsock devices gracefully
2bda92fac netlink: Fix the issue of update_interface

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-31 09:02:07 +02:00
Jiang Liu
b3901c46d6 runtime-rs: ignore errors during clean up sandbox resources
Ignore errors during clean up sandbox resources as much as we can.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-07-31 13:07:43 +08:00
Chelsea Mafrica
8a2c201719 docs: Update links for pods and kubelet
The links for pods and kubelets no longer work so update to new links
with relevant info.

Fixes #7487

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-07-29 00:38:35 +00:00
Gabriela Cervantes
5a1b5d3672 metrics: Add sysbench pod yaml
This PR adds the sysbench pod yaml for the sysbench performance test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-28 20:03:15 +00:00
Gabriela Cervantes
ad413d1646 metrics: Add sysbench dockerfile
This PR adds sysbench dockerfile.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-28 19:58:10 +00:00
Gabriela Cervantes
1512560111 metrics: Add sysbench performance test
This PR adds the sysbench performance test for kata CI.

Fixes #7485

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-28 19:54:12 +00:00
Gabriela Cervantes
bee1a628bd metrics: Fix json result for tensorflow
This PR fixes the json result for tensorflow.i

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-28 17:02:16 +00:00
Jiang Liu
62e328ca5c runtime-rs: refine implementation of TaskService
Refine implementation of TaskService, making handler_message() as a
method.

Fixes: #7479

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-07-29 00:47:33 +08:00
Jiang Liu
458e1bc712 runtime-rs: make send_message() as an method of ServiceManager
Simplify implementation by making send_message() as an method of
ServiceManager.

Fixes: #7479

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-07-29 00:47:31 +08:00
Jiang Liu
1cc1c81c9a runtime-rs: fix possibe bug in ServiceManager::run()
Multiple instances of task service may get registered by
ServiceManager::run(), fix it by making operation symmetric.

Fixes: #7479

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-07-29 00:47:30 +08:00
Jiang Liu
1a5f90dc3f runtime-rs: simplify implementation of service crate
Simplify implementation of service crate.

Fixes: #7479

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-07-29 00:47:28 +08:00
Gabriela Cervantes
51cd99c927 metrics: Round axelnet and resnet results
This PR rounds the axelnet and resnet results in order to extract
properly the result.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-28 16:15:22 +00:00
Gabriela Cervantes
3b883bf5a7 metrics: Fix atoi invalid syntax
This PR will avoid to have the strconv.atoi parsing error when we
are retrieving the results from the json.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-28 16:15:22 +00:00
Gabriela Cervantes
f9dec11a8f checkmetrics: Move checkmetrics to gha-run script
This PR moves the checkmetrics to gha-run script to gathered
tensorflow information.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-28 16:15:22 +00:00
Gabriela Cervantes
53af71cfd0 checkmetrics: Add AlexNet value for qemu
This PR adds AlexNet value for qemu for checkmetrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-28 16:15:22 +00:00
Gabriela Cervantes
a435d36fe1 checkmetrics: Add Resnet value for qemu
This PR adds the Resnet value for qemu for checkmetrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-28 16:15:22 +00:00
Gabriela Cervantes
a79a3a8e1d checkmetrics: Add alexnet value for clh
This PR adds the AlexNet value for clh for checkmetrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-28 16:15:22 +00:00
Gabriela Cervantes
3c32875046 checkmetrics: Add Resnet value for clh
This PR adds the checkmetrics Resnet value for clh.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-28 16:15:22 +00:00
Gabriela Cervantes
08dfaa97aa metrics: General improvements to the tensorflow script
This PR adds general improvements to the tensorflow script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-28 16:15:22 +00:00
Gabriela Cervantes
63b8534b41 metrics: Enable Tensorflow metrics for kata CI
This PR enables the Tensorflow benchmark metrics for kata CI.

Fixes #7395

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-28 16:15:22 +00:00
Aurélien
e8f8641988 Merge pull request #7132 from sprt/aks-volume-tests
tests: Add `k8s-volume` and `k8s-file-volume` tests to GHA CI
2023-07-28 08:58:03 -07:00
Fabiano Fidêncio
68b9acfd02 Merge pull request #7474 from GabyCT/topic/upboo
metrics: Update boot time for kata metrics
2023-07-28 17:55:43 +02:00
David Esparza
f89abcbad8 Merge pull request #7473 from GabyCT/topic/addfioreport
metrics: Add FIO report files for kata metrics
2023-07-28 09:37:21 -06:00
Fabiano Fidêncio
c9742d6fa9 Merge pull request #7411 from fidencio/topic/kata-deploy-create-runtime-classes
kata-deploy: Allow runtimeclasses to be created by the daemonset
2023-07-28 16:05:49 +02:00
Yuan-Zhuo
731e7c763f kata-ctl: add monitor subcommand for runtime-rs
The previous kata-monitor in golang could not communicate with runtime-rs
to gather metrics due to different sandbox addresses.
This PR adds the subcommand monitor in kata-ctl to gather metrics from
runtime-rs and monitor itself.

Fixes: #5017

Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
2023-07-28 17:30:08 +08:00
Yuan-Zhuo
d74639d8c6 kata-ctl: provide the global TIMEOUT for creating MgmtClient
Several functions in kata-ctl need to establish a connection with runtime-rs through MgmtClient.
This PR provides a global TIMEOUT to avoid multiple definitions.

Fixes: #5017

Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
2023-07-28 17:23:37 +08:00
Yuan-Zhuo
02cc4fe9db runtime-rs: add support for gather metrics in runtime-rs
1. Implemented metrics collection for runtime-rs shim and dragonball hypervisor.
2. Described the current supported metrics in runtime-rs.(docs/design/kata-metrics-in-runtime-rs.md)

Fixes: #5017

Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
2023-07-28 17:16:51 +08:00
Fabiano Fidêncio
8353aae41a ci: k8s: Rework get_nodes_and_pods_info()
The amount of info we've added seemed unnecessary, and ends up making
our lives even harder when trying to find errors.

Let's just rely on the kata-debug container to collect the needed info
for us.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-28 10:04:33 +02:00
Fabiano Fidêncio
6ad5d7112e ci: k8s: Do not gather node info before running the tests
It's been proven to not be useful, and ends up making things more
confusing due to the amount of logs printed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-28 10:04:33 +02:00
Fabiano Fidêncio
5261e3a60c ci: k8s: Group messages to improve readability
Right now is getting way too easy to get lost in the logs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-28 10:04:33 +02:00
Fabiano Fidêncio
9cc6b5f461 ci: k8s: Get logs from kata-deploy
Let's make sure we can debug kata-deploy in case something goes wrong
during its execution.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-28 10:04:33 +02:00
Fabiano Fidêncio
9d285c6226 ci: k8s: Let kata-deploy take care of the runtimeclasses
By doing this we can test the change done for the daemonset. :-)

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-28 10:04:33 +02:00
Fabiano Fidêncio
87568ed985 gha: Test split out runtimeclasses are in sync with all-in-one file
This is needed in order to not lose track of what's been created and
what's been added here and there.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-28 10:04:33 +02:00
Fabiano Fidêncio
39192c6084 kata-deploy: Print variables passed to the script
This will help folks to debug / understand what's been passed to the
kata-deploy.sh script.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-28 10:04:33 +02:00
Fabiano Fidêncio
0e157be6f2 kata-deploy: Allow runtimeclasses to be created by the daemonset
Let's allow the daemonset to create the runtimeclasses, which will
decrease one manual step a user of kata-deploy should take, and also
help us in the Confidential Containers land as the Operator can just
delegate it to this script.

Fixes: #7409

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-28 10:04:33 +02:00
Fabiano Fidêncio
a274333248 kata-deploy: Change default values of DEBUG
This can be easily done as there was no official release with the
previous values.

The reason we're doing so is because when using `yq` to replace the
value, even when forcing `--tag '!!str' "yes"`, the content is placed
without quotes, causing errors in our CI.

While here, we're also removing the fallback value for DEBUG, as it is
**always** set in the kata-deploy.yaml file.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-28 09:50:39 +02:00
Fabiano Fidêncio
69535b8089 kata-deploy: runtimeclass: Split out entries
This will make things simpler to only create the handlers defined by the
kata-deploy user.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-28 09:43:45 +02:00
Fabiano Fidêncio
9e1710674a kata-runtimeClasses: Alphabetically sort the enrties
This will become handy in the near future, as we want to have separate
enrties for each file, while still keeping this one.

Having the entries sorted will make our lives easier to test those are
always in sync.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-28 09:43:45 +02:00
Zhongtao Hu
61a8eabf8e Merge pull request #7139 from openanolis/fix/devmanager
runtime-rs: change block index to 0
2023-07-28 14:04:19 +08:00
Aurélien Bombo
6222bd9103 tests: Add k8s-file-volume test
This imports the k8s-file-volume test from the tests repo and modifies
it slightly to set up the host volume on the AKS host.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-07-27 14:07:55 -07:00
Aurélien Bombo
187a72d381 tests: Add k8s-volume test
This imports the k8s-volume test from the tests repo and modifies it
slightly to set up the host volume on the AKS host.

Fixes: #6566

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-07-27 14:06:43 -07:00
Gabriela Cervantes
0c84270357 metrics: Add boot time value for qemu
This PR adds the boot time value and limit for qemu.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-27 20:06:24 +00:00
Gabriela Cervantes
6520dfee37 metrics: Update boot time for kata metrics
This PR updates the boot time limit for kata metrics.

Fixes #7475

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-27 19:14:19 +00:00
Gabriela Cervantes
ff22790617 metrics: Update runtime and configuration paths
This PR updates the runtime and configuration paths for kata containers.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-27 17:14:03 +00:00
Gabriela Cervantes
a5d4e33880 metrics: Add compare virtiofsd dax script
This PR adds the compare virtiofsd dax script for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-27 16:53:50 +00:00
Gabriela Cervantes
5e937fa622 metrics: Update general FIO tests
This PR updates general FIO tests by adding the recent date of a change.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-27 16:47:17 +00:00
Gabriela Cervantes
b0bea47c53 metrics: Add makefile to report generator
This PR adds the makefile to report generator for the FIO test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-27 16:42:11 +00:00
Gabriela Cervantes
73c57b9a19 metrics: Add FIO report files for kata metrics
This PR adds FIO report files for kata metrics.

Fixes #7472

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-27 16:39:35 +00:00
Chelsea Mafrica
e941b3a094 Merge pull request #7456 from alakesh/agent-fix-typo
agent: fix typo in constant
2023-07-27 09:31:24 -07:00
David Esparza
ba8a8fcbf2 Merge pull request #7442 from GabyCT/topic/addgofilesfio
metrics: Add FIO benchmark for metrics tests
2023-07-27 10:20:43 -06:00
Zhongtao Hu
c8fcd29d9b runtime-rs: use device manager to handle virtio-pmem
use device manager to handle virtio-pmem device

Fixes: #7119
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-07-27 20:18:49 +08:00
Zhongtao Hu
901c192251 runtime-rs: support configure vm_rootfs_driver
support configure vm_rootfs_driver in toml config

Fixes: #7119
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-07-27 20:12:53 +08:00
Zhongtao Hu
5d6199f9bc runtime-rs: use device manager to handle vm rootfs
use device manager to handle vm rootfs, after attach the block device of
vm rootfs, we need to increase index number

Fixes: #7119
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-07-27 20:12:45 +08:00
James O. D. Hunt
20f1f62a2a runtime-rs: change block index to 0
Change block index in SharedInfo to 0 for vda.

Fixes #7119

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-07-27 20:11:44 +08:00
Chao Wu
ede1dae65d Merge pull request #7465 from fidencio/topic/fix-dragonball-static-check-runner-selector
gha: dragonball: Run only on the dragonball labeled machine
2023-07-27 10:19:26 +08:00
Gabriela Cervantes
662f87539e metrics: Add general FIO makefile
This PR adds a general FIO makefile for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-26 20:46:02 +00:00
Fabiano Fidêncio
f28af98ac6 Merge pull request #7453 from sprt/fix-ci-node-debugger
tests: Fix `k8s-job` test
2023-07-26 22:27:21 +02:00
Fabiano Fidêncio
8a22b5f075 Merge pull request #7439 from ManaSugi/fix/remove-unused-mut
agent,libs: Remove unused 'mut' keywords
2023-07-26 21:25:41 +02:00
Fabiano Fidêncio
9792ac49fe Merge pull request #7425 from jongwu/remove_mut
runtime-rs: remove unneeded 'mut' keywords
2023-07-26 21:24:40 +02:00
Fabiano Fidêncio
24564a8499 Merge pull request #7455 from sprt/local-tests
tests: QoL improvements for running tests locally
2023-07-26 21:23:43 +02:00
Aurélien Bombo
c5a87eed29 tests: gha: Add timeout to cluster creation
This has been intermittently taking a while lately so let's add a
timeout.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-07-26 10:19:07 -07:00
Aurélien Bombo
6daeb08e69 tests: k8s: Clean up node debuggers after running
This deletes node debugger pods after execution since their presence may
affect tests that assume only test workloads pods are present.

For example, in `k8s-job` we wait for *any* pod to be in the `Succeeded`
state before proceeding, which causes failures.

Fixes: #7452

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-07-26 10:19:07 -07:00
Fabiano Fidêncio
3aa6c77a01 gha: dragonball: Run only on the dragonball labeled machine
Static checks for dragonball are landing on any of the self-hosted
runners, and the reason for that is because "self-hosted" was the label
selector used.

Let's use "dragonball" instead, as the machine has that label as well.

Fixes: #7464

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-26 18:15:04 +02:00
Gabriela Cervantes
37641a5430 metrics: Add example config for fio jobs
This PR adds example config for fio jobs.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-26 16:03:12 +00:00
Alakesh Haloi
314aec73d4 agent: fix typo in constant
It fixes a constant name to have the right spelling

Fixes: #7457
Signed-off-by: Alakesh Haloi <a_haloi@apple.com>
2023-07-26 00:06:34 -05:00
Aurélien Bombo
4703434b12 tests: k8s: Allow using custom resource group
This simply allows setting a custom resource group when debugging
locally, so as to prevent name collisions and not pollute the namespace.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-07-25 15:45:44 -07:00
Aurélien Bombo
350f3f70b7 tests: Import common.bash in run_kubernetes_tests.sh
Not sure why this works in GHA, but the `info` call on line 65 would
fail locally.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-07-25 15:45:44 -07:00
Aurélien Bombo
d7f04a64a0 tests: k8s: Leave runtimeclass_workloads/ alone
Makes it so that `setup.sh` doesn't make changes in
`runtimeclass_workloads/` directly. Instead we treat that as a template
directory and we use the new directory `runtimeclass_workloads_work/` as
a work dir.

This has two advantages:

 * Allows rerunning tests without the assumption that `setup.sh` must be
   idempotent. E.g. the `set_runtime_class()` step would break.
 * Doesn't pollute your git environment with a bunch of changes when
   developing.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-07-25 15:45:44 -07:00
Aurélien Bombo
bdde6aa948 tests: k8s: Split deployment and testing commands
This splits deploying Kata and running the tests into separate commands
to make it possible to rerun tests locally without having to redeploy
Kata each time.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-07-25 15:44:46 -07:00
Aurélien Bombo
91a0b3b406 tests: aks: Simply delete cluster when cleaning up
If we're going to delete the cluster anyway, no need to call
kata-cleanup.

Fixes: #7454

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-07-25 15:44:46 -07:00
Gabriela Cervantes
3c1044d9d5 metrics: Update FIO paths for k8s runner
This PR updates the FIO paths for k8s runner.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-25 20:50:03 +00:00
Eric Ernst
5385ddc560 Merge pull request #7365 from alakesh/symlink-fix
agent: exclude symlinks from recursive ownership change
2023-07-25 11:27:48 -07:00
Gabriela Cervantes
6177a0db3e metrics: Add env files for FIO
This PR adds the env files for FIO for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-25 17:48:45 +00:00
Gabriela Cervantes
a45900324d metrics: Add fio exec
This PR adds fio exec for the FIO benchmark.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-25 17:36:08 +00:00
Gabriela Cervantes
ea198fddcc metrics: Add FIO runner k8s
Add program to execute FIO workloads using k8s.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-25 17:34:29 +00:00
Gabriela Cervantes
8f7ef41c14 metrics: Add FIO vendor code
This PR adds the FIO vendor code.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-25 17:24:29 +00:00
Gabriela Cervantes
6293c17bde metrics: Add FIO benchmark for metrics tests
This PR adds the FIO benchmark scripts and resources for the metrics
tests section.

Fixes #7441

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-25 16:36:33 +00:00
Fabiano Fidêncio
cdf04e5018 Merge pull request #7437 from jepio/fix-sev-kernel-cache
cache: kernel: Fix kernel caching
2023-07-25 18:10:03 +02:00
GabyCT
7a3b55ce67 Merge pull request #7432 from ManaSugi/runk/doc-docker
runk: Add Docker guide to README
2023-07-25 09:56:02 -06:00
GabyCT
c1bd527163 Merge pull request #7430 from GabyCT/topic/fixjson
metrics: General improvements to json.bash script
2023-07-25 09:45:53 -06:00
Fabiano Fidêncio
6efd684a46 Merge pull request #7408 from fidencio/topic/kata-deploy-add-SHIMS-and-SHIM_DEFAULT-as-env
kata-deploy: Allow shim creation based on what's passed to the daemonset
2023-07-25 16:56:46 +02:00
Fabiano Fidêncio
5b82268d2c Merge pull request #7436 from jepio/vfio-gha
gha: ci: Add skeleton of vfio job
2023-07-25 14:44:04 +02:00
Manabu Sugimoto
ff4cfcd8a2 runk: Add Docker guide to README
`runk` can launch containers using Docker, so add the guide
to it's README.

```sh
$ sudo dockerd --experimental --add-runtime="runk=/usr/local/bin/runk"
$ sudo docker run -it --rm --runtime runk busybox echo hello runk
hello runk
```

Fixes: #7431

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2023-07-25 20:10:49 +09:00
Jeremi Piotrowski
c8ac56569a cache: kernel: Harmonize commit with fetching side
kata-deploy-binaries.sh uses the last commit in
tools/packaging/static-build/kernel for its version check, while the cache
generation uses tools/packaging/kernel. Use tools/packaging/static-build/kernel
as $kata_config_version is already part of the version string and covers any
changes to tools/packaging/kernel.

Fixes: #7403
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-07-25 12:23:05 +02:00
Jeremi Piotrowski
81775ab1b3 cache: kernel: Fix SEV kernel caching
The SEV kernel cache calls create_cache_asset() twice, once for the kernel and
once for modules. Both calls need to use the same version string, otherwise the
second call overwrites the "latest" file of the first one and the cache is not
used.

Fixes: #7403
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-07-25 11:58:19 +02:00
Jeremi Piotrowski
717f775f30 gha: ci: Add skeleton of vfio job
This job will run on a nested virt capable Azure VM (improving test
concurrency). This is just a placeholder while we adapt the test to GHA.

Fixes: #6555
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-07-25 11:13:04 +02:00
Manabu Sugimoto
b9f100b391 agent,libs: Remove unused 'mut' keywords
Remove unused `mut` because the agent compilation fails
when the rust compiler is >= 1.71. This is related to #7425

Fixes: #7438

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2023-07-25 17:41:08 +09:00
Fabiano Fidêncio
a56f96bb2b kata-deploy: Allow shim creation based on what's passed to the daemonset
Instead of hardcoding shims as part of the script, let's ensure we can
allow them to be created based on environment variables passed to the
daemonset.

This change brings no functionality change as the default values in the
daemonset are exactly what has been used as part of the scripts.

Fixes: #7407

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-25 08:30:00 +02:00
Fabiano Fidêncio
5ce0b4743f Merge pull request #7382 from zvonkok/vfio-ap-debug
s390x: Fixing device.Bus assignment
2023-07-25 08:26:25 +02:00
David Esparza
b11d618a3f Merge pull request #7413 from fidencio/topic/release-publish-builder-images
release: Mention the container images used to build the project
2023-07-24 15:46:31 -06:00
Fabiano Fidêncio
56fdeb1247 Merge pull request #7417 from fidencio/topic/kata-deploy-binaries-cached-kernel-fix
kata-deploy-binaries: kernel_cache: Take module_dir into account
2023-07-24 22:26:09 +02:00
Gabriela Cervantes
4a5ab38f16 metrics: General improvements to json.bash script
This PR adds general improvements like putting function before function
name and consistency in how we declare variables and so on to have
uniformity across the metrics scripts.

Fixes #7429

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-24 16:51:38 +00:00
Fabiano Fidêncio
d4eba36980 kata-deploy-binaries: kernel_cache: Take module_dir into account
`module_dir` has been passed to the function but was never assigned to a
var, leading to errors when trying to use it.

Fixes: #7416

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-24 18:19:13 +02:00
Fabiano Fidêncio
b7c9867d60 release: Mention the container images used to build the project
This is a small step towards build reproducibility.

Fixes: #7412

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-24 18:01:57 +02:00
Wainer Moschetta
2e9853c761 Merge pull request #7427 from fidencio/topic/gha-port-nydus-tests-follow-up-1
ci: nydus: Fix typo in "source"
2023-07-24 11:20:05 -03:00
Fabiano Fidêncio
7c4b597816 ci: nydus: Fix typo in "source"
We should source from `nydus_dir`, instead of `cri_containerd_dir`, and
that was a leftover from fb4f7a002c.

Fixes: #6543

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-24 14:55:09 +02:00
Fabiano Fidêncio
589672d510 Merge pull request #7426 from fidencio/topic/gha-port-nydus-tests
gha: ci: Add no-op nydus tests to our CI
2023-07-24 13:56:57 +02:00
Fabiano Fidêncio
6a680e241b gha: ci: Add placeholder for the nydus tests as part of the CI
This will triger the nydus tests, but as they currently are they'll just
return "okay" without actually executing.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-24 13:37:36 +02:00
Fabiano Fidêncio
fb4f7a002c gha: nydus: Add a no-op GHA for nydus
This newly added GHA does nothing, is not even triggered, and it's just
a placeholder that we'll grow in the next commits / PRs, so we can
actually start running the nydus tests as part of our CI.

Fixes: #6543

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-24 13:37:33 +02:00
Fupan Li
0ae987973b Merge pull request #7367 from openanolis/chao/migrate_dragonball_sandbox
Dragonball: migrate dragonball-sandbox crates to Kata
2023-07-24 17:52:11 +08:00
Fabiano Fidêncio
4a207a16f9 gha: nydus: Bring tests as they are from the tests repo
Let's bring the nydus tests, without any kind of modification, from the
tests repo.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-24 10:56:41 +02:00
Jianyong Wu
2c8f83424d runtime-rs: remove unneeded 'mut' keywords
These unneeded 'mut' keywords blocks built by rust 1.71.0. Remove them.

Fixes: #7424
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-07-24 08:47:15 +00:00
Zvonko Kaiser
1fc715bc65 s390x: Add AP Attach/Detach test
Now that we have propper AP device support add a
unit test for testing the correct Attach/Detach of AP devices.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-07-23 13:44:19 +00:00
Fabiano Fidêncio
e1a4040a6c Merge pull request #7326 from fidencio/topic/gha-ci-add-cri-containerd-tests
ci: gha: Add cri-containerd tests (but still do not enable them)
2023-07-21 19:29:38 +02:00
Fabiano Fidêncio
6a59e227b6 Merge pull request #7399 from fidencio/topic/add-kata-debug
packaging/tools: Add kata-debug and use it as part of our CI
2023-07-21 17:05:27 +02:00
Fabiano Fidêncio
e91f5edba0 ci: cri-containerd: Fix default typo for testContainerStart()
It must but {1:-0}, instead of {1-0}.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
8b8aef09af ci: cri-containerd: Temporarily disable TestContainerSwap
The test is currently failing with GHA, and I don't think it makes sense
to block all the other tests to get merged while it's happening.

For now, let's disable it and re-enable it as soon as we have it
passing.

Reference: https://github.com/kata-containers/kata-containers/issues/7410

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
56767001cb ci: cri-containerd: Add namespace / uid to the pods
Otherwise crictl will fail to remove them with:
```
getting sandbox status of pod "$pod": metadata.Name, metadata.Namespace
or metadata.Uid is not in metadata "..."
```

A huge shout out to Steven Horsman for helping to debug this one.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
a84773652c ci: cri-containerd: Always use sudo to call crictl
Otherwise we may get the following error:
```
time="2023-07-15T21:12:13Z" level=fatal msg="validate service connection: validate CRI v1 runtime API for endpoint \"unix:///run/containerd/containerd.sock\": rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: permission denied\""
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
99ba86a1b2 ci: cri-containerd: Add /usr/local/go/bin to the PATH
Otherwise go is not picked up.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
7f3b309997 ci: cri-containerd: Add function before each function
We've been doing this for all files moved to this repo.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
fde22d6bce ci: cri-containerd: Assume podman is always used
For this set of tests, we'll always be using podman in order to avoid
having containerd pulled in by docker.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
9465a04963 ci: cri-containerd: Adapt "source ..." to this repo
Let's adapt what we "source" to the kata-containers repo.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
df8d144119 ci: cri-containerd: Remove CI variable
We always want to run the tests using as much debug as possible.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
f90570aef0 ci: cri-containerd: Remove unused runc_runtime_bin
The variable is not used anywhere in our tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
c3637039f4 ci: cri-containerd: Remove KILL_VMM_TEST env var
We don't need the env var, we just need to restrict the test according
to the KATA_HYPERVISOR used, as right now it's very specifict to QEMU.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
bc4919f9b2 ci: cri-containerd: Always run shim-v2 tests
We only have shim-v2 as the runtime type, so we always need to run tests
using it. :-)

We had to adjust the script in order to properly run the tests with the
current logic.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
f9e332c6db ci: cri-containerd: Stop cloning containerd
It's already done as part of the install_dependencies()

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
cfd662fee9 ci: cri-containerd: Remove ununsed SNAP_CI var
We don't support SNAP anymore, thus we can remove the var.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
d36c3395c0 ci: cri-containerd: Update copyright
As we're touching the file already, let's update its Copyright info.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
b5be8a4a8f ci: cri-containerd: Move integration-tests.sh as it was
Let's move the `integration/containerd/cri/integration-tests.sh` file
from the tests repo to this one.

The file has been moved as it is, it's not used, and in the following
commits we'll clean it up before actually using it.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
f2e00c95c0 ci: cri-containerd: Populate install_dependencies()
Let's install all the dependencies needed for running the
`cri-containerd` tests.

The list of dependencies we have are:
* From the system
  - build-essential
  - jq
  - podman-docker
* From our own repo
  - yq
  - go
* From GitHub projects
  - containerd
  - cri-tools

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
8979552527 versions: Add "latest" field for cri-tools
As we don't want to disrupt what we have on the `tests` repo, let's
create a "latest" entry and use that for the GitHub actions tests.

Once we deprecate the `tests` repo we can decide whether we want to
stick to using "latest" or switch back to "version".

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
1bbcbafa67 ci: Add clone_cri_container()
This function will simply clone containerd repo, specifically on a tag
we want to use to test.

This can be expanded for different projects, and it will be the case as
soon as we grow the tests.  But, for now, let's keep it simple.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
f66c68a2bf ci: Add install_cri_tools()
This function will install cri-tools in the host, and soon enough (as
part of this PR) we'll be using it to install cri-tools as part of the
cri-containerd tests.

I've decided to have this as part of the `common.bash` as other tests
that will be added in the future will require cri-tools to be installed
as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
4dd828414f ci: Add install_cri_containerd()
This function will install cri-containerd in the host, and soon enough
(as part of this PR) we'll be using it to install cri-containerd as part
of the cri-containerd tests.

I've decided to have this as part of the `common.bash` as other tests
that will be added in the future will require cri-containerd to be
installed as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
ad47d1b9f8 ci: Add download_github_project_tarball()
This function will hel us to get the tarball, from a github project,
that we're going to use as part of our tests.

Right now this is not used anywhere, but it'll soon enough (as part of
this series) be used to download the cri-containerd / cri-tools / cni
tarballs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
788c562a95 ci: Add get_latest_patch_release_from_a_github_project()
This function will help us to get the latest patch release from a
GitHub project.

The idea behind this function is that we don't have to keep updating
versions.yaml that frequently (or worse, have it outdated as it
currently is), and always test against the latest patch release of a
given project's version that we care about.

Although right now this is not used anywhere, this will be used with the
coming cri-containerd tests, which will be part of this series.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
6742f3a898 ci: Use function before each install_go.sh function
We've been doing this for all files moved to this repo.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
5eacecffc3 ci: Adjust paths for install_go.sh
Let's adjust paths for what we source and the scripts we call, after
moving from the tests repo to this one.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
8ed1595f96 ci: Update copyright for install_go.sh
As we're touching the file already, let's update its Copyright info.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
6123d0db2c ci: Move install_go.sh as it was
Let's move `.ci/install_go.sh` file from the tests repo to this one.

The file has been moved as it is, it's not used, and in the following
commits we'll clean it up before actually using it.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
8653be71b2 ci: Do not take cross-build into consideration for kata-arch.sh
Right now we'd need to import lib.sh just in order to get cross-build
information for rust, and it seems a little bit premature to do so at
this stage and only for rust.

Let's skip it and keep this transition simple.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
6a76bf92cb ci: Fix style / identation if kata-arch.sh
We've been using:
```
function foo() {
}
```

instead of
```
function foo()
{
}
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
72743851c1 ci: Add function before each kata-arch.sh function
We've been doing this for all files moved to this repo.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
9f6d4892c8 ci: Update copyright for kata-arch.sh
As we're touching the file already, let's update its Copyright info.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
6f73a72839 ci: Move kata-arch.sh as it was
Let's move `.ci/kata-arch.sh` file from the tests repo to this one.

The file has been moved as it is, it's not used, and in the following
commits we'll clean it up before actually using it.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
3615d73433 ci: Add get_from_kata_deps()
First of all, I'm 100% aware that I'm duplicating this function here as
I've copied it from the packaging stuff, and I'm not exactly proud of
that.

However, right now it seems a little bit premature to combine that set
of scripts with this set of scripts in a single one and make them used
by both pieces of our project.

Anyways, this functions helps to get information from the
`versions.yaml` file, and it'll be used as part of the cri-containerd
tests and a few others in the future.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
34779491e0 gha: kubernetes: Avoid declaring repo_root_dir
This is already declared as part of the `common.bash` file, so let's
just make sure we use it from there.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
f3738beaca tests: Use $HOME/go as fallback for $GOPATH
Considering that someone may want to run the tests locally, we shouldn't
rely on having GITHUB_WORKSPACE exported, and fallback to $HOME/go if
needed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
b87ed27416 tests: Move ensure_yq to common.bash
As this function will be used by different scripts, let's move it to a
common place.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Jeremi Piotrowski
124e390333 tests: common: Fix quoting when globbing
When the glob star is inside quotes, there is only one iteration of the loop
and b holds all matches at once. Move the glob out of the quotes so that we
actually iterate over matched paths.

Fixes: #6543
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
db77c9a438 tests: Make install_kata take care of the links
It makes the kata-containers installation more complete.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
13715db1f8 tests: Do not call install_check_metrics when installing kata
The `install_kata` function was moved from the metrics' `gha-run.sh`
file to the `common.bash` in the commit 3ffd48bc16, but I didn't notice
that it brought with it a call to `install_check_metrics`, which is
totally unrelated to installing Kata Containers.

Let's remove the call so the function is a little bit less specific, and
move the call to install_check_metrics to the metrics `gha-run.sh` file.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 16:54:27 +02:00
Fabiano Fidêncio
e149a3c783 Merge pull request #7404 from fidencio/topic/cache-consider-changes-in-the-scripts-used-to-build-the-kernel
cache: kernel: Consider changes in tools/packaging/kernel
2023-07-21 15:05:01 +02:00
Fabiano Fidêncio
630634c5df ci: k8s: Group logs to make them easier to read
Otherwise it becomes really hard to find the info you're looking for.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 14:05:30 +02:00
Fabiano Fidêncio
228b30f31c ci: k8s: Gather node info during the cleanup
This will make our lives easier to debug issues with the CI.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 14:05:30 +02:00
Fabiano Fidêncio
81f99543ec ci: k8s: Cleanup cluster before deleting it
This will help us to in two fronts:
* catching possible issues related to kata-deploy cleanup
* do more (like, in the future, collect logs) after the tests run

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 14:05:30 +02:00
Fabiano Fidêncio
38a7b5325f packaging/tools: Add kata-debug
kata-debug is a tool that is used as part of the Kata Containers CI to gather
information from the node, in order to help debugging issues with Kata
Containers.

As one can imagine, this can be expanded and used outside of the CI context,
and any contribution back to the script is very much welcome.

The resulting container is stored at the [Kata Containers quay.io
space](https://quay.io/repository/kata-containers/kata-debug) and can
be used as shown below:
```sh
kubectl debug $NODE_NAME -it --image=quay.io/kata-containers/kata-debug:latest
```

Fixes: #7397

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 14:05:30 +02:00
Fabiano Fidêncio
a0fd41fd37 Merge pull request #7406 from fidencio/topic/merge-tarball-fix-version-yaml-not-found
kata-deploy: Properly get the path of the versions.yaml file
2023-07-21 14:04:18 +02:00
Fabiano Fidêncio
ae6e8d2b38 kata-deploy: Properly get the path of the versions.yaml file
We need to correctly get the full path of the versions.yaml file as part
of the merge-builds.sh script, as we do a `pushd` there and that leads
to a fail merging the artefacts as the `versions.yaml` file does not
exists in that path.

Fixes: #7405

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 12:02:11 +02:00
Fabiano Fidêncio
309e232553 cache: kernel: Consider changes in tools/packaging/kernel
Any change in the script used to build the kernel should invalidate the
cache.

Fixes: #7403

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-21 11:48:29 +02:00
GabyCT
f95a7896b1 Merge pull request #7394 from fidencio/topic/ship-VERSIOB-and-versions.yaml-as-part-of-release-tarball
kata-deploy: Add VERSION and versions.yaml to the final tarball
2023-07-20 14:38:21 -06:00
GabyCT
14025baafe Merge pull request #7376 from GabyCT/topic/addcray
metrics: Add C-Ray performance test
2023-07-20 14:37:53 -06:00
GabyCT
b629f6a822 Merge pull request #7363 from GabyCT/topic/enabletensorflow
metrics: enable TensorFlow benchmark to be run on gha
2023-07-20 13:36:55 -06:00
Fabiano Fidêncio
59fdd69b85 kata-deploy: Add VERSION and versions.yaml to the final tarball
Let's make things simpler to figure out which version of Kata
Containers has been deployed, and also which artefacts come with it.

This will help us immensely in the future, for the TEEs use case, so we
can easily know whether we can deploy a specific guest kernel for a
specific host kernel.

Fixes: #7394

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-20 18:33:14 +02:00
Fabiano Fidêncio
5dddd7c5d1 release: Upload versions.yaml as part of the release
Although this file is far away from being a SBOM, it'll help folks to
easily visualise which components are part of a release, and even have
SBOMs generated from that.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-20 18:31:21 +02:00
Gabriela Cervantes
bad3ac84b0 metrics: Rename C-Ray to cpu performance tests
This PR renames C-Ray tests to cpu category.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-20 15:56:02 +00:00
Fabiano Fidêncio
87d99a71ec versions: Remove "kernel-experimental"
We've not been using nor shipping this kernel for a very long time.

Regardless, we're leaving behind the logic in the kernel scripts to
build it, in case it becomes necessary in the future.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-20 17:14:22 +02:00
Zvonko Kaiser
545de5042a vfio: Fix tests
Now with more elaborate checking of cold|hot plug ports
we needed to update some of the tests.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-07-20 13:42:44 +00:00
Zvonko Kaiser
62aa6750ec vfio: Added better handling of VFIO Control Devices
Depending on the vfio_mode we need to mount the
VFIO control device additionally into the container.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-07-20 13:42:42 +00:00
Fabiano Fidêncio
fe07ac662d Merge pull request #7387 from GabyCT/topic/fixmemoryinsidec
metrics: Add function to memory inside container script
2023-07-20 10:06:15 +02:00
Zvonko Kaiser
dd422ccb69 vfio: Remove obsolete HotplugVFIOonRootBus
Removing HotplugVFIOonRootBus which is obsolete with the latest PCI
topology changes, users can set cold_plug_vfio or hot_plug_vfio either
in the configuration.toml or via annotations.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-07-20 07:25:40 +00:00
Zvonko Kaiser
114542e2ba s390x: Fixing device.Bus assignment
The device.Bus was reset if a specific combination of
configuration parameters were not met. With the new
PCIe topology this should not happen anymore

Fixes: #7381

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-07-20 07:24:26 +00:00
Alakesh Haloi
371a118ad0 agent: exclude symlinks from recursive ownership change
currently when fsGroup is used with direct-assign, kata agent
recursively changes ownership and permission for each file including
symlinks. However the problem with symlinks is, the permission of
the symlink itself may not be same as the underlying file. So while
doing recursive ownership and permission changes we should skip
symlinks.

Fixes: #7364
Signed-off-by: Alakesh Haloi <a_haloi@apple.com>
2023-07-19 20:42:55 -07:00
Gabriela Cervantes
e64edf41e5 metrics: Add tensorflow function in gha-run script
This PR adds the tensorflow function in gha-run script in order to
be triggered in the gha.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-19 21:31:51 +00:00
Gabriela Cervantes
67a6fff4f7 metrics: Enable tensorflow benchmark on gha
This PR enables the TensorFlow benchmark on gha for the kata metrics CI.

Fixes #7362

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-19 21:31:51 +00:00
GabyCT
c3f21c36f3 Merge pull request #7388 from dborquez/revert-commit-broke-checkmetrics-baseline-values
Revert "metrics: Replace backslashes used to escape double quoted key in jq expr"
2023-07-19 14:36:16 -06:00
David Esparza
01450deb6a Revert "metrics: Replace backslashes used to escape double quoted key in jq expr."
This reverts commit 468f017e21.

Fixes: #7385

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-07-19 10:07:11 -06:00
Gabriela Cervantes
8430068058 metrics: Add function to memory inside container script
This PR adds function before function of the variables at the memory
inside container script in order to have uniformity across the script.

Fixes #7386

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-19 16:00:53 +00:00
Chao Wu
bbd3c1b6ab Dragonball: migrate dragonball-sandbox crates to Kata
In order to make it easier for developers to contribute to Dragonball,
we decide to migrate all dragonball-sandbox crates to Kata.

fixes: #7262

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-07-19 19:41:57 +08:00
Chao Wu
7153b51578 Merge pull request #7372 from fidencio/topic/bump-virtiofsd-to-v1.7.0
versions: Bump virtiofsd to v1.7.0
2023-07-19 10:51:49 +08:00
GabyCT
8c662916ab Merge pull request #7377 from dborquez/add_verbosity_to_blogbench
metrics: stop hypervirsor and shim at init_env stage
2023-07-18 15:57:54 -06:00
Fabiano Fidêncio
5f7da301fd Merge pull request #7378 from fidencio/topic/ci-k8s-fix-source-path
ci: k8s: Adapt "source ..." to the new location of gha-run.sh
2023-07-18 22:30:55 +02:00
Fabiano Fidêncio
fad801d0fb ci: k8s: Adapt "source ..." to the new location of gha-run.sh
This is a follow up of 2ee2cd307b, which
changed the location of gha-run.sh

Fixes: #7373

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-18 21:26:41 +02:00
David Esparza
55e2f0955b metrics: stop hypervirsor and shim at init_env stage
This PR kills the hypervisor and the kata shim in the
init_env stage prior to launch any metric test.
Additionally this PR adds info messages in the main blocks
of the blogbench test to help in debugging.

Fixes: #7366

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-07-18 12:05:29 -06:00
Gabriela Cervantes
556e663fce metrics: Add disk link to general metrics README
This PR adds the disk link information to the general metrics README.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-18 16:42:35 +00:00
Gabriela Cervantes
98c1217093 metrics: Add C-Ray README
This PR adds the C-Ray documentation at the README file.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-18 16:35:54 +00:00
Gabriela Cervantes
8e7d9926e4 metrics: Add C-Ray Dockerfile
This PR adds the C-Ray Dockerfile for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-18 16:33:55 +00:00
Gabriela Cervantes
e2ee769783 metrics: Add C-Ray performance test
This PR adds C-Ray performance test in order to be part of the kata
metrics CI.

Fixes #7375

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-18 16:32:23 +00:00
Fabiano Fidêncio
2011e3d72a Merge pull request #7374 from fidencio/topic/ci-tdx-adjust-kubeconfig-path
ci: Move `tests/integration/gha-run.sh`  to `tests/integration/kuberentes/` ... and also remove KUBECONFIG from the tdx envs
2023-07-18 17:32:57 +02:00
Fabiano Fidêncio
8e09e04f48 Merge pull request #6788 from jepio/kernel-update-6.1-lts
versions: Update kernel to version v6.1.x
2023-07-18 17:29:21 +02:00
Chao Wu
935432c36d Merge pull request #7352 from justxuewei/exec-hang
agent: Fix exec hang issues with a backgroud process
2023-07-18 23:02:18 +08:00
Fabiano Fidêncio
2ee2cd307b ci: k8s: Move gha-run.sh to the kubernetes dir
The file belongs there, as it's only used for k8s related tests.

Fixes: #7373

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-18 15:45:06 +02:00
Fabiano Fidêncio
88eaff5330 ci: tdx: Adjust KUBECONFIG
We don't need to export KUBECONFIG there.  Let's just make sure we have
the server correctly setup and avoid doing that.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-18 15:39:52 +02:00
Jeremi Piotrowski
c09e268a1b versions: Downgrade SEV(-SNP) kernel back to v5.19.x
CC-GPU seems to have issues with v6.1, so downgrade the kernels used for
SEV-SNP to a known-working version. It is worth mentioning that TDX is also
still on 5.19.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-07-18 15:29:46 +02:00
Fabiano Fidêncio
25d80fcec2 Merge pull request #6993 from zvonkok/kata-agent-init-mount
agent: Ignore already mounted dev/fs/pseudo-fs
2023-07-18 14:11:44 +02:00
Fabiano Fidêncio
4687f2bf9d Merge pull request #7369 from fidencio/topic/gha-ci-bring-tdx-back
ci: k8s: Bring TDX tests back
2023-07-18 13:28:33 +02:00
Fabiano Fidêncio
6a7a323656 versions: Bump virtiofsd to v1.7.0
https://gitlab.com/virtio-fs/virtiofsd/-/releases/v1.7.0 was released
Today.

Fixes: #7371

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-18 12:33:13 +02:00
Fabiano Fidêncio
ac5f5353ba ci: k8s: Bring TDX tests back
Now that we have a new TDX machine plugged into our CI, let's re-enable
the TDX tests.

Fixes: #7368

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-18 10:33:43 +02:00
Jeremi Piotrowski
950b89ffac versions: Update kernel to version v6.1.38
Kernel v6.1.38 is the current latest LTS version, switch to it.  No
patches should be necessary. Some CONFIG options have been removed:

- CONFIG_MEMCG_SWAP is covered by CONFIG_SWAP and CONFIG_MEMCG
- CONFIG_ARCH_RANDOM is unconditionally compiled in
- CONFIG_ARM64_CRYPTO is covered by CONFIG_CRYPTO and ARCH=arm64

Fixes: #6086
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-07-18 10:04:21 +02:00
GabyCT
7729d82e6e Merge pull request #7360 from GabyCT/topic/updategraldoc
metrics: Update machine learning documentation
2023-07-17 15:30:13 -06:00
Fabiano Fidêncio
26d525fcf3 Merge pull request #7361 from fidencio/topic/gha-ci-add-cri-containerd-tests-skeleton-follow-up-2
gha: ci: cri-containerd: Fix KATA_HYPERVSIOR typo
2023-07-17 22:38:50 +02:00
GabyCT
b4852c8544 Merge pull request #7335 from kata-containers/topic/addmobilenet
tests: Add MobileNet Tensorflow performance benchmark
2023-07-17 14:36:59 -06:00
Gabriela Cervantes
8ccc1e5c93 metrics: Update machine learning documentation
This PR updates the machine learning documentation related with
Tensorflow and Pytorch benchmarks.

Fixes #7359

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-17 20:32:49 +00:00
Fabiano Fidêncio
f50d2b0664 gha: ci: cri-containerd: Fix KATA_HYPERVSIOR typo
KATA_HYPERVSIOR should be KATA_HYPERVISOR

Fixes: #6543

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-17 21:56:51 +02:00
David Esparza
687596ae41 Merge pull request #7320 from dborquez/fix_jq_checkmetrics_checkvar_expression
metrics: replace backslashes used to escape double quoted jq key expr.
2023-07-17 13:50:18 -06:00
Gabriela Cervantes
620b945975 metrics: Add Tensorflow Mobilenet documentation
This PR adds the Tensorflow mobilinet documentation for the machine
learning README.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-17 17:39:05 +00:00
Zhongtao Hu
d50f3888af Merge pull request #7219 from Apokleos/network-refactor
runtime-rs: enhancement of Device Manager for network endpoints.
2023-07-17 14:13:51 +08:00
QuanweiZhou
ce14f26d82 Merge pull request #5450 from openanolis/trace_rs
feat(Tracing): tracing in Rust runtime
2023-07-17 09:27:13 +08:00
Manabu Sugimoto
f1d8de9be6 runk: Allow runk to launch a container without pid namespace
Allow runk to launch a container even though users don't specify the
pid namespace in `config.json` because general container runtimes
such as runc also can launch a container without the namespace.
On the other hand, Kata Containers doesn't allow it due to security issue
so this feature should be enabled in only runk.

Fixes: #7168

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2023-07-16 23:31:14 +05:30
Zhongtao Hu
419f8a5db7 Merge pull request #7021 from cheriL/7020/ignore-unconfigured-netinterface
runtime-rs: ignore unconfigured network interfaces
2023-07-16 10:11:15 +08:00
Xuewei Niu
6c91af0a26 agent: Fix exec hang issues with a backgroud process
Issue #4747 and pull request #4748 fix exec hang issues where the exec
command hangs when a process's stdout is not closed. However, the PR might
cause the exec command not to work as expected, leading to CI failure. The
PR was reverted in #7042. This PR resolves the exec hang issues and has
undergone 1000 rounds of testing to verify that it would not cause any CI
failures.

Fixes: #4747

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-07-16 08:32:45 +08:00
David Esparza
5a9829996c Merge pull request #7349 from dborquez/fix_extract_kata_env_for_metrics
metrics: Stop running kata-env before kata is properly installed.
2023-07-14 15:20:52 -06:00
David Esparza
59f4731bb2 metrics: Stop running kata-env before kata is properly installed.
This PR makes kata-env is called only after some metrics have
completed his workload. This fixes a bug that occurs when
kata-env was being called before kata is already installed on the
testing platform.

Fixes: #7348

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-07-14 13:40:48 -06:00
David Esparza
468f017e21 metrics: Replace backslashes used to escape double quoted key in jq expr.
This PR uses squared brackets in a jq expression to access
key values corresponding to metric results in json format.

The values are the data inputs into the checkmetrics tool.

Fixes: #7319

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-07-14 18:41:41 +00:00
GabyCT
b9535fb187 Merge pull request #7337 from dborquez/fix_remove_old_metrics_config
metrics: use rm -f to remove the oldest continerd config file.
2023-07-14 09:19:41 -06:00
Fabiano Fidêncio
7a854507cc Merge pull request #7333 from zvonkok/main
kernel: Update kernel config name
2023-07-14 13:49:27 +02:00
Fabiano Fidêncio
cfc90fad84 Merge pull request #7344 from fidencio/topic/kata-deploy-add-a-debug-option
kata-deploy: Add a debug option to kata-deploy (and also use it as part of our CI)
2023-07-14 13:16:55 +02:00
Fabiano Fidêncio
64f013f3bf ci: k8s: Enable debug when running the tests
This will help us to gather more information about Kata Containers in
case of failure.

Fixes: #7343

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-14 12:18:11 +02:00
Fabiano Fidêncio
8f4b1df9cf kata-deploy: Give users the ability to run it on DEBUG mode
The DEBUG env var introduced to the kata-deploy / kata-cleanup yaml file
will be responsible for:
* Setting up the CRI Engine to run with the debug log level set to debug
  * The default is usually info
* Setting up Kata Containers to enable:
  * debug logs
  * debug console
  * agent logs

This will help a lot folks trying to debug Kata Containers while using
kata-deploy, and also help us to always run with DEBUG=yes as part of
our CI.

Fixes: #7342

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-14 12:18:08 +02:00
Chao Wu
9b3dc572ae Merge pull request #7018 from nubificus/feat_bindmount_propagation
runtime-rs: add parameter for propagation of (u)mount events
2023-07-14 15:21:41 +08:00
Zvonko Kaiser
2c8dfde168 kernel: Update kernel config name
Fixes: #7294

When installing the kernel config adjust the name like
the vmlinuz and vmlinux files so that any added suffixes
are also reflected in the kernel config name.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-07-14 06:50:35 +00:00
Archana Shinde
b9b8ccca0c Merge pull request #7236 from amshinde/move-guestprotection
kata-ctl: Move GuestProtection code to kata-sys-util
2023-07-13 23:50:17 -07:00
soup
150e54d02b runtime-rs: ignore unconfigured network interfaces
Fixes: #7020

Signed-off-by: soup <lqh348659137@outlook.com>
2023-07-14 14:16:03 +08:00
David Esparza
3ae02f9202 metrics: use rm -f to remove older continerd config file.
In order to run kata metrics we need to check that the containerd
config file is properly set. When this is not the case, we
need to remove that file, and generate a valid one.

This PR runs rm -f in order to ignore errors in case the
file to delete does not exist.

Fixes: #7336

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-07-13 16:20:03 -06:00
David Esparza
22d4e4c5a6 Merge pull request #7328 from GabyCT/topic/updatecommon
tests: Add function before function name in common.bash for metrics
2023-07-13 16:11:30 -06:00
Gabriela Cervantes
a864d0e349 tests: Add tensorflow mobilenet dockerfile
This PR adds the tensorflow mobilenet dockerfile.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-13 21:24:40 +00:00
Gabriela Cervantes
788d2a254e tests: Add tensorflow mobilenet performance test
This PR adds tensorflow mobilenet performance test for
kata metrics.

Fixes #7334

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-13 21:18:25 +00:00
David Esparza
e8917d7321 Merge pull request #7330 from GabyCT/topic/storagedoc
tests: Add metrics storage documentation
2023-07-13 15:10:53 -06:00
GabyCT
8db43eae44 Merge pull request #7318 from dborquez/fix_timestamp_generator_on_metrics
metrics: Fix metrics ts generator to treat numbers as decimals
2023-07-13 11:21:09 -06:00
Gabriela Cervantes
3fed61e7a4 tests: Add storage link to general metrics documentation
This PR adds storage link to general metrics README.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-13 16:03:49 +00:00
Gabriela Cervantes
b34dda4ca6 tests: Add storage blogbench metrics documentation
This PR adds the storage metrics documentation for blogbench for kata
metrics.

Fixes #7329

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-13 16:00:14 +00:00
Anastassios Nanos
6787c63900 runtime-rs: add parameter for propagation of (u)mount events
Add an extra parameter in `bind_mount_unchecked` to specify
the propagation type: "shared" or "slave".

Fixes: #7017

Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
2023-07-13 15:58:22 +00:00
Gabriela Cervantes
6e5679bc46 tests: Add function before function name in common.bash for metrics
This PR adds function before the function name in common.bash script
in order to have uniformity across all the script.

Fixes #7327

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-13 15:48:47 +00:00
Archana Shinde
62080f83cb kata-sys-util: Fix compilation errors
Fix compilation errors for aarch64 and s390x

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-07-13 20:09:43 +05:30
Archana Shinde
02d99caf6d static-checks: Make cargo clippy pass.
Get rid of cargo clippy warnings.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-07-13 20:08:13 +05:30
Archana Shinde
9824206820 agent: Make the static checks pass for agent
The static checks for the agent require Cargo.lock to be updated.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-07-13 20:08:13 +05:30
Archana Shinde
61e4032b08 kata-ctl: Remove all utility functions to get platform protection
Since these have been added to kata-sys-util, remove these from
kata-ctl. Change all invocations to get platform protection to make use
of kata-sys-util.

Fixes: #7144

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-07-13 20:08:13 +05:30
Archana Shinde
a24dbdc781 kata-sys-util: Move utilities to get platform protection
Add utilities to get platform protection to kata-sys-util

Fixes: #7144

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-07-13 20:08:13 +05:30
Archana Shinde
dacdf7c282 kata-ctl: Remove cpu related functions from kata-ctl
Remove cpu related functions which have been moved to kata-sys-util.
Change invocations in kata-ctl to make use of functions now moved to
kata-sys-util.

Signed-off-by: Nathan Whyte <nathanwhyte35@gmail.com>
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-07-13 20:08:13 +05:30
Archana Shinde
f5d1957174 kata-sys-util: Move additional functionality to cpu.rs
Make certain imports architecture specific as these are not used on all
architectures.
Move additional constants and functionality to cpu.rs.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-07-13 20:08:13 +05:30
Nathan Whyte
304b9d9146 kata-sys-util: Move CPU info functions
Move get_single_cpu_info and get_cpu_flags into kata-sys-util.
Add new functions that get a list of flags and check if a flag
exists in that list.

Fixes #6383

Signed-off-by: Nathan Whyte <nathanwhyte35@gmail.com>
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-07-13 20:08:13 +05:30
Fabiano Fidêncio
eed3c7c046 Merge pull request #7322 from fidencio/topic/gha-ci-add-cri-containerd-tests-skeleton-follow-up
gha: ci: Add cri-containerd tests skeleton -- follow up 1
2023-07-13 13:53:48 +02:00
Fabiano Fidêncio
7319cff77a ci: cri-containerd: Add LTS / Active versions for containerd
As we'll be testing against the LTS and the Active versions of
containers, let's add those entries to the versions.yaml file and make
sure we export what we want to use for the tests as an env var.

The approach taken should not break the current way of getting the
containerd version.

LTS and Active versions of containerd can be found at:
https://containerd.io/releases/#support-horizon

Fixes: #6543

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-13 12:05:47 +02:00
Fabiano Fidêncio
2a957d41c8 ci: cri-containerd: Export GOPATH
Let's make sure this is exported, as it'll be needed in order to install
`yq`, which will be used to get the versions of the dependencies to be
installed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-13 12:05:47 +02:00
Fabiano Fidêncio
75a294b74b ci: cri-containerd: Ensure deps are installed
Let's make sure we install the needed dependencies for running the
`cri-containerd` tests.

Right now this commit is basically adding a placeholder, and later on,
when we'll actually be able to test the job, we'll add the logic of
installing the needed dependencies.

The obvious dependencies we've spotted so far are:
* From the OS
  * jq
  * curl (already present)
* From our repo
  * yq (using the install_yq script)
* From GitHub
  * cri-containerd
  * cri-tools
  * cni plugins

We may need a few more packages, but we will only figure this out as
part of the actual work.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-13 12:04:22 +02:00
Zhongtao Hu
b69cdb5c21 Merge pull request #7286 from xuejun-xj/xuejun/up-fix
dragonball/agent: Add some optimization for Makefile and bugfixes of unit tests on aarch64
2023-07-13 09:39:23 +08:00
GabyCT
ee17097e88 Merge pull request #7282 from GabyCT/topic/enableblogbench
metrics: Enable blogbench test
2023-07-12 16:35:52 -06:00
David Esparza
f63673838b Merge pull request #7315 from GabyCT/topic/machinelearning
tests: Add machine learning performance tests
2023-07-12 15:57:11 -06:00
David Esparza
6924d14df5 metrics: Fix metrics ts generator to treat numbers as decimals
Use bc tool to perform math operations even when variables contain
values with leading zero.

Fixes: #7317

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-07-12 20:57:33 +00:00
Gabriela Cervantes
9e048c8ee0 checkmetrics: Add blogbench read value for qemu
This PR adds the blogbench read value for qemu.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-12 20:38:27 +00:00
Gabriela Cervantes
2935aeb7d7 checkmetrics: Add blogbench write value for qemu
This PR adds the blogbench write value for qemu limit.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-12 20:37:27 +00:00
Gabriela Cervantes
02031e29aa checkmetrics: Add blogbench read value for clh
This PR adds the blogbench read value for clh limit.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-12 20:37:27 +00:00
Gabriela Cervantes
107fae033b checkmetrics: Add blogbench write value for clh
This PR adds the blogbench write value limit for clh.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-12 20:37:27 +00:00
Gabriela Cervantes
8c75c2f4bd metrics: Update blogbench Dockerfile
This PR udpates the blogbench dockerfile to have non interactive mode.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-12 20:37:27 +00:00
Gabriela Cervantes
49723a9ecf metrics: Add double quotes to variables
This PR adds double quotes to variables in the blogbench script to
have uniformity across all the tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-12 20:37:27 +00:00
Gabriela Cervantes
dc67d902eb metrics: Enable blogbench test
This PR enables the blogbench performance test for the kata metrics CI.

Fixes #7281

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-12 20:37:24 +00:00
Fabiano Fidêncio
3f38f75918 Merge pull request #7314 from fidencio/topic/gha-ci-add-cri-containerd-tests-skeleton
tests: gha: ci: Add cri-containerd tests skeleton
2023-07-12 22:21:47 +02:00
Fabiano Fidêncio
438fe3b829 gha: ci: Add cri-containerd tests skeleton
This PR builds the foundation for us to start migrating the
cri-containerd tests from Jenkins to GitHub Actions.

Right now the test does nothing and should always finish successfully.
The coming PRs will actually introduce logic to the `gha-run.sh` script
where we'll be able to run the tests and make sure those pass before
having them actually merged.

Fixes: #6543

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-12 20:57:39 +02:00
Fabiano Fidêncio
bd08d745f4 tests: metrics: Move metrics specific function to metrics gha-run.sh
`compress_metrics_results_dir()` is only used by the metrics GHA.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-12 20:56:55 +02:00
Fabiano Fidêncio
3ffd48bc16 tests: common: Move a few utility functions to common.bash
Those functions were originally introduced as part of the
`metrics/gha-run.sh` file, but those will be very hand at the time we
start adding more tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-12 20:55:05 +02:00
Gabriela Cervantes
7f961461bd tests: Add machine learning README
This PR adds machine learning README.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-12 16:37:15 +00:00
Fabiano Fidêncio
bb2ef4ca34 tests: Add function before each function
Let's just keep this standardised.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-12 18:36:09 +02:00
Gabriela Cervantes
063f7aa7cb tests: Add Pytorch Dockerfile
This PR adds Pytorch Dockerfile for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-12 16:34:17 +00:00
Fabiano Fidêncio
b6282f7053 Merge pull request #7255 from GabyCT/topic/memoryinsideenabled
metrics: Enable memory inside container metrics
2023-07-12 18:33:36 +02:00
Gabriela Cervantes
1af03b9b32 tests: Add Pytorch performance test
This PR adds Pytorch performance test for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-12 16:33:02 +00:00
Gabriela Cervantes
4cecd62370 tests: Add tensorflow Dockerfile
This PR adds the tensorflow Dockerfile.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-12 16:31:32 +00:00
Gabriela Cervantes
c4094f62c9 tests: Add metrics machine learning performance tests
This PR adds metrics machine learning performance tests like
Tensorflow and Pytorch.

Fixes #7313

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-12 16:28:25 +00:00
Jeremi Piotrowski
b9a63d66a4 Merge pull request #7297 from jepio/fix-mariner-cache
tools: Use a consistent target name when building mariner initrd
2023-07-12 13:43:47 +02:00
Fabiano Fidêncio
1ab99bd6bb Merge pull request #7276 from fidencio/topic/gha-debug-gha-tests-start
gha: ci: Gather info about the node / pods
2023-07-12 12:35:10 +02:00
Chao Wu
f6a51a8a78 Merge pull request #7306 from justxuewei/none-network-model
runtime-rs: Do not scan network if network model is "none"
2023-07-12 14:53:52 +08:00
Zvonko Kaiser
4e352a73ee Merge pull request #7308 from fidencio/topic/gha-temporarily-disable-tdx-runs
gha: k8s: tdx: Temporarily disable TDX tests
2023-07-12 08:39:02 +02:00
Fabiano Fidêncio
89b622dcb8 gha: k8s: tdx: Temporarily disable TDX tests
TDX tests need to be temporarily disabled as the current machine
allocated for this will be off for some time, and a new machine only
will be added next week.

Fixes: #7307

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-12 08:26:10 +02:00
Fabiano Fidêncio
8c9d08e872 gha: ci: Gather info about the node / pods
This is a very simple addition, that should be expanded by
https://github.com/kata-containers/kata-containers/pull/7185, and it's
targetting gathering more info that will help us to debug CI failures.

Fixes: #7296

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-12 08:04:37 +02:00
alex.lyn
283f809dda runtime-rs: Enhancing Device Manager for network endpoints.
Currently, network endpoints are separate from the device manager
and need to be included for proper management. In order to do so,
we need to refactor the implementation of the network endpoints.

The first step is to restructure the NetworkConfig and NetworkDevice
structures.
Next, we will implement the virtio-net driver and add the Network
device to the Device Manager.
Finally, we'll unify entries with do_handle_device for each endpoint.

Fixes: #7215

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-07-12 11:27:12 +08:00
xuejun-xj
a65291ad72 agent: rustjail: update test_mknod_dev
When running cargo test in container, test_mknod_dev may fail sometimes
because of "Operation not permitted". Change the device path to
"/dev/fifo-test" to avoid this case.

Fixes: #7284

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
2023-07-12 11:22:32 +08:00
xuejun-xj
46b81dd7d2 agent: clippy: fix cargo clippy warnings
Replace "if let Ok(_) = ..." with ".is_ok()" method.

Fixes: #7284

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
2023-07-12 11:22:32 +08:00
xuejun-xj
c4771d9e89 agent: Makefile: enable set SECCOMP dynamically
Change ":=" to "?:".

Fixes: #7284

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
2023-07-12 11:22:32 +08:00
xuejun-xj
a88212e2c5 utils.mk: update BUILD_TYPE argument
Enable to dynamically set BUILD_TYPE argument.

Fixes: #7284

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
2023-07-12 11:22:32 +08:00
xuejun-xj
883b4db380 dragonball: fix cargo test on aarch64
1. Update memory end assert because address space layout differs between
x86 and arm.
2. Set guest_addr for aarch64 in test_handler_insert_region case.

Fixes: #7284
TODO: #7290

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
2023-07-12 11:22:31 +08:00
Xuewei Niu
6822029c81 runtime-rs: Do not scan network if network model is "none"
Skip to scan network from netns if the network model is specified to
"none".

Fixes: #7305

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-07-12 10:00:50 +08:00
Fabiano Fidêncio
ae55893deb Merge pull request #7303 from GabyCT/topic/cleanupmemoryusage
metrics: Update memory usage script
2023-07-11 23:52:05 +02:00
Gabriela Cervantes
ce54e43ebe metrics: Update memory usage script
This PR updates memory usage script by applying the clean_env_ctr at the main
in order to avoid failures of leaving certain processes not removed.

Fixes #7302

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-11 17:03:25 +00:00
Fabiano Fidêncio
ceb5c69ee8 Merge pull request #7299 from fidencio/topic/gha-stop-previous-workflows-if-a-pr-is-updated
gha: Cancel previous jobs if a PR is updated
2023-07-11 16:22:47 +02:00
Fabiano Fidêncio
fbc2a91ab5 gha: Cancel previous jobs if a PR is updated
Let's make sure we cancel previous runs, mainly as we have some of those
that take a lot of time to run, whenever the PR is updated.

This is based on the following stack overflow suggestion:
https://stackoverflow.com/questions/66335225/how-to-cancel-previous-runs-in-the-pr-when-you-push-new-commitsupdate-the-curre

This is very much needed as we don't want to wait for a long time to
have access to a runner because of other runners are still being used
performing a task that's meaningless due to the PR update.

Fixes: #7298

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-11 14:37:10 +02:00
Jeremi Piotrowski
307cfc8f7a tools: Use a consistent target name when building mariner initrd
Currently a mixture of cbl-mariner and mariner is used when creating the
mariner initrd. The kata-static tarball has mariner in the name, but the
jenkins url uses cbl-mariner. This breaks cache usage.

Use mariner as the target name throughout the build, so that caching works.

Fixes: #7292
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-07-11 14:17:14 +02:00
Fabiano Fidêncio
aa484dc0e3 Merge pull request #7288 from fidencio/topic/add-nightly-jobs-follow-up-7
gha: nightly: Fix long name of AKS clusters issue and make the CI easier to test
2023-07-11 11:16:09 +02:00
Fabiano Fidêncio
d780cc08f4 gha: nightly: Also use workflow_dispatch to trigger it
This is a very nice suggestion from Steve Horsman, as with that we can
manually trigger the workflow anytime we need to test it, instead of
waiting for a full day for it to be retriggered via the `schedule`
event.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-11 10:42:40 +02:00
Fabiano Fidêncio
b99ff30267 gha: nightly: Fix name size limit for AKS
Passing the commit hash as the "pr-number" has shown problematic as it
would make the AKS cluster name longer than what's accepted by AKS.

One easy way to solve this is just passing "nightly" as the PR number,
as that's only used to create the cluster.

Fixes: #7247

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-11 09:59:13 +02:00
xuejun-xj
aedc586e14 dragonball: Makefile: add coverage target
Add "coverage" target to compute code coverage for dragonball.

Fixes: #7284

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
2023-07-11 14:36:25 +08:00
Fabiano Fidêncio
52100bb3dd Merge pull request #7280 from fidencio/topic/gha-add-badge-for-our-tests
README: Add badge for our Nightly CI
2023-07-10 19:35:33 +02:00
Gabriela Cervantes
310e069f73 checkmetrics: Enable checkmetrics for memory inside test
This PR enables the checkmetrics to include the memory inside
container test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-10 17:05:13 +00:00
Fabiano Fidêncio
b61b15aab6 Merge pull request #7259 from fidencio/topic/gha-restrict-job-run-according-to-files-touched
gha: Do not run all the tests if only docs are updated
2023-07-10 18:12:29 +02:00
Fabiano Fidêncio
1363fbbf12 README: Add badge for our Nightly CI
This will help folks to monitor the history of the failing tests, as
we've done in Jenkins with the "Green Effort CI".

Fixes: #7279

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-10 17:31:51 +02:00
Fabiano Fidêncio
9dc63fe338 Merge pull request #7273 from openanolis/runtime-rs-fix-mem-ci
bugfix: plus default_memory when calculating mem size
2023-07-10 15:12:05 +02:00
Zvonko Kaiser
fab2e6a93f Merge pull request #7277 from fidencio/topic/add-nightly-jobs-follow-up-6
gha: ci: Use github.sha to get the last commit reference
2023-07-10 13:36:31 +02:00
Fabiano Fidêncio
1776b18fa0 gha: Do not run all the tests if only docs are updated
We should not go through the trouble of running all our tests on AKS /
Azure / baremetal machines in case a PR only changes our documentation.

Fixes: #7258

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-10 10:30:46 +02:00
Yushuo
28c29b248d bugfix: plus default_memory when calculating mem size
We've noticed this caused regressions with the k8s-oom tests, and then
decided to take a step back and do this in the same way it was done
before 67972ec48a.

Moreover, this step back is also more reasonable in terms of the
controlling logic.

And by doing this we can re-enable the k8s-oom.bats tests, which is done
as part of this PR.

Fixes: #7271
Depends-on: github.com/kata-containers/tests#5705

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-07-10 15:53:04 +08:00
Fabiano Fidêncio
0c1cbd01d8 gha: ci: after-push: Use github.sha to get the last commit reference
As we need to pass down the commit sha to the jobs that will be
triggered from the `push` event, we must be careful on what exactly
we're using there.

At first we were using ${{ github.ref }}, but this turns out to be the
**branch name**, rather than the commit hash.  In order to actually get
the commit hash, Let's use ${{ github.sha }} instead.

Fixes: #7247

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-10 09:39:33 +02:00
Fabiano Fidêncio
37a9556789 gha: ci: nightly: Use github.sha to get the last commit reference
As we need to pass down the commit sha to the jobs that will be
triggered from the `schedule` event, we must be careful on what exactly
we're using there.

At first we were using ${{ github.ref }}, but this turns out to be the
**branch name**, rather than the commit hash.  In order to actually get
the commit hash, Let's use ${{ github.sha }} instead, as described by
https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#

Fixes: #7247

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-10 09:39:26 +02:00
Fabiano Fidêncio
afbc1f94d7 Merge pull request #7272 from fidencio/topic/dragonball-k8s-number-cpus-fix
dragonball: Don't fail if a request asks for more CPUs than allowed
2023-07-10 08:25:06 +02:00
Ji-Xinyou
ed23b47c71 tracing: Add tracing to runtime-rs
Introduce tracing into runtime-rs, only some functions are instrumented.

Fixes: #5239

Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-07-09 22:09:43 +08:00
Fabiano Fidêncio
96e9374d4b dragonball: Don't fail if a request asks for more CPUs than allowed
Let's take the same approach of the go runtime, instead, and allocate
the maximum allowed number of vcpus instead.

Fixes: #7270

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-08 15:50:23 +02:00
Fabiano Fidêncio
38f0aaa516 Revert "gha: k8s: dragonball: Skip k8s-number-cpus"
This reverts commit a79505b667.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-08 14:43:49 +02:00
Fabiano Fidêncio
828a721838 gha: k8s: dragonball: Skip k8s-oom
Let's skip the k8s-oom, as the test is currently failing.

We've an issue opened for that, and we'll be working on re-enabling it
as soon as possible.

Reference:
https://github.com/kata-containers/kata-containers/issues/7271

Fixes: #7253

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-08 14:27:49 +02:00
Fabiano Fidêncio
a79505b667 gha: k8s: dragonball: Skip k8s-number-cpus
Let's skip the k8s-number-cpus, as the test is currently failing.

We've an issue opened for that, and we'll be working on re-enabling it
as soon as possible.

Reference:
https://github.com/kata-containers/kata-containers/issues/7270

Fixes: #7253

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-08 14:27:42 +02:00
Fabiano Fidêncio
275c84e7b5 Revert "agent: fix the issue of exec hang with a backgroud process"
This reverts commit 25d2fb0fde.

The reason we're reverting the commit is because it to check whether
it's the cause for the regression on devmapper tests.

Fixes: #7253
Depends-on: github.com/kata-containers/tests#5705

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-08 14:27:40 +02:00
Gabriela Cervantes
2be342023b checkmetrics: Add memory usage inside container value for qemu
This PR adds the memory usage inside container value for qemu.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-07 16:28:28 +00:00
Gabriela Cervantes
6ca34f949e checkmetrics: Add memory inside container value for clh
Add memory inside container value for clh.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-07 16:28:28 +00:00
Gabriela Cervantes
6c68924230 metrics: Enable memory inside container metrics
This PR will enable the memory inside container metrics for the Kata CI.

Fixes #7254

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-07 16:28:28 +00:00
Fabiano Fidêncio
b7c58320a5 Merge pull request #7267 from fidencio/topic/add-nightly-jobs-follow-up-5
gha: ci: Fix refernce passed to checkout@v3
2023-07-07 18:26:44 +02:00
Fabiano Fidêncio
0ad298895e gha: ci: Fix refernce passed to checkout@v3
On cc3993d860 we introduced a regression,
where we started passing inputs.commit-hash, instead of
github.event.pull_request.head.sha. However, we have been setting
commit-hash to github.event.pull_request.sha, meaning that we're mssing
a `.head.` there.

github.event.pull_request.sha is empty for the pull_request_target
event, leading the CI to pull the content from `main` instead of the
content from the PR.

Fixes: #7247

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-07 17:55:11 +02:00
Fabiano Fidêncio
48d9f8769e Merge pull request #7264 from fidencio/topic/add-nightly-jobs-follow-up-4
gha: ci: Avoid using env also in the ci-nightly and payload-after-push
2023-07-07 17:10:43 +02:00
Fabiano Fidêncio
86904909aa gha: ci: Avoid using env also in the ci-nightly and payload-after-push
The latter workflow is breaking as it doesn't recognise ${GITHUB_REF},
the former would most likely break as well, but it didn't get triggered
yet.

The error we're facing is:
```
Determining the checkout info
  /usr/bin/git branch --list --remote origin/${GITHUB_REF}
  /usr/bin/git tag --list ${GITHUB_REF}
  Error: A branch or tag with the name '${GITHUB_REF}' could not be found
```

Fixes: #7247

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-07 14:46:30 +02:00
Fabiano Fidêncio
48c3cec1f4 Merge pull request #7243 from sprt/ensure-cluster-no-exist
gha: k8s: Ensure cluster doesn't exist before creating it
2023-07-07 14:03:41 +02:00
Fabiano Fidêncio
3e2b723487 Merge pull request #7263 from fidencio/topic/add-nightly-jobs-follow-up-3
gha: ci: More follow up fixes after adding a nightly CI
2023-07-07 13:58:26 +02:00
Fabiano Fidêncio
18bd2d6e4a Merge pull request #6839 from sprt/sprt/mariner-ci-tests
tests: Enable running k8s tests on Mariner
2023-07-07 13:36:28 +02:00
Zvonko Kaiser
f72cb2fc12 agent: Remove shadowed function, add slog-term
Remove shadowed get_mounts(), added slog-term as a new crate,
slog can directly log to stdout and we can capture output
in the test-cases that are created in the function to be tested.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-07-07 11:28:14 +00:00
Fabiano Fidêncio
1d05b9cc71 gha: ci: Pass down secrets to ci-on-push / ci-nightly
We have to do this, otherwise we cannot log into azure.

This is a regression introduced by
106e305717.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-07 12:00:33 +02:00
Fabiano Fidêncio
c5b4164cb1 gha: ci: Fix tarball-suffix passed to the metrics tests
Instead of passing "-${{ inputs.tag }}-amd64", we must only pass
"-${{ inputs.tag }}".

This is a regression introduced by
106e305717.

Fixes: #7247

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-07 12:00:24 +02:00
Fabiano Fidêncio
fa0f9954a1 Merge pull request #7261 from fidencio/topic/add-nightly-jobs-follow-up-2
gha: ci: Avoid using env unless it's really needed
2023-07-07 10:13:25 +02:00
Zvonko Kaiser
07810bf71f agent: Ignore already mounted dev/fs/pseudo-fs
Using an initrd and setting KATA_INIT=yes meaning we're using the kata-agent
as the init process we need to make sure that the agent is not segfaulting
if mounts are already happened. Some workloads need to configure several
things in the initrd before the kata-agent starts which involves having
/proc or /sys already mounted.

Fixes: #6992

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-07-07 07:36:04 +00:00
Fabiano Fidêncio
11e3ccfa4d gha: ci: Avoid using env unless it's really needed
de83cd9de7 tried to solve an issue, but it
clearly seems that I'm using env wrongly, as what ended up being passed
as input was "$VAR", instead of the content of the VAR variable.

As we can simply avoid using those here, let's do it and save us a
headache.

Fixes: #7247

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-07 07:31:10 +02:00
Aurélien Bombo
c45f646b9d gha: k8s: Ensure cluster doesn't exist before creating it
The cluster cleanup step will sometimes fail to run, meaning the next
run would fail in the cluster creation step. This PR addresses that.

Example: https://github.com/kata-containers/kata-containers/actions/runs/5349582743/jobs/9867845852

Fixes: #7242

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-07-06 15:06:30 -07:00
GabyCT
58e921eace Merge pull request #7260 from fidencio/topic/add-nightly-jobs-follow-up-1
gha: ci: Follow up fixes for the nightly jobs
2023-07-06 15:45:13 -06:00
GabyCT
54da0d7c91 Merge pull request #7230 from GabyCT/topic/enabmemory
tests: Enable memory usage metrics tests
2023-07-06 14:30:56 -06:00
Fabiano Fidêncio
1a7bbcd398 gha: ci: Fix typo pull_requesst -> pull_request
Thanks David Esparza for pointing this one out.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-06 22:29:00 +02:00
Fabiano Fidêncio
ddf4afb961 gha: ci: Fix set-fake-pr-number job
It has to have steps declared, and we need to make it a dependency for
the nightly kata-containers-ci-on-push job.

Fixes: #7247

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-06 22:02:08 +02:00
Fabiano Fidêncio
8a0a66655d gha: ci: schedule expects a list, not a map
And because of that we need to declare '- cron', instead of 'cron'.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-06 22:02:08 +02:00
Fabiano Fidêncio
5c0269dc5a gha: ci: Add pr-number input to the correct job
It must have been an input for the AKS jobs, not the SNP one.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-06 22:02:08 +02:00
Fabiano Fidêncio
de83cd9de7 gha: ci: Use $VAR instead of ${{ env.VAR }}
Otherwise we'll get the following error from the workflow:
```
The workflow is not valid. .github/workflows/ci-on-push.yaml (Line: 24,
Col: 20): Unrecognized named-value: 'env'. Located at position 1 within
expression: env.COMMIT_HASH .github/workflows/ci-on-push.yaml (Line: 25,
Col: 18): Unrecognized named-value: 'env'. Located at position 1 within
expression: env.PR_NUMBER
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-06 22:02:08 +02:00
Wainer Moschetta
1a4ae1ef47 Merge pull request #6953 from fidencio/topic/add-nightly-jobs
gha: Add nightly jobs
2023-07-06 14:50:10 -03:00
Gabriela Cervantes
6acce83e12 metrics: Fix the call to check_metrics function
This PR fixes the call to check_metrics function as KATA_HYPERVISOR
is not needed to be passed.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-06 17:22:49 +00:00
David Esparza
0bd21c173a Merge pull request #7240 from dborquez/storing_metrics_artifacts
metrics: storing metrics workflow artifacts
2023-07-06 09:49:45 -06:00
Fabiano Fidêncio
152e2509ca Merge pull request #7238 from fidencio/topic/gha-run-tests-on-specific-namespace
gha: k8s: Ensure tests are running on a specific namespace
2023-07-06 17:25:00 +02:00
Fabiano Fidêncio
e067d18333 gha: Add a nightly CI job
The idea is to mimic what's been done with Jenkins and the "Green CI"
effort, but now using our GHA and the GHA infrastructure.

Fixes: #7247

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-06 14:39:49 +02:00
Fabiano Fidêncio
7c0de8703c gha: k8s: Ensure tests are running on a specific namespace
Let's make sure we run our tests in a specific namespace, as in case of
any kind of issue, we will just get rid of the namespace itself, which
will take care of cleaning up any leftover from failing tests.

One important thing to mention is why we can get rid of the `namespace:
${namespace}` on the tests that are already using it, and let's do it in
parts:
* namespace: default
  We can easily get rid of this as that's the default namespace where
  pods are created, so it was a no-op so far.
* namespace: test-quota-ns
  My understanding is that we'd need this in order to get a clean
  namespace where we'd be setting a quota for.  Doing this in the
  namespace that's only used for tests should **not** cause any
  side-effect on the tests, as we're running those in serial and there's
  no other pods running on the `kata-containers-k8s-tests` namespace

Last but not least, we're not dynamically creating namespaces as the
tests are not running in parallel, **never**, not in the case of having
2 tests being ran at same time, neither in the case of having 2 jobs
being scheduled to the same machine.

Fixes: #6864

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-06 14:14:50 +02:00
Fabiano Fidêncio
106e305717 gha: Create a re-usable ci.yaml file
This is based on the `ci-on-push.yaml` file, and it's called from ther
The reason to split on a new file is that we can easily introduce a
`ci-nightly.yaml` file and re-use the `ci.yaml` file there as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-06 13:07:59 +02:00
Fabiano Fidêncio
cc3993d860 gha: Pass event specific info from the caller workflow
Let's ensure we're not relying, on any of the called workflows, on event
specific information.

Right now, the two information we've been relying on are:
* PR number, coming from github.event.pull_request.number
* Commit hash, coming from github.event.pull_request.head.sha

As we want to, in the future, add nightly jobs, which will be triggered
by a different event (thus, having different fields populated), we
should ensure that those are not used unless it's in the "top action"
that's trigerred by the event.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-06 11:23:17 +02:00
David Esparza
4e396e7285 metrics: Add function keyword to to helper metrics functions
Use the 'function' keyword to prevent bash aliases from colliding
with other function's name.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-07-05 20:59:21 -06:00
David Esparza
1ca17c2f70 metrics: storing metrics workflow artifacts
This PR enables storing metrics workflow artifacts in two
separated flavours: clh and qemu.

Fixes: #7239

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-07-05 20:57:10 -06:00
David Esparza
a3fc673121 Merge pull request #7181 from dborquez/add_blogbench_and_webtooling
metrics: Adds blogbench and webtool metrics tests
2023-07-05 20:37:33 -06:00
Gabriela Cervantes
5a61065ab7 checkmetrics: Add checkmetrics value for memory usage in qemu
This PR adds the checkmetrics value for memory usage in qemu.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-05 19:22:12 +00:00
Gabriela Cervantes
78086ed1fe checkmetrics: Add memory usage value for clh
This PR adds the memory usage value for clh.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-05 19:19:04 +00:00
Gabriela Cervantes
1c3dbafbf0 metrics: Fix function of how to retrieve multiple values
This PR fixes the function of how to add multiple values of pss memory.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-05 18:19:36 +00:00
Gabriela Cervantes
18968f428f metrics: Add function to have uniformity
This PR adds the function name before the function to have uniformity
across all the test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-05 18:15:31 +00:00
David Esparza
35d096b607 metrics: Adds blogbench and webtool metrics tests
This PR adds blogbench and webtooling metrics checks to this repo.
The function running the test intentionally returns zero, so
the test will be enabled in another PR once the workflow is
green.

Fixes: #7069

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-07-04 14:38:52 -06:00
Gabriela Cervantes
d8f90e89d5 metrics: Rename function at memory usage script
This PR renames the function name for the memory usage script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-04 19:58:09 +00:00
Gabriela Cervantes
b9d66e0d53 metrics: Fix double quotes variables in memory usage script
This PR usses double quotes in all the variables as well as general fixes
to the memory usage script in order to have uniformity.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-04 19:51:36 +00:00
Gabriela Cervantes
476a11194a tests: Enable memory usage metrics tests
This PR enables the memory usage metrics tests for kata CI.

Fixes #7229

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-04 16:11:54 +00:00
Fabiano Fidêncio
a25d5b9807 Merge pull request #7222 from jepio/fix-dragonball-check
gha: dragonball: Correctly propagate PATH update
2023-07-04 15:59:13 +02:00
Jeremi Piotrowski
b568c7f7d8 tests/integration: Provide default value for KATA_HOST_OS
Non AKS k8s tests (SEV/SNP/TDX) don't currently set KATA_HOST_OS, so provide a
default empty value for the variable so that those tests can run.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-07-04 14:28:29 +02:00
Fabiano Fidêncio
6d2e6ed7b6 Merge pull request #7217 from likebreath/0630/clh_v33.0
versions: Upgrade to Cloud Hypervisor v33.0
2023-07-04 12:52:26 +02:00
Jeremi Piotrowski
d6e96ea06d tests/integration: Use AzureLinux instead of Mariner
as OSSKU value, to get rid of this warning when creating the AKS cluster:

WARNING: The osSKU "AzureLinux" should be used going forward instead of
"CBLMariner" or "Mariner". The osSKUs "CBLMariner" and "Mariner" will
eventually be deprecated.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-07-04 12:49:07 +02:00
Jeremi Piotrowski
40c46c75ed tests/integration: Perform yq install in run_tests()
We only need to install in run_tests() so that the yq install is picked up by
kubernets/setup.sh as well. We also need to either use (sudo &&
INSTALL_IN_GOPATH=false) || (INSTALL_IN_GOPATH=true).

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-07-04 12:49:07 +02:00
Bin Liu
f214058b07 Merge pull request #7202 from wedsonaf/macros
Convert `is_allowed`, `ttrpc_error` and `sl` to functions
2023-07-04 14:23:08 +08:00
Peng Tao
f5658c7833 Merge pull request #7224 from fidencio/topic/gha-release-fix-hub-download
gha: release: Use a specific release of hub
2023-07-04 10:21:17 +08:00
GabyCT
5950df7d95 Merge pull request #7199 from GabyCT/topic/installchem
metrics: Add checkmetrics to gha-run.sh for metrics CI
2023-07-03 17:49:18 -06:00
Gabriela Cervantes
d8b8f7e94d metrics: Enable launch tests time metrics
This PR enables the launch tests metrics for kata CI.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-03 22:38:04 +00:00
Fabiano Fidêncio
72fd562bd6 gha: release: Use a specific release of hub
ideally we should never ever use hub again, and switch to a supported /
release tool instead.  However, in order to get v3.1.3 released, let's
just stick to the last released version of hub, as trying to get its
release is leading to:
```
curl -s "https://api.github.com/repos/github/hub/releases/latest"
{
  "message": "Moved Permanently",
  "url": "https://api.github.com/repositories/401025/releases/latest",
  "documentation_url": "https://docs.github.com/v3/#http-redirects"
}
```

And that breaks the release process. :-/

Fixes: #7223

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-03 22:00:55 +02:00
Fabiano Fidêncio
a7340a63a4 Merge pull request #7209 from GabyCT/topic/fixbuildovmf
packaging: Fix indentation of build.sh script at ovmf
2023-07-03 20:06:29 +02:00
Gabriela Cervantes
0502354b42 checkmetrics: Add checkmetrics json for qemu
This PR adds checkmetrics json file for qemu metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-03 16:47:03 +00:00
Gabriela Cervantes
b481ef1883 makefile: Add -buildvcs=false flag to go build
This PR adds the -buildvcs=false flag to the go build of checkmetrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-03 16:42:51 +00:00
Gabriela Cervantes
e94aaed3c7 ci_worker: Add checkmetrics ci worker for cloud hypervisor
This PR adds the checkmetrics ci worker file for cloud hypervisor in
order to check the boot times limit.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-03 16:42:51 +00:00
Gabriela Cervantes
917576e6fb metrics: Add double quotes in all variables
This PR adds double quotes in all variables to have uniformity across
all the gha-run.sh script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-03 16:42:50 +00:00
Gabriela Cervantes
cc8f0a24e4 metrics: Add checkmetrics to gha-run.sh for metrics CI
This PR adds checkmetrics installation for gha-run.sh in order to compare
results limits as part of the metrics CI.

Fixes #7198

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-07-03 16:41:31 +00:00
Jeremi Piotrowski
477856c1e3 gha: dragonball: Correctly propagate PATH update
cargo/rust is installed in one step, we need to write the PATH update to
GITHUBENV so that it becomes visible in the next steps.

Fixes: #7221
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-07-03 17:05:12 +02:00
Fupan Li
b6307c2744 Merge pull request #5444 from zvonkok/vra
doc: Add documentation for the virtualization reference architecture
2023-07-03 10:14:20 +08:00
Peng Tao
c85aff7ef4 Merge pull request #6949 from zvonkok/kernel-fixes
gpu: Update kernel building to the latest changes
2023-07-03 09:53:08 +08:00
Peng Tao
581be92b25 Merge pull request #4492 from zvonkok/pcie-topology
runtime: fix PCIe topology for GPUDirect use-case
2023-07-03 09:17:12 +08:00
David Esparza
d01762dc35 Merge pull request #7174 from dborquez/add_memory_footprint_test
metrics: Add memory footprint tests
2023-06-30 16:32:10 -06:00
Fabiano Fidêncio
00b0755e3e Merge pull request #7200 from fidencio/topic/add-virtiofs-none-option
runtime: Add "none" as a shared_fs option
2023-06-30 22:45:39 +02:00
Aurélien Bombo
1c211cd730 gha: Swap asset/release in build matrix
This simply displays the asset name first in GH's UI, so that the
release name (always "test") is truncated rather than the asset name.
Makes things slightly easier to read.

e.g.

    build-asset (cloud-hypervisor-glibc, te...

instead of

    build-asset (test, cloud-hypervisor-gli...

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-06-30 12:51:40 -07:00
Aurélien Bombo
0152c9aba5 tools: Introduce USE_CACHE environment variable
This allows setting `USE_CACHE=no` to test building e2e during
developmet without having to comment code blocks and so forth.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-06-30 12:51:40 -07:00
Aurélien Bombo
2b59756894 tests: Build CLH with glibc for Mariner
This enables building CLH with glibc and the mshv feature as required
for Mariner. At test time, it also configures Kata to use that CLH
flavor when running Mariner.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-06-30 12:51:40 -07:00
Aurélien Bombo
80c78eadce tests: Use baked-in kernel with Mariner
Mariner ships a bleeding-edge kernel that might be ahead of upstream, so
we use that to guarantee compatibility with the host.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-06-30 12:51:40 -07:00
Aurélien Bombo
532755ce31 tests: Build Mariner rootfs initrd
* Adds a new `rootfs-initrd-mariner` build target.
 * Sets the custom initrd path via annotation in `setup.sh` at test
   time.
 * Adapts versions.yaml to specify a `cbl-mariner` initrd variant.
 * Introduces env variable `HOST_OS` at deploy time to enable using a
   custom initrd.
 * Refactors the image builder so that its caller specifies the desired
   guest OS.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-06-30 12:51:40 -07:00
Fabiano Fidêncio
6a21e20c63 runtime: Add "none" as a shared_fs option
Currently, even when using devmapper, if the VMM supports virtio-fs /
virtio-9p, that's used to share a few files between the host and the
guest.

This *needed*, as we need to share with the guest contents like secrets,
certificates, and configurations, via Kubernetes objects like configMaps
or secrets, and those are rotated and must be updated into the guest
whenever the rotation happens.

However, there are still use-cases users can live with just copying
those files into the guest at the pod creation time, and for those
there's absolutely no need to have a shared filesystem process running
with no extra obvious benefit, consuming memory and even increasing the
attack surface used by Kata Containers.

For the case mentioned above, we should allow users, making it very
clear which limitations it'll bring, to run Kata Containers with
devmapper without actually having to use a shared file system, which is
already the approach taken when using Firecracker as the VMM.

Fixes: #7207

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-06-30 20:45:00 +02:00
Bo Chen
5681caad5c versions: Upgrade to Cloud Hypervisor v33.0
Details of this release can be found in ourroadmap project as iteration
v33.0: https://github.com/orgs/cloud-hypervisor/projects/6.

Fixes: #7216

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-06-30 09:37:27 -07:00
David Esparza
b2ce8b4d61 metrics: Add memory footprint tests to the CI
This PR adds memory foot print metrics to tests/metrics/density
folder.

Intentionally, each test exits w/ zero in all test cases to ensure
that tests would be green when added, and will be enabled in a
subsequent PR.

A workflow matrix was added to define hypervisor variation on
each job, in order to run them sequentially.

The launch-times test was updated to make use of the matrix
environment variables.

Fixes: #7066

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-06-30 09:52:27 -06:00
David Esparza
5e3f617cb6 Merge pull request #7197 from GabyCT/topic/fixfunctionname
metrics: Uniformity across function names in gha-run.sh
2023-06-30 09:37:15 -06:00
Zvonko Kaiser
d035955ef5 doc: Add documentation for the virtualization reference architecture
Fixes: #4041

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-30 12:30:37 +00:00
Zvonko Kaiser
0f454d0c04 gpu: Fixing typos for PCIe topology changes
Some comments and functions had typos and wrong capitalization.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-30 08:42:55 +00:00
Gabriela Cervantes
6bb2ea8195 packaging: Fix indentation of build.sh script at ovmf
This PR fixes the indentation of build.sh script at ovmf.

Fixes #7208

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-29 15:46:54 +00:00
Fupan Li
4288b935e1 Merge pull request #7104 from openanolis/physical/endpoint
runtime-rs:  support physical endpoint using device manager
2023-06-29 14:43:44 +08:00
GabyCT
19890133e9 Merge pull request #7189 from Apokleos/direct-vol-bugfix
runtime-rs: bugfix for direct volume path's validation.
2023-06-28 12:26:22 -06:00
Wedson Almeida Filho
0504bd7254 agent: convert the sl macros to functions
There is nothing in them that requires them to be macros. Converting
them to functions allows for better error messages.

Fixes: #7201

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-06-28 14:05:32 -03:00
Wedson Almeida Filho
0860fbd410 agent: convert the ttrpc_error macro to a function
There is nothing in it that requires it to be a macro. Converting it to
a function allows for better error messages.

Fixes: #7201

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-06-28 14:05:32 -03:00
Wedson Almeida Filho
0e5d6ce6d7 agent: convert the is_allowed macro to a function
Having a function allows for better error messages from the type checker
and it makes it clearer to callers what can happen. For example:

is_allowed!(req);

Gives no indication that it may result in an early return, and no simple
way for callers to modify the behaviour. It also makes it look like
ownership of `req` is being transferred.

On the other hand,

is_allowed(&req)?;

Indicates that `req` is being borrowed (immutably) and may fail. The
question mark indicates that the caller wants an early return on
failure.

Fixes: #7201

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-06-28 14:05:32 -03:00
Wedson Almeida Filho
f680fc52be agent: change AGENT_CONFIG's lazy type to just AgentConfig
Since it is never modified, it doesn't really need a lock of any kind.
Removing the `RwLock` wrapper allows us to remove all `.read().await`
calls when accessing it.

Additionally, `AGENT_CONFIG` already has a static lifetime, so there is
no need to wrap it in a ref-counted heap allocation.

Fixes: #5409

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-06-28 14:05:27 -03:00
GabyCT
3f87d0fbfe Merge pull request #7180 from dborquez/run_ret_hypervisor_version_w_sudo
metrics: Fix retrieving hypervisor version on metrics
2023-06-28 10:54:23 -06:00
Gabriela Cervantes
beb7063683 metrics: Uniformity across function names
This PR adds the word function before the function names in order to have
uniformity across the script as some are using this and some are not.

Fixes #7196

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-28 16:09:19 +00:00
Fabiano Fidêncio
c8d33da8a4 Merge pull request #7188 from jongwu/fix_vfio
runtime-rs: fix build error on AArch64
2023-06-28 15:43:14 +02:00
Jianyong Wu
1f3e837e4b runtime-rs: fix build error on AArch64
Vfio support introduce build error on AArch64. Remove arch related
annotation can avoid this error.

Fixes: #7187
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-06-28 07:10:43 +00:00
alex.lyn
6fd25968c6 runtime-rs: bugfix for direct volume path's validation.
The failure mainly caused by the encoded volume path and
the mount/src. As the src will be validated with stat,but
it's not a full path and encoded, which causes the stat
mount source failed.

Fixes: #7186

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-06-28 10:07:07 +08:00
GabyCT
3885ba4910 Merge pull request #7173 from GabyCT/topic/addcheckm
checkmetrics: Add checkmetrics makefile and documentation
2023-06-27 16:30:44 -06:00
Gabriela Cervantes
415578cf3b docs: Add general README
This PR adds link to the unreference docs in the cmd path to make
them more discoverable.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-27 20:29:37 +00:00
Zhongtao Hu
c76583a08f Merge pull request #7171 from GabyCT/topic/enabletimedoc
docs: Add boot time metrics documentation
2023-06-27 10:28:56 +08:00
Zhongtao Hu
bff4672f7d runtime-rs: support physical endpoint using device manager
use device manager to attach physical endpoint

Fixes: #7103
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-06-27 10:25:51 +08:00
David Esparza
32cba7e44a metrics: Fix retrieving hypervisor version on metrics
This PR makes use of sudo to retrieve the hypervisor version.

Fixes: #7178

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-06-26 16:26:27 -06:00
Gabriela Cervantes
aa7946de47 checkmetrics: Add general checkmetrics documentation
This PR adds the general checkmetrics documentation for kata metrics tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-26 17:07:57 +00:00
Gabriela Cervantes
2fac2b72fe checkmetrics: Add checkmetrics makefile
This PR adds checkmetrics makefile which is used to process the
metrics json results files.

Fixes #7172

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-26 16:31:55 +00:00
Gabriela Cervantes
e45899ae0e docs: Add time tests documentation reference
This PR adds time tests documentation reference in the general README
for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-26 16:30:20 +00:00
Gabriela Cervantes
28130d3cef docs: Add boot time metrics documentation
This PR adds boot time metrics documentation for kata metrics tests.

Fixes #7170

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-26 16:19:28 +00:00
Zhongtao Hu
ce8e3cc091 Merge pull request #7073 from Apokleos/spdk-vol
runtime-rs: add support spdk/vhost-user based volume.
2023-06-26 11:34:44 +08:00
alex.lyn
0df2fc2702 runtime-rs: add support spdk/vhost-user based volume.
Unlike the previous usage which requires creating
/dev/xxx by mknod on the host, the new approach will
fully utilize the DirectVolume-related usage method,
and pass the spdk controller to vmm.

And a user guide about using the spdk volume when run
a kata-containers. it can be found in docs/how-to.

Fixes: #6526

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-06-25 16:23:19 +08:00
GabyCT
4cf552c151 Merge pull request #7097 from stevenhorsman/remove-unecessary-kata-versions
static-build: Remove kata-version parameter
2023-06-23 16:53:57 -06:00
GabyCT
388b55175e Merge pull request #7056 from FuuuOverclocking/fuu/fix-console_manager
dragonball: avoid obtaining lock twice in create_stdio_console
2023-06-23 16:47:00 -06:00
GabyCT
1a80fd66a2 Merge pull request #7161 from GabyCT/topic/enablemetricslimits
metrics: Add checkmetrics for kata metrics CI
2023-06-23 16:45:16 -06:00
Gabriela Cervantes
17198089ee vendor: Add vendor checkmetrics dependencies
This PR adds the vendor for the checkmetrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-23 20:55:30 +00:00
David Esparza
cfd6da9467 Merge pull request #7159 from dborquez/enable_launchtimes_test
metrics: enable launch-times test on gha-run metrics script
2023-06-23 12:59:46 -06:00
GabyCT
d6ff48f4e7 Merge pull request #7158 from GabyCT/topic/addmetricsreadme
docs: Add general metrics documentation
2023-06-23 11:28:00 -06:00
Gabriela Cervantes
f1dfea6e87 docs: Add metrics documentation reference
This PR adds the metrics documentation as a general reference in the
main README for kata containers.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-23 16:26:34 +00:00
Zvonko Kaiser
8330fb8ee7 gpu: Update unit tests
Some tests are now failing due to the changes how PCIe is
handled. Update the test accordingly.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-23 11:16:25 +00:00
David Esparza
8593594247 metrics: enable launch-times test on gha-run metrics script
This PR enables launch-times test on gha metrics workflow.

Fixes: #7049

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-06-22 18:05:46 -06:00
Fupan Li
469c678425 Merge pull request #7058 from Apokleos/vfio-dev
add support vfio device manager
2023-06-22 17:51:22 -06:00
Gabriela Cervantes
c4ee601bf4 metrics: Add checkmetrics for kata metrics CI
This PR adds the checkmetrics scripts that will be used for the kata metrics CI.

Fixes #7160

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-22 21:06:46 +00:00
Steve Horsman
267e97f9c0 Merge pull request #7162 from sprt/trusted-pr-authors
gha: Don't automatically trigger CI
2023-06-22 20:55:10 +01:00
Aurélien Bombo
e0d6475b49 gha: Don't automatically trigger CI
We have GH configured so that manual approval is required for CI runs
triggered by outside contributors. However, because CI is triggered by
the `pull_request_target` event, this setting isn't being honored
(see [1]). This means that an attacker could trivially extracts secrets
by submitting a PR.

This change aims to mititgate this issue by preventing PRs from
triggering CI unless the `ok-to-test` label is set.

Note: For further context, we use the `pull_request_target` event and
manually check out the PR branch because it is the only way to both
access secrets and test incoming code changes.

Fixes: #7163

 [1]: https://docs.github.com/en/actions/managing-workflow-runs/approving-workflow-runs-from-public-forks

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-06-22 11:05:53 -07:00
Aurélien Bombo
b535c7cbd8 tests: Enable running k8s tests on Mariner
This removes the gate and lets CI run tests on Mariner.

Fixes: #6840

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-06-22 10:30:52 -07:00
Archana Shinde
2d329125fd Merge pull request #6800 from amshinde/check-vm-capability
kata-ctl: Check for vm capability
2023-06-21 23:52:46 -07:00
Zhongtao Hu
4b793222ab Merge pull request #7154 from cheriL/7153/fix_spellings
docs: fix spelling of "crate"
2023-06-22 10:48:58 +08:00
Gabriela Cervantes
71071bdb63 docs: Add general metrics documentation
This PR adds a general metrics introduction documentation for the kata CI.

Fixes #7157

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-21 17:19:36 +00:00
Archana Shinde
610f7986e4 check: Relax the unrestricted_guest check when running in a VM
When running on a VM, the kernel parameter "unrestricted_guest" for
kernel module "kvm_intel" is not required. So, return success when running
on a VM without checking value of this kernel parameter.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-06-21 07:30:35 -07:00
Archana Shinde
1b406b9d0c kata-ctl:Implement functionality to check host is capable of running VM
Implement functionality to add to the env output if the host is capable
of running a VM.

Fixes: #6727

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-06-21 07:30:22 -07:00
David Esparza
90408d66c0 Merge pull request #7148 from GabyCT/topic/fixtabsinitscript
packaging: Fix indentation in init.sh script
2023-06-21 07:24:25 -06:00
stevenhorsman
adf88eaa89 static-build: Remove kata-version parameter
- Remove the unnecessary kata-version passed as a second parameter

Fixes: #7096
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-06-21 10:15:42 +01:00
soup
09720babc3 docs: fix spelling of "crate"
Fixes: #7153

Signed-off-by: soup <lqh348659137@outlook.com>
2023-06-21 16:10:54 +08:00
David Esparza
84b214d9d2 Merge pull request #7150 from GabyCT/topic/fixworkflows
gha: Fix gha actions
2023-06-20 18:08:23 -06:00
Gabriela Cervantes
7185afc50e gha: Fix gha actions
This PR removes an unrecognized value located in one of the yamls for the
gha in order to make it work the CI again.

Fixes #7149

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-20 23:13:25 +00:00
Gabriela Cervantes
21294b868d packaging: Fix indentation in init.sh script
This PR replaces single spaces for tabs in order to fix the indentation
in the init.sh script.

Fixes #7147

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-20 22:06:52 +00:00
GabyCT
90e36f43ff Merge pull request #7138 from dborquez/setup-kata-and-configure-launchtimes-test
metrics: install kata and launch-times test
2023-06-20 16:00:38 -06:00
David Esparza
fad3ac9f58 metrics: install kata and launch-times test
This PR installs kata static tarball on metrics runner
and run launch-times tests.

Fixes: #7049

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-06-20 13:58:09 -06:00
David Esparza
d071a87c7b Merge pull request #7109 from dborquez/add_common_libs_for_metrics
tests: Move tests helper script to this repo
2023-06-19 19:02:37 -06:00
David Esparza
4bbfcfaf15 tests: Move tests helper script to this repo
The common.sh script includes helper functions used in
our metrics tests, so we are gradually adding more
metrics used in kata.

Fixes: #7108

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-06-19 12:14:25 -06:00
David Esparza
f152f0e8c3 metrics: Add launch-times to metrics tests
This test measures the duration of a workload that starts, and then
immediately stops the contianer. Also measures the workload period,
the time to quit period, and the time to kernel period.

Fixes: #7049

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-06-19 10:40:16 -06:00
GabyCT
decbe77e28 Merge pull request #7129 from GabyCT/topic/metrlibjson
tests: Add json script for metrics tests
2023-06-19 09:59:41 -06:00
Fabiano Fidêncio
ef8b360711 Merge pull request #7085 from stevenhorsman/cherry-pick-initramfs
Cherry pick initramfs caching updates from CCv0
2023-06-19 11:59:00 +02:00
alex.lyn
59510cfee0 runtime-rs: add support vfio device based volume
A new choice of using vfio devic based volume for kata-containers.
With the help of kata-ctl direct-volume, users are able to add a
specified device which is BDF or IOMMU group ID.

To help users to use it smoothly, A doc about howto added in
docs/how-to/how-to-run-kata-containers-with-kinds-of-Block-Volumes.

Fixes: #6525

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-06-18 14:07:05 +08:00
alex.lyn
1e3b372bbb runtime-rs: add support vfio device manager
Limitations:
As no ready rust vmm's vfio manager is ready, it only supports
part of vfio in runtime-rs. And the left part is to call vmm
interfaces related to vfio add/remove.

So when vmm/vfio manager ready, a new PR will be pushed to
narrow the gap.

Fixes: #6525

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-06-18 14:05:59 +08:00
David Esparza
61e819ea8e Merge pull request #7131 from GabyCT/topic/fixrunner
gha: Fix format for run launchtimes metrics yaml
2023-06-16 18:30:57 -06:00
Gabriela Cervantes
6b08489301 gha: Fix format for run launchtimes metrics yaml
This PR fixes the format for the run launchtimes metrics yaml which
is causing to the workflow to fail.

Fixes #7130

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-16 22:00:36 +00:00
Gabriela Cervantes
3cefa43e75 tests: Add json script for metrics tests
This PR adds the json script which allow us to save the metrics results
into a json file which will be used in the kata containers metrics.

Fixes #7128

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-16 19:45:26 +00:00
GabyCT
7976a0ac72 Merge pull request #7114 from GabyCT/topic/libcommontests
tests: Add tests lib common script
2023-06-16 11:48:19 -06:00
Greg Kurz
27045798bf Merge pull request #7112 from gkurz/fix-virtiofsd-args
Fix deprecated virtiofsd args (go shim only)
2023-06-16 18:13:24 +02:00
Fabiano Fidêncio
6a3710055b initramfs: Build dependencies as part of the Dockerfile
This will help to not have to build those on every CI run, and rather
take advantage of the cached image.

Fixes: #7084

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit c720869eef)
2023-06-16 10:58:12 +01:00
Fabiano Fidêncio
aa2380fdd6 packaging: Add infra to push the initramfs builder image
Let's add the needed infra for only building and pushing the initramfs
builder image to the Kata Containers' quay.io registry.

Fixes: #7084

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 111ad87828)
2023-06-16 10:58:12 +01:00
Fabiano Fidêncio
1c7fcc6cbb packaging: Use existing image to build the initramfs
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder for the initramds.

This will save us some CI time.

Fixes: #7084

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ebf6c83839)
2023-06-16 10:58:12 +01:00
Greg Kurz
a43ea24dfc virtiofsd: Convert legacy -o sub-options to their -- replacement
The `-o` option is the legacy way to configure virtiofsd, inherited
from the C implementation. The rust implementation honours it for
compatibility but it logs deprecation warnings.

Let's use the replacement options in the go shim code. Also drop
references to `-o` from the configuration TOML file.

Fixes #7111

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-06-16 11:42:54 +02:00
Greg Kurz
8e00dc6944 virtiofsd: Drop -o no_posix_lock
The C implementation of virtiofsd had some kind of limited support
for remote POSIX locks that was causing some workflows to fail with
kata. Commit 432f9bea6e hard coded `-o no_posix_lock` in order
to enforce guest local POSIX locks and avoid the issues.

We've switched to the rust implementation of virtiofsd since then,
but it emits a warning about `-o` being deprecated.

According to https://gitlab.com/virtio-fs/virtiofsd/-/issues/53 :

   The C implementation of the daemon has limited support for
   remote POSIX locks, restricted exclusively to non-blocking
   operations. We tried to implement the same level of
   functionality in #2, but we finally decided against it because,
   in practice most applications will fail if non-blocking
   operations aren't supported.

   Implementing support for non-blocking isn't trivial and will
   probably require extending the kernel interface before we can
   even start working on the daemon side.

There is thus no justification to pass `-o no_posix_lock` anymore.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-06-16 11:42:39 +02:00
Greg Kurz
2a15ad9788 virtiofsd: Stop using deprecated -f option
The rust implementation of virtiofsd always runs foreground and
spits a deprecation warning when `-f` is passed.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-06-16 10:30:40 +02:00
David Esparza
b9d92f4577 Merge pull request #7117 from dborquez/add_checkout_metrics_workflow
gha: Add base branch on SHA on pull requst
2023-06-15 17:06:16 -06:00
Gabriela Cervantes
c3043a6c60 tests: Add tests lib common script
This PR adds the test lib common script that is going to be used
for kata containers metrics.

Fixes #7113

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-15 21:23:00 +00:00
David Esparza
b16e0de734 gha: Add base branch on SHA on pull requst
The run-launchtimes-metrics workflow needs to get the commit ID
for the last commit to the head branch of the PR.

Fixes: #7116

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-06-15 13:11:33 -06:00
Zvonko Kaiser
72f2cb84e6 gpu: Reset cold or hot plug after overriding
If we override the cold, hot plug with an annotation
we need to reset the other plugging mechanism to NoPort
otherwise both will be enabled.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-15 17:51:01 +00:00
Zvonko Kaiser
fbacc09646 gpu: PCIe topology, consider vhost-user-block in Virt
In Virt the vhost-user-block is an PCIe device so
we need to make sure to consider it as well. We're keeping
track of vhost-user-block devices and deduce the correct
amount of PCIe root ports.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-15 17:39:55 +00:00
GabyCT
0f24f427d7 Merge pull request #7101 from dborquez/add_initial_metrics_gh_workflow
gha: ci-on-push: Run metrics tests
2023-06-15 10:08:56 -06:00
David Esparza
bc152b1141 gha: ci-on-push: Run metrics tests
This gh-workflow prints a simple msg, but is the base for future
PRs that will gradually add the jobs corresponding to the kata
metrics test.

Fixes: #7100

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-06-14 15:15:08 -06:00
GabyCT
a3180d0cb8 Merge pull request #7095 from GabyCT/topic/updatedebugconse
docs: Update Developer Guide
2023-06-14 13:49:37 -06:00
Gabriela Cervantes
dad731d5c1 docs: Update Developer Guide
This PR updates the developer guide at the connect to the debug console
section.

Fixes #7094

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-14 15:36:51 +00:00
Zhongtao Hu
11692a76e1 Merge pull request #7092 from Apokleos/virtiofs-enhancement
runtime-rs: Enhance flexibility of virtio-fs config
2023-06-14 20:01:46 +08:00
Zvonko Kaiser
b11246c3aa gpu: Various fixes for virt machine type
The PCI qom path was not deduced correctly added regex for correct
path walking.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-14 08:33:57 +00:00
Zvonko Kaiser
40101ea7db vfio: Added annotation for hot(cold) plug
Now it is possible to configure the PCIe topology via annotations
and addded a simple test, checking for Invalid and RootPort

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-14 08:20:24 +00:00
Zvonko Kaiser
8f0d4e2612 vfio: Cleanup of Cold and Hot Plug
Removed the configuration of PCIeRootPort and PCIeSwitchPort, those
values can be deduced in createPCIeTopology

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-14 08:20:24 +00:00
Zvonko Kaiser
b5c4677e0e vfio: Rearrange the bus assignemnt
Refactor the bus assignment so that the call to GetAllVFIODevicesFromIOMMUGroup
can be used by any module without affecting the topology.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-14 08:20:24 +00:00
Zvonko Kaiser
b1aa8c8a24 gpu: Moved the PCIe configs to drivers
The hypervisor_state file was the wrong location for the PCIe Port
settings, moved everything under device umbrella, where it can be
consumed more easily and we do not get into circular deps.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-14 08:20:24 +00:00
Zvonko Kaiser
55a66eb7fb gpu: Add config to TOML
Update cold-plug and hot-plug setting to include bridge, root and
switch-port

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-14 08:20:24 +00:00
Zvonko Kaiser
da42801c38 gpu: Add config settings tests for hot-plug
Updated all references and config settings for hot-plug to match
cold-plug

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-14 08:20:24 +00:00
Zvonko Kaiser
de39fb7d38 runtime: Add support for GPUDirect and GPUDirect RDMA PCIe topology
Fixes: #4491

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-14 08:20:24 +00:00
Zvonko Kaiser
9318e022af gpu: Add CC relates configs
For the GPU CC use case we need to set several crypto algorithms.
The driver relies on them in the CC case.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-14 07:56:53 +00:00
Zvonko Kaiser
b7932be4b6 gpu: Add Arm64 Kernel Settings
For different archs we need diferent settings use ${ARCH} to choose
the right fragment

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-14 07:56:53 +00:00
Zvonko Kaiser
211b0ab268 gpu: Update Kernel Config
Newer drivers need more symbols so lets enable them

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-14 07:56:53 +00:00
Zvonko Kaiser
5f103003d6 gpu: Update kernel building to the latest changes
Use now the sev.conf rather then the snp.conf.
Devices can be prestend in two different way in the
container (1) as vfio devices /dev/vfio/<num>
(2) the device is managed by whataever driver in
the VM kernel claims it.

Fixes: #6844

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-14 07:56:53 +00:00
Fabiano Fidêncio
95bec479ca Merge pull request #7090 from GabyCT/topic/ufcversion
versions: Update firecracker version to 1.3.3
2023-06-14 01:24:02 +02:00
Fabiano Fidêncio
8aa4a87fae Merge pull request #7099 from sprt/fix-new-targets
tools: Fix no-op builds
2023-06-14 01:23:39 +02:00
Aurélien Bombo
35e4938e8c tools: Fix no-op builds
This fixes the builds of `cloud-hypervisor-glibc` and
`rootfs-initrd-mariner` to properly create the `build/` directory.

Fixes: #7098

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-06-13 10:56:49 -07:00
Zhongtao Hu
da8dde0c24 Merge pull request #7079 from HerlinCoder/herlincoder/vpa
runtime-rs: update Cargo.lock
2023-06-13 21:44:45 +08:00
Fabiano Fidêncio
ff38937246 Merge pull request #7087 from sprt/fix-gha-stage
gha: Fix `stage` definition in matrix
2023-06-13 12:17:25 +02:00
alex.lyn
347385b4ee runtime-rs: Enhance flexibility of virtio-fs config
support more and flexible options for inline virtiofs.

Fixes: #7091

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-06-13 15:12:47 +08:00
Zhongtao Hu
355a24e0e1 Merge pull request #6289 from openanolis/runtime_vcpu_resize
feat(runtime): vcpu resize capability
2023-06-13 10:54:11 +08:00
Chelsea Mafrica
1763b1f69f Merge pull request #7082 from jodh-intel/remove-snap
packaging: Remove snap package
2023-06-12 17:05:00 -07:00
Gabriela Cervantes
21d2278539 versions: Update firecracker version to 1.3.3
This PR updates the firecracker version to 1.3.3 which includes the following
changes
Fixed passing through cache information from host in CPUID leaf 0x80000006.
A race condition that has been identified between the API thread and the VMM
thread due to a misconfiguration of the api_event_fd.

Fixes #7089

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-12 20:32:02 +00:00
Aurélien Bombo
0e2379909b gha: Fix stage definition in matrix
This defines `stage` as a list instead of a literal to fix the GHA CI.

Fixes: #7086

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-06-12 11:24:45 -07:00
Fabiano Fidêncio
977309a281 Merge pull request #7027 from sprt/sprt/mariner-build-targets
gha: Add new build targets for Mariner
2023-06-12 19:19:22 +02:00
Yushuo
ae2cfa8263 doc: add vcpu handlint doc for runtime-rs
Kubernetes and Containerd will help calculate the Sandbox Size and pass it to
Kata Containers through annotations.

In order to accommodate this favorable change and be compatible with the past,
we have implemented the handling of the number of vCPUs in runtime-rs. This is
This is slightly different from the original runtime-go design.

This doc introduce how we handle vCPU size in runtime-rs.

Fixes: #5030

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2023-06-12 19:23:11 +08:00
Yushuo
7b1e67819c fix(clippy): fix clippy error
Fixes: #5030

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2023-06-12 17:53:16 +08:00
Yushuo
67972ec48a feat(runtime-rs): calculate initial size
In this commit, we refactored the logic of static resource management.

We defined the sandbox size calculated from PodSandbox's annotation and
SingleContainer's spec as initial size, which will always be the sandbox
size when booting the VM.

The configuration static_sandbox_resource_mgmt controls whether we will
modify the sandbox size in  the following container operation.

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2023-06-12 17:53:16 +08:00
Yushuo
aaa96c749b feat(runtime-rs): modify onlineCpuMemRequest
Some vmms, such as dragonball, will actively help us
perform online cpu operations when doing cpu hotplug.
Under the old onlineCpuMem interface, it is difficult
to adapt to this situation.

So we modify the semantics of nb_cpus in onlineCpuMemRequest.
In the original semantics, nb_cpus represents the number of
newly added CPUs that need to be online. The modified
semantics become that the number of online CPUs in the guest
needs to be guaranteed.

Fixes: #5030

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2023-06-12 17:53:16 +08:00
Yushuo
d66f7572dd feat(runtime-rs): clear cpuset in runtime side
The declaration of the cpu number in the cpuset is greater
than the actual number of vcpus, which will cause an error when
updating the cgroup in the guest.

This problem is difficult to solve, so we temporarily clean up
the cpuset in the container spec before passing in the agent.

Fixes: #5030

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2023-06-12 17:53:16 +08:00
Yushuo
a0385e1383 feat(runtime-rs): update linux resource when stop_process
Update the resource when delete container, which is in
stop_process in runtime-rs.

Fixes: #5030

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2023-06-12 17:53:16 +08:00
Yushuo
a39e1e6cd1 feat(runtime-rs): merge the update_cgroups in update_linux_resources
Updating vCPU resources and memory resources of the sandbox and
updating cgroups on the host will always happening together, and
they are all updated based on the linux resources declarations of
all the containers.

So we merge update_cgroups into the update_linux_resources, so we
can better manage the resources allocated to one pod in the host.

Fixes: #5030

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2023-06-12 17:53:16 +08:00
Ji-Xinyou
fa6dff9f70 feat(runtime-rs): support vcpu resizing on runtime side
Support vcpu resizing on runtime side:
1. Calculate vcpu numbers in resource_manager using all the containers'
   linux_resources in the spec.
2. Call the hypervisor(vmm) to do the vcpu resize.
3. Call the agent to online vcpus.

Fixes: #5030
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-06-12 17:53:16 +08:00
James O. D. Hunt
8cb4238b46 packaging: Remove snap package
Nobody has volunteered to maintain the (currently broken) snap build, so
remove it.

Fixes: #6769.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-06-12 09:24:09 +01:00
Helin Guo
2137739987 runtime-rs: update Cargo.lock
After we support memory resize in Dragonball, we need to update
Cargo.lock in runtime-rs.

Fixes: #6719

Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
2023-06-12 11:25:59 +08:00
Chao Wu
2988553305 Merge pull request #6998 from HerlinCoder/herlincoder/vpa
Dragonball: support resize memory
2023-06-11 17:21:12 +08:00
Archana Shinde
56d2ea9b78 kata-ctl: Refactor kernel module check
Adding vhost and vhost-net to the kernel modules. These do not require
any kernel module parameters to be checked. Currently, kernel params is
a required field. Make this as optional. Could make this as <Option>,
but making this a slice instead, as a module could have multiple kernel
params. Refactor the function that checks are for kernel modules into
two with one specifically checking if the module is loaded and other
checking for module parameters.

Refactor some of the tests to take into account these changes.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-06-09 14:10:31 -07:00
Aurélien Bombo
9f7a45996c gha: Add rootfs-initrd-mariner build target
This adds the Mariner guest image build target to the list of assets
as preparation for #6839.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-06-09 11:36:42 -07:00
Aurélien Bombo
f28a62164a gha: Add cloud-hypervisor-glibc build target
This adds the glibc flavor of CLH to the list of assets as preparation
for #6839. Mariner Kata is only tested with glibc.

Fixes: #7026

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-06-09 11:35:50 -07:00
Fabiano Fidêncio
b50f62ce48 Merge pull request #6756 from arronwy/measured_rootfs
Port Measured rootfs feature from CCv0 branch to main
2023-06-09 12:35:05 +02:00
Helin Guo
8fb7ab7518 dragonball: introduce virtio-balloon device
We introduce virtio-balloon device to support memory resize.
virtio-balloon device could reclaim memory from guest to host.

Fixes: #6719

Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
2023-06-09 17:47:27 +08:00
Helin Guo
7ed9494973 dragonball: introduce virtio-mem device
We introduce virtio-mem device to support memory resize. virtio-mem
device could hot-plug more memory blocks to guest and could also
hot-unplug them from guest.

Fixes: #6719

Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
2023-06-09 17:47:21 +08:00
Chao Wu
c7c45626c9 Merge pull request #6973 from Apokleos/direct-vol
add support direct volume and refactor device manager
2023-06-09 11:29:00 +08:00
alex.lyn
776a15e092 runtime-rs: add support direct volume.
As block/direct volume use similar steps of device adding,
so making full use of block volume code is a better way to
handle direct volume.

the only different point is that direct volume will use
DirectVolume and get_volume_mount_info to parse mountinfo.json
from the direct volume path. That's to say, direct volume needs
the help of `kata-ctl direct-volume ...`.

Details seen at Advanced Topics:
[How to run Kata Containers with kinds of Block Volumes]
docs/how-to/how-to-run-kata-containers-with-kinds-of-Block-Volumes.md

Fixes: #5656

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-06-09 08:16:26 +08:00
Helin Guo
a8e0f51c52 dragonball: extend DeviceOpContext
In order to support virtio-mem and virtio-balloon devices, we need to
extend DeviceOpContext with VmConfigInfo and InstanceInfo.

Fixes: #6719

Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
2023-06-08 22:04:31 +08:00
alex.lyn
abae114046 runtime-rs: refactor device manager implementation
The key aspects of the DM implementation refactoring as below:

1. reduce duplicated code
 Many scenarios have similar steps when adding devices. so to reduce
 duplicated code, we should create a common method abstracted and use
 it in various scenarios.
do_handle_device:
(1) new_device with DeviceConfig and return device_id;
(2) try_add_device with device_id and do really add device;
(3) return device info of device's info;

2. return full info of Device Trait get_device_info
 replace the original type DeviceConfig with full info DeviceType.

3. refactor find_device method.

Fixes: #5656

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-06-08 08:47:08 +08:00
Fabiano Fidêncio
08d10d38be Merge pull request #7048 from sprt/sprt/fix-gha
gha: Fix gha-run.sh and unbreak CI
2023-06-07 23:40:02 +02:00
James O. D. Hunt
452f286552 Merge pull request #6764 from byron-marohn/fix_5401
kata-ctl: Switch to slog logging; add --log-level and --json-logging arguments
2023-06-07 16:08:53 +01:00
Fuu
210a15794c dragonball: avoid obtaining lock twice in create_stdio_console
Fixes #7055

Signed-off-by: Fuu <fuu-open@linux.alibaba.com>
2023-06-07 16:12:22 +08:00
Aurélien Bombo
69668ce87f tests: gha-run: Use correct env variable for repo
s/DOCKER_IMAGE/DOCKER_REPO

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-06-06 11:54:43 -07:00
Aurélien Bombo
f487199edf gha: aks: Fix argument in call to gha-run.sh
Fixes: #7047

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-06-06 11:51:18 -07:00
GabyCT
5ad8aaf9df Merge pull request #7035 from GabyCT/topic/logparserdoc
log-parser: Update log parser link at README
2023-06-06 12:02:25 -06:00
Fabiano Fidêncio
de2e507483 Merge pull request #6972 from sprt/sprt/gha-run-script
gha: aks: Extract `run` commands to a script
2023-06-06 14:54:03 +02:00
Wang, Arron
f6afae9c73 packaging: Add rootfs-image-tdx-tarball target
Add rootfs-image-tdx target:
./tools/packaging/kata-deploy/local-build/kata-deploy-binaries.sh --build=rootfs-image-tdx
./opt/kata/share/kata-containers/kata-containers-tdx.img
./opt/kata/share/kata-containers/kata-ubuntu-latest-tdx.image

Fixes: #6674

Signed-off-by: Wang, Arron <arron.wang@intel.com>
2023-06-06 12:34:20 +02:00
Wang, Arron
f62b2670c0 config: Add root hash value and measure config to kernel params
After we have a guest kernel with builtin initramfs which
provide the rootfs measurement capability and Kata rootfs
image with hash device, we need set related root hash value
and measure config to the kernel params in kata configuration file.

Fixes: #6674

Signed-off-by: Wang, Arron <arron.wang@intel.com>
2023-06-06 12:34:13 +02:00
Wang, Arron
0080588075 kernel: Integrate initramfs into Guest kernel
Integrate initramfs into guest kernel as one binary,
which will be measured by the firmware together.

Fixes: #6674

Signed-off-by: Wang, Arron <arron.wang@intel.com>
2023-06-06 12:33:41 +02:00
Wang, Arron
28b2645624 initramfs: Add build script to generate initramfs
The init.sh in initramfs will parse the verity scheme,
roothash, root device and setup the root device accordingly.

Fixes: #6674

Signed-off-by: Wang, Arron <arron.wang@intel.com>
2023-06-06 12:33:28 +02:00
Wang, Arron
5cb02a8067 image-build: generate root hash as an separate partition for rootfs
Generate rootfs hash data during creating the kata rootfs,
current kata image only have one partition, we add another
partition as hash device to save hash data of rootfs data blocks.

Fixes: #6674

Signed-off-by: Wang, Arron <arron.wang@intel.com>
2023-06-06 12:31:14 +02:00
Arron Wang
31c0ad2076 packaging: Add cryptsetup support in Guest kernel and rootfs
Add required kernel config for dm-crypt/dm-integrity/dm-verity
and related crypto config.

Add userspace command line tools for disk encryption support
and ext4 file system utilities.

Fixes: #6674

Signed-off-by: Arron Wang <arron.wang@intel.com>
2023-06-06 12:30:07 +02:00
Fabiano Fidêncio
eb1bfa922b Merge pull request #6980 from nubificus/feat_sharefs_files
runtime-rs: handle copy files when share_fs is not available
2023-06-06 12:26:55 +02:00
Chao Wu
b0c6cd05a2 Merge pull request #7033 from openanolis/fix-agent-ctl
agent-ctl: fix the compile error
2023-06-06 11:55:15 +08:00
Gabriela Cervantes
980d084f47 log-parser: Update log parser link at README
This PR updates the link to the correspondent Developer Guide at the
enabling full containerd debug that we have for kata 2.0 documentation.

Fixes #7034

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-06-05 15:59:52 +00:00
Yushuo
410bc18143 agent-ctl: fix the compile error
When the version of libc is upgraded to 0.2.145, older getrandom could not adapt
to new API, and this will make agent-ctl fail to compile.

We upgrade the version of `rand`, so the low version of getrandom will no longer
need.

Fixes: #7032

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-06-05 21:48:36 +08:00
Jayant Singh
77519fd120 kata-ctl: Switch to slog logging; add --log-level, --json-logging args
Fixes: #5401, #6654

- Switch kata-ctl from eprintln!()/println!() to structured logging via
  the logging library which uses slog.
- Adds a new create_term_logger() library call which enables printing
  log messages to the terminal via a less verbose / more human readable
  terminal format with colors.
- Adds --log-level argument to select the minimum log level of printed messages.
- Adds --json-logging argument to switch to logging in JSON format.

Co-authored-by: Byron Marohn <byron.marohn@intel.com>
Co-authored-by: Luke Phillips <lucas.phillips@intel.com>
Signed-off-by: Jayant Singh <jayant.singh@intel.com>
Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Luke Phillips <lucas.phillips@intel.com>
Signed-off-by: Kelby Madal-Hellmuth <kelby.madal-hellmuth@intel.com>
Signed-off-by: Liz Lawrens <liz.lawrens@intel.com>
2023-06-02 20:13:22 +00:00
Aurélien Bombo
aab6030962 gha: aks: Extract run commands to a script
Github Actions reads and runs workflow files from the main branch,
rather than from the PR branch. This means that PRs that modify workflow
files aren't being tested with the updated workflows coming from the PR,
but rather with the old workflows from the main branch. AFAIK, this
behavior isn't avoidable for workflow files (but is for other scripts).

This makes it very hard to reliably test workflow changes before they're
actually merged into main and leads to issues that we have to hotifx
(see #6983, #6995).

This PR aims to mitigate that by extracting the commands used in
workflows to a separate script file. The way our CI is set up, those
script files are read from the PR branch and thus changes would be
reflected in the CI checks.

Fixes: #6971

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-06-02 10:22:35 -07:00
Fupan Li
465f5a5ced Merge pull request #4748 from lifupan/main_fix
agent: fix the issue of exec hang with a backgroud process
2023-06-02 10:46:43 +08:00
Chao Wu
2128fa2b4e Merge pull request #7013 from xuejun-xj/xuejun/bugfix
runtime-rs: bugfix: update Cargo.lock
2023-06-02 10:08:27 +08:00
Anastassios Nanos
e4eb664d27 runtime-rs: update rust to 1.69.0
We are probably hitting this:
https://github.com/rust-lang/rust/issues/63033

Seems like it is worth a try to upgrade to 1.69.0

Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
2023-06-01 21:40:56 +00:00
Anastassios Nanos
ed37715e05 runtime-rs: handle copy files when share_fs is not available
In hypervisors that do not support virtiofs we have to copy files in
the VM sandbox to properly setup the network (resolv.conf, hosts, and hostname).

To do that, we construct the volume as before, with the addition of an extra
variable that designates the path where the file will reside in the sandbox.

In this case, we issue a `copy_file` agent request *and* we patch the spec
to account for this change.

Fixes: #6978

Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
Signed-off-by: George Pyrros <gpyrros@nubificus.co.uk>
2023-06-01 21:40:56 +00:00
Fabiano Fidêncio
18b1a019d4 Merge pull request #7011 from jepio/fix-aks-cluster-name
gha: aks: Use short SHA in cluster name
2023-06-01 15:56:20 +02:00
Fabiano Fidêncio
5ab42d87fb Merge pull request #7009 from fidencio/topic/display-badge-for-the-publish-artefacts-job
README: Display badge for the "Publish Artefacts" job and update the Kata Containers logo
2023-06-01 15:13:41 +02:00
Fabiano Fidêncio
eb1f44f111 Merge pull request #7007 from fidencio/topic/try-to-fix-ubuntu-k8s-key-not-available
kata-deploy: Change how we get the Ubuntu k8s key
2023-06-01 15:13:22 +02:00
xuejun-xj
5f6fc3ed76 runtime-rs: bugfix: update Cargo.lock
When dragonball update dbs-boot crate in commit
64c764c147, the Cargo.lock in runtime-rs
should also be updated.

Fixes: #6969

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
2023-06-01 20:25:35 +08:00
Jeremi Piotrowski
1c6d22c803 gha: aks: Use short SHA in cluster name
Full SHA is 40 characters, while AKS cluster name has a limit of 63. Trim the
SHA to 12 characters, which is widely considered to be unique enough and is
short enough to be used in the cluster name

Fixes: #7010
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-06-01 14:03:53 +02:00
Fabiano Fidêncio
3c1f6d36dc readme: Update Kata Containers logo
Let's use the horizontal logo, as it occupies better the space the we
have.

The logo comes from:
https://openinfra.dev/brand/logos

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-06-01 12:25:13 +02:00
Fabiano Fidêncio
3886841131 readme: Add status badge for the "Publish Artefacts" job
Let's start adding the status of our jobs as part of our main page, so
folks monitoring those can easily check whether they're okay, or if
someone has to be pinged about those.

Fixes: #7008

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-06-01 12:25:01 +02:00
Fabiano Fidêncio
26f7520387 kata-deploy: Change how we get the Ubuntu k8s key
The current method has been failing every now and then, and was reported
on https://github.com/kubernetes/release/issues/2862.

Ding poked me and suggested to do this change here, so here we go. :-)

Fixes: #7006

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-06-01 12:10:30 +02:00
Fabiano Fidêncio
9ec2bca101 Merge pull request #7002 from fidencio/topic/follow-up-on-7000
gha: aks: Ensure host_os is used everywhere needed
2023-06-01 08:51:27 +02:00
Fabiano Fidêncio
8cbb80da66 Merge pull request #6929 from LindaYu17/dev
kubernetes: add agnhost command in pod yaml
2023-06-01 08:39:58 +02:00
Fabiano Fidêncio
aebd3b47d9 gha: aks: Ensure host_os is used everywhere needed
We added that to create the cluster name, but I forgot to add that to
the part we get the k8s config file, or to the part where we delete the
AKS cluster.

Fixes: #6999

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-31 20:50:55 +02:00
Fabiano Fidêncio
e01f75723a Merge pull request #6997 from singhwang/main
main | release: Standardize kata static file name
2023-05-31 15:22:30 +02:00
Fabiano Fidêncio
1ed917a079 Merge pull request #6989 from BbolroC/configurable-build-registry
packaging: make BUILDER_REGISTRY configurable
2023-05-31 15:18:51 +02:00
Fabiano Fidêncio
de22783124 Merge pull request #7000 from fidencio/topic/use-a-different-name-for-the-ubuntu-and-mariner-aks-clusters
gha: aks: Add the host_os as part of the aks cluster's name
2023-05-31 15:18:17 +02:00
Archana Shinde
141c26f307 Merge pull request #6985 from amshinde/kernel-tdx-build
kernel: Modify build-kernel.sh to accomodate for changes in version.yaml
2023-05-31 01:57:20 -07:00
Fabiano Fidêncio
0c8282c224 gha: aks: Add the host_os as part of the aks cluster's name
We need to do so, otherwise we'll create two clusters for testing Cloud
Hypervisor with exactly the same name, one using Ubuntu, and one using
Mariner.

Fixes: #6999

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-31 05:20:04 +02:00
SinghWang
4b89a6bdac release: Standardize kata static file name
The string representing the architecture aarch64 and x86_64 need to be changed to arm64 and amd64 for the release.

Fixes: #6986
Signed-off-by: SinghWang <wangxin_0611@126.com>
2023-05-31 10:24:45 +08:00
Fabiano Fidêncio
51e42a9972 Merge pull request #6995 from sprt/sprt/fix-mariner-ci
gha: Fix Mariner cluster creation
2023-05-31 00:23:36 +02:00
Archana Shinde
9228815ad2 kernel: Modify build-kernel.sh to accomodate for changes in version.yaml
There were recent changes for the tdx kernel in the version.yaml that are
not currently accounted for in the build-kernel.sh script.
Attempts to setup a tdx kernel to build local changes seemed to not download
the tdx kernel. Instead the mainline kernel is downloaded which has no
tdx-related changes.

The version.yaml has a new entry for tdx kernel. Use that instead for
setting up and downloading the tdx kernel.

Fixes: #6984

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-05-30 13:44:58 -07:00
Aurélien Bombo
03027a7399 gha: Fix Mariner cluster creation
While the Mariner Kata host is in preview, we need the `aks-preview`
extension to enable the `--workload-runtime KataMshvVmIsolation` flag.

Fixes: #6994

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-05-30 13:26:49 -07:00
Hyounggyu Choi
43e73bdef7 packaging: make BUILDER_REGISTRY configurable
This PR is to make an environment variable `BUILDER_REGISTRY` configurable
so that those who want to use their own registry for build can set up
the registry.

Fixes: #6988
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-05-30 14:40:02 +02:00
Fabiano Fidêncio
2e2d7243d2 Merge pull request #6983 from sprt/sprt/fix-gha-ci
gha: Unbreak CI and fix cluster creation step
2023-05-30 12:58:10 +02:00
Zhongtao Hu
8b6cb2cd75 Merge pull request #6806 from xuejun-xj/xuejun/vcpuhotplug
Dragonball: support vcpu hotplug on aarch64
2023-05-30 18:47:50 +08:00
xuejun-xj
ffe3157a46 dragonball: add arm64 patches for upcall
The vcpu hotplug/hotunplug feature is implemented with upcall. This commit
add three patches to support the feature on aarch64. Patches:
> 0005: add support of upcall on aarch64
> 0006: skip activate offline cpus' MSI interrupt
> 0007: set the correct boot cpu number

Fixes: #6010

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
2023-05-30 15:51:08 +08:00
xuejun-xj
560442e6ed dragonball: add vcpu_boot_onlined vector
This commit implements the vcpu_boot_onlined vector in get_fdt_vm_info.

"boot_enabled" means whether this vcpu should be onlined at first boot.
It will be used by fdt, which write an attribute called boot_enabled,
and will be handled by guest kernel to pass the correct cpu number to
function "bringup_nonboot_cpus".

Fixes: #6010

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
2023-05-30 15:51:08 +08:00
xuejun-xj
e31772cfea dragonball: add support resize_vcpu on aarch64
This commit add support of resize_vcpu on aarch64. As kvm will check
whether vgic is initialized when calling KVM_CREATE_VCPU ioctl, all the
vcpu fds should be created before vm is booted.

To support resizing vcpu scenario, we use max_vcpu_count for
create_vcpus and setup_interrupt_controller interfaces. The
SetVmConfiguration API will ensure max_vcpu_count >= boot_vcpu_count.

Fixes: #6010

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
2023-05-30 15:51:08 +08:00
xuejun-xj
64c764c147 dragonball: update dbs-boot to v0.4.0
dbs-boot-v0.4.0 refectors the create_fdt interface. It simplifies the
parameters needed to be passed and abstracts them into three structs.

By the way, it also reserves some interfaces for future feature: numa
passthrough and cache passthrough.

Fixes: #6969

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
2023-05-30 15:51:08 +08:00
xuejun-xj
fd9b414646 dragonball: update comment for init_microvm
Rewrite the comment of Vm::init_microvm method for aarch64.

Fixes cargo test warnings on aarch64.

Fixes: #6969

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
2023-05-30 15:51:08 +08:00
Aurélien Bombo
af16d3fca4 gha: Unbreak CI and fix cluster creation step
This fixes the regression introduced by #6686 by properly injecting the
`--os-sku mariner --workload-runtime KataMshvVmIsolation` flags.

Error reference:
https://github.com/kata-containers/kata-containers/actions/runs/5111460297/jobs/9188819103

Fixes: #6982

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-05-29 13:32:47 -07:00
Zhongtao Hu
099b4b0d0e Merge pull request #6598 from Apokleos/sandbox_bind_mounts
runtime-rs/sandbox_bindmounts: add support for sandbox bindmounts
2023-05-28 12:00:39 +08:00
Zhongtao Hu
cb962b0dc9 Merge pull request #6702 from Apokleos/directvol-common
runtime-rs/kata-ctl: Enhancement of DirectVolumeMount.
2023-05-28 12:00:12 +08:00
Fabiano Fidêncio
44546a4a57 Merge pull request #6686 from sprt/sprt/mariner-ci
gha: Create Mariner host as part of k8s tests
2023-05-27 05:34:28 +02:00
alex.lyn
5ddc4f94c5 runtime-rs/kata-ctl: Enhancement of DirectVolumeMount.
Move the get_volume_mount_info to kata-types/src/mount.rs.
If so, it becomes a common method of DirectVolumeMountInfo
and reduces duplicated code.

Fixes: #6701

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-05-26 11:18:29 +08:00
Fupan Li
25d2fb0fde agent: fix the issue of exec hang with a backgroud process
When run a exec process in backgroud without tty, the
exec will hang and didn't terminated.

For example:

crictl -i <container id> sh -c 'nohup tail -f /dev/null &'

Fixes: #4747

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2023-05-26 10:56:46 +08:00
Tim Zhang
5231aff90f Merge pull request #6860 from lifupan/main
netlink: Fix the issue of update_interface
2023-05-26 10:54:07 +08:00
Aurélien Bombo
4af4ced1aa gha: Create Mariner host as part of k8s tests
The current testing setup only supports running Kata on top of an Ubuntu
host. This adds Mariner to the matrix of testable hosts for k8s
tests, with Cloud Hypervisor as a VMM.

As preparation for the upcoming PR that will change only the actual test
code (rather than workflow YAMLs), this also introduces a new file
`setup.sh` that will be used to set host-specific parameters at test
run-time.

Fixes: #6961

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-05-25 14:29:46 -07:00
Fabiano Fidêncio
59cefa719c Merge pull request #6965 from fidencio/topic/gha-increase-aks-creation-waiting-time
gha: Increase timeout for AKS jobs and give more time to start running the tests
2023-05-25 17:23:17 +02:00
Greg Kurz
837f7a2fe6 Merge pull request #6959 from beraldoleal/issues/6757
runtime: sending SIGKILL to qemu
2023-05-25 16:24:37 +02:00
alex.lyn
eee7aae71d runtime-rs/sandbox_bindmounts: add support for sandbox bindmounts
sandbox_bind_mounts supports kinds of mount patterns, for example:

(1) "/path/to", default readonly mode.
(2) "/path/to:ro", same as (1).
(3) "/path/to:rw", readwrite mode.

Both support configuration and annotation:
(1)[runtime]
sandbox_bind_mounts=["/path/to", "/path/to:rw", "/mnt/to:ro"]
(2) annotation will alse be supported, restricted as below:
io.katacontainers.config.runtime.sandbox_bind_mounts
                         = "/path/to /path/to:rw /mnt/to:ro"

Fixes: #6597

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-05-25 20:00:25 +08:00
Fupan Li
62b2838962 Merge pull request #6846 from ZhangShuaiyi/DeviceMgrMethod
dragonball: convert BlockDeviceMgr and VirtioNetDeviceMgr functions to methods
2023-05-25 18:11:44 +08:00
QuanweiZhou
377b7735f5 Merge pull request #6872 from justxuewei/rm-virtio-devices
dragonball: Remove virtio-net and vsock devices gracefully
2023-05-25 17:08:36 +08:00
Fabiano Fidêncio
3d5d6eb361 Merge pull request #6958 from fidencio/topic/kata-deploy-improve-backup-restore
kata-deploy: Improve shim backup / restore
2023-05-25 10:54:06 +02:00
Fabiano Fidêncio
3f0735a7e8 Merge pull request #6952 from stevenhorsman/git-clone-doc-fix
doc: Update git commands
2023-05-25 10:36:08 +02:00
Fabiano Fidêncio
557b840814 gha: aks: Wait longer to start running the tests
We're still facing issues related to the time taken to deploy the
kata-deplot daemonset and starting to run the tests.

Ideally, we should solve this with a readiness probe, and that's the
approach we want to take in the future.  However, for now, let's just
make sure those tests are not on the way of the community.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-25 10:13:19 +02:00
Fabiano Fidêncio
c04c872c42 gha: aks: Increase the timeout time
We've seen tests being aborted close to the end of the run due to the
timeout.  Let's increase it, avoiding to hit such cases again..

Fixes: #6964

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-25 10:13:08 +02:00
GabyCT
8d98484230 Merge pull request #6926 from GabyCT/topic/fixtabsmerge
kata-deploy: Fix indentation on kata deploy merge script
2023-05-24 14:55:51 -06:00
Fabiano Fidêncio
428041624a kata-deploy: Improve shim backup / restore
We're currently backing up and restoring all the possible shim files,
but the default one ("containerd-shim-kata-v2").

Let's ensure this is also backed up and restored.

Fixes: #6957

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-24 18:39:27 +02:00
Gabriela Cervantes
14c3f1e9f5 kata-deploy: Fix indentation on kata deploy merge script
This PR fixes the indentation on the kata deploy merge script
that instead of single spaces uses a tap.

Fixes #6925

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-05-24 16:01:10 +00:00
Beraldo Leal
0e47cfc4c7 runtime: sending SIGKILL to qemu
There is a race condition when virtiofsd is killed without finishing all
the clients. Because of that, when a pod is stopped, QEMU detects
virtiofsd is gone, which is legitimate.

Sending a SIGTERM first before killing could introduce some latency
during the shutdown.

Fixes #6757.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-05-24 11:31:28 -04:00
stevenhorsman
6a0035e419 doc: Update git commands
Fix bad migrations from `go get` to `git clone` and update the cloned
directory path

Fixes: #6951
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-05-24 13:16:48 +01:00
Fabiano Fidêncio
7c9faab523 Merge pull request #6947 from fidencio/topic/gha-release-fix-payload-tagging
gha: release: Simplify the process for tagging the payload
2023-05-24 11:22:09 +02:00
Fabiano Fidêncio
f636c1f8a4 gha: release: Simplify the process for tagging the payload
We previously were doing:
* Create a new image on kata-deploy-ci using the commit hash of the
  latest tag
  * This was used to test on AKS, which is no longer needed as we test
    on AKS on every PR
* Create a new image on kata-deploy using the release tag and "latest"
  or "stable", by tagging the kata-deploy-ci image accordingly

As part of cfe63527c5, we broke the
workflow described above, as in the first step we would save the PKG_SHA
to be used in the second step, but that part ended up being removed.

Anyways, this back and forth is not needed anymore and we can simplify
the process by doing:
* Create a new image on kata-deploy, using:
  - The tag received as ref from the event that triggered this worklow
  - "latest" or "stable" tag, depending on whether it's a stable release
    or not

Fixes: #6946

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-24 08:54:43 +02:00
Fabiano Fidêncio
01827911f4 Merge pull request #6943 from fidencio/topic/gha-login-dont-specify-the-registry-if-using-docker-io
gha: release: login-action: Don't specify docker.io registry
2023-05-24 07:33:12 +02:00
Fabiano Fidêncio
1c9ad4435a Merge pull request #6939 from GabyCT/topic/updatenydus
versions: Update nydus version to 2.2.1
2023-05-24 00:12:57 +02:00
Fabiano Fidêncio
d10c9be603 gha: release: login-action: Don't specify docker.io registry
For some bizarre reason, the login-action will simply fail to
authenticate to docker.io in it's specified as a registry.  The way to
proceed, instead, is to *not* specify any registry as it'd be used by
default.

Fixes: #6943

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-23 22:38:12 +02:00
Fabiano Fidêncio
9aae333343 Merge pull request #6871 from kmjohansen/bugfix/ptmx
runtime: make debug console work with sandbox_cgroup_only
2023-05-23 22:24:51 +02:00
Fabiano Fidêncio
df77fefce8 Merge pull request #6941 from fidencio/3.2.0-alpha3-branch-bump
# Kata Containers 3.2.0-alpha3
2023-05-23 22:21:03 +02:00
Fabiano Fidêncio
c54363114d release: Kata Containers 3.2.0-alpha3
- release: Fix `docker/login-action` version

f3702268d release: Fix `docker/login-action` version

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-23 18:39:16 +02:00
Fabiano Fidêncio
c7a77f980b Merge pull request #6935 from fidencio/topic/release-fix-docker-login-action-version
release: Fix `docker/login-action` version
2023-05-23 18:35:03 +02:00
Gabriela Cervantes
0b1c5ea5bb versions: Update nydus version to 2.2.1
This PR updates the nydus version to 2.2.1. This change includes:
nydus-image: fix a underflow issue in get_compressed_size()
backport fix/feature to stable 2.2
[backport] contrib: upgrade runc to v1.1.5
service: add README for nydus-service
nydus: fix a possible panic caused by SubCmdArgs::is_present
Backports two bugfixes from master into stable/v2.2
[backport stable/v2.2] action: upgrade golangci-lint to v1.51.2
[backport] action: fix smoke test for branch pattern

Fixes #6938

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-05-23 15:39:04 +00:00
Fabiano Fidêncio
f3702268d1 release: Fix docker/login-action version
`docker/login-action@v3` does *not* exist and `docker/login-action@v2`
should be used instead.

Fixes: #6934

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-23 14:11:03 +02:00
Fabiano Fidêncio
c82ac57e30 Merge pull request #6930 from fidencio/3.2.0-alpha2-branch-bump
# Kata Containers 3.2.0-alpha2
2023-05-23 13:50:58 +02:00
Linda Yu
433b5add4a kubernetes: add agnhost command in pod yaml
Fixes: #6928

Signed-off-by: Linda Yu <linda.yu@intel.com>
2023-05-23 18:11:45 +08:00
Fupan Li
170336517f Merge pull request #5441 from openanolis/device_manager_dev
runtime-rs: device manager for runtime-rs
2023-05-23 16:50:07 +08:00
Fabiano Fidêncio
fc09d0f5dd release: Kata Containers 3.2.0-alpha2
- Fix cache for OVMF and rootfs-initrd (both x86_64)
- Upgrade to Cloud Hypervisor v32.0
- osbuilder: Bump fedora image version
- local-build: Standardise what's set for the local build scripts
- gha: aks: Wait a little bit more before run the tests
- docs: Update container network model url
- gha: release: Fix s390x worklow
- cache: Fix OVMF caching
- gha: payload-after-push: Pass secrets down
- tools: Fix arch bug

22154e0a3 cache: Fix OVMF tarball name for different flavours
b7341cd96 cache: Use "initrd" as `initrd_type` to build rootfs-initrd
b8ffcd1b9 osbuilder: Bump fedora image version
636539bf0 kata-deploy: Use apt-key.gpg from k8s.io
ae24dc73c local-build: Standardise what's set for the local build scripts
35c3d7b4b runtime: clh: Re-generate the client code
cfee99c57 versions: Upgrade to Cloud Hypervisor v32.0
ad324adf1 gha: aks: Wait a little bit more before run the tests
191b6dd9d gha: release: Fix s390x worklow
cfd8f4ff7 gha: payload-after-push: Pass secrets down
75330ab3f cache: Fix OVMF caching
a89b44aab tools: Fix arch bug
11a34a72e docs: Update container network model url

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-23 09:06:44 +02:00
Fabiano Fidêncio
160d9aae4d Merge pull request #6918 from fidencio/topic/fix-cache-x86_64-ovmf-rootfs-initrd
Fix cache for OVMF and rootfs-initrd (both x86_64)
2023-05-22 21:34:56 +02:00
Zhongtao Hu
4719802c8d runtime-rs: add virtio-blk-mmio
add virtio-blk-mmio option for dragonball

Fixes:#5375
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-05-23 00:58:10 +08:00
Zhongtao Hu
f9bded4484 runtime-rs: add devicetype enum
use device type to store the config information for different kind of
devices

Fixes:#5375
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-05-23 00:55:35 +08:00
Zhongtao Hu
6800d30fdb runtime-rs: remove device
Support remove device after container stop

Fixes:#5375
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-05-23 00:54:22 +08:00
Zhongtao Hu
f16012a1eb runtime-rs: support linux device
support linux device in runtime-rs

Fixes:#5375
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-05-23 00:54:13 +08:00
Zhongtao Hu
fe9ec67644 runtime-rs: block volume
support block volume in runtime-rs

Fixes: #5375
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-05-23 00:54:04 +08:00
Zhongtao Hu
a8bfac90b1 runtime-rs: support block rootfs
support devmapper for block rootfs

Fixes: #5375

Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-05-23 00:53:30 +08:00
Zhongtao Hu
b076d46db3 agent: handle hotplug virtio-mmio device
As dragonball support hotplug virtio-mmio device, we should handle it in agent

Fixes:#5375
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-05-23 00:53:22 +08:00
Zhongtao Hu
6e273d6ccc runtime-rs: implement trait for vhost-user device
add the trait implementation for vhost-user device

Fixes:#5375
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-05-23 00:53:16 +08:00
Zhongtao Hu
cc9c915384 runtime-rs: implement trait for vfio device
add the trait implementation for vfio device,

Fixes:#5375
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-05-23 00:53:10 +08:00
Archana Shinde
2c9efbe04c Merge pull request #6907 from likebreath/0519/clh_v32.0
Upgrade to Cloud Hypervisor v32.0
2023-05-22 09:53:05 -07:00
Zhongtao Hu
e4c5c74a75 runtime-rs: device manager
Support device manager for runtime-rs, add block device handler for
device manager

Fixes:#5375
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-05-23 00:53:04 +08:00
Fabiano Fidêncio
22154e0a3b cache: Fix OVMF tarball name for different flavours
75330ab3f9 tried to fix OVMF caching, but
didn't consider that the "vanilla" OVMF tarball name is not
"kata-static-ovmf-x86_64.tar.xz", but rather "kata-static-ovmf.tar.xz".

The fact we missed that, led to the cache builds of OVMF failing, and
the need to build the component on every single PR.

Fixes: #6917 (hopefully for good this time).

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-22 18:12:30 +02:00
Fabiano Fidêncio
b7341cd968 cache: Use "initrd" as initrd_type to build rootfs-initrd
We've been defaulting to "", which would lead to a mismatch with the
latest version from the cache, causing a miss, and finally having to
build the rootfs-initrd as part of the tests, every single time.

Fixes: #6917

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-22 18:12:30 +02:00
Fabiano Fidêncio
a28cefd538 Merge pull request #6924 from stevenhorsman/fedora-bump
osbuilder: Bump fedora image version
2023-05-22 18:10:57 +02:00
Fabiano Fidêncio
7f350d3ec6 Merge pull request #6913 from fidencio/topic/gha-build-and-upload-payload-can-silently-fail
local-build: Standardise what's set for the local build scripts
2023-05-22 18:04:51 +02:00
stevenhorsman
b8ffcd1b9b osbuilder: Bump fedora image version
- Swap out an EoL fedora image for the latest

Fixes: #6923
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-05-22 13:48:00 +01:00
Fabiano Fidêncio
636539bf0c kata-deploy: Use apt-key.gpg from k8s.io
We're facing some issues to download / use the public key provided by
google for installing kubernetes as part of the kata-deploy image.
```
The following signatures couldn't be verified because the public key is
not available: NO_PUBKEY B53DC80D13EDEF05
Reading package lists... Done
W: GPG error: https://packages.cloud.google.com/apt kubernetes-xenial
   InRelease: The following signatures couldn't be verified because the
   public key is not available: NO_PUBKEY B53DC80D13EDEF05 E: The
   repository 'https://apt.kubernetes.io kubernetes-xenial InRelease' is
   not signed.
N: Updating from such a repository can't be done securely, and is
   therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user
   configuration details.
```

Let's work this around following the suggestion made by @dims, at:
https://github.com/kubernetes/k8s.io/pull/4837#issuecomment-1446426585

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-22 11:06:01 +02:00
Fabiano Fidêncio
ae24dc73c1 local-build: Standardise what's set for the local build scripts
We've a discrepancy on what's set along the scripts used to build the
Kata Cotainers artefacts locally.

Some of those were missing a way to easily debug them in case of a
failure happens, but one specific one (build-and-upload-payload.sh)
could actually silently fail.

All of those have been changed as part of this commut.

Fixes: #6908

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-22 08:36:01 +02:00
Steve Horsman
a2e69c5b66 Merge pull request #6906 from fidencio/topic/gh-aks-wait-a-little-more-before-start-the-tests
gha: aks: Wait a little bit more before run the tests
2023-05-20 08:01:20 +01:00
GabyCT
6796af511b Merge pull request #6890 from GabyCT/topic/fixurlvirt
docs: Update container network model url
2023-05-19 15:10:26 -06:00
Bo Chen
35c3d7b4bc runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v32.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #6632

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-05-19 12:49:45 -07:00
Bo Chen
cfee99c577 versions: Upgrade to Cloud Hypervisor v32.0
Details of this release can be found in ourroadmap project as iteration
v32.0: https://github.com/orgs/cloud-hypervisor/projects/6.

Fixes: #6682

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-05-19 12:11:13 -07:00
Steve Horsman
98fa436627 Merge pull request #6904 from fidencio/topic/gha-fix-s390x-release-workflow
gha: release: Fix s390x worklow
2023-05-19 19:00:57 +01:00
Steve Horsman
d5355dee20 Merge pull request #6898 from fidencio/topic/fix-ovmf-caching
cache: Fix OVMF caching
2023-05-19 18:24:51 +01:00
Fabiano Fidêncio
dfa9301eac Merge pull request #6900 from fidencio/topic/gha-fix-payload-after-push
gha: payload-after-push: Pass secrets down
2023-05-19 17:23:00 +02:00
Fabiano Fidêncio
ad324adf1d gha: aks: Wait a little bit more before run the tests
fa832f4709 increased the timeout, which
helped a lot, mainly in the TEE machines.  However, we're still seeing
some failures here and there with the AKS tests.

Let's bump it yet again and, hopefully, those errors to start the tests
will go away.

Fixes: #6905

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-19 16:40:35 +02:00
Fabiano Fidêncio
191b6dd9dd gha: release: Fix s390x worklow
GitHub is warning us that:
"""
The workflow is not valid. In .github/workflows/release.yaml (Line: 21,
Col: 11): Error from called workflow
kata-containers/kata-containers/.github/workflows/release-s390x.yaml@d2e92c9ec993f56537044950a4673e50707369b5
(Line: 14, Col: 12): Job 'kata-deploy' depends on unknown job
'create-kata-tarball'.
"""

This is happening as we need to reference
"build-kata-static-tarball-s390x" instead of "create-kata-tarball".

Fixes: #6903

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-19 16:21:49 +02:00
Fabiano Fidêncio
cfd8f4ff76 gha: payload-after-push: Pass secrets down
The "build-assets-${arch}" jobs need to have access to the secrets in
order to log into the container registry in the cases where
"push-to-registry", which is used to push the builder containers to
quay.io, is set to "yes".

Now that "build-assets-${arch}" pass the secrets down, we need to log
into the container registry in the "build-kata-static-tarball-${arch}"
files, in case "push-to-registry" is set to "yes".

Fixes: #6899

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-19 15:00:06 +02:00
Fabiano Fidêncio
7abae8ee9c Merge pull request #6896 from stevenhorsman/firecracker-arch-case
tools: Fix arch bug
2023-05-19 14:26:14 +02:00
Fabiano Fidêncio
75330ab3f9 cache: Fix OVMF caching
OVMF has been cached, but it's not been used from cache as the `version`
set in the cached builds has always been empty.

The reason for that is because we've been trying to look for
`externals.ovmf.ovmf.version`, while we should be actually looking for
`externals.ovmf.x86_64.version`.

Setting `x86_64` as the OVMF_FLAVOUR would cause another bug, as the
expected tarball name would then be `kata-static-x86_64.tar.xz`, instead
of `kata-static-ovmf-x86_64.tar.xz`.

With everything said, let's simplify the OVMF_FLAVOUR usage, by using it
as it's passed, and only adapting the tarball name for the TDVF case,
which is the abnormal one.

Fixes: #6897

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-19 14:00:39 +02:00
Fabiano Fidêncio
d2e92c9ec9 Merge pull request #6892 from fidencio/3.2.0-alpha1-branch-bump
# Kata Containers 3.2.0-alpha1
2023-05-19 12:31:33 +02:00
stevenhorsman
a89b44aabf tools: Fix arch bug
Fix mismatched case of `arch`

Fixes: #6895
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-05-19 09:28:22 +01:00
Fabiano Fidêncio
f527f614c1 release: Kata Containers 3.2.0-alpha1
- runtime: Use static_sandbox_resource_mgmt=true for TEEs
- update tokio dependency
- resource-control: fix setting CPU affinities on Linux
- runtime: use enable_vcpus_pinning from toml
- gha: k8s: Make the tests more reliable
- gha: Enable SEV-SNP tests on main
- gha: tdx: Use the k3s overlay for kata-cleanup
- runtime: Port sev package to main
- gpu: Rename the last bits from `gpu` to `nvidia-gpu`
- deploy: fix shell script error
- ppc64le: switch virtiofsd from C to rust version
- osbuilder: Fix indentation in rootfs.sh
- virtcontainers/qemu_test.go: Improve coverage
- agent: Add context to errors that may occur when AgentConfig file is …
- virtcontainers/pkg/compatoci/: Improved coverage for  for Kata 2.0
- kata-manager: Fix '-o' syntax and logic error
- kata-ctl:  Add the option to install kata-ctl to a user specified directory
- runtime-rs: fix building instructions to use correct required Rust ve…
- Dragonball: use LinuxBootConfigurator::write_bootparams
- kata-deploy: Add http_proxy as part of the docker build
- kata-deploy: Do not ship the kata tarball
- kata-deploy: Build improvements
- deploy: Fix arch in image tag
- Revert "kata-deploy: Use readinessProbe to ensure everything is ready"
- virtcontainers: Improved test coverage for fc.go from 4.6% to 18.5%
- main | release: Fix multi-arch publishing is not supported
- cache: More fixes to nvidia-gpu kernels caching
- runtime: remove overriding ARCH value by default for ppc64le
- gha: Fix Body Line Length action flagging empty body commit messages
- gha: Fix snap creation workflow
- cache: Fix nvidia-gpu version
- cache: Update the KERNEL_FLAVOUR list to include nvidia-gpu
- packaging: Add SEV-SNP artifacts to main
- docs: Mark snap installation method as unmaintained
- packaging: Add sev artifacts to main
- kata-ctl: add generic kvm check & unit test
- Log-parser-rs
- warning_fix: fix warnings when build with cargo-1.68.0
- cross-compile: Include documentation and configuration for cross-compile
- runtime: Fix virtiofs fd leak
- gpu: cold plug VFIO devices
- pkg/signals: Improved test coverage 60% to 100%
- virtcontainers/persist: Improved test coverage 65% to 87.5%
- virtcontainers/clh_test.go: improve unit test coverage
- virtcontainers/factory: Improved test coverage
- gha: Also run k8s tests on qemu-snp
- gha: sev: fix for kata-deploy error
- gha: Also run k8s tests on qemu-sev
- Implement the "kata-ctl env" command
- runtime-rs: support keep_abnormal in toml config
- gpu: Build and Ship an GPU enabled Kernel
- kata-ctl: checks for kvm, kvm_intel modules loaded
- osbuilder: Fix D-Bus enabling in the dracut case
- snap: fix docker start fail issue
- kata-manager: Fix containerd download
- agent: Fix ut issue caused by fd double closed
- Bump ttrpc to 0.7.2 and protobuf to 3.2.0
- gpu: Add GPU enabled confguration and runtime
- gpu: Do not pass-through PCI (Host) Bridges
- cache-components: Fix caching of TDVF and QEMU for TDX
- gha: tdx: Ensure kata-deploy is removed after the tests run
- versions: Upgrade to Cloud Hypervisor v31.0
- osbuilder: Enable dbus in the dracut case
- runtime: Don't create socket file in /run/kata
- nydus_rootfs/prefetch_files: add prefetch_files for RAFS
- runtime-rs/virtio-fs: add support extra handler for cache mode.
- runtime-rs: enable nerdctl to setup cni plugin
- tdx: Add artefacts from the latest TDX tools release into main
- runtime: support non-root for clh
- gha: ci-on-push: Run k8s tests with dragonball
- rustjail: Use CPUWeight with systemd and CgroupsV2
- gha: k8s-on-aks: {create,delete} AKS must be a coded-in step
- docs: update the rust version from version.yaml
- gha: k8s-on-aks: Set {create,delete}_aks as steps
- gha: k8s-on-aks: Fix cluster name
- gha: Also run k8s tests on AKS with dragonball
- gha: Only push images to registry after merging a PR
- gha: aks: Use D4s_v5 instance
- tools: Avoid building the kernel twice
- rustjail: Fix panic when cgroup manager fails
- runtime: add filter metrics with specific names
- gha: Use ghcr.io for the k8s CI
- GHA |Switch "kubernetes tests" from jenkins to GitHub actions
- docs: Update CNM url in networking document
- kata-ctl: add function to get platform protection.

f6e1b1152 agent: update tokio dependency
4cb83dc21 kata-ctl: update tokio dependency
df615ff25 runk: update tokio dependency
ca6892ddb runtime-rs: update tokio dependency
ca1531fe9 runtime: Use static_sandbox_resource_mgmt=true for TEEs
fa832f470 gha: k8s: Make the tests more reliable
cbb9fe8b8 config: Use standard OVMF with SEV
724437efb kata-deploy: add kata-qemu-sev runtimeclass
521dad2a4 Tests: skip CPU constraints test on SEV and SNP
72308ddb0 gha: ci-on-push: Don't skip tests for SEV
da0f92cef gha: ci-on-push: Don't skip tests for SEV-SNP
12f43bea0 gha: tdx: Use the k3s overlay for kata-cleanup
1a3f8fc1a deploy: fix shell script error
87cb98c01 osbuilder: Fix indentation in rootfs.sh
c5a59caca ppc64le: switch virtiofsd from C to rust version
bfdf0144a versions: Bump virtiofsd to 1.6.1
dd7562522 runtime: pkg/sev: Add kbs utility package for SEV pre-attestation
05de7b260 runtime: Add sev package
3a9d3c72a gpu: Rename the last bits from `gpu` to `nvidia-gpu`
4cde844f7 local-build: Fix kernel-nvidia-gpu target name
593840e07 kata-ctl: Allow INSTALL_PATH= to be specified
bdb75fb21 runtime: use enable_vcpus_pinning from toml
20cb87508 virtcontainers/qemu_test.go: Improve test coverage
b9a1db260 kata-deploy: Add http_proxy as part of the docker build
3e85bf5b1 resource-control: fix setting CPU affinities on Linux
5f3f844a1 runtime-rs: fix building instructions with respect to required Rust version
777c3dc8d kata-deploy: Do not ship the kata tarball
50cc9c582 tests: Improve coverage for virtcontainers/pkg/compatoci/ for Kata 2.0
136e2415d static-build: Download firecracker instead of building it
3bf767cfc static-build: Adjust ARCH for nydus
ac88d34e0 static-build: Use relased binary for CLH (aarch64)
73913c8eb kata-manager: Fix '-o' syntax and logic error
2856d3f23 deploy: Fix arch in image tag
e8f81ee93 Revert "kata-deploy: Use readinessProbe to ensure everything is ready"
cfe63527c release: Fix multi-arch publishing is not supported
197c33651 Dragonball: use LinuxBootConfigurator::write_bootparams to writes the boot parameters into guest memory.
4d17ea4a0 cache: Fix nvidia-snp caching version
a133fadbf cache: Fix nvidia-gpu-tdx-experimental cache URL
b9990c201 cache: Fix nvidia-gpu version
c9bf7808b cache: Update the KERNEL_FLAVOUR list to include nvidia-gpu
3665b4204 gpu: Rename `gpu` targets to `nvidia-gpu`
2c90cac75 local-build: fixup alphabetization
4da6eb588 kata-deploy: Add qemu-snp shim
14dd05375 kata-deploy: add kata-qemu-snp runtimeclass
0bb37bff7 config: Add SNP configuration
af7f2519b versions: update SEV kernel description
dbcc3b5cc local-build: fix default values for OVMF build
b8bbe6325 gha: build OVMF for tests and release
cf0ca265f local-build: Add x86_64 OVMF target
db095ddeb cache: add SNP flavor to comments
f4ee00576 gha: Build and ship QEMU for SNP
7a58a91fa docs: update SNP guide
879333bfc versions: update SNP QEMU version
38ce4a32a local-build: add support to build QEMU for SEV-SNP
5f8008b69 kata-ctl: add unit test for kvm check
a085a6d7b kata-ctl: add generic kvm check
772d4db26 gha: Build and ship SEV initrd
45fa36692 gha: Build and ship SEV OVMF
4770d3064 gha: Build and ship SEV kernel.
fb9c1fc36 runtime: Add qemu-sev config
813e4c576 runtimeClasses: add sev runtime class
af18806a8 static-build: Add caching support to sev ovmf
76ae7a3ab packaging: adding caching capability for kernel
12c5ef902 packaging: add support to build OVMF for SEV
b87820ee8 packaging: add support to build initrd for sev
e1f3b871c docs: Mark snap installation method as unmaintained
022a33de9 agent: Add context to errors when AgentConfig file is missing
b0e6a094b packaging: Add sev kernel build capability
a4c0303d8 virtcontainers: Fixed static checks for improved test coverage for fc.go
8495f830b cross-compile: Include documentation and configuration for cross-compile
13d7f39c7 gpu: Check for VFIO port assignments
6594a9329 tools: made log-parser-rs
03a8cd69c virtcontainers: Improved test coverage for fc.go from 4.6% to 18.5%
9e2b7ff17 gha: sev: fix for kata-deploy error
5c9246db1 gha: Also run k8s tests on qemu-snp
c57a44436 gha: Add the ability to test qemu-snp
406419289 env: Utilize arch specific functionality to get cpu details
fb40c71a2 env: Check for root privileges
1016bc17b config: Add api to fetch config from default config path
b908a780a kata-env: Pass cmd option for file path
b1920198b config: Workaround the way agent and hypervisor configs are fetched
f2b2621de kata-env: Implement the kata-env command.
c849bdb0a gha: Also run k8s tests on qemu-sev
6bf1fc605 virtcontainers/factory: Improved test coverage
0d49ceee0 gha: Fix snap creation workflow warnings
138ada049 gpu: Cold Plug VFIO toml setting
defb64334 runtime: remove overriding ARCH value by default for ppc64le
f7ad75cb1 gpu: Cold-plug extend the api.md
0fec2e698 gpu: Add cold-plug test
f2ebdd81c utils: Get rid of spurious print statement left behind.
9a94f1f14 make: Export VERSION and COMMIT
2f81f48da config: Add file under /opt as another location to look for the config
07f7d17db config: Make the pipe_size field optional
68f635773 config: Make function to get the default conf file public
7565b3356 kata-ctl: Implement Display trait for GuestProtection enum
94a00f934 utils: Make certain constants in utils.rs public
572b338b3 gitignore: Ignore .swp and .swo editor backup files
376884b8a cargo: Update version of clap to 4.1.13
17daeb9dd warning_fix: fix warnings when build with cargo-1.68.0
521519d74 gha: Add the ability to test qemu-sev
205909fbe runtime: Fix virtiofs fd leak
5226f15c8 gha: Fix Body Line Length action flagging empty body commit messages
0f45b0faa virtcontainers/clh_test.go: improve unit test coverage
dded731db gpu: Add OVMF setting for MMIO aperture
2a830177c gpu: Add fwcfg helper function
131f056a1 gpu: Extract VFIO Functions to drivers
c8cf7ed3b gpu: Add ColdPlug of VFIO devices with devManager
e2b5e7f73 gpu: Add Rawdevices to hypervisor
6107c32d7 gpu: Assign default value to cold-plug
377ebc2ad gpu: Add configuration option for cold-plug VFIO
c18ceae10 gpu: Add new struct PCIePort
9c38204f1 virtcontainers/persist: Improved test coverage 65% to 87.5%
1c1ee8057 pkg/signals: Improved test coverage 60% to 100%
cc8ea3232 runtime-rs: support keep_abnormal in toml config
96e8470db kata-manager: Fix containerd download
432d40744 kata-ctl: checks for kvm, kvm_intel modules loaded
b1730e4a6 gpu: Add new kernel build option to usage()
3e7b90226 osbuilder: Fix D-Bus enabling in the dracut case
53c749a9d agent: Fix ut issue caused by fd double closed
2e3f19af9 agent: fix clippy warnings caused by protobuf3
4849c56fa agent: Fix unit test issue cuased by protobuf upgrade
0a582f781 trace-forwarder: remove unused crate protobuf
73253850e kata-ctl: remove unused crate ttrpc
76d2e3054 agent-ctl: Bump ttrpc from 0.6.0 to 0.7.1
eb3d20dcc protocols: Add ut for Serde
59568c79d protocols: add support for Serde
a6b4d92c8 runtime-rs: Bump ttrpc from 0.6.0 to 0.7.1
ac7c63bc6 gpu: Add containerd shim for qemu-gpu
a0cc8a75f gpu: Add a kube runtime class
a81fff706 gpu: Adding a GPU enabled configuration
8af6fc77c agent: Bump ttrpc from 0.6.0 to 0.7.1
009b42dbf protocols: Fix unit test
392732e21 protocols: Bump ttrpc from 0.6.0  to 0.7.1
f4f958d53 gpu: Do not pass-through PCI (Host) Bridges
825e76948 gpu: Add GPU support to default kernel without any TEE
e4ee07f7d gpu: Add GPU TDX experimental kernel
a1272bcf1 gha: tdx: Fix typo overlay -> overlays
3fa0890e5 cache-components: Fix TDVF caching
80e3a2d40 cache-components: Fix TDX QEMU caching
87ea43cd4 gpu: Add configuration fragment
aca6ff728 gpu: Build and Ship an GPU enabled Kernel
dc662333d runtime: Increase the dial_timeout
eb1762e81 osbuilder: Enable dbus in the dracut case
f478b9115 clh: tdx: Update timeouts for confidential guest
3b76abb36 kata-deploy: Ensure node is ready after CRI Engine restart
5ec9ae0f0 kata-deploy: Use readinessProbe to ensure everything is ready
ea386700f kata-deploy: Update podOverhead for TDX
e31efc861 gha: tdx: Use the k3s overlay
542bb0f3f gha: tdx: Set KUBECONFIG env at the job level
d7fdf19e9 gha: tdx: Delete kata-deploy after the tests finish
da35241a9 tests: k8s: Skip k8s-cpu-ns when testing TDX
db2cac34d runtime: Don't create socket file in /run/kata
6d315719f snap: fix docker start fail issue
e4b3b0887 gpu: Add proper CONFIG_LOCALVERSION depending on TEE
69ba2098f runtime-rs: remove network entities and netns
b31f103d1 runtime-rs: enable nerdctl cni plugin
69d7a959c gha: ci-on-push: Run tests on TDX
5a0727ecb kata-deploy: Ship kata-qemu-tdx runtimeClass
98682805b config: Add configuration for QEMU TDX
3e1580019 govmm: Directly pass the firmware using -bios with TDX
3c5ffb0c8 govmm: Set "sept-ve-disable=on"
ed145365e runtime/qemu: Drop "kvm-type=tdx"
25b3cdd38 virtcontainers: Drop check for the `tdx` CPU flag
01bdacb4e virtcontainers: Also check /sys/firmwares/tdx for TDX
9feec533c cache: Add ability to cache OVMF
ce8d98251 gha: Build and ship the OVMF for TDX
39c3fab7b local-build: Add support to build OVMF for TDX
054174d3e versions: Bump OVMF for TDX
800fb49da packaging: Add get_ovmf_image_name() helper
fbf03d7ac cache: Document kernel-tdx-experimental
5d79e9696 cache: Add a space to ease the reading of the kernel flavours
6e4726e45 cache: Fix typos
fc22ed0a8 gha: Build and ship the Kernel for TDX
502844ced local-build: Add support to build Kernel for TDX
b2585eecf local-build: Avoid code duplication building the kernel
f33345c31 versions: Update Kernel TDX version
20ab2c242 versions: Move Kernel TDX to its own experimental entry
3d9ce3982 cache: Allow specifying the QEMU_FLAVOUR
33dc6c65a gha: Build and ship QEMU for TDX
eceaae30a local-build: Add support to build QEMU for TDX
f7b7c187e static-build: Improve qemu-experimental build script
3018c9ad5 versions: Update QEMU TDX version
800ee5cd8 versions: Move QEMU TDX to its own experimental entry
1315bb45f local-build: Add dragonball kernel to the `all` target
73e108136 local-build: Rename non vanilla kernel build functions
1d851b4be local-build: Cosmetic changes in build targets
49ce685eb gha: k8s-on-aks: Always delete the AKS cluster
e2a770df5 gha: ci-on-push: Run k8s tests with dragonball
d1f550bd1 docs: update the rust version from versions.yaml
f3595e48b nydus_rootfs/prefetch_files: add prefetch_files for RAFS
3bfaafbf4 fix: oci hook
c1fbaae8d rustjail: Use CPUWeight with systemd and CgroupsV2
375187e04 versions: Upgrade to Cloud Hypervisor v31.0
79f3047f0 gha: k8s-on-aks: {create,delete} AKS must be a coded-in step
2f35b4d4e gha: ci-on-push: Only run on `main` branch
e7bd2545e Revert "gha: ci-on-push: Depend on Commit Message Check"
0d96d4963 Revert "gha: ci-on-push: Adjust to using workflow_run"
c7ee45f7e Revert "gha: ci-on-push: Adapt chained jobs to workflow_run"
5d4d72064 Revert "gha: k8s-on-aks: Fix cluster name"
13d857a56 gha: k8s-on-aks: Set {create,delete}_aks as steps
dc6569dbb runtime-rs/virtio-fs: add support extra handler for cache mode.
85cc5bb53 gha: k8s-on-aks: Fix cluster name
1688e4f3f gha: aks: Use D4s_v5 instance
108d80a86 gha: Add the ability to also test Dragonball
2550d4462 gha: build-kata-static-tarball: Only push to registry after merge
e81b8b8ee local-build: build-and-upload-payload is not quay.io specific
13929fc61 gha: publish-kata-deploy-payload: Improve registry login
41026f003 gha: payload-after-push: Pass registry / repo as inputs
7855b4306 gha: ci-on-push: Adapt chained jobs to workflow_run
3a760a157 gha: ci-on-push: Adjust to using workflow_run
a159ffdba gha: ci-on-push: Depend on Commit Message Check
8086c75f6 gha: Also run k8s tests on AKS with dragonball
fe86c08a6 tools: Avoid building the kernel twice
3215860a4 gha: Set ci-on-push to run on `pull_request_target`
d17dfe4cd gha: Use ghcr.io for the k8s CI
b661e0cf3 rustjail: Add anyhow context for D-Bus connections
60c62c3b6 gha: Remove kata-deploy-test.yaml
43894e945 gha: Remove kata-deploy-push.yaml
cab9ca043 gha: Add a CI pipeline for Kata Containers
53b526b6b gha: k8s: Add snippet to run k8s tests on aks clusters
c444c24bc gha: aks: Add snippets to create / delete aks clusters
11e0099fb tests: Move k8s tests to this repo
73be4bd3f gha: Update actions for release.yaml
d38d7fbf1 gha: Remove code duplication from release.yaml
56331bd7b gha: Split payload-after-push-*.yaml
a552a1953 docs: Update CNM url in networking document
7796e6ccc rustjail: Fix minor grammatical error in function name
41fdda1d8 rustjail: Do  not unwrap potential error with cgroup manager
a914283ce kata-ctl: add function to get platform protection.
0f7351556 runtime: add filter metrics with specific names
cbe6ad903 runtime: support non-root for clh
d3bb25418 utils: Add function to check vhost-vsock

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-19 09:26:36 +02:00
Fabiano Fidêncio
0364620844 Merge pull request #6819 from fidencio/topic/use-static-sandbox-resource-mgmt-for-TEEs
runtime: Use static_sandbox_resource_mgmt=true for TEEs
2023-05-18 22:38:31 +02:00
Fabiano Fidêncio
2ea8acaaa5 Merge pull request #6882 from bergwolf/github/tokio
update tokio dependency
2023-05-18 20:35:16 +02:00
Krister Johansen
eff6ed2d5f runtime: make debug console work with sandbox_cgroup_only
If a hypervisor debug console is enabled and sandbox_cgroup_only is set,
the hypervisor can fail to open /dev/ptmx, which prevents the sandbox
from launching.

This is caused by the absence of a device cgroup entry to allow access
to /dev/ptmx.  When sandbox_cgroup_only is not set, the hypervisor
inherits the default unrestrcited device cgroup, but with it enabled it
runs into allow / deny list restrictions.

Fix by adding an allowlist entry for /dev/ptmx when debug is enabled,
sandbox_cgroup_only is true, and no /dev/ptmx is already in the list of
devices.

Fixes: #6870

Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
2023-05-18 10:36:24 -07:00
Gabriela Cervantes
11a34a72e2 docs: Update container network model url
This PR updates the container network model url that is part of the
virtcontainers documentation.

Fixes #6889

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-05-18 15:08:08 +00:00
Peng Tao
f6e1b1152c agent: update tokio dependency
To 1.28.1 to bring in the latest fixes.

Fixes: #6881
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-05-18 09:36:06 +00:00
Shuaiyi Zhang
c477ac551f dragonball: Convert VirtioNetDeviceMgr function to method
Convert VirtioNetDeviceMgr::insert_device and
VirtioNetDeviceMgr::update_device_ratelimiters to method.

Fixes: #6880

Signed-off-by: Shuaiyi Zhang <zhang_syi@qq.com>
2023-05-18 16:57:01 +08:00
Shuaiyi Zhang
4659facb74 dragonball: Convert BlockDeviceMgr function to method
Convert BlockDeviceMgr::insert_device, BlockDeviceMgr::remove_device
and BlockDeviceMgr::update_device_ratelimiters to method.

Fixes: #6880

Signed-off-by: Shuaiyi Zhang <zhang_syi@qq.com>
2023-05-18 16:56:49 +08:00
Peng Tao
4cb83dc219 kata-ctl: update tokio dependency
Update to 1.28.1 To pick up the latest fixes.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-05-18 08:25:13 +00:00
Peng Tao
df615ff252 runk: update tokio dependency
Update to 1.28.1 to pick up latest fixes.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-05-18 08:24:41 +00:00
Peng Tao
ca6892ddb1 runtime-rs: update tokio dependency
Unify it to the latest 1.28.1 version.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-05-18 08:18:22 +00:00
Fabiano Fidêncio
3a4b924226 Merge pull request #6833 from rye-stripe/bugfix/vcpu-pinning
resource-control: fix setting CPU affinities on Linux
2023-05-18 08:12:39 +02:00
Xuewei Niu
ee6deef09d dragonball: Remove virtio-net and vsock devices gracefully
This MR implements removing virtio-net and virtio-vsock devices gracefully when
shutting down VMM.

Fixes: #6684

Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-05-18 12:11:20 +08:00
Fabiano Fidêncio
e762f70920 Merge pull request #6838 from rye-stripe/bugfix/use-enable-vcpus-pinning-from-toml
runtime: use enable_vcpus_pinning from toml
2023-05-17 21:30:44 +02:00
Fabiano Fidêncio
ca1531fe9d runtime: Use static_sandbox_resource_mgmt=true for TEEs
When this option is enabled the runtime will attempt to determine the
appropriate sandbox size (memory, CPU) before booting the virtual
machine.

As TEEs do not support memory and CPU hotplug, this approach must be
used.

Fixes: #6818

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-17 19:21:52 +02:00
Fabiano Fidêncio
851b97fa51 Merge pull request #6866 from fidencio/topic/gha-improve-actions
gha: k8s: Make the tests more reliable
2023-05-17 19:19:18 +02:00
Fabiano Fidêncio
8ce14e709a Merge pull request #6810 from fitzthum/snp-enable
gha: Enable SEV-SNP tests on main
2023-05-17 15:29:54 +02:00
Greg Kurz
206df04b99 Merge pull request #6858 from fidencio/topic/gha-tdx-fix-cleanup
gha: tdx: Use the k3s overlay for kata-cleanup
2023-05-17 15:04:56 +02:00
Wainer Moschetta
259158f1c3 Merge pull request #6789 from dubek/add-sev-package
runtime: Port sev package to main
2023-05-17 10:02:19 -03:00
Fabiano Fidêncio
fa832f4709 gha: k8s: Make the tests more reliable
We like it or not, every now and then we'll have to deal with flaky
tests, and our tests using GHA are not exempt from that fact.

With this simple commit, we're trying to improve the reliability of the
tests in a few different fronts:

* Giving enough time for the script used by kata-deploy to be executed
  * We've hit issues as the kata-deploy pod is considered "Ready" at the
    moment it starts running, not when it finishes the needed setup. We
    should also be looking on how to solve this on the kata-deploy side
    but, for now, let's ensure our tests do not break with the current
    kata-deploy behavior.

* Merging the "Deploy kata-deploy" and "Run tests" steps
  * We've hit issues re-running tests and seeing even more failures than
    the ones we're trying to debug, as a step will simply be taken as
    succeeded as part of the re-run, in case it was successful executed
    as part of the first run.  This causes issues with the kata-deploy
    deployment, as the tests would start running before even having the
    node set up for running Kata Containers.

Fixes: #6865 #6649

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-17 13:38:08 +02:00
Tobin Feldman-Fitzthum
cbb9fe8b81 config: Use standard OVMF with SEV
The AmdSev firmware package should be used with
measured direct boot. If the expected hashes are not
injected into the firmware binary by the VMM, the
guest will not boot. This is required for security.

Currently the main branch does not have the extended
shim support for SEV, which tells the VMM to inject
the expected hashes.

We ship the standard OVMF package to use with SNP,
so let's switch SEV to that for now. This will need
to be changed back when shim support for SEV(-ES)
is added to main.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2023-05-17 11:36:04 +02:00
Tobin Feldman-Fitzthum
724437efb3 kata-deploy: add kata-qemu-sev runtimeclass
In order to populate containerd config file with
support for SEV, we need to add the qemu-sev shim
to the kata-deploy script.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2023-05-17 11:36:02 +02:00
Tobin Feldman-Fitzthum
521dad2a47 Tests: skip CPU constraints test on SEV and SNP
Currently Kata does not support memory / CPU hotplug for SEV or
SEV-SNP so we need to skip tests that rely on it.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2023-05-17 11:35:13 +02:00
Tobin Feldman-Fitzthum
72308ddb07 gha: ci-on-push: Don't skip tests for SEV
Now that SEV artifacts are built by GHA, remove
conditional that skips tests when using qemu-sev.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2023-05-17 11:35:13 +02:00
Tobin Feldman-Fitzthum
da0f92cef8 gha: ci-on-push: Don't skip tests for SEV-SNP
Now that we have SNP artifacts in place and they are built via gha,
remove the condition that skips the tests for SNP.

Fixes: #6809

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2023-05-17 11:35:13 +02:00
fupan
2bda92face netlink: Fix the issue of update_interface
When updating an interface, there's maybe an existed
interface whose name would be the same with the updated
required name, thus it would update failed with interface
name existed error. Thus we should rename the existed interface
with an temporary name and swap it with the previouse interface
name last.

Fixes: #6842

Signed-off-by: fupan <fupan.lfp@antgroup.com>
2023-05-17 16:45:49 +08:00
Fabiano Fidêncio
12f43bea0f gha: tdx: Use the k3s overlay for kata-cleanup
As the TDX CI runs on k3s, we must ensure the cleanup, as already done
for the deploy, used the k3s overlay.

Fixes: #6857

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-17 09:50:29 +02:00
Fabiano Fidêncio
9630c13ac0 Merge pull request #6845 from fidencio/topic/yet-more-nvidia-gpu-naming-fixes
gpu: Rename the last bits from `gpu` to `nvidia-gpu`
2023-05-17 09:05:12 +02:00
Steve Horsman
e4a458035c Merge pull request #6852 from stevenhorsman/container-image-arch-consistency
deploy: fix shell script error
2023-05-17 08:01:39 +01:00
Amulya Meka
3ccc29030d Merge pull request #6780 from Amulyam24/rust-virtfs
ppc64le: switch virtiofsd from C to rust version
2023-05-17 09:36:28 +05:30
GabyCT
e0e46de12d Merge pull request #6849 from GabyCT/topic/fixtabs
osbuilder: Fix indentation in rootfs.sh
2023-05-16 16:47:09 -06:00
stevenhorsman
1a3f8fc1a2 deploy: fix shell script error
- Remove local introduced by bad copy-paste

Fixes: #6814
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-05-16 19:30:32 +01:00
Salvador Fuentes
b76058c979 Merge pull request #6721 from nedsouza/virtcontainers-qemu-go-coverage
virtcontainers/qemu_test.go: Improve coverage
2023-05-16 11:11:43 -06:00
Feng Wang
ebc8e8e2fd Merge pull request #6773 from jepio/agent-config-error-context
agent: Add context to errors that may occur when AgentConfig file is …
2023-05-16 09:21:34 -07:00
Gabriela Cervantes
87cb98c01d osbuilder: Fix indentation in rootfs.sh
This PR replaces single spaces to tabs in order to fix the
indentation of the rootfs script.

Fixes #6848

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-05-16 15:30:50 +00:00
James O. D. Hunt
a96fcfd5be Merge pull request #6735 from nedsouza/258/tests-coverage-compatoci
virtcontainers/pkg/compatoci/: Improved coverage for  for Kata 2.0
2023-05-16 15:36:35 +01:00
Amulyam24
c5a59caca1 ppc64le: switch virtiofsd from C to rust version
We have been using the C version of virtiofsd on ppc64le. Now that the issue with
rust virtiofsd have been fixed, let's switch to it.

Fixes: #4259

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2023-05-16 14:46:19 +02:00
Amulyam24
bfdf0144aa versions: Bump virtiofsd to 1.6.1
virtiofsd v1.6.1  has been released with the fixes required for running
successfully on ppc64le.

Fixes: #4259

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2023-05-16 14:46:16 +02:00
Dov Murik
dd7562522a runtime: pkg/sev: Add kbs utility package for SEV pre-attestation
Supports both online and offline modes of interaction with simple-kbs
for SEV/SEV-ES confidential guests.

Fixes: #6795

Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
2023-05-16 15:27:32 +03:00
Dov Murik
05de7b2607 runtime: Add sev package
The sev package provides utilities for launching AMD SEV and SEV-ES
confidential guests.

Fixes: #6795

Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
2023-05-16 15:27:32 +03:00
Fabiano Fidêncio
3a9d3c72aa gpu: Rename the last bits from gpu to nvidia-gpu
Let's specifically name the `gpu` runtime class as `nvidia-gpu`.  By
doing this we keep the door open and ease the life of the next vendor
adding GPU support for Kata Containers.

Fixes: #6553

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-16 13:47:52 +02:00
Fabiano Fidêncio
4cde844f70 local-build: Fix kernel-nvidia-gpu target name
It must have `-tarball` as part of its name.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-16 13:34:52 +02:00
Archana Shinde
8d10d157b3 Merge pull request #6823 from jodh-intel/utils-kata-manager-containerd-fix
kata-manager: Fix '-o' syntax and logic error
2023-05-15 21:44:35 -07:00
Bin Liu
47a02dcc7f Merge pull request #6767 from ngpatel6/Issue-5403
kata-ctl:  Add the option to install kata-ctl to a user specified directory
2023-05-16 10:43:40 +08:00
Chao Wu
911d8a5a7f Merge pull request #6804 from pmores/fix-rust-version-in-docs
runtime-rs: fix building instructions to use correct required Rust ve…
2023-05-16 10:14:05 +08:00
Bin Liu
2cd2d02d1f Merge pull request #6812 from ZhangShuaiyi/dev/write_bootparams
Dragonball: use LinuxBootConfigurator::write_bootparams
2023-05-16 09:54:41 +08:00
GabyCT
3d8185863d Merge pull request #6835 from GabyCT/topic/buildkataproxy
kata-deploy: Add http_proxy as part of the docker build
2023-05-15 16:15:27 -06:00
Narendra Patel
593840e075 kata-ctl: Allow INSTALL_PATH= to be specified
Update the kata-ctl install rule to allow it to be installed to a given directory

The Makefile was updated to use an INSTALL_PATH variable to track where the
kata-ctl binary should be installed.  If the user doesn't specify anything,
then it uses the default path that cargo uses.  Otherwise, it will install it
in the directory that the user specified.  The README.md file was also updated
to show how to use the new option.

Fixes #5403

Co-authored-by: Cesar Tamayo <cesar.tamayo@intel.com>
Co-authored-by: Kevin Mora Jimenez <kevin.mora.jimenez@intel.com>
Co-authored-by: Narendra Patel <narendra.g.patel@intel.com>
Co-authored-by: Ray Karrenbauer <ray.karrenbauer@intel.com>
Co-authored-by: Srinath Duraisamy <srinath.duraisamy@intel.com>
Signed-off-by: Narendra Patel <narendra.g.patel@intel.com>
2023-05-15 17:21:49 -04:00
Peteris Rudzusiks
bdb75fb21e runtime: use enable_vcpus_pinning from toml
Set the default value of runtime's EnableVCPUsPinning to value read from .toml.

Fixes: #6836

Signed-off-by: Peteris Rudzusiks <rye@stripe.com>
2023-05-15 21:41:20 +02:00
Tamas K Lengyel
20cb875087 virtcontainers/qemu_test.go: Improve test coverage
Rework TestQemuCreateVM routine to be a table driven test with
various config variations passed to it. After CreateVM a handful
of additional functions are exercised to improve code-coverage.
Also add partial coverage for StartVM routine.

Currently improving from 19.7% to 35.7%

Credit PR to Hackathon Team3

Fixes: #267

Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
2023-05-15 15:26:35 -04:00
Fabiano Fidêncio
da877a603d Merge pull request #6829 from fidencio/topic/kata-deploy-remove-tarball-from-payload-image
kata-deploy: Do not ship the kata tarball
2023-05-15 19:01:14 +02:00
Gabriela Cervantes
b9a1db2601 kata-deploy: Add http_proxy as part of the docker build
Add http_proxy and https_proxy as part of the docker build arguments
in order to build properly when we are behind a proxy.

Fixes #6834

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-05-15 15:57:29 +00:00
Peteris Rudzusiks
3e85bf5b17 resource-control: fix setting CPU affinities on Linux
With this fix the vCPU pinning feature chooses the correct
physical cores to pin the vCPU threads on rather than always using core 0.

Fixes #6831

Signed-off-by: Peteris Rudzusiks <rye@stripe.com>
2023-05-15 16:46:36 +02:00
Pavel Mores
5f3f844a1e runtime-rs: fix building instructions with respect to required Rust version
Fixes: #6803

Signed-off-by: Pavel Mores <pmores@redhat.com>
2023-05-15 16:30:41 +02:00
Fabiano Fidêncio
9e83795fca Merge pull request #6825 from fidencio/topic/kata-deploy-build-improvements
kata-deploy: Build improvements
2023-05-15 13:49:15 +02:00
Fabiano Fidêncio
802cd2f673 Merge pull request #6821 from stevenhorsman/container-image-arch-consistency
deploy: Fix arch in image tag
2023-05-15 11:16:01 +02:00
Fabiano Fidêncio
815b4e8dac Merge pull request #6816 from fidencio/topic/kata-deploy-fixes
Revert "kata-deploy: Use readinessProbe to ensure everything is ready"
2023-05-15 10:24:58 +02:00
Fabiano Fidêncio
777c3dc8d2 kata-deploy: Do not ship the kata tarball
There's absolutely no reason to ship the kata-static tarball as part of
the payload image, as:
* The tarball is already part of the release process
* The payload image already has uncompressed content of the tarball
* The tarball itself is not used anywhere by the kata-deploy scripts

Fixes: #6828

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-15 09:22:39 +02:00
LiuWeijie
50cc9c582f tests: Improve coverage for virtcontainers/pkg/compatoci/ for Kata 2.0
Add test cases for ParseConfigJson function and GetContainerSpec function

Fixes: #258

Signed-off-by: LiuWeijie <weijie.liu@intel.com>
2023-05-15 11:58:17 +08:00
Fabiano Fidêncio
136e2415da static-build: Download firecracker instead of building it
There's no reason for us to build firecracker instead of simply
downloading the official released tarball, as tarballs are provided for
the architectures we want to use them.

Fixes: #6770

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-12 22:05:33 +02:00
Fabiano Fidêncio
3bf767cfcd static-build: Adjust ARCH for nydus
When building from aarch64, just use "arm64" as that's what's used in
the name of the released nydus tarballs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-12 22:05:33 +02:00
Fabiano Fidêncio
ac88d34e0c static-build: Use relased binary for CLH (aarch64)
There's no need to build Cloud Hypervisor aarch64 as, for a few releases
already, Cloud Hypervisor provides an official release binary for the
architecture.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-12 22:05:01 +02:00
Archana Shinde
32b39ee347 Merge pull request #6763 from nedsouza/266/tests_coverage_virtcontainers_fc
virtcontainers: Improved test coverage for fc.go from 4.6% to 18.5%
2023-05-12 11:53:27 -07:00
James O. D. Hunt
73913c8eb7 kata-manager: Fix '-o' syntax and logic error
Fix the syntax and logic error that is only displayed if the user runs
the script with `-o`. This option requests that "only" Kata Containers
is installed and stops containerd from being installed.

Fixes: #6822.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-05-12 16:44:24 +01:00
stevenhorsman
2856d3f23d deploy: Fix arch in image tag
`uname -m` produces `x86_64`, but container image convention
is to use `amd64`, so update this in the tag

Fixes: #6820
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-05-12 16:14:19 +01:00
Fabiano Fidêncio
42dce15b1f Merge pull request #6450 from singhwang/main
main | release: Fix multi-arch publishing is not supported
2023-05-12 15:25:59 +02:00
Fabiano Fidêncio
e8f81ee93d Revert "kata-deploy: Use readinessProbe to ensure everything is ready"
This reverts commit 5ec9ae0f04, for two
main reasons:
* The readinessProbe was misintepreted by myself when working on the
  original PR
* It's actually causing issues, as the pod ends up marked as not
  healthy.
2023-05-12 14:28:23 +02:00
SinghWang
cfe63527c5 release: Fix multi-arch publishing is not supported
When release is published, kata-deploy payload and kata-static package
can support multi-arch publishing.

Fixes: #6449

Signed-off-by: SinghWang <wangxin_0611@126.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-12 13:36:44 +02:00
Shuaiyi Zhang
197c336516 Dragonball: use LinuxBootConfigurator::write_bootparams to writes
the boot parameters into guest memory.

Fixes: #6813

Signed-off-by: Shuaiyi Zhang <zhang_syi@qq.com>
2023-05-12 16:07:44 +08:00
Fabiano Fidêncio
181017d1d8 Merge pull request #6811 from fidencio/topic/yet-more-fixes-for-nvidia-gpu-kernels
cache: More fixes to nvidia-gpu kernels caching
2023-05-12 10:02:08 +02:00
Amulya Meka
76f975e5e6 Merge pull request #6742 from Amulyam24/agent-build
runtime: remove overriding ARCH value by default for ppc64le
2023-05-12 12:34:50 +05:30
Archana Shinde
20ac3917ad Merge pull request #6739 from byron-marohn/fix_5561
gha: Fix Body Line Length action flagging empty body commit messages
2023-05-11 15:17:07 -07:00
Archana Shinde
1ad442e656 Merge pull request #6748 from nedsouza/fix-snap
gha: Fix snap creation workflow
2023-05-11 15:09:22 -07:00
Fabiano Fidêncio
4d17ea4a01 cache: Fix nvidia-snp caching version
All the kernel-foo instances, such as "kernel-sev" or "kernel-snp",
should be transformed into "kernel.foo" when looking at the
versions.yaml file.

This was already done for SEV, but missed on the SNP case.

Fixes: #6777

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-11 21:26:58 +02:00
Fabiano Fidêncio
a133fadbfa cache: Fix nvidia-gpu-tdx-experimental cache URL
We were passing "kernel-nvidia-gpu-tdx", missing the "-experimental"
part, leading to a non-valid URL.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-11 21:20:06 +02:00
Fabiano Fidêncio
a7dd6cbadd Merge pull request #6807 from fidencio/topic/fix-nvidia-gpu-cache
cache: Fix nvidia-gpu version
2023-05-11 17:40:41 +02:00
Fabiano Fidêncio
b9990c2017 cache: Fix nvidia-gpu version
c9bf7808b6 introduced the logic to
properly get the version of nvidia-gpu kernels, but one important part
was dropped during the rebase into main, which is actually getting the
correct version of the kernel.

Fixing this now, and using the old issue as reference.

Fixes: #6777

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-11 13:55:14 +02:00
Fabiano Fidêncio
14939d00ad Merge pull request #6778 from fidencio/topic/cache-gpu-related-kernels
cache: Update the KERNEL_FLAVOUR list to include nvidia-gpu
2023-05-11 13:14:45 +02:00
Fabiano Fidêncio
c9bf7808b6 cache: Update the KERNEL_FLAVOUR list to include nvidia-gpu
We need to make sure that, when caching a `-nvidia-gpu` kernel, we still
look at the version of the base kernel used to build the nvidia-gpu
drivers, as the ${vendor}-gpu kernels are based on already existing
entries in the versions.yaml file and do not require a new entry to be
added.

Fixes: #6777

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-11 10:56:13 +02:00
Fabiano Fidêncio
3665b42045 gpu: Rename gpu targets to nvidia-gpu
This will make it easier for other GPU vendors to add the needed bits in
the future.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-11 10:55:55 +02:00
Fabiano Fidêncio
edfaae85cb Merge pull request #6700 from fitzthum/snp-artifacts
packaging: Add SEV-SNP artifacts to main
2023-05-11 10:47:10 +02:00
James O. D. Hunt
fe33015075 Merge pull request #6794 from jodh-intel/docs-mark-snap-as-unmaintained
docs: Mark snap installation method as unmaintained
2023-05-11 09:14:25 +01:00
Fabiano Fidêncio
c937d0a5d4 Merge pull request #6591 from UnmeshDeodhar/add-sev-artifacts-to-main
packaging: Add sev artifacts to main
2023-05-11 09:09:36 +02:00
Tobin Feldman-Fitzthum
2c90cac751 local-build: fixup alphabetization
A few pieces of the local-build tooling are supposed to be
alphabetized. Fixup a couple minor issues that have accumulated.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2023-05-10 21:23:38 +00:00
Tobin Feldman-Fitzthum
4da6eb588d kata-deploy: Add qemu-snp shim
Now that we have the SNP components in place, make sure that
kata-deploy knows about the qemu-snp shim so that it will be
added to containerd config.

Fixes: #6575

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2023-05-10 20:55:36 +00:00
Tobin Feldman-Fitzthum
14dd053758 kata-deploy: add kata-qemu-snp runtimeclass
Since SEV-SNP has limited hotplug support, increase
the pod overhead to account for fixed resource usage.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2023-05-10 20:55:36 +00:00
Tobin Feldman-Fitzthum
0bb37bff78 config: Add SNP configuration
SNP requires many specific configurations, so let's make
a new SNP configuration file that we can use with the
kata-qemu-snp runtime class.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
2023-05-10 20:55:36 +00:00
Chelsea Mafrica
13f9ba2298 Merge pull request #6379 from cmaf/kata-ctl-check-kvm-1
kata-ctl: add generic kvm check & unit test
2023-05-10 13:33:57 -07:00
Tobin Feldman-Fitzthum
af7f2519bf versions: update SEV kernel description
SNP and SEV will share a (guest) kernel. Update the description
in versions.yaml to mention this.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2023-05-10 20:27:12 +00:00
Tobin Feldman-Fitzthum
dbcc3b5cc8 local-build: fix default values for OVMF build
Existing value has wrong name and compression type
leading to installation failure.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2023-05-10 20:27:12 +00:00
Tobin Feldman-Fitzthum
b8bbe6325f gha: build OVMF for tests and release
The x86_64 package of OVMF is required for deployments
that don't use kernel hashes, which includes SEV-SNP
in the short term. We should keep this in the bundle
in the long term in case someone wants to disable
kernel hashes.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2023-05-10 20:27:12 +00:00
Tobin Feldman-Fitzthum
cf0ca265f9 local-build: Add x86_64 OVMF target
Add targets to build the "plain" x86_64 OVMF.

This will be used by anyone who is using SEV or SNP
without kernel hashes. The SNP QEMU does not yet
support kernel hashes so the OvmfPkg will be used
by default.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
2023-05-10 20:24:51 +00:00
Tobin Feldman-Fitzthum
db095ddeb4 cache: add SNP flavor to comments
Update comments to include new SNP QEMU option

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2023-05-10 20:19:56 +00:00
Tobin Feldman-Fitzthum
f4ee00576a gha: Build and ship QEMU for SNP
Now that we can build SNP QEMU, let's do that for tests and release.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2023-05-10 20:19:56 +00:00
Tobin Feldman-Fitzthum
7a58a91fa6 docs: update SNP guide
Since we reshuffled versions.yaml, update the guide so that
we can find the SNP QEMU info.

Once runtime support is merged we should overhaul or remove
this guide, but let's keep it for now.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2023-05-10 20:19:56 +00:00
Tobin Feldman-Fitzthum
879333bfc7 versions: update SNP QEMU version
Refactor SNP QEMU entry in versions.yaml to match
qemu-experimental and qemu-tdx-experimental.

Also, update the version of QEMU to what we are using
in CCv0. This is the non-UPM QEMU and it does not
have kernel hashes support.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
2023-05-10 20:19:56 +00:00
Tobin Feldman-Fitzthum
38ce4a32af local-build: add support to build QEMU for SEV-SNP
Add Make targets and helper functions to build the QEMU
needed for SEV-SNP.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
2023-05-10 20:19:56 +00:00
Chelsea Mafrica
5f8008b69c kata-ctl: add unit test for kvm check
Check that kvm test fails when run as non-root and when device specified
is not /dev/kvm.

Fixes #5338

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-05-10 10:29:20 -07:00
Chelsea Mafrica
a085a6d7b4 kata-ctl: add generic kvm check
Add kvm check using ioctl macro to create a syscall that checks the kvm
api version and if creation of a vm is successful.

Fixes #5338

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-05-10 10:29:20 -07:00
Unmesh Deodhar
772d4db262 gha: Build and ship SEV initrd
We have code that builds initrd for SEV.
thus, adding that to the test and release process.

Fixes: #6572

Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
2023-05-10 12:19:56 -05:00
Unmesh Deodhar
45fa366926 gha: Build and ship SEV OVMF
SEV requires special OVMF to work. Thus, building that for test and release.

Fixes: #6572

Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
2023-05-10 12:19:56 -05:00
Unmesh Deodhar
4770d3064a gha: Build and ship SEV kernel.
SEV requires custom kernel arguments when building.
Thus, adding it to the test and release process.

Fixes: #6572

Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
2023-05-10 12:19:56 -05:00
Unmesh Deodhar
fb9c1fc36e runtime: Add qemu-sev config
Adding config file that can be used with qemu-sev runtime class.
Since SEV has limited hotplug support, increase
the pod overhead to account for fixed resource usage.

Fixes: #6572

Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
2023-05-10 12:19:56 -05:00
Unmesh Deodhar
813e4c576f runtimeClasses: add sev runtime class
Adding kata-qemu-sev runtime class.

Fixes: #6572

Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
2023-05-10 12:19:56 -05:00
Unmesh Deodhar
af18806a8d static-build: Add caching support to sev ovmf
SEV requires special OVMF.
Now that we have ability to build this custom OVMF, let's optimize
it by caching so that we don't have to build it for every run.

Fixes: sev: #6572

Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
2023-05-10 12:19:55 -05:00
Unmesh Deodhar
76ae7a3abe packaging: adding caching capability for kernel
The SEV initrd build requires kernel modules.
So, for SEV case, we need to cache kernel modules tarball in
addition to kernel tarball.

Fixes: #6572

Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
2023-05-10 12:19:55 -05:00
Unmesh Deodhar
12c5ef9020 packaging: add support to build OVMF for SEV
SEV requires special OVMF to work with kernel hashes.
Thus, adding changes that builds this custom OVMF for SEV.

Fixes: #6572

Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
2023-05-10 12:19:55 -05:00
Unmesh Deodhar
b87820ee8c packaging: add support to build initrd for sev
We need special initrd for SEV. The work on SEV initrd is based on
Ubuntu. Thus, adding another entry in versions.yaml
This binary will have '-sev' suffix to distinguish it from the generic
binary.

Fixes: #6572

Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
2023-05-10 12:19:55 -05:00
James O. D. Hunt
e1f3b871cd docs: Mark snap installation method as unmaintained
The snap package is no longer being maintained so update the docs to
warn readers.

We'll remove the snap installation docs in a few weeks.

See: #6769.
Fixes: #6793.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-05-10 18:02:46 +01:00
Jeremi Piotrowski
022a33de92 agent: Add context to errors when AgentConfig file is missing
When the agent config file is missing, the panic message says "no such file or
directory" but doesn't inform the user about which file was missing. Add
context to the parsing (with filename) and to the from_config_file() calls
(with information where the path is coming from).

Fixes: #6771
Depends-on: github.com/kata-containers/tests#5627
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-05-10 08:43:16 +02:00
Fabiano Fidêncio
6881b9558b Merge pull request #6512 from gabevenberg/log-parser-rs
Log-parser-rs
2023-05-10 08:22:59 +02:00
Chao Wu
7218229af0 Merge pull request #6594 from Apokleos/warning_fix_1.68.0
warning_fix: fix warnings when build with cargo-1.68.0
2023-05-10 09:51:45 +08:00
Unmesh Deodhar
b0e6a094be packaging: Add sev kernel build capability
Adding code that builds sev kernel.

Fixes: #6572

Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
2023-05-09 13:47:22 -05:00
Tim Zhang
b0b5d7082e Merge pull request #6753 from amshinde/add-cross-building-with-cross
cross-compile: Include documentation and configuration for cross-compile
2023-05-09 16:31:40 +08:00
Feng Wang
4e0dce6802 Merge pull request #6738 from fengwang666/oss-fix-fd-leak
runtime: Fix virtiofs fd leak
2023-05-08 10:52:36 -07:00
Eduardo Berrocal
a4c0303d89 virtcontainers: Fixed static checks for improved test coverage for fc.go
Expanded tests on fc_test.go to cover more lines of code. Coverage went from 4.6% to 18.5%.
Fixed very simple static check fail on line 202.

Fixes: #266

Signed-off-by: Eduardo Berrocal <eduardo.berrocal@intel.com>
2023-05-07 00:17:36 -07:00
Peng Tao
65670e6b0a Merge pull request #6699 from zvonkok/cold-plug-vfio
gpu: cold plug VFIO devices
2023-05-05 10:04:29 +08:00
Archana Shinde
b86d32aba9 Merge pull request #6728 from nedsouza/256/tests_coverage_pkg_signals
pkg/signals: Improved test coverage 60% to 100%
2023-05-04 16:19:12 -07:00
Archana Shinde
9443c4aea7 Merge pull request #6729 from nedsouza/259/tests_coverage_virtcontainers_persist
virtcontainers/persist: Improved test coverage 65% to 87.5%
2023-05-04 16:18:55 -07:00
Archana Shinde
09134c30de Merge pull request #6737 from nedsouza/265/virtcontainers-clh-go-coverage
virtcontainers/clh_test.go: improve unit test coverage
2023-05-04 16:15:43 -07:00
Archana Shinde
8495f830b7 cross-compile: Include documentation and configuration for cross-compile
`cross` is an open source tool that provides zero-setup cross compile
for rust binaries. Add documentation on this tool for compiling
kata-ctl tool and Cross.toml file that provides required configuration
for installing dependencies for various targets.
This is pretty useful for a developer to make sure code compiles and
passes checks for various architectures.

Fixes: #6765

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-05-04 14:13:00 -07:00
Bin Liu
e57ac2ae18 Merge pull request #6749 from nedsouza/260/tests_coverage_virtcontainers_factory
virtcontainers/factory: Improved test coverage
2023-05-04 10:54:40 +08:00
Zvonko Kaiser
13d7f39c71 gpu: Check for VFIO port assignments
Bailing out early if the port is wrong, allowed port settings are
no-port, root-port, switch-port

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-05-03 12:32:33 +00:00
Gabe Venberg
6594a9329d tools: made log-parser-rs
Eventual replacement of kata-log-parser, but for now replicates its
functionaility for the new runtime-rs syntax. Takes in log files,
parses, sorts by timestamp, spits them out in json, csv, xml, toml, and
a few others.

Fixes #5350

Signed-off-by: Gabe Venberg <gabevenberg@gmail.com>
2023-05-02 13:16:54 -05:00
Wainer Moschetta
f5ff975560 Merge pull request #6723 from ryansavino/gha-k8s-also-test-snp
gha: Also run k8s tests on qemu-snp
2023-05-01 10:37:12 -03:00
Fabiano Fidêncio
b6e54676eb Merge pull request #6759 from ryansavino/gha-sev-kata-deploy-fix
gha: sev: fix for kata-deploy error
2023-05-01 11:42:16 +02:00
Eduardo Berrocal
03a8cd69c2 virtcontainers: Improved test coverage for fc.go from 4.6% to 18.5%
Expanded tests on fc_test.go to cover more lines of code. Coverage went from 4.6% to 18.5%.

Fixes: #266

Signed-off-by: Eduardo Berrocal <eduardo.berrocal@intel.com>
2023-04-28 15:40:45 -07:00
Ryan Savino
9e2b7ff177 gha: sev: fix for kata-deploy error
kubectl commands need a '-f' instead of a '-k'

Fixes: #6758

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2023-04-28 14:54:36 -05:00
Ryan Savino
5c9246db19 gha: Also run k8s tests on qemu-snp
Added the k8s tests for qemu-snp

Fixes: #6722

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2023-04-28 14:43:53 -05:00
Ryan Savino
c57a44436c gha: Add the ability to test qemu-snp
With the changes proposed as part of this PR, a qemu-snp cluster
will be created but no tests will be performed.

GitHub Actions will only run the tests using the workflows that are
part of the **target** branch, instead of the using the ones coming
from the PR. No way to work around this for now.

After this commit is merged, the tests (not the yaml files for the
actions) will be altered in order for the checkout action  to help in
this case.

Fixes: #6722

Signed-off-by: Ryan Savino <ryan.savino@amd.com>
2023-04-28 13:07:13 -05:00
Wainer Moschetta
29785a43d7 Merge pull request #6712 from ryansavino/gha-k8s-also-test-sev
gha: Also run k8s tests on qemu-sev
2023-04-28 14:22:03 -03:00
Archana Shinde
65c61785fc Merge pull request #6660 from amshinde/kata-ctl-cmd
Implement the "kata-ctl env" command
2023-04-28 01:33:28 -07:00
Archana Shinde
4064192896 env: Utilize arch specific functionality to get cpu details
Have kata-env call architecture specific function to get cpu details
instead of generic function to get cpu details that works only for
certain architectures. The functionality for cpu details has been fully
implemented for x86_64 and arm architectures, but needs to be
implemented for s390 and powerpc.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-04-27 16:45:41 -07:00
Archana Shinde
fb40c71a21 env: Check for root privileges
Check for root privileges early on.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-04-27 16:45:41 -07:00
Archana Shinde
1016bc17b7 config: Add api to fetch config from default config path
Add api to fetch config from default config path and use that in
kata-ctl tool.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-04-27 16:45:41 -07:00
Archana Shinde
b908a780a0 kata-env: Pass cmd option for file path
Add ability to write the environment information to a file
or stdout if file path is absent.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-04-27 16:45:41 -07:00
Archana Shinde
b1920198be config: Workaround the way agent and hypervisor configs are fetched
This is essentially a workaround for the issue:
https://github.com/kata-containers/kata-containers/issues/5954

runtime-rs chnages the Kata config format adding agent_name and
hypervisor_name which are then used as keys to fetch the agent and
hypervisor configs. This will not work for older configs.
So use the first entry in the hashmaps to fetch the configs as a
workaround while the config change issue is resolved.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-04-27 16:45:41 -07:00
Archana Shinde
f2b2621dec kata-env: Implement the kata-env command.
Command implements functionality to get user environment settings.

Fixes: #5339

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-04-27 16:45:41 -07:00
Ryan Savino
c849bdb0a5 gha: Also run k8s tests on qemu-sev
Added the k8s tests for qemu-sev

Fixes: #6711

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2023-04-27 15:24:08 -05:00
Eduardo Berrocal
6bf1fc6051 virtcontainers/factory: Improved test coverage
Expanded tests on factory_test.go to cover more lines of code. Coverage went from 34% to 41.5% in the case of user-mode run tests,
and from 77.7% to 84% in the case of priviledge-mode run tests.

Fixes: #260

Signed-off-by: Eduardo Berrocal <eduardo.berrocal@intel.com>
2023-04-27 13:08:35 -07:00
Tamas K Lengyel
0d49ceee0b gha: Fix snap creation workflow warnings
Fix recurring issues of failing to install dependencies due to stale apt cache.
Uprev actions/checkout to v3 to resolve issue "Node.js 12 actions are deprecated."

Fixes: #5659
Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
2023-04-27 18:40:02 +00:00
Zvonko Kaiser
138ada049c gpu: Cold Plug VFIO toml setting
Added the cold_plug_vfio setting to the qemu-toml.in with some
epxlanation

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-27 11:04:45 +00:00
Amulyam24
defb643346 runtime: remove overriding ARCH value by default for ppc64le
Currently, ARCH value is being set to powerpc64le by default.
powerpc64le is only right in context of rust and any operation
which might use this variable for a different purpose would fail on ppc64le.

Fixes: #6741

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2023-04-27 16:17:48 +05:30
Zvonko Kaiser
f7ad75cb12 gpu: Cold-plug extend the api.md
Make the hypervisorconfig consistent in code and api.md

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-27 09:35:05 +00:00
Zvonko Kaiser
0fec2e6986 gpu: Add cold-plug test
Cold plug setting is now correctly decoded in toml

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-27 09:30:24 +00:00
Archana Shinde
f2ebdd81c2 utils: Get rid of spurious print statement left behind.
The print was used for debugging, get ris of it.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-04-26 22:12:30 -07:00
Archana Shinde
9a94f1f149 make: Export VERSION and COMMIT
These will be consumed by kata-ctl, so export these so that
they can be used to replace variables available to the rust binary.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-04-26 22:12:30 -07:00
Archana Shinde
2f81f48dae config: Add file under /opt as another location to look for the config
Most of kata installation tools use this path for installation, so
add this to the paths to look for the configuration.toml file.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-04-26 22:12:30 -07:00
Archana Shinde
07f7d17db5 config: Make the pipe_size field optional
Add the serde default attribute to the field so that parsing
can continue if this field is not present.
The agent assumes a default value for this, so it is not required
by the user to provide a value here.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-04-26 22:12:30 -07:00
Archana Shinde
68f6357731 config: Make function to get the default conf file public
This will be used by the kata-env command.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-04-26 22:12:30 -07:00
Archana Shinde
7565b33568 kata-ctl: Implement Display trait for GuestProtection enum
Implement Display for enum to display in env output.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-04-26 22:12:30 -07:00
Archana Shinde
94a00f9346 utils: Make certain constants in utils.rs public
These would be used outside of utils.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-04-26 22:12:30 -07:00
Archana Shinde
572b338b3b gitignore: Ignore .swp and .swo editor backup files
Ignore temporary files created by vim editor.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-04-26 22:12:30 -07:00
Archana Shinde
376884b8a4 cargo: Update version of clap to 4.1.13
This version includes macros related to using command options.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-04-26 22:12:30 -07:00
alex.lyn
17daeb9dd7 warning_fix: fix warnings when build with cargo-1.68.0
Fixes: #6593

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-04-27 10:29:50 +08:00
Ryan Savino
521519d745 gha: Add the ability to test qemu-sev
With the changes proposed as part of this PR, a qemu-sev cluster will
be created but no tests will be performed.

GitHub Actions will only run the tests using the workflows that are
part of the **target** branch, instead of the using the ones coming
from the PR. No way to work around this for now.

After this commit is merged, the tests (not the yaml files for the
actions) will be altered in order for the checkout action  to help in this
case.

Fixes: #6711

Signed-off-by: Ryan Savino <ryan.savino@amd.com>
2023-04-26 17:56:28 -05:00
Feng Wang
205909fbed runtime: Fix virtiofs fd leak
The kata runtime invokes removeStaleVirtiofsShareMounts after
a container is stopped to clean up the stale virtiofs file caches.

Fixes: #6455
Signed-off-by: Feng Wang <fwang@confluent.io>
2023-04-26 15:53:39 -07:00
Byron Marohn
5226f15c84 gha: Fix Body Line Length action flagging empty body commit messages
Change the Body Line Length workflow to not trigger when the commit
message contains only a message without a body. Other workflows will
flag the missing body sections, and it was confusing to have an error
message that said 'Body line too long (max 150)' when this was not
actually the case.

Fixes: #5561

Co-authored-by: Jayant Singh <jayant.singh@intel.com>
Co-authored-by: Luke Phillips <lucas.phillips@intel.com>
Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Jayant Singh <jayant.singh@intel.com>
Signed-off-by: Luke Phillips <lucas.phillips@intel.com>
Signed-off-by: Kelby Madal-Hellmuth <kelby.madal-hellmuth@intel.com>
Signed-off-by: Liz Lawrens <liz.lawrens@intel.com>
2023-04-26 17:29:16 -04:00
Tamas K Lengyel
0f45b0faa9 virtcontainers/clh_test.go: improve unit test coverage
Credit PR to Hackathon Team3

Fixes: #265

Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
2023-04-26 19:12:51 +00:00
Zvonko Kaiser
dded731db3 gpu: Add OVMF setting for MMIO aperture
The default size of OVMFs aperture is too low to
initialized PCIe devices with huge BARs

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-26 09:47:37 +00:00
Zvonko Kaiser
2a830177ca gpu: Add fwcfg helper function
Added driver util function for easier handling of VFIO
devices outside of the VFIO module. At the sandbox level
we may need to set options depending if we have a VFIO/PCIe
device, like the fwCfg for confiential guests.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-26 09:47:37 +00:00
Zvonko Kaiser
131f056a12 gpu: Extract VFIO Functions to drivers
Some functions may be used in other modules then only in
the VFIO module, extract them and make them available to
other layers like sandbox.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-26 09:47:37 +00:00
Zvonko Kaiser
c8cf7ed3bc gpu: Add ColdPlug of VFIO devices with devManager
If we have a VFIO device and cold-plug is enabled
we mark each device as ColdPlug=true and let the VFIO
module do the attaching.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-26 09:47:37 +00:00
Zvonko Kaiser
e2b5e7f73b gpu: Add Rawdevices to hypervisor
RawDevics are used to get PCIe device info early before the sandbox
is started to make better PCIe topology decisions

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-26 09:47:37 +00:00
Zvonko Kaiser
6107c32d70 gpu: Assign default value to cold-plug
Make sure the configuration is propagated to the right structs
and the default value is assigned.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-26 09:47:37 +00:00
Zvonko Kaiser
377ebc2ad1 gpu: Add configuration option for cold-plug VFIO
Users can set cold-plug="root-port" to cold plug a VFIO device in QEMU

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-26 09:47:37 +00:00
Zvonko Kaiser
c18ceae109 gpu: Add new struct PCIePort
For the hypervisor to distinguish between PCIe components, adding
a new enum that can be used for hot-plug and cold-plug of PCIe devices

Fixes: #6687

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-26 09:47:37 +00:00
Bin Liu
509bc8b6c8 Merge pull request #6718 from openanolis/mengze/keep_abnormal
runtime-rs: support keep_abnormal in toml config
2023-04-26 12:36:52 +08:00
Bin Liu
b6d880510a Merge pull request #6595 from zvonkok/gpu-snp-tdx-kernel
gpu: Build and Ship an GPU enabled Kernel
2023-04-26 12:33:51 +08:00
Eduardo Berrocal
9c38204f13 virtcontainers/persist: Improved test coverage 65% to 87.5%
Expanded tests on manager_test.go to cover more lines of code.

Fixes: #259

Signed-off-by: Eduardo Berrocal <eduardo.berrocal@intel.com>
2023-04-25 23:53:46 +00:00
Eduardo Berrocal
1c1ee8057c pkg/signals: Improved test coverage 60% to 100%
Expanded tests on signals_test.go to cover more lines of code. 'go test' won't show 100% coverage (only 66.7%), because one test need to spawn a new
process (since it is testing a function that calls os.Exit(1)).

Fixes: #256

Signed-off-by: Eduardo Berrocal <eduardo.berrocal@intel.com>
2023-04-25 23:34:13 +00:00
mengze
cc8ea3232e runtime-rs: support keep_abnormal in toml config
This patch adds keep_abnormal in runtime config. If keep_abnormal =
true, it means that 1) if the runtime exits abnormally, the cleanup
process will be skipped, and 2) the runtime will not exit even if the
health check fails.

This option is typically used to retain abnormal information for
debugging and should NOT be enabled by default.

Fixes: #6717

Signed-off-by: mengze <mengze@linux.alibaba.com>
Signed-off-by: quanweiZhou <quanweiZhou@linux.alibaba.com>
2023-04-25 13:47:44 +08:00
David Esparza
7fdaab49bc Merge pull request #6295 from dborquez/add_kernel_module_checks_kvm
kata-ctl: checks for kvm, kvm_intel modules loaded
2023-04-24 13:33:18 -06:00
Greg Kurz
0ca6d3b726 Merge pull request #6681 from Vlad1mir-D/6677-fix-kata-agent-dbus-connection
osbuilder: Fix D-Bus enabling in the dracut case
2023-04-24 17:31:13 +02:00
Bin Liu
3d8688f92e Merge pull request #6620 from jongwu/docker_fail_start_snap
snap: fix docker start fail issue
2023-04-24 10:53:16 +08:00
Archana Shinde
97291d88e9 Merge pull request #6696 from amshinde/kata-manager-containerd-fix
kata-manager: Fix containerd download
2023-04-21 09:54:30 -07:00
Archana Shinde
96e8470dbe kata-manager: Fix containerd download
Newer containerd releases have an additional static package published.
Because of this,  download_url contains two urls causing curl to fail.
To resolve this, pick the first url from the containerd releases to
download containerd.

Fixes: #6695

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-04-20 23:08:51 -07:00
David Esparza
432d407440 kata-ctl: checks for kvm, kvm_intel modules loaded
Ensure that kvm and kvm_intel modules are loaded.
Renames the get_cpu_info() function to read_file_contents()

Fixes #5332

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-04-20 11:29:36 -06:00
Zvonko Kaiser
b1730e4a67 gpu: Add new kernel build option to usage()
With each release make sure we ship a GPU  enabled kernel

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-20 07:48:30 +00:00
Fupan Li
ceefd50bd0 Merge pull request #6680 from Tim-Zhang/fix-ut-bad-fd
agent: Fix ut issue caused by fd double closed
2023-04-20 11:18:27 +08:00
Fupan Li
a7b4b69230 Merge pull request #6673 from Tim-Zhang/upgrade-ttrpc-protobuf
Bump ttrpc to 0.7.2 and protobuf to 3.2.0
2023-04-20 10:13:43 +08:00
Fupan Li
a1568cd2f5 Merge pull request #6676 from zvonkok/gpu-runtime
gpu: Add GPU enabled confguration and runtime
2023-04-19 13:01:49 +08:00
Vladimir
3e7b902265 osbuilder: Fix D-Bus enabling in the dracut case
- D-Bus enabling now occurs only in setup_rootfs (instead of
prepare_overlay and setup_rootfs)
- Adjust permissions of / so dbus-broker will be able to traverse FS

These changes enables kata-agent to successfully communicate with D-Bus.

Fixes #6677

Signed-off-by: Vladimir <amigo.elite@gmail.com>
2023-04-18 23:17:34 +03:00
Tim Zhang
53c749a9de agent: Fix ut issue caused by fd double closed
Never ever try to close the same fd double times, even in a unit test.

A file descriptor is a number which will be reused, so when you close
the same number twice you may close another file descriptor in the second
time and then there will be an error 'Bad file descriptor (os error 9)'
while the wrongly closed fd is being used.

Fixes: #6679

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-18 23:19:10 +08:00
Hyounggyu Choi
5c032c64ac Merge pull request #6664 from zvonkok/vfio-fix
gpu: Do not pass-through PCI (Host) Bridges
2023-04-18 19:50:15 +09:00
Tim Zhang
2e3f19af92 agent: fix clippy warnings caused by protobuf3
Fix warnings introduced by protobuf upgrade.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 20:15:49 +08:00
Tim Zhang
4849c56faa agent: Fix unit test issue cuased by protobuf upgrade
Fixes: #6646

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 19:49:21 +08:00
Tim Zhang
0a582f7815 trace-forwarder: remove unused crate protobuf
Remove unused crate protobuf.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 19:49:21 +08:00
Tim Zhang
73253850e6 kata-ctl: remove unused crate ttrpc
Remove unused crate ttrpc.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 19:49:21 +08:00
Tim Zhang
76d2e30547 agent-ctl: Bump ttrpc from 0.6.0 to 0.7.1
Fixes: #6646

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 19:49:21 +08:00
Tim Zhang
eb3d20dccb protocols: Add ut for Serde
Fixes: #6646

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 19:49:21 +08:00
Tim Zhang
59568c79dd protocols: add support for Serde
rust-protobuf@3 does not support Serde natively anymore.
So we need to do it by ourselves.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 19:49:21 +08:00
Tim Zhang
a6b4d92c84 runtime-rs: Bump ttrpc from 0.6.0 to 0.7.1
Fixes: #6646

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 19:49:20 +08:00
Zvonko Kaiser
ac7c63bc66 gpu: Add containerd shim for qemu-gpu
Last but not least add the continerd shim configuration
pointing to the correct configuration-<shim>.toml

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-17 10:45:04 +00:00
Zvonko Kaiser
a0cc8a75f2 gpu: Add a kube runtime class
With the added configuration add the corresponding kube
runtime class.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-17 10:42:04 +00:00
Zvonko Kaiser
a81fff706f gpu: Adding a GPU enabled configuration
We need to set hotplug on pci root port and enable at least one
root port. Also set the guest-hooks-dir to the correct path

Fixes: #6675

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-17 10:40:09 +00:00
Tim Zhang
8af6fc77cd agent: Bump ttrpc from 0.6.0 to 0.7.1
Fixes: #6646

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 18:31:41 +08:00
Tim Zhang
009b42dbff protocols: Fix unit test
Fixes: #6646

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 18:31:41 +08:00
Tim Zhang
392732e213 protocols: Bump ttrpc from 0.6.0 to 0.7.1
Fixes: #6646

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 18:31:35 +08:00
Zvonko Kaiser
f4f958d53c gpu: Do not pass-through PCI (Host) Bridges
On some systems a GPU is in a IOMMU group with a PCI Bridge and
PCI Host Bridge. Per default no PCI Bridge needs to be passed-through.
When scanning the IOMMU group, ignore devices with a 0x60 class ID prefix.

Fixes: #6663

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-17 10:08:23 +00:00
Zvonko Kaiser
825e769483 gpu: Add GPU support to default kernel without any TEE
With each release make sure we ship a GPU enabled kernel

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-17 09:58:58 +00:00
Zvonko Kaiser
e4ee07f7d4 gpu: Add GPU TDX experimental kernel
With each release make sure we ship a GPU and TEE enabled kernel
This adds tdx-experimental kernel support

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-17 09:58:52 +00:00
Fabiano Fidêncio
243cb2e3af Merge pull request #6670 from fidencio/topic/fix-caching-of-tdvf-and-tdx-qemu
cache-components: Fix caching of TDVF and QEMU for TDX
2023-04-16 09:04:04 +02:00
Fabiano Fidêncio
a1272bcf1d gha: tdx: Fix typo overlay -> overlays
The beauty of GHA not allowing us to easily test changes in the yaml
files as part of the PR has hit us again. :-/

The correct path for the k3s deployment is
tools/packaging/kata-deploy/kata-deploy/overlays/k3s instead of
tools/packaging/kata-deploy/kata-deploy/overlay/k3s.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-15 15:00:06 +02:00
Fabiano Fidêncio
3fa0890e5e cache-components: Fix TDVF caching
TDVF caching is not working as the tarball name is incorrect. The result
expected is kata-static-tdvf.tar.xz, but it's looking for
kata-static-tdx.tar.xz.

This happens as a logic to convert tdx -> tdvf has been added as part of
the building scripts, but I missed doing this as part of the caching
scripts.

Fixes: #6669

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-15 14:12:29 +02:00
Fabiano Fidêncio
80e3a2d408 cache-components: Fix TDX QEMU caching
TDX QEMU caching is not working as expected, as we're checking for its
version looking at "assets.hypervisor.${QEMU_FLAVOUR}.version", which is
correct for standard QEMU. However, for TDX QEMU we should be checking
for "assets.hypervisor.${QEMU_FLAVOUR}.tag"

Fixes: #6668

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-15 14:12:26 +02:00
Fabiano Fidêncio
fffe2c6082 Merge pull request #6648 from fidencio/topic/gha-tdx-improvements-and-fixes
gha: tdx: Ensure kata-deploy is removed after the tests run
2023-04-15 00:21:31 +02:00
Bo Chen
a819ce145f Merge pull request #6633 from likebreath/0406/clh_v31.0
versions: Upgrade to Cloud Hypervisor v31.0
2023-04-14 13:52:19 -07:00
Zvonko Kaiser
87ea43cd4e gpu: Add configuration fragment
Adding configuration fragment for the kernel,
depending on the TEE kernel update the LOCALVERSION

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-14 07:52:51 +00:00
Zvonko Kaiser
aca6ff7289 gpu: Build and Ship an GPU enabled Kernel
With each release make sure we ship a GPU and TEE enabled kernel

Fixes: #6553

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-14 07:52:42 +00:00
Fabiano Fidêncio
dc662333df runtime: Increase the dial_timeout
When testing on AKS, we've been hitting the dial_timeout every now and
then.  Let's increase it to 45 seconds (instead of 30) for all the VMMs,
and to 60 seconfs in case of TEEs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 22:42:52 +02:00
Greg Kurz
897c0bc67e Merge pull request #6658 from gkurz/osbuilder-dracut-dbus
osbuilder: Enable dbus in the dracut case
2023-04-13 19:03:15 +02:00
Greg Kurz
eb1762e813 osbuilder: Enable dbus in the dracut case
The agent now offloads cgroup configuration to systemd when
possible. This requires to enable D-Bus in order to communicate
with systemd.

Fixes #6657

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-04-13 14:16:50 +02:00
Greg Kurz
f9a94f8fc5 Merge pull request #6623 from UiPath/fix-no-space-device
runtime: Don't create socket file in /run/kata
2023-04-13 10:36:20 +02:00
Fabiano Fidêncio
f478b9115e clh: tdx: Update timeouts for confidential guest
Booting up TDX takes more time than booting up a normal VM.  Those
values are being already used as part of the CCv0 branch, and we're just
bringing them to the `main` branch as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 10:18:07 +02:00
Fabiano Fidêncio
3b76abb366 kata-deploy: Ensure node is ready after CRI Engine restart
Let's ensure the node is ready after the CRI Engine restart, otherwise
we may proceed and scripts may simply fail if they try to deploy a pod
while the CRI Engine is not yet restarted (and, consequently, the node
is not Ready).

Related: #6649

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 10:18:07 +02:00
Fabiano Fidêncio
5ec9ae0f04 kata-deploy: Use readinessProbe to ensure everything is ready
readinessProbe will help us to only have the kata-deploy pod marked as
Ready when it finishes all the needed configurations in the node.

Related: #6649

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 10:18:07 +02:00
Fabiano Fidêncio
ea386700fe kata-deploy: Update podOverhead for TDX
As TEEs cannot hotplug memory / CPU, we *must* consider the default
values for those as part of the podOverhead.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 10:18:07 +02:00
Fabiano Fidêncio
e31efc861c gha: tdx: Use the k3s overlay
As the TDX machine is using k3s, let's make sure we're deploying
kat-deploy using the k3s overlay.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 10:18:07 +02:00
Fabiano Fidêncio
542bb0f3f3 gha: tdx: Set KUBECONFIG env at the job level
By doing this we avoid having to set it up on every step.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 10:18:07 +02:00
Fabiano Fidêncio
d7fdf19e9b gha: tdx: Delete kata-deploy after the tests finish
We must ensure that no kata-deploy is left behind after the tests
finish, otherwise it may interfere with the next run.

Fixes: #6647

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 10:18:07 +02:00
Fabiano Fidêncio
da35241a91 tests: k8s: Skip k8s-cpu-ns when testing TDX
TEEs do not support CPU / memory hotplug, thus this test must be
skipped.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 10:18:07 +02:00
Alexandru Matei
db2cac34d8 runtime: Don't create socket file in /run/kata
The socket file for shim management is created in /run/kata
and it isn't deleted after the container is stopped. After
running and stopping thousands of containers /run folder
will run out of space.

Fixes #6622
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Co-authored-by: Greg Kurz <groug@kaod.org>
2023-04-13 10:21:29 +03:00
Jianyong Wu
6d315719f0 snap: fix docker start fail issue
In Arm baseline CI, docker starts fail with error: "no sockets found via
socket activation: make sure the service was started by systemd". I find
a solusion in [1] to fix it.

[1] https://forums.docker.com/t/failed-to-load-listeners-no-sockets-found-via-socket-activation-make-sure-the-service-was-started-by-systemd/62505

Fixes: #6619
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-04-13 09:35:40 +08:00
Zhongtao Hu
328793bb27 Merge pull request #6585 from Apokleos/nydus_prefetch_files
nydus_rootfs/prefetch_files: add prefetch_files for RAFS
2023-04-12 19:58:36 +08:00
Zvonko Kaiser
e4b3b08871 gpu: Add proper CONFIG_LOCALVERSION depending on TEE
If conf_guest is set we need to update the CONFIG_LOCALVERSION
to match the suffix created in install_kata
-nvidia-gpu-{snp|tdx}, the linux headers will be named the very
same if build with make deb-pkg for TDX or SNP.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-12 11:30:59 +00:00
Zhongtao Hu
fef531f565 Merge pull request #6618 from Apokleos/virtiofs_extra_cache_mode
runtime-rs/virtio-fs: add support extra handler for cache mode.
2023-04-12 14:40:05 +08:00
Bin Liu
9327bb0912 Merge pull request #6639 from openanolis/nerdctl
runtime-rs: enable nerdctl to setup cni plugin
2023-04-12 12:04:37 +08:00
Zhongtao Hu
69ba2098f8 runtime-rs: remove network entities and netns
remove network entities and netns

Fixes:#4693
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-04-12 10:21:06 +08:00
Zhongtao Hu
b31f103d12 runtime-rs: enable nerdctl cni plugin
1. when we use nerdctl to setup network for kata, no netns is created by
nerdctl, kata need to create netns by its own

2. after start VM, nerdctl will call cni plugin via oci hook, we need to
rescan the netns after the interfaces have been created, and hotplug
the network device into the VM

Fixes:#4693
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-04-12 10:21:04 +08:00
Fabiano Fidêncio
3b3656d96d Merge pull request #6522 from fidencio/topic/add-tdx-artefacts-from-2023ww01-to-main
tdx: Add artefacts from the latest TDX tools release into main
2023-04-11 20:43:02 +02:00
Fabiano Fidêncio
50ce33b02d Merge pull request #6205 from fengwang666/non-root-clh
runtime: support non-root for clh
2023-04-11 19:34:00 +02:00
Fabiano Fidêncio
4751adbea1 Merge pull request #6610 from fidencio/topic/gha-run-dragonball-k8s-tests
gha: ci-on-push: Run k8s tests with dragonball
2023-04-11 18:16:14 +02:00
Fabiano Fidêncio
69d7a959c8 gha: ci-on-push: Run tests on TDX
Now that we've added a TDX capable external runner, let's make sure we
also run the basic tests using TDX.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 16:10:35 +02:00
Fabiano Fidêncio
5a0727ecb4 kata-deploy: Ship kata-qemu-tdx runtimeClass
Let's make sure we configure containerd for the kata-qemu-tdx handler
and ship the kata-qemu-tdx runtime class for kubernetes.

Fixes: #6537

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 16:10:35 +02:00
Fabiano Fidêncio
98682805be config: Add configuration for QEMU TDX
As the QEMU configuration for TDX differs quite a lot from the normal
QEMU configuration, let's add a new configuration file for the QEMU TDX.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 16:10:35 +02:00
Fabiano Fidêncio
3e15800199 govmm: Directly pass the firmware using -bios with TDX
Since TDX doesn't support readonly memslot, TDVF cannot be mapped as
pflash device and it actually works as RAM. "-bios" option is chosen to
load TDVF.

OVMF is the opensource firmware that implements the TDVF support. Thus
the command line to specify and load TDVF is ``-bios OVMF.fd``

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
3c5ffb0c85 govmm: Set "sept-ve-disable=on"
This is needed since 22ww49.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
ed145365ec runtime/qemu: Drop "kvm-type=tdx"
This is not supported since 22ww49.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
25b3cdd38c virtcontainers: Drop check for the tdx CPU flag
In the recent kernels provided by Intel the `tdx` CPU flag is not
present anymore.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
01bdacb4e4 virtcontainers: Also check /sys/firmwares/tdx for TDX
Let's make sure we also check /sys/firmwares/tdx for TDX guest
protection, as the location may depend on whether TDX Seam is being used
or not.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
9feec533ce cache: Add ability to cache OVMF
Let's add the ability to cache OVMF, which right now we're only building
and shipping it for TDX.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
ce8d982512 gha: Build and ship the OVMF for TDX
Let's build the OVMF with TDX support as part of our tests, and let's
ship it as part of our releases.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
39c3fab7b1 local-build: Add support to build OVMF for TDX
Let's add the needed targets and modifications to be able to build
OVMF for TDX as part of the local-build scripts.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
054174d3e6 versions: Bump OVMF for TDX
Let's update the OVMF for TDX version to what's the latest tested
release of the Intel TDX tools with Kata Containers.

This change requires a newer version of `nasm` than the one provided by
the container used to build the project.  This change will also be
needed for SEV-SNP and was originally done by Alex Carter (thanks!).

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
800fb49da1 packaging: Add get_ovmf_image_name() helper
As we'll be using this from different places in the near future, let's
create a helper function as part of the libs.sh.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
fbf03d7aca cache: Document kernel-tdx-experimental
Let's make users aware of the cache_components_main.sh that they can
also cache the kernel-tdx-experimental builds.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
5d79e96966 cache: Add a space to ease the reading of the kernel flavours
Right now it's quite hard to read those, let's improve it a little bit.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
6e4726e454 cache: Fix typos
Let's just fix a few simple typos:
* kernek -> kernel
* experimetnal -> experimental

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
fc22ed0a8a gha: Build and ship the Kernel for TDX
Let's build the kernel with TDX support as part of our tests, and let's
ship it as part of our releases.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
502844ced9 local-build: Add support to build Kernel for TDX
Let's add the needed targets and modifications to be able to build
kernel-tdx-experimental as part of the local-build scripts.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
b2585eecff local-build: Avoid code duplication building the kernel
Let's create a `install_kernel_helper()` function, as it was already
done for QEMU, and rely on that when calling `install_kernel` and
`install_kernel_dragonball_experimental`.

This helps us to reduce the code duplication by a fair amount.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
f33345c311 versions: Update Kernel TDX version
Let's update the Kernel TDX version to what's the latest tested release
of the Intel TDX tools with Kata Containers.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
20ab2c2420 versions: Move Kernel TDX to its own experimental entry
Although we've been providing users a way to build kernel with TDX
support, this must be moved to its own experimental entry instead of how
it currently is.

The reason for that is because the patches are not yet merged into
kernel, and this is still an experimental build of the project.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
3d9ce3982b cache: Allow specifying the QEMU_FLAVOUR
Let's do what we already did when caching the kernel, and allow passing
a FLAVOUR of the project to build.

By doing this we can re-use the same function used to cache QEMU to also
cache any kind of experimental QEMU that we may happen to have.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
33dc6c65aa gha: Build and ship QEMU for TDX
Let's build QEMU TDX as part of our tests, and let's ship it as part of
our releases.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
eceaae30a5 local-build: Add support to build QEMU for TDX
Let's add the needed targets and modifications to be able to build
qemu-tdx-experimental as part of the local-build scripts.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
f7b7c187ec static-build: Improve qemu-experimental build script
Let's make sure the `qemu_suffix` and `qemu_tarball_name` can be
specified.  With this we make it really easy to reuse this script for
any addition flavour of an experimental QEMU that ends up having to be
built (specifically looking at the ones for Confidential Containers
here).

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:17:04 +02:00
Fabiano Fidêncio
3018c9ad51 versions: Update QEMU TDX version
Let's update the QEMU TDX version to what's the latest tested release of
the Intel TDX tools with Kata Containers.

In order to do such update, we had to relax the checks on the QEMU
version for some of the configuration options, as those were removed
right after the window was open for the 7.1.0 development (thus the
7.0.50 check).

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:17:04 +02:00
Fabiano Fidêncio
800ee5cd88 versions: Move QEMU TDX to its own experimental entry
Although we've been providing users a way to build QEMU with TDX
support, this must be moved to its own experimental entry instead of how
it currently is.

The reason for that is because the patches are not yet merged into QEMU,
and this is still an experimental build of the project.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:17:04 +02:00
Fabiano Fidêncio
1315bb45f9 local-build: Add dragonball kernel to the all target
As the dragonball kernel is shipped as part of our releases, it must be
added to the `all` target.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:17:04 +02:00
Fabiano Fidêncio
73e108136a local-build: Rename non vanilla kernel build functions
In order to make it easier to read, let's just rename the
install_dragonball_experimental_kernel and install_experimental_kernel
to install_kernel_dragonball_experimental and
install_kernel_experimental, respectively.

This allows us to quickly get to those functions when looking for
`install_kernel`.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:17:04 +02:00
Fabiano Fidêncio
1d851b4be3 local-build: Cosmetic changes in build targets
This is a simple cosmetic change, adding a space between the function
call and the `;;`.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:17:04 +02:00
Fabiano Fidêncio
49ce685ebf gha: k8s-on-aks: Always delete the AKS cluster
Regardless of the tests succeeding or failing, the AKS cluster **must be
deleted**.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 13:40:40 +02:00
Fabiano Fidêncio
e2a770df55 gha: ci-on-push: Run k8s tests with dragonball
Now that the infra for running dragonball tests has been enabled, let's
actually make sure to have them running on each PR.

The tests skipped are:
* `k8s-cpu-ns.bats`, as CPU resize doesn't seem to be yet properly
  supported on runtime-rs
  * https://github.com/kata-containers/kata-containers/issues/6621

Fixes: #6605

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 11:47:47 +02:00
Fabiano Fidêncio
aee6174a53 Merge pull request #6637 from gkurz/cpu-shares-to-weight
rustjail: Use CPUWeight with systemd and CgroupsV2
2023-04-11 10:55:48 +02:00
GabyCT
dc74133e74 Merge pull request #6631 from fidencio/topic/gha-create-delete-aks-cannot-be-workflows
gha: k8s-on-aks: {create,delete} AKS must be a coded-in step
2023-04-10 14:05:24 -06:00
Zhongtao Hu
8cdec5707e Merge pull request #6540 from houstar/main
docs: update the rust version from version.yaml
2023-04-10 16:53:21 +08:00
Qingyuan Hou
d1f550bd1e docs: update the rust version from versions.yaml
Fixes: #6539
Signed-off-by: Qingyuan Hou <lenohou@gmail.com>
2023-04-10 03:34:15 +00:00
alex.lyn
f3595e48b0 nydus_rootfs/prefetch_files: add prefetch_files for RAFS
A sandbox annotation used to specify prefetch_files.list
path the container image being used, and runtime will pass
it to Hypervisor to search for corresponding prefetch file:
format looks like:
"io.katacontainers.config.hypervisor.prefetch_files.list"
      = /path/to/<uid>/xyz.com/fedora:36/prefetch_file.list

Fixes: #6582

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-04-10 10:05:52 +08:00
Zhongtao Hu
3bfaafbf44 fix: oci hook
1. when do the deserialization for the oci hook, we should use camel
case for createRuntime

2. we should pass the dir of bundle path instead of the path of
config.json

Fixes:#4693
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-04-10 09:53:43 +08:00
Greg Kurz
c1fbaae8d6 rustjail: Use CPUWeight with systemd and CgroupsV2
The CPU shares property belongs to CgroupsV1. CgroupsV2 uses CPU weight
instead. The correct value is computed in the latter case but it is passed
to systemd using the legacy property. Systemd rejects the request and the
agent exists with the following error :

        Value specified in CPUShares is out of range: unknown

Replace the "shares" wording with "weight" in the CgroupsV2 code to
avoid confusions. Use the "CPUWeight" property since this is what
systemd expects in this case.

Fixes #6636

References:

https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#CPUWeight=weight
https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#systemd%20252
https://github.com/containers/crun/blob/main/crun.1.md#cpu-controller

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-04-07 17:57:26 +02:00
Bo Chen
375187e045 versions: Upgrade to Cloud Hypervisor v31.0
Details of this release can be found in our new roadmap project as
iteration v31.0: https://github.com/orgs/cloud-hypervisor/projects/6.

Fixes: #6632

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-06 14:35:26 -07:00
Fabiano Fidêncio
79f3047f06 gha: k8s-on-aks: {create,delete} AKS must be a coded-in step
I should have seen this coming, but currently the "create" and "delete"
AKS workflows cannot be imported and uses as a job's step, resulting on
an error trying to find the correspondent action.yaml file for those.

Fixes: #6630

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-06 22:56:08 +02:00
Fabiano Fidêncio
ee5dda012b Merge pull request #6629 from fidencio/topic/gha-refactor-run-k8s-tests-on-aks
gha: k8s-on-aks: Set {create,delete}_aks as steps
2023-04-06 22:02:34 +02:00
Fabiano Fidêncio
2f35b4d4e5 gha: ci-on-push: Only run on main branch
Let's ensure we're only running this workflow when PRs are opened
against the main branch.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-06 19:11:24 +02:00
Fabiano Fidêncio
e7bd2545ef Revert "gha: ci-on-push: Depend on Commit Message Check"
This reverts commit a159ffdba7.

Unfortunately we have to revert the PRs related to the switch done to
using `workflow_run` instead of `pull_request_target`.  The reason for
that being that we can only mark jobs as required if they are targetting
PRs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-06 19:11:14 +02:00
Fabiano Fidêncio
0d96d49633 Revert "gha: ci-on-push: Adjust to using workflow_run"
This reverts commit 3a760a157a.

Unfortunately we have to revert the PRs related to the switch done to
using `workflow_run` instead of `pull_request_target`.  The reason for
that being that we can only mark jobs as required if they are targetting
PRs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-06 19:11:06 +02:00
Fabiano Fidêncio
c7ee45f7e5 Revert "gha: ci-on-push: Adapt chained jobs to workflow_run"
This reverts commit 7855b43062.

Unfortunately we have to revert the PRs related to the switch done to
using `workflow_run` instead of `pull_request_target`.  The reason for
that being that we can only mark jobs as required if they are targetting
PRs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-06 19:09:54 +02:00
Fabiano Fidêncio
5d4d720647 Revert "gha: k8s-on-aks: Fix cluster name"
This reverts commit 85cc5bb534.

Unfortunately we have to revert the PRs related to the switch done to
using `workflow_run` instead of `pull_request_target`.  The reason for
that being that we can only mark jobs as required if they are targetting
PRs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-06 19:07:04 +02:00
Fabiano Fidêncio
13d857a56d gha: k8s-on-aks: Set {create,delete}_aks as steps
We've been currently using {create,delete}_aks as jobs.  However, it
means that if the tests fail we'll end up deleting the AKS cluster (as
expected), but not having a way to recreate the cluster without
re-running all jobs, which is a waste of resources.

Fixes: #6628

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-06 16:54:15 +02:00
Fabiano Fidêncio
abaf881f4a Merge pull request #6612 from fidencio/topic/gha-k8s-on-aks-fix-cluster-name
gha: k8s-on-aks: Fix cluster name
2023-04-06 10:48:38 +02:00
alex.lyn
dc6569dbbc runtime-rs/virtio-fs: add support extra handler for cache mode.
Add support for virtiofsd when virtio_fs_extra_args with
"-o cache auto, ..." users specified.

Fixes: #6615

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-04-06 16:31:02 +08:00
Fabiano Fidêncio
85cc5bb534 gha: k8s-on-aks: Fix cluster name
This was missed from the last series, as GHA will use the "target
branch" yaml file to start the workflow.

Basically we changed the name of the cluster created to stop relying on
the PR number, as that's not easily accessible on `workflow_run`.

Fixes: #6611

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-06 08:50:07 +02:00
Fabiano Fidêncio
68cb5689f5 Merge pull request #6584 from fidencio/topic/gha-k8s-also-test-dragonball
gha: Also run k8s tests on AKS with dragonball
2023-04-05 22:50:14 +02:00
Fabiano Fidêncio
ae488cc09f Merge pull request #6596 from fidencio/topic/gha-only-push-to-registry-when-merging-content
gha: Only push images to registry after merging a PR
2023-04-05 22:07:13 +02:00
Fabiano Fidêncio
2c38e17ef0 Merge pull request #6607 from fidencio/topic/gha-switch-to-using-a-D4_v5-instance
gha: aks: Use D4s_v5 instance
2023-04-05 22:06:40 +02:00
Archana Shinde
6af52cef3a Merge pull request #6590 from zvonkok/build-kernel-fix
tools: Avoid building the kernel twice
2023-04-05 11:45:59 -07:00
Greg Kurz
a3e3b0591f Merge pull request #6562 from c3d/issue/6561-unwrap-panic
rustjail: Fix panic when cgroup manager fails
2023-04-05 16:58:13 +02:00
James O. D. Hunt
cbe6f04194 Merge pull request #6501 from shippomx/dev_metrics
runtime: add filter metrics with specific names
2023-04-05 15:15:09 +01:00
Fabiano Fidêncio
1688e4f3f0 gha: aks: Use D4s_v5 instance
It's been pointed out that D4s_v5 instances are more powerful than the
D4s_v3 ones, and have the very same price.  With this in mind, let's
switch to the newer machines.

Fixes: #6606

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-05 16:02:17 +02:00
Fabiano Fidêncio
108d80a86d gha: Add the ability to also test Dragonball
With the changes proposed as part of this PR, an AKS cluster will be
created but no tests will be performed.

The reason we have to do this is because GitHub Actions will only run
the tests using the workflows that are part of the **target** branch,
instead of the using the ones coming from the PR, and we didn't find yet
a way to work this around.

Once this commit is in, we'll actually change the tests themselves (not
the yaml files for the actions), as those will be the ones we want as
the checkout action helps us on this case.

Fixes: #6583

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-05 15:53:03 +02:00
Fabiano Fidêncio
2550d4462d gha: build-kata-static-tarball: Only push to registry after merge
56331bd7bc oversaw the fact that we
mistakenly tried to push the build containers to the registry for a PR,
rather than doing so only when the code is merged.

As the workflow is now shared between different actions, let's introduce
an input variable to specify which are the cases we actually need to
perform a push to the registry.

Fixes: #6592

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-05 13:57:26 +02:00
Fabiano Fidêncio
e81b8b8ee5 local-build: build-and-upload-payload is not quay.io specific
Let's just print "to the registry" instead of printing "to quay.io", as
the registry used is not tied to quay.io.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-05 12:54:44 +02:00
Fabiano Fidêncio
13929fc610 gha: publish-kata-deploy-payload: Improve registry login
Let's only try to login to the registry that's being passed as an input
argument.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-05 12:54:44 +02:00
Fabiano Fidêncio
41026f003e gha: payload-after-push: Pass registry / repo as inputs
We made registry / repo mandatory, but we only adapted that to the amd64
job.  Let's fix it now and make sure this is also passed to the arm64
and s390x jobs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-05 12:54:44 +02:00
Fabiano Fidêncio
7855b43062 gha: ci-on-push: Adapt chained jobs to workflow_run
As we're using the `workflow_run` event, the checkout action would
pull the **current target branch** instead of the PR one.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-05 12:54:44 +02:00
Fabiano Fidêncio
3a760a157a gha: ci-on-push: Adjust to using workflow_run
The way previously used to get the PR's commit sha can only be used with
`pull_request*` kind of events.

Let's adapt it to the `workflow_run` now that we're using it.

With this change we ended up dropping the PR number from the tarball
suffix, as that's not straightforward to get and, to be honest, not a
unique differentiator that would justify the effort.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-05 12:54:44 +02:00
Fabiano Fidêncio
a159ffdba7 gha: ci-on-push: Depend on Commit Message Check
Let's make this workflow dependent of the commit message check, and only
start it if the commit message check one passes.

As a side effect, this allows us to run this specific workflow using
secrets, without having to rely on `pull_request_target`.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-05 12:54:40 +02:00
Fabiano Fidêncio
8086c75f61 gha: Also run k8s tests on AKS with dragonball
As already done for Cloud Hypervisor and QEMU, let's make sure we can
run the AKS tests using dragonball.

Fixes: #6583

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-04 10:58:47 +02:00
Fabiano Fidêncio
1c6d7cb0f7 Merge pull request #6589 from fidencio/topic/gha-k8s-use-ghcr-instead-of-quay
gha: Use ghcr.io for the k8s CI
2023-04-04 10:48:16 +02:00
Zvonko Kaiser
fe86c08a63 tools: Avoid building the kernel twice
Two different kernel build targets (build,install) have both instructions to
build the kernel, hence it was executed twice. Install should only do
install and build should only do build.

Fixes: #6588

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-04 05:44:44 +00:00
Fabiano Fidêncio
3215860a47 gha: Set ci-on-push to run on pull_request_target
This is less secure than running the PR on `pull_request`, and will
require using an additional `ok-to-test` label to make sure someone
deliverately ran the actions coming from a forked repo.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-03 20:50:36 +02:00
Fabiano Fidêncio
d17dfe4cdd gha: Use ghcr.io for the k8s CI
Let's switch to using the `ghcr.io` registry for the k8s CI, as this
will save us some troubles on running the CI with PRs coming from forked
repos.

Fixes: #6587

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-03 15:52:33 +02:00
Fabiano Fidêncio
e1f972fb1d Merge pull request #6568 from kata-containers/topic/add-k8s-tests-as-part-of-gha
GHA |Switch "kubernetes tests" from jenkins to GitHub actions
2023-04-03 14:25:35 +02:00
Christophe de Dinechin
b661e0cf3f rustjail: Add anyhow context for D-Bus connections
In cases where the D-Bus connection fails, add a little additional context about
the origin of the error.

Fixes: 6561

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Suggested-by: Archana Shinde <archana.m.shinde@intel.com>
Spell-checked-by: Greg Kurz <gkurz@redhat.com>
2023-04-03 14:09:34 +02:00
Fabiano Fidêncio
60c62c3b69 gha: Remove kata-deploy-test.yaml
This workflow becomes redundant as we're already testing kubernetes
using kata-deploy, and also testing it on AKS.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-31 21:55:41 +02:00
Fabiano Fidêncio
43894e9459 gha: Remove kata-deploy-push.yaml
This becomes redundant now that its steps are covered as part of the
`ci-on-push.yaml`.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-31 21:55:41 +02:00
Fabiano Fidêncio
cab9ca0436 gha: Add a CI pipeline for Kata Containers
This is the very first step to replacing the Jenkins CI, and I've
decided to start with an x86_64 approach only (although easily
expansible for other arches as soon as they're ready to switch), and to
start running our kubernetes tests (now running on AKS).

Fixes: #6541

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-31 21:55:41 +02:00
Fabiano Fidêncio
53b526b6bd gha: k8s: Add snippet to run k8s tests on aks clusters
This will be shortly used as part of a newly created GitHub action which
will replace our Jenkins CI.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-31 21:55:41 +02:00
Fabiano Fidêncio
c444c24bc5 gha: aks: Add snippets to create / delete aks clusters
Those will be shortly used as part of a newly added GitHub action for
testing k8s tests on Azure.

They've been created using the secrets we already have exposed as part
of our GitHub, and they follow a similar way to authenticate to Azure /
create an AKS cluster as done in the `/test-kata-deploy` action.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-31 21:55:41 +02:00
Fabiano Fidêncio
11e0099fb5 tests: Move k8s tests to this repo
The first part of simplifying things to have all our tests using GitHub
actions is moving the k8s tests to this repo, as those will be the first
vict^W targets to be migrated to GitHub actions.

Those tests have been slightly adapted, mainly related to what they load
/ import, so they are more self-contained and do not require us bringing
a lot of scripts from the tests repo here.

A few scripts were also dropped along the way, as we no longer plan to
deploy kubernetes as part of every single run, but rather assume there
will always be k8s running whenever we land to run those tests.

It's important to mention that a few tests were not added here:

* k8s-block-volume:
* k8s-file-volume:
* k8s-volume:
* k8s-ro-volume:
  These tests depend on some sort of volume being created on the
  kubernetes node where the test will run, and this won't fly as the
  tests will run from a GitHub runner, targetting a different machine
  where kubernetes will be running.
  * https://github.com/kata-containers/kata-containers/issues/6566

* k8s-hugepages: This test depends a whole lot on the host where it
  lands and right now we cannot assume anything about that anymore, as
  the tests will run from a GitHub runner, targetting a different
  machine where kubernetes will be running.
  * https://github.com/kata-containers/kata-containers/issues/6567

* k8s-expose-ip: This is simply hanging when running on AKS and has to
  be debugged in order to figure out the root cause of that, and then
  adapted to also work on AKS.
  * https://github.com/kata-containers/kata-containers/issues/6578

Till those issues are solved, we'll keep running a jenkins job with
hose tests to avoid any possible regression.

Last but not least, I've decided to **not** keep the history when
bringing those tests here, otherwise we'd end up polluting a lot the
history of this repo, without any clear benefit on doing so.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-31 21:55:41 +02:00
David Esparza
5d89d08fc4 Merge pull request #6564 from GabyCT/topic/updateneturl
docs: Update CNM url in networking document
2023-03-31 09:58:55 -06:00
Fabiano Fidêncio
73be4bd3f9 gha: Update actions for release.yaml
checkout@v2 should not be used anymore, please, see:
https://github.blog/changelog/2022-09-22-github-actions-all-actions-will-begin-running-on-node16-instead-of-node12/

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-31 13:24:26 +02:00
Fabiano Fidêncio
d38d7fbf1a gha: Remove code duplication from release.yaml
We can easily re-use the newly added build-kata-static-tarball-*.yaml as
part of the release.yaml file.

By doing this we consolidate on how we build the components accross our
actions.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-31 13:24:26 +02:00
Fabiano Fidêncio
56331bd7bc gha: Split payload-after-push-*.yaml
Let's split those actions into two different ones:
* Build the kata-static tarball
* Publish the kata-deploy payload

We're doing this as, later in this series we'll start taking advantage
of both pieces.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-31 13:24:26 +02:00
Gabriela Cervantes
a552a1953a docs: Update CNM url in networking document
This PR updates the url for the Container Network Model
in the network document.

Fixes #6563

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-03-30 16:20:33 +00:00
Christophe de Dinechin
7796e6ccc6 rustjail: Fix minor grammatical error in function name
Rename `unit_exist` function to `unit_exists` to match English grammar rule.

Fixes: #6561

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2023-03-30 16:13:37 +02:00
Christophe de Dinechin
41fdda1d84 rustjail: Do not unwrap potential error with cgroup manager
There can be an error while connecting to the cgroups managager, for
example a `ENOENT` if a file is not found. Make sure that this is
reported through the proper channels instead of causing a `panic()`
that does not provide much information.

Fixes: #6561

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Reported-by: Greg Kurz <gkurz@redhat.com>
2023-03-30 16:09:13 +02:00
Archana Shinde
07e49c63e1 Merge pull request #6257 from amshinde/kata-ctl-env
kata-ctl: add function to get platform protection.
2023-03-29 11:55:07 -07:00
Archana Shinde
a914283ce0 kata-ctl: add function to get platform protection.
This function checks for tdx, sev or snp protection on x86
platform.

Fixes: #1000

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-03-28 15:40:25 -07:00
Fabiano Fidêncio
245ed2cecf Merge pull request #6536 from gkurz/3.2.0-alpha0-branch-bump
# Kata Containers 3.2.0-alpha0
2023-03-28 16:05:10 +02:00
Wainer Moschetta
d0f79e66b9 Merge pull request #6513 from fidencio/topic/use-kata-deploy-local-build-as-part-of-the-snap-stuff
snap: Build the artefacts using kata-deploy
2023-03-28 09:59:31 -03:00
Miao Xia
0f73515561 runtime: add filter metrics with specific names
The kata monitor metrics API returns a huge size response,
if containers or sandboxs are a large number,
focus on what we need will be harder.

Fixes: #6500

Signed-off-by: Miao Xia <xia.miao1@zte.com.cn>
2023-03-28 14:56:13 +08:00
Greg Kurz
4a246309ee release: Kata Containers 3.2.0-alpha0
- nydus: upgrad to v2.2.0
- osbuilder: Add support for CBL-Mariner
- kata-deploy: Fix bash semantics error
- make only_kata work without -f
- runtime-rs: ch: Implement confidential guest handling
- qemu/arm64: disable image nvdimm once no firmware offered
- static checks workflow improvements
- A couple of kata-deploy fixes
- agent: Bring in VFIO-AP device handling again
- bugfix: set hostname in CreateSandboxRequest
- packaging / kata-deploy builds:  Add the ability to cache and consume cached components
- versions: Update firecracker version
- dependency: update cgroups-rs
- Built-in Sandbox: add more unit tests for dragonball. Part 6
- runtime: add support for Hyper-V
- runtime-rs: update load_config comment
- Add support for ephemeral mounts to occupy entire sandbox's memory
- runtime-rs: fix default kernel location and add more default config paths
- Implement direct-volume commands handler for shim-mgmt
- bugfix: modify tty_win info in runtime when handling ResizePtyRequest
- bugfix: add get_ns_path API for Hypervisor
- runtime-rs: add the missing default trait
- packaging: Simplify get_last_modification()
- utils: Make kata-manager.sh runs checks
- dragonball: support pmu on aarch64
- docs: fix typo in key filename in AWS installation guide
- backport rustjail systemd cgroup fix #6331 to 3.1
- main | kata-deploy: Fix kata deploy arm64 image build error
- workflows: Yet more fixes for publishing the kata-deploy payload after every PR merged
- rustjail: fix cgroup handling in agent-init mode
- runtime/Makefile: Fix install-containerd-shim-v2 dependency
- fix wrong notes for func GetSandboxesStoragePathRust()
- fix(runtime-rs): add exited state to ensure cleanup
- runtime-rs: add oci hook support
- utils: Remove kata-manager.sh cgroups v2 check
- workflows:  Fixes for the `payload-after-push` action
- Dragonball: update dependencies
- workflows: Do not install docker
- workflows: Publish kata-deploy payload after a merge
- src: Fixed typo mod.rs
- actions: Use `git-diff` to get changes in kernel dir
- agent: don't set permission of existing directory in copy_file
- runtime: use filepath.Clean() to clean the mount path
- Upgrade to Cloud Hypervisor v30.0
- feat(runtime): make static resource management consistent with 2.0
- osbuilder: Include minimal set of device nodes in ubuntu initrd
- kata-ctl/exec: add new command exec to enter guest VM.
- kernel: Add CONFIG_SEV_GUEST to SEV kernel config
- runtime-rs: Improve Cloud Hypervisor config handling
- virtiofsd: update to a valid path on ppc64le
- runtime-rs: cleanup kata host share path
- osbuilder: fix default build target in makefile
- devguide: Add link to the contribution guidelines
- kata-deploy: Ensure go binaries can run on Ubuntu 20.04
- dragonball: config_manager: preserve device when update
- Revert "workflows: Push the builder image to quay.io"
- Remove all remaining unsafe impl
- kata-deploy: Fix building the kata static firecracker arm64 package occurred an error
- shim-v2: Bump Ubuntu container image  to 22.04
- packaging: Cache the container used to build the kata-deploy artefacts
- utils: always check some dependencies.
- versions: Use ubuntu as the default distro for the rootfs-image
- github-action: Replace deprecated command with environment file
- docs: Change the order of release step
- runtime-rs: remove unnecessary Send/Sync trait implement
- runtime-rs: Don't build on Power, don't break on Power.
- runtime-rs: handle sys_dir bind volume
- sandbox: set the dns for the sandbox
- packaging/shim-v2: Only change the config if the file exists
- runtime-rs: Add basic CH implementation
- release: Revert kata-deploy changes after 3.1.0-rc0 release

8b008fc743 kata-deploy: fix bash semantics error
74ec38cf02 osbuilder: Add support for CBL-Mariner
ac58588682 runtime-rs: ch: Generate Cloud Hypervisor config for confidential guests
96555186b3 runtime-rs: ch: Honour debug setting
e3c2d727ba runtime-rs: ch: clippy fix
ece5edc641 qemu/arm64: disable image nvdimm if no firmware offered
dd23f452ab utils: renamed only_kata to skip_containerd
59c81ed2bb utils: informed pre-check about only_kata
4f0887ce42 kata-deploy: fix install failing to chmod runtime-rs/bin/*
09c4828ac3 workflows: add missing artifacts on payload-after-push
fbf891fdff packaging: Adapt `get_last_modification()`
82a04dbce1 local-build: Use cached VirtioFS when possible
3b99004897 local-build: Use cached shim v2 when possible
1b8c5474da local-build: Use cached RootFS when possible
09ce4ab893 local-build: Use cached QEMU when possible
1e1c843b8b local-build: Use cached Nydus when possible
64832ab65b local-build: Use cached Kernel when possible
04fb52f6c9 local-build: Use cached Firecracker when possible
8a40f6f234 local-build: Use cached Cloud Hypervisor when possible
194d5dc8a6 tools: Add support for caching VirtioFS artefacts
a34272cf20 tools: Add support for caching shim v2 artefacts
7898db5f79 tools: Add support for caching RootFS artefacts
e90891059b tools: Add support for caching QEMU artefacts
7aed8f8c80 tools: Add support for caching Nydus artefacts
cb4cbe2958 tools: Add support for caching Kernel artefacts
762f9f4c3e tools: Add support for caching Firecracker artefacts
6b1b424fc7 tools: Add support for caching Cloud Hypervisor artefacts
08fe49f708 versions: Adjust kernel names to match kata-deploy build targets
99505c0f4f versions: Update firecracker version
f4938c0d90 bugfix: set hostname
96baa83895 agent: Bring in VFIO-AP device handling again
f666f8e2df agent: Add VFIO-AP device handling
b546eca26f runtime: Generalize VFIO devices
4c527d00c7 agent: Rename VFIO handling to VFIO PCI handling
db89c88f4f agent: Use cfg-if for s390x CCW
68a586e52c agent: Use a constant for CCW root bus path
a8b55bf874 dependency: update cgroups-rs
97cdba97ea runtime-rs: update load_config comment
974a5c22f0 runtime: add support for Hyper-V
40f4eef535 build: Use the correct kernel name
a6c67a161e runtime: add support for ephemeral mounts to occupy entire sandbox memory
844bf053b2 runtime-rs: add the missing default trait
e7bca62c32 bugfix: modify tty_win info in runtime when handling ResizePtyRequest
30e235f0a1 runtime-rs: impl volume-resize trait for sandbox
e029988bc2 bugfix: add get_ns_path API for Hypervisor
42b8867148 runtime-rs: impl volume-stats trait for sandbox
462d4a1af2 workflows: static-checks: Free disk space before running checks
e68186d9af workflows: static-checks: Set GOPATH only once
439ff9d4c4 tools/osbuilder/tests: Remove TRAVIS variable
43ce3f7588 packaging: Simplify get_last_modification()
33c5c49719 packaging: Move repo_root_dir to lib.sh
16e2c3cc55 agent: implement update_ephemeral_mounts api
3896c7a22b protocol: add updateEphemeralMounts proto
23488312f5 agent: always use cgroupfs when running as init
8546387348 agent: determine value of use_systemd_cgroup before LinuxContainer::new()
736aae47a4 rustjail: print type of cgroup manager
dbae281924 workflows: Properly set the kata-tarball architecture
76b4591e2b tools: Adjust the build-and-upload-payload.sh script
cd2aaeda2a kata-deploy: Switch to using an ubuntu image
2d43e13102 docs: fix typo in AWS installation guide
760f78137d dragonball: support pmu on aarch64
9bc7bef3d6 kata-deploy: Fix path to the Dockerfile
78ba363f8e kata-deploy: Use different images for s390x and aarch64
6267909501 kata-deploy: Allow passing BASE_IMAGE_{NAME,TAG}
3443f558a6 nydus: upgrad nydus to v2.2.0
395645e1ce runtime: hybrid-mode cause error in the latest nydusd
f8e44172f6 utils: Make kata-manager.sh runs checks
f31c79d210 workflows: static-checks: Remove TRAVIS_XXX variables
8030e469b2 fix(runtime-rs): add exited state to ensure cleanup
7d292d7fc3 workflows: Fix the path of imported workflows
e07162e79d workflows: Fix action name
dd2713521e Dragonball: update dependencies
bd1ed26c8d workflows: Publish kata-deploy payload after a merge
fea7e8816f runtime-rs: Fixed typo mod.rs
a9e2fc8678 runtime/Makefile: Fix install-containerd-shim-v2 dependency
b6880c60d3 logging: Correct the code notes
12cfad4858 runtime-rs: modify the transfer to oci::Hooks
828d467222 workflows: Do not install docker
4b8a5a1a3d utils: Remove kata-manager.sh cgroups v2 check
2c4428ee02 runtime-rs: move pre-start hooks to sandbox_start
e80c9f7b74 runtime-rs: add StartContainer hook
977f281c5c runtime-rs: add CreateContainer hook support
875f2db528 runtime-rs: add oci hook support
ecac3a9e10 docs: add design doc for Hooks
3ac6f29e95 runtime: clh: Re-generate the client code
262daaa2ef versions: Upgrade to Cloud Hypervisor v30.0
192df84588 agent: always use cgroupfs when running as init
b0691806f1 agent: determine value of use_systemd_cgroup before LinuxContainer::new()
dc86d6dac3 runtime: use filepath.Clean() to clean the mount path
c4ef5fd325 agent: don't set permission of existing directory
3483272bbd runtime-rs: ch: Enable initrd usage
fbee6c820e runtime-rs: Improve Cloud Hypervisor config handling
1bff1ca30a kernel: Add CONFIG_SEV_GUEST to SEV kernel config Adding kernel config to sev case since it is needed for SNP and SNP will use the SEV kernel. Incrementing kernel config version to reflect changes
ad8968c8d9 rustjail: print type of cgroup manager
b4a1527aa6 kata-deploy: Fix static shim-v2 build on arm64
2c4f8077fd Revert "shim-v2: Bump Ubuntu container image  to 22.04"
afaccf924d Revert "workflows: Push the builder image to quay.io"
4c39c4ef9f devguide: Add link to the contribution guidelines
76e926453a osbuilder: Include minimal set of device nodes in ubuntu initrd
697ec8e578 kata-deploy: Fix kata static firecracker arm64 package build error
ced3c99895 dragonball: config_manager: preserve device when update
da8a6417aa runtime-rs: remove all remaining unsafe impl
0301194851 dragonball: use crossbeam_channel in VmmService instead of mpsc::channel
9d78bf9086 shim-v2: Bump Ubuntu container image  to 22.04
3cfce5a709 utils: improved unsupported distro message.
919d19f415 feat(runtime): make static resource management consistent with 2.0
b835c40bbd workflows: Push the builder image to quay.io
781ed2986a packaging: Allow passing a container builder to the scripts
45668fae15 packaging: Use existing image to build td-shim
e8c6bfbdeb packaging: Use existing image to build td-shim
3fa24f7acc packaging: Add infra to push the OVMF builder image
f076fa4c77 packaging: Use existing image to build OVMF
c7f515172d packaging: Add infra to push the QEMU builder image
fb7b86b8e0 packaging: Use existing image to build QEMU
d0181bb262 packaging: Add infra to push the virtiofsd builder image
7c93428a18 packaging: Use existing image to build virtiofsd
8c227e2471 virtiofsd: Pass the expected toolchain to the build container
7ee00d8e57 packaging: Add infra to push the shim-v2 builder image
24767d82aa packaging: Use existing image to build the shim-v2
e84af6a620 virtiofsd: update to a valid path on ppc64le
6c3c771a52 packaging: Add infra to push the kernel builder image
b9b23112bf packaging: Use existing image to build the kernel
869827d77f packaging: Add push_to_registry()
e69a6f5749 packaging: Add get_last_modification()
6c05e5c67a packaging: Add and export BUILDER_REGISTRY
1047840cf8 utils: always check some dependencies.
95e3364493 runtime-rs: remove unnecessary Send/Sync trait implement
a96ba99239 actions: Use `git-diff` to get changes in kernel dir
619ef54452 docs: Change the order of release step
a161d11920 versions: Use ubuntu as the default distro for the rootfs-image
be40683bc5 runtime-rs: Add a generic powerpc64le-options.mk
47c058599a packaging/shim-v2: Install the target depending on the arch/libc
b582c0db86 kata-ctl/exec: add new command exec to enter guest VM.
07802a19dc runtime-rs: handle sys_dir bind volume
04e930073c sandbox: set the dns for the sandbox
32ebe1895b agent: fix the issue of creating the dns file
44aaec9020 github-action: Replace deprecated command with environment file
a68c5004f8 packaging/shim-v2: Only change the config if the file exists
ee76b398b3 release: Revert kata-deploy changes after 3.1.0-rc0 release
bbc733d6c8 docs: runtime-rs: Add CH status details
37b594c0d2 runtime-rs: Add basic CH implementation
545151829d kata-types: Add Cloud Hypervisor (CH) definitions
2dd2421ad0 runtime-rs: cleanup kata host share path
0a21ad78b1 osbuilder: fix default build target in makefile
9a01d4e446 dragonball: add more unit test for virtio-blk device.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-03-28 08:40:06 +02:00
Bin Liu
75987aae72 Merge pull request #6408 from jongwu/nydus_rm_hybrid
nydus: upgrad to v2.2.0
2023-03-28 11:07:56 +08:00
Fabiano Fidêncio
4a95375dc8 Merge pull request #6465 from dallasd1/mariner-rootfs
osbuilder: Add support for CBL-Mariner
2023-03-27 22:18:31 +02:00
Fabiano Fidêncio
43dd4440f4 snap: Build the artefacts using kata-deploy
Our CI and release process are currently taking advantage of the
kata-deploy local build scripts to build the artefacts.

Having snap doing the same is the next logical step, and it will also
help to reduce, by a lot, the CI time as we only build the components
that a PR is touching (otherwise we just pull the cached component).

Fixes: #6514

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-27 17:34:43 +02:00
Fabiano Fidêncio
293119df78 Merge pull request #6515 from xyz-li/main
kata-deploy: Fix bash semantics error
2023-03-24 13:18:10 +01:00
Chelsea Mafrica
bbc699ddd8 Merge pull request #6419 from gabevenberg/containerd-pre-check
make only_kata work without -f
2023-03-23 10:02:32 -07:00
xyz-li
8b008fc743 kata-deploy: fix bash semantics error
The argument of return must be numeric.

Fixes: #6521

Signed-off-by: xyz-li <hui0787411@163.com>
2023-03-23 22:47:54 +08:00
James O. D. Hunt
da676872b1 Merge pull request #6439 from jodh-intel/runtime-rs-ch-confidential-guest
runtime-rs: ch: Implement confidential guest handling
2023-03-23 13:01:47 +00:00
Dallas Delaney
74ec38cf02 osbuilder: Add support for CBL-Mariner
Add osbuilder support to build a rootfs and image
based on the CBL-Mariner Linux distro

Fixes: #6462

Signed-off-by: Dallas Delaney <dadelan@microsoft.com>
2023-03-22 11:45:32 -07:00
James O. D. Hunt
ac58588682 runtime-rs: ch: Generate Cloud Hypervisor config for confidential guests
This change provides a preliminary implementation for the Cloud Hypervisor (CH) feature ([currently
disabled](https://github.com/kata-containers/kata-containers/pull/6201))
to allow it to generate the CH configuration for handling confidential guests.

This change also introduces concrete errors using the `thiserror` crate
(see `src/runtime-rs/crates/hypervisor/ch-config/src/errors.rs`) and a
lot of unit tests for the conversion code that generates the CH
configuration from the generic Hypervisor configuration.

Fixes: #6430.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-03-22 14:38:38 +00:00
James O. D. Hunt
96555186b3 runtime-rs: ch: Honour debug setting
Enable Cloud Hypervisor debug based on the specified configuration
rather than hard-coding debug to be disabled.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-03-22 14:38:38 +00:00
James O. D. Hunt
e3c2d727ba runtime-rs: ch: clippy fix
Simplify the code to keep rust's `clippy` happy.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-03-22 14:38:38 +00:00
James O. D. Hunt
f06f72b5e9 Merge pull request #6467 from jongwu/qemu-uefi-path
qemu/arm64: disable image nvdimm once no firmware offered
2023-03-22 08:43:01 +00:00
Steve Horsman
adaabd141a Merge pull request #6406 from jepio/jepio/static-checks-workflow-improvements
static checks workflow improvements
2023-03-20 17:12:54 +00:00
Wainer Moschetta
20da7f3ec8 Merge pull request #6495 from wainersm/fix-kata-deploy-ci
A couple of kata-deploy fixes
2023-03-20 13:48:02 -03:00
Fabiano Fidêncio
2fe0733dcb Merge pull request #4582 from BbolroC/vfio-ap
agent: Bring in VFIO-AP device handling again
2023-03-20 11:43:13 +01:00
Jianyong Wu
ece5edc641 qemu/arm64: disable image nvdimm if no firmware offered
For now, image nvdimm on qemu/arm64 depends on UEFI/ACPI, so if there
is no firmware offered, it should be disabled.

Fixes: #6468
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-03-20 18:03:05 +08:00
Zhongtao Hu
1e8005ff88 Merge pull request #6477 from openanolis/runtime-rs-hostname
bugfix: set hostname in CreateSandboxRequest
2023-03-20 12:43:29 +08:00
Gabe Venberg
dd23f452ab utils: renamed only_kata to skip_containerd
Renamed for greater clarity as to what that flag does.

Signed-off-by: Gabe Venberg <gabevenberg@gmail.com>
2023-03-17 16:09:45 -05:00
Gabe Venberg
59c81ed2bb utils: informed pre-check about only_kata
passed the only_kata variable through to pre_check, only_kata does not
abort the install when containerd is already installed.

fixes #6385

Signed-off-by: Gabe Venberg <gabevenberg@gmail.com>
2023-03-17 15:58:57 -05:00
Fabiano Fidêncio
96252db787 Merge pull request #6481 from fidencio/topic/cache-artefacts
packaging / kata-deploy builds:  Add the ability to cache and consume cached components
2023-03-17 20:54:42 +01:00
Wainer dos Santos Moschetta
4f0887ce42 kata-deploy: fix install failing to chmod runtime-rs/bin/*
The kata-deploy install method tried to `chmod +x /opt/kata/runtime-rs/bin/*` but it isn't
always true that /opt/kata/runtime-rs/bin/ exists. For example, the
s390x payload does not build the kernel-dragonball-experimental
artifacts. So let's ensure the dir exist before issuing the command.

Fixes #6494
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-03-17 16:09:21 -03:00
Wainer dos Santos Moschetta
09c4828ac3 workflows: add missing artifacts on payload-after-push
The kata-deploy-ci payloads for amd64 and arm64 were missing the shim-v2
and kernel-dragonball-experimental artifacts.

Fixes #6493
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-03-17 15:31:21 -03:00
Fabiano Fidêncio
fbf891fdff packaging: Adapt get_last_modification()
The function is returning "" when called from the script used to cache
the artefacts and one difference noted between this version and the
already working one from the CCv0 is that we make sure to `pushd
${repo_root_dir}` in the CCv0 version.

Let's give it a try here and see if it solves the issue.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-17 16:27:34 +01:00
Fabiano Fidêncio
82a04dbce1 local-build: Use cached VirtioFS when possible
As we've added the support for caching components, let's use them
whenever those are available.

Fixes: #6480

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-03-17 16:27:34 +01:00
Fabiano Fidêncio
3b99004897 local-build: Use cached shim v2 when possible
As we've added the support for caching components, let's use them
whenever those are available.

Fixes: #6480

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-03-17 16:27:34 +01:00
Fabiano Fidêncio
1b8c5474da local-build: Use cached RootFS when possible
As we've added the support for caching components, let's use them
whenever those are available.

Fixes: #6480

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-03-17 16:27:34 +01:00
Fabiano Fidêncio
09ce4ab893 local-build: Use cached QEMU when possible
As we've added the support for caching components, let's use them
whenever those are available.

Fixes: #6480

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-03-17 16:27:34 +01:00
Fabiano Fidêncio
1e1c843b8b local-build: Use cached Nydus when possible
As we've added the support for caching components, let's use them
whenever those are available.

Fixes: #6480

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-03-17 16:27:34 +01:00
Fabiano Fidêncio
64832ab65b local-build: Use cached Kernel when possible
As we've added the support for caching components, let's use them
whenever those are available.

Fixes: #6480

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-03-17 16:27:34 +01:00
Fabiano Fidêncio
04fb52f6c9 local-build: Use cached Firecracker when possible
As we've added the support for caching components, let's use them
whenever those are available.

Fixes: #6480

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-03-17 16:27:34 +01:00
Fabiano Fidêncio
8a40f6f234 local-build: Use cached Cloud Hypervisor when possible
As we've added the support for caching components, let's use them
whenever those are available.

Fixes: #6480

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-03-17 16:27:34 +01:00
Fabiano Fidêncio
194d5dc8a6 tools: Add support for caching VirtioFS artefacts
Let's add support for caching VirtioFS artefacts that are generated using
the kata-deploy local-build scripts.

Right now those are not used, but we'll switch to using them very soon
as part of upcoming changes of how we build the components we test in
our CI.

Fixes: #6480

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-03-17 11:43:01 +01:00
Fabiano Fidêncio
a34272cf20 tools: Add support for caching shim v2 artefacts
Let's add support for caching shim v2 artefacts that are generated using
the kata-deploy local-build scripts.

Right now those are not used, but we'll switch to using them very soon
as part of upcoming changes of how we build the components we test in
our CI.

Fixes: #6480

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-03-17 11:43:01 +01:00
Fabiano Fidêncio
7898db5f79 tools: Add support for caching RootFS artefacts
Let's add support for caching RootFS artefacts that are generated using
the kata-deploy local-build scripts.

Right now those are not used, but we'll switch to using them very soon
as part of upcoming changes of how we build the components we test in
our CI.

Fixes: #6480

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-03-17 11:43:01 +01:00
Fabiano Fidêncio
e90891059b tools: Add support for caching QEMU artefacts
Let's add support for caching QEMU artefacts that are generated using
the kata-deploy local-build scripts.

Right now those are not used, but we'll switch to using them very soon
as part of upcoming changes of how we build the components we test in
our CI.

Fixes: #6480

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-03-17 11:43:01 +01:00
Fabiano Fidêncio
7aed8f8c80 tools: Add support for caching Nydus artefacts
Let's add support for caching Nydus artefacts that are generated using
the kata-deploy local-build scripts.

Right now those are not used, but we'll switch to using them very soon
as part of upcoming changes of how we build the components we test in
our CI.

Fixes: #6480

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-03-17 11:43:01 +01:00
Fabiano Fidêncio
cb4cbe2958 tools: Add support for caching Kernel artefacts
Let's add support for caching Kernel artefacts that are generated using
the kata-deploy local-build scripts.

Right now those are not used, but we'll switch to using them very soon
as part of upcoming changes of how we build the components we test in
our CI.

Fixes: #6480

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-03-17 11:43:01 +01:00
Fabiano Fidêncio
762f9f4c3e tools: Add support for caching Firecracker artefacts
Let's add support for caching Firecracker artefacts that are generated
using the kata-deploy local-build scripts.

Right now those are not used, but we'll switch to using them very soon
as part of upcoming changes of how we build the components we test in
our CI.

Fixes: #6480

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-03-17 11:28:56 +01:00
Fabiano Fidêncio
6b1b424fc7 tools: Add support for caching Cloud Hypervisor artefacts
Let's add support for caching Cloud Hypervisor artefacts that are
generated using the kata-deploy local-build scripts.

Right now those are not used, but we'll switch to using them very soon
as part of upcoming changes of how we build the components we test in
our CI.

Fixes: #6480

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-03-17 11:28:56 +01:00
Fabiano Fidêncio
08fe49f708 versions: Adjust kernel names to match kata-deploy build targets
Let's adjust the kernel names in versions.yaml so those can match the
names used as part of the kata-deploy local build scripts.

Right now this doesn't bring any benefit nor drawback, but it'll make
our life easier later on in this same series.

Depends-on: github.com/kata-containers/tests#5534

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-17 11:28:56 +01:00
Fabiano Fidêncio
d281d1b90a Merge pull request #6483 from GabyCT/topic/updatefcv
versions: Update firecracker version
2023-03-17 10:37:22 +01:00
Gabriela Cervantes
99505c0f4f versions: Update firecracker version
This PR updates the firecracker version being used in kata containers
versions.yaml

The changes in version 1.3.1 are

Added

Introduced T2CL (Intel) and T2A (AMD) CPU templates to provide
instruction set feature parity between Intel and AMD CPUs when using
these templates.
Added Graviton3 support (c7g instance type).
Changed

Improved error message when invalid network backend provided.
Improved TCP throughput by between 5% and 15% (depending on CPU) by using
scatter-gather I/O in the net device's TX path.
Upgraded Rust toolchain from 1.64.0 to 1.66.0.
Made seccompiler output bit-reproducible.
Fixed

Fixed feature flags in T2 CPU template on Intel Ice Lake.

Fixes #6482

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-03-16 17:34:33 +00:00
Yushuo
f4938c0d90 bugfix: set hostname
Setting hostname according to the spec.

Fixes: #6247

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-03-16 17:16:06 +08:00
Hyounggyu Choi
96baa83895 agent: Bring in VFIO-AP device handling again
This PR is a continuing work for (kata-containers#3679).

This generalizes the previous VFIO device handling which only
focuses on PCI to include AP (IBM Z specific).

Fixes: kata-containers#3678
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-03-16 18:14:12 +09:00
Greg Kurz
e6e719699f Merge pull request #6471 from etrunko/main
dependency: update cgroups-rs
2023-03-16 08:01:07 +01:00
QuanweiZhou
56c63a9b1c Merge pull request #6186 from wllenyj/dragonball-ut-6
Built-in Sandbox: add more unit tests for dragonball. Part 6
2023-03-16 11:02:05 +08:00
Jakob Naucke
f666f8e2df agent: Add VFIO-AP device handling
Initial VFIO-AP support (#578) was simple, but somewhat hacky; a
different code path would be chosen for performing the hotplug, and
agent-side device handling was bound to knowing the assigned queue
numbers (APQNs) through some other means; plus the code for awaiting
them was written for the Go agent and never released. This code also
artificially increased the hotplug timeout to wait for the (relatively
expensive, thus limited to 5 seconds at the quickest) AP rescan, which
is impractical for e.g. common k8s timeouts.

Since then, the general handling logic was improved (#1190), but it
assumed PCI in several places.

In the runtime, introduce and parse AP devices. Annotate them as such
when passing to the agent, and include information about the associated
APQNs.

The agent awaits the passed APQNs through uevents and triggers a
rescan directly.

Fixes: #3678
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2023-03-16 10:07:48 +09:00
Jakob Naucke
b546eca26f runtime: Generalize VFIO devices
Generalize VFIO devices to allow for adding AP in the next patch.
The logic for VFIOPciDeviceMediatedType() has been changed and IsAPVFIOMediatedDevice() has been removed.

The rationale for the revomal is:

- VFIODeviceMediatedType is divided into 2 subtypes for AP and PCI
- Logic of checking a subtype of mediated device is included in GetVFIODeviceType()
- VFIOPciDeviceMediatedType() can simply fulfill the device addition based
on a type categorized by GetVFIODeviceType()

Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2023-03-16 10:06:37 +09:00
Jakob Naucke
4c527d00c7 agent: Rename VFIO handling to VFIO PCI handling
e.g., split_vfio_option is PCI-specific and should instead be named
split_vfio_pci_option. This mutually affects the runtime, most notably
how the labels are named for the agent.

Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2023-03-16 07:43:39 +09:00
Jakob Naucke
db89c88f4f agent: Use cfg-if for s390x CCW
Uses fewer lines in upcoming VFIO-AP support.

Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2023-03-16 07:43:39 +09:00
Jakob Naucke
68a586e52c agent: Use a constant for CCW root bus path
used a function like PCI does, but this is not necessary

Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2023-03-16 07:43:39 +09:00
Fabiano Fidêncio
814d07af58 Merge pull request #6463 from sprt/sprt/mshv-compat
runtime: add support for Hyper-V
2023-03-15 18:03:25 +01:00
Eduardo Lima (Etrunko)
a8b55bf874 dependency: update cgroups-rs
Huge pages failure with cgroups v2.
https://github.com/kata-containers/cgroups-rs/issues/112

Fixes: #6470

Signed-off-by: Eduardo Lima (Etrunko) <etrunko@redhat.com>
2023-03-15 12:21:12 -03:00
Chao Wu
530b2a7685 Merge pull request #6458 from openanolis/chao/update_comments
runtime-rs: update load_config comment
2023-03-15 19:32:07 +08:00
Chao Wu
97cdba97ea runtime-rs: update load_config comment
Since shimv2 create task option is already implemented, we need to update the
corresponding comments.

Also, the ordering is also updated to fit with the code.

fixes: #3961

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-03-15 14:44:47 +08:00
Eric Ernst
dc42f0a33b Merge pull request #6411 from wlan0/empty-dir
Add support for ephemeral mounts to occupy entire sandbox's memory
2023-03-13 20:07:27 -07:00
Henry Beberman
974a5c22f0 runtime: add support for Hyper-V
This adds /dev/mshv to the list of sandbox devices so that VMMs can
create Hyper-V VMs.

In our testing, this also doesn't error out in case /dev/mshv isn't
present.

Fixes #6454.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-03-13 17:13:51 -07:00
Fabiano Fidêncio
ab0bd7a1ee Merge pull request #6292 from fidencio/topic/runtime-rs-small-fixes
runtime-rs: fix default kernel location and add more default config paths
2023-03-13 16:53:30 +01:00
Fabiano Fidêncio
40f4eef535 build: Use the correct kernel name
When calling `MAKE_KERNEL_NAME` we're considering the default kernel
name will be `vmlinux.container` or `vmlinuz.container`, which is not
the case as the runtime-rs, when used with dragonball, relies on the
`vmlinu[zx]-dragonball-experimental.container` kernel.

Other hypervisors will have to introduce a similar
`MAKE_KERNEL_NAME_${HYPERVISOR}` to adapt this to the kernel they want
to use, similarly to what's already done for the go runtime.

By doing this we also ensure that no changes in the configuration file
will be required to run runtime-rs, with dragonball, as part of our CI
or as part of kata-deploy.

Fixes: #6290

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-13 13:47:20 +01:00
James O. D. Hunt
ae9be1d94b Merge pull request #5840 from tzY15368/feat-runtimers-direct-vol
Implement direct-volume commands handler for shim-mgmt
2023-03-13 07:58:40 +00:00
Chelsea Mafrica
4b877b0a3e Merge pull request #6426 from openanolis/runtime-rs-resize-pty
bugfix: modify tty_win info in runtime when handling ResizePtyRequest
2023-03-10 14:08:41 -08:00
Sidhartha Mani
a6c67a161e runtime: add support for ephemeral mounts to occupy entire sandbox memory
On hotplug of memory as containers are started, remount all ephemeral mounts with size option set to the total sandbox memory

Fixes: #6417

Signed-off-by: Sidhartha Mani <sidhartha_mani@apple.com>
2023-03-10 13:36:02 -08:00
James O. D. Hunt
99a4eaa898 Merge pull request #6443 from openanolis/runtime-rs-get-netns
bugfix: add get_ns_path API for Hypervisor
2023-03-10 20:16:22 +00:00
Fabiano Fidêncio
44bc222ca4 Merge pull request #5578 from Richardhongyu/main
runtime-rs: add the missing default trait
2023-03-10 18:01:43 +01:00
Li Hongyu
844bf053b2 runtime-rs: add the missing default trait
Some structs in the runtime-rs don't implement Default trait.
This commit adds the missing Default.

Fixes: #5463

Signed-off-by: Li Hongyu <lihongyu1999@bupt.edu.cn>
2023-03-10 08:19:56 +00:00
Yushuo
e7bca62c32 bugfix: modify tty_win info in runtime when handling ResizePtyRequest
Currently, we only create the new exec process in runtime, this will cause error
when the following requests needing to be handled:

- Task: exec process
- Task: resize process pty
- ...

The agent do not do_exec_process when we handle ExecProcess, thus we can not find
any process information in the guest when we handle ResizeProcessPty. This will
report an error.

In this commit, the handling process is modified to the:
* Modify process tty_win information in runtime
* If the exec process is not running, we just return. And the truly pty_resize will
happen when start_process

Fixes: #6248

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-03-10 14:33:51 +08:00
Tingzhou Yuan
30e235f0a1 runtime-rs: impl volume-resize trait for sandbox
Implements resize-volume handlers in shim-mgmt,
trait for sandbox and add RPC calls to agent.
Note the actual rpc handler for the resize request is currently not
implemented, refer to issue #3694.

Fixes #5369

Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>
2023-03-10 01:27:06 -05:00
Yushuo
e029988bc2 bugfix: add get_ns_path API for Hypervisor
For external hypervisors(qemu, cloud-hypervisor, ...), the ns they launch vm in
is different from internal hypervisor(dragonball). And when we doing CreateContainer
hook, we will rely on the netns path. So we add a get_ns_path API.

Fixes: #6442

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-03-10 13:57:00 +08:00
Tingzhou Yuan
42b8867148 runtime-rs: impl volume-stats trait for sandbox
Implements get-volume-stats trait for sandbox,
handler for shim-mgmt and add RPC calls to
agent. Also added type conversions in trans.rs

Fixes #5369

Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>
2023-03-10 00:48:02 -05:00
Jeremi Piotrowski
462d4a1af2 workflows: static-checks: Free disk space before running checks
We've been seeing the 'sudo make test' job occasionally run out of space in
/tmp, which is part of the root filesystem. Removing dotnet and
`AGENT_TOOLSDIRECTORY` frees around 10GB of space and in my tests the job still
has 13GB of space left after running.

Fixes: #6401
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-03-09 13:30:09 +01:00
Jeremi Piotrowski
e68186d9af workflows: static-checks: Set GOPATH only once
{{ runner.workspace }}/kata-containers and {{ github.workspace }} resolve to
the same value, but they're being used multiple times in the workflow. Remove
multiple definitions and define the GOPATH var at job level once.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-03-09 13:30:09 +01:00
Jeremi Piotrowski
439ff9d4c4 tools/osbuilder/tests: Remove TRAVIS variable
The last remaining user of the TRAVIS variable in this repo is
tools/osbuilder/tests and it is only used to skip spinning up VMs. Travis
didn't support virtualization and the same is true for github actions hosted
runners. Replace the variable with KVM_MISSING and determine availability of
/dev/kvm at runtime.

TRAVIS is also used by '.ci/setup.sh' in kata-containers/tests to reduce the
set of dependencies that gets installed, but this is also in the process of
being removed.

Fixes: #3544
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-03-09 13:29:49 +01:00
Christophe de Dinechin
7566a7eae4 Merge pull request #6432 from fidencio/topic/simplify-get-last-modification
packaging: Simplify get_last_modification()
2023-03-09 10:57:58 +01:00
Fabiano Fidêncio
43ce3f7588 packaging: Simplify get_last_modification()
There's no need to pass repo_root_dir to get_last_modification() as the
variable used everywhere is exported from that very same file.

Fixes: #6431

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-08 21:22:03 +01:00
Fabiano Fidêncio
33c5c49719 packaging: Move repo_root_dir to lib.sh
This is used in several parts of the code, and can have a single
declaration as part of the `lib.sh` file, which is already imported by
all the places where it's used.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-08 21:10:53 +01:00
James O. D. Hunt
614d1817ce Merge pull request #6410 from tg5788re/kata-manager-use-runtime-checks
utils: Make kata-manager.sh runs checks
2023-03-08 09:55:03 +00:00
Chao Wu
fef268a7de Merge pull request #6413 from xuejun-xj/xuejun/pmu
dragonball: support pmu on aarch64
2023-03-08 14:24:31 +08:00
Steve Horsman
cc1821fb8b Merge pull request #6409 from Sig00rd/patch-1
docs: fix typo in key filename in AWS installation guide
2023-03-07 15:19:46 +00:00
Fabiano Fidêncio
861552c305 Merge pull request #6414 from jepio/jepio/backport-3.1-rustjail-systemd-cgroup-fix-6331
backport rustjail systemd cgroup fix #6331 to 3.1
2023-03-07 12:51:08 +01:00
Sidhartha Mani
16e2c3cc55 agent: implement update_ephemeral_mounts api
- implement update_ephemeral_mounts rpc
- for each mountpoint passed in, remount it with new options

Signed-off-by: Sidhartha Mani <sidhartha_mani@apple.com>
2023-03-06 13:44:14 -08:00
Sidhartha Mani
3896c7a22b protocol: add updateEphemeralMounts proto
- adds a new rpc call to the agent service named `updateEphemeralMounts`
- this call takes a list of grpc.Storage objects

Signed-off-by: Sidhartha Mani <sidhartha_mani@apple.com>
2023-03-06 13:43:47 -08:00
Jeremi Piotrowski
23488312f5 agent: always use cgroupfs when running as init
The logic to decide which cgroup driver is used is currently based on the
cgroup path that the host provides. This requires host and guest to use the
same cgroup driver. If the guest uses kata-agent as init, then systemd can't be
used as the cgroup driver. If the host requests a systemd cgroup, this
currently results in a rustjail panic:

  thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: I/O error: No such file or directory (os error 2)

  Caused by:
      No such file or directory (os error 2)', rustjail/src/cgroups/systemd/manager.rs:44:51
  stack backtrace:
     0:     0x7ff0fe77a793 - std::backtrace_rs::backtrace::libunwind::trace::h8c197fa9a679d134
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
     1:     0x7ff0fe77a793 - std::backtrace_rs::backtrace::trace_unsynchronized::h9ee19d58b6d5934a
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
     2:     0x7ff0fe77a793 - std::sys_common::backtrace::_print_fmt::h4badc450600fc417
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:65:5
     3:     0x7ff0fe77a793 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::had334ddb529a2169
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:44:22
     4:     0x7ff0fdce815e - core::fmt::write::h1aa7694f03e44db2
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/fmt/mod.rs:1209:17
     5:     0x7ff0fe74e0c4 - std::io::Write::write_fmt::h61b2bdc565be41b5
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/io/mod.rs:1682:15
     6:     0x7ff0fe77cd3f - std::sys_common::backtrace::_print::h4ec69798b72ff254
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:47:5
     7:     0x7ff0fe77cd3f - std::sys_common::backtrace::print::h0e6c02048dec3c77
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:34:9
     8:     0x7ff0fe77c93f - std::panicking::default_hook::{{closure}}::hcdb7e705dc37ea6e
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:267:22
     9:     0x7ff0fe77d9b8 - std::panicking::default_hook::he03a933a0f01790f
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:286:9
    10:     0x7ff0fe77d9b8 - std::panicking::rust_panic_with_hook::he26b680bfd953008
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:688:13
    11:     0x7ff0fe77d482 - std::panicking::begin_panic_handler::{{closure}}::h559120d2dd1c6180
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:579:13
    12:     0x7ff0fe77d3ec - std::sys_common::backtrace::__rust_end_short_backtrace::h36db621fc93b005a
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:137:18
    13:     0x7ff0fe77d3c1 - rust_begin_unwind
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:575:5
    14:     0x7ff0fda52ee2 - core::panicking::panic_fmt::he7679b415d25c5f4
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:65:14
    15:     0x7ff0fda53182 - core::result::unwrap_failed::hb71caff146724b6b
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/result.rs:1791:5
    16:     0x7ff0fe5bd738 - <rustjail::cgroups::systemd::manager::Manager as rustjail::cgroups::Manager>::apply::hd46958d9d807d2ca
    17:     0x7ff0fe606d80 - <rustjail::container::LinuxContainer as rustjail::container::BaseContainer>::start::{{closure}}::h1de806d91fcb878f
    18:     0x7ff0fe604a76 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h1749c148adcc235f
    19:     0x7ff0fdc0c992 - kata_agent::rpc::AgentService::do_create_container::{{closure}}::{{closure}}::hc1b87a15dfdf2f64
    20:     0x7ff0fdb80ae4 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h846a8c9e4fb67707
    21:     0x7ff0fe3bb816 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h53de16ff66ed3972
    22:     0x7ff0fdb519cb - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h1cbece980286c0f4
    23:     0x7ff0fdf4019c - <tokio::future::poll_fn::PollFn<F> as core::future::future::Future>::poll::hc8e72d155feb8d1f
    24:     0x7ff0fdfa5fd8 - tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut::h0a407ffe2559449a
    25:     0x7ff0fdf033a1 - tokio::runtime::task::raw::poll::h1045d9f1db9742de
    26:     0x7ff0fe7a8ce2 - tokio::runtime::scheduler::multi_thread::worker::Context::run_task::h4924ae3464af7fbd
    27:     0x7ff0fe7afb85 - tokio::runtime::task::raw::poll::h5c843be39646b833
    28:     0x7ff0fe7a05ee - std::sys_common::backtrace::__rust_begin_short_backtrace::ha7777c55b98a9bd1
    29:     0x7ff0fe7a9bdb - core::ops::function::FnOnce::call_once{{vtable.shim}}::h27ec83c953360cdd
    30:     0x7ff0fe7801d5 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hed812350c5aef7a8
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/boxed.rs:1987:9
    31:     0x7ff0fe7801d5 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hc7df8e435a658960
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/boxed.rs:1987:9
    32:     0x7ff0fe7801d5 - std::sys::unix::thread::Thread::new::thread_start::h575491a8a17dbb33
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys/unix/thread.rs:108:17

Forward the value of "init_mode" to AgentService, so that we can force cgroupfs
when systemd is unavailable.

Fixes: #5779
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-03-06 20:34:21 +01:00
Jeremi Piotrowski
8546387348 agent: determine value of use_systemd_cgroup before LinuxContainer::new()
Right now LinuxContainer::new() gets passed a CreateOpts struct, but then
modifies the use_systemd_cgroup field inside that struct. Pull the cgroups path
parsing logic into do_create_container, so that CreateOpts can be immutable in
LinuxContainer::new. This is just moving things around, there should be no
functional changes.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-03-06 20:34:21 +01:00
Jeremi Piotrowski
736aae47a4 rustjail: print type of cgroup manager
Since the cgroup manager is wrapped in a dyn now, the print in
LinuxContainer::new has been useless and just says "CgroupManager". Extend the
Debug trait for 'dyn Manager' to print the type of the cgroup manager so that
it's easier to debug issues.

Fixes: #5779
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-03-06 20:34:21 +01:00
Fabiano Fidêncio
0749657c73 Merge pull request #6359 from singhwang/main
main | kata-deploy: Fix kata deploy arm64 image build error
2023-03-06 16:48:03 +01:00
Fabiano Fidêncio
dbae281924 workflows: Properly set the kata-tarball architecture
Let's make sure the kata-tarball architecture upload / downloaded / used
is exactly the same one that we need as part of the architecture we're
using to generate the image.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-06 13:18:51 +01:00
Fabiano Fidêncio
76b4591e2b tools: Adjust the build-and-upload-payload.sh script
Now that we've switched the base container image to using Ubuntu instead
of CentOS, we don't need any kind of extra logic to correctly build the
image for different architectures, as Ubuntu is a multi-arch image that
supports all the architectures we're targetting.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-06 13:18:51 +01:00
SinghWang
cd2aaeda2a kata-deploy: Switch to using an ubuntu image
Let's make sure we use a multi-arch image for building kata-deploy.
A few changes were also added in order to get systemd working inside the
kata-deploy image, due to the switch from CentOS to Ubuntu.

Fixes: #6358
Signed-off-by: SinghWang <wangxin_0611@126.com>
2023-03-06 13:18:51 +01:00
Szymon Fugas
2d43e13102 docs: fix typo in AWS installation guide
Fixes referring to previously created key file with .pen extension instead of .pem.

Fixes: #6412
Signed-off-by: Sig00rd <sfugas@virtuslab.com>
2023-03-06 13:18:08 +01:00
xuejun-xj
760f78137d dragonball: support pmu on aarch64
This commit adds support for pmu virtualization on aarch64. The
initialization of pmu is in the following order:
1. Receive pmu parameter(vpmu_feature) from runtime-rs to determine the
VpmuFeatureLevel.
2. Judge whether to initialize pmu devices and add pmu device node into
fdt on aarch64, according to VpmuFeatureLevel.

Fixes: #6168

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
2023-03-06 18:55:13 +08:00
Fabiano Fidêncio
93a40cb35e Merge pull request #6402 from fidencio/topic/yet-more-fixes-for-the-publish-kata-deploy-payload-work
workflows: Yet more fixes for publishing the kata-deploy payload after every PR merged
2023-03-06 10:43:32 +01:00
Fabiano Fidêncio
df35f8f885 Merge pull request #6331 from jepio/jepio/fix-agent-init-cgroups
rustjail: fix cgroup handling in agent-init mode
2023-03-05 20:29:40 +01:00
Fabiano Fidêncio
98d611623f Merge pull request #6361 from etrunko/main
runtime/Makefile: Fix install-containerd-shim-v2 dependency
2023-03-04 13:47:11 +01:00
Fabiano Fidêncio
9bc7bef3d6 kata-deploy: Fix path to the Dockerfile
As part of bd1ed26c8d, we've pointed to
the Dockerfile that's used in the CC branch, which is wrong.

For what we're doing on main, we should be pointing to the one under the
`kata-deploy` folder, and not the one under the non-existent
`kata-deploy-cc` one.

Fixes: #6343

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-04 12:18:38 +01:00
Fabiano Fidêncio
78ba363f8e kata-deploy: Use different images for s390x and aarch64
As the image provided as part of registry.centos.org is not a multi-arch
one, at least not for CentOS 7, we need to expand the script used to
build the image to pass images that are known to work for s390x (ClefOS)
and aarch64 (CentOS, but coming from dockerhub).

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-04 12:18:32 +01:00
Fabiano Fidêncio
6267909501 kata-deploy: Allow passing BASE_IMAGE_{NAME,TAG}
Let's break the IMAGE build parameter into BASE_IMAGE_NAME and
BASE_IMAGE_TAG, as it makes it easier to replace the default CentOS
image by something else.

Spoiler alert, the default CentOS image is **not** multi-arch, and we do
want to support at least aarch64 and s390x in the near term future.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-04 12:16:41 +01:00
Jianyong Wu
3443f558a6 nydus: upgrad nydus to v2.2.0
Use the latest nydus, we may let nydus work on arm64.

Fixes: #6407
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-03-04 12:58:48 +08:00
Jianyong Wu
395645e1ce runtime: hybrid-mode cause error in the latest nydusd
When update the nydusd to 2.2, the argument "--hybrid-mode" cause
the following error:

thread 'main' panicked at 'ArgAction::SetTrue / ArgAction::SetFalse is defaulted'

Maybe we should remove it to upgrad nydusd

Fixes: #6407
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-03-04 12:58:48 +08:00
tg5788re
f8e44172f6 utils: Make kata-manager.sh runs checks
Updated the `kata-manager.sh` script to make it run all the checks on
the host system before attempting to create a container. If any checks
fail, they will indicate to the user what the problem is in a clearer
manner than those reported by the container manager.

Fixes: #6281.

Signed-off-by: tg5788re <jfokugas@gmail.com>
2023-03-03 09:56:12 -06:00
Chelsea Mafrica
ebe916b372 Merge pull request #6355 from yanggangtony/fix-wrong-notes
fix wrong notes for func GetSandboxesStoragePathRust()
2023-03-03 07:55:54 -08:00
Jeremi Piotrowski
f31c79d210 workflows: static-checks: Remove TRAVIS_XXX variables
These variables are unused since we don't use travis CI. This also allows to
remove two steps:

- 'Setup GOPATH' only printed variables
- 'Setup travis reference' modified some shell local variables that don't have
  any influence on the rest of the steps

The TRAVIS var is still used by tools/osbuilder/tests to determine if
virtualization is available.

Fixes: #3544
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-03-03 11:38:34 +01:00
Zhongtao Hu
60bb9d114a Merge pull request #6399 from yipengyin/fix-cleanup
fix(runtime-rs): add exited state to ensure cleanup
2023-03-03 17:41:16 +08:00
Chao Wu
6fc4c8b099 Merge pull request #5788 from openanolis/runtime-rs-ocihook
runtime-rs: add oci hook support
2023-03-03 01:06:21 +08:00
James O. D. Hunt
4a7a859592 Merge pull request #6377 from pembek01/remove-cgroupsv2-check
utils: Remove kata-manager.sh cgroups v2 check
2023-03-02 17:00:46 +00:00
Fabiano Fidêncio
b20d5289cb Merge pull request #6400 from fidencio/topic/fixes-for-generating-the-kata-deploy-payload
workflows:  Fixes for the `payload-after-push` action
2023-03-02 14:20:24 +01:00
Yipeng Yin
8030e469b2 fix(runtime-rs): add exited state to ensure cleanup
Set process status to exited at end of io wait, which indicate process
exited only, but stop process has not been finished. Otherwise, the
cleanup_container will be skipped.

Fixes: #6393

Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
2023-03-02 18:14:20 +08:00
Fabiano Fidêncio
7d292d7fc3 workflows: Fix the path of imported workflows
In `payload-after-push.yaml` we ended up mentioning cc-*.yaml workflows,
which are non existent in the main branch.

Let's adapt the name to the correct ones.

Fixes: #6343

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-02 10:18:10 +01:00
Fabiano Fidêncio
e07162e79d workflows: Fix action name
We have a few actions in the `payload-after-push.*.yaml` that are
referring to Confidential Containers, but they should be referring to
Kata Containers instead.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-02 10:17:18 +01:00
Chao Wu
572c385774 Merge pull request #6269 from openanolis/chao/update_dragonball_version
Dragonball: update dependencies
2023-03-02 17:15:39 +08:00
Fabiano Fidêncio
7286f8f706 Merge pull request #6391 from fidencio/topic/do-not-install-docker-as-part-of-the-actions
workflows: Do not install docker
2023-03-02 10:12:15 +01:00
Fabiano Fidêncio
7201279647 Merge pull request #6344 from fidencio/topic/generate-a-kata-deploy-payload-on-each-PR-merged
workflows: Publish kata-deploy payload after a merge
2023-03-02 09:02:34 +01:00
Chao Wu
dd2713521e Dragonball: update dependencies
Since rust-vmm and dragonball-sandbox has introduced several updates
such as vPMU support for aarch64, we also need to update Dragonball
dependencies to include those changes.

Update:
virtio-queue to v0.6.0
kvm-ioctls to v0.12.0
dbs-upcall to v0.2.0
dbs-virtio-devices to v0.2.0
kvm-bindings to v0.6.0

Also, several aarch64 features are updated because of dependencies
changes:
1. update vcpu hotplug API.
2. update vpmu related API.
3. adjust unit test cases for aarch64 Dragonball.

fixes: #6268

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-03-02 14:53:04 +08:00
Chao Wu
2934ab4a3c Merge pull request #6380 from Christopher-C-Robinson/#6256-typo-fix
src: Fixed typo mod.rs
2023-03-02 14:31:33 +08:00
Fabiano Fidêncio
bd1ed26c8d workflows: Publish kata-deploy payload after a merge
For the architectures we know that `make kata-tarball` works as
expected, let's start publishing the kata-deploy payload after each
merge.

This will help to:
* Easily test the content of current `main` or `stable-*` branch
* Easily bisect issues
* Start providing some sort of CI/CD content pipeline for those who
  need that

This is a forward-port work from the `CCv0` and groups together patches
that I've worked on, with the work that Choi did in order to support
different architectures.

Fixes: #6343

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-03-02 02:19:10 +01:00
Domesticcadiz
fea7e8816f runtime-rs: Fixed typo mod.rs
Fixed the typo in comment in the delete method located in mod.rs file.

Fixes: #6256.

Signed-off-by: Domesticcadiz <christopher.cadiz.robinson@gmail.com>
2023-03-01 18:03:41 -06:00
Archana Shinde
65fa19fe92 Merge pull request #6305 from amshinde/update-action-kernel-check
actions: Use `git-diff` to get changes in kernel dir
2023-03-01 13:46:50 -08:00
Eduardo Lima (Etrunko)
a9e2fc8678 runtime/Makefile: Fix install-containerd-shim-v2 dependency
$ make install
make: *** No rule to make target 'containerd-shim-kata-v2', needed by 'install-containerd-shim-v2'.  Stop.

Spotted when building kata-runtime with a different name for
SHIMV2_OUTPUT. For instance, trying to keep different runtime binaries
installed at the same time, one from master and another from lets say,
the CCv0 branch, with the following small change applied.

diff --git a/src/runtime/Makefile b/src/runtime/Makefile
index 95efaff78..2bab9eb75 100644
--- a/src/runtime/Makefile
+++ b/src/runtime/Makefile
@@ -231,7 +231,7 @@ SED = sed

 CLI_DIR = cmd
 SHIMV2 = containerd-shim-kata-v2
-SHIMV2_OUTPUT = $(bCURDIR)/$(SHIMV2)
+SHIMV2_OUTPUT = $(CURDIR)/$(SHIMV2)-ccv0
 SHIMV2_DIR = $(CLI_DIR)/$(SHIMV2)

 MONITOR = kata-monitor

Fixes: #6398

Signed-off-by: Eduardo Lima (Etrunko) <etrunko@redhat.com>
2023-03-01 15:57:30 -03:00
yanggang
b6880c60d3 logging: Correct the code notes
Fix wrong notes for func GetSandboxesStoragePathRust()

Fixes: #6394

Signed-off-by: yanggang <gang.yang@daocloud.io>
2023-03-01 19:20:25 +08:00
Yushuo
12cfad4858 runtime-rs: modify the transfer to oci::Hooks
In this commit, we have done:
    * modify the tranfer process from grpc::Hooks to oci::Hooks, so the code
      can be more clean
    * add more tests for create_runtime, create_container, start_container hooks

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-03-01 10:35:10 +08:00
Fabiano Fidêncio
828d467222 workflows: Do not install docker
The latest ubuntu runners already have docker installed and trying to
install it manually will cause the following issue:
```
Run curl -fsSL https://test.docker.com/ -o test-docker.sh
Warning: the "docker" command appears to already exist on this system.

If you already have Docker installed, this script can cause trouble, which is
why we're displaying this warning and provide the opportunity to cancel the
installation.

If you installed the current Docker package using this script and are using it
again to update Docker, you can safely ignore this message.

You may press Ctrl+C now to abort this script.
+ sleep 20
+ sudo -E sh -c apt-get update -qq >/dev/null
E: The repository 'https://packages.microsoft.com/ubuntu/22.04/prod jammy Release' is no longer signed.
```

Fixes: #6390

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-28 23:53:28 +01:00
Alec Pemberton
4b8a5a1a3d utils: Remove kata-manager.sh cgroups v2 check
Removed the part in the `kata-manager.sh` script that checks if the host system only runs cgroups v2.

Fixes: #6259.

Signed-off-by: Alec Pemberton <pembek1901@gmail.com>
2023-02-28 11:23:51 -06:00
Steve Horsman
785310fe18 Merge pull request #6368 from yoheiueda/dir-perm
agent: don't set permission of existing directory in copy_file
2023-02-28 14:48:10 +00:00
Chelsea Mafrica
703589c279 Merge pull request #6369 from XDTG/6082/Fix-path-check-bypassed
runtime: use filepath.Clean() to clean the mount path
2023-02-27 17:24:50 -08:00
Bo Chen
ba9227184e Merge pull request #6376 from likebreath/0224/clh_v30.0
Upgrade to Cloud Hypervisor v30.0
2023-02-27 11:48:52 -08:00
Yushuo
2c4428ee02 runtime-rs: move pre-start hooks to sandbox_start
In some cases, network endpoints will be configured through Prestart
Hook. So network endpoints may need to be added(hotpluged) after vm
is started and also Prestart Hook is executed.

We move pre-start hook functions' execution to sandbox_start to allow
hooks running between vm_start and netns_scan easily, so that the
lifecycle API can be cleaner.

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-02-27 21:56:43 +08:00
Yushuo
e80c9f7b74 runtime-rs: add StartContainer hook
StartContainer will be execute in guest container namespace in Kata.
The Hook Path of this kind of hook is also in guest container namespace.

StartContainer is executed after start operation is called, and it
should be executed before user-specific command is executed.

Fixes: #5787

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-02-27 21:56:43 +08:00
Yushuo
977f281c5c runtime-rs: add CreateContainer hook support
CreateContainer hook is one kind of OCI hook. In kata, it will be
executed after VM is started, before container is created, and after
CreateRuntime is executed.

The hook path of CreateContainer hook is in host runtime namespace, but
it will be executed in host vmm namespace.

Fixes: #5787

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-02-27 21:56:43 +08:00
Yushuo
875f2db528 runtime-rs: add oci hook support
According to the runtime OCI Spec, there can be some hook
operations in the lifecycle of the container. In these hook
operations, the runtime can execute some commands. There are different
points in time in the container lifecycle  and different hook types
can be executed.

In this commit, we are now supporting 4 types of hooks(same in
runtime-go): Prestart hook, CreateRuntime hook, Poststart hook and
Poststop hook.

Fixes: #5787

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-02-27 21:56:43 +08:00
Yushuo
ecac3a9e10 docs: add design doc for Hooks
Fixes: #5787

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-02-27 21:56:43 +08:00
Bin Liu
e90989b16b Merge pull request #6314 from openanolis/static_doc
feat(runtime): make static resource management consistent with 2.0
2023-02-27 16:43:27 +08:00
Bo Chen
3ac6f29e95 runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v30.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #6375

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-02-24 10:20:29 -08:00
Bo Chen
262daaa2ef versions: Upgrade to Cloud Hypervisor v30.0
Details of this release can be found in our new roadmap project as
iteration v30.0: https://github.com/orgs/cloud-hypervisor/projects/6.

Fixes: #6375

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-02-24 10:19:46 -08:00
Jeremi Piotrowski
192df84588 agent: always use cgroupfs when running as init
The logic to decide which cgroup driver is used is currently based on the
cgroup path that the host provides. This requires host and guest to use the
same cgroup driver. If the guest uses kata-agent as init, then systemd can't be
used as the cgroup driver. If the host requests a systemd cgroup, this
currently results in a rustjail panic:

  thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: I/O error: No such file or directory (os error 2)

  Caused by:
      No such file or directory (os error 2)', rustjail/src/cgroups/systemd/manager.rs:44:51
  stack backtrace:
     0:     0x7ff0fe77a793 - std::backtrace_rs::backtrace::libunwind::trace::h8c197fa9a679d134
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
     1:     0x7ff0fe77a793 - std::backtrace_rs::backtrace::trace_unsynchronized::h9ee19d58b6d5934a
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
     2:     0x7ff0fe77a793 - std::sys_common::backtrace::_print_fmt::h4badc450600fc417
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:65:5
     3:     0x7ff0fe77a793 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::had334ddb529a2169
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:44:22
     4:     0x7ff0fdce815e - core::fmt::write::h1aa7694f03e44db2
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/fmt/mod.rs:1209:17
     5:     0x7ff0fe74e0c4 - std::io::Write::write_fmt::h61b2bdc565be41b5
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/io/mod.rs:1682:15
     6:     0x7ff0fe77cd3f - std::sys_common::backtrace::_print::h4ec69798b72ff254
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:47:5
     7:     0x7ff0fe77cd3f - std::sys_common::backtrace::print::h0e6c02048dec3c77
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:34:9
     8:     0x7ff0fe77c93f - std::panicking::default_hook::{{closure}}::hcdb7e705dc37ea6e
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:267:22
     9:     0x7ff0fe77d9b8 - std::panicking::default_hook::he03a933a0f01790f
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:286:9
    10:     0x7ff0fe77d9b8 - std::panicking::rust_panic_with_hook::he26b680bfd953008
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:688:13
    11:     0x7ff0fe77d482 - std::panicking::begin_panic_handler::{{closure}}::h559120d2dd1c6180
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:579:13
    12:     0x7ff0fe77d3ec - std::sys_common::backtrace::__rust_end_short_backtrace::h36db621fc93b005a
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:137:18
    13:     0x7ff0fe77d3c1 - rust_begin_unwind
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:575:5
    14:     0x7ff0fda52ee2 - core::panicking::panic_fmt::he7679b415d25c5f4
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:65:14
    15:     0x7ff0fda53182 - core::result::unwrap_failed::hb71caff146724b6b
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/result.rs:1791:5
    16:     0x7ff0fe5bd738 - <rustjail::cgroups::systemd::manager::Manager as rustjail::cgroups::Manager>::apply::hd46958d9d807d2ca
    17:     0x7ff0fe606d80 - <rustjail::container::LinuxContainer as rustjail::container::BaseContainer>::start::{{closure}}::h1de806d91fcb878f
    18:     0x7ff0fe604a76 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h1749c148adcc235f
    19:     0x7ff0fdc0c992 - kata_agent::rpc::AgentService::do_create_container::{{closure}}::{{closure}}::hc1b87a15dfdf2f64
    20:     0x7ff0fdb80ae4 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h846a8c9e4fb67707
    21:     0x7ff0fe3bb816 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h53de16ff66ed3972
    22:     0x7ff0fdb519cb - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h1cbece980286c0f4
    23:     0x7ff0fdf4019c - <tokio::future::poll_fn::PollFn<F> as core::future::future::Future>::poll::hc8e72d155feb8d1f
    24:     0x7ff0fdfa5fd8 - tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut::h0a407ffe2559449a
    25:     0x7ff0fdf033a1 - tokio::runtime::task::raw::poll::h1045d9f1db9742de
    26:     0x7ff0fe7a8ce2 - tokio::runtime::scheduler::multi_thread::worker::Context::run_task::h4924ae3464af7fbd
    27:     0x7ff0fe7afb85 - tokio::runtime::task::raw::poll::h5c843be39646b833
    28:     0x7ff0fe7a05ee - std::sys_common::backtrace::__rust_begin_short_backtrace::ha7777c55b98a9bd1
    29:     0x7ff0fe7a9bdb - core::ops::function::FnOnce::call_once{{vtable.shim}}::h27ec83c953360cdd
    30:     0x7ff0fe7801d5 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hed812350c5aef7a8
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/boxed.rs:1987:9
    31:     0x7ff0fe7801d5 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hc7df8e435a658960
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/boxed.rs:1987:9
    32:     0x7ff0fe7801d5 - std::sys::unix::thread::Thread::new::thread_start::h575491a8a17dbb33
                                 at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys/unix/thread.rs:108:17

Forward the value of "init_mode" to AgentService, so that we can force cgroupfs
when systemd is unavailable.

Fixes: #5779
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-02-24 14:02:11 +01:00
Jeremi Piotrowski
b0691806f1 agent: determine value of use_systemd_cgroup before LinuxContainer::new()
Right now LinuxContainer::new() gets passed a CreateOpts struct, but then
modifies the use_systemd_cgroup field inside that struct. Pull the cgroups path
parsing logic into do_create_container, so that CreateOpts can be immutable in
LinuxContainer::new. This is just moving things around, there should be no
functional changes.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-02-24 13:46:37 +01:00
XDTG
dc86d6dac3 runtime: use filepath.Clean() to clean the mount path
Fix path check bypassed issuse introduced by #6082,
use filepath.Clean() to clean path before check

Fixes: #6082

Signed-off-by: XDTG <click1799@163.com>
2023-02-24 15:48:09 +08:00
Yohei Ueda
c4ef5fd325 agent: don't set permission of existing directory
This patch fixes the issue that do_copy_file changes
the directory permission of the parent directory of
a target file, even when the parent directory already
exists.

Fixes #6367

Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
2023-02-24 16:43:59 +09:00
Feng Wang
cbe6ad9034 runtime: support non-root for clh
This change enables to run cloud-hypervisor VMM using a non-root user
when rootless flag is set true in the configuration

Fixes: #2567

Signed-off-by: Feng Wang <fwang@confluent.io>
2023-02-22 13:57:09 -08:00
Fabiano Fidêncio
44a780f262 Merge pull request #6262 from jepio/jepio/initrd-dev-nodes
osbuilder: Include minimal set of device nodes in ubuntu initrd
2023-02-22 20:34:13 +01:00
GabyCT
a0b1f81867 Merge pull request #5958 from Apokleos/kata-ctl-exec
kata-ctl/exec: add new command exec to enter guest VM.
2023-02-22 12:07:44 -06:00
Fabiano Fidêncio
109071855d Merge pull request #6124 from Alex-Carter01/snp-kernel-config
kernel: Add CONFIG_SEV_GUEST to SEV kernel config
2023-02-22 18:42:35 +01:00
David Esparza
5e2fe5f932 Merge pull request #6332 from jodh-intel/runtime-rs-ch-config-convert
runtime-rs: Improve Cloud Hypervisor config handling
2023-02-22 10:15:50 -06:00
GabyCT
5c6e56931f Merge pull request #6312 from Amulyam24/virtiofsd-fix
virtiofsd: update to a valid path on ppc64le
2023-02-22 08:57:51 -06:00
James O. D. Hunt
3483272bbd runtime-rs: ch: Enable initrd usage
Allow an initrd/initramfs image to be used with Cloud Hypervisor, which
is handled differently to the default rootfs image type.

Fixes: #6335.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-02-22 10:55:01 +00:00
James O. D. Hunt
fbee6c820e runtime-rs: Improve Cloud Hypervisor config handling
Replace `cloud_hypervisor_vm_create_cfg()` with a set of `TryFrom` trait
implementations in the new CH specific `convert.rs` to allow the generic
`Hypervisor` configuration to be converted into the CH specific
`VmConfig` type.

Note that device configuration is not currently handled in `convert.rs`
(it's handled in `inner_device.rs`).

This change removes the old hard-coded CH specific configuration.

Fixes: #6203.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-02-22 10:48:05 +00:00
Chao Wu
578f2e7c2e Merge pull request #6080 from openanolis/rem
runtime-rs: cleanup kata host share path
2023-02-22 17:45:24 +08:00
GabyCT
7aff118c82 Merge pull request #6236 from jepio/jepio/osbuilder-fix-default-make-target
osbuilder: fix default build target in makefile
2023-02-21 17:00:21 -06:00
Alex Carter
1bff1ca30a kernel: Add CONFIG_SEV_GUEST to SEV kernel config
Adding kernel config to sev case since it is needed for SNP and SNP will use the SEV kernel.
Incrementing kernel config version to reflect changes

Fixes: #6123
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
2023-02-21 16:48:45 +00:00
GabyCT
fc5c62a5a1 Merge pull request #6330 from c3d/issue/6329-contribution-link-in-devguide
devguide: Add link to the contribution guidelines
2023-02-21 09:17:20 -06:00
Fabiano Fidêncio
ab5b45f615 Merge pull request #6340 from fidencio/topic/ensure-go-binaries-can-still-run-on-ubuntu-2004
kata-deploy: Ensure go binaries can run on Ubuntu 20.04
2023-02-21 13:52:18 +01:00
Zhongtao Hu
4f20cb7ced Merge pull request #6325 from HerlinCoder/herlincoder/config-manager
dragonball: config_manager: preserve device when update
2023-02-21 17:51:41 +08:00
Jeremi Piotrowski
ad8968c8d9 rustjail: print type of cgroup manager
Since the cgroup manager is wrapped in a dyn now, the print in
LinuxContainer::new has been useless and just says "CgroupManager". Extend the
Debug trait for 'dyn Manager' to print the type of the cgroup manager so that
it's easier to debug issues.

Fixes: #5779
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-02-21 10:07:03 +01:00
SinghWang
b4a1527aa6 kata-deploy: Fix static shim-v2 build on arm64
Following Jong Wu suggestion, let's link /usr/bin/musl-gcc to
/usr/bin/aarch64-linux-musl-gcc.

Fixes: #6320
Signed-off-by: SinghWang <wangxin_0611@126.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-21 10:00:28 +01:00
Fabiano Fidêncio
2c4f8077fd Revert "shim-v2: Bump Ubuntu container image to 22.04"
This reverts commit 9d78bf9086.

Golang binaries are built statically by default, unless linking against
CGO, which we do.  In this case we dynamically link against glibc,
causing us troubles when running a binary built with Ubuntu 22.04 on
Ubuntu 20.04 (which will still be supported for the next few years ...)

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-21 10:00:28 +01:00
Fabiano Fidêncio
73d0ca0bd5 Merge pull request #6334 from fidencio/topic/fix-push-to-registry-behaviour
Revert "workflows: Push the builder image to quay.io"
2023-02-21 10:00:13 +01:00
Bin Liu
5c16e98d4f Merge pull request #6322 from Tim-Zhang/remove-remain-unsafe-impl
Remove all remaining unsafe impl
2023-02-21 14:08:05 +08:00
Fabiano Fidêncio
afaccf924d Revert "workflows: Push the builder image to quay.io"
This reverts commit b835c40bbd.

Right now I'm reverting this one as this should only run *after* commits
get pushed to our repo, not on very PR.
2023-02-20 18:37:28 +01:00
Fabiano Fidêncio
b1fd4b093b Merge pull request #6319 from singhwang/main
kata-deploy: Fix building the kata static firecracker arm64 package occurred an error
2023-02-20 18:04:31 +01:00
Christophe de Dinechin
4c39c4ef9f devguide: Add link to the contribution guidelines
New developers are often confused by some of our requirements, notably porting
labels. While our CONTRIBUTING.md file points to the solution, the developer's
guide does not. Add a link there.

Fixes: #6329

Signed-off-by: Christophe de Dinechin <christophe@dinechin.org>
2023-02-20 15:27:19 +01:00
Fabiano Fidêncio
a3b615919e Merge pull request #6323 from fidencio/topic/fix-make-shim-v2-tarball-on-aarch64
shim-v2: Bump Ubuntu container image  to 22.04
2023-02-20 14:57:34 +01:00
Jeremi Piotrowski
76e926453a osbuilder: Include minimal set of device nodes in ubuntu initrd
When starting an initrd the kernel expects to find /dev/console in the initrd,
so that it can connect it as stdin/stdout/stderr to the /init process. If the
device node is missing the kernel will complain that it was unable to open an
initial console. If kata-agent is the initrd init process, it will also result
in log messages not being logged to console and thus not forwarded to host
syslog.

Add a set of standard device nodes for completeness, so that console logging
works. To do that we install the makedev packge which provides a MAKEDEV helper
that knows the major/minor numbers. Unfortunately the debian package tries to
create devnodes from postinst, which can be suppressed if systemd-detect-virt
is present. That's why we create a small dummy script that matches what
systemd-detect-virt would output (anything is enough to suppress mknod).

Fixes: #6261
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-02-20 11:15:56 +01:00
Fabiano Fidêncio
6a0ac2b3a5 Merge pull request #6310 from kata-containers/topic/cache-artefacts-container-builder
packaging: Cache the container used to build the kata-deploy artefacts
2023-02-20 11:02:53 +01:00
James O. D. Hunt
0dea57c452 Merge pull request #6309 from gabevenberg/always-check-deps
utils: always check some dependencies.
2023-02-20 08:31:56 +00:00
SinghWang
697ec8e578 kata-deploy: Fix kata static firecracker arm64 package build error
When building the kata static arm64 package, the stages of firecracker report errors.

Fixes: #6318
Signed-off-by: SinghWang <wangxin_0611@126.com>
2023-02-20 16:10:18 +08:00
Helin Guo
ced3c99895 dragonball: config_manager: preserve device when update
DeviceConfigInfo contains config and device, so when we want to do
update we could simply update config part of the info, and device would
not be changed during update.

Fixes: #6324

Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
2023-02-20 14:34:09 +08:00
Tim Zhang
da8a6417aa runtime-rs: remove all remaining unsafe impl
Fixes: #6307

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-02-20 14:29:59 +08:00
Tim Zhang
0301194851 dragonball: use crossbeam_channel in VmmService instead of mpsc::channel
Because crossbeam_channel has more features and better performance than
mpsc::channel and finally rust replace its channel implementation with
crossbeam_channel on version 1.67

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-02-20 14:29:57 +08:00
Fabiano Fidêncio
9d78bf9086 shim-v2: Bump Ubuntu container image to 22.04
Let's bump the base container image to use the 22.04 version of Ubuntu,
as it does bring up-to-date package dependencies that we need to
statically build the runtime-rs on aarch64.

Fixes: #6320

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-20 07:14:09 +01:00
Fabiano Fidêncio
299fc35c37 Merge pull request #6304 from fidencio/topic/switch-the-default-x86_64-rootfs-image-to-ubuntu
versions: Use ubuntu as the default distro for the rootfs-image
2023-02-17 19:29:10 +01:00
Gabe Venberg
3cfce5a709 utils: improved unsupported distro message.
previously, if installing on unkown distro, script would tell user that
their distro was unsupported. Changed error message prompting user to
install dependecies manually, then retry.

Signed-off-by: Gabe Venberg <gabevenberg@gmail.com>
2023-02-17 09:06:26 -06:00
Bin Liu
f44dae75c9 Merge pull request #6267 from jongwooo/github-action/replace-deprecated-command-with-environment-file
github-action: Replace deprecated command with environment file
2023-02-17 22:54:12 +08:00
Fabiano Fidêncio
6a29088b81 Merge pull request #6298 from amshinde/update-release-doc
docs: Change the order of release step
2023-02-17 15:46:12 +01:00
Ji-Xinyou
919d19f415 feat(runtime): make static resource management consistent with 2.0
* add doc in the configuration
* make entry consistent with 2.0

Fixes: #6313
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2023-02-17 21:36:56 +08:00
Bin Liu
b7fe29f033 Merge pull request #6308 from Tim-Zhang/remove-unnecessary-send-and-sync
runtime-rs: remove unnecessary Send/Sync trait implement
2023-02-17 19:53:54 +08:00
Fabiano Fidêncio
b835c40bbd workflows: Push the builder image to quay.io
Let's push the builder images to a registry, so we can take advantage of
those on each step of our building process.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-17 12:06:48 +01:00
Fabiano Fidêncio
781ed2986a packaging: Allow passing a container builder to the scripts
This, combined with the effort of caching builder images *and* only
performing the build itself inside the builder images, is the very first
step for reproducible builds for the project.

Reproducible builds are quite important when we talk about Confidential
Containers, as users may want to verify the content used / provided by
the CSPs, and this is the first step towards that direction.

Fixes: #5517

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-17 12:06:48 +01:00
Fabiano Fidêncio
45668fae15 packaging: Use existing image to build td-shim
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder image for the td-shim.

This will save us some CI time.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-17 12:06:48 +01:00
Fabiano Fidêncio
e8c6bfbdeb packaging: Use existing image to build td-shim
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder image for the td-shim.

This will save us some CI time.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-17 12:06:48 +01:00
Fabiano Fidêncio
3fa24f7acc packaging: Add infra to push the OVMF builder image
Let's add the needed infra for building and pushing the OVMF builder
image to the Kata Containers' quay.io registry.

Fixes: #5477

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-17 12:06:48 +01:00
Fabiano Fidêncio
f076fa4c77 packaging: Use existing image to build OVMF
Let's first try to pull a pre-existing image, instead of buildinf our
own, to be used as a builder image for OVMF.

This will save us some CI time.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-17 12:06:48 +01:00
Fabiano Fidêncio
c7f515172d packaging: Add infra to push the QEMU builder image
Let's add the needed infra for only building and pushing the QEMU
builder image to the Kata Containers' quay.io registry.

Fixes: #5481

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-17 12:06:48 +01:00
Fabiano Fidêncio
fb7b86b8e0 packaging: Use existing image to build QEMU
Let's first try to pull a pre-existsing image, instead of building our
own, to be used as a builder image for QEMU.

This will save us some CI time.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-17 12:06:48 +01:00
Fabiano Fidêncio
d0181bb262 packaging: Add infra to push the virtiofsd builder image
Let's add the needed infra for only building and pushing the virtiofsd
builder image to the Kata Containers' quay.io registry.

Fixes: #5480

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-17 12:06:48 +01:00
Fabiano Fidêncio
7c93428a18 packaging: Use existing image to build virtiofsd
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder image for the virtiofsd.

This will save us some CI time.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-17 12:06:48 +01:00
Fabiano Fidêncio
8c227e2471 virtiofsd: Pass the expected toolchain to the build container
Let's ensure we're building virtiofsd with a specific toolchain that's
known to not cause any issues, instead of always using the latest one.

On each bump of the virtiofsd, we'll make sure to adjust this according
to what's been used by the virtiofsd community.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-17 12:06:48 +01:00
Fabiano Fidêncio
7ee00d8e57 packaging: Add infra to push the shim-v2 builder image
Let's add the needed infra for only building and pushing the shim-v2
builder image to the Kata Containers' quay.io registry.

Fixes: #5478

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-17 12:06:47 +01:00
Fabiano Fidêncio
24767d82aa packaging: Use existing image to build the shim-v2
Let's try to pull a pre-existing image, instead of building our own, to
be used as a builder for the shim-v2.

This will save us some CI time.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-17 12:06:24 +01:00
Amulyam24
e84af6a620 virtiofsd: update to a valid path on ppc64le
Currently the symbolic link for virtiofsd which is used as
a valid path is not updated on every CI run. Fix it by
using the actual path of installation.

Fixes: #6311

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2023-02-17 16:22:39 +05:30
Fabiano Fidêncio
6c3c771a52 packaging: Add infra to push the kernel builder image
Let's add the needed infra for only building and pushing the kernel
builder image to the Kata Containers' quay.io registry.

Fixes: #5476

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-17 11:30:28 +01:00
Fabiano Fidêncio
b9b23112bf packaging: Use existing image to build the kernel
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder image for the kernel.

This will save us some CI time.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-17 11:30:28 +01:00
Fabiano Fidêncio
869827d77f packaging: Add push_to_registry()
This function will push a specific tag to a registry, whenever the
PUSH_TO_REGISTRY environment variable is set, otherwise it's a no-op.

This will be used in the future to avoid replicating that logic in every
builder used by the kata-deploy scripts.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-17 11:30:21 +01:00
Fabiano Fidêncio
e69a6f5749 packaging: Add get_last_modification()
Let's add a function to get the hash of the last commit modifying a
specific file.

This will help to avoid writing `git rev-list ...` into every single
build script used by the kata-deploy.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-17 10:39:33 +01:00
Fabiano Fidêncio
6c05e5c67a packaging: Add and export BUILDER_REGISTRY
BUILD_REGISTRY, which points to quay.io/kata-containers/builder, will be
used for storing the builder images used to build the artefacts via the
kata-deploy scripts.

The plan is to tag, whenever it's possible and makes sense, images like:
* ${BUILDER_REGISTRY}:${component}-${unique_identifier}

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-17 10:39:33 +01:00
Fabiano Fidêncio
bd9af5569f Merge pull request #6296 from fidencio/topic/dont-build-runtime-rs-for-ppc64le-2nd-try
runtime-rs: Don't build on Power, don't break on Power.
2023-02-17 10:08:39 +01:00
Gabe Venberg
1047840cf8 utils: always check some dependencies.
Every dependency in check_deps is used inside the script (apart from
git, which may be a historical artifact), and therefore should be
checked even when the -f option is passed to the script. Simply changed
at what point check_deps is called in order to always run it.

Fixes #6302.

Signed-off-by: Gabe Venberg <gabevenberg@gmail.com>
2023-02-16 23:00:19 -06:00
Tim Zhang
95e3364493 runtime-rs: remove unnecessary Send/Sync trait implement
Send and Sync are automatically derived traits,
if a type is composed entirely of Send or Sync types, then it is Send or Sync.
Almost all primitives are Send and Sync,
so we don't need to implement them manually most of the time.

Fixes: #6307

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-02-17 11:51:13 +08:00
Archana Shinde
a96ba99239 actions: Use git-diff to get changes in kernel dir
Use `git-diff` instead of legacy `git-whatchanged` to get
differences in the packaging/kernel directory. This also fixes
a bug by grepping for the kernel directory in the output of the
git command.

Fixes: #6210

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-02-16 17:33:41 -08:00
Archana Shinde
619ef54452 docs: Change the order of release step
When a new stable branch is created, it is necessary to change the
references in the tests repo from main to the new stable branch.

However this step needs to be performed after the repos have been tagged
as the `tags_repos.sh` script is the one that creates the new branch.
Clarify this in the documentation and move the step to change branch
references in test repo after repos have been tagged.

Fixes: #1824

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-02-16 12:12:21 -08:00
Fabiano Fidêncio
a161d11920 versions: Use ubuntu as the default distro for the rootfs-image
Currently ubuntu is already the default distro for all the architectures
but x86_64, which uses clearlinux.  However, our CI does *not* test the
clearlinux image we ship.

Taking a look at our CI code [0], we've been using ubuntu as base for
the tests for a few years already, if not forever.

The minimum we can do is to switch to distributing ubuntu, as the tested
rootfs-image, and then decide later on whether we should switch back to
clearlinux (once we switch our CI to using that, and make sure all tests
will be green), or if we move to slimmer distro, such as alpine.

[0]: 0a39dd1a01/.ci/install_kata_image.sh (L44)

Fixes: #6303

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-16 20:30:40 +01:00
Fabiano Fidêncio
be40683bc5 runtime-rs: Add a generic powerpc64le-options.mk
There's a check in the runtime-rs Makefile that basically checks whether
the `arch/$arch-options.mk` exists or not and, if it doesn't, the build
is just aborted.

With this in mind, let's create a generic powerpc64le-options.mk file
and not bail when building for this architecture.

Fixes: #6142

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-16 16:29:24 +01:00
Fabiano Fidêncio
47c058599a packaging/shim-v2: Install the target depending on the arch/libc
In the `install_go_rust.sh` file we're adding a
x86_64-unknown-linux-musl target unconditionally.  That should be,
instead, based in the ARCH of the host and the appropriate LIBC to be
used with that host.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-16 16:29:24 +01:00
Fabiano Fidêncio
c1602c848a Merge pull request #6300 from openanolis/footloose
runtime-rs: handle sys_dir bind volume
2023-02-16 12:53:15 +01:00
alex.lyn
b582c0db86 kata-ctl/exec: add new command exec to enter guest VM.
The patchset will help users to easily enter guest VM by debug
console sock.

In order to enter guest VM smoothly, users needs to do some
configuration, options as below:
(1) Set debug_console_enabled = true with default vport 1026.
(2) Or add agent.debug_console agent.debug_console_vport=<PORT>
into kernel_params, and the vport is <PORT> you set.

The detail of usage:
$ kata-ctl exec -h
kata-ctl-exec
Enter into guest VM by debug console

USAGE:
kata-ctl exec [OPTIONS] <SANDBOX_ID>

ARGS:
<SANDBOX_ID> pod sandbox ID

Fixes: #5340

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-02-16 17:05:53 +08:00
Yushuo
07802a19dc runtime-rs: handle sys_dir bind volume
For some cases, users will mount system directories as bind volume.
We should not bind mount these kind of directories in the host as it does
not make sense.

Fixes: #6299

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-02-16 15:45:33 +08:00
Bin Liu
629a31ec6e Merge pull request #6287 from lifupan/main
sandbox: set the dns for the sandbox
2023-02-16 15:00:01 +08:00
Fabiano Fidêncio
f5b28736ce Merge pull request #6294 from fidencio/topic/only-change-configs-if-the-config-files-exist
packaging/shim-v2: Only change the config if the file exists
2023-02-16 07:13:28 +01:00
Fupan Li
04e930073c sandbox: set the dns for the sandbox
The rust agent had supported to set the guest dns
server in start sandbox request, thus add the dns
in the runtime side.

Fixes:#6286

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2023-02-16 11:25:02 +08:00
Fupan Li
32ebe1895b agent: fix the issue of creating the dns file
We should make sure the dns's source file's parent
directory exist, otherwise, it would failed to create
the file directly.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2023-02-16 11:24:54 +08:00
Peng Tao
139ad8e95f Merge pull request #6201 from jodh-intel/runtime-rs-add-cloud-hypervisor
runtime-rs: Add basic CH implementation
2023-02-16 11:23:04 +08:00
Archana Shinde
eba2bb275d Merge pull request #6284 from amshinde/revert-kata-deploy-changes-after-3.1.0-rc0-release
release: Revert kata-deploy changes after 3.1.0-rc0 release
2023-02-15 14:50:12 -08:00
Archana Shinde
4a35d5fa6e Merge pull request #6283 from amshinde/3.1.0-rc0-branch-bump
# Kata Containers 3.1.0-rc0
2023-02-15 13:00:43 -08:00
Chelsea Mafrica
f9db0c5a86 Merge pull request #6285 from cmaf/assisted-pr-4216
Assisted PR | docs: Update how-to-use-kata-containers-with-firecracker.md
2023-02-15 09:40:01 -08:00
jongwooo
44aaec9020 github-action: Replace deprecated command with environment file
In workflow, `set-output` command is deprecated and will be disabled soon.
This commit replaces the deprecated `set-output` command with putting a
value in the environment file `$GITHUB_OUTPUT`.

Fixes #6266

Signed-off-by: jongwooo <jongwooo.han@gmail.com>
2023-02-16 01:41:03 +09:00
Hyounggyu Choi
a68c5004f8 packaging/shim-v2: Only change the config if the file exists
Let's not try to sed a file that doesn't exist, which may be the case
depending on the architecture we're building the shim-v2 for.

This is a partial-forward port of
f24c47ea47.

Fixes: #6293

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-02-15 17:00:53 +01:00
Willem Dendauw
9304889330 docs: Update how-to-use-kata-containers-with-firecracker.md
Removed the `` around containerd, because when you execute this as a
script it runs the containerd command within the script, which it should
not do.

Fixes #4217

Signed-off-by: Willem Dendauw <willem.dendauw@hotmail.com>
2023-02-14 15:53:26 -08:00
Archana Shinde
ee76b398b3 release: Revert kata-deploy changes after 3.1.0-rc0 release
As 3.1.0-rc0 has been released, let's switch the kata-deploy / kata-cleanup
tags back to "latest", and re-add the kata-deploy-stable and the
kata-cleanup-stable files.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-02-14 15:47:51 -08:00
Archana Shinde
5988199ada release: Kata Containers 3.1.0-rc0
- kata-deploy: Install protobuf-compiler explicitly in shim-v2 Dockerfile
- runtime: tracing: Fix missing ctx return
- runtime: add reconnect timeout for vhost user block
- SEV: Update ReducedPhysBits
- shim-v2/build.sh: Only build runtime-rs for the supported arches
- kata-ctl: Expand unit tests for CPU check
- runtime: support cgroup v2 metrics marshal guest metrics
- Typo: change tabs in comment to spaces
- rootfs: support EROFS filesystem
- versions: Update runc version
- runtime: Improve documentation of appendFDs
- Minor cleanups in make file
- main | docs: Fix missing critical steps in how-to-hotplug-memory-arm64.md
- Action check kernel config version
- clh: Enforce API timeout only for vm.boot request
- virtiofsd: change cache mod to const
- runtime-rs: ignor "no such process" error when delete cgroup for a thread to let it go
- kernel: Add console kernel config for s390
- runtime: remove not used shim configurations
- improvement: Fix naming conventions for span name and log subsystem
- Dragonball: add cpu resize ability
- arm64/CI: fix unit test failure on arm64
- CI: Make docker version stick to v20.10 in ubuntu:20.04 for s390x|ppc64le
- virtiofsd: fix the build on ppc64le
- runtime:fix stat uds path
- cni: Update cni plugins version to 1.2.0
- Built-in Sandbox: add more unit tests for dragonball. Part 5
- runtime: Drop QEMU log file support
- docs: Add documentation for building agent with seccomp support.
- Add kernel-dragonball-experimental to kata-deploy, kata-deploy-test, and the release
- runtime-rs: add missing config section for share-fs
- runtime: Add hmp for qemu
- upcall: add document for upcall
- runtime: Start QEMU undaemonized and get logs
- docs: Update url link in QAT documentation
- versions: update cni plugins version
- versions: Upgrade to Cloud Hypervisor v29.0
- runtime: Use consts in `kata-runtime check`
- versions: Bump QEMU to v7.2.0
- agent: Eliminate unnecessary metrics
- runtime:all APIs are hang in the service.mu
- Utility functions for kata-env
- versions: Update conmon version
- runtime: paas enablevhostuserstore annotation to hypervisor config
- runk: Upgrade liboci-cli to v0.0.4
- runtime: use system pagesize for hugepage test
- dependency: update cgroups-rs
- runtime: Use git rev-parse for the kata-monitor tag
- virtcontainers: split out linux-specific bits for mount, factory
- Add darwin skeletons
- vendor: revendor netlink to get latest
- Address issues with the initial vCPU pinning functionality
- virtcontainers: Fix misspelling in error message
- runtime: add test generated file to .gitignore
- runtime: fix up disable_netns handling
- docs: add hint of probing loop module
- tools: add --locked option for cargo install
- runtime-rs: add Single Container support
- virtcontainers: tests: Ensure Linux specific tests are just run on Linux
- Change cache mode from none to never
- tools: Fix indentation for setup aks script
- virtcontainers: fs_share: Add Darwin skeleton
- virtcontainers: Add a Virtualization.framework skeleton
- kata-ctl: remove get_kata_version_by_url function
- kata-ctl: fix build error on s390x
- virtcontainers: Introduce hypervisor_darwin
- runtime: Define Darwin handled signals list
- nydus: net-ns handling needs to be only executed on Linux hosts
- clh: Ensure it works with Docker / Moby
- agent: refactor guest hooks
- fix moby prestart hook handling
- schedcore: Make buildable on !linux
- Built-in Sandbox: add more unit tests for dragonball. Part 4
- runtime-rs: cleanup the run dir of hypervisor when shut down
- Feat: implementation of kata-ctl direct-volume operations
- Runtime: Clarify mutability of global var
- kata-runtime: add rust runtime path for kata-runtime exec
- versions: Upgrade to Cloud Hypervisor v28.1
- runtime-rs: add dbs-upcall feature
- runtime/Makefile: Get some bits happy on darwin
- docs: remove old and misleading instructions for minikube
- packaging: fix indents in build-kernel.sh
- kernel: adding kmod to do docker env
- versions: Update the rust toolchain to 1.66.0
- kata-ctl: skip test if access GitHub.com fail
- agent: unset `CC` for cross-build
- runtime-rs: enable hugepage
- runtime-rs: Clean up mount points shared to guest
- kata-ctl: fix checkcpu bug in non-x86 arches

d144ded12 release: Adapt kata-deploy for 3.1.0-rc0
8e3863cec kata-deploy: Install protobuf-compiler explicitly in shim-v2 Dockerfile
c45391991 runtime: tracing: Fix missing ctx return
4139d68d5 runtime-rs: Include target install in conditional branch
ca02c9f51 runtime: add reconnect timeout for vhost user block
2f5bc0f40 kata-ctl: Expand unit tests for CPU check
67b8f0773 SEV: Update ReducedPhysBits
bdf20b5d2 rootfs: support EROFS filesystem
fff0e50a7 versions: Update runc version
ed02c8a05 docs: add guide for building rootfs with EROFS
01765e173 runtime: support cgroup v2 metrics marshal guest metrics
49326fe4e fix(clippy): fix hypervisor clippy checks
94b1d9814 cargo: Update Cargo.lock files
f1855594a make: Get rid of verbose output while creating tar
c3836010a make: clean up obsolete targets
ac64b021a clh: Enforce API timeout only for vm.boot request
56071c6e7 virtiofsd: change cache mod to const
5d37d31ac cgroups: upgrade cgroupfs to 0.3.1
ab59a65c9 runtime-rs: neglect a certain error when delete cgroup
390916b33 runtime: remove not used shim configurations
9794c52c6 improvement: Fix naming conventions for span name and log subsystem
f49b89b63 CI: Set docker version to v20.10 in ubuntu:20.04 for s390x|ppc64le
3c24e2340 README: Update Readme under packaging/kernel
d73f3a8a2 github-action: Add step to verify kernel config version id updated
59f104c02 runtime: skip unit test that fail regularly on aarch64
b7dd97cac kata-ctl: fix permission deny issue in test_add_remove
57c5e5629 Dragonball: add cpu resize ability
3c48f2202 runtime: Improve documentation of appendFDs
856ab6687 virtiofsd: fix the build on ppc64le
f83115a83 docs: Fix missing critical steps in how-to-hotplug-memory-arm64.md
e071d9251 Typo: change tabs in comment to spaces
56f0a27fe kernel: Add console kernel config for s390
334c4b8bd runtime: Drop QEMU log file support
3a63e3c1f cni: Update cni plugins version to 1.2.0
510798155 dragonball: Improve test cases
dc90c6e30 dragonball: add more unit test for vm
c07135535 runtime-rs: Improve s390x error message
4e2db96ef runtime-rs: Don't try to build on Power
8e8c720d5 kata-deploy-push: Ensure we build Dragonball specific kernel
1e531b44d runtime:fix stat uds path
9092c23a2 runtime: Add hmp for qemu
b7f4e96ff kata-deploy-test: Ensure we build dragonball specific kernel
063dec37c release: Add the dragonball-experimental kernel
0b3c91d2a kata-deploy: Add kernel-dragonball-experimental target
00dcd900f docs: Add documentation for building agent with seccomp support.
2b779cba0 docs: Update url link in QAT documentation
39fe4a4b6 runtime: Collect QEMU's stderr
a5319c6be runtime: Start QEMU undaemonized
bf4e3a618 runtime: Launch QEMU with cmd.Start()
8a1723a5c runtime: Pre-establish the QMP connection
8a4f08cb0 govmm: Optionally pass QMP listener to QEMU
219bb8e7d govmm: Optionally start QMP with a pre-configured connection
a85d0e465 versions: update cni plugins version
676d02850 versions: Bump QEMU to v7.2.0
861c38b6a versions: Upgrade to Cloud Hypervisor v29.0
ba87e0afe runtime: Use consts in `kata-runtime check`
9f490d16f upcall: add document for upcall
596037e20 versions: Update conmon version
095e8fdef runk: Use the original Kill command instead of the customed it.
0f9e23a3d runk: Upgrade liboci-cli to v0.0.4
69fc8de71 runtime:all APIs are hang in the service.mu
8d4c2cf1b kata-ctl: Allow certain constants to go unused
64c11a66f kata-ctl: Have function to get cpu details to run on specific arch
923cd3fda virtcontainers: split out Linux parts from mount
cf1bae352 runtime: paas enablevhostuserstore annotation to hypervisor config
1592a385e dependency: update cgroups-rs
60ff230d8 virtcontainers: Split the factory package into Linux and Darwin bits
76437a972 runtime: Use git rev-parse for the kata-monitor tag
a9626682a virtcontainers: resourcecontrol: Add skeleton for Darwin
ea06fe3af virtcontainers: Add a Network API skeleton for Darwin
6ee550e9a runtime: vCPUs pinning is sandbox specific, not hypervisor
6199b6917 runtime-rs: change cache mode
a33a22ccd runtime-rs: add missing config section for share-fs
e3d3b72fa virtcontainers: use resource control for setting CPU affinity
f137048be resource-control: add helper function for setting CPU affinity
73216a810 vendor: revendor netlink to get latest
fc17d7cc4 virtcontainers: Fix misspelling in error message
12fd6ffc1 runtime: fix up disable_netns handling
64c9114a3 tools: add --locked option for cargo install
7eb43cec1 runtime: add test generated file to .gitignore
8551853cf runtime: use system pagesize for hugepage test
86a82cace runtime: change cache mode from none to never
82c59efd6 runtime-rs: change cache mode from none to never
7b309b578 kata-types: change cache mode from none to never
fee4e7c7c docs: change cache mode from none to never
594b57d08 utils: Add utility functions to get cpu and distro details.
d33e34361 check: Move PROC_CPUINFO from architecture specific files
f8a93a1de tools: Fix indentation for setup aks script
03de5f41b kata-ctl: remove get_kata_version_by_url function
464d4c94d runtime-rs: process single_container
5f9c892e4 kata-types: add single_container support
fa9ae9362 virtcontainers: Add a Virtualization.framework skeleton
d48b22bb1 virtcontainers: fs_share: add Darwin skeleton
fafc7a8b1 virtcontainers: tests: Ensure Linux specific tests are just run on Linux
efa4fc0b2 clh: Add hotplug support for network devices
1074d2c1d clh: Make vmAddNetPutRequest capable of doing hotplugs
9ec8a1398 virtcontainers: introduce hypervisor_darwin
8bb68a9f2 vc/network: skip existing endpoints when scanning for new ones
c21a8d5ff kata-ctl: fix build error on s390x
3b4420eb8 runtime: Define Darwin handled signals list
24b05a99b schedcore: Make buildable on !linux
3886aad19 nydus: net-ns handling needs to be only executed on Linux hosts
e256903af runtime-rs: cleanup the run dir of hypervisor when shut down
937a41346 kata-ctl: add unit tests for volume ops
8451db7c0 kata-ctl: direct-volume: add Add and Remove handlers
2d4b2cf72 runtime-rs: add POST method to shim-client
cae78a685 kata-ctl: add constants for direct-volume commands
652021ad9 versions: Upgrade to Cloud Hypervisor v28.1
d08538912 vc: fix up UT for CreateSandbox API change
578a9c25f vc: rescan network endpoints after running prestart hooks
cb84b0fb0 katautils: run prestart hooks after starting VM
079462d2e runk: Fix needless_borrow warning
2c24fcf34 runtime-rs: Fix clippy::bool-to-int-with-if warnings
025e78341 runtime-rs: Fix needless_borrow warnings
4fb163d57 runtime-rs: Allow clippy:box_default warnings
20121fcda runtime-rs: Fix unnecessary_cast warnings
b95364a14 dragonball: Allow question_mark warning in allocate_device_resources()
0b2f060bf dragonball: Fix unnecessary_cast warnings
a545a6593 agent: Allow clippy::question_mark warning in Namespace{}
9ced34dd2 agent: Fix explicit_auto_deref warnings
f77220490 agent: Fix needless_borrow warnings
7bcdc9049 rustjail: Fix unnecessary_cast warnings
41d7dbaae rustjail: Fix needless_borrow warnings
2a73e057d kata-types: Fix unnecessary_cast warnings
cf9ef1833 kata-types: Fix needless_borrow warnings
126187e81 safe-path: Fix needless_borrow warnings
bb78d35db kata-sys-util: Fix "match-like-matches-macro" warning
668e65240 kata-sys-util: Fix unnecessary_cast warnings
c1a8d89a7 kata-sys-util: Fix needless_borrow warnings
c9c38e6d0 logging: Allow clippy::type-complexity warning
ffd6fbb6b logging: Fix needless_borrow warnings
60df30015 protocols: Fix unnecessary_cast warnings
56e7b5d0f runtime/Makefile: Get some bits happy on darwin
0bbeb34b4 protocols: Fix needless_borrow warnings
dfea6c7d2 versions: Update the rust toolchain to 1.66.0
86ee24b33 Runtime: Clarify mutability of global var
dae667062 kata-runtime: add rust runtime path for kata-runtime exec
a2e3715e0 upcall: remove upcall client when stopping vm
31591d791 dragonball: fix unit test failure case about Kvm.
2b02e0a9b dragonball: add more unit test for vcpu manager
85f9094f1 agent: refactor guest hooks
360506225 runtime-rs: add dbs-upcall feature
03a0c9d78 kata-ctl: skip test if access GitHub.com fail
1dcbda3f0 kata-ctl: update Cargo.lock
b4b5d8150 docs: remove old and misleading instructions for minikube
0fe24e08b packaging: fix indents in build-kernel.sh
3480780bd kata-ctl: add check framework support for non-x86
1bd533f10 kata-ctl: let check framework arch-agnostic
fd77eebd4 runtime-rs: fix the issues mentioned in the code review
0e6920790 runtime-rs: Clean up mount points shared to guest
ecb28e2b1 kernel: adding kmod to do docker env
087515a46 agent: unset `CC` for cross-build
bf8848f92 agent: Eliminate unnecessary metrics
f8a48ab41 docs: add hint of probing loop module
afaf17f42 runtime-rs: enable container hugepage
fc4a67eec runtime-rs: enable vm hugepage

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-02-14 15:47:44 -08:00
Archana Shinde
d144ded12c release: Adapt kata-deploy for 3.1.0-rc0
kata-deploy files must be adapted to a new release.  The cases where it
happens are when the release goes from -> to:
* main -> stable:
  * kata-deploy-stable / kata-cleanup-stable: are removed

* stable -> stable:
  * kata-deploy / kata-cleanup: bump the release to the new one.

There are no changes when doing an alpha release, as the files on the
"main" branch always point to the "latest" and "stable" tags.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-02-14 15:47:44 -08:00
Fabiano Fidêncio
0d2a7f8324 Merge pull request #6273 from BbolroC/fix-protobuf-s390x-ppc64le
kata-deploy: Install protobuf-compiler explicitly in shim-v2 Dockerfile
2023-02-14 22:25:20 +01:00
James O. D. Hunt
bbc733d6c8 docs: runtime-rs: Add CH status details
Add a few details about the current state of the Cloud Hypervisor (CH)
runtime-rs external hypervisor implementation with pointers to the
appropriate issues.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-02-14 15:38:46 +00:00
James O. D. Hunt
37b594c0d2 runtime-rs: Add basic CH implementation
Add a basic runtime-rs `Hypervisor` trait implementation for Cloud
Hypervisor (CH).

> **Notes:**
>
> - This only supports a default Kata configuration for CH currently.
>
> - Since this feature is still under development, `cargo` features have
>   been added to enable the feature optionally. The default is to not enable
>   currently since the code is not ready for general use.
>
>   To enable the feature for testing and development, enable the
>   `cloud-hypervisor` feature in the `virt_container` crate and enable the
>   `cloud-hypervisor` feature for its `hypervisor` dependency.

Fixes: #5242.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-02-14 15:38:39 +00:00
James O. D. Hunt
5f6d747e6d Merge pull request #6272 from cmaf/tracing-clh-returnctx-startVM
runtime: tracing: Fix missing ctx return
2023-02-14 08:17:45 +00:00
Bin Liu
e812c5ce66 Merge pull request #6076 from zhaojizhuang/reconnect
runtime: add reconnect timeout for vhost user block
2023-02-14 10:39:20 +08:00
Archana Shinde
7b4e5751ca Merge pull request #5007 from larrydewey/update-rpb-main
SEV: Update ReducedPhysBits
2023-02-13 14:56:38 -08:00
Hyounggyu Choi
87d197ef20 Merge pull request #6143 from fidencio/topic/only-build-runtime-rs-for-x86_64-and-arm
shim-v2/build.sh: Only build runtime-rs for the supported arches
2023-02-13 23:43:10 +01:00
Hyounggyu Choi
8e3863cecb kata-deploy: Install protobuf-compiler explicitly in shim-v2 Dockerfile
This is to install a missing binary protoc in shim-v2 Dockerfile.

Fixes: #6244

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
(cherry picked from commit 10603e3def)
2023-02-13 22:29:19 +01:00
Chelsea Mafrica
c453919911 runtime: tracing: Fix missing ctx return
Normally we return the context when creating a trace span so that the
ordering of spans w.r.t. calls is maintained in tracing output. Add
missing context for StartVM() for Cloud Hypervisor.

Fixes #6271

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-02-13 12:37:52 -08:00
Chelsea Mafrica
036d3a4088 Merge pull request #5920 from cmaf/kata-ctl-check-cpu-unit-tests-1
kata-ctl: Expand unit tests for CPU check
2023-02-13 12:21:58 -08:00
Hyounggyu Choi
4139d68d51 runtime-rs: Include target install in conditional branch
A Makefile target `install` should be included in the conditional branch
as default and test.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-02-13 21:13:32 +01:00
James O. D. Hunt
545151829d kata-types: Add Cloud Hypervisor (CH) definitions
Implement `ConfigPlugin` trait for Cloud Hypervisor (CH).

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-02-13 10:25:29 +00:00
zhaojizhuang
ca02c9f512 runtime: add reconnect timeout for vhost user block
Fixes: #6075
Signed-off-by: zhaojizhuang <571130360@qq.com>
2023-02-13 14:33:46 +08:00
Zhongtao Hu
2dd2421ad0 runtime-rs: cleanup kata host share path
cleanup the /run/kata-containers/shared/sandboxes/pid path

Fixes:#5975
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-02-13 13:07:07 +08:00
Bin Liu
95602c8c08 Merge pull request #5999 from yaoyinnan/5998/feat/cgroup-metrics
runtime: support cgroup v2 metrics marshal guest metrics
2023-02-11 19:26:24 +08:00
Bin Liu
8a9392fd9d Merge pull request #6188 from yahaa/Typo-fix
Typo: change tabs in comment to spaces
2023-02-11 11:19:11 +08:00
Bin Liu
ecbd94d80c Merge pull request #6064 from yaoyinnan/6063/feat/rootfs-erofs
rootfs: support EROFS filesystem
2023-02-11 11:10:23 +08:00
Chelsea Mafrica
2f5bc0f408 kata-ctl: Expand unit tests for CPU check
Change unit tests for CPU check to table-driven tests and expand test
cases including temp files for cpuinfo.

Fixes #5919

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-02-10 14:18:44 -08:00
Larry Dewey
67b8f0773f SEV: Update ReducedPhysBits
Updating this field, as `cpuid` provides host level data, which is not
what a guest would expect for Reduced Phsycial Bits. In almost all
cases, we should be using `1` for the value here.

Amend: Adding unit test change.

Fixes: #5006

Signed-off-by: Larry Dewey <larry.dewey@amd.com>
2023-02-10 13:19:33 -06:00
yaoyinnan
bdf20b5d26 rootfs: support EROFS filesystem
For kata containers, rootfs is used in the read-only way.
EROFS can noticably decrease metadata overhead.

On the basis of supporting the EROFS file system, it supports using the config parameter to switch the file system used by rootfs.

Fixes: #6063

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>
2023-02-11 00:44:13 +08:00
GabyCT
bd1e8a2a24 Merge pull request #6252 from GabyCT/topic/upruncversion
versions: Update runc version
2023-02-10 08:46:26 -06:00
GabyCT
86501d5f6f Merge pull request #6200 from gkurz/improve-appendFDs-doc
runtime: Improve documentation of appendFDs
2023-02-09 15:50:37 -06:00
Gabriela Cervantes
fff0e50a73 versions: Update runc version
This PR updates the runc version. This new version include
changes in:
- Fix mounting via wrong proc fd. When the user and mount namespaces are
used, and the bind mount is followed by the cgroup mount in the spec,
the cgroup was mounted using the bind mount's mount fd.
- Switch kill() in libcontainer/nsenter to sane_kill().
- Fix "permission denied" error from runc run on noexec fs.
- Fix failed exec after systemctl daemon-reload. Due to a regression
in v1.1.3, the DeviceAllow=char-pts rwm rule was no longer added and
was causing an error open /dev/pts/0: operation not permitted: unknown when systemd was reloaded.

Fixes #6251

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-02-09 21:16:41 +00:00
Archana Shinde
b67a1da187 Merge pull request #6166 from amshinde/make-cleanup
Minor cleanups in make file
2023-02-09 11:24:48 -08:00
yaoyinnan
ed02c8a051 docs: add guide for building rootfs with EROFS
Add guide for building rootfs with EROFS.

Fixes: #6063

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>
2023-02-09 20:07:51 +08:00
yaoyinnan
01765e1734 runtime: support cgroup v2 metrics marshal guest metrics
Support to use cgroup v2 metrics marshal guest metrics.

Fixes: #5998

Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>
2023-02-09 19:14:09 +08:00
yaoyinnan
49326fe4e1 fix(clippy): fix hypervisor clippy checks
Fix hypervisor clippy checks.

Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>
2023-02-09 14:32:27 +08:00
Jianyong Wu
6f86fb8e27 Merge pull request #6183 from singhwang/main
main | docs: Fix missing critical steps in how-to-hotplug-memory-arm64.md
2023-02-09 09:26:11 +08:00
Archana Shinde
94b1d9814c cargo: Update Cargo.lock files
The cargo.locks file under src/libs and agent-ctl seem to be outdated.
Updating these.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-02-08 13:50:54 -08:00
Archana Shinde
f1855594a2 make: Get rid of verbose output while creating tar
We already have verbose output while merging the builds from various
build targets. Getting rid of verbose output to speed up.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-02-08 13:41:41 -08:00
Archana Shinde
c3836010a8 make: clean up obsolete targets
Cleanup targets that have been removed in the past when the
makefile for kata-deploy was included.
Instead, add targets from the makefile under local-build kata-deploy.

Fixes: #6165

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-02-08 13:41:40 -08:00
Archana Shinde
a482b0d410 Merge pull request #6209 from amshinde/action-check-kernel-config-version
Action check kernel config version
2023-02-08 10:34:54 -08:00
Bin Liu
407d3146e6 Merge pull request #6234 from UiPath/fix-clh-timeout
clh: Enforce API timeout only for vm.boot request
2023-02-08 21:33:56 +08:00
Tim Zhang
d4f8f3a779 Merge pull request #6152 from liubin/fix/6151-refactor-cache-mod-const
virtiofsd: change cache mod to const
2023-02-08 17:53:57 +08:00
Alexandru Matei
ac64b021a6 clh: Enforce API timeout only for vm.boot request
launchClh already has a timeout of 10seconds for launching clh, e.g.
if launchClh or setupVirtiofsDaemon takes a few seconds the context's
deadline will already be expired by the time it reaches bootVM

Fixes #6240
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2023-02-08 11:14:51 +02:00
Bin Liu
56071c6e7b virtiofsd: change cache mod to const
Change cache mod from literal to const and place them in one place.

Also set default cache mode from `none` to `never` in
`pkg/katautils/config-settings.go.in`.

Fixes: #6151

Signed-off-by: Bin Liu <bin@hyper.sh>
2023-02-08 15:06:52 +08:00
Zhongtao Hu
2752225360 Merge pull request #6193 from jongwu/cgroup_del_err
runtime-rs: ignor "no such process" error when delete cgroup for a thread to let it go
2023-02-08 10:30:12 +08:00
Bin Liu
93b3d0a28e Merge pull request #6163 from BbolroC/kernel-config-s390
kernel: Add console kernel config for s390
2023-02-08 10:02:38 +08:00
Bin Liu
71a3b73cb0 Merge pull request #6223 from d3c3mber/rm-unused-shim-config
runtime: remove not used shim configurations
2023-02-08 10:00:52 +08:00
Jeremi Piotrowski
0a21ad78b1 osbuilder: fix default build target in makefile
The .dracut_rootfs.done file is accidentally being picked up as the default
target, regardless of BUILD_METHOD. Move the 'all' target definition up, so
that it's the default (=first) target in the makefile. Additionally make the
.dracut_rootfs.done target conditional on the right BUILD_METHOD being
selected, as building it doesn't make sense with BUILD_METHOD=distro.

Fixes: #6235
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-02-07 18:36:03 +01:00
Jianyong Wu
5d37d31ac7 cgroups: upgrade cgroupfs to 0.3.1
Trait method cause for std::error::Error is deprecated thus need replace
it with source method for cgroups-fs::error::ErrorKind.

Fixes: #6192
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-02-07 18:09:31 +08:00
Jianyong Wu
ab59a65c92 runtime-rs: neglect a certain error when delete cgroup
Delete cgroup for a thread which may exit can lead to panic. Just
neglect that error is harmless also avoid this failure.

Fixes: #6192
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-02-07 18:09:31 +08:00
wllenyj
9a01d4e446 dragonball: add more unit test for virtio-blk device.
Added more unit tests for virtio-blk device.

Fixes: #4899

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2023-02-07 17:16:11 +08:00
d3c3mber
390916b33c runtime: remove not used shim configurations
ShimPath and ShimDebug are not needed anymore.

Fixes: #6147

Signed-off-by: d3c3mber <tangbo_gl_2022@163.com>
2023-02-07 14:06:12 +08:00
Bin Liu
8ae14f6a55 Merge pull request #6208 from joannejchen/fix-naming-conventions
improvement: Fix naming conventions for span name and log subsystem
2023-02-07 13:43:37 +08:00
joannejchen
9794c52c65 improvement: Fix naming conventions for span name and log subsystem
Normally, the span name should be the same as the function name, and the log subsystem should not contain spaces.

Fixes #6153

Signed-off-by: joannejchen <chenjjoanne@gmail.com>
2023-02-06 08:25:49 -06:00
Bin Liu
df93439c3b Merge pull request #6009 from openanolis/dragonball/add_cpu_resize
Dragonball: add cpu resize ability
2023-02-05 19:54:08 +08:00
Archana Shinde
d3bb254188 utils: Add function to check vhost-vsock
Add function to check if the host-system has the vhost-vsock
kernel module.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-02-03 15:41:59 -08:00
GabyCT
7fc35f19eb Merge pull request #6056 from jongwu/perm_deny
arm64/CI: fix unit test failure on arm64
2023-02-03 10:53:38 -06:00
Greg Kurz
1660d5651f Merge pull request #6212 from BbolroC/fix-docker-buildx-s390x
CI: Make docker version stick to v20.10 in ubuntu:20.04 for s390x|ppc64le
2023-02-03 17:05:55 +01:00
Hyounggyu Choi
f49b89b632 CI: Set docker version to v20.10 in ubuntu:20.04 for s390x|ppc64le
This is to make a docker version to v20.10 in docker upstream image ubuntu:20.04 for s390x and ppc64le.

Fixes: #6211

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-02-03 14:21:23 +01:00
Archana Shinde
3c24e23409 README: Update Readme under packaging/kernel
Update Readme to instruct users to increment the kata config version
for any changes made to configs or patches under packaging/kernel.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-02-02 22:43:24 -08:00
Archana Shinde
d73f3a8a26 github-action: Add step to verify kernel config version id updated
The version mentioned in the `kata_config_version` needs to be
updated for any kernel config change or changed to the patches applied.
Without this, CI would not test with the latest kernel changes.
We use to enforce this earlier as part of CI when `packaging` was
a standalone repo.

Add back this check as part of a github action so that the check is
performed early on instead of a CI job.

Fixes: #6210

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-02-02 22:42:54 -08:00
Jianyong Wu
59f104c022 runtime: skip unit test that fail regularly on aarch64
There are lots of unit test cases fails regularly on aarch64, including
TestIOCopy, create_tmpfs. Temporarily skip it for now and enable it
after them get fixed.

Fixes: #6194
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-02-03 11:34:39 +08:00
Jianyong Wu
b7dd97cac6 kata-ctl: fix permission deny issue in test_add_remove
test_add_remove and test_get_sandbox_id_for_volume need root user, but
test_drop_privs can temporarily change the user to "nobody" that can
lead to the failure of these tests.

Serialise these three tests can fix it.

Fixes: #6055
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-02-03 11:34:39 +08:00
GabyCT
968f5b4031 Merge pull request #6140 from Amulyam24/rust-vitiofsd
virtiofsd: fix the build on ppc64le
2023-02-02 14:30:26 -06:00
Chao Wu
57c5e5629b Dragonball: add cpu resize ability
Add cpu resize ability upon upcall communication channel. Runtime could
use ResizeVcpu VmmAction and pass the desired vCPU number to the
Dragonball hypervisor.
Dragonball will trigger the device manager service in guest kernel's
upcall server to do cpu resize.

Fixes: #6008
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-02-03 00:26:33 +08:00
Greg Kurz
3c48f2202c runtime: Improve documentation of appendFDs
The cmd.ExtraFiles feature that is used to implement appendFDs takes an
array of arbitray file descriptors and internally renumbers them to be
consecutive starting from 3, using dup2().

This isn't especially obvious : document it for the sake of clarity.

Fixes #6199

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-02-02 12:52:10 +01:00
Amulyam24
856ab66871 virtiofsd: fix the build on ppc64le
link-self-contained is not supported on ppc64le rust target.
Hence, do not pass it while building virtiofsd.

Fixes: #6195

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2023-02-02 13:59:12 +05:30
SinghWang
f83115a838 docs: Fix missing critical steps in how-to-hotplug-memory-arm64.md
The key steps in how-to-hotplug-memory-arm64.md are missing, resulting in the kata qemu pod not being created successfully.

Fixes: #6105
Signed-off-by: SinghWang <wangxin_0611@126.com>
2023-02-02 12:12:39 +08:00
yahaa
e071d9251f Typo: change tabs in comment to spaces
Fixes: #6150

Signed-off-by: yahaa <1477765176@qq.com>
2023-02-02 12:08:33 +08:00
Peng Tao
a34f36f8f4 Merge pull request #6149 from openanolis/fix_kata_runtime
runtime:fix stat uds path
2023-02-02 11:00:07 +08:00
GabyCT
d6945200cc Merge pull request #6170 from amshinde/update-cni-version
cni: Update cni plugins version to 1.2.0
2023-02-01 09:18:14 -06:00
Chao Wu
c282a1c709 Merge pull request #5616 from wllenyj/dragonball-ut-5
Built-in Sandbox: add more unit tests for dragonball. Part 5
2023-01-31 21:12:05 +08:00
Peng Tao
09d416fe43 Merge pull request #6174 from gkurz/remove-qemu-log-file
runtime: Drop QEMU log file support
2023-01-31 17:56:04 +08:00
Hyounggyu Choi
56f0a27fef kernel: Add console kernel config for s390
This config is to update console kernel config for s390.

Fixes: #6162

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-01-31 10:44:07 +01:00
Greg Kurz
334c4b8bdc runtime: Drop QEMU log file support
The QEMU log file is essentially about fine grain tracing of QEMU
internals and mostly useful for developpers, not production. Notably,
the log file isn't limited in size, nor rotated in any way. It means
that a container running in the VM could possibly flood the log file
with a guest triggerable trace. For example, on openshift, the log
file is supposed to reside on a per-VM 14 GiB tmpfs mount. This means
that each pod running with the kata runtime could potentially consume
this amount of host RAM which is not acceptable.

Error messages are best collected from QEMU's stderr as kata is doing
now since PR #5736 was merged. Drop support for the QEMU log file
because it doesn't bring any value but can certainly do harm.

Fixes #6173

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-01-31 09:20:29 +01:00
Archana Shinde
3a63e3c1f7 cni: Update cni plugins version to 1.2.0
A new release was made for the cni plugins. Use the new
version for the CI.

Fixes: #6169

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-01-30 22:33:34 -08:00
Chelsea Mafrica
1648b85e2d Merge pull request #6137 from amshinde/agent-seccomp-doc
docs: Add documentation for building agent with seccomp support.
2023-01-30 19:08:15 -08:00
wllenyj
510798155d dragonball: Improve test cases
The same EpollManager should be used instead of creating two.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2023-01-31 10:51:51 +08:00
wllenyj
dc90c6e30b dragonball: add more unit test for vm
Added more unit tests for vm module.

Fixes: #4899

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2023-01-31 10:51:51 +08:00
Fabiano Fidêncio
c071355359 runtime-rs: Improve s390x error message
Nothing much to add, let's just make the message more clear.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-30 20:32:07 +01:00
Fabiano Fidêncio
4e2db96ef7 runtime-rs: Don't try to build on Power
As done for s390x, let's just skip the runtime-rs build for Power.

Fixes: #6142

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-30 20:32:07 +01:00
Bin Liu
b29cbbfd2c Merge pull request #6141 from fidencio/topic/upcall-follow-up
Add kernel-dragonball-experimental to kata-deploy, kata-deploy-test, and the release
2023-01-30 19:48:18 +08:00
Fabiano Fidêncio
8e8c720d51 kata-deploy-push: Ensure we build Dragonball specific kernel
As the dragonball specific kernel is now part of the release, let's make
sure we build it as part of the kata-deploy-push action.

Fixes: #5859

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-30 09:40:34 +01:00
Zhongtao Hu
c1dd9b9777 Merge pull request #6023 from openanolis/missing_config
runtime-rs: add missing config section for share-fs
2023-01-30 15:45:22 +08:00
Bin Liu
653e00dff8 Merge pull request #6146 from zhaojizhuang/add-hmp
runtime: Add hmp for qemu
2023-01-30 15:43:53 +08:00
Peng Tao
de45f62096 Merge pull request #6081 from openanolis/chao/update_upcall_doc
upcall: add document for upcall
2023-01-30 12:03:11 +08:00
Zhongtao Hu
1e531b44dc runtime:fix stat uds path
os.Stat("unix:///run/vc/sbs/sid/shim-monitor.sock") will fail,
should be os.Stat("/run/vc/sbs/sid/shim-monitor.sock")

Fixes:#6148
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-01-29 15:08:13 +08:00
zhaojizhuang
9092c23a2e runtime: Add hmp for qemu
Fixes: #6092
Signed-off-by: zhaojizhuang <571130360@qq.com>
2023-01-29 14:22:04 +08:00
Fabiano Fidêncio
b7f4e96ff3 kata-deploy-test: Ensure we build dragonball specific kernel
As the dragonball specific kernel is now part of the release, let's make
sure we build it as part of the kata-deploy-test action.

Fixes: #5859

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-28 10:55:39 +01:00
Fabiano Fidêncio
063dec37c2 release: Add the dragonball-experimental kernel
Let's add the dragonball specific kernel, which takes advantage of
upcall, as part of the release tarball, so it can be used from the
release tarball / kata-deploy.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-28 10:55:39 +01:00
Fabiano Fidêncio
0b3c91d2a2 kata-deploy: Add kernel-dragonball-experimental target
As Chao Wu added the support for building the dragonball kernel as a new
experimental kernel, let's make sure we reflect that as part of the
kata-deploy build scripts.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-28 10:55:39 +01:00
Greg Kurz
af125b1498 Merge pull request #5736 from gkurz/no-qemu-daemonize
runtime: Start QEMU undaemonized and get logs
2023-01-27 16:33:48 +01:00
Archana Shinde
00dcd900f9 docs: Add documentation for building agent with seccomp support.
The default for the agent today is building with seccomp support.
However, additional steps need to be taken for building against
musl such as installing the static seccomp library for musl.
Add documentation to explain this.

Fixes #6136

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-01-26 10:58:38 -08:00
Archana Shinde
461b32491f Merge pull request #6131 from GabyCT/topic/updateqatdoc
docs: Update url link in QAT documentation
2023-01-25 17:07:54 -08:00
Gabriela Cervantes
2b779cba00 docs: Update url link in QAT documentation
This PR updates the url link in QAT documentation.

Fixes #6130

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-01-25 15:27:29 +00:00
Fabiano Fidêncio
392c87550f Merge pull request #6111 from littlejawa/bump_cni_plugins_to_120
versions: update cni plugins version
2023-01-25 12:40:55 +01:00
Greg Kurz
39fe4a4b6f runtime: Collect QEMU's stderr
LaunchQemu now connects a pipe to QEMU's stderr and makes it
usable by callers through a Go io.ReadCloser object. As
explained in [0], all messages should be read from the pipe
before calling cmd.Wait : introduce a LogAndWait helper to handle
that.

Fixes #5780

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-01-24 23:09:17 +01:00
Greg Kurz
a5319c6be6 runtime: Start QEMU undaemonized
QEMU has always been started daemonized since the beginning. I
could not find any justification for that though, but it certainly
introduces a problem : QEMU stops logging errors when started this
way, which isn't accaptable from a support standpoint. The QEMU
community discourages the use of -daemonize ; mostly because
libvirt, QEMU's primary consummer, doesn't use this option and
prefers getting errors from QEMU's stderr through a pipe in order
to enforce rollover.

Now that virtcontainers knows how to start QEMU with a pre-
established QMP connection, let's start QEMU without -daemonize.
This requires to handle the reaping of QEMU when it terminates.
Since cmd.Wait() is blocking, call it from a goroutine.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-01-24 23:09:11 +01:00
Greg Kurz
bf4e3a618f runtime: Launch QEMU with cmd.Start()
LaunchCustomQemu() currently starts QEMU with cmd.Run() which is
supposed to block until the child process terminates. This assumes
that QEMU daemonizes itself, otherwise LaunchCustomQemu() would
block forever. The virtcontainers package indeed enables the
Daemonize knob in the configuration but having such an implicit
dependency on a supposedly configurable setting is ugly and fragile.

cmd.Run() is :

func (c *Cmd) Run() error {
	if err := c.Start(); err != nil {
		return err
	}
	return c.Wait()
}

Let's open-code this : govmm calls cmd.Start() and returns the
cmd to virtcontainers which calls cmd.Wait().

If QEMU doesn't start, e.g. missing binary, there won't be any
errors to collect from QEMU output. Just drop these lines in govmm.
Similarily there won't be any log file to read from in virtcontainers.
Drop that as well.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-01-24 23:09:11 +01:00
Greg Kurz
8a1723a5cb runtime: Pre-establish the QMP connection
Running QEMU daemonized ensures that the QMP socket is ready to
accept connections when LaunchQemu() returns. In order to be
able to run QEMU undaemonized, let's handle that part upfront.
Create a listener socket and connect to it. Pass the listener
to QEMU and pass the connected socket to QMP : this ensures
that we cannot fail to establish QMP connection and that we
can detect if QEMU exits before accepting the connection.
This is basically what libvirt does.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-01-24 23:09:11 +01:00
Greg Kurz
8a4f08cb0f govmm: Optionally pass QMP listener to QEMU
QEMU's -qmp option can be passed the file descriptor of a socket that
is already in listening mode. This is done with by passing `fd=XXX`
to `-qmp` instead of a path. Note that these two options are mutually
exclusive : QEMU errors out if both are passed, so we check that as
well in the validation function.

While here add the `path=` stanza in the path based case for clarity.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-01-24 23:08:48 +01:00
Greg Kurz
219bb8e7d0 govmm: Optionally start QMP with a pre-configured connection
When QEMU is launched daemonized, we have the guarantee that the
QMP socket is available. In order to launch a non-daemonized QEMU,
the QMP connection should be created before QEMU is started in order
to avoid a race. Introduce a variant of QMPStart() that can use such
an existing connection.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-01-24 19:16:47 +01:00
Julien Ropé
a85d0e465c versions: update cni plugins version
Use cni plugins v1.2.0 to get latest fixes.

Fixes: #6110

Signed-off-by: Julien Ropé <jrope@redhat.com>
2023-01-23 14:24:29 +01:00
Bo Chen
40c6904324 Merge pull request #6098 from likebreath/0117/clh_v29.0
versions: Upgrade to Cloud Hypervisor v29.0
2023-01-18 10:59:40 -08:00
GabyCT
421a33f846 Merge pull request #6096 from dcantah/kataruntime-use_hyp_consts
runtime: Use consts in `kata-runtime check`
2023-01-18 10:54:42 -06:00
Fabiano Fidêncio
980a2c7794 Merge pull request #6103 from fidencio/topic/bump-qemu-to-7.2.0
versions: Bump QEMU to v7.2.0
2023-01-18 17:38:47 +01:00
Fabiano Fidêncio
676d028504 versions: Bump QEMU to v7.2.0
As QEMU released its v7.2.0 version in December last year, last do the
bump on our side.

A few configuration options have been removed between the v6.2.0 (the
version we currently use) and v7.2.0, so those have also been dropped
from our configure-hypervison.sh script (for this specific version).

Also, we're explicitly setting --disable-virtiofsd for the platforms
that we're testing using the rust version.
See: a8d6abe129/docs/about/deprecated.rst (virtiofsd)

Fixes: #6102

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-18 13:11:12 +01:00
Bin Liu
083facd5ae Merge pull request #5256 from Yuan-Zhuo/fix-agent-metrics
agent: Eliminate unnecessary metrics
2023-01-18 11:43:37 +08:00
Peng Tao
7d1a604bad Merge pull request #6060 from ls-ggg/6055/service.mu-deadlock
runtime:all APIs are hang in the service.mu
2023-01-18 10:50:00 +08:00
Chelsea Mafrica
fa1f08f5da Merge pull request #5812 from amshinde/kata-ctl-env-util
Utility functions for kata-env
2023-01-17 18:45:54 -08:00
Bo Chen
861c38b6aa versions: Upgrade to Cloud Hypervisor v29.0
Details of this release can be found in our new roadmap project as
iteration v29.0: https://github.com/orgs/cloud-hypervisor/projects/6.

Fixes: #6097

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-01-17 15:45:23 -08:00
David Esparza
c8596a4065 Merge pull request #6085 from GabyCT/topic/uconmonversion
versions: Update conmon version
2023-01-17 11:33:02 -06:00
Danny Canter
ba87e0afea runtime: Use consts in kata-runtime check
Fixes: #6095

We're already importing the virtcontainers package so might as well
use the constants for the hypervisor types we're checking against instead
of typing the names out in the switch cases.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-01-17 06:55:36 -08:00
Chao Wu
9f490d16fe upcall: add document for upcall
In order for users to get better understand of upcall features, we add
this document for upcall to illustrate what is upcall and how to enable
upcall.

fixes: #6054
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-01-17 14:53:47 +08:00
Bin Liu
790f45190b Merge pull request #6074 from zhaojizhuang/enablevhostuserstore
runtime: paas enablevhostuserstore annotation to hypervisor config
2023-01-17 11:43:43 +08:00
Bin Liu
42efe013c1 Merge pull request #6078 from utam0k/libcli-0.4.0
runk: Upgrade liboci-cli to v0.0.4
2023-01-17 09:48:09 +08:00
Gabriela Cervantes
596037e20c versions: Update conmon version
This PR updates the conmon version that we are using in our versions.yaml

Fixes #6084

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-01-16 22:20:53 +00:00
utam0k
095e8fdef4 runk: Use the original Kill command instead of the customed it.
We can remove the custom kill command.

Fixes: #6083

Signed-off-by: utam0k <k0ma@utam0k.jp>
2023-01-16 21:35:47 +09:00
utam0k
0f9e23a3d9 runk: Upgrade liboci-cli to v0.0.4
https://github.com/containers/youki/releases/tag/v0.0.4

Fixes: #6083

Signed-off-by: utam0k <k0ma@utam0k.jp>
2023-01-16 21:35:09 +09:00
Tim Zhang
20196048bf Merge pull request #6030 from liubin/fix/6029-use-system-hugepagesize
runtime: use system pagesize for hugepage test
2023-01-16 16:57:55 +08:00
Fupan Li
a1a7ed98df Merge pull request #6040 from liubin/fix/6039-update-cgroup-rs
dependency: update cgroups-rs
2023-01-16 16:51:41 +08:00
ls
69fc8de712 runtime:all APIs are hang in the service.mu
When the vmm process exits abnormally, a goroutine sets s.monitor
to null in the 'watchSandbox' function without getting service.mu,
This will cause another goroutine to block when sending a message
to s.monitor, and it holds service.mu, which leads to a deadlock.
For example, the wait function in the file
.../pkg/containerd-shim-v2/wait.go will send a message to s.monitor
after obtaining service.mu, but s.monitor may be null at this time

Fixes: #6059

Signed-off-by: ls <335814617@qq.com>
2023-01-16 14:45:37 +08:00
Archana Shinde
8d4c2cf1b9 kata-ctl: Allow certain constants to go unused
The generic constants for cpu vendor and model may be superseded
by architecture specific constants. Allow these to be marked as
dead code to ignore warnings on architectures where they are overrided.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-01-15 18:07:35 -08:00
Archana Shinde
64c11a66fd kata-ctl: Have function to get cpu details to run on specific arch
This function relies on get_single_cpu function which has configured
to compile on amd64 and s390x.
Making the function get_generic_cpu_details to compile on these
architectures until we resolve the compilation for functions defined
in check.rs. This is a temporary solution until we cleanup check.rs to
make it build on all architectures.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-01-15 18:07:35 -08:00
Eric Ernst
807eeaafd0 Merge pull request #6047 from egernst/build-kata-monitor-on-darwin
runtime: Use git rev-parse for the kata-monitor tag
2023-01-13 15:29:00 -08:00
Eric Ernst
3d573ba579 Merge pull request #6050 from egernst/goos-the-vc
virtcontainers: split out linux-specific bits for mount, factory
2023-01-13 15:28:42 -08:00
Eric Ernst
458fe865ea Merge pull request #6052 from egernst/add-darwin-skeletons
Add darwin skeletons
2023-01-13 13:14:16 -08:00
Eric Ernst
923cd3fda1 virtcontainers: split out Linux parts from mount
Mount handling is often unique in Linux. Let's ensure that the common
parts remain in mount.go, while Linux speific parts are within a linux
file.

Fixes: #6049

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2023-01-13 11:14:56 -08:00
Eric Ernst
54f2b296e3 Merge pull request #6048 from egernst/revendor-netlink
vendor: revendor netlink to get latest
2023-01-13 11:08:47 -08:00
Eric Ernst
f82918f872 Merge pull request #6045 from egernst/fix-6044
Address issues with the initial vCPU pinning functionality
2023-01-13 11:06:42 -08:00
GabyCT
9c6e90fd55 Merge pull request #6043 from GabyCT/topic/fixerrormsg
virtcontainers: Fix misspelling in error message
2023-01-13 09:16:34 -06:00
zhaojizhuang
cf1bae3521 runtime: paas enablevhostuserstore annotation to hypervisor config
Fixes: #6073
Signed-off-by: zhaojizhuang <571130360@qq.com>
2023-01-13 17:07:38 +08:00
Bin Liu
1592a385eb dependency: update cgroups-rs
Update cgroups-rs.

Fixes: #6039

Signed-off-by: Bin Liu <bin@hyper.sh>
2023-01-13 14:00:51 +08:00
Eric Ernst
60ff230d80 virtcontainers: Split the factory package into Linux and Darwin bits
- split template
- split factory
- add stubs for darwin

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2023-01-12 16:51:28 -08:00
Samuel Ortiz
76437a9721 runtime: Use git rev-parse for the kata-monitor tag
The .git-commit can be a multiple line file, potentially confusing
the Darwin linker for example.

Fixes: #6046

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2023-01-12 16:01:58 -08:00
Samuel Ortiz
a9626682af virtcontainers: resourcecontrol: Add skeleton for Darwin
Cgroups do not exist on Darwin, so use an empty implementation for
resourcecontrol for the time being. In the process, ensure that the
utilized cgroup handling (ie, isSystemdCgroup) is kept in general file,
since we use this to help assess/constrain the container spec we pass to
the guest.

Fixes: #6051

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2023-01-12 15:53:28 -08:00
Samuel Ortiz
ea06fe3afc virtcontainers: Add a Network API skeleton for Darwin
Empty for now.

Fixes: #6051

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2023-01-12 15:53:28 -08:00
Eric Ernst
6ee550e9a5 runtime: vCPUs pinning is sandbox specific, not hypervisor
While at it, make sure we persist this and fix a misc typo.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2023-01-12 15:44:25 -08:00
Zhongtao Hu
6199b69178 runtime-rs: change cache mode
use never as the cache mode if none is configured

Fixes:#6020
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-01-12 18:13:50 +08:00
Zhongtao Hu
a33a22ccd1 runtime-rs: add missing config section for share-fs
add missing config sections for share-fs

Fixes:#6020
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-01-12 18:12:37 +08:00
Peng Tao
2b4b825228 Merge pull request #6032 from liubin/fix/6031-add-test-file-to-gitignore
runtime: add test generated file to .gitignore
2023-01-12 15:38:46 +08:00
Peng Tao
4a4232b851 Merge pull request #6037 from bergwolf/github/no-netns
runtime: fix up disable_netns handling
2023-01-12 09:58:24 +08:00
Eric Ernst
e3d3b72fa2 virtcontainers: use resource control for setting CPU affinity
Let's abstract the CPU affinity, instead of calling linux only code from
sandbox.

Fixes: #6044

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2023-01-11 17:55:53 -08:00
Eric Ernst
f137048be3 resource-control: add helper function for setting CPU affinity
Let's abstract the CPU affinity

Fixes: #6044

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2023-01-11 17:55:53 -08:00
Eric Ernst
73216a8104 vendor: revendor netlink to get latest
This'll address issue where netlink couldn't build on Darwin hosts.

Fixes: #6026

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2023-01-11 17:23:15 -08:00
Gabriela Cervantes
fc17d7cc41 virtcontainers: Fix misspelling in error message
This PR fixes a misspelling in the error message when it tries to run
a system without Confidential computing support.

Fixes #6042

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-01-11 21:58:07 +00:00
GabyCT
c6b7f69040 Merge pull request #5837 from deagon/doc-fix
docs: add hint of probing loop module
2023-01-11 12:20:47 -06:00
Tim Zhang
c91b142587 Merge pull request #6035 from liubin/fix/5376-set-a-fixed-cgroups-version
tools: add --locked option for cargo install
2023-01-11 20:44:23 +08:00
Peng Tao
12fd6ffc1f runtime: fix up disable_netns handling
With `disable_netns=true`, we should never scan the sandbox netns which
is the host netns in such case.

Fixes: #6021
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-01-11 12:25:24 +00:00
Bin Liu
64c9114a39 tools: add --locked option for cargo install
There is a broken release of cgroup-rs, but cargo install will not use
the version in Cargo.lock, so add the `--locked` option to use the version
specified in the Cargo.toml

Fixes: #5376

Signed-off-by: Bin Liu <bin@hyper.sh>
2023-01-11 19:34:46 +08:00
Bin Liu
7eb43cec15 runtime: add test generated file to .gitignore
Add test generated file to .gitignore to avoid making the
working directory dirty.

Fixes: #6031

Signed-off-by: Bin Liu <bin@hyper.sh>
2023-01-11 17:16:06 +08:00
Bin Liu
8551853cfe runtime: use system pagesize for hugepage test
In TestHandleHugepages it will do a mount operation with different pagesizes,
but some systems only support 2M pagesize, test for a 1g pagesize will fail.

This commit try to fix by only mount pagesizes under `/sys/kernel/mm/hugepages`, which are
supported to mount by the OS.

Fixes: #6029

Signed-off-by: Bin Liu <bin@hyper.sh>
2023-01-11 17:02:58 +08:00
Bin Liu
0ec4aa1a86 Merge pull request #6007 from jongwu/single_container
runtime-rs: add Single Container support
2023-01-11 10:55:50 +08:00
Eric Ernst
07e77f5be7 Merge pull request #5994 from dcantah/virtcontainers_tests_darwin
virtcontainers: tests: Ensure Linux specific tests are just run on Linux
2023-01-10 17:13:28 -08:00
Fabiano Fidêncio
147c56bb8d Merge pull request #6019 from liubin/fix/6018-virtiofsd-cache-mod
Change cache mode from none to never
2023-01-10 23:12:13 +01:00
Bin Liu
709483425f Merge pull request #6014 from GabyCT/topic/fixinidentationaks
tools: Fix indentation for setup aks script
2023-01-10 17:49:27 +08:00
Bin Liu
8225d8044e Merge pull request #6003 from dcantah/fs-skeleton
virtcontainers: fs_share: Add Darwin skeleton
2023-01-10 17:48:45 +08:00
Bin Liu
86a82cace9 runtime: change cache mode from none to never
New Rust virtiofsd's `cache` mode doesn't support `none` mode,
we should use `never` to replace it.

Fixes: #6018

Signed-off-by: Bin Liu <bin@hyper.sh>
2023-01-10 17:29:48 +08:00
Bin Liu
82c59efd65 runtime-rs: change cache mode from none to never
New Rust virtiofsd's `cache` mode doesn't support `none` mode,
we should use `never` to replace it.

Fixes: #6018

Signed-off-by: Bin Liu <bin@hyper.sh>
2023-01-10 16:14:59 +08:00
Bin Liu
7b309b578d kata-types: change cache mode from none to never
New Rust virtiofsd's `cache` mode doesn't support `none` mode,
we should use `never` to replace it.

Fixes: #6018

Signed-off-by: Bin Liu <bin@hyper.sh>
2023-01-10 14:21:30 +08:00
Bin Liu
fee4e7c7c4 docs: change cache mode from none to never
New Rust virtiofsd's `cache` mode doesn't support `none` mode,
we should use `never` to replace it.

Fixes: #6018

Signed-off-by: Bin Liu <bin@hyper.sh>
2023-01-10 14:19:25 +08:00
Eric Ernst
4d53303a7d Merge pull request #6005 from dcantah/vfw-skeleton
virtcontainers: Add a Virtualization.framework skeleton
2023-01-09 15:50:04 -08:00
Archana Shinde
594b57d082 utils: Add utility functions to get cpu and distro details.
These functions is meant to be used for the kata-env command.

Fixes: #5688

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-01-09 14:36:36 -08:00
Archana Shinde
d33e343613 check: Move PROC_CPUINFO from architecture specific files
Move PROC_CPUINFO into check.rs. This file is used accross
architectures and does not need to be in arch-specific files.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-01-09 14:31:33 -08:00
Gabriela Cervantes
f8a93a1ded tools: Fix indentation for setup aks script
This PR fixes the indentation for setup aks script being used
in tools.

Fixes #6013

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-01-09 15:27:50 +00:00
Tim Zhang
6628891666 Merge pull request #5982 from liubin/fix/5981-remove-tests-func
kata-ctl: remove get_kata_version_by_url function
2023-01-09 18:18:21 +08:00
Bin Liu
03de5f41b2 kata-ctl: remove get_kata_version_by_url function
In `src/tools/kata-ctl/src/check.rs`, there is a function
`get_kata_version_by_url` in the tests mod,
indeed we can use the `get_kata_all_releases_by_url` in the main mod
to replace it.

Fixes: #5981

Signed-off-by: Bin Liu <bin@hyper.sh>
2023-01-09 15:32:16 +08:00
Fupan Li
2b34f0a54f Merge pull request #5992 from liubin/fix/5987-kata-ctl-s390x-build-error
kata-ctl: fix build error on s390x
2023-01-09 15:28:37 +08:00
Bin Liu
1bae41a4d4 Merge pull request #5996 from dcantah/vfw-initial
virtcontainers: Introduce hypervisor_darwin
2023-01-09 11:37:02 +08:00
Jianyong Wu
464d4c94de runtime-rs: process single_container
Process single_container like pod_sandbox when create container but like
pod_container when get the size info of memory/cpu from oci/spec.

Fixes: #6006
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-01-09 10:29:01 +08:00
Jianyong Wu
5f9c892e48 kata-types: add single_container support
For now, only pod_sandbox and pod_container are supported. It doesn't cover
the case that container started by ctr which is a single_container defined
in kata 2.0. port the single_container kata type from kata 2.0 to kata 3.0.

Fixes: #6006
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-01-09 10:29:01 +08:00
Samuel Ortiz
fa9ae9362c virtcontainers: Add a Virtualization.framework skeleton
Fixes: #6004

A Virtualization.framework based Hypervisor implementation.
This is just stubs for now to eventually get this building.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-01-08 07:40:21 -08:00
Eric Ernst
d48b22bb13 virtcontainers: fs_share: add Darwin skeleton
Fixes: #6002

As a first pass for testing, let's add a skeleton for filesystem
sharing support on Darwin..

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-01-07 19:56:47 -08:00
Bin Liu
2c10b37172 Merge pull request #5991 from dcantah/darwin-sigs
runtime: Define Darwin handled signals list
2023-01-07 11:19:48 +08:00
Bin Liu
bc8a6423e0 Merge pull request #5986 from dcantah/nydus-nonetns
nydus: net-ns handling needs to be only executed on Linux hosts
2023-01-07 11:19:07 +08:00
Bo Chen
8265aad380 Merge pull request #6001 from fidencio/topic/add-network-hotplug-support-for-clh
clh: Ensure it works with Docker / Moby
2023-01-06 13:06:57 -08:00
Eric Ernst
fafc7a8b1a virtcontainers: tests: Ensure Linux specific tests are just run on Linux
Fixes: #5993

Several tests utilize linux'isms like Mounts, bindmounts, vsock etc.

Let's ensure that these are still tested on Linux, but that we also skip
these tests when on other operating systems (Darwin). This commit just
moves tests; there shouldn't be any functional test changes. While the
tests still won't be runnable on Darwin/other hosts yet, this is a necessary
step forward.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-01-06 11:09:11 -08:00
Fabiano Fidêncio
efa4fc0b25 clh: Add hotplug support for network devices
This is needed in order to have Moby / Docker working properly with
Cloud Hypervisor, as Moby / Docker relies on hotplugging a network
device to the VM as a preStartHook.

Fixes: #5997

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-06 18:59:47 +01:00
Fabiano Fidêncio
1074d2c1d3 clh: Make vmAddNetPutRequest capable of doing hotplugs
THe only bit needed for having the vmAddNetPutRequest() capable of
dealing with hotplugs, instead of only coldplugs, is making sure it
doesn't error out in case a `200` response is returned.

The 200 response means:
"""
The new device was successfully added to the VM instance.
"""

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-06 18:55:55 +01:00
Zhongtao Hu
ec18368aba Merge pull request #5858 from openanolis/refactor-guest-hook
agent: refactor guest hooks
2023-01-06 22:28:09 +08:00
Fabiano Fidêncio
175794458f Merge pull request #5972 from bergwolf/github/hook
fix moby prestart hook handling
2023-01-06 14:54:39 +01:00
Eric Ernst
9ec8a13985 virtcontainers: introduce hypervisor_darwin
Fixes: #5995

Placeholder skeleton at this point - implementation will be added after
basic build refactoring lands.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-01-06 02:03:34 -08:00
Peng Tao
8bb68a9f28 vc/network: skip existing endpoints when scanning for new ones
So that addAllEndpoints() becomes re-entrant and we can use it to scan
netns changes.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-01-06 10:01:19 +00:00
Bin Liu
c21a8d5ff8 kata-ctl: fix build error on s390x
Some type is not imported in s390x's mod file.

Fixes: #5987

Signed-off-by: Bin Liu <bin@hyper.sh>
2023-01-06 13:27:28 +08:00
Bin Liu
31abe170fc Merge pull request #5984 from dcantah/schedcore-nonlinux
schedcore: Make buildable on !linux
2023-01-06 10:38:39 +08:00
Samuel Ortiz
3b4420eb8e runtime: Define Darwin handled signals list
Fixes: #5990

Some signals may not be defined on non Linux host OSes, like
SIGSTKFLT for example. It's also not defined on certain architectures,
but irrelevant for this.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-01-05 17:50:47 -08:00
Danny Canter
24b05a99b6 schedcore: Make buildable on !linux
Fixes: #5983

sched-core only makes sense on Linux hosts. Let's add stub/error for
other platforms.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-01-05 11:51:04 -08:00
Danny Canter
3886aad199 nydus: net-ns handling needs to be only executed on Linux hosts
Fixes: #5985

With nydus not being its own pkg, it is challenging to implement cleanly
in a virtcontainers package that isn't necesarily Linux-only. The
existing code utilizes network namespace code in order to ensure nydus
is launched in the host netns. This is very Linux specific - so let's
make sure we only carry this out in a linux specific file.

In the Darwin case, to allow for compilation at least, let's add a stub
for doNetNS. Ideally the nydus and vc code can be refactored /
decoupled.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-01-05 11:48:43 -08:00
Bin Liu
1b46d4fb50 Merge pull request #5611 from wllenyj/dragonball-ut-4
Built-in Sandbox: add more unit tests for dragonball. Part 4
2023-01-05 15:21:36 +08:00
Bin Liu
a40fca1f57 Merge pull request #5976 from yaoyinnan/5825/fix/cleanup-hypervisor
runtime-rs: cleanup the run dir of hypervisor when shut down
2023-01-05 15:14:21 +08:00
Zhongtao Hu
8c4c0d2715 Merge pull request #5467 from tzY15368/feat-katactl-direct-vol
Feat: implementation of kata-ctl direct-volume operations
2023-01-05 14:06:18 +08:00
Bin Liu
4ab9364aa6 Merge pull request #5946 from dcantah/clarify-var
Runtime: Clarify mutability of global var
2023-01-05 13:08:45 +08:00
Bin Liu
649d2d4b8d Merge pull request #5964 from openanolis/kata-runtime
kata-runtime: add rust runtime path for kata-runtime exec
2023-01-05 09:35:21 +08:00
Fabiano Fidêncio
db372d8897 Merge pull request #5974 from likebreath/0103/clh_v28.1
versions: Upgrade to Cloud Hypervisor v28.1
2023-01-04 19:02:35 +01:00
yaoyinnan
e256903af2 runtime-rs: cleanup the run dir of hypervisor when shut down
Cleanup the run dir of hypervisor when shut down.

Fixes: #5825

Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>
2023-01-04 22:36:39 +08:00
Bin Liu
e2c7e5f172 Merge pull request #5950 from openanolis/upcall_fea
runtime-rs: add dbs-upcall feature
2023-01-04 16:20:40 +08:00
Tingzhou Yuan
937a41346e kata-ctl: add unit tests for volume ops
Added table driven unit tests and
funcitionality test for functions in volume_ops.

`join_path` relies on safe_path::scoped_join
to validate the unsafe part of the input.
Testcase also takes into account the possibility of specially
constructed string that would get b64-encoded into path-like string.

Fixes #5341

Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>
2023-01-04 01:34:40 -05:00
Tingzhou Yuan
8451db7c0c kata-ctl: direct-volume: add Add and Remove handlers
This commit adds direct-volume command handlers for kata-ctl,
 including add, remove, stats and resize. Stats and resize
makes HTTP over UDS calls to runtime-rs while add and remove
 runs locally on the host.

Fixes #5341

Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>

kata-ctl: direct-volume: add Add and Remove handlers

This commit adds direct-volume command handlers for kata-ctl,
 including add, remove, stats and resize. Stats and resize
makes HTTP over UDS calls to runtime-rs while add and remove
 runs locally on the host.

Fixes #5341

Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>
2023-01-04 01:34:38 -05:00
Tingzhou Yuan
2d4b2cf72c runtime-rs: add POST method to shim-client
partly refactored shim-client to reuse code, added POST method
support, and made path string constants public for client imports.

Fixes #5341

Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>
2023-01-04 01:33:53 -05:00
Tingzhou Yuan
cae78a6851 kata-ctl: add constants for direct-volume commands
added direct-volume mountinfo struct and constant path strings to kata-types

Fixes #5341

Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>
2023-01-04 01:33:51 -05:00
Bin Liu
38a6bc570d Merge pull request #5947 from dcantah/yq-darwin
runtime/Makefile: Get some bits happy on darwin
2023-01-04 14:24:43 +08:00
Bin Liu
3bda4a8194 Merge pull request #5943 from liubin/fix/5942-remove-old-description
docs: remove old and misleading instructions for minikube
2023-01-04 12:02:53 +08:00
Bin Liu
5b11201848 Merge pull request #5945 from liubin/fix/5944-indents
packaging: fix indents in build-kernel.sh
2023-01-04 11:00:49 +08:00
Bo Chen
652021ad95 versions: Upgrade to Cloud Hypervisor v28.1
This patch upgrade Cloud Hypervisor to its latest bug release v28.1:
https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v28.1

Fixes: #5973

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-01-03 14:09:44 -08:00
Fabiano Fidêncio
156e4e673b Merge pull request #5908 from Alex-Carter01/kmod_warning
kernel: adding kmod to do docker env
2023-01-03 20:35:22 +01:00
Fabiano Fidêncio
67f0fd505d Merge pull request #5967 from fidencio/topic/bump-rust-toolchain-to-1.66.0
versions: Update the rust toolchain to 1.66.0
2023-01-03 18:50:16 +01:00
Fabiano Fidêncio
5f5f6ce7a7 Merge pull request #5951 from liubin/fix/5948-check_latest_version
kata-ctl: skip test if access GitHub.com fail
2023-01-03 18:49:57 +01:00
Peng Tao
d085389127 vc: fix up UT for CreateSandbox API change
Need to adapt the UT as well.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-01-03 22:30:42 +08:00
Peng Tao
578a9c25f0 vc: rescan network endpoints after running prestart hooks
Moby relies on the prestart hooks to configure network endpoints. We
should rescan the netns after running them so that the newly added
endpoints can be found and plugged to the guest.

Fixes: #5941
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-01-03 22:30:41 +08:00
Fabiano Fidêncio
a3e1257708 Merge pull request #5891 from jtumber-ibm/foreign-cc
agent: unset `CC` for cross-build
2023-01-03 14:38:24 +01:00
Peng Tao
cb84b0fb02 katautils: run prestart hooks after starting VM
So that we can pass the hypervisor pid to the hook instead of the
runtime process's.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-01-03 10:52:32 +00:00
Fabiano Fidêncio
079462d2eb runk: Fix needless_borrow warning
As we bumped the rust toolchain to 1.66.0, some new warnings have been
raised due to needless_borrow.

Let's fix them all here.

For more info about the warnings, please, take a look at:
https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 17:14:13 +01:00
Fabiano Fidêncio
2c24fcf34c runtime-rs: Fix clippy::bool-to-int-with-if warnings
As we bumped the rust toolchain to 1.66.0, some new warnings have been
raised due to boolean to int conversion using if.

Let's fix them all here.

For more info about the warnings, please, take a look at:
https://rust-lang.github.io/rust-clippy/master/index.html#bool_to_int_with_if

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 17:14:13 +01:00
Fabiano Fidêncio
025e78341e runtime-rs: Fix needless_borrow warnings
As we bumped the rust toolchain to 1.66.0, some new warnings have been
raised due to needless_borrow.

Let's fix them all here.

For more info about the warnings, please, take a look at:
https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 17:14:13 +01:00
Fabiano Fidêncio
4fb163d570 runtime-rs: Allow clippy:box_default warnings
As the rust toolchain version bump to its 1.66.0 release raised a
warning about using Box::default() instead of specifying a type.

For now that's something we don't need to change, so let's ignore such
warning in this very specific case.

See:
https://rust-lang.github.io/rust-clippy/master/index.html#box_default

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 17:14:01 +01:00
Fabiano Fidêncio
20121fcda7 runtime-rs: Fix unnecessary_cast warnings
As we bumped the rust toolchain to 1.66.0, some new warnings have been
 raised due to unnecessary_cast.

 Let's fix them all here.

 For more info about the warnings, please, take a look at:
 https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_cast

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 16:16:39 +01:00
Fabiano Fidêncio
b95364a140 dragonball: Allow question_mark warning in allocate_device_resources()
As the rust toolchain version bump to its 1.66.0 release raised a
warning about the code being able to be refactored to use `?`.

For now that's something we don't need to change, so let's ignore such
warning in this very specific case.

See:
https://rust-lang.github.io/rust-clippy/master/index.html#question_mark

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 15:55:49 +01:00
Fabiano Fidêncio
0b2f060bf3 dragonball: Fix unnecessary_cast warnings
As we bumped the rust toolchain to 1.66.0, some new warnings have been
raised due to unnecessary_cast.

Let's fix them all here.

For more info about the warnings, please, take a look at:
https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_cast

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 15:55:42 +01:00
Fabiano Fidêncio
a545a65934 agent: Allow clippy::question_mark warning in Namespace{}
As the rust toolchain version bump to its 1.66.0 release raised a
warning about the code being able to be refactored to use `?`.

For now that's something we don't need to change, so let's ignore such
warning in this very specific case.

See:
https://rust-lang.github.io/rust-clippy/master/index.html#question_mark

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 15:22:20 +01:00
Fabiano Fidêncio
9ced34dd22 agent: Fix explicit_auto_deref warnings
As we bumped the rust toolchain to 1.66.0, some new warnings have been
raised due to explicit_auto_deref.

Let's fix them all here.

For more info about the warnings, please, take a look at:
https://rust-lang.github.io/rust-clippy/master/index.html#explicit_auto_deref

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 14:59:50 +01:00
Fabiano Fidêncio
f77220490e agent: Fix needless_borrow warnings
As we bumped the rust toolchain to 1.66.0, some new warnings have been
raised due to needless_borrow.

Let's fix them all here.

For more info about the warnings, please, take a look at:
https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 14:58:13 +01:00
Fabiano Fidêncio
7bcdc9049a rustjail: Fix unnecessary_cast warnings
As we bumped the rust toolchain to 1.66.0, some new warnings have been
raised due to unnecessary_cast.

Let's fix them all here.

For more info about the warnings, please, take a look at:
https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_cast

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 14:42:58 +01:00
Fabiano Fidêncio
41d7dbaaea rustjail: Fix needless_borrow warnings
As we bumped the rust toolchain to 1.66.0, some new warnings have been
raised due to needless_borrow.

Let's fix them all here.

For more info about the warnings, please, take a look at:
https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 14:42:25 +01:00
Fabiano Fidêncio
2a73e057db kata-types: Fix unnecessary_cast warnings
As we bumped the rust toolchain to 1.66.0, some new warnings have been
raised due to unnecessary_cast.

Let's fix them all here.

For more info about the warnings, please, take a look at:
https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_cast

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 14:28:07 +01:00
Fabiano Fidêncio
cf9ef1833c kata-types: Fix needless_borrow warnings
As we bumped the rust toolchain to 1.66.0, some new warnings have been
raised due to needless_borrow.

Let's fix them all here.

For more info about the warnings, please, take a look at:
https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 14:28:07 +01:00
Fabiano Fidêncio
126187e814 safe-path: Fix needless_borrow warnings
As we bumped the rust toolchain to 1.66.0, some new warnings have been
raised due to needless_borrow.

Let's fix them all here.

For more info about the warnings, please, take a look at:
https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 14:28:07 +01:00
Fabiano Fidêncio
bb78d35db8 kata-sys-util: Fix "match-like-matches-macro" warning
As we bumped the rust toolchain to 1.66.0, some new warnings have been
raised due to "match-like-matches-macro".

Let's fix them all here.

For more info about the warnings, please, take a look at:
https://rust-lang.github.io/rust-clippy/master/index.html#match_like_matches_macro

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 14:28:07 +01:00
Fabiano Fidêncio
668e652401 kata-sys-util: Fix unnecessary_cast warnings
As we bumped the rust toolchain to 1.66.0, some new warnings have been
raised due to unnecessary_cast.

Let's fix them all here.

For more info about the warnings, please, take a look at:
https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_cast

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 14:28:07 +01:00
Fabiano Fidêncio
c1a8d89a72 kata-sys-util: Fix needless_borrow warnings
As we bumped the rust toolchain to 1.66.0, some new warnings have been
raised due to needless_borrow.

Let's fix them all here.

For more info about the warnings, please, take a look at:
https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 14:28:07 +01:00
Fabiano Fidêncio
c9c38e6d01 logging: Allow clippy::type-complexity warning
As the rust toolchain version bump to its 1.66.0 release raised a
warning about the type complexity used for the closure, and that's
something we don't want to change, let's ignore such warning in this
very specific case.

See:
https://rust-lang.github.io/rust-clippy/master/index.html#type_complexity

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 14:28:07 +01:00
Fabiano Fidêncio
ffd6fbb6b6 logging: Fix needless_borrow warnings
As we bumped the rust toolchain to 1.66.0, some new warnings have been
raised due to needless_borrow.

Let's fix them all here.

For more info about the warnings, please, take a look at:
https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 14:18:14 +01:00
Fabiano Fidêncio
60df30015b protocols: Fix unnecessary_cast warnings
As we bumped the rust toolchain to 1.66.0, some new warnings have been
raised due to unnecessary_cast.

Let's fix them all here.

For more info about the warnings, please, take a look at:
https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_cast

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 14:18:14 +01:00
Danny Canter
56e7b5d0fd runtime/Makefile: Get some bits happy on darwin
Substitution in the yq install script doesn't like zsh, and additionally
the version of yq we're using doesn't have a darwin/arm64 build so grab
the amd64 version and let rosetta work its magic.

Additionally swap to abspath from readlink -m for the printing of what binaries
to install, as the -m flag doesn't exist on the BSD variant, and this
should be the same behavior.

Fixes: #5970

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-01-02 04:19:58 -08:00
Fabiano Fidêncio
0bbeb34b4c protocols: Fix needless_borrow warnings
As we bumped the rust toolchain to 1.66.0, some new warnings have been
raised due to needless_borrow.

Let's fix them all here.

For more info about the warnings, please, take a look at:
https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 12:41:29 +01:00
Fabiano Fidêncio
dfea6c7d21 versions: Update the rust toolchain to 1.66.0
We're doing the bump on main, as we'll need this as part of the CCv0
branch due to the dependencies we have there.

Link to the 1.66.0 release:
https://github.com/rust-lang/rust/blob/master/RELEASES.md#version-1660-2022-12-15

Fixes: #5966

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-01-02 11:34:00 +01:00
Danny Canter
86ee24b33c Runtime: Clarify mutability of global var
Was about to change `urandomdev` to a constant when I realized it's
intentionally mutable so it can be mocked in tests. There's other
comments to the same effect so clarify here as well.

Fixes: #5965

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-01-02 01:13:34 -08:00
Zhongtao Hu
dae6670628 kata-runtime: add rust runtime path for kata-runtime exec
add rust runtime path for kata-runtime exec

Fixes:#5963
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-12-30 13:34:34 +08:00
Chao Wu
a2e3715e01 upcall: remove upcall client when stopping vm
In order to avoid resource leak, we need to remove upcall client in vm
and vcpu manager when stopping vm.

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-12-28 20:23:39 +08:00
wllenyj
31591d7915 dragonball: fix unit test failure case about Kvm.
Due to the wrong use of as_raw_fd, Kvm was dropped twice.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-12-26 11:32:31 +08:00
wllenyj
2b02e0a9bf dragonball: add more unit test for vcpu manager
Added more unit tests for Vcpu Manager.

Fixes: #4899

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-12-26 11:31:42 +08:00
Yushuo
85f9094f17 agent: refactor guest hooks
We have to execute some hooks both in host and guest. And in
/libs/kata-sys-util/src/hooks.rs, the coomon operations are implemented.

In this commit, we are going to refactor the code of guest hooks using
code in /libs/kata-sys-util/src/hooks.rs. At the same time, we move
function valid_env to kata-sys-util to make it usable by both agent and
runtime.

Fixes: #5857

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2022-12-26 10:15:19 +08:00
Chao Wu
1511587a9a Merge pull request #5601 from openanolis/hugepage
runtime-rs: enable hugepage
2022-12-25 22:35:06 +08:00
Zhongtao Hu
3605062258 runtime-rs: add dbs-upcall feature
add dbs-upcall feature to dragonball

Fixes:#5949

Depends-on: github.com/kata-containers/tests#5355

Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-12-25 19:02:42 +08:00
Bin Liu
03a0c9d78e kata-ctl: skip test if access GitHub.com fail
This commit will call `error_for_status` after `send`, this call
will generate errors if status code between 400-499 and 500-599.

And sometime access github.com will fail, in this case we can
skip the test to prevent the CI failing.

Fixes: #5948

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-12-23 15:12:12 +08:00
Bin Liu
1dcbda3f0f kata-ctl: update Cargo.lock
kata-ctl depends on runtime-rs, and this commit:
fbf294da3f

added a new dependency named shim-interface, this Cargo.lock should be updated too.

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-12-23 15:06:50 +08:00
Bin Liu
b4b5d8150e docs: remove old and misleading instructions for minikube
Some instructions are old, delete them to prevent misleading.

Fixes: #5942

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-12-23 12:02:46 +08:00
Bin Liu
0fe24e08bb packaging: fix indents in build-kernel.sh
In the function get_kernel, the indents are two tabs,
which should be 1 tab.

Fixes: #5944

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-12-22 14:56:06 +08:00
Fupan Li
dc9c8d3357 Merge pull request #5901 from justxuewei/fix/mpleak
runtime-rs: Clean up mount points shared to guest
2022-12-21 09:59:25 +08:00
Bin Liu
92b843ac5a Merge pull request #5924 from jongwu/kata-ctl-checkcpu
kata-ctl: fix checkcpu bug in non-x86 arches
2022-12-21 09:16:53 +08:00
Jianyong Wu
3480780bd8 kata-ctl: add check framework support for non-x86
x86 changes the check framwork. Enable them for non-x86 accordingly.

Fixes: #5923
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2022-12-20 11:41:00 +08:00
Jianyong Wu
1bd533f10b kata-ctl: let check framework arch-agnostic
The current check framwork is specific for x86. Refactor the code
to let it arch-agnostic.

Fixes: #5923
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2022-12-20 11:41:00 +08:00
Fabiano Fidêncio
2e54c8e887 Merge pull request #5921 from fidencio/3.1.0-alpha1-branch-bump
# Kata Containers 3.1.0-alpha1
2022-12-19 15:45:53 +01:00
Bin Liu
6039516802 Merge pull request #5925 from xinydev/fix-docs
docs: Remove duplicate sentences
2022-12-19 17:12:15 +08:00
Peng Tao
473f5ff7da Merge pull request #5861 from mflagey/Docs_Change_build_virtiofsd_in_developer_guide_#5860
docs: Update virtiofsd build script in the developer guide
2022-12-19 17:02:35 +08:00
Bin Liu
0cf443a612 Merge pull request #5915 from openanolis/legacy_device
dragonball: refactor legacy device initialization
2022-12-19 13:31:45 +08:00
Xuewei Niu
fd77eebd4d runtime-rs: fix the issues mentioned in the code review
In order to avoid cloning, changed the signature of
`ShareFsMount::share_rootfs`, `ShareFsMount::share_volume`, and
`ShareFsMount::umount_rootfs` to receive a reference to a config.

Fixes: #5898

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2022-12-19 11:46:50 +08:00
Xuewei Niu
0e69207909 runtime-rs: Clean up mount points shared to guest
Fixed issues where shared volumes couldn't umount correctly.

The rootfs of each container is cleaned up after the container is killed, except
for `NydusRootfs`. `ShareFsRootfs::cleanup()` calls
`VirtiofsShareMount::umount_rootfs()` to umount mount points shared to the
guest, and umounts the bundle rootfs.

Fixes: #5898

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2022-12-19 11:46:14 +08:00
Xin Yang
74fa10a235 docs: remove duplicate sentences
remove duplicate sentences in spdk docs
Fixes: #5926

Signed-off-by: Xin Yang <xinydev@gmail.com>
2022-12-17 11:26:36 +00:00
Bin Liu
e4645642d0 Merge pull request #5877 from openanolis/fix_start_bundle
runtime-rs: enable start container from bundle
2022-12-17 08:10:08 +08:00
Wainer Moschetta
339ef99669 Merge pull request #5867 from Alex-Carter01/sev_module_unload
kernel building: Add module unload to SEV kernel config
2022-12-16 17:17:53 -03:00
Alex Carter
ecb28e2b13 kernel: adding kmod to do docker env
adding kmod to kernel building docker env to remove warning

Fixes: #5866
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
2022-12-16 17:02:47 +00:00
Alex Carter
9f465a58af kernel: Add "unload" module to SEV config
Fixes: #5866
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
2022-12-16 16:56:56 +00:00
Fabiano Fidêncio
b0896126cf release: Kata Containers 3.1.0-alpha1
- tools: Add some new gitignore items
- shim: return hypervisor's pid not shim's pid
- Dragonball: introduce upcall
- refactor(shim-mgmt): move client side to libs
- kata-ctl: Add --list option
- kata-ctl: check: only-list-releases and include-all-releases options
- basic framework for QEMU support in runtime-rs
- tools: Fix indentation on build kernel script
- runtime-rs: fix standalone share fs
- runtime-rs: fix sandbox_pidns calculation and oci spec amending
- runtime,agent: Add SELinux support for containers inside the guest
- kata-sys-util: fix issues where umount2 couldn't get the correct path
- agent: Drop the Option for LinuxContainer.cgroup_manager
- dragonball: enable kata3.0/dragonball CI on Arm
- fix kata deploy error after node reboot.
- tools: Fix indentation for ovmf script
- runtime: prevent waiting 50 ms minimum for a process exit
- runtime-rs: fix high cpu
- agent: remove `sysinfo` dependency
- runtime-rs: bind mount volumes in sandbox level
- docs: Update the rust version in the installation documentation
- runtime-rs: fix some variable names and typos
- kata-ctl: add host check for aarch64
- kata-ctl: fix dependency version conflict
- workflow: fix cargo-deny-runner.yaml syntax error
- runtime: Add identification in version for runtime-rs
- workflow: call cargo in user's $PATH
- runtime-rs: remove the version number from the commit display message
- runk: Re-implement start operation using the agent codes
- build: update golang version to 1.19.3
- snap: Fix snapcraft setup (unbreak snap releases)
- fix(agent): fix iptables binary path in guest
- runtime-rs: moving only vCPU threads into sandbox controller
- tools: Remove extra tab spaces from kata deploy binaries script
- ci: let static checks don't depend on build
- actions: use matrix to refactor static checks
- agent: support systemd cgroup for kata agent.
- actions: skip some jobs using "paths-ignore" filter
- runtime: go fix code for 1.19
- doc: update runtime-rs "Build and Install"
- runtime: don't fail mkdir if the folder is already created by another process
- kernel: add CONFIG_X86_SGX into whitelist
- runtime-rs: block on the current thread when setup the network to avoid be take over by other task
- Refactor(runtime-rs): add conditional compile for virt-sandbox persist
- runtime: add log record to the qemu config method `appendDevices` for…
- runtime: Use containerd v1.6.8
- tools: Fix indentation of build static firecracker script
- package: add nydus to release artifacts
- agent: check if command exist before do ip_tables test
- runtime: Support virtiofs queue size for qemu and make it configurable
- docs: change mount-info.json to mountInfo.json
- docs: update doc "NVIDIA GPU passthrough"
- runtime-rs: support vhost-vsock
- utils: Add utility function to fetch the kernel version.
- versions: update nydusd version
- runtime-rs: support nydus v5 and v6 rootfs
- Upgrade to Cloud Hypervisor v28.0
- docs: update doc "Setup swap device in guest kernel"
- Rust fixes + Golang bump
- clh: avoid race condition when stopping clh
- tools: Fix indentation of build static virtiofsd script
- docs: Fix configuration path
- runtime-rs : fix the shim source in the documentation test is ambiguous
- versions: update vmm-sys-util and related crates to v0.11.0
- runtime-rs: delete all cargo patches
- feat(shim-mgmt): iptables handler
- tools: Remove empty spaces from build kernel script
- Built-in Sandbox: add more unit tests for dragonball. Part 3
- Dragonball: enable mem_file_path config into hugetlbfs process
- runtime-rs:add hypervisor interface capabilities
- cloud-hypervisor: Fix GetThreadIDs function
- github: Parallelise static checks
- runtime-rs: blanks filled & fixes made to virtiofsd launch
- vCPUs pinning support for Kata Containers
- runtime-rs: fix shared volume permission issue
- runk: Ignore an error when calling kill cmd with --all option
- runk: Upgrade libseccomp crate to v0.3.0 in Cargo.lock
- snap: Unbreak docker install
- add EnterNetNS in virtcontainers
- tools: Fix indentation of build static clh script
- virtiofsd: Not use "link-self-contained=yes" on s390x
- Kata ctl drop privs
- versions: bump golangci-lint version
- runtime-rs: generate config files with the default target
- docs: Fix volumeMounts in SGX usage example
- versions: Update Cloud Hypervisor to b4e39427080
- docs: update rust runtime installation guide
- rustjail: Upgrade libseccomp crate to v0.3.0
- makefile: remove sudo when create symbolic link
- agent: remove redundant checks
- shim: Ensure pagesize is set when reporting hugetlb stats
- kata-ctl: Re-enable network tests on s390x (fixes 5438)
- agent: use NLM_F_REPLACE replace NLM_F_EXCL in rtnetlink
- fix readme content error at doc directory
- agent: validate hugepage size is supported
- Makefile: fix an typo in runtime-rs makefile
- qemu: Re-work static-build Dockerfile
- Modify agent-url return value in runtime-rs
- runtime-rs: regulate the comment in runtime-rs makefile
- doc: Update how-to-run-kata-containers-with-SNP-VMs.md
- kata-ctl: Disable network check on s390x
- virtiofsd: Build inside a container
- Dragonball: remove redundant comments in event manager
- versions: Update TDX QEMU
- runtime-rs: fix typo get_contaier_type to get_container_type
- kata-ctl: improve command descriptions for consistency
- runtime-rs: force shutdown shim process in it can't exit
- versions: Update TDX kernel
- ci: skip s390x for dragonball.
- Dragonball: delete redundant comments in blk_dev_mgr
- kata-ctl: Move development to main branch
- runtime-rs: support ephemeral storage for emptydir
- docs: fix a typo in rust-runtime-installation-guide
- Built-in Sandbox: add more unit tests for dragonball
- readme: remove libraries mentioning

b5cfd0958 kata-ctl: Fixed format for check release options
fbf294da3 refactor(shim-mgmt): move client side to libs
ae0dcacd4 tools: Add some new gitignore items
99485d871 shim: return hypervisor's pid not shim's pid
1f28ff683 runtime-rs: add binary to exercise shim proper w/o containerd dependencies
eb8c9d38f runtime-rs: add launch of a simple qemu process to start_vm()
2f6d0d408 runtime-rs: support qemu in VirtContainer
1413dfe91 runtime-rs: add basic empty boilerplate for qemu driver
a81ced0e3 upcall: add upcall into kernel build script
f5c34ed08 Dragonball: introduce upcall
8dbfc3dc8 kata-ctl: Fixed format for check release options
f3091a9da kata-ctl: Add kata-ctl check release options
a577df8b7 tools: Fix indentation on build kernel script
b087667ac kata-deploy: Fix the pod of kata deploy starts to occur an error
79cf38e6e runtime-rs: clear OCI spec namespace path
62f4603e8 runtime-rs: reset rdma cgroup
5b6596f54 runtime-rs: CreateContainerRequest has Default
e9e82ce28 runtime-rs: fix is_pid_namespace_enabled check
8079a9732 kata-sys-util: fix issues where umount2 couldn't get the correct path
4661ea8d3 runtime-rs: fix standalone share fs
c5abc5ed4 config: speed up rng init when kernel boot for arm64
3e6114b2e tools: Fix indentation for ovmf script
7fdbbcda8 agent: Drop the Option for LinuxContainer.cgroup_manager
d04d45ea0 runtime: use pidfd to wait for processes on Linux
e9ba0c11d runtime: use exponential backoff for process wait
748f22e7d agent: remove sysinfo dependency
0019d653d runtime-rs: fix high cpu
46b38458a docs: Update the rust version in the installation documentation
71491a69c runtime: move process wait logic to another function
92ebe61fe runtime: reap force killed processes
fdf0a7bb1 runtime-rs: fix the issues mentioned in the code review
1d823c4f6 runtime-rs: umount and permission controls in sandbox level
527b87141 runtime-rs: bind mount volumes in sandbox level
9ccf2ebe8 agent: add signal value to log
fb2c142f1 runtime-rs: fix some variable names and typos
737420469 kata-ctl: fix dependency version conflict
89574f03f workflow: call cargo in user's $PATH
d4321ab48 runtime: Add identification in version for runtime-rs
f7fc436be workflow: fix cargo-deny-runner.yaml syntax error
78532154d docs: Add description for guest SELinux support
c617bbe70 runtime: Pass SELinux policy for containers to the agent
935476928 agent: Add SELinux support for containers
a75f99d20 osbuilder: Create guest image for SELinux
a9c746f28 kernel: Add kernel configs for SELinux
86cb05883 snap: Fix snapcraft setup (unbreak snap releases)
f443b7853 build: update golang version to 1.19.3
e12db92e4 runk: Re-implement start operation using the agent codes
e723bad0a ci: let static checks don't depend on build
69aae0227 actions: use matrix to refactor static checks
a5e4cad4b kata-ctl: add host check for aarch64
2edbe389d runtime-rs: moving only vCPU threads into sandbox controller
340e24f17 actions: skip some job using "paths-ignore" filter
2426ea9bd doc: update runtime-rs "Build and Install"
67fe703ff runtime-rs: remove the version number from the commit display message
1d93a9346 fix(agent): fix iptables binary path in guest
1dfd845f5 runtime: go fix code for 1.19
cd85a44a0 tools: Remove extra tab spaces from kata deploy binaries script
cb199e0ec kernel: add CONFIG_X86_SGX into whitelist
4b45e1386 runtime: don't fail mkdir if the folder is already created
b987bbc57 runtime-rs: block on the current thread when setup the network
abb9ebeec package: add nydus to release artifacts
30a7ebf43 runtime: Log invalid devices in QEMU config
2539f3186 runtime: Use containerd v1.6.8
993d05a42 docs: change mount-info.json to mountInfo.json
d808adef9 runtime-rs: support vhost-vsock
6b2ef66f0 runtime-rs: add conditional compile for virt-sandbox persist
6c1e153a6 docs: update doc "NVIDIA GPU passthrough"
b53171b60 agent: check command before do test_ip_tables
a636d426d versions: update nydusd version
3bb145c63 runtime: Support virtiofs queue size for qemu and make it configurable
e80a9f09f utils: Add utility function to fetch the kernel version.
36545aa81 runtime: clh: Re-generate the client code
f4b02c224 versions: Upgrade to Cloud Hypervisor v28.0
e4a6fbadf docs: update doc "Setup swap device in guest kernel"
2f5f575a4 log-parser: Simplify check
d94718fb3 runtime: Fix gofmt issues
16b837509 golang: Stop using io/ioutils
66aa330d0 versions: Update golangci-lint
b3a4a1629 versions: bump containerd version
eab8d6be1 build: update golang version to 1.19.2
e80dbc15d runtime-rs: workaround Dragonball compilation problem
c3f1922df fix(fmt): fix cargo fmt to pass static check
a4099dab8 tools: Fix indentation of build static firecracker script
c46814b26 runtime-rs:support nydus v5 and v6
a04afab74 qemu: early exit from Check if the process was stopped
7e481f217 qemu: set stopped only if StopVM is successful
0e3ac66e7 clh: return faster with dead clh process from isClhRunning
9ef68e0c7 clh: fast exit from isClhRunning if the process was stopped
2631b08ff clh: don't try to stop clh multiple times
f45fe4f90 versions: update vmm-sys-util and related crates to v0.11.0
8be081730 tools: Fix indentation of build static virtiofsd script
f8f97c1e2 feat(shim-mgmt): iptables handler
29c75cf12 runtime-rs: delete all cargo patches
9f70a6949 tools: Remove empty spaces from build kernel script
57336835d dragonball: add more unit test for device manager
233370023 dragonball: add test utils.
3e9c3f12c docs: Fix configuration path
2adb1c182 Dragonball: enable mem_file_path config into hugetlbfs process
daeee26a1 cloud-hypervisor: Fix GetThreadIDs function
40d514aa2 github: Parallelise static checks
2508d39b7 runtime: added vcpus pinning logics Core VCPU threads pinning logics for issue 4476. Also provided docs.
fef8e92af runtime-rs:add hypervisor interface capabilities
27b191358 runtime-rs: blanks filled & fixes made to virtiofsd launch
990e6359b snap: Unbreak docker install
ca69a9ad6 snap: Use metadata for dependencies
df092185e runk: Upgrade libseccomp crate to v0.3.0 in Cargo.lock
16dca4ecd runk: Ignore an error when calling kill cmd with --all option
b74c18024 runtime-rs: fix shared volume permission issue
936fe35ac runtime-rs : fix shim source is ambiguous
0ed7da30d tools: Fix indentation of build static clh script
43fcb8fd0 virtiofsd: Not use "link-self-contained=yes" on s390x The compile option link-self-contained=yes asks rustc to use C library startup object files that come with the compiler, which are not available on the target s390x-unknown-linux-gnu. A build does not contain any startup files leading to a broken executable entry point (causing segmentation fault).
219919e9f docs: Fix volumeMounts in SGX usage example
c0f5bc81b cargo: Add Cargo.lock to version control
474927ec9 gitignore: Add gitignore file
699f821e1 utils: Add function to drop priveleges
a6fb4e2a6 versions: bump golangci-lint version
b015f34af runtime-rs: generate config files with the default target
d7bb4b551 agent: support systemd cgroup for kata agent
144efd1a7 docs: update rust runtime installation guide
abf4f9b29 docs: kata 3.0 Architecture fix readme content error
44d8de892 agent: remove redundant checks
9d286af7b versions: Update Cloud Hypervisor to b4e39427080
081ee4871 agent: use NLM_F_REPLACE replace NLM_F_EXCL in rtnetlink
e95089b71 kata-ctl: add basic cpu check for s390x
871d2cf2c kata-ctl: Limit running tests to x86 and use native-tls on s390x
cbd84c3f5 rustjail: Upgrade libseccomp crate to v0.3.0
748be0fe3 makefile: remove sudo when create symbolic link
227e717d2 qemu: Re-work static-build Dockerfile
72738dc11 agent: validate hugepage size is supported
f74e328ff Makefile: fix an typo in runtime-rs makefile
f205472b0 Makefile: regulate the comment style for the runtime-rs comments
9f2c7e47c Revert "kata-ctl: Disable network check on s390x"
ac403cfa5 doc: Update how-to-run-kata-containers-with-SNP-VMs.md
00981b3c0 kata-ctl: Disable network check on s390x
39363ffbf runtime: remove same function
c322d1d12 kata-ctl: arch: Improve check call
0bc5baafb snap: Build virtiofsd using the kata-deploy scripts
cb4ef4734 snap: Create a task for installing docker
7e5941c57 virtiofsd: Build inside a container
35d52d30f versions: Update TDX QEMU
4d9dd8790 runtime-rs: fix typo get_contaier_type to get_container_type
70676d4a9 kata-ctl: improve command descriptions for consistency
9eb73d543 versions: Update TDX kernel
00a42f69c kata-ctl: cargo: 2021 -> 2018
fb6327474 kata-ctl: rustfmt + clippy fixes
1f1901e05 dragonball: fix clippy warning for aarch64
a343c570e dragonball: enhance dragonball ci
6a64fb0eb ci: skip s390x for dragonball.
a743e37da Dragonball: delete redundant comments in blk_dev_mgr
2b345ba29 build: Add kata-ctl to tools list
f7010b806 kata-ctl: docs: Write basic documentation
862eaef86 docs: fix a typo in rust-runtime-installation-guide
26c043dee ci: Add dragonball test
781e604c3 docs: Reference kata-ctl README
15c343cbf kata-ctl: Don't rely on system ssl libs
c23584994 kata-ctl: clippy: Resolve warnings and reformat
133690434 kata-ctl: implement CLI argument --check-version-only
eb5423cb7 kata-ctl: switch to use clap derive for CLI handling
018aa899c kata-ctl: Add cpu check
7c9f9a5a1 kata-ctl: Make arch test run at compile time
b63ba66dc kata-ctl: Formatting tweaks
cca7e32b5 kata-ctl: Lint fixes to allow the branch to be built
8e7bb8521 kata-ctl: add code for framework for arch
303fc8b11 kata-ctl: Add unit tests cases
d0b33e9a3 versions: Add kata-ctl version entry
002b18054 kata-ctl: Add initial rust code for kata-ctl
b62b18bf1 dragonball: fix clippy warning
2ddc948d3 Makefile: add dragonball components.
3fe81fe4a dragonball-ut: use skip_if_not_root to skip root case
72259f101 dragonball: add more unit test for vmm actions
9717dc3f7 Dragonball: remove redundant comments in event manager
9c1ac3d45 runtime-rs: return port on agent-url req
89e62d4ed shim: Ensure pagesize is set when reporting hugetbl stats
8d4ced3c8 runtime-rs: support ephemeral storage for emptydir
046ddc646 readme: remove libraries mentioning
86ad832e3 runtime-rs: force shutdown shim process in it can't exit

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-12-16 09:12:07 +01:00
Zhongtao Hu
21ec766d29 docs: add documents for using bundle to start container
add document for using bundle to start container

Fixes:#5872
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-12-16 11:13:25 +08:00
Yushuo
d14c3af35c dragonball: refactor legacy device initialization
If the serial path is given, legacy_manager should create socket console
based on that path. Or the console should be created based on stdio.

Fixes: #5914

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2022-12-15 20:55:01 +08:00
Fabiano Fidêncio
1d266352ea Merge pull request #5902 from Bevisy/fix-too-many-git-file
tools: Add some new gitignore items
2022-12-15 11:29:32 +01:00
Zhongtao Hu
ca39a07a14 runtime-rs: enable start container from bundle
enable start container from bundle in this way

$ ls ./bundle
config.json  rootfs
$ sudo ctr run -d --runtime io.containerd.kata.v2 --config bundle/config.json test_kata

Fixes:#5872
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-12-15 17:28:13 +08:00
Peng Tao
ebb73df6bc Merge pull request #5899 from Bevisy/fix-outdated-comments
shim: return hypervisor's pid not shim's pid
2022-12-15 14:55:54 +08:00
Peng Tao
7210905deb Merge pull request #5712 from openanolis/chao/upcall
Dragonball: introduce upcall
2022-12-15 14:44:56 +08:00
Chao Wu
fad229b853 Merge pull request #5875 from Ji-Xinyou/xyji/refactor-shim-mgmt
refactor(shim-mgmt): move client side to libs
2022-12-15 10:59:45 +08:00
David Esparza
1dbd6c8057 Merge pull request #5735 from dborquez/kata-ctl-cli-list
kata-ctl: Add --list option
2022-12-14 15:03:21 -06:00
Alex
b5cfd09583 kata-ctl: Fixed format for check release options
Fixed formatting for check release options

Fixes: #5345

Signed-off-by: Alex <alee23@bu.edu>
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2022-12-14 09:42:57 -06:00
James O. D. Hunt
2e15af777c Merge pull request #5786 from alexlee-23/main
kata-ctl: check: only-list-releases and include-all-releases options
2022-12-14 11:25:36 +00:00
Ji-Xinyou
fbf294da3f refactor(shim-mgmt): move client side to libs
The client side is moved to libs. This is to solve the problem
that including clients will bring about messy dependencies.

Fixes: #5874
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-12-14 17:42:25 +08:00
Peng Tao
856d4b7361 Merge pull request #5798 from pmores/qemu-support
basic framework for QEMU support in runtime-rs
2022-12-14 15:05:33 +08:00
Binbin Zhang
ae0dcacd4a tools: Add some new gitignore items
Add some new ignore items to avoid local builds that cause git to track a lot of files

Fixes: #5900

Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
2022-12-14 11:38:23 +08:00
Binbin Zhang
99485d871c shim: return hypervisor's pid not shim's pid
update outdated code comments

Fixes: #3234

Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
2022-12-14 11:16:11 +08:00
GabyCT
b637d12d19 Merge pull request #5884 from GabyCT/topic/fixbuildscript
tools: Fix indentation on build kernel script
2022-12-13 15:28:24 -06:00
Chao Wu
bb4be2a666 Merge pull request #5690 from yipengyin/fix-virtiofsd
runtime-rs: fix standalone share fs
2022-12-14 00:16:10 +08:00
James Tumber
087515a46e agent: unset CC for cross-build
When `HOST_ARCH` != `ARCH` unset `CC`

Specifying a foreign CC is incompatible with building libgit2. Thus after the RUSTFLAGS linker
has been set we can safely unset CC to avoid passing this value through the build.

Fixes: #5890

Signed-off-by: James Tumber <james.tumber@ibm.com>
2022-12-13 15:30:06 +00:00
Pavel Mores
1f28ff6838 runtime-rs: add binary to exercise shim proper w/o containerd dependencies
After building the binary as usual with `cargo build` run it as follows.

It needs a configuration.toml in which only qemu keys `path`, `kernel`
and `initrd` will initially need to be set.  Point them to respective
files e.g. from a kata distribution tarball.

It also needs to be launched from an exported container bundle
directory.  One can be created by running

mkdir rootfs
podman export $(podman create busybox) | tar -C ./rootfs -xvf -
runc spec -b .

in a suitable directory.

Then launch the program like this:

KATA_CONF_FILE=/path/to/configuration-qemu.toml /path/to/shim-ctl

Fixes: #5817

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-12-13 14:55:21 +01:00
Pavel Mores
eb8c9d38ff runtime-rs: add launch of a simple qemu process to start_vm()
The point here is just to get a simplest Kata VM running.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-12-13 14:54:26 +01:00
Pavel Mores
2f6d0d408b runtime-rs: support qemu in VirtContainer
Added registration of qemu config plugin and support for creating Qemu
Hypervisor instance.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-12-13 14:54:26 +01:00
Pavel Mores
1413dfe91c runtime-rs: add basic empty boilerplate for qemu driver
This does almost literally nothing so far apart from getting and setting
HypervisorConfig.  It's mostly copied from/inspired by dragonball.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-12-13 14:53:45 +01:00
Bin Liu
3952fedcd0 Merge pull request #5882 from bergwolf/github/oci-namespaces
runtime-rs: fix sandbox_pidns calculation and oci spec amending
2022-12-13 18:32:02 +08:00
Fabiano Fidêncio
f1381eb361 Merge pull request #4813 from ManaSugi/fix/add-selinux-agent
runtime,agent: Add SELinux support for containers inside the guest
2022-12-13 11:24:53 +01:00
Yuan-Zhuo
bf8848f926 agent: Eliminate unnecessary metrics
DEFAULT_REGISTRY pre-registers many metrics that we don't need or have duplicated.
This PR uses a custom register for metrics without interference and ensures that
the registration process is executed only once when the program is running.

Fixes: #5255

Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
2022-12-13 16:18:33 +08:00
Fupan Li
015674df16 Merge pull request #5873 from justxuewei/fix/umount2
kata-sys-util: fix issues where umount2 couldn't get the correct path
2022-12-13 15:52:32 +08:00
Chao Wu
a81ced0e3f upcall: add upcall into kernel build script
In order to let upcall being used by Kata Container, we need to add
those patches into kernel build script.

Currently, only when experimental (-e) and hypervisor type dragonball
(-t dragonball) are both enabled, that the upcall patches will be
applied to build a 5.10 guest kernel.

example commands: sh ./build-kernel.sh -e -t dragonball -d setup

fixes: #5642

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-12-13 15:44:55 +08:00
Chao Wu
f5c34ed088 Dragonball: introduce upcall
Upcall is a direct communication tool between VMM and guest developed
upon vsock. The server side of the upcall is a driver in guest kernel
(kernel patches are needed for this feature) and it'll start to serve
the requests after the kernel starts. And the client side is in
Dragonball VMM , it'll be a thread that communicates with vsock through
uds.

We want to keep the lightweight of the VM through the implementation of
the upcall, through which we could achieve vCPU hotplug, virtio-mmio
hotplug without implementing complex and heavy virtualization features
such as ACPI virtualization.

fixes: #5642

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-12-13 15:44:47 +08:00
Bin Liu
03b6124fc6 Merge pull request #5848 from Yuan-Zhuo/drop-cgmr-option
agent: Drop the Option for LinuxContainer.cgroup_manager
2022-12-13 12:09:39 +08:00
Guoqiang Ding
f8a48ab41d docs: add hint of probing loop module
If `loop` module is not probed, it causes error like "losetup: cannot find an unused loop device".

Fixes: #5887
Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>
2022-12-13 11:33:42 +08:00
Alex
8dbfc3dc82 kata-ctl: Fixed format for check release options
Fixed formatting for check release options

Fixes: #5345

Signed-off-by: Alex <alee23@bu.edu>
2022-12-13 03:10:19 +00:00
Bin Liu
add2486259 Merge pull request #5853 from jongwu/test_kata3.0_arm
dragonball: enable kata3.0/dragonball CI on Arm
2022-12-13 11:05:17 +08:00
Alex
f3091a9da4 kata-ctl: Add kata-ctl check release options
This pull request adds kata-ctl check only-list-releases and include-all-releases

Fixes: #5345

Signed-off-by: Alex <alee23@bu.edu>
2022-12-13 03:04:30 +00:00
Gabriela Cervantes
a577df8b71 tools: Fix indentation on build kernel script
This PR fixes the indentation on the build kernel script.

Fixes #5883

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-12-12 16:37:47 +00:00
Fabiano Fidêncio
740387b569 Merge pull request #5829 from singhwang/main
fix kata deploy error after node reboot.
2022-12-12 14:20:14 +01:00
singhwang
b087667ac5 kata-deploy: Fix the pod of kata deploy starts to occur an error
If a pod of kata is deployed on a machine, after the machine restarts, the pod status of kata-deploy will be CrashLoopBackOff.

Fixes: #5868
Signed-off-by: SinghWang <wangxin_0611@126.com>
2022-12-12 19:11:38 +08:00
Peng Tao
79cf38e6ea runtime-rs: clear OCI spec namespace path
None of the host namespace paths make sense in the guest. Let's clear
them all before sending the spec to the agent.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-12-12 11:07:14 +00:00
Peng Tao
62f4603e81 runtime-rs: reset rdma cgroup
We don't support rdma cgroups yet. Let's make sure it is reset to empty.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-12-12 09:57:24 +00:00
Peng Tao
5b6596f54e runtime-rs: CreateContainerRequest has Default
We can just use it to initialize the default fields.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-12-12 09:57:24 +00:00
Peng Tao
e9e82ce28b runtime-rs: fix is_pid_namespace_enabled check
We should test is_pid_namespace_enabled before amending the container
spec, where the pid namespace path is cleared and resulting
sandbox_pidns to always being false.

Fixes: #5881
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-12-12 09:54:48 +00:00
Zhongtao Hu
afaf17f423 runtime-rs: enable container hugepage
enable the functionality of using hugepages in container

Fixes: #5560
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-12-12 17:49:31 +08:00
Xuewei Niu
8079a9732d kata-sys-util: fix issues where umount2 couldn't get the correct path
Strings in Rust don't have \0 at the end, but C does, which leads to `umount2`
in the libc can't get the correct path. Besides, calling `nix::mount::umount2`
to avoid using an unsafe block is a robust solution.

Fixes: #5871

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2022-12-12 11:50:32 +08:00
Yipeng Yin
4661ea8d3b runtime-rs: fix standalone share fs
Standalone share fs should add virtiofs device in setup_device_before_start_vm
and return the storages to mount the directory in guest. And it uses
hypervisor's jailer root directly instead of jail config.

Besides, we tweaked the parameter, so it adapts to rust version virtiofsd
now. And its cache policy which forbids caching is "never" now,  instead of
"none". Hence, we change the default cache mode.

Fixes: #5655

Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
2022-12-12 10:58:09 +08:00
GabyCT
67e82804c5 Merge pull request #5865 from GabyCT/topic/fixspacesovmfscript
tools: Fix indentation for ovmf script
2022-12-09 15:33:49 -06:00
Jianyong Wu
c5abc5ed4d config: speed up rng init when kernel boot for arm64
For now, rng init is too slow for kata3.0/dragonball. Enable
random_trust_cpu can speed up rng init when kernel boot.

Fixes: #5870
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2022-12-09 14:20:18 +08:00
Gabriela Cervantes
3e6114b2ef tools: Fix indentation for ovmf script
This PR fixes the indentation for the ovmf script for packaging.

Fixes #5864

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-12-08 16:12:20 +00:00
Zhongtao Hu
fc4a67eec3 runtime-rs: enable vm hugepage
support vm hugepage,set the hugetlbfs mount point as vm  memory path

Fixes:#5560
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-12-09 00:01:16 +08:00
Greg Kurz
5ef7ed72ae Merge pull request #5610 from UiPath/fix-process-wait
runtime: prevent waiting 50 ms minimum for a process exit
2022-12-08 11:02:39 +01:00
Mathias Flagey
ebe5c5adf9 docs: Update virtiofsd build script in the developer guide
Script to execute to build virtiofsd has been changed in #5426 but not in the doc. This commit update the developer guide.

Fixes: #5860

Signed-off-by: Mathias Flagey <mathiasflagey1201@gmail.com>
2022-12-08 09:29:10 +01:00
Peng Tao
0a1d1ec2fa Merge pull request #5830 from openanolis/fix-high-cpu
runtime-rs: fix high cpu
2022-12-08 12:16:06 +08:00
Steve Horsman
39394fa2a8 Merge pull request #5844 from jtumber-ibm/patch-1
agent: remove `sysinfo` dependency
2022-12-07 16:35:05 +00:00
Fupan Li
cce316b5e9 Merge pull request #5607 from justxuewei/feat/sandbox-level-volume
runtime-rs: bind mount volumes in sandbox level
2022-12-07 19:23:38 +08:00
Chelsea Mafrica
1ff4185111 Merge pull request #5842 from cyyzero/update_install_guide
docs: Update the rust version in the installation documentation
2022-12-06 23:40:35 -08:00
Yuan-Zhuo
7fdbbcda82 agent: Drop the Option for LinuxContainer.cgroup_manager
Cgroup manager for a container will always be created.
Thus, dropping the option for LinuxContainer.cgroup_manager
is feasible and could simplify the code.

Fixes: #5778

Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
2022-12-07 13:40:38 +08:00
Alexandru Matei
d04d45ea05 runtime: use pidfd to wait for processes on Linux
Use pidfd_open and poll on newer versions of Linux to wait
for the process to exit. For older versions use existing wait logic

Fixes: #5617

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-12-06 16:31:05 +02:00
Alexandru Matei
e9ba0c11d0 runtime: use exponential backoff for process wait
Initial wait period between checks is 1ms, and the
next ones are min(wait_period*5, 50ms)

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-12-06 16:30:58 +02:00
James Tumber
748f22e7d0 agent: remove sysinfo dependency
Removes the redundant dependency `sysinfo`.

Fixes: #5843

Signed-off-by: James Tumber <james.tumber@ibm.com>
2022-12-06 10:18:53 +00:00
Quanwei Zhou
0019d653d6 runtime-rs: fix high cpu
Fixed the issue when using nonblocking, the `tokio::io::copy()` needing
to handle EAGAIN, resulting in high CPU usage.

Fixes: #5740
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-12-06 14:25:33 +08:00
Chao Wu
326d589ff5 Merge pull request #5822 from liubin/fix/5820-var-name-and-typo
runtime-rs: fix some variable names and typos
2022-12-06 14:24:11 +08:00
Zhongtao Hu
c12bb5008d Merge pull request #5769 from jongwu/check_host_arm
kata-ctl: add host check for aarch64
2022-12-06 14:05:52 +08:00
Chen Yiyang
46b38458af docs: Update the rust version in the installation documentation
Rust version in the installation documentation does not match the
requirements. Just fix it.

Fixes: #5841

Signed-off-by: Chen Yiyang <cyyzero@qq.com>
2022-12-06 12:50:32 +08:00
Chao Wu
538bddf4ee Merge pull request #5811 from tzY15368/fix-katactl-conflict-dependency
kata-ctl: fix dependency version conflict
2022-12-06 10:44:48 +08:00
Alexandru Matei
71491a69c3 runtime: move process wait logic to another function
extract process wait logic to another function

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-12-05 13:32:04 +02:00
Alexandru Matei
92ebe61fea runtime: reap force killed processes
reap child processes after sending SIGKILL

Fixes #5739

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-12-05 13:31:58 +02:00
Xuewei Niu
fdf0a7bb14 runtime-rs: fix the issues mentioned in the code review
Removed the `Debug` trait for the `ShareFs` and etc. Renamed
`ShareFsMount::upgrade()` and `ShareFsMount::downgrade()` to
`upgrade_to_rw()` and `downgrade_to_ro()`. Protected `mounted_info_set`
with a mutex to avoid race conditions.

Fixes: #5588

Signed-off-by: Xuewei Niu <justxuewei@apache.org>
2022-12-05 11:18:26 +08:00
Xuewei Niu
1d823c4f65 runtime-rs: umount and permission controls in sandbox level
This commit implemented umonut controls and permission controls. When a volume
is no longer referenced, it will be umounted immediately. When a volume mounted
with readonly permission and a new coming container needs readwrite permission,
the volume should be upgraded to readwrite permission. On the contrary, if a
volume with readwrite permission and no container needs readwrite, then the
volume should be downgraded.

Fixes: #5588

Signed-off-by: Xuewei Niu <justxuewei@apache.org>
2022-12-05 10:58:13 +08:00
Xuewei Niu
527b871414 runtime-rs: bind mount volumes in sandbox level
Implemented bind mount related managment on the sandbox side, involving bind
mount a volume if it's not mounted before, upgrade permission to readwrite if
there is a new container needs.

Fixes: #5588

Signed-off-by: Xuewei Niu <justxuewei@apache.org>
2022-12-05 10:58:13 +08:00
Bin Liu
9ccf2ebe8a agent: add signal value to log
For signal_process call, log the signal value in logs.

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-12-02 14:53:58 +08:00
Bin Liu
fb2c142f18 runtime-rs: fix some variable names and typos
Fix some not perfect variable names, and some typos in logs.

Fixes: #5820

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-12-02 14:52:34 +08:00
Bin Liu
8246de821f Merge pull request #5809 from liubin/fix/cargo-deny-workflow-error
workflow: fix cargo-deny-runner.yaml syntax error
2022-12-02 12:19:44 +08:00
Bin Liu
514b7778a2 Merge pull request #5807 from liubin/fix/5806-add-shim-lanuage
runtime: Add identification in version for runtime-rs
2022-12-02 11:36:55 +08:00
Bin Liu
c1f5a93b66 Merge pull request #5814 from liubin/fix/5813-test-dragonball-error
workflow: call cargo in user's $PATH
2022-12-02 11:36:19 +08:00
Tingzhou Yuan
737420469a kata-ctl: fix dependency version conflict
Also added crate `runtime-rs/crates/runtimes` as dependency as it's
immediately depended upon by the `direct-volume` feature, see issue
5341 and PR 5467.

Fixes #5810

Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>
2022-12-01 17:53:21 +00:00
Bin Liu
89574f03f8 workflow: call cargo in user's $PATH
Call cargo in root's HOME may lead to permission error, should
call cargo installed in user's HOME/PATH.

Fixes: #5813

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-12-01 15:37:16 +08:00
Bin Liu
d4321ab489 runtime: Add identification in version for runtime-rs
Now we are supporting two runtime/shim, the go version,
and the rust version, for debug purposes, we can
add an identification in the version info
to tell us which runtime/shim is used.

Fixes: #5806

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-12-01 15:14:08 +08:00
Bin Liu
7fabfb2cf0 Merge pull request #5756 from chentt10/remove-version-number-from-commit-message
runtime-rs: remove the version number from the commit display message
2022-12-01 13:11:47 +08:00
Bin Liu
f7fc436bed workflow: fix cargo-deny-runner.yaml syntax error
There is a syntax error in .github/workflows/cargo-deny-runner.yaml

Fixes: #5808

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-12-01 12:32:00 +08:00
Fabiano Fidêncio
212325a9db Merge pull request #5649 from ManaSugi/runk/refactor-start-using-agent-code
runk: Re-implement start operation using the agent codes
2022-11-29 20:45:16 +01:00
Fabiano Fidêncio
ac1b2d2a18 Merge pull request #5774 from UiPath/fix-go-panic
build: update golang version to 1.19.3
2022-11-29 13:17:53 +01:00
Fabiano Fidêncio
d8d9aae123 Merge pull request #5781 from jodh-intel/snap-fix-release
snap: Fix snapcraft setup (unbreak snap releases)
2022-11-29 13:11:34 +01:00
Manabu Sugimoto
78532154d9 docs: Add description for guest SELinux support
Add the description about how to enable SELinux for containers
running inside the guest.

Fixes: #4812

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-11-29 19:07:56 +09:00
Manabu Sugimoto
c617bbe70d runtime: Pass SELinux policy for containers to the agent
Pass SELinux policy for containers to the agent if `disable_guest_selinux`
is set to `false` in the runtime configuration. The `container_t` type
is applied to the container process inside the guest by default.
Users can also set a custom SELinux policy to the container process using
`guest_selinux_label` in the runtime configuration. This will be an
alternative configuration of Kubernetes' security context for SELinux
because users cannot specify the policy in Kata through Kubernetes's security
context. To apply SELinux policy to the container, the guest rootfs must
be CentOS that is created and built with `SELINUX=yes`.

Fixes: #4812

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-11-29 19:07:56 +09:00
Manabu Sugimoto
9354769286 agent: Add SELinux support for containers
The kata-agent supports SELinux for containers inside the guest
to comply with the OCI runtime specification.

Fixes: #4812

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-11-29 19:07:56 +09:00
Bin Liu
588f81a23c Merge pull request #5612 from openanolis/fix-iptables
fix(agent): fix iptables binary path in guest
2022-11-29 16:57:06 +08:00
Bin Liu
1da2d0603c Merge pull request #5761 from gaohuatao-1/ght_overhead
runtime-rs: moving only vCPU threads into sandbox controller
2022-11-29 13:53:01 +08:00
Manabu Sugimoto
a75f99d20d osbuilder: Create guest image for SELinux
Create a guest image to support SELinux for containers inside the guest
if `SELINUX=yes` is specified. This works only if the guest rootfs is
CentOS and the init service is systemd, not the agent init. To enable
labeling the guest image on the host, selinuxfs must be mounted on the
host. The kata-agent will be labeled as `container_runtime_exec_t` type.

Fixes: #4812

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-11-29 13:32:26 +09:00
Manabu Sugimoto
a9c746f284 kernel: Add kernel configs for SELinux
Add kernel configs related to SELinux in order to add the
support for containers running inside the guest.

Fixes: #4812

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-11-29 13:32:26 +09:00
GabyCT
681d946644 Merge pull request #5748 from GabyCT/topic/removeextratabspacesdocker
tools: Remove extra tab spaces from kata deploy binaries script
2022-11-28 15:34:12 -06:00
James O. D. Hunt
86cb058833 snap: Fix snapcraft setup (unbreak snap releases)
Setup the snapcraft environment manually as the action we had been using
for this does not appear to be actively maintained currently.

Related to this, switch to specifying the snapcraft store credentials
using the `SNAPCRAFT_STORE_CREDENTIALS` secret. This unbreaks
`snapcraft upload`, which Canonical appear to have broken by removing
the previous facility.

Fixes: #5772.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-11-28 15:51:47 +00:00
Alexandru Matei
f443b78537 build: update golang version to 1.19.3
This Go release fixes golang/go#56309

Fixes #5773
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-11-28 17:03:29 +02:00
GabyCT
013752667b Merge pull request #5776 from liubin/tmp/debug-static-check
ci: let static checks don't depend on build
2022-11-28 07:51:42 -06:00
Fabiano Fidêncio
527e6c99e9 Merge pull request #5766 from liubin/fix/5763-use-composite-action-refactor-static-checks
actions: use matrix to refactor static checks
2022-11-28 14:12:27 +01:00
Bin Liu
6af037d379 Merge pull request #5154 from Yuan-Zhuo/main
agent: support systemd cgroup for kata agent.
2022-11-28 18:40:10 +08:00
Manabu Sugimoto
e12db92e4d runk: Re-implement start operation using the agent codes
This commit re-implements `start` operation by leveraging the agent codes.
Currently, `runk` has own `start` mechanism even if the agent already
has the feature to handle starting a container. This worsen the maintainability
and `runk` cannot keep up with the changes on the agent side easily.
Hence, `runk` replaces own implementations with agent's ones.

Fixes: #5648

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-11-28 19:11:21 +09:00
Fabiano Fidêncio
74531114c3 Merge pull request #5762 from liubin/fix/5759-skip-action-by-path
actions: skip some jobs using "paths-ignore" filter
2022-11-28 11:04:34 +01:00
Bin Liu
e723bad0af ci: let static checks don't depend on build
Build is a time consumable operation, skip build while let
ci run faster.

Fixes: #5777

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-11-28 15:26:04 +08:00
Bin Liu
a55eb78c32 Merge pull request #5752 from liubin/fix/5750-go-fix-1.19
runtime: go fix code for 1.19
2022-11-26 02:09:02 +08:00
Bin Liu
57c80ad65c Merge pull request #5758 from chentt10/update-runtime-rs-build-and-install
doc: update runtime-rs "Build and Install"
2022-11-26 02:08:48 +08:00
Bin Liu
69aae02276 actions: use matrix to refactor static checks
Using matrix to reduce the duplication that of similar code.

Fixes: #5763

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-11-26 00:32:15 +08:00
Jianyong Wu
a5e4cad4b6 kata-ctl: add host check for aarch64
For now, we can check if host support running kata by check if "/dev/kvm"
exist on aarch64.

Fixes: #5768
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2022-11-25 18:55:32 +08:00
gaohuatao
2edbe389d8 runtime-rs: moving only vCPU threads into sandbox controller
when overhead controller exists, just contrain vCPU threads
in sandbox controller

Fixes:#5760

Signed-off-by: gaohuatao <gaohuatao@bytedance.com>
2022-11-25 17:53:21 +08:00
Peng Tao
e32c023d96 Merge pull request #5714 from UiPath/fix-mkdir
runtime: don't fail mkdir if the folder is already created by another process
2022-11-25 17:52:56 +08:00
Bin Liu
ae1001a9d1 Merge pull request #5742 from openanolis/chao/SGX_whitelist
kernel: add CONFIG_X86_SGX into whitelist
2022-11-25 17:36:26 +08:00
Bin Liu
340e24f175 actions: skip some job using "paths-ignore" filter
If only docs/images are changed, some jobs should not run.

Fixes: #5759

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-11-25 15:33:32 +08:00
Chen Taotao
2426ea9bdc doc: update runtime-rs "Build and Install"
When using source code to compile runtime-rs,make the
documentation point out the detailed environment build
and compilation methods to avoid errors caused by related
dependent packages.

Fixes:#5757

Signed-off-by: Chen Taotao <chentt10@chinatelecom.cn>
2022-11-25 13:13:00 +08:00
Chen Taotao
67fe703ff5 runtime-rs: remove the version number from the commit display message
The displayed commit message and version message are partially duplicated.
Remove the version number from the commit display message.

Fixes:#5735

Signed-off-by: Chen Taotao <chentt10@chinatelecom.cn>
2022-11-25 13:00:01 +08:00
Ji-Xinyou
1d93a93468 fix(agent): fix iptables binary path in guest
Some rootfs put iptables-save and iptables-restore
under /usr/sbin instead of /sbin. This pr checks both
and returns the one exist.

Fixes: #5608
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-11-25 11:57:34 +08:00
Bin Liu
1dfd845f51 runtime: go fix code for 1.19
We have starting to use golang 1.19, some features are
not supported later, so run `go fix` to fix them.

Fixes: #5750

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-11-25 11:29:18 +08:00
Zhongtao Hu
f02bb1a9cb Merge pull request #5729 from openanolis/netnsref
runtime-rs: block on the current thread when setup the network to avoid be take over by other task
2022-11-25 08:09:10 +08:00
Gabriela Cervantes
cd85a44a04 tools: Remove extra tab spaces from kata deploy binaries script
This PR removes extra tab spaces from the kata deploy binaries
script.

Fixes #5747

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-11-24 17:57:36 +00:00
Chao Wu
cb199e0ecf kernel: add CONFIG_X86_SGX into whitelist
CONFIG_X86_SGX is introduced after kernel 5.11, and that config is a
default x86_64 config for Kata build-kernel.sh script.
But if we use -v to specify any kernel version below 5.11 will cause an
inevitable error because CONFIG_X86_SGX is not supported in older
kernels and that may cause problem for the situation if we need kernel
version below 5.11.

So I propose to put CONFIG_X86_SGX into whitelist.conf to avoid break
building guest kernel below 5.11.

fixes: #5741

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-11-24 20:43:58 +08:00
Alexandru Matei
4b45e13869 runtime: don't fail mkdir if the folder is already created
Use MkdirAll instead of Mkdir so it doesn't generate an
error when the folder is created by another process

Fixes #5713

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-11-24 11:20:56 +02:00
Chao Wu
9bde32daa1 Merge pull request #5707 from openanolis/ref
Refactor(runtime-rs): add conditional compile for virt-sandbox persist
2022-11-24 15:24:06 +08:00
Zhongtao Hu
b987bbc576 runtime-rs: block on the current thread when setup the network
As the increase of the I/O intensive tasks, two issues could be caused:

 1. When the future is blocked, the current thread (which is in the network namespace)
    might be take over by other tasks. After the future is finished, the thread take over
    the current task might not be in the pod network namespace
 2. When finish setting up the network, the current thread will be set back to the host namsapce.
    But the task which be taken over would still stay in the pod network namespace

 To avoid that, we need to block the future on the current thread.

Fixes:#5728
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-11-24 13:48:05 +08:00
Bin Liu
06a604b753 Merge pull request #5720 from YchauWang/wyc-docs-test-22
runtime: add log record to the qemu config method `appendDevices` for…
2022-11-24 13:15:06 +08:00
Peng Tao
b4d0a39f6d Merge pull request #5723 from fidencio/topic/runtime-bump-containerd-to-v1.6.8
runtime: Use containerd v1.6.8
2022-11-24 11:28:58 +08:00
GabyCT
6d1b5d47fb Merge pull request #5664 from GabyCT/topic/fixfirecrackerscript
tools: Fix indentation of build static firecracker script
2022-11-23 15:00:07 -06:00
Fabiano Fidêncio
82aa876903 Merge pull request #5727 from liubin/feat/add-nydus-to-release
package: add nydus to release artifacts
2022-11-23 14:39:26 +01:00
Bin Liu
abb9ebeece package: add nydus to release artifacts
Install nydus related binaries under /opt/kata/libexec/

Fixes: #5726

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-11-23 15:17:58 +08:00
Fabiano Fidêncio
5cbf879659 Merge pull request #5693 from jongwu/test_ip_table
agent: check if command exist before do ip_tables test
2022-11-23 08:15:08 +01:00
wangyongchao.bj
30a7ebf430 runtime: Log invalid devices in QEMU config
When the user tried to add new devices to the VM, there is no error info for the invalid
 device. This PR adds a log record to the `appendDevices` for the invalid device of the
 qemu config.

Fixes: #5719

Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
2022-11-23 09:09:45 +08:00
Fabiano Fidêncio
df3d9878d5 Merge pull request #5695 from darfux/virtiofs-queue-size
runtime: Support virtiofs queue size for qemu and make it configurable
2022-11-22 20:04:30 +01:00
Archana Shinde
e7f8d21bb7 Merge pull request #5717 from Kvasscn/fix_direct_blk_mount_info
docs: change mount-info.json to mountInfo.json
2022-11-22 10:19:02 -08:00
Fabiano Fidêncio
2539f31862 runtime: Use containerd v1.6.8
Let's follow the binary bump used in the CI and also bump the vendored
version of containerd to v1.6.8.

Fixes: #5722

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-11-22 18:28:30 +01:00
Fabiano Fidêncio
732123b9ab Merge pull request #5709 from kinderyj/main
docs: update doc "NVIDIA GPU passthrough"
2022-11-22 16:53:51 +01:00
Chao Wu
8b04ba95cb Merge pull request #5691 from yipengyin/support-vhost-vsock
runtime-rs: support vhost-vsock
2022-11-22 14:59:55 +08:00
Jason Zhang
993d05a42e docs: change mount-info.json to mountInfo.json
mount-info.json should be mountInfo.json according to the description in the doc.

Fixes: #5716

Signed-off-by: Jason Zhang <zhanghj.lc@inspur.com>
2022-11-22 14:25:57 +08:00
Yipeng Yin
d808adef95 runtime-rs: support vhost-vsock
Rename old VsockConfig to HybridVsockConfig. And add VsockConfig to
support vhost-vsock. We follow kata's old way to try random vhost fd
for 50 times to generate uniqe fd.

Fixes: #5654

Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
2022-11-22 10:03:52 +08:00
Zhongtao Hu
6b2ef66f0f runtime-rs: add conditional compile for virt-sandbox persist
code refactoring, add conditional compile for virt-sandbox persist

Fixes: #5706
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-11-21 19:51:43 +08:00
Matt Wang
6c1e153a6f docs: update doc "NVIDIA GPU passthrough"
We should make sure the hook shell
`nvidia-container-toolkit.sh` is executable.

Fixes: #5594

Signed-off-by: Matt Wang <kinder_yj@hotmail.com>
2022-11-21 17:31:20 +08:00
Jianyong Wu
b53171b605 agent: check command before do test_ip_tables
test_ip_tables test depends on iptables tools. But we can't
ensure these tools are exist. it's better to skip the test
if there is no such tools.

Fixes: #5697
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2022-11-21 14:56:51 +08:00
Bin Liu
7c8d474959 Merge pull request #5689 from kata-containers/kata-ctl-util
utils: Add utility function to fetch the kernel version.
2022-11-21 14:44:05 +08:00
Peng Tao
be31a0fb41 Merge pull request #5638 from bergwolf/github/nydusd
versions: update nydusd version
2022-11-21 09:53:11 +08:00
Peng Tao
a636d426d9 versions: update nydusd version
To the latest stable v2.1.1.

Depends-on: github.com/kata-containers/tests#5246
Fixes: #5635
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-11-19 16:33:29 +00:00
liyuxuan.darfux
3bb145c63a runtime: Support virtiofs queue size for qemu and make it configurable
The default vhost-user-fs queue-size of qemu is 128 now. Set it to 1024
by default which is same as clh. Also make this value configurable.

Fixes: #5694

Signed-off-by: liyuxuan.darfux <liyuxuan.darfux@bytedance.com>
2022-11-19 15:38:11 +08:00
Archana Shinde
e80a9f09fa utils: Add utility function to fetch the kernel version.
Add functionality to get kernel version and related unit tests.
This is intended to be used in the kata-env command going forward.

Fixes: #5688

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-11-18 15:39:57 -08:00
Bin Liu
7506237420 Merge pull request #5144 from openanolis/nydus-dev
runtime-rs: support nydus v5 and v6 rootfs
2022-11-18 14:05:04 +08:00
Bo Chen
65686dbbdc Merge pull request #5684 from likebreath/1117/clh_v28.0
Upgrade to Cloud Hypervisor v28.0
2022-11-17 15:18:51 -08:00
Chelsea Mafrica
85f818743b Merge pull request #5679 from liubin/fix/5678-update-swap-doc
docs: update doc "Setup swap device in guest kernel"
2022-11-17 13:23:57 -08:00
Bo Chen
36545aa81a runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v28.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #5683

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-11-17 09:45:27 -08:00
Bo Chen
f4b02c2244 versions: Upgrade to Cloud Hypervisor v28.0
Details of this release can be found in our new roadmap project as
iteration v28.0: https://github.com/orgs/cloud-hypervisor/projects/6.

Fixes: #5683

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-11-17 09:44:49 -08:00
Fabiano Fidêncio
81c0945afa Merge pull request #5669 from fidencio/topic/rust-fixes-plus-golang-bump
Rust fixes + Golang bump
2022-11-17 16:02:17 +01:00
Bin Liu
e4a6fbadf8 docs: update doc "Setup swap device in guest kernel"
`crictl runp` command needs `--runtime kata` option
to start a Kata Containers pod.

Fixes: #5678

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-11-17 22:57:22 +08:00
Fabiano Fidêncio
2f5f575a43 log-parser: Simplify check
```
14:13:15 parse.go:306:5: S1009: should omit nil check; len() for github.com/kata-containers/kata-containers/src/tools/log-parser.kvPairs is defined as zero (gosimple)
14:13:15 	if pairs == nil || len(pairs) == 0 {
14:13:15 	   ^
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-11-17 14:17:29 +01:00
Fabiano Fidêncio
d94718fb30 runtime: Fix gofmt issues
It seems that bumping the version of golang and golangci-lint new format
changes are required.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-11-17 14:16:12 +01:00
Fabiano Fidêncio
16b8375095 golang: Stop using io/ioutils
The package has been deprecated as part of 1.16 and the same
functionality is now provided by either the io or the os package.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-11-17 13:43:25 +01:00
Fabiano Fidêncio
66aa330d0d versions: Update golangci-lint
Let's bump the golangci-lint in order to fix issues that popped up after
updating Golang to its 1.19.2 version.

Depends-on: github.com/kata-containers/tests#5257

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-11-16 19:03:02 +01:00
Peng Tao
b3a4a16294 versions: bump containerd version
v1.5.2 cannot be built from source by newer golang. Let's bump
containerd version to 1.6.8. The GO runtime dependency has
been moved to v1.6.6 for some time already.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-11-16 19:02:41 +01:00
Peng Tao
eab8d6be13 build: update golang version to 1.19.2
So that we get the latest language fixes.

There is little use to maitain compiler backward compatibility.
Let's just set the default golang version to the latest 1.19.2.

Fixes: #5494
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-11-16 19:02:39 +01:00
Chao Wu
e80dbc15d8 runtime-rs: workaround Dragonball compilation problem
Since the upstream rust-vmm is changing its dependency style towards
caret requirements in these days (more information:
rust-vmm/vm-memory#199) and it breaks Dragonball compilation frequently.

rust-vmm is expected to finish the changes this week and in order to not
break Kata CI due to Dragonball's compilation error, we will add
Cargo.lock file into /src/dragonball first and remove it later when
rust-vmm is stable.

fixes: #5657
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-11-16 12:44:41 +01:00
Ji-Xinyou
c3f1922df6 fix(fmt): fix cargo fmt to pass static check
Fix cargo fmt

Fixes: #5639
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-11-16 12:44:38 +01:00
Greg Kurz
1bbcb413c9 Merge pull request #5597 from UiPath/fix-clh-wait
clh: avoid race condition when stopping clh
2022-11-16 07:39:27 +01:00
Gabriela Cervantes
a4099dab8f tools: Fix indentation of build static firecracker script
This PR fixes the indentation of the build static firecracker script.

Fixes #5663

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-11-15 16:01:36 +00:00
Bin Liu
b8dbb35bb7 Merge pull request #5631 from GabyCT/topic/fixvirtiofsdscript
tools: Fix indentation of build static virtiofsd script
2022-11-11 14:31:26 +08:00
Bin Liu
dff78593c0 Merge pull request #5505 from Joffref/patch-1
docs: Fix configuration path
2022-11-11 14:26:40 +08:00
Zhongtao Hu
7d91150185 Merge pull request #5536 from chentt10/fix-name-shim-source-ambiguous
runtime-rs : fix the shim source in the documentation test is ambiguous
2022-11-11 14:07:05 +08:00
Zhongtao Hu
c46814b26a runtime-rs:support nydus v5 and v6
add nydus v5 snd v6 upport for container rootfs

Fixes:#5142
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-11-11 10:15:35 +08:00
Alexandru Matei
a04afab74d qemu: early exit from Check if the process was stopped
Fixes: #5625

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-11-10 22:43:32 +02:00
Alexandru Matei
7e481f2179 qemu: set stopped only if StopVM is successful
Fixes: #5624

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-11-10 22:43:32 +02:00
Alexandru Matei
0e3ac66e76 clh: return faster with dead clh process from isClhRunning
Through proactively checking if Cloud Hypervisor process is dead,
this patch provides a faster path for isClhRunning

Fixes: #5623

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-11-10 22:43:32 +02:00
Alexandru Matei
9ef68e0c7a clh: fast exit from isClhRunning if the process was stopped
Use atomic operations instead of acquiring a mutex in isClhRunning.
This stops isClhRunning from generating a deadlock by trying to
reacquire an already-acquired lock when called via StopVM->terminate.

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-11-10 22:43:32 +02:00
Alexandru Matei
2631b08ff1 clh: don't try to stop clh multiple times
Avoid executing StopVM concurrently when virtiofs dies as a result of clh
being stopped in StopVM.

Fixes: #5622

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-11-10 22:43:32 +02:00
James O. D. Hunt
56641bc230 Merge pull request #5637 from openanolis/chao/update_cargo_lock
versions: update vmm-sys-util and related crates to v0.11.0
2022-11-10 13:49:24 +00:00
Chao Wu
f45fe4f90d versions: update vmm-sys-util and related crates to v0.11.0
Since the upstream of vmm-sys-utils upgraded to 0.11.0, some crates
automatically upgrade to v0.11.0, and some stay at v0.10.0 ( depending
on how they write version dependency in Cargo toml` which causes the
compile error in runtime-rs.

In order to fix this problem, we need to upgrade all vmm-sys-util
dependencies in runtime-rs to v0.11.0.

fixes: #5636

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-11-10 19:13:23 +08:00
quanweiZhou
bbc93260c9 Merge pull request #5615 from openanolis/chao/delete_cargo_patch
runtime-rs: delete all cargo patches
2022-11-10 10:18:19 +08:00
Gabriela Cervantes
8be0817305 tools: Fix indentation of build static virtiofsd script
This Pr removes single spaces and fix the indentation of the script.

Fixes #5630

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-11-09 17:09:13 +00:00
Zhongtao Hu
071ac4693a Merge pull request #5613 from openanolis/iptables
feat(shim-mgmt): iptables handler
2022-11-09 17:21:45 +08:00
Bin Liu
1d59137c6f Merge pull request #5620 from GabyCT/topic/removeemptysspaces
tools: Remove empty spaces from build kernel script
2022-11-09 17:02:29 +08:00
Ji-Xinyou
f8f97c1e22 feat(shim-mgmt): iptables handler
Support the handlers in runtime, which are used by kata-ctl iptables series of commands in runtime.

Fixes: #5370
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-11-09 10:39:50 +08:00
Chao Wu
29c75cf12b runtime-rs: delete all cargo patches
The cargo patch in the cargo.toml seems to cause the whole runtime-rs
building time longer and also makes it harder to build runtime-rs in an
environment without the network

We should delete all patches from the cargo.toml file and publish all
the crates that was once patched.

fixes: #5614 #5527 #5526 #5449

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-11-09 10:02:58 +08:00
Gabriela Cervantes
9f70a6949b tools: Remove empty spaces from build kernel script
This PR removes some extra empty spaces at the build kernel script.

Fixes #5619

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-11-08 17:49:57 +00:00
Chao Wu
f5f25d9379 Merge pull request #5431 from wllenyj/dragonball-ut-3
Built-in Sandbox: add more unit tests for dragonball. Part 3
2022-11-08 15:48:16 +08:00
Zhongtao Hu
351bdbfacd Merge pull request #5567 from openanolis/chao/fix_mem_file_path_error
Dragonball: enable mem_file_path config into hugetlbfs process
2022-11-08 09:00:13 +08:00
wllenyj
57336835da dragonball: add more unit test for device manager
Added more unit tests for device manager.

Fixes: #4899

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-11-08 00:45:17 +08:00
wllenyj
2333700237 dragonball: add test utils.
Added some tools for dragonball unit testing.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-11-08 00:45:17 +08:00
Bin Liu
bfe9157abc Merge pull request #5570 from openanolis/capability
runtime-rs:add hypervisor interface capabilities
2022-11-07 23:04:55 +08:00
Mathis Joffre
3e9c3f12ce docs: Fix configuration path
On install you generate a configuration-fc.toml
file when building the kata-runtime and
copy it to either /etc/kata-containers/configuration-fc.toml
or /usr/share/defaults/kata-containers/configuration-fc.toml.
To reflect that the path must be one of the above,
we can fix the path in doc.

Fixes: #5589

Signed-off-by: Mathis Joffre <mariusjoffre@gmail.com>
2022-11-07 10:19:47 +01:00
Chao Wu
2adb1c1823 Dragonball: enable mem_file_path config into hugetlbfs process
In the current Dragonball code, mem_file_path config is not used when
hugetlbfs is enabled.
In this commit we add mem_file_path into hugetlbfs enable process.

fixes: #5566
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-11-07 16:07:57 +08:00
Fabiano Fidêncio
7250be3601 Merge pull request #5584 from fengyehong/clh-thread
cloud-hypervisor: Fix GetThreadIDs function
2022-11-07 08:22:40 +01:00
Fabiano Fidêncio
3b1750e8e8 Merge pull request #5586 from fidencio/topic/paralelise-static-checks
github: Parallelise static checks
2022-11-07 07:54:48 +01:00
Bin Liu
824ea83c3c Merge pull request #5573 from pmores/fill-in-virtiofsd-standalone-impl
runtime-rs: blanks filled & fixes made to virtiofsd launch
2022-11-07 14:19:45 +08:00
Bin Liu
83d052f82b Merge pull request #4476 from LitFlwr0/vcpu-pinning-frq
vCPUs pinning support for Kata Containers
2022-11-07 10:37:22 +08:00
Guanglu Guo
daeee26a1e cloud-hypervisor: Fix GetThreadIDs function
Get vcpu thread-ids by reading cloud-hypervisor process tasks information.

Fixes: #5568

Signed-off-by: Guanglu Guo <guoguanglu@qiyi.com>
2022-11-05 17:23:19 +08:00
Bin Liu
427b01e298 Merge pull request #5548 from justxuewei/fix/share-fs-permission
runtime-rs: fix shared volume permission issue
2022-11-04 21:21:50 +08:00
Fabiano Fidêncio
40d514aa2c github: Parallelise static checks
Although introducing an awful amount of code duplication, let's
parallelise the static checks in order to reduce its time and the space
used in the VMs running those.

While I understand there may be ways to make the whole setup less
repetitive and error prone, I'm taking the approach of:
* Make it work
* Make it right
* Make it fast

So, it's clear that I'm only attempting to make it work, and I'd
appreciate community help in order to improve the situation here.  But,
for now, this is a stopgap solution.

JFYI, the time needed for run the tests on the `main` branch went down
from ~110 minutes to ~60 minutes.  Plus, we're not running those on a
single VM anymore, which decreases the change to hit the space limit.

Reference: https://github.com/kata-containers/kata-containers/actions/runs/3393468605/jobs/5640842041

Ideally, each one of the following tests should be also split into
smaller tests, each test for one component, for instance.
* static-checks
* compiler-checks
* unit-tests
* unit-tests-as-root

Fixes: #5585

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-11-04 13:41:16 +01:00
LitFlwr0
2508d39b7c runtime: added vcpus pinning logics
Core VCPU threads pinning logics for issue 4476. Also provided docs.

Fixes:#4476
Signed-off-by: LitFlwr0 <861690705@qq.com>
2022-11-04 17:52:42 +08:00
Zhongtao Hu
fef8e92af1 runtime-rs:add hypervisor interface capabilities
1. be able to check does hypervisor support use block device, block
device hotplug, multi-queue, and share file

2. be able to set the hypervisor capability of using block device, block
device hotplug, multi-queue, and share file

Fixes: #5569
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-11-04 09:24:36 +08:00
Bin Liu
b0c7bcce7c Merge pull request #5556 from ManaSugi/runk/fix-kill-behavior
runk: Ignore an error when calling kill cmd with --all option
2022-11-04 08:42:27 +08:00
Bin Liu
02fa6b8dad Merge pull request #5557 from ManaSugi/runk/update-cargolock-libseccomp
runk: Upgrade libseccomp crate to v0.3.0 in Cargo.lock
2022-11-04 08:41:45 +08:00
Fabiano Fidêncio
bb38901550 Merge pull request #5571 from jodh-intel/snap-unbreak-docker
snap: Unbreak docker install
2022-11-03 23:47:07 +01:00
Pavel Mores
27b1913584 runtime-rs: blanks filled & fixes made to virtiofsd launch
The 'config' argument to ShareVirtioFsStandalone::new() is now actually
used, taking care of an explicit TODO.

If a shared path doesn't exist in ShareVirtioFsStandalone::virtiofsd_args()
it is now created instead of returning an error, thus following
ShareVirtioFsInline's suit.

The '-o vhost_user_socket=...' command line argument doesn't seem to be
supported by newer versions of virtiofsd so we replace it with
'--socket-path' which should be functionally equivalent according to docs.

Fixes #5572

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-11-03 08:38:59 +01:00
James O. D. Hunt
990e6359b7 snap: Unbreak docker install
It appears that _either_ the GitHub workflow runners have changed their
environment, or the Ubuntu archive has changed package dependencies,
resulting in the following error when building the snap:

```
Installing build dependencies: bc bison build-essential cpio curl docker.io ...

    :

The following packages have unmet dependencies:
docker.io : Depends: containerd (>= 1.2.6-0ubuntu1~)
E: Unable to correct problems, you have held broken packages.
```

This PR uses the simplest solution: install the `containerd` and `runc`
packages. However, we might want to investigate alternative solutions in
the future given that the docker and containerd packages seem to have
gone wild in the Ubuntu GitHub workflow runner environment. If you
include the official docker repo (which the snap uses), a _subset_ of
the related packages is now:

- `containerd`
- `containerd.io`
- `docker-ce`
- `docker.io`
- `moby-containerd`
- `moby-engine`
- `moby-runc`
- `runc`

Fixes: #5545.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-11-02 10:09:03 +00:00
James O. D. Hunt
ca69a9ad6d snap: Use metadata for dependencies
Rather than hard-coding the package manager into the docker part,
use the `build-packages` section to specify the parts package
dependencies in a distro agnostic manner.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-11-02 09:50:29 +00:00
Manabu Sugimoto
df092185ee runk: Upgrade libseccomp crate to v0.3.0 in Cargo.lock
The libseccomp crate was upgraded to v0.3.0 by 4696ead,
but `Cargo.lock` of runk wasn't updated by mistake.
So, this commit updates `Cargo.lock` of runk to the latest dependencies.

Fixes: #5487

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-11-01 20:26:33 +09:00
Manabu Sugimoto
16dca4ecd4 runk: Ignore an error when calling kill cmd with --all option
Ignore an error handling that is triggered when the kill command is called
with `--all option` to the stopped container.

High-level container runtimes such as containerd call the kill command with
`--all` option in order to terminate all processes inside the container
even if the container already is stopped. Hence, a low-level runtime
should allow `kill --all` regardless of the container state like runc.

This commit reverts to the previous behavior.

Fixes: #5555

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-11-01 20:24:29 +09:00
Xuewei Niu
b74c18024a runtime-rs: fix shared volume permission issue
Fix the issue where share volumes always have readwrite permission even if
readonly permission is enough.

Fixes: #5549

Signed-off-by: Xuewei Niu <justxuewei@apache.org>
2022-11-01 18:42:19 +08:00
Chen TaoTao
936fe35acb runtime-rs : fix shim source is ambiguous
In the documentation test, the name shim has multiple potential
sources of import, now give it a clear source.

Fixes: #5535

Signed-off-by: Chen TaoTao <chentt10@chinatelecom.cn>
2022-10-31 19:54:22 -07:00
snir911
288e337a6f Merge pull request #5434 from Rouzip/remove-doNetNS
add EnterNetNS in virtcontainers
2022-10-30 11:19:07 +02:00
GabyCT
e04ad49c1b Merge pull request #5530 from GabyCT/topic/fixclhscript
tools: Fix indentation of build static clh script
2022-10-28 11:52:56 -05:00
Gabriela Cervantes
0ed7da30d7 tools: Fix indentation of build static clh script
This Pr removes single spaces and fix the indentation of the script.

Fixes #5528

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-10-27 21:09:34 +00:00
Bin Liu
0bb005093e Merge pull request #5523 from BbolroC/s390x-virtiofsd
virtiofsd: Not use "link-self-contained=yes" on s390x
2022-10-27 20:42:57 +08:00
Hyounggyu Choi
43fcb8fd09 virtiofsd: Not use "link-self-contained=yes" on s390x
The compile option link-self-contained=yes asks rustc to use
C library startup object files that come with the compiler,
which are not available on the target s390x-unknown-linux-gnu.
A build does not contain any startup files leading to a
broken executable entry point (causing segmentation fault).

Fixes: #5522

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2022-10-26 23:43:22 +02:00
David Esparza
37f0cd1c8f Merge pull request #5436 from amshinde/kata-ctl-drop-privs
Kata ctl drop privs
2022-10-26 11:37:27 -05:00
David Esparza
8b0c830a23 Merge pull request #5513 from bergwolf/github/golang-ci-lint
versions: bump golangci-lint version
2022-10-26 07:36:45 -05:00
Bin Liu
059b09b0a8 Merge pull request #5510 from bergwolf/github/runtime-rs-makefile
runtime-rs: generate config files with the default target
2022-10-26 20:29:17 +08:00
David Esparza
4d6c3bd0fa Merge pull request #5515 from cmaf/docs-fix-sgx-k8s-volumemount
docs: Fix volumeMounts in SGX usage example
2022-10-26 07:24:31 -05:00
Chelsea Mafrica
219919e9f7 docs: Fix volumeMounts in SGX usage example
The /dev/sgx is not mounted and the enclave is not available,
causing the demo job to report an error in the logs. Add volumeMounts to
container in order to have the device available in the container.

Fixes: #5514

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2022-10-25 23:20:49 -07:00
Archana Shinde
c0f5bc81b7 cargo: Add Cargo.lock to version control
Add Cargo.lock to capture state of build.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-10-25 20:34:40 -07:00
Archana Shinde
474927ec90 gitignore: Add gitignore file
Ignore autogeneraated version.rs

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-10-25 20:34:40 -07:00
Archana Shinde
699f821e12 utils: Add function to drop priveleges
This function is meant to be used before operations
such as accessing network to make sure those operations
are not performed as a privilged user.

Fixes: #5331

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-10-25 20:34:40 -07:00
Peng Tao
a6fb4e2a68 versions: bump golangci-lint version
There is little point to maintain backward compatiblity for
golangci-lint. Let's just use a unified version of it.

Fixes: #5512
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-10-26 10:41:24 +08:00
Peng Tao
b015f34aff runtime-rs: generate config files with the default target
Right now it is not generated with a simple `make`.

Fixes: #5509
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-10-26 10:25:29 +08:00
Yuan-Zhuo
d7bb4b5512 agent: support systemd cgroup for kata agent
1. Implemented a rust module for operating cgroups through systemd with the help of zbus (src/agent/rustjail/src/cgroups/systemd).
2. Add support for optional cgroup configuration through fs and systemd at agent (src/agent/rustjail/src/container.rs).
3. Described the usage and supported properties of the agent systemd cgroup (docs/design/agent-systemd-cgroup.md).

Fixes: #4336

Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
2022-10-25 13:57:09 +08:00
Bo Chen
a151d8ee50 Merge pull request #5493 from fidencio/topic/update-clh
versions: Update Cloud Hypervisor to b4e39427080
2022-10-24 07:54:02 -07:00
Bin Liu
0f7088a4b1 Merge pull request #5501 from openanolis/update_install_guide
docs: update rust runtime installation guide
2022-10-24 17:49:34 +08:00
Bin Liu
4696eadfeb Merge pull request #5488 from ManaSugi/fix/update-libseccomp-crate
rustjail: Upgrade libseccomp crate to v0.3.0
2022-10-24 17:03:30 +08:00
Bin Liu
badb2600b3 Merge pull request #5474 from openanolis/makefile
makefile: remove sudo when create symbolic link
2022-10-24 17:03:20 +08:00
Bin Liu
ab5f97759d Merge pull request #5497 from Rouzip/remove-redundant
agent: remove redundant checks
2022-10-24 16:41:49 +08:00
Fabiano Fidêncio
190e623c40 Merge pull request #5317 from Champ-Goblem/fix-containerd-stats
shim: Ensure pagesize is set when reporting hugetlb stats
2022-10-24 10:24:49 +02:00
Fabiano Fidêncio
7248cf51c5 Merge pull request #5447 from hbrueckner/fix-5438
kata-ctl: Re-enable network tests on s390x (fixes 5438)
2022-10-24 10:23:35 +02:00
Zhongtao Hu
144efd1a7a docs: update rust runtime installation guide
As kata-deploy support rust runtime, we need to update the installation docs

Fixes:#5500
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-10-24 15:55:30 +08:00
James O. D. Hunt
65ef2a0a0b Merge pull request #5089 from liubin/fix/4895-ignore-exit-error
agent: use NLM_F_REPLACE replace NLM_F_EXCL in rtnetlink
2022-10-24 08:46:54 +01:00
Zhongtao Hu
164ecca3f0 Merge pull request #5499 from zhaoxuat/main
fix readme content error at doc directory
2022-10-24 14:15:52 +08:00
zhaoxu
abf4f9b299 docs: kata 3.0 Architecture
fix readme content error

Fixes: #5498
Signed-off-by: zhaoxu <zhaoxu@megvii.com>
2022-10-24 11:07:34 +08:00
snir911
ee189d2ebe Merge pull request #5455 from kata-containers/main-validate-hp-size
agent: validate hugepage size is supported
2022-10-23 08:15:05 +03:00
Rouzip
44d8de8923 agent: remove redundant checks
Remove redundant checks for executable files.

FIXes: #3730

Signed-off-by: Rouzip <1226015390@qq.com>
2022-10-22 23:31:18 +08:00
Fabiano Fidêncio
9d286af7b4 versions: Update Cloud Hypervisor to b4e39427080
An API change, done a long time ago, has been exposed on Cloud
Hypervisor and we should update it on the Kata Containers side to ensure
it doesn't affect Cloud Hypervisor CI and because the change is needed
for an upcoming work to get QAT working with Cloud Hypervisor.

Fixes: #5492

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-10-21 20:52:54 +02:00
Bin Liu
081ee48713 agent: use NLM_F_REPLACE replace NLM_F_EXCL in rtnetlink
Sometimes we will face EEXIST error when adding arp neighbour.
Using NLM_F_REPLACE replace NLM_F_EXCL will avoid fail if the
entry exists.

See https://man7.org/linux/man-pages/man7/netlink.7.html

Fixes: #4895

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-10-21 21:19:14 +08:00
Hendrik Brueckner
e95089b716 kata-ctl: add basic cpu check for s390x
Add a basic s390x cpu check for the "sie" feature to be present.
Also re-enable cpu check testing.

Fixes: #5438

Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>
2022-10-21 12:04:28 +00:00
Hendrik Brueckner
871d2cf2c0 kata-ctl: Limit running tests to x86 and use native-tls on s390x
For s390x, use native-tls for reqwest because the rustls-tls/ring
dependency is not available for s390x.

Also exclude s390x, powerpc64le, and aarch64 from running the cpu
check due to the lack of the arch-specific implementation. In this
case, rust complains about unused functions in src/check.rs (both
normal and test context).

Fixes: #5438

Co-authored-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>
2022-10-21 11:54:26 +00:00
Manabu Sugimoto
cbd84c3f5a rustjail: Upgrade libseccomp crate to v0.3.0
The libseccomp crate v0.3.0 has been released, so use it in the agent.

Fixes: #5487

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-10-21 15:40:05 +09:00
Bin Liu
1bf64c9a11 Merge pull request #5453 from openanolis/chao/fix_comment_typo
Makefile: fix an typo in runtime-rs makefile
2022-10-21 14:36:39 +08:00
David Esparza
1c159d83ea Merge pull request #5465 from fidencio/topic/re-work-QEMU-dockerfile
qemu: Re-work static-build Dockerfile
2022-10-20 13:32:03 -05:00
Zhongtao Hu
748be0fe3d makefile: remove sudo when create symbolic link
when using mock to package rpm, we cannot have sudo permission

Fixes: #5473
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-10-20 22:13:21 +08:00
Bin Liu
cd27ad144e Merge pull request #5219 from openanolis/krt-modify
Modify agent-url return value in runtime-rs
2022-10-20 11:17:29 +08:00
Fabiano Fidêncio
227e717d27 qemu: Re-work static-build Dockerfile
Differently than every single other bit that's part of our repo, QEMU
has been using a single Dockerfile that prepares an environment where
the project can be built, but *also* building the project as part of
that very same Dockerfile.

This is a problem, for several different reasons, including:
* It's very hard to have a reproducible build if you don't have an
  archived image of the builder
* One cannot cache / ipload the image of the builder, as that contains
  already a specific version of QEMU
* Every single CI run we end up building the builder image, which
  includes building dependencies (such as liburing)

Let's split the logic into a new build script, and pass the build script
to be executed inside the builder image, which will be only responsible
for providing an environment where QEMU can be built.

Fixes: #5464

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-10-19 21:34:36 +02:00
Bin Liu
faf363db75 Merge pull request #5414 from openanolis/chao/regulate_runtime_rs_makefile_comments
runtime-rs: regulate the comment in runtime-rs makefile
2022-10-19 15:36:00 +08:00
Snir Sheriber
72738dc11f agent: validate hugepage size is supported
before setting a limit, otherwise paths may not be found.
guest supporting different hugepage size is more likely with peer-pods where
podvm may use different flavor.

Fixes: #5191
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-10-19 09:55:33 +03:00
Chao Wu
f74e328fff Makefile: fix an typo in runtime-rs makefile
There is a typo in runtime-rs makefile.
_dragonball should be _DB

fixes: #5452

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-10-19 14:12:48 +08:00
Chao Wu
f205472b01 Makefile: regulate the comment style for the runtime-rs comments
In runtime-rs makefile, we use
```
```
to let make help print out help information for variables and targets,
but later commits forgot this rule.
So we need to follow the previous rule and change the current comments.

fixes: #5413
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-10-19 12:12:50 +08:00
Fabiano Fidêncio
c97b7b18e7 Merge pull request #5416 from zvonkok/patch-1
doc: Update how-to-run-kata-containers-with-SNP-VMs.md
2022-10-18 22:45:05 +02:00
Hendrik Brueckner
9f2c7e47c9 Revert "kata-ctl: Disable network check on s390x"
This reverts commit 00981b3c0a.

Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>
2022-10-18 11:12:18 +00:00
James O. D. Hunt
dd60a0298d Merge pull request #5439 from jodh-intel/kata-ctl-s390x-disable-tls
kata-ctl: Disable network check on s390x
2022-10-18 09:58:09 +01:00
Zvonko Kaiser
ac403cfa5a doc: Update how-to-run-kata-containers-with-SNP-VMs.md
If the needed libraries (for virtfs) are installed on the host,
 QEMU will pick it up and enable it. If not installed and you
do not enable the flag, QEMU will just ignore it, and you end
up without 9p support. Enabling it explicitly will fail if the
needed libs are not installed so this way we can be sure that
it gets build.

Fixes: #5418

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2022-10-17 05:56:19 -07:00
James O. D. Hunt
00981b3c0a kata-ctl: Disable network check on s390x
s390x apparently does not support rust-tls, which is required by the
network check (due to the `reqwest` crate dependency).

Disable the network check on s390x until we can find a solution to the
problem.

> **Note:**
>
> This fix is assumed to be a temporary one until we find a solution.
> Hence, I have not moved the network check code (which should be entirely
> generic) into an architecture specific module.

Fixes: #5435.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-17 10:24:06 +01:00
Rouzip
39363ffbfb runtime: remove same function
Add EnterNetNS in virtcontainers to remove same function.

FIXes #5394

Signed-off-by: Rouzip <1226015390@qq.com>
2022-10-17 10:59:13 +08:00
James O. D. Hunt
c322d1d12a kata-ctl: arch: Improve check call
Rework the architecture-specific `check()` call by moving all the
conditional logic out of the function.

Fixes: #5402.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-15 11:41:53 +01:00
Fabiano Fidêncio
ff8bfdfe3b Merge pull request #5426 from fidencio/topic/build-virtiofsd-in-a-2nd-layer-container
virtiofsd: Build inside a container
2022-10-15 00:26:56 +02:00
Fabiano Fidêncio
0bc5baafb9 snap: Build virtiofsd using the kata-deploy scripts
Let's build virtiofsd using the kata-deploy build scripts, which
simplifies and unifies the way we build our components.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-10-14 13:44:03 +02:00
Fabiano Fidêncio
cb4ef4734f snap: Create a task for installing docker
Let's have the docker installation / configuration as part of its own
task, which can be set as a dependency of other tasks whcih may or may
not depend on docker.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-10-14 12:41:21 +02:00
Fabiano Fidêncio
7e5941c578 virtiofsd: Build inside a container
When moving to building the CI artefacts using the kata-deploy scripts,
we've noticed that the build would fail on any machine where the tarball
wasn't officially provided.

This happens as rust is missing from the 1st layer container.  However,
it's a very common practice to leave the 1st layer container with the
minimum possible dependencies and install whatever is needed for
building a specific component in a 2nd layer container, which virtiofsd
never had.

In this commit we introduce the second layer containers (yes,
comtainers), one for building virtiofsd using musl, and one for building
virtiofsd using glibc.  The reason for taking this approach was to
actually simplify the scripts and avoid building the dependencies
(libseccomp, libcap-ng) using musl libc.

Fixes: #5425

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-10-14 12:41:21 +02:00
Zhongtao Hu
5d17cbeef7 Merge pull request #5383 from openanolis/chao/update_comments_in_event_manager
Dragonball: remove redundant comments in event manager
2022-10-14 15:50:37 +08:00
Fabiano Fidêncio
c745d6648d Merge pull request #5420 from fidencio/topic/update-tdx-qemu-repo
versions: Update TDX QEMU
2022-10-13 20:57:37 +02:00
Bin Liu
b23a24ab2f Merge pull request #5417 from liubin/fix/typo-get_contaier_type
runtime-rs: fix typo get_contaier_type to get_container_type
2022-10-13 22:35:23 +08:00
Bin Liu
c7b38532f0 Merge pull request #5412 from tzY15368/improve-cmd-descriptions
kata-ctl: improve command descriptions for consistency
2022-10-13 19:17:42 +08:00
Fabiano Fidêncio
35d52d30fd versions: Update TDX QEMU
The previously used repo will be removed by Intel, as done with the one
used for TDX kernel.  The TDX team has already worked on providing the
patches that were hosted atop of the QEMU commit with the following hash
4c127fdbe81d66e7cafed90908d0fd1f6f2a6cd0 as a tarball in the
https://github.com/intel/tdx-tools repo, see
https://github.com/intel/tdx-tools/pull/162.

On the Kata Containers side, in order to simplify the process and to
avoid adding hundreds of patches to our repo, we've revived the
https://github.com/kata-containers/qemu repo, and created a branch and a
tag with those hundreds of patches atop of the QEMU commit hash
4c127fdbe81d66e7cafed90908d0fd1f6f2a6cd0.  The branch is called
4c127fdbe81d66e7cafed90908d0fd1f6f2a6cd0-plus-TDX-v3.1 and the tag is
called TDX-v3.1.

Knowing the whole background, let's switch the repo we're getting the
TDX QEMU from.

Fixes: #5419

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-10-13 11:53:29 +02:00
Bin Liu
4d9dd8790d runtime-rs: fix typo get_contaier_type to get_container_type
Change get_contaier_type to get_container_type

Fixes: #5415

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-10-13 17:12:43 +08:00
Bin Liu
2de29b6f69 Merge pull request #5088 from liubin/fix/5087-force-shutdown-shim
runtime-rs: force shutdown shim process in it can't exit
2022-10-13 16:55:05 +08:00
Fabiano Fidêncio
d934d87482 Merge pull request #5404 from fidencio/topic/update-tdx-kernel-repo
versions: Update TDX kernel
2022-10-13 09:14:44 +02:00
Tingzhou Yuan
70676d4a99 kata-ctl: improve command descriptions for consistency
This change improves the command descriptions for kata-ctl and can avoid certain confusions in command functionality.

Fixes #5411

Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>
2022-10-13 04:10:23 +00:00
Bin Liu
3b70c72436 Merge pull request #5395 from wllenyj/dragonball-s390
ci: skip s390x for dragonball.
2022-10-13 09:03:08 +08:00
Bin Liu
157d3cdcb1 Merge pull request #5397 from openanolis/chao/delete_redundant_dragonball_comment
Dragonball: delete redundant comments in blk_dev_mgr
2022-10-13 09:01:59 +08:00
Fabiano Fidêncio
9eb73d543a versions: Update TDX kernel
The previously used repo has been removed by Intel.  As this happened,
the TDX team worked on providing the patches that were hosted atop of
the v5.15 kernel as a tarball present in the
https://github.com/intel/tdx-tools repos, see
https://github.com/intel/tdx-tools/pull/161.

On the Kata Containers side, in order to simplify the process and to
avoid adding ~1400 kernel patches to our repo, we've revived the
https://github.com/kata-containers/linux repo, and created a branch and
a tag with those ~1400 patches atop of the v5.15.  The branch is called
v5.15-plus-TDX, and the tag is called 5.15-plus-TDX (in order to avoid
having to change how the kernel builder script deals with versioning).

Knowing the whole background, let's switch the repo we're getting the
TDX kernel from.

Fixes: #5326

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-10-12 16:54:43 +02:00
James O. D. Hunt
d3ee8d9f1b Merge pull request #5388 from jodh-intel/kata-ctl
kata-ctl: Move development to main branch
2022-10-12 14:29:35 +01:00
James O. D. Hunt
00a42f69c0 kata-ctl: cargo: 2021 -> 2018
Revert to the 2018 edition of rust for consistency with other rust
components.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-12 11:46:51 +01:00
James O. D. Hunt
fb63274747 kata-ctl: rustfmt + clippy fixes
Make this file conform to the standard rust layout conventions and
simplify the code as recommended by `clippy`.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-12 11:46:48 +01:00
wllenyj
1f1901e059 dragonball: fix clippy warning for aarch64
Added aarch64 check.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-10-12 18:29:00 +08:00
wllenyj
a343c570e4 dragonball: enhance dragonball ci
Unified use of Makefile instead of calling `cargo test` directly.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-10-12 17:53:01 +08:00
wllenyj
6a64fb0eb3 ci: skip s390x for dragonball.
Currently, Dragonball only supports x86_64 and aarch64 platforms.

Fixes: #4381

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-10-12 15:27:45 +08:00
Bin Liu
7aacba0abc Merge pull request #5282 from liubin/fix/4730-rs-emptydir
runtime-rs: support ephemeral storage for emptydir
2022-10-12 09:53:59 +08:00
Chao Wu
a743e37daf Dragonball: delete redundant comments in blk_dev_mgr
delete redundent derive part for BlockDeviceMgr.

fixes: #5396

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-10-11 19:41:47 +08:00
Chao Wu
d2bf2f5dd0 Merge pull request #5393 from LetFu/5392/fixInstallKata30RustRuntimeShimGuideTypo
docs: fix a typo in rust-runtime-installation-guide
2022-10-11 19:27:31 +08:00
James O. D. Hunt
2b345ba29d build: Add kata-ctl to tools list
Update the top-level Makefile to build the `kata-ctl` tool by default.

Fixes: #4499, #5334.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-11 10:05:16 +01:00
James O. D. Hunt
f7010b8061 kata-ctl: docs: Write basic documentation
Provide a basic document explaining a little about the `kata-ctl`
command.

Fixes: #5351.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-11 10:04:48 +01:00
Bin Liu
ffdd7e1ad8 Merge pull request #4961 from wllenyj/dragonball-ut-2
Built-in Sandbox: add more unit tests for dragonball
2022-10-11 14:12:25 +08:00
Bin Liu
39702c19d5 Merge pull request #5276 from bergwolf/github/readme
readme: remove libraries mentioning
2022-10-11 13:19:18 +08:00
chmod100
862eaef863 docs: fix a typo in rust-runtime-installation-guide
Fixes: #5392

Signed-off-by: chmod100 <letfu@outlook.com>
2022-10-11 02:31:29 +00:00
wllenyj
26c043dee7 ci: Add dragonball test
Enhanced Static-Check of CI to support nested virtualization.

Fixes: #5378

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-10-11 00:36:20 +08:00
James O. D. Hunt
781e604c39 docs: Reference kata-ctl README
Add a link to the `kata-ctl` tool's README.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-10 16:49:53 +01:00
James O. D. Hunt
15c343cbf2 kata-ctl: Don't rely on system ssl libs
Build using the rust TLS implementation rather than the system ones.
This resolves the `reqwest` crate build failure: it doesn't appear to
build against the native libssl libraries due to Kata defaulting to
using the musl libc.

Fixes: #5387.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-10 13:42:51 +01:00
James O. D. Hunt
c23584994a kata-ctl: clippy: Resolve warnings and reformat
Resolved a couple of clippy warnings and applied standard `rustfmt`.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-10 13:42:51 +01:00
David Esparza
133690434c kata-ctl: implement CLI argument --check-version-only
This kata-ctl argument returns the latest stable Kata
release by hitting github.com.
Adds check-version unit tests.

Fixes: #11

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2022-10-10 13:42:51 +01:00
David Esparza
eb5423cb7f kata-ctl: switch to use clap derive for CLI handling
Switch from the functional version of `clap` to the declarative
methodology.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
Commit-edited-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-10 13:42:51 +01:00
Chelsea Mafrica
018aa899cb kata-ctl: Add cpu check
Add architecture-specific code for x86_64 and generic calls handling
checks for CPU flags and attributes.

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2022-10-10 13:42:50 +01:00
James O. D. Hunt
7c9f9a5a1d kata-ctl: Make arch test run at compile time
Changed the `panic!()` call to a `compile_error!()` one to ensure it
fires at compile time rather than runtime.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-10 13:42:50 +01:00
James O. D. Hunt
b63ba66dc3 kata-ctl: Formatting tweaks
Automatic format updates.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-10 13:42:50 +01:00
James O. D. Hunt
cca7e32b54 kata-ctl: Lint fixes to allow the branch to be built
Remove return value for branches that call `unimplemented!()`.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-10 13:42:50 +01:00
Chelsea Mafrica
8e7bb8521c kata-ctl: add code for framework for arch
Add framework for different architectures for check. In the existing
kata-runtime check, the network checks do not appear to be
architecture-specific while the kernel module, cpu, and kvm checks do
have separate implementations for different architectures.

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2022-10-10 13:42:50 +01:00
David Esparza
303fc8b118 kata-ctl: Add unit tests cases
Add more unit tests cases to --version argument.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
Commit-edited-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-10 13:42:43 +01:00
David Esparza
d0b33e9a32 versions: Add kata-ctl version entry
As we're switching to using the rust version of the kata-ctl, lets
provide with its own entry in the kata-ctl command line.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
Commit-edited-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-10 13:42:35 +01:00
Chelsea Mafrica
002b18054d kata-ctl: Add initial rust code for kata-ctl
Use agent-ctl tool rust code as an example for a skeleton for the new
kata-ctl tool.

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2022-10-10 10:10:37 +01:00
wllenyj
b62b18bf1c dragonball: fix clippy warning
Fixed:
- unnecessary_lazy_evaluations
- derive_partial_eq_without_eq
- redundant_closure
- single_match
- question_mark
- unused-must-use
- redundant_clone
- needless_return

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-10-10 16:41:40 +08:00
wllenyj
2ddc948d30 Makefile: add dragonball components.
Enable ci to run dragonball unit tests.

Fixes: #4899

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-10-10 16:41:40 +08:00
wllenyj
3fe81fe4ab dragonball-ut: use skip_if_not_root to skip root case
Use skip_if_not_root to skip when unit test requires privileges.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-10-10 16:41:40 +08:00
wllenyj
72259f101a dragonball: add more unit test for vmm actions
Added more unit tests for vmm actions.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-10-10 16:41:39 +08:00
Peng Tao
acd72c44d4 Merge pull request #5380 from bergwolf/3.1.0-alpha0-branch-bump
# Kata Containers 3.1.0-alpha0
2022-10-09 16:16:36 +08:00
Chao Wu
9717dc3f75 Dragonball: remove redundant comments in event manager
handle_events for EventManager doesn't take max_events as arguments, so
we need to update the comments for it.

p.s. max_events is defined when initializing the EventManager.

fixes: #5382

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-10-09 14:38:12 +08:00
Peng Tao
ee74231b1c release: Kata Containers 3.1.0-alpha0
- libs/kata-types: adjust default_vcpus correctly
- runtime-rs: delete duplicated PASSTHROUGH_FS_DIR const
- Enable ACRN hypervisor support for Kata 2.x release
- agent: reduce reference count for failed mount
- agent: don't exit early if signal fails due to ESRCH
- kata-sys-util: delete duplicated get_bundle_path
- packaging: Mount $HOME/.docker in the 1st layer container
- Upgrade to Cloud Hypervisor v27.0
- microvm: Remove kernel_irqchip=on option
- kata-sys-util: fix typo `unknow`
- dragonball: update ut for kernel config
- versions: Update gperf url to avoid libseccomp random failures
- versions: Update oci version
- dragonball: fix no "as_str" error on Arm
- tools: release: fix bogus version check
- runtime-rs: update Cargo.lock
- refactor(runtime-rs): Use RwLock in runtime-agent
- runtime-rs: fix shim close_io call to support kubectl cp
- runtime-rs: add comments for runtime-rs shared directory
- workflow: trigger test-kata-deploy with pull_request and fix workflow_dispatch
- Dragonball: update linux_loader to 0.6.0
- modify virtio_net_dev_mgr.rs wrong code comments
- docs: Update urls in runk documentation
- runtime-rs: support watchable mount
- runtime-rs: debug console support in runtime
- kata-deploy: ship the rustified runtime binary
- runtime-rs: define VFIO unbind path as a const
- runtime-rs: set agent timeout to 0 for stream RPCs
- Added SNP-Support for Kata-Containers
- packaging: fix typo in configure-hypervisor.sh
- runtime/runtime-rs: update dependency
- release: Revert kata-deploy changes after 3.0.0-rc0 release
- runtime-rs: add test for StaticResource
- runtime-rs: remove hardcoded string
- docs: add README for runtime-rs hypervisor crate
- runtime-rs: use Path.is_file to check regular files
- osbuilder: Export directory variables for libseccomp
- runtime-rs: add unit tests for network resource
- runtime-rs/resource: use macro to reduce duplicated code
- runtime-rs: fix incorrect comments
- kernel: Add crypto kernel config for s390
- Non-root hypervisor uid reuse bug
- Build-in Sandbox: update dragonball-sandbox dependencies
- docs: Update url in virtualization document
- dragonball: Fix problem that stdio console cannot connect to stdout
- runtime-rs: call TomlConfig's validate function after load
- feat(Shimmgmt): Shim management server and client

53f209af4 libs/kata-types: adjust default_vcpus correctly
ef5a2dc3b agent: don't exit early if signal fails due to ESRCH
435c8f181 acrn: Enable ACRN hypervisor support for Kata 2.x release
c31cf7269 agent: reduce reference count for failed mount
4da743f90 packaging: Mount $HOME/.docker in the 1st layer container
067e2b1e3 runtime: clh: Use the new API to boot with TDX firmware (td-shim)
5d63fcf34 runtime: clh: Re-generate the client code
fe6107042 versions: Upgrade to Cloud Hypervisor v27.0
17de94e11 microvm: Remove kernel_irqchip=on option
3aeaa6459 runtime-rs: delete duplicated PASSTHROUGH_FS_DIR const
43ae97233 kata-sys-util: delete duplicated get_bundle_path
ac0483122 kata-sys-util: fix typo `unknow`
a24127659 versions: Update gperf url to avoid libseccomp random failures
a617a6348 versions: Update oci version
6d585d591 dragonball: fix no "as_str" error on Arm
421729f99 tools: release: fix bogus version check
457b0beaf runtime-rs: update Cargo.lock
f89ada2de dragonball: update ut for kernel config
0e899669e runtime-rs: fix shim close_io call to support kubectl cp
96cf21fad runtime-rs: add comments for runtime-rs shared directory
9bd941098 docs: Update urls in runk documentation
90ecc015e Dragonball: update linux_loader to 0.6.0
4a763925e runtime-rs: support watchable mount
abc26b00b dragonball: modify wrong code comments modify virtio_net_dev_mgr.rs wrong code comments
20bcaf0e3 runtime-rs: set agent timeout to 0 for stream RPCs
274de024c docs: add README for runtime-rs hypervisor crate
a4a23457c osbuilder: Export directory variables for libseccomp
d663f110d kata-deploy: get the config path from cri options
c6b3dcb67 kata-deploy: support kata-deploy for runtime-rs
46965739a runtime-rs: remove hardcoded string
a394761a5 kata-deploy: add installation for runtime-rs
50299a329 refactor(runtime-rs): Use RwLock in runtime agent
9628c7df0 runtime: update runc dependency
7fbc88387 runtime-rs: drop dependency on rustc-serialize
bf2be0cf7 release: Revert kata-deploy changes after 3.0.0-rc0 release
e23bfd615 runtime-rs: make function name more understandable
426a43678 runtime-rs: add unit test and eliminate raw string
87959cb72 runtime-rs: debug console support in runtime
d55cf9ab7 docs: Update url in virtualization document
0399da677 runtime-rs: update dependencies
f6f19917a dragonball: update dragonball-sandbox dependencies
2caee1f38 runtime-rs: define VFIO unbind path as a const
3f65ff2d0 runtime-rs: fix incorrect comments
9670a3caa runtime-rs: use Path.is_file to check regular files
d9e6eb11a docs: Guide to use SNP-VMs with Kata-Containers
ded60173d runtime: Enable choice between AMD SEV and SNP
22bda0838 runtime: Support for AMD SEV-SNP VMs
a2bbd2942 kernel: Introduce SNP kernel
0e69405e1 docs: Developer-Guide updated
105eda5b9 runtime: Initrd path option added to config
a8a8a28a3 runtime-rs/resource: use macro to reduce duplicated code
7622452f4 Dragonball: Fix the problem about stdio console
208233288 runtime-rs: add test for StaticResource
adb33a412 packaging: fix typo in configure-hypervisor.sh
f91431987 runtime: store the user name in hypervisor config
86a02c5f6 kernel: Add crypto kernel config for s390
5cafe2177 runtime: make StopVM thread-safe
c3015927a runtime: add more debug logs for non-root user operation
5add50aea runtime-rs: timeout for shim management client
9f13496e1 runtime-rs: shim management client
aaf6d6908 runtime-rs: call TomlConfig's validate function after load
e891295e1 runtime-rs: shim management - agent-url
59aeb776b runtime-rs: shim management
a828292b4 runtime-rs: add unit tests for network resource
7676cde0c workflow: trigger test-kata-deploy with pull_request
f10827357 workflow: require PR num input on test-kata-deploy workflow_dispatch
428d6dc80 workflow: Revert "workflow: trigger test-kata-deploy with pull_request"

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-10-09 11:50:42 +08:00
Peng Tao
102a9dda71 workflow: Revert "workflow: trigger test-kata-deploy with pull_request"
This reverts commit 7676cde0c5.
It turns out that when triggerred from a PR, the docker login command is
failing with
```
Error: Cannot perform an interactive login from a non TTY device
```

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-10-09 11:50:42 +08:00
Fupan Li
2c88e1cd80 Merge pull request #5302 from liubin/fix/5285-SetFsSharingSupport-comment
runtime: fix incorrect comment for SetFsSharingSupport function
2022-10-09 09:40:31 +08:00
Bin Liu
b556c9b986 Merge pull request #5235 from YchauWang/wyc-qmp-log
virtcontainers: add warn log record for qmp hotplug cpu error
2022-10-09 08:29:09 +08:00
Bin Liu
07201c7fe5 Merge pull request #5111 from liubin/fix/5110-adjust-default-vcpus
libs/kata-types: adjust default_vcpus correctly
2022-10-08 20:29:53 +08:00
Bin Liu
53f209af44 libs/kata-types: adjust default_vcpus correctly
With default_maxvcpus = 0 and default_vcpus = 1 settings, the
default_vcpus will be set to 0 and leads to starting fail.

The default_maxvcpus is not set correctly when it is set to 0,
and the default_vcpus is set to 0.

The correct action is setting default_maxvcpus to the max number
of CPUs or MAX_DRAGONBALL_VCPUS, and the default_vcpus should be
set to the desired value if the valuse is between 0 and
default_maxvcpus.

Fixes: #5110

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-10-08 16:52:05 +08:00
Bin Liu
dd34540b8a Merge pull request #5305 from liubin/fix/5301-delete-duplicated-PASSTHROUGH_FS_DIR
runtime-rs: delete duplicated PASSTHROUGH_FS_DIR const
2022-10-08 16:39:03 +08:00
Ji-Xinyou
9c1ac3d457 runtime-rs: return port on agent-url req
Add the server vport (1024) when requesting agent-url

Fixes: #5213
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-10-08 16:14:21 +08:00
Fabiano Fidêncio
ce73bc6dac Merge pull request #5015 from vijaydhanraj/enable_acrn_kata2.x
Enable ACRN hypervisor support for Kata 2.x release
2022-10-08 09:27:59 +02:00
Bin Liu
4616363eec Merge pull request #5365 from fengwang666/mount-bug-fix
agent: reduce reference count for failed mount
2022-10-08 14:27:38 +08:00
Fupan Li
1b7272c7ca Merge pull request #5367 from fengwang666/signal-bug-fix
agent: don't exit early if signal fails due to ESRCH
2022-10-08 14:21:50 +08:00
Feng Wang
ef5a2dc3bf agent: don't exit early if signal fails due to ESRCH
ESRCH usually means the process has exited. In this case,
the execution should continue to kill remaining container processes.

Fixes: #5366

Signed-off-by: Feng Wang <feng.wang@databricks.com>
[Fix up cargo updates]
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-10-08 12:15:12 +08:00
Bin Liu
5ace4e2354 Merge pull request #5304 from liubin/fix/5299-delete-duplicated-get_bundle_path
kata-sys-util: delete duplicated get_bundle_path
2022-10-08 10:57:52 +08:00
Vijay Dhanraj
435c8f181a acrn: Enable ACRN hypervisor support for Kata 2.x release
Currently ACRN hypervisor support in Kata2.x releases is broken.
This commit re-enables ACRN hypervisor support and also refactors
the code so as to remove dependency on Sandbox.

Fixes #3027

Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
2022-10-07 07:40:32 -07:00
Feng Wang
c31cf7269e agent: reduce reference count for failed mount
The kata agent adds a reference for each storage object before mount
and skip mount again if the storage object is known. We need to
remove the object reference if mount fails.

Fixes: #5364

Signed-off-by: Feng Wang <feng.wang@databricks.com>
2022-10-06 21:37:59 -07:00
Fabiano Fidêncio
ff62cedd26 Merge pull request #5323 from fidencio/topic/fix-kata-deploy-build-behind-proxy
packaging: Mount $HOME/.docker in the 1st layer container
2022-10-05 21:18:29 +02:00
Fabiano Fidêncio
4da743f90b packaging: Mount $HOME/.docker in the 1st layer container
In order to ensure that the proxy configuration is passed to the 2nd
layer container, let's ensure the $HOME/.docker/config.json file is
exposed inside the 1st layer container.

For some reason which I still don't fully understand exporting
https_proxy / http_proxy / no_proxy was not enough to get those
variables exported to the 2nd layer container.

In this commit we're creating a "$HOME/.docker" directory, and removing
it after the build, in case it doesn't exist yet.  The reason we do this
is to avoid docker not running in case "$HOME/.docker" doesn't exist.

This was not tested with podman, but if there's an issue with podman,
the issue was already there beforehand and should be treated as a
different problem than the one addressed in this commit.

Fixes: #5077

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-10-05 15:25:07 +02:00
Archana Shinde
6e2d39c588 Merge pull request #5311 from likebreath/0930/clh_v27.0
Upgrade to Cloud Hypervisor v27.0
2022-10-04 10:56:00 -07:00
Fabiano Fidêncio
d5572d5fd5 Merge pull request #5106 from norbjd/fix/microvm-machine-options
microvm: Remove kernel_irqchip=on option
2022-10-04 12:19:37 +02:00
Champ-Goblem
89e62d4edf shim: Ensure pagesize is set when reporting hugetbl stats
The containerd stats method and metrics API are broken with Kata 2.5.x, the stats fail to load and the metrics API responds with status code 500

This seems to be down to the conversion from the stats reported by the agent RPC `StatsContainer` where the field `Pagesize` is not
completed by the `setHugetlbStats` method. In the case where multiple sized tables stats are reported, this causes containerd to register two metrics
with the same label set, rather than each being partitioned by the `page` label.

Fixes: #5316
Signed-off-by: Champ-Goblem <cameron@northflank.com>
2022-10-04 09:16:30 +01:00
Bo Chen
067e2b1e33 runtime: clh: Use the new API to boot with TDX firmware (td-shim)
The new way to boot from TDX firmware (e.g. td-shim) is using the
combination of '--platform tdx=on' with '--firmware tdshim'.

Fixes: #5309

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-03 10:30:54 -07:00
Bo Chen
5d63fcf344 runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v27.0.
Note: The client code of cloud-hypervisor's (CLH) OpenAPI is
automatically generated by openapi-generator [1-2].

[1] https://github.com/OpenAPITools/openapi-generator
[2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md

Fixes: #5309

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-03 10:30:42 -07:00
Bo Chen
fe61070426 versions: Upgrade to Cloud Hypervisor v27.0
This release has been tracked in our new [roadmap project ](https://github.com/orgs/cloud-hypervisor/projects/6) as iteration v27.0.

**Community Engagement**
A new mailing list has been created to support broader community discussions.
Please consider [subscribing](https://lists.cloudhypervisor.org/g/dev/); an announcement of a regular meeting will be
announced via this list shortly.

**Prebuilt Packages**
Prebuilt packages are now available. Please see this [document](https://github.com/cloud-hypervisor/obs-packaging/blob/main/README.md)
on how to install. These packages also include packages for the different
firmware options available.

**Network Device MTU Exposed to Guest**
The MTU for the TAP device associated with a virtio-net device is now exposed
to the guest. If the user provides a MTU with --net mtu=.. then that MTU is
applied to created TAP interfaces. This functionality is also exposed for
vhost-user-net devices including those created with the reference backend.

**Boot Tracing**
Support for generating a trace report for the boot time has been added
including a script for generating an SVG from that trace.

**Simplified Build Feature Flags**
The set of feature flags, for e.g. experimental features, have been simplified:

* msvh and kvm features provide support for those specific hypervisors
(with kvm enabled by default),
* tdx provides support for Intel TDX; and although there is no MSHV support
now it is now possible to compile with the mshv feature,
* tracing adds support for boot tracing,
* guest_debug now covers both support for gdbing a guest (formerly gdb
feature) and dumping guest memory.

The following feature flags were removed as the functionality was enabled by
default: amx, fwdebug, cmos and common.

**Asynchronous Kernel Loading**
AArch64 has gained support for loading the guest kernel asynchronously like
x86-64.

**GDB Support for AArch64**
GDB stub support (accessed through --gdb under guest_debug feature) is now
available on AArch64 as well as as x86-64.

**Notable Bug Fixes**
* This version incorporates a version of virtio-queue that addresses an issue
where a rogue guest can potentially DoS the VMM,
* Improvements around PTY handling for virtio-console and serial devices,
* Improved error handling in virtio devices.

**Deprecations**
Deprecated features will be removed in a subsequent release and users should
plan to use alternatives.

* Booting legacy firmware (compiled without a PVH header) has been deprecated.
All the firmware options (Cloud Hypervisor OVMF and Rust Hypervisor Firmware)
support booting with PVH so support for loading firmware in a legacy mode is no
longer needed. This functionality will be removed in the next release.

Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v27.0

Note: To have the new API of loading firmware for booting (e.g. boot
from td-shim), a specific commit revision after the v27.0 release is
used as the Cloud Hypervisor version from the 'versions.yaml'.

Fixes: #5309

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-03 10:25:04 -07:00
Fabiano Fidêncio
0143036b84 Merge pull request #5303 from liubin/fix/5296-typo-unknow
kata-sys-util: fix typo `unknow`
2022-10-03 15:29:45 +02:00
norbjd
17de94e118 microvm: Remove kernel_irqchip=on option
`kernel_irqchip` option doesn't seem to bring any benefits and, on the
contrary, its usage cause issues when using the microvm machine type.

With this in mind, let's remove it.

Fixes: #1984, #4386

Signed-off-by: norbjd <norbjd@users.noreply.github.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-10-03 11:48:05 +02:00
Bin Liu
3aeaa6459d runtime-rs: delete duplicated PASSTHROUGH_FS_DIR const
The const PASSTHROUGH_FS_DIR defined twice, delte one.

Fixes: #5301

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-30 15:53:08 +08:00
Bin Liu
43ae972335 kata-sys-util: delete duplicated get_bundle_path
get_bundle_path has already defined in spec.rs,
delete it from fs.rs.

Fixes: #5299

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-30 15:50:58 +08:00
Bin Liu
ac04831223 kata-sys-util: fix typo unknow
Change `unknow` to `unknown`.

Fixes: #5296

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-30 15:47:34 +08:00
Bin Liu
68e8a86aec runtime: fix incorrect comment for SetFsSharingSupport function
The comment for SetFsSharingSupport is not suitable, correct the
function name.

Fixes: #5285

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-30 15:44:44 +08:00
Bin Liu
805e80b2a2 Merge pull request #5278 from openanolis/chao/update_linux_loader_ut
dragonball: update ut for kernel config
2022-09-30 11:12:29 +08:00
Bin Liu
357d323803 Merge pull request #5244 from GabyCT/topic/debugosbuilder
versions: Update gperf url to avoid libseccomp random failures
2022-09-30 10:10:54 +08:00
Bin Liu
8d4ced3c86 runtime-rs: support ephemeral storage for emptydir
Add support for ephemeral storage and k8s emptydir.

Depends-on:github.com/kata-containers/tests#5161

Fixes: #4730

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-30 09:10:20 +08:00
David Esparza
9b033f174b Merge pull request #5292 from GabyCT/topic/updateoci
versions: Update oci version
2022-09-29 16:29:11 -05:00
Greg Kurz
7b4c3c0cab Merge pull request #5288 from jongwu/fix_cmdline_arm
dragonball: fix no "as_str" error on Arm
2022-09-29 18:59:00 +02:00
Gabriela Cervantes
a241276592 versions: Update gperf url to avoid libseccomp random failures
This PR updates the gperf url to avoid random failures when installing
libseccomp as it seems that the mirrror url produces network random
failures in multiple CIs.

Fixes #5294

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-09-29 16:52:46 +00:00
Gabriela Cervantes
a617a63481 versions: Update oci version
This PR updates the oci version that we are using in kata containers.

Fixes #5291

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-09-29 15:32:48 +00:00
Jianyong Wu
6d585d5919 dragonball: fix no "as_str" error on Arm
Cmdline struct update in the latest linux-loader lib and its as_str
method is changed to as_cstring, thus we need fix it according whereas
the old as_str method is used.

Fixes: #5287
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2022-09-29 21:06:31 +08:00
Bin Liu
68f6dbb202 Merge pull request #5284 from gkurz/fix-release-script
tools: release: fix bogus version check
2022-09-29 20:46:11 +08:00
Greg Kurz
421729f991 tools: release: fix bogus version check
Shell expands `*"rc"*` to the top-level `src` directory. This results
in comparing a version with a directory name. This doesn't make sense
and causes the script to choose the wrong branch of the `if`.

The intent of the check is actually to detect `rc` in the version.

Fixes: #5283
Signed-off-by: Greg Kurz <groug@kaod.org>
2022-09-29 11:31:43 +02:00
Bin Liu
949ffcc457 Merge pull request #5281 from liubin/fix/5280-update-cargo-lock
runtime-rs: update Cargo.lock
2022-09-29 17:16:21 +08:00
Bin Liu
1352e31180 Merge pull request #5200 from openanolis/agent_rwlock
refactor(runtime-rs): Use RwLock in runtime-agent
2022-09-29 13:15:41 +08:00
Bin Liu
457b0beaf0 runtime-rs: update Cargo.lock
src/dragonball/Cargo.toml is updated and the Cargo.lock is not
commited into repo.

Fixes: #5280

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-29 13:15:01 +08:00
Bin Liu
abbdf89a06 Merge pull request #5271 from liubin/fix/4729-add-close-io-for-kubectl-cp
runtime-rs: fix shim close_io call to support kubectl cp
2022-09-29 13:10:49 +08:00
Peng Tao
046ddc6463 readme: remove libraries mentioning
There are two duplicated mentioning of the rust libraries in README.md.
Let's just remove them all as the section is intended to list out core
Kata components rather than general libraries.

Fixes: #5275
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-09-29 12:10:50 +08:00
Chao Wu
f89ada2de1 dragonball: update ut for kernel config
Since linux loader is updated in the Dragonball and the api for Cmdline
has been changed ( as_str() changed to as_cstring() ), we need to update
unit test in Dragonball.

fixes: #5277

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-09-29 11:35:45 +08:00
Bin Liu
0e899669ee runtime-rs: fix shim close_io call to support kubectl cp
Add close_io to shim and call agent's close_stdin in close_io.

Depends-on:github.com/kata-containers/tests#5155

Fixes: #4729

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-29 09:35:17 +08:00
quanweiZhou
901893163f Merge pull request #5198 from openanolis/share-fs-comment
runtime-rs: add comments for runtime-rs shared directory
2022-09-29 09:12:01 +08:00
Greg Kurz
7294e2fa9e Merge pull request #4387 from snir911/tmp-workflow-main
workflow: trigger test-kata-deploy with pull_request and fix workflow_dispatch
2022-09-28 16:42:51 +02:00
Zhongtao Hu
96cf21fad0 runtime-rs: add comments for runtime-rs shared directory
add comments for runtime-rs shared directory

Fixes:#5197
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-09-28 15:46:34 +08:00
Zhongtao Hu
2f1a4b02ee Merge pull request #5254 from openanolis/chao/update_linux_loader
Dragonball: update linux_loader to 0.6.0
2022-09-28 15:04:09 +08:00
Bin Liu
0f6884b8c3 Merge pull request #5252 from zhaoxuat/main
modify virtio_net_dev_mgr.rs wrong code comments
2022-09-28 11:34:20 +08:00
Bin Liu
d0be4a285e Merge pull request #5260 from GabyCT/topic/fixrunkdoc
docs: Update urls in runk documentation
2022-09-28 11:30:39 +08:00
Zhongtao Hu
ff053b0808 Merge pull request #5220 from liubin/fix/5184-rs-inotify
runtime-rs: support watchable mount
2022-09-28 11:19:53 +08:00
Zhongtao Hu
319caa8e74 Merge pull request #5097 from openanolis/dbg-console
runtime-rs: debug console support in runtime
2022-09-28 10:30:22 +08:00
Peng Tao
33b0720119 Merge pull request #5193 from openanolis/origin/kata-deploy
kata-deploy: ship the rustified runtime binary
2022-09-28 10:19:16 +08:00
Gabriela Cervantes
9bd941098e docs: Update urls in runk documentation
This PR updates the urls that we have in the runk documentation.

Fixes #5259

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-09-27 15:45:43 +00:00
Chao Wu
90ecc015e0 Dragonball: update linux_loader to 0.6.0
Since linux-loader 0.4.0 and 0.5.0 is yanked due to null terminator bug,
we need to update linux-loader to 0.6.0.

And as_str() function should also be changed.

fixes: #5253

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-09-27 23:01:44 +08:00
Bin Liu
c64e56327f Merge pull request #5190 from liubin/fix/5189-unbind-as-a-const
runtime-rs: define VFIO unbind path as a const
2022-09-27 21:04:18 +08:00
Bin Liu
4a763925e5 runtime-rs: support watchable mount
Use watchable mount to support inotify for virtio-fs.

Fixes: #5184

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-27 19:08:25 +08:00
zhaoxu
abc26b00bb dragonball: modify wrong code comments
modify virtio_net_dev_mgr.rs wrong code comments

Fixes: #5252

Signed-off-by: zhaoxu <zhaoxu@megvii.com>
2022-09-27 18:32:13 +08:00
Bin Liu
c95cf6dce7 Merge pull request #5250 from liubin/fix/5249-set-timeout-to-zero-for-stream-rpc
runtime-rs: set agent timeout to 0 for stream RPCs
2022-09-27 17:39:35 +08:00
Peng Tao
8a2df6b31c Merge pull request #4931 from jpecholt/snp-support
Added SNP-Support for Kata-Containers
2022-09-27 14:17:54 +08:00
Bin Liu
41a3bd87a5 Merge pull request #5161 from liubin/fix/5160-typo-in-configure-hypervisor-sh
packaging: fix typo in configure-hypervisor.sh
2022-09-27 13:03:39 +08:00
Bin Liu
20bcaf0e36 runtime-rs: set agent timeout to 0 for stream RPCs
For stream RPCs:
- write_stdin
- read_stdout
- read_stderr

there should be no timeout (by setting it to 0).

Fixes: #5249

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-27 11:47:37 +08:00
Bin Liu
407e46b1b7 Merge pull request #5218 from bergwolf/github/deps
runtime/runtime-rs: update dependency
2022-09-27 11:02:46 +08:00
Bin Liu
414c6a1578 Merge pull request #5175 from bergwolf/revert-kata-deploy-changes-after-3.0.0-rc0-release
release: Revert kata-deploy changes after 3.0.0-rc0 release
2022-09-27 11:02:24 +08:00
Bin Liu
a2f207b923 Merge pull request #5163 from liubin/fix/5162-add-test-for-StaticResource
runtime-rs: add test for StaticResource
2022-09-26 17:44:20 +08:00
Zhongtao Hu
9d67f5a7e2 Merge pull request #5230 from openanolis/nohc
runtime-rs: remove hardcoded string
2022-09-26 16:01:41 +08:00
quanweiZhou
ad87c7ac56 Merge pull request #5206 from openanolis/hypervisor/readme
docs: add README for runtime-rs hypervisor crate
2022-09-26 16:01:12 +08:00
Bin Liu
5a98fb8d2b Merge pull request #5186 from liubin/fix/5185
runtime-rs: use Path.is_file to check regular files
2022-09-26 12:33:47 +08:00
GabyCT
f7f05f238e Merge pull request #5233 from GabyCT/topic/exportlibseccomp
osbuilder: Export directory variables for libseccomp
2022-09-23 13:54:14 -05:00
Zhongtao Hu
4a36bb9e21 Merge pull request #4924 from openanolis/runtime-rs-netUT
runtime-rs: add unit tests for network resource
2022-09-23 17:45:24 +08:00
Zhongtao Hu
274de024c5 docs: add README for runtime-rs hypervisor crate
add README for runtime-rs hypervisor crate

Fixes:#4634
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-09-23 15:20:02 +08:00
Chao Wu
9cf5de0b4e Merge pull request #5171 from liubin/fix/5170-use-macro
runtime-rs/resource: use macro to reduce duplicated code
2022-09-23 10:59:53 +08:00
wangyongchao.bj
04bbce8dc3 virtcontainers: add warn log record for qmp hotplug cpu error
The qmp command of hotplug cpu failed error was hidden. It didn't friendly for
the user tracing the hotplug cpu error. The PR help us to improve the hotplug
cpu error log. Add real qemu command error log for `failed to hot add vCPUs`.
Through the error message, we can get the reason of the failed qmp command
 for hotplug cpu operation.

Fixes: #5234

Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
2022-09-23 08:22:30 +08:00
Gabriela Cervantes
a4a23457ca osbuilder: Export directory variables for libseccomp
To avoid the random failures when we are building the rootfs as it seems
that it does not find the value for the libseccomp and gperf directory,
this PR export these variables.

Fixes #5232

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-09-22 21:45:20 +00:00
Chelsea Mafrica
de869f2565 Merge pull request #5188 from liubin/fix/5187-incorrect-comments-in-kata-types-hypervisor
runtime-rs: fix incorrect comments
2022-09-22 14:09:20 -07:00
Zhongtao Hu
d663f110d7 kata-deploy: get the config path from cri options
get the config path for runtime-rs from cri options

Fixes: #5000
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-09-22 17:39:25 +08:00
Zhongtao Hu
c6b3dcb67d kata-deploy: support kata-deploy for runtime-rs
support kata-deploy for runtime-rs

Fixes:#5000
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-09-22 17:39:20 +08:00
Ji-Xinyou
46965739a4 runtime-rs: remove hardcoded string
Use KATA_PATH instead of "run/kata"

Fixes: #5229
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-09-22 16:06:51 +08:00
Zhongtao Hu
a394761a5c kata-deploy: add installation for runtime-rs
setup the compile environment and installation path for the Rust runtime

Fixes:#5000
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-09-22 15:59:44 +08:00
Peng Tao
ce22a9f134 Merge pull request #5159 from BbolroC/s390-config
kernel: Add crypto kernel config for s390
2022-09-22 15:36:24 +08:00
Peng Tao
a2c13bad45 Merge pull request #5156 from fengwang666/uid-reuse-bug
Non-root hypervisor uid reuse bug
2022-09-22 15:35:39 +08:00
Peng Tao
af174c2b6d Merge pull request #5195 from wllenyj/update-dbs
Build-in Sandbox: update dragonball-sandbox dependencies
2022-09-22 15:07:11 +08:00
Ji-Xinyou
50299a3292 refactor(runtime-rs): Use RwLock in runtime agent
Use RwLock for Agent in runtime, for better concurrency.

Fixes: #5199
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-09-21 17:43:40 +08:00
Peng Tao
9628c7df0c runtime: update runc dependency
To bring fix to CVE-2022-29162.

Fixes: #5217
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-09-21 17:21:37 +08:00
Peng Tao
7fbc883879 runtime-rs: drop dependency on rustc-serialize
We are not using it and it hasn't got any updates for more than five
years, leaving open CVEs unresolved.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-09-21 17:19:58 +08:00
Peng Tao
bf2be0cf7a release: Revert kata-deploy changes after 3.0.0-rc0 release
As 3.0.0-rc0 has been released, let's switch the kata-deploy / kata-cleanup
tags back to "latest", and re-add the kata-deploy-stable and the
kata-cleanup-stable files.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-09-21 15:19:38 +08:00
snir911
cb977c04bd Merge pull request #5204 from GabyCT/topic/updatevirt
docs: Update url in virtualization document
2022-09-21 10:05:13 +03:00
Ji-Xinyou
e23bfd615e runtime-rs: make function name more understandable
Change kparams to kernel_params for understandability.

Fixes: #5068
Signed-Off-By: Ji-Xinyou <jerryji0414@outlook.com>
2022-09-21 11:48:11 +08:00
Ji-Xinyou
426a436780 runtime-rs: add unit test and eliminate raw string
Add two unit tests for coverage and eliminate raw strings to constant.

Fixes: #5068
Signed-Off-By: Ji-Xinyou <jerryji0414@outlook.com>
2022-09-21 11:47:07 +08:00
Ji-Xinyou
87959cb72d runtime-rs: debug console support in runtime
Read debug console configuration in kernel params.

Fixes: #5068
Signed-Off-By: Ji-Xinyou <jerryji0414@outlook.com>
2022-09-21 11:46:55 +08:00
Bin Liu
a2e7434a0f Merge pull request #5082 from QiliangFan/main
dragonball: Fix problem that stdio console cannot connect to stdout
2022-09-21 11:12:19 +08:00
Gabriela Cervantes
d55cf9ab71 docs: Update url in virtualization document
This PR updates the url for the cloud hypervisor in the virtualization
document.

Fixes #5203

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-09-20 16:52:24 +00:00
wllenyj
0399da677d runtime-rs: update dependencies
Updated Cargo.lock.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-09-20 15:00:14 +08:00
wllenyj
f6f19917a8 dragonball: update dragonball-sandbox dependencies
Updated vmm-sys-util to 0.10.0
Updated virtio-queue to 0.4.0
Updated vm-memory to 0.9.0
Updated linux-loader to 0.5.0

Fixes: #5194

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-09-20 14:48:09 +08:00
Zhongtao Hu
e05e42fd3c Merge pull request #5113 from liubin/fix/5112-call-TomlConfig-validate-func
runtime-rs: call TomlConfig's validate function after load
2022-09-20 14:38:42 +08:00
Zhongtao Hu
fc65e96ad5 Merge pull request #5133 from openanolis/shimmgmt
feat(Shimmgmt): Shim management server and client
2022-09-20 14:37:19 +08:00
Bin Liu
2caee1f38d runtime-rs: define VFIO unbind path as a const
In src/runtime-rs/crates/hypervisor/src/device/vfio.rs,
the path of new_id is defined as a const, but unbind is used
as a local variable, they should be unified to const.

Fixes: #5189

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-19 16:08:35 +08:00
Bin Liu
3f65ff2d07 runtime-rs: fix incorrect comments
Some comments for types are incorrect in file
 src/libs/kata-types/src/config/hypervisor/mod.rs

Fixes: #5187

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-19 16:03:06 +08:00
Bin Liu
9670a3caac runtime-rs: use Path.is_file to check regular files
Use Path.is_file to replace using `stat` to check the file type.

Fixes: #5185

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-19 15:57:07 +08:00
Joana Pecholt
d9e6eb11ae docs: Guide to use SNP-VMs with Kata-Containers
The guide describes how to set Kata-Containers up so that AMD SEV-SNP
encrypted VMs are used when deploying confidential containers.

Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>
2022-09-16 17:51:41 +02:00
Joana Pecholt
ded60173d4 runtime: Enable choice between AMD SEV and SNP
This is based on a patch from @niteeshkd that adds a config
parameter to choose between AMD SEV and SEV-SNP VMs as the
confidential guest type in case both types are supported. SEV is
the default.

Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>
2022-09-16 17:51:41 +02:00
Joana Pecholt
22bda0838c runtime: Support for AMD SEV-SNP VMs
This commit adds AMD SEV-SNP as a confidential guest option to the
runtime. Information on required components such as OVMF, QEMU and
a kernel supporting SEV-SNP are defined in the versions file and
corresponding configs are added.

Note: The CPU model 'host' provided by the current SNP-QEMU does
not support all SNP capabilities yet, which is why this option is
changed to EPYC-v4.

Note: The guest's physical address space reduction specified with
ReducedPhysBits is 1. Details are can be found in Section 15.34.6
here https://www.amd.com/system/files/TechDocs/24593.pdf

Fixes #4437

Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>
2022-09-16 17:51:41 +02:00
Joana Pecholt
a2bbd29422 kernel: Introduce SNP kernel
This introduces the SNP kernel as a confidential computing guest.

Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>
2022-09-16 17:51:41 +02:00
Joana Pecholt
0e69405e16 docs: Developer-Guide updated
Developer-Guide.md is updated to work using current golang versions.
Related Readmes are also updated.

Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>
2022-09-16 17:51:41 +02:00
Joana Pecholt
105eda5b9a runtime: Initrd path option added to config
Adds initrd configuration option to the configuration.toml that is
generated for the setup using QEMU.

Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>
2022-09-16 17:51:41 +02:00
Tim Zhang
32a9d6d66d Merge pull request #5174 from bergwolf/3.0.0-rc0-branch-bump
# Kata Containers 3.0.0-rc0
2022-09-16 16:59:55 +08:00
Peng Tao
583591099d release: Kata Containers 3.0.0-rc0
- runtime-rs: delete some allow(dead_code) attributes
- kata-types: don't check virtio_fs_daemon for inline-virtio-fs
- kata-types: change return type of getting CPU period/quota function
- runtime-rs: fix host device check pattern
- runtime-rs: remove meaningless comment
- runtime-rs: update rust runtime roadmap
- runk: Enable seccomp support by default
- config: add "inline-virtio-fs" as a "shared_fs" type
- runtime-rs: add README.md
- runk: Refactor container builder
- kernel: fix kernel tarball name for SEV
- libs/kata-types: replace tabs by spaces in comments
- gperf: point URL to mirror site

be242a3c3 release: Adapt kata-deploy for 3.0.0-rc0
156e1c324 runtime-rs: delete some allow(dead_code) attributes
62cf6e6fc runtime-rs: remove meaningless comment
bcf6bf843 runk: Enable seccomp support by default
2b1d05857 runtime-rs: fix host device check pattern
85b49cee0 runtime-rs: add README.md
36d805fab config: add "inline-virtio-fs" as a "shared_fs" type
b948a8ffe kernel: fix kernel tarball name for SEV
50f912615 libs/kata-types: replace tabs by spaces in comments
96c8be715 libs/kata-types: change return type of getting CPU period/quota
fc9c6f87a kata-types: don't check virtio_fs_daemon for inline-virtio-fs
968c2f6e8 runk: Refactor container builder
84268f871 runtime-rs: update rust runtime roadmap
566656b08 gperf: point URL to mirror site

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-09-16 03:53:44 +00:00
Peng Tao
be242a3c3c release: Adapt kata-deploy for 3.0.0-rc0
kata-deploy files must be adapted to a new release.  The cases where it
happens are when the release goes from -> to:
* main -> stable:
  * kata-deploy-stable / kata-cleanup-stable: are removed

* stable -> stable:
  * kata-deploy / kata-cleanup: bump the release to the new one.

There are no changes when doing an alpha release, as the files on the
"main" branch always point to the "latest" and "stable" tags.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-09-16 03:53:43 +00:00
Bin Liu
a8a8a28a34 runtime-rs/resource: use macro to reduce duplicated code
Some device types have the same definition, they can be implemented
by macro to reduce code.

And this commit also deleted the `peer_name` field of the structs that
is never been used.

Fixes: #5170

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-15 15:45:26 +08:00
Bin Liu
be22e8408d Merge pull request #5165 from liubin/fix/5164-remove-dead_code
runtime-rs: delete some allow(dead_code) attributes
2022-09-15 09:32:10 +08:00
Bin Liu
156e1c3247 runtime-rs: delete some allow(dead_code) attributes
Some #![allow(dead_code)]s and code are not needed indeed.

Fixes: #5164

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-14 20:50:30 +08:00
qiliangfan
7622452f4b Dragonball: Fix the problem about stdio console
Let stdout stream connect to the com1_device,

Fixes: #5083

Signed-off-by: qiliangfan <fanqiliang@mail.nankai.edu.cn>
2022-09-14 15:53:57 +08:00
Bin Liu
208233288a runtime-rs: add test for StaticResource
Add test case for StaticResource, the old test is not
covering the StaticResource struct.

Fixes: #5162

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-14 11:45:07 +08:00
Bin Liu
adb33a4121 packaging: fix typo in configure-hypervisor.sh
`powwer` is a typo of `power`, and many spaces should
be replaced by tabs for indent.

Fixes: #5160

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-14 11:38:01 +08:00
Feng Wang
f914319874 runtime: store the user name in hypervisor config
The user name will be used to delete the user instead of relying on
uid lookup because uid can be reused.

Fixes: #5155

Signed-off-by: Feng Wang <feng.wang@databricks.com>
2022-09-13 10:32:55 -07:00
Hyounggyu Choi
86a02c5f6a kernel: Add crypto kernel config for s390
This config update supports new crypto algorithms for s390.

Fixes: #5158

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2022-09-13 18:13:57 +02:00
Feng Wang
5cafe21770 runtime: make StopVM thread-safe
StopVM can be invoked by multiple threads and needs to be thread-safe

Fixes: #5155

Signed-off-by: Feng Wang <feng.wang@databricks.com>
2022-09-12 21:56:15 -07:00
Feng Wang
c3015927a3 runtime: add more debug logs for non-root user operation
Previously the logging was insufficient and made debugging difficult

Fixes: #5155

Signed-off-by: Feng Wang <feng.wang@databricks.com>
2022-09-12 21:38:57 -07:00
Bin Liu
a58feba9bb Merge pull request #5105 from liubin/fix/5104-ignore-virtiofs-daemon-for-inline-mode
kata-types: don't check virtio_fs_daemon for inline-virtio-fs
2022-09-13 10:33:56 +08:00
Bin Liu
42d4da9b6c Merge pull request #5101 from liubin/fix/5100-cpu-period-quota-data-type
kata-types: change return type of getting CPU period/quota function
2022-09-13 10:33:29 +08:00
Tim Zhang
8ec4edcf4f Merge pull request #5146 from liubin/fix/5145-check-host-dev
runtime-rs: fix host device check pattern
2022-09-13 10:33:05 +08:00
Tim Zhang
447521c6da Merge pull request #5151 from liubin/fix/5150-remove-comment
runtime-rs: remove meaningless comment
2022-09-13 10:32:53 +08:00
Bin Liu
2f830c09a3 Merge pull request #5073 from openanolis/update
runtime-rs: update rust runtime roadmap
2022-09-13 10:32:25 +08:00
Bin Liu
62cf6e6fc3 runtime-rs: remove meaningless comment
The comment for `generate_mount_path` function is a copy miss
and should be deleted.

Fixes: #5150

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-09 16:07:35 +08:00
Bin Liu
55f4f3a95b Merge pull request #4897 from ManaSugi/runk/enable-seccomp
runk: Enable seccomp support by default
2022-09-09 14:11:35 +08:00
Manabu Sugimoto
bcf6bf843c runk: Enable seccomp support by default
Enable seccomp support in `runk` by default.
Due to this, `runk` is built with `gnu libc` by default
because the building `runk` with statically linked the `libseccomp`
and `musl` requires additional configurations.
Also, general container runtimes are built with `gnu libc` as
dynamically linked binaries by default.
The user can disable seccomp by `make SECCOMP=no`.

Fixes: #4896

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-09-09 10:55:16 +09:00
GabyCT
be462baa7e Merge pull request #5103 from liubin/fix/5102-add-inline-virtiofs-config
config: add "inline-virtio-fs" as a "shared_fs" type
2022-09-08 10:33:20 -05:00
GabyCT
bcbce8317d Merge pull request #5061 from liubin/fix/5022-runtime-rs-readme
runtime-rs: add README.md
2022-09-08 10:32:08 -05:00
bin liu
2b1d058572 runtime-rs: fix host device check pattern
Host devices should start with `/dev/` but not `/dev`.

Fixes: #5145

Signed-off-by: bin liu <liubin0329@gmail.com>
2022-09-08 22:44:46 +08:00
Bin Liu
85b49cee02 runtime-rs: add README.md
Add README.md for runtime-rs.

Fixes: #5022

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-08 16:03:45 +08:00
Bin Liu
7cfc357c6e Merge pull request #5034 from ManaSugi/runk/refactor-container-builder
runk: Refactor container builder
2022-09-08 11:30:07 +08:00
Ji-Xinyou
5add50aea2 runtime-rs: timeout for shim management client
Let client side support timeout if the timeout value is set.
If timeout not set, execute directly.

Fixes: #5114
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-09-08 11:11:33 +08:00
Bin Liu
36d805fab9 config: add "inline-virtio-fs" as a "shared_fs" type
"inline-virtio-fs" is newly supported by kata 3.0 as a "shared_fs" type,
it should be described in configuration file.

"inline-virtio-fs" is the same as "virtio-fs", but it is running in
the same process of shim, does not need an external virtiofsd process.

Fixes: #5102

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-08 11:05:01 +08:00
Fabiano Fidêncio
5793685a4b Merge pull request #5095 from ryansavino/sev-kernel-build-fix
kernel: fix kernel tarball name for SEV
2022-09-07 17:50:17 +02:00
Bin Liu
5df6ff991d Merge pull request #5116 from liubin/fix/5115-replace-tab-by-space
libs/kata-types: replace tabs by spaces in comments
2022-09-07 15:53:34 +08:00
Ji-Xinyou
9f13496e13 runtime-rs: shim management client
Add client side function(public), to establish http connections (PUT,
POST, GET) to the long standing shim mgmt server.

Fixes: #5114
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-09-07 15:39:14 +08:00
Fabiano Fidêncio
e94d38c97b Merge pull request #5058 from ryansavino/gperf-url-fix
gperf: point URL to mirror site
2022-09-07 09:25:13 +02:00
Bin Liu
aaf6d69089 runtime-rs: call TomlConfig's validate function after load
Call TomlConfig's validate function after it is loaded and
adjusted by annotations.

Fixes: #5112

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-07 11:34:08 +08:00
Bin Liu
fe55f6afd7 Merge pull request #5124 from amshinde/revert-arp-neighbour-api
Revert arp neighbour api
2022-09-07 11:14:53 +08:00
Ji-Xinyou
e891295e10 runtime-rs: shim management - agent-url
Add agent-url to its handler. The general framework of registering URL
handlers is done.

Fixes: #5114
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-09-07 11:13:21 +08:00
Chelsea Mafrica
051dabb0fe Merge pull request #5099 from liubin/fix/5098-add-default-config-for-runtime-rs
runtime-rs: add default agent/runtime/hypervisor for configuration
2022-09-06 17:49:42 -07:00
Archana Shinde
d23779ec9b Revert "agent: fix unittests for arp neighbors"
This reverts commit 81fe51ab0b.
2022-09-06 15:41:42 -07:00
Archana Shinde
d340564d61 Revert "agent: use rtnetlink's neighbours API to add neighbors"
This reverts commit 845c1c03cf.

Fixes: #5126
2022-09-06 15:41:42 -07:00
Archana Shinde
188d37badc kata-deploy: Add debug statement
Adding this so that we can see the status of running pods in
case of failure.

Fixes: #5126

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-09-06 15:41:14 -07:00
Ryan Savino
b948a8ffe6 kernel: fix kernel tarball name for SEV
'linux-' prefix needed for tarball name in SEV case. Output to same file name.

Fixes: #5094

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2022-09-06 11:04:29 -05:00
Bin Liu
50f9126153 libs/kata-types: replace tabs by spaces in comments
Replace tabs by spaces in the comments of file
libs/kata-types/src/annotations/mod.rs.

Fixes: #5115

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-06 17:32:57 +08:00
Ji-Xinyou
59aeb776b0 runtime-rs: shim management
Add shim management http server and boot it as a light-weight thread
when the sandbox is created.

Fixes: #5114
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-09-06 16:44:16 +08:00
Bin Liu
96c8be715b libs/kata-types: change return type of getting CPU period/quota
period should have a type of u64, and quota should be i64, the
function of getting CPU period and quota from annotations should
use the same data type as function return type.

Fixes: #5100

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-06 11:35:52 +08:00
Bin Liu
fc9c6f87a3 kata-types: don't check virtio_fs_daemon for inline-virtio-fs
If the shared_fs is set to "inline-virtio-fs", the "virtio_fs_daemon"
should be ignored.

Fixes: #5104

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-05 17:44:28 +08:00
James O. D. Hunt
662ce3d6f2 Merge pull request #5086 from Yuan-Zhuo/main
docs: fix unix socket address in agent-ctl doc
2022-09-05 09:24:28 +01:00
Bin Liu
e879270a0c runtime-rs: add default agent/runtime/hypervisor for configuration
Kata 3.0 introduced 3 new configurations under runtime section:

name="virt_container"
hypervisor_name="dragonball"
agent_name="kata"
Blank values will lead to starting to fail.

Adding default values will make user easy to migrate to kata 3.0.

Fixes: #5098

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-05 15:55:28 +08:00
Bin Liu
e5437a7084 Merge pull request #5063 from liubin/fix/5062-split-amend-spec
runtime-rs: split amend_spec function
2022-09-05 15:00:31 +08:00
Manabu Sugimoto
968c2f6e8e runk: Refactor container builder
Refactor the container builder code (`InitContainer` and `ActivatedContainer`)
to make it easier to understand and to maintain.

The details:

1. Separate the existing `builder.rs` into an `init_builder.rs` and
`activated_builder.rs` to make them easy to read and maintain.

2. Move the `create_linux_container` function from the `builder.rs` to
`container.rs` because it is shared by the both files.

3. Some validation functions such as `validate_spec` from `builder.rs`
to `utils.rs` because they will be also used by other components as
utilities in the future.

Fixes: #5033

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-09-05 14:36:30 +09:00
Bin Liu
ba013c5d0f Merge pull request #4744 from openanolis/runtime-rs-static_resource_mgmt
runtime-rs: support functionality of static resource management
2022-09-05 11:17:09 +08:00
Wainer Moschetta
e81a73b622 Merge pull request #4719 from bookinabox/cargo-deny
github-actions: Add cargo-deny
2022-09-02 17:24:50 -03:00
Fabiano Fidêncio
1ccd883103 Merge pull request #5090 from fidencio/topic/keep-passing-build-suffix-to-qemu
qemu: Keep passing BUILD_SUFFIX
2022-09-02 19:37:22 +02:00
Fabiano Fidêncio
373dac2dbb qemu: Keep passing BUILD_SUFFIX
In the commit 54d6d01754 we ended up
removing the BUILD_SUFFIX argument passed to QEMU as it only seemed to
be used to generate the HYPERVISOR_NAME and PKGVERSION, which were added
as arguments to the dockerfile.

However, it turns out BUILD_SUFFIX is used by the `qemu-build-post.sh`
script, so it can rename the QEMU binary accordingly.

Let's just bring it back.

Fixes: #5078

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-09-02 15:47:48 +02:00
Fabiano Fidêncio
9cf4eaac13 Merge pull request #5079 from ryansavino/tdx-qemu-tarball-path-fix
qemu: fix tdx qemu tarball directories
2022-09-02 14:04:50 +02:00
Bin Liu
86ad832e37 runtime-rs: force shutdown shim process in it can't exit
In some case the call of cleanup from shim to service manager will fail,
and the shim process will continue to running, that will make process leak.

This commit will force shutdown the shim process in case of any errors in
service crate.

Fixes: #5087

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-02 19:43:50 +08:00
Yuan-Zhuo
5f4f5f2400 docs: fix unix socket address in agent-ctl doc
Following the instructions in guidance doc will result in the ECONNREFUSED,
thus we need to keep the unix socket address in the two commands consistent.

Fixes: #5085

Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
2022-09-02 17:37:44 +08:00
Peng Tao
b5786361e9 Merge pull request #4862 from egernst/memory-hotplug-limitation
Address Memory hotplug limitation
2022-09-02 16:11:46 +08:00
Ryan Savino
59e3850bfd qemu: create no_patches.txt file for SPR-BKC-QEMU-v2.5
Patches failing without the no_patches.txt file for SPR-BKC-QEMU-v2.5.

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2022-09-01 21:07:30 -05:00
Bin Liu
6de4bfd860 Merge pull request #5076 from GabyCT/topic/updatedeveloperguide
docs: Update url in the Developer Guide
2022-09-02 10:01:02 +08:00
Ryan Savino
54d6d01754 qemu: fix tdx qemu tarball directories
Dockerfile cannot decipher multiple conditional statements in the main RUN call.
Cannot segregate statements in Dockerfile with '{}' braces without wrapping entire statement in 'bash -c' statement.
Dockerfile does not support setting variables by bash command.
Must set HYPERVISOR_NAME and PKGVERSION from parent script: build-base-qemu.sh

Fixes: #5078

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2022-09-01 20:36:28 -05:00
Archana Shinde
f79ef1ad90 Merge pull request #5048 from amshinde/3.0.0-alpha1-branch-bump
# Kata Containers 3.0.0-alpha1
2022-09-02 06:42:16 +05:30
Gabriela Cervantes
e83b821316 docs: Update url in the Developer Guide
This PR updates the url for containerd in the Developer Guide.

Fixes #5075

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-09-01 15:33:29 +00:00
Zhongtao Hu
84268f8716 runtime-rs: update rust runtime roadmap
Update the status and plan for the Rust runtime developement

Fixes: #4884
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-09-01 22:53:30 +08:00
GabyCT
9bce2beebf Merge pull request #5040 from GabyCT/topic/updatecni
versions: Update cni plugins version
2022-09-01 09:31:06 -05:00
Bin Liu
69b82023a8 Merge pull request #5065 from liubin/fix/5064-specify-language-for-code-in-markdown
docs: Specify language in markdown for syntax highlight
2022-09-01 16:11:23 +08:00
Bin Liu
41ec71169f runtime-rs: split amend_spec function
amend_spec do two works:

- modify the spec
- check if the pid namespace is enabled

This make it confusable. So split it into two functions.

Fixes: #5062

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-01 14:44:54 +08:00
Bin Liu
749a6a2480 docs: Specify language in markdown for syntax highlight
Specify language for code block in docs/Unit-Test-Advice.md
for syntax highlight.

Fixes: #5064

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-01 13:54:31 +08:00
Ji-Xinyou
a828292b47 runtime-rs: add unit tests for network resource
Add UTs for network resource

Fixes: #4923
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-09-01 10:13:09 +08:00
Eric Ernst
9997ab064a sandbox_test: Add test to verify memory hotplug behavior
Augment the mock hypervisor so that we can validate that ACPI memory hotplug
is carried out as expected.

We'll augment the number of memory slots in the hypervisor config each
time the memory of the hypervisor is changed. In this way we can ensure
that large memory hotplugs are broken up into appropriately sized
pieces in the unit test.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-08-31 10:32:30 -07:00
Eric Ernst
f390c122f0 sandbox: don't hotplug too much memory at once
If we're using ACPI hotplug for memory, there's a limitation on the
amount of memory which can be hotplugged at a single time.

During hotplug, we'll allocate memory for the memmap for each page,
resulting in a 64 byte per 4KiB page allocation. As an example, hotplugging 12GiB
of memory requires ~192 MiB of *free* memory, which is about the limit
we should expect for an idle 256 MiB guest (conservative heuristic of 75%
of provided memory).

From experimentation, at pod creation time we can reliably add 48 times
what is provided to the guest. (a factor of 48 results in using 75% of
provided memory for hotplug). Using prior example of a guest with 256Mi
RAM, 256 Mi * 48 = 12 Gi; 12GiB is upper end of what we should expect
can be hotplugged successfully into the guest.

Note: It isn't expected that we'll need to hotplug large amounts of RAM
after workloads have already started -- container additions are expected
to occur first in pod lifecycle. Based on this, we expect that provided
memory should be freely available for hotplug.

If virtio-mem is being utilized, there isn't such a limitation - we can
hotplug the max allowed memory at a single time.

Fixes: #4847

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-08-31 10:32:30 -07:00
Ryan Savino
566656b085 gperf: point URL to mirror site
gperf download fails intermittently.
Changing to mirror site will hopefully increase download reliability.

Fixes: #5057

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2022-08-31 10:02:53 -05:00
Fabiano Fidêncio
08d230c940 Merge pull request #5046 from fidencio/topic/fix-regression-on-building-tdx-kernel
kernel: Re-work get_tee_kernel()
2022-08-31 13:16:26 +02:00
Greg Kurz
380af44043 Merge pull request #5036 from jpecholt/whitelist-cleanup
kernel: Whitelist cleanup
2022-08-31 11:08:32 +02:00
Fabiano Fidêncio
a1fdc08275 kernel: Re-work get_tee_kernel()
00aadfe20a introduced a regression on
`make cc-tdx-kernel-tarball` as we stopped passing all the needed
information to the `build-kernel.sh` script, leading to requiring `yq`
installed in the container used to build the kernel.

This commit partially reverts the faulty one, rewritting it in a way the
old behaviour is brought back, without changing the behaviour that was
added by the faulty commit.

Fixes: #5043

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-31 10:08:12 +02:00
Peng Tao
f1276180b1 Merge pull request #4996 from liubin/fix/4995-delete-socket-option-for-shim
runtime-rs: delete socket from shim command-line options
2022-08-31 14:16:56 +08:00
Bin Liu
515bdcb138 Merge pull request #4900 from wllenyj/dragonball-ut
Built-in Sandbox: add more unit tests for dragonball.
2022-08-31 14:00:07 +08:00
Eric Ernst
e0142db24f hypervisor: Add GetTotalMemoryMB to interface
It'll be useful to get the total memory provided to the guest
(hotplugged + coldplugged). We'll use this information when calcualting
how much memory we can add at a time when utilizing ACPI hotplug.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-08-30 16:37:47 -07:00
Archana Shinde
0ab49b233e release: Kata Containers 3.0.0-alpha1
- Initrd fixes for ubuntu systemd
- kernel: Add CONFIG_CGROUP_HUGETLB=y as part of the cgroup fragments
- Fix kata-deploy to work on CI context
- github-actions: Auto-backporting
- runtime-rs: add support for core scheduling
- ci: Use versions.yaml for the libseccomp
- runk: Add cli message for init command
- agent: add some logs for mount operation
- Use iouring for qemu block devices
- logging: Replace nix::Error::EINVAL with more descriptive msgs
- kata-deploy: fix threading conflicts
- kernel: Ignore CONFIG_SPECULATION_MITIGATIONS for older kernels
- runtime-rs: support loading kernel modules in guest vm
- TDX: Get TDX working again with Cloud Hypervisor + a minor change on QEMU's code
- runk: Move delete logic to libcontainer
- runtime: cri-o annotations have been moved to podman
- Fix depbot reported rust crates dependency security issues
- UT: test_load_kernel_module needs root
- enable vmx for vm factory
- runk: add pause/resume commands
- kernel: upgrade guest kernel support to 5.19
- Drop-in cfg files support in runtime-rs
- agent: do some rollback works if case of do_create_container failed
- network: Fix error message for setting hardware address on TAP interface
- Upgrade to Cloud Hypervisor v26.0
- runtime: tracing: End root span at end of trace
- ci: Update libseccomp version
- dep: update nix dependency
- Updated the link target of CRI-O
- libs/test-utils: share test code by create a new crate

dc32c4622 osbuilder: fix ubuntu initrd /dev/ttyS0 hang
cc5f91dac osbuilder: add systemd symlinks for kata-agent
c08a8631e agent: add some logs for mount operation
0a6f0174f kernel: Ignore CONFIG_SPECULATION_MITIGATIONS for older kernels
6cf16c4f7 agent-ctl: fix clippy error
4b57c04c3 runtime-rs: support loading kernel modules in guest vm
dc90eae17 qemu: Drop unnecessary `tdx_guest` kernel parameter
d4b67613f clh: Use HVC console with TDX
c0cb3cd4d clh: Avoid crashing when memory hotplug is not allowed
9f0a57c0e clh: Increase API and SandboxStop timeouts for TDX
b535bac9c runk: Add cli message for init command
c142fa254 clh: Lift the sharedFS restriction used with TDX
bdf8a57bd runk: Move delete logic to libcontainer
a06d819b2 runtime: cri-o annotations have been moved to podman
ffd1c1ff4 agent-ctl/trace-forwarder: udpate thread_local dependency
69080d76d agent/runk: update regex dependency
e0ec09039 runtime-rs: update async-std dependency
763ceeb7b logging: Replace nix::Error::EINVAL with more descriptive msgs
4ee2b99e1 kata-deploy: fix threading conflicts
731d39df4 kernel: Add CONFIG_CGROUP_HUGETLB=y as part of the cgroup fragments
96d903734 github-actions: Auto-backporting
a6fbaac1b runk: add pause/resume commands
8e201501e kernel: fix for set_kmem_limit error
00aadfe20 kernel: SEV guest kernel upgrade to 5.19.2
0d9d8d63e kernel: upgrade guest kernel support to 5.19.2
57bd3f42d runtime-rs: plug drop-in decoding into config-loading code
87b97b699 runtime-rs: add filesystem-related part of drop-in handling
cf785a1a2 runtime-rs: add core toml::Value tree merging
92f7d6bf8 ci: Use versions.yaml for the libseccomp
f508c2909 runtime: constify splitIrqChipMachineOptions
2b0587db9 runtime: VMX is migratible in vm factory case
fa09f0ec8 runtime: remove qemuPaths
326f1cc77 agent: enrich some error code path
4f53e010b agent: skip test_load_kernel_module if non-root
3a597c274 runtime: clh: Use the new 'payload' interface
16baecc5b runtime: clh: Re-generate the client code
50ea07183 versions: Upgrade to Cloud Hypervisor v26.0
f7d41e98c kata-deploy: export CI in the build container
4f90e3c87 kata-deploy: add dockerbuild/install_yq.sh to gitignore
8ff5c10ac network: Fix error message for setting hardware address on TAP interface
338c28295 dep: update nix dependency
78231a36e ci: Update libseccomp version
34746496b libs/test-utils: share test code by create a new crate
3829ab809 docs: Update CRI-O target link
fcc1e0c61 runtime: tracing: End root span at end of trace
c1e3b8f40 govmm: Refactor qmp functions for adding block device
598884f37 govmm: Refactor code to get rid of redundant code
00860a7e4 qmp: Pass aio backend while adding block device
e1b49d758 config: Add block aio as a supported annotation
ed0f1d0b3 config: Add "block_device_aio" as a config option for qemu
b6cd2348f govmm: Add io_uring as AIO type
81cdaf077 govmm: Correct documentation for Linux aio.
a355812e0 runtime-rs: fixed bug on core-sched error handling
591dfa4fe runtime-rs: add support for core scheduling
09672eb2d agent: do some rollback works if case of do_create_container failed

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-08-30 12:59:10 -07:00
Derek Lee
52bbc3a4b0 cargo.lock: update crates to comply with checks
Updates versions of crossbeam-channel because 0.52.0 is a yanked package
(creators mark version as not for release except as a dependency for
another package)

Updates chrono to use >0.42.0 to avoid:
https://rustsec.org/advisories/RUSTSEC-2020-0159

Updates lz4-sys.

Signed-off-by: Derek Lee <derlee@redhat.com>
2022-08-30 10:08:41 -07:00
Derek Lee
aa581f4b28 cargo.toml: Add oci to src/libs workplace
Adds oci under the src/libs workplace.

oci shares a Cargo.lock file with the rest of src/libs but was not
listed as a member of the workspace.

There is no clear reason why it is not included in the workspace, so
adding it so cargo-deny stop complaining

Signed-off-by: Derek Lee <derlee@redhat.com>
2022-08-30 09:30:03 -07:00
Derek Lee
7914da72c9 cargo.tomls: Added Apache 2.0 to cargo.tomls
One of the checks done by cargo-deny is ensuring all crates have a valid
license. As the rust programs import each other, cargo.toml files
without licenses trigger the check. While I could disable this check
this would be bad practice.

This adds an Apache-2.0 license in the Cargo.toml files.

Some of these files already had a header comment saying it is an Apache
license. As the entire project itself is under an Apache-2.0 license, I
assumed all individual components would also be covered under that
license.

Signed-off-by: Derek Lee <derlee@redhat.com>
2022-08-30 09:30:03 -07:00
Derek Lee
bed4aab7ee github-actions: Add cargo-deny
Adds cargo-deny to scan for vulnerabilities and license issues regarding
rust crates.

GitHub Actions does not have an obvious way to loop over each of the
Cargo.toml files. To avoid hardcoding it, I worked around the problem
using a composite action that first generates the cargo-deny action by
finding all Cargo.toml files before calling this new generated action in
the master workflow.

Uses recommended deny.toml from cargo-deny repo with the following
modifications:

 ignore = ["RUSTSEC-2020-0071"]
  because chrono is dependent on the version of time with the
  vulnerability and there is no simple workaround

 multiple-versions = "allow"
  Because of the above error and other packages, there are instances
  where some crates require different versions of a crate.

 unknown-git = "allow"
  I don't see a particular issue with allowing crates from other repos.
  An alternative would be the manually set each repo we want in an
  allow-git list, but I see this as more of a nuisance that its worth.
  We could leave this as a warning (default), but to avoid clutter I'm
  going to allow it.

If deny.toml needs to be edited in the future, here's the guide:
https://embarkstudios.github.io/cargo-deny/index.html

Fixes #3359

Signed-off-by: Derek Lee <derlee@redhat.com>
2022-08-30 09:30:03 -07:00
Gabriela Cervantes
b1a8acad57 versions: Update cni plugins version
This PR updates the cni plugins version that is being used in the kata CI.

Fixes #5039
Depends-on: github.com/kata-containers/tests#5088

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-08-30 16:04:45 +00:00
Joana Pecholt
a6581734c2 kernel: Whitelist cleanup
This removes two options that are not needed (any longer). These
are not set for any kernel so they do not need to be ignored either.

Fixes #5035

Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>
2022-08-30 13:24:12 +02:00
Fabiano Fidêncio
1b92a946d6 Merge pull request #4987 from ryansavino/initrd-fixes-for-ubuntu-systemd
Initrd fixes for ubuntu systemd
2022-08-30 09:16:43 +02:00
GabyCT
630eada0d3 Merge pull request #4956 from shippomx/main
kernel: Add CONFIG_CGROUP_HUGETLB=y as part of the cgroup fragments
2022-08-29 14:31:46 -05:00
GabyCT
3426da66df Merge pull request #4951 from wainersm/fix_kata-deploy-ci
Fix kata-deploy to work on CI context
2022-08-29 14:30:59 -05:00
Wainer Moschetta
cd5be6d55a Merge pull request #4775 from bookinabox/auto-backport
github-actions: Auto-backporting
2022-08-29 14:08:12 -03:00
Bin Liu
11383c2c0e Merge pull request #4797 from openanolis/runtime-rs-coresched
runtime-rs: add support for core scheduling
2022-08-29 14:28:30 +08:00
Bin Liu
25f54bb999 Merge pull request #4942 from ManaSugi/fix/use-versions-yaml-for-libseccomp
ci: Use versions.yaml for the libseccomp
2022-08-29 11:22:35 +08:00
Archana Shinde
c174eb809e Merge pull request #4983 from ManaSugi/runk/add-init-msg
runk: Add cli message for init command
2022-08-27 00:15:25 +05:30
Ryan Savino
dc32c4622f osbuilder: fix ubuntu initrd /dev/ttyS0 hang
Guest log is showing a hang on systemd getty start.
Adding symlink for /dev/ttyS0 resolves issue.

Fixes: #4932

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2022-08-26 04:59:36 -05:00
Ryan Savino
cc5f91dac7 osbuilder: add systemd symlinks for kata-agent
AGENT_INIT=no (systemd) add symlinks for kata-agent service.

Fixes: #4932

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2022-08-26 04:59:36 -05:00
Fupan Li
63959b0be6 Merge pull request #5011 from liubin/fix/4962-add-logs
agent: add some logs for mount operation
2022-08-26 17:12:15 +08:00
Bin Liu
c08a8631e0 agent: add some logs for mount operation
Somewhere is lack of log info, add more details about
the storage and log when error will help understand
what happened.

Fixes: #4962

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-08-26 14:09:56 +08:00
Archana Shinde
7d52934ec1 Merge pull request #4798 from amshinde/use-iouring-qemu
Use iouring for qemu block devices
2022-08-26 04:00:24 +05:30
Wainer Moschetta
cbe5e324ae Merge pull request #4815 from bookinabox/improve-agent-errors
logging: Replace nix::Error::EINVAL with more descriptive msgs
2022-08-25 14:27:56 -03:00
Fabiano Fidêncio
1eea3d9920 Merge pull request #4965 from ryansavino/kata-deploy-threading-fix
kata-deploy: fix threading conflicts
2022-08-25 19:11:52 +02:00
Fabiano Fidêncio
70cd4f1320 Merge pull request #4999 from fidencio/topic/ignore-CONFIG_SPECULATION_MITIGATIONS-for-older-kernels
kernel: Ignore CONFIG_SPECULATION_MITIGATIONS for older kernels
2022-08-25 17:43:57 +02:00
Fabiano Fidêncio
0a6f0174f5 kernel: Ignore CONFIG_SPECULATION_MITIGATIONS for older kernels
TDX kernel is based on a kernel version which doesn't have the
CONFIG_SPECULATION_MITIGATIONS option.

Having this in the allow list for missing configs avoids a breakage in
the TDX CI.

Fixes: #4998

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-25 10:51:13 +02:00
Bin Liu
cce99c5c73 runtime-rs: delete socket from shim command-line options
The socket is not used to specify the socket address, but
an ENV variable is used for runtime-rs.

Fixes: #4995

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-08-25 15:32:17 +08:00
Bin Liu
a7e64b1ca9 Merge pull request #4892 from openanolis/shuoyu/runtime-rs
runtime-rs: support loading kernel modules in guest vm
2022-08-25 15:01:23 +08:00
Fabiano Fidêncio
ddc94e00b0 Merge pull request #4982 from fidencio/topic/improve-cloud-hypervisor-plus-tdx-support
TDX: Get TDX working again with Cloud Hypervisor + a minor change on QEMU's code
2022-08-25 08:53:10 +02:00
Bin Liu
875d946fb4 Merge pull request #4976 from ManaSugi/runk/refactor-delete-func
runk: Move delete logic to libcontainer
2022-08-25 14:30:30 +08:00
Yushuo
6cf16c4f76 agent-ctl: fix clippy error
Fixes: #4988

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2022-08-25 11:00:49 +08:00
Yushuo
4b57c04c33 runtime-rs: support loading kernel modules in guest vm
Users can specify the kernel module to be loaded through the agent
configuration in kata configuration file or in pod anotation file.

And information of those modules will be sent to kata agent when
sandbox is created.

Fixes: #4894

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2022-08-25 10:38:04 +08:00
Peng Tao
aa6bcacb7d Merge pull request #4973 from bergwolf/github/go-depbot
runtime: cri-o annotations have been moved to podman
2022-08-25 10:12:06 +08:00
Peng Tao
78af76b72a Merge pull request #4969 from bergwolf/github/depbot
Fix depbot reported rust crates dependency security issues
2022-08-25 10:11:54 +08:00
Fabiano Fidêncio
dc90eae17b qemu: Drop unnecessary tdx_guest kernel parameter
With the current TDX kernel used with Kata Containers, `tdx_guest` is
not needed, as TDX_GUEST is now a kernel configuration.

With this in mind, let's just drop the kernel parameter.

Fixes: #4981

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-24 20:02:43 +02:00
Fabiano Fidêncio
d4b67613f0 clh: Use HVC console with TDX
As right now the TDX guest kernel doesn't support "serial" console,
let's switch to using HVC in this case.

Fixes: #4980

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-24 20:02:40 +02:00
Fabiano Fidêncio
c0cb3cd4d8 clh: Avoid crashing when memory hotplug is not allowed
The runtime will crash when trying to resize memory when memory hotplug
is not allowed.

This happens because we cannot simply set the hotplug amount to zero,
leading is to not set memory hotplug at all, and later then trying to
access the value of a nil pointer.

Fixes: #4979

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-24 20:02:22 +02:00
Fabiano Fidêncio
9f0a57c0eb clh: Increase API and SandboxStop timeouts for TDX
While doing tests using `ctr`, I've noticed that I've been hitting those
timeouts more frequently than expected.

Till we find the root cause of the issue (which is *not* in the Kata
Containers), let's increase the timeouts when dealing with a
Confidential Guest.

Fixes: #4978

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-24 20:02:12 +02:00
Manabu Sugimoto
b535bac9c3 runk: Add cli message for init command
Add cli message for init command to tell the user
not to run this command directly.

Fixes: #4367

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-08-25 00:32:35 +09:00
Fabiano Fidêncio
c142fa2541 clh: Lift the sharedFS restriction used with TDX
When booting the TDX kernel with `tdx_disable_filter`, as it's been done
for QEMU, VirtioFS can work without any issues.

Whether this will be part of the upstream kernel or not is a different
story, but it easily could make it there as Cloud Hypervisor relies on
the VIRTIO_F_IOMMU_PLATFORM feature, which forces the guest to use the
DMA API, making these devices compatible with TDX.

See Sebastien Boeuf's explanation of this in the
3c973fa7ce208e7113f69424b7574b83f584885d commit:
"""
By using DMA API, the guest triggers the TDX codepath to share some of
the guest memory, in particular the virtqueues and associated buffers so
that the VMM and vhost-user backends/processes can access this memory.
"""

Fixes: #4977

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-24 17:14:05 +02:00
Manabu Sugimoto
bdf8a57bdb runk: Move delete logic to libcontainer
Move delete logic to `libcontainer` crate to make the code clean
like other commands.

Fixes: #4975

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-08-24 19:12:36 +09:00
Peng Tao
a06d819b24 runtime: cri-o annotations have been moved to podman
Let's swith to depending on podman which also simplies indirect
dependency on kubernetes components. And it helps to avoid cri-o
security issues like CVE-2022-1708 as well.

Fixes: #4972
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-08-24 18:11:37 +08:00
Peng Tao
ffd1c1ff4f agent-ctl/trace-forwarder: udpate thread_local dependency
To bring in fix to CWE-362.

Fixes: #4968
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-08-24 17:10:49 +08:00
Peng Tao
69080d76da agent/runk: update regex dependency
To bring in fix to CVE-2022-24713.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-08-24 17:02:15 +08:00
Peng Tao
e0ec09039d runtime-rs: update async-std dependency
So that we bump several indirect dependencies like crossbeam-channel,
crossbeam-utils to bring in fixes to known security issues like CVE-2020-15254.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-08-24 16:56:29 +08:00
Bin Liu
2b5dc2ad39 Merge pull request #4705 from bergwolf/github/agent-ut-improve
UT: test_load_kernel_module needs root
2022-08-24 16:22:55 +08:00
Bin Liu
6551d4f25a Merge pull request #4051 from bergwolf/github/vmx-vm-factory
enable vmx for vm factory
2022-08-24 16:22:37 +08:00
Bin Liu
ad91801240 Merge pull request #4870 from cyyzero/runk-cgroup
runk: add pause/resume commands
2022-08-24 14:44:43 +08:00
Derek Lee
763ceeb7ba logging: Replace nix::Error::EINVAL with more descriptive msgs
Replaces instances of anyhow!(nix::Error::EINVAL) with other messages to
make it easier to debug.

Fixes #954

Signed-off-by: Derek Lee <derlee@redhat.com>
2022-08-23 13:44:46 -07:00
Ryan Savino
4ee2b99e1e kata-deploy: fix threading conflicts
Fix threading conflicts when kata-deploy 'make kata-tarball' is called.
Force the creation of rootfs tarballs to happen serially instead of in parallel.

Fixes: #4787

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2022-08-23 12:35:23 -05:00
Miao Xia
731d39df45 kernel: Add CONFIG_CGROUP_HUGETLB=y as part of the cgroup fragments
Kata guest os cgroup is not work properly kata guest kernel config option
CONFIG_CGROUP_HUGETLB is not set, leading to:

root@clr-b08d402cc29d44719bb582392b7b3466 ls /sys/fs/cgroup/hugetlb/
ls: cannot access '/sys/fs/cgroup/hugetlb/': No such file or directory

Fixes: #4953

Signed-off-by: Miao Xia <xia.miao1@zte.com.cn>
2022-08-23 12:31:13 +02:00
Derek Lee
96d9037347 github-actions: Auto-backporting
An implementation of semi-automating the backporting
process.

This implementation has two steps:
1. Checking whether any associated issues are marked as bugs

   If they do, mark with `auto-backport` label

2. On a successful merge, if there is a `auto-backport` label  and there
   are any tags of `backport-to-BRANCHNAME`, it calls an action that
   cherry-picks the commits in the PR and automatically creates a PR to
   those branches.

This action uses https://github.com/sqren/backport-github-action

Fixes #3618

Signed-off-by: Derek Lee <derlee@redhat.com>
2022-08-22 16:19:09 -07:00
Chen Yiyang
a6fbaac1bd runk: add pause/resume commands
To make cgroup v1 and v2 works well, I use `cgroups::cgroup` in
`Container` to manager cgroup now. `CgroupManager` in rustjail has some
drawbacks. Frist, methods in Manager traits are not visiable. So we need
to modify rustjail and make them public. Second, CgrupManager.cgroup is
private too, and it can't be serialized. We can't load/save it in
status file. One solution is adding getter/setter in rustjail, then
create `cgroup` and set it when loading status. In order to keep the
modifications to a minimum in rustjail, I use `cgroups::cgroup`
directly. Now it can work on cgroup v1 or v2, since cgroup-rs do this
stuff.

Fixes: #4364 #4821

Signed-off-by: Chen Yiyang <cyyzero@qq.com>
2022-08-22 23:11:50 +08:00
Fabiano Fidêncio
d797036b77 Merge pull request #4861 from ryansavino/upgrade-kernel-support-5.19
kernel: upgrade guest kernel support to 5.19
2022-08-22 14:57:00 +02:00
Bin Liu
8c8e97a495 Merge pull request #4772 from pmores/drop-in-cfg-files-support-rs
Drop-in cfg files support in runtime-rs
2022-08-22 13:41:56 +08:00
Bin Liu
eb91ee45be Merge pull request #4754 from liubin/fix/4749-rollback-when-creating-container-failed
agent: do some rollback works if case of do_create_container failed
2022-08-22 10:44:11 +08:00
Ryan Savino
8e201501ef kernel: fix for set_kmem_limit error
Fixes: #4390

Fix in cargo cgroups-rs crate - Updated crate version to 0.2.10

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2022-08-19 13:08:14 -05:00
Ryan Savino
00aadfe20a kernel: SEV guest kernel upgrade to 5.19.2
kernel: Update SEV guest kernel to 5.19.2

Kernel 5.19.2 has all the needed patches for running SEV, thus let's update it and stop using the version coming from confidential-containers.

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2022-08-19 13:08:14 -05:00
Ryan Savino
0d9d8d63ea kernel: upgrade guest kernel support to 5.19.2
kernel: Upgrade guest kernel support to 5.19.2

Let's update to the latest 5.19.x released kernel.

CONFIG modifications necessary:
fragments/common/dax.conf - CONFIG_DEV_PAGEMAP_OPS no longer configurable:
https://www.kernelconfig.io/CONFIG_DEV_PAGEMAP_OPS?q=CONFIG_DEV_PAGEMAP_OPS&kernelversion=5.19.2
fragments/common/dax.conf - CONFIG_ND_BLK no longer supported:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f8669f1d6a86a6b17104ceca9340ded280307ac1
fragments/x86_64/base.conf - CONFIG_SPECULATION_MITIGATIONS is a dependency for CONFIG_RETPOLINE:
https://www.kernelconfig.io/config_retpoline?q=&kernelversion=5.19.2
fragments/s390/network.conf - removed from kernel since 5.9.9:
https://www.kernelconfig.io/CONFIG_PACK_STACK?q=CONFIG_PACK_STACK&kernelversion=5.19.2

Updated vmlinux path in build-kernel.sh for arch s390

Fixes #4860

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2022-08-19 13:08:13 -05:00
Fabiano Fidêncio
9806ce8615 Merge pull request #4937 from chenhengqi/fix-error-msg
network: Fix error message for setting hardware address on TAP interface
2022-08-19 17:54:58 +02:00
Pavel Mores
57bd3f42d3 runtime-rs: plug drop-in decoding into config-loading code
To plug drop-in support into existing config-loading code in a robust
way, more specifically to create a single point where this needs to be
handled, load_from_file() and load_raw_from_file() were refactored.
Seeing as the original implemenations of both functions were identical
apart from adjust_config() calls in load_from_file(), load_from_file()
was reimplemented in terms of load_raw_from_file().

Fixes  #4771

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-08-19 11:01:29 +02:00
Pavel Mores
87b97b6994 runtime-rs: add filesystem-related part of drop-in handling
The central function being added here is load() which takes a path to a
base config file and uses it to load the base config file itself, find
the corresponding drop-in directory (get_dropin_dir_path()), iterate
through its contents (update_from_dropins()) and load each drop-in in
turn and merge its contents with the base file (update_from_dropin()).

Also added is a test of load() which mirrors the corresponding test in
the golang runtime (TestLoadDropInConfiguration() in config_test.go).

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-08-19 11:01:29 +02:00
Pavel Mores
cf785a1a23 runtime-rs: add core toml::Value tree merging
This is the core functionality of merging config file fragments into the
base config file.  Our TOML parser crate doesn't seem to allow working
at the level of TomlConfig instances like BurntSushi, used in the Golang
runtime, does so we implement the required functionality at the level of
toml::Value trees.

Tests to verify basic requirements are included.  Values set by a base
config file and not touched by a subsequent drop-in should be preserved.
Drop-in config file fragments should be able to change values set by the
base config file and add settings not present in the base.  Conversion
of a merged tree into a mock TomlConfig-style structure is tested as
well.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-08-19 11:01:29 +02:00
Manabu Sugimoto
92f7d6bf8f ci: Use versions.yaml for the libseccomp
It would be nice to use `versions.yaml` for the maintainability.
Previously, we have been specified the `libseccomp` and the `gperf` version
directly in this script without using the `versions.yaml` because the current
snap workflow is incomplete and fails.
This is because snap CI environment does not have kata-cotnainers repository
under ${GOPATH}. To avoid the failure, the `rootfs.sh` extracts the libseccomp
version and url in advance and pass them to the `install_libseccomp.sh` as
environment variables.

Fixes: #4941

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-08-19 09:05:08 +09:00
Fabiano Fidêncio
828383bc39 Merge pull request #4933 from likebreath/0816/prepare_clh_v26.0
Upgrade to Cloud Hypervisor v26.0
2022-08-18 18:36:53 +02:00
James O. D. Hunt
6d6edb0bb3 Merge pull request #4903 from cmaf/tracing-defer-rootSpan-end
runtime: tracing: End root span at end of trace
2022-08-18 08:51:41 +01:00
Peng Tao
f508c2909a runtime: constify splitIrqChipMachineOptions
A simple cleanup.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-08-18 10:09:20 +08:00
Peng Tao
2b0587db95 runtime: VMX is migratible in vm factory case
We are not spinning up any L2 guests in vm factory, so the L1 guest
migration is expected to work even with VMX.

See https://www.linux-kvm.org/page/Nested_Guests

Fixes: #4050
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-08-18 10:08:43 +08:00
Peng Tao
fa09f0ec84 runtime: remove qemuPaths
It is broken that it doesn't list QemuVirt machine type. In fact we
don't need it at all. Just drop it.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-08-18 10:06:10 +08:00
Peng Tao
326f1cc773 agent: enrich some error code path
So that it is easier to find out why some function fails.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-08-18 10:02:12 +08:00
Peng Tao
4f53e010b4 agent: skip test_load_kernel_module if non-root
We need root privilege to load a real kernel module.

Fixes: #4704
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-08-18 10:02:12 +08:00
Bin Liu
cc4b9ac7cd Merge pull request #4940 from ManaSugi/fix/update-libseccomp-version
ci: Update libseccomp version
2022-08-18 08:36:59 +08:00
Bin Liu
c7b7bb701a Merge pull request #4930 from bergwolf/github/depbot
dep: update nix dependency
2022-08-18 08:05:14 +08:00
Bo Chen
3a597c2742 runtime: clh: Use the new 'payload' interface
The new 'payload' interface now contains the 'kernel' and 'initramfs'
config.

Fixes: #4952

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-08-17 12:23:43 -07:00
Bo Chen
16baecc5b1 runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v26.0.
Note: The client code of cloud-hypervisor's (CLH) OpenAPI is
automatically generated by openapi-generator [1-2].

[1] https://github.com/OpenAPITools/openapi-generator
[2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md

Fixes: #4952

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-08-17 12:23:12 -07:00
Bo Chen
50ea071834 versions: Upgrade to Cloud Hypervisor v26.0
Highlights from the Cloud Hypervisor release v26.0:

**SMBIOS Improvements via `--platform`**
`--platform` and the appropriate API structure has gained support for supplying
OEM strings (primarily used to communicate metadata to systemd in the guest)

**Unified Binary MSHV and KVM Support**
Support for both the MSHV and KVM hypervisors can be compiled into the same
binary with the detection of the hypervisor to use made at runtime.

**Notable Bug Fixes**
* The prefetchable flag is preserved on BARs for VFIO devices
* PCI Express capabilties for functionality we do not support are now filtered
out
* GDB breakpoint support is more reliable
* SIGINT and SIGTERM signals are now handled before the VM has booted
* Multiple API event loop handling bug fixes
* Incorrect assumptions in virtio queue numbering were addressed, allowing
thevirtio-fs driver in OVMF to be used
* VHDX file format header fix
* The same VFIO device cannot be added twice
* SMBIOS tables were being incorrectly generated

**Deprecations**
Deprecated features will be removed in a subsequent release and users should
plan to use alternatives.

The top-level `kernel` and `initramfs` members on the `VmConfig` have been
moved inside a `PayloadConfig` as the `payload` member. The OpenAPI document
has been updated to reflect the change and the old API members continue to
function and are mapped to the new version. The expectation is that these old
versions will be removed in the v28.0 release.

**Removals**
The following functionality has been removed:

The unused poll_queue parameter has been removed from --disk and
equivalent. This was residual from the removal of the vhost-user-block
spawning feature.

Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v26.0

Fixes: #4952

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-08-17 12:20:26 -07:00
wllenyj
c75970b816 dragonball: add more unit test for config manager
Added more unit tests for config manager.

Fixes: #4899

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-08-17 23:46:26 +08:00
Wainer dos Santos Moschetta
f7d41e98cb kata-deploy: export CI in the build container
The clone_tests_repo() in ci/lib.sh relies on CI variable to decide
whether to checkout the tests repository or not. So it is required to
pass that variable down to the build container of kata-deploy, otherwise
it can fail on some scenarios.

Fixes #4949
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2022-08-17 10:42:49 -03:00
Wainer dos Santos Moschetta
4f90e3c87e kata-deploy: add dockerbuild/install_yq.sh to gitignore
The install_yq.sh is copied to tools/packaging/kata-deploy/local-build/dockerbuild
so that it is added in the kata-deploy build image. Let's tell git to
ignore that file.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2022-08-17 10:00:28 -03:00
Bin Liu
9d6d236003 Merge pull request #4869 from PrajwalBorkar/prajwal-patch
Updated the link target of CRI-O
2022-08-17 17:55:40 +08:00
Hengqi Chen
8ff5c10ac4 network: Fix error message for setting hardware address on TAP interface
Error out with the correct interface name and hardware address instead.

Fixes: #4944

Signed-off-by: Hengqi Chen <chenhengqi@outlook.com>
2022-08-17 16:42:07 +08:00
Peng Tao
338c282950 dep: update nix dependency
To fix CVE-2021-45707 that affects nix < 0.20.2.

Fixes: #4929
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-08-17 16:06:26 +08:00
James O. D. Hunt
82ad43f9bf Merge pull request #4928 from liubin/fix/4925-share-test-utils-for-rust
libs/test-utils: share test code by create a new crate
2022-08-17 08:31:11 +01:00
Manabu Sugimoto
78231a36e4 ci: Update libseccomp version
Updates the libseccomp version that is being used in the Kata CI.

Fixes: #4858, #4939

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-08-17 15:39:22 +09:00
Bin Liu
8cd1e50eb6 Merge pull request #4921 from liubin/fix/2920-delete-vergen
runtime-rs: delete vergen dependency
2022-08-17 10:09:12 +08:00
Bin Liu
34746496b7 libs/test-utils: share test code by create a new crate
More and more Rust code is introduced, the test utils original in agent
should be made easy to share, move it into a new crate will make it
easy to share between different crates.

Fixes: #4925

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-08-17 00:12:44 +08:00
GabyCT
dd93d4ad5a Merge pull request #4922 from bergwolf/github/release
workflow: trigger release for 3.x releases
2022-08-16 10:20:33 -05:00
Peng Tao
6d6c068692 workflow: trigger release for 3.x releases
So that we can push 3.x artifacts to the release page.

Fixes: #4919
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-08-16 17:55:51 +08:00
Bin Liu
eab7c8f28f runtime-rs: delete vergen dependency
vergen is a build dependency, but it is not being used.
we are processing ver/commit hash by make command, but not by vergen.

Fixes: #4920

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-08-16 15:31:24 +08:00
Bin Liu
828574d27c Merge pull request #4893 from openanolis/runtime-rs-main
Runtime-rs: support persist file
2022-08-16 14:42:22 +08:00
Bin Liu
334c7b3355 Merge pull request #4916 from GabyCT/topic/fixurl
docs: Update url in containerd documentation
2022-08-16 13:45:58 +08:00
Bin Liu
f9d3181533 Merge pull request #4911 from bergwolf/3.0.0-alpha0-branch-bump
# Kata Containers 3.0.0-alpha0
2022-08-16 13:44:49 +08:00
Gabriela Cervantes
3e9077f6ee docs: Update url in containerd documentation
This PR updates the url that we have in our kata containerd
documentation.

Fixes #4915

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-08-15 19:04:29 +00:00
Bin Liu
830fb266e6 Merge pull request #4854 from openanolis/runtime-rs-delete
runtime-rs: delete route model
2022-08-15 20:48:58 +08:00
Prajwal Borkar
3829ab809f docs: Update CRI-O target link
Fixes #4767

Signed-off-by: Prajwal Borkar <prajwalborkar5075@gmail.com>
2022-08-15 16:48:32 +05:30
Peng Tao
52133ef66e release: Kata Containers 3.0.0-alpha0
- runtime-rs: fix design doc's typo
- docs: use curl as default downloader for runtime-rs
- runtime-rs: update Cargo.lock
- Fix some GitHub actions workflow issues
- versions: Update libseccomp version
- runtime-rs:merge runtime rs to main
- nydus: wait nydusd API server ready before mounting share fs
- versions: Update TD-shim due to build breakage
- agent-ctl: Add an empty [workspace]
- packaging: Create no_patches.txt for the SPR-BKC-PC-v9.6.x
- docs: Improve SGX documentation
- runtime: explicitly mark the source of the log is from qemu.log
- runtime: add unlock before return in sendReq
- docs: add back host network limitation
- runk: add ps sub-command
- Depends-on:github.com/kata-containers/tests#4986
- runtime-rs:update rtnetlink version
- runtime-rs:skip the build process when the arch is s390x
- docs: Improve SGX documentation
- agent: Use rtnetlink's neighbours API to add neighbors
- Bump TDX dependencies (QEMU and Kernel)
- OVMF / td-shim: Adjust final tarball location
- libs: fix CI error for protocols
- runtime-rs: merge main to runtime-rs
- packaging: Add support for building TDVF
- versions: Track and add support for building TD-shim
- versions: Upgrade rust version
- Merge Main into runtime-rs branch
- agent: log RPC calls for debugging
- runtime-rs: fix stop failed in azure
- Add support AmdSev build of OVMF
- runtime: Support for host cgroupv2
- versions: Update runc version
- qemu: Add liburing to qemu build
- runtime-rs: fix set share sandbox pid namespace
- Docs: fix tables format error
- versions: Update Firecracker version to v1.1.0
- agent: Fix stream fd's double close
- container: kill all of the processes in a container when it terminated
- fix network failed for kata ci
- runtime-rs: handle default_vcpus greator than default_maxvcpu
- agent: fix fd-double-close problem in ut test_do_write_stream
- runtime-rs: add functionalities support for macvlan and vlan endpoints
- Docs: add rust environment setup for kata 3.0
- rustjail: check result to let it return early
- upgrade nydus version
- support disable_guest_seccomp
- cgroups: remove unnecessary get_paths()
- versions: Update firecracker version
- kata-monitor: fix can't monitor /run/vc/sbs
- runtime-rs: fix sandbox_cgroup_only=false panic
- runtime-rs: fix ctr exit failed
- docs: add installation guide for kata 3.0
- runtime-rs: support functionalities of ipvlan endpoint
- runtime-rs: remove the value of hypervisor path in DB config
- kata-sys-util: upgrade nix version
- runtime-rs: fix some bugs to make runtime-rs on aarch64
- runk: Support `exec` sub-command
- runtime-rs: hypervisor part
- clh: Don't crash if no network device is set by the upper layer
- packaging: Rework how ${BUILD_SUFFIX} is used with the QEMU builder scripts
- versions: Update Cloud Hypervisor to v25.0
- Runtime-rs merge main
- kernel: Deduplicate code used for building TEE kernels
- runtime-rs: Dragonball-sandbox - add virtio device feature support for aarch64
- packaging: Simplify config path handling
- build: save lines for repository_owner check
- kata 3.0 Architecture
- Fix clh tarball build
- runtime-rs: built-in Dragonball sandbox part III - virtio-blk, virtio-fs, virtio-net and VMM API support
- runtime: Fix DisableSelinux config
- docs: Update URL links for containerd documentation
- docs: delete CRI containerd plugin statement
- release: Revert kata-deploy changes after 2.5.0-rc0 release
- tools/snap: simplify nproc
- action: revert commit message limit to 150 bytes
- runtime-rs: Dragonball sandbox - add Vcpu::configure() function for aarch64
- runtime-rs: makefile for dragonball
- runtime-rs:refactor network model with netlink
- runtime-rs: Merge Main into runtime-rs branch
- runtime-rs: built-in Dragonball sandbox part II - vCPU manager
- runtime-rs: runtime-rs merge main
- runtime-rs: built-in Dragonball sandbox part I - resource and device managers

caada34f1 runtime-rs: fix design doc's typo
b61dda40b docs: use curl as default downloader for runtime-rs
ca9d16e5e runtime-rs: update Cargo.lock
99a7b4f3e workflow: Revert "static-checks: Allow Merge commit to be >75 chars"
d14e80e9f workflow: Revert "docs: modify move-issues-to-in-progress.yaml"
1f4b6e646 versions: Update libseccomp version
8a4e69008 versions: Update TD-shim due to build breakage
065305f4a agent-ctl: Add an empty [workspace]
1444d7ce4 packaging: Create no_patches.txt for the SPR-BKC-PC-v9.6.x
2ae807fd2 nydus: wait nydusd API server ready before mounting share fs
c8d4ea84e docs: Improve SGX documentation
d8ad16a34 runtime: add unlock before return in sendReq
8bbffc42c runtime-rs:update rtnetlink version
c5452faec docs: Improve SGX documentation
389ae9702  runtime-rs:skip the test when the arch is s390x
945e02227 runtime-rs:skip the build process when the arch is s390x
8d1cb1d51 td-shim: Adjust final tarball location
62f05d4b4 ovmf: Adjust final tarball location
9972487f6 versions: Bump Kernel TDX version
c9358155a kernel: Sort the TDX configs alphabetically
dd397ff1b versions: Bump QEMU TDX version
230a22905 runk: add ps sub-command
889557ecb docs: add back host network limitation
c9b5bde30 versions: Track and build TDVF
e6a5a5106 packaging: Generate a tarball as OVMF build result
42eaf19b4 packaging: Simplify OVMF repo clone
4d33b0541 packaging: Don't hardcode "edk2" as the cloned repo's dir.
7247575fa runtime-rs:fix cargo clippy
b06bc8228 versions: Track and add support for building TD-shim
86ac653ba libs: fix CI error for protocols
81fe51ab0 agent: fix unittests for arp neighbors
845c1c03c agent: use rtnetlink's neighbours API to add neighbors
9b1940e93 versions: update rust version
638c2c416 static-build: Add AmdSev option for OVMF builder Introduces new build of firmware needed for SEV
f0b58e38d static-build: Add build script for  OVMF
fa0b11fc5 runtime-rs: fix stdin hang in azure
5c3155f7e runtime: Support for host cgroup v2
4ab45e5c9 docs: Update support for host cgroupv2
326eb2f91 versions: Update runc version
f5aa6ae46 agent: Fix stream fd's double close problem
6e149b43f Docs: fix tables format error
85f4e7caf runtime: explicitly mark the source of the log is from qemu.log
56d49b507 versions: Update Firecracker version to v1.1.0
b3147411e runtime-rs:add unit test for set share pid ns
1ef3f8eac runtime-rs: set share sandbox pid namespace
57c556a80 runtime-rs: fix stop failed in azure
0e24f47a4 agent: log RPC calls for debugging
c825065b2 runtime-rs: fix tc filter setup failed
e0194dcb5 runtime-rs: update route destination with prefix
fa85fd584 docs: add rust environment setup for kata 3.0
896478c92 runtime-rs: add functionalities support for macvlan and vlan endpoints
df79c8fe1 versions: Update firecracker version
912641509 agent: fix fd-double-close problem in ut test_do_write_stream
43045be8d runtime-rs: handle default_vcpus greator than default_maxvcpu
0d7cb7eb1 agent: delete agent-type property in announce
eec9ac81e rustjail: check result to let it return early.
402bfa0ce nydus: upgrade nydus/nydus-snapshotter version
54f53d57e runtime-rs: support disable_guest_seccomp
4331ef80d Runtime-rs: add installation guide for rust-runtime
72dbd1fcb kata-monitor: fix can't monitor /run/vc/sbs.
e9988f0c6 runtime-rs: fix sandbox_cgroup_only=false panic
cebbebbe8 runtime-rs: fix ctr exit failed
62182db64 runtime-rs: add unit test for ipvlan endpoint
99654ce69 runtime-rs: update dbs-xxx dependencies
f4c3adf59 runtime-rs: Add compile option file
545ae3f0e runtime-rs: fix warning
19eca71cd runtime-rs: remove the value of hypervisor path in DB config
d8920b00c runtime-rs: support functionalities of ipvlan endpoint
2b01e9ba4 dragonball: fix warning
996a6b80b kata-sys-util: upgrade nix version
f690b0aad qemu: Add liburing to qemu build
d93e4b939 container: kill all of the processes in this container
3c989521b dragonball: update for review
274598ae5 kata-runtime: add dragonball config check support.
1befbe673 runtime-rs: Cargo lock for fix version problem
3d6156f6e runtime-rs: support dragonball and runtime-binary
3f6123b4d libs: update configuration and annotations
9ae2a45b3 cgroups: remove unnecessary get_paths()
be31207f6 clh: Don't crash if no network device is set by the upper layer
051181249 packaging: Add a "-" in the dir name if $BUILD_DIR is available
dc3b6f659 versions: Update Cloud Hypervisor to v25.0
201ff223f packaging: Use the $BUILD_SUFFIX when renaming the qemu binary
1a25afcdf kernel: Allow passing the URL to download the tarball
80c68b80a kernel: Deduplicate code used for building TEE kernels
d2584991e dragonball: fix dependency unused warning
458f6f42f dragonball: use const string for legacy device type
939959e72 docs: add Dragonball to hypervisors
f6f96b8fe dragonball: add legacy device support for aarch64
7a4183980 dragonball: add device info support for aarch64
f7ccf92dc kata-deploy: Rely on the configured config path
386a523a0 kata-deploy: Pass the config path to CRI-O
13df57c39 build: save lines for repository_owner check
57c2d8b74 docs: Update URL links for containerd documentation
e57a1c831 build: Mark git repos as safe for build
2551924bd docs: delete CRI containerd plugin statement
9cee52153 fmt: do cargo fmt and add a dependency for blk_dev
47a4142e0 fs: change vhostuser and virtio into const
e14e98bbe cpu_topo: add handle_cpu_topology function
5d3b53ee7 downtime: add downtime support
6a1fe85f1 vfio: add vfio as TODO
5ea35ddcd refractor: remove redundant by_id
b646d7cb3 config: remove ht_enabled
cb54ac6c6 memory: remove reserve_memory_bytes
bde6609b9 hotplug: add room for other hotplug solution
d88b1bf01 dragonball: update vsock dependency
dd003ebe0 Dragonball: change error name and fix compile error
38957fe00 UT: fix compile error in unit tests
11b3f9514 dragonball: add virtio-fs device support
948381bdb dragonball: add virtio-net device support
3d20387a2 dragonball: add virtio-blk device support
87d38ae49 Doc: add document for Dragonball API
2bb1eeaec docs: further questions related to upcall
026aaeecc docs: add FAQ to the report
fffcb8165 docs: update the content of the report
42ea854eb docs: kata 3.0 Architecture
efdb92366 build: Fix clh source build as normal user
0e40ecf38 tools/snap: simplify nproc
f59939a31 runk: Support `exec` sub-command
4d89476c9 runtime: Fix DisableSelinux config
090de2dae dragonball: fix the clippy errors.
a1593322b dragonball: add vsock api to api server
89b9ba860 dragonball: add set_vm_configuration api
95fa0c70c dragonball: add start microvm support
5c1ccc376 dragonball: add Vmm struct
4d234f574 dragonball: refactor code layout
cfd5dae47 dragonball: add vm struct
527b73a8e dragonball: remove unused feature in AddressSpaceMgr
3bafafec5 action: extend commit message line limit to 150 bytes
5010c643c release: Revert kata-deploy changes after 2.5.0-rc0 release
7120afe4e dragonball: add vcpu test function for aarch64
648d285a2 dragonball: add vcpu support for aarch64
7dad7c89f dragonball: update dbs-xxx dependency
07231b2f3 runtime-rs:refactor network model with netlink
c8a905206 build: format files
242992e3d build: put install methods in utils.mk
8a697268d build: makefile for dragonball config
9c526292e runtime-rs:refactor network model with netlink
71db2dd5b hotplug: add room for future acpi hotplug mechanism
8bb00a3dc dragonball: fix a bug when generating kernel boot args
2aedd4d12 doc: add document for vCPU, api and device
bec22ad01 dragonball: add api module
07f44c3e0 dragonball: add vcpu manager
78c971875 dragonball: add upcall support
7d1953b52 dragonball: add vcpu
468c73b3c dragonball: add kvm context
e89e6507a dragonball: add signal handler
b6cb2c4ae dragonball: add metrics system
e80e0c464 dragonball: add io manager wrapper
d5ee3fc85 safe-path: fix clippy warning
93c10dfd8 runtime-rs: add crosvm license in Dragonball
dfe6de771 dragonball: add dragonball into kata README
39ff85d61 dragonball: green ci
71f24d827 dragonball: add Makefile.
a1df6d096 Doc: Update Dragonball Readme and add document for device
8619f2b3d dragonball: add virtio vsock device manager.
52d42af63 dragonball: add device manager.
c1c1e5152 dragonball: add kernel config.
6850ef99a dragonball: add configuration manager.
0bcb422fc dragonball: add legacy devices manager
3c45c0715 dragonball: add console manager.
3d38bb300 dragonball: add address space manager.
aff604055 dragonball: add resource manager support.
8835db6b0 dragonball: initial commit
9cb15ab4c agent: add the FSGroup support
ff7874bc2 protobuf: upgrade the protobuf version to 2.27.0
06f398a34 runtime-rs: use withContext to evaluate lazily
fd4c26f9c runtime-rs: support network resource
4be7185aa runtime-rs: runtime part implement
10343b1f3 runtime-rs: enhance runtimes
9887272db libs: enhance kata-sys-util and kata-types
3ff0db05a runtime-rs: support rootfs volume for resource
234d7bca0 runtime-rs: support cgroup resource
75e282b4c runtime-rs: hypervisor base define
bdfee005f runtime-rs: service and runtime framework
4296e3069 runtime-rs: agent implements
d3da156ee runtime-rs: uint FsType for s390x
e705ee07c runtime-rs: update containerd-shim-protos to 0.2.0
8c0a60e19 runtime-rs: modify the review suggestion
278f843f9 runtime-rs: shim implements for runtime-rs
641b73610 libs: enhance kata-sys-util
69ba1ae9e trans: fix the issue of wrong swapness type
d2a9bc667 agent: agent-protocol support async
aee9633ce libs/sys-util: provide functions to execute hooks
8509de0ae libs/sys-util: add function to detect and update K8s emptyDir volume
6d59e8e19 libs/sys-util: introduce function to get device id
5300ea23a libs/sys-util: implement reflink_copy()
1d5c898d7 libs/sys-util: add utilities to parse NUMA information
87887026f libs/sys-util: add utilities to manipulate cgroup
ccd03e2ca libs/sys-util: add wrappers for mount and fs
45a00b4f0 libs/sys-util: add kata-sys-util crate under src/libs
48c201a1a libs/types: make the variable name easier to understand
b9b6d70aa libs/types: modify implementation details
05ad026fc libs/types: fix implementation details
d96716b4d libs/types:fix styles and implementation details
6cffd943b libs/types:return Result to handle parse error
6ae87d9d6 libs/types: use contains to make code more readable
45e5780e7 libs/types: fixed spelling and grammer error
2599a06a5 libs/types:use include_str! in test file
8ffff40af libs/types:Option type to handle empty tomlconfig
626828696 libs/types: add license for test-config.rs
97d8c6c0f docs: modify move-issues-to-in-progress.yaml
8cdd70f6c libs/types: change method to update config by annotation
e19d04719 libs/types: implement KataConfig to wrap TomlConfig
387ffa914 libs/types: support load Kata agent configuration from file
69f10afb7 libs/types: support load Kata hypervisor configuration from file
21cc02d72 libs/types: support load Kata runtime configuration from file
5b89c1df2 libs/types: add kata-types crate under src/libs
4f62a7618 libs/logging: fix clippy warnings
6f8acb94c libs: refine Makefile rules
7cdee4980 libs/logging: introduce a wrapper writer for logging
426f38de9 libs/logging: implement rotator for log files
392f1ecdf libs: convert to a cargo workspace
575df4dc4 static-checks: Allow Merge commit to be >75 chars

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-08-15 07:23:13 +00:00
Ji-Xinyou
ff7c78e0e8 runtime-rs: static resource mgmt default to false
Static resource management should be default to false. If default to be
true, later update sandbox operation, e.g. resize, will not work.

Fixes: #4742
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-08-15 14:42:38 +08:00
Ji-Xinyou
00f3a6de12 runtime-rs: make static resource mgmt idiomatic
Make the get value process (cpu and mem) more idiomatic.

Fixes: #4742
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-08-15 11:18:35 +08:00
Zhongtao Hu
4d7f3edbaf runtime-rs: support the functionality of cleanup
Cleanup sandbox resource

Fixes: #4891
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-08-13 15:56:38 +08:00
Zhongtao Hu
5aa83754e5 runtime-rs: support save to persist file and restore
Support the functionality of save and restore for sandbox state

Fixes:#4891
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-08-13 15:44:13 +08:00
Chelsea Mafrica
fcc1e0c617 runtime: tracing: End root span at end of trace
The root span should exist the duration of the trace. Defer ending span
until the end of the trace instead of end of function. Add the span to
the service struct to do so.

Fixes #4902

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2022-08-12 13:15:39 -07:00
GabyCT
97b7fe438a Merge pull request #4898 from openanolis/fixdoc
runtime-rs: fix design doc's typo
2022-08-12 10:06:44 -05:00
Bin Liu
2cd964ca79 Merge pull request #4881 from openanolis/runtime-rs-curl
docs: use curl as default downloader for runtime-rs
2022-08-12 19:46:39 +08:00
Bin Liu
6a8e8dfc8e Merge pull request #4876 from liubin/fix/4875-update-Cargo-lock
runtime-rs: update Cargo.lock
2022-08-12 19:41:02 +08:00
Ji-Xinyou
caada34f1d runtime-rs: fix design doc's typo
Fix docs/design/architecture_3.0's typo. Both source code and png.

Fixes: #4883
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-08-12 17:38:13 +08:00
Bin Liu
bfa86246f8 Merge pull request #4872 from liubin/fix/4871-github-actions-fix
Fix some GitHub actions workflow issues
2022-08-11 19:26:15 +08:00
Zhongtao Hu
c280d6965b runtime-rs: delete route model
As route model is used for specific internal scenario, and it's not for
the general requirement.

Fixes:#4838
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-08-11 15:56:43 +08:00
Zhongtao Hu
b61dda40b7 docs: use curl as default downloader for runtime-rs
use curl as default downloader for runtime-rs

Fixes: #4879
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-08-11 15:52:13 +08:00
Fabiano Fidêncio
881c87a25c Merge pull request #4859 from GabyCT/topic/updatelibse
versions: Update libseccomp version
2022-08-11 09:34:44 +02:00
Bin Liu
ca9d16e5ea runtime-rs: update Cargo.lock
Update Cargo.lock

Fixes: #4875

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-08-11 10:34:36 +08:00
Ji-Xinyou
4a54876dde runtime-rs: support static resource management functionality
Supports functionalities of static resource management, enabled by
default.

Fixes: #4742
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-08-11 09:46:44 +08:00
Bin Liu
99a7b4f3e1 workflow: Revert "static-checks: Allow Merge commit to be >75 chars"
This reverts commit 575df4dc4d.

Fixes: #4871

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-08-11 08:59:02 +08:00
Bin Liu
d14e80e9fd workflow: Revert "docs: modify move-issues-to-in-progress.yaml"
This reverts commit 97d8c6c0fa.

Fixes: #4871

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-08-11 08:58:43 +08:00
Bin Liu
cb7f9524be Merge pull request #4804 from openanolis/anolis/merge_runtime_rs_to_main
runtime-rs:merge runtime rs to main
2022-08-11 08:40:41 +08:00
Tim Zhang
4813a3cef9 Merge pull request #4711 from liubin/fix/4710-wait-nydusd-api-server-ready
nydus: wait nydusd API server ready before mounting share fs
2022-08-10 17:20:17 +08:00
Gabriela Cervantes
1f4b6e6460 versions: Update libseccomp version
This PR updates the libseccomp version at the versions.yaml that is
being used in the kata CI.

Fixes #4858

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-08-09 14:27:59 +00:00
GabyCT
4d07c86cf1 Merge pull request #4846 from fidencio/topic/update-td-shim-due-to-build-breakage
versions: Update TD-shim due to build breakage
2022-08-08 11:50:49 -05:00
Fabiano Fidêncio
b0fa44165e Merge pull request #4844 from fidencio/topic/agent-ctl-add-an-empty-workspace
agent-ctl: Add an empty [workspace]
2022-08-08 17:08:43 +02:00
Fabiano Fidêncio
a8176d0218 Merge pull request #4842 from fidencio/topic/packaging-create-no_patches.txt-for-the-SPR-BKC-PC-v9.6.x-kernel
packaging: Create no_patches.txt for the SPR-BKC-PC-v9.6.x
2022-08-08 17:05:26 +02:00
Fabiano Fidêncio
8a4e690089 versions: Update TD-shim due to build breakage
"We need a newer nightly 1.62 rust to deal with the change
rust-lang/libc@576f778 on crate libc which breaks the compilation."

This comes from the a pull-request raised on TD-shim repo,
https://github.com/confidential-containers/td-shim/pull/354, which fixes
the issues with the commit being used with Kata Containers.

Let's bump to a newer commit of TD-shim and to a newer version of the
nightly toolchain as part of our versions file.

Fixes: #4840

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-08 15:53:57 +02:00
Fabiano Fidêncio
8854b4de2c Merge pull request #4836 from cmaf/sgx-update-docs-2
docs: Improve SGX documentation
2022-08-08 12:15:04 +02:00
Fabiano Fidêncio
065305f4a1 agent-ctl: Add an empty [workspace]
"An empty [workspace] can be used with a package to conveniently create a
workspace with the package and all of its path dependencies", according
to the https://doc.rust-lang.org/cargo/reference/workspaces.html

This is also matches with the suggestion provided by the Cargo itself,
due to the errors faced with the Cloud Hypervisor CI:
```
10:46:23 this may be fixable by adding `go/src/github.com/kata-containers/kata-containers/src/tools/agent-ctl` to the `workspace.members` array of the manifest located at: /tmp/jenkins/workspace/kata-containers-2-clh-PR/Cargo.toml
10:46:23 Alternatively, to keep it out of the workspace, add the package to the `workspace.exclude` array, or add an empty `[workspace]` table to the package's manifest.
```

Fixes: #4843

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-08 11:24:39 +02:00
Fabiano Fidêncio
1444d7ce42 packaging: Create no_patches.txt for the SPR-BKC-PC-v9.6.x
The file was added as part of the commit that tested this changes in the
CCv0 branch, but forgotten when re-writing it to the `main` branch.

Fixes: #4841

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-08 11:00:23 +02:00
liubin
2ae807fd29 nydus: wait nydusd API server ready before mounting share fs
If the API server is not ready, the mount call will fail, so before
mounting share fs, we should wait the nydusd is started and
the API server is ready.

Fixes: #4710

Signed-off-by: liubin <liubin0329@gmail.com>
Signed-off-by: Bin Liu <bin@hyper.sh>
2022-08-08 16:18:38 +08:00
Tim Zhang
8d4d98587f Merge pull request #4746 from liubin/fix/4745-add-log-field
runtime: explicitly mark the source of the log is from qemu.log
2022-08-08 15:21:01 +08:00
Bin Liu
9516286f6d Merge pull request #4829 from LetFu/fix/addUnlock
runtime: add unlock before return in sendReq
2022-08-08 14:42:44 +08:00
Archana Shinde
c1e3b8f40f govmm: Refactor qmp functions for adding block device
Instead of passing a bunch of arguments to qmp functions for
adding block devices, use govmm BlockDevice structure to reduce these.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-08-05 13:16:34 -07:00
Archana Shinde
598884f374 govmm: Refactor code to get rid of redundant code
Get rid of redundant return values from function.
args and blockdevArgs used to return different values to maintain
compatilibity between qemu versions. These are exactly the same now.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-08-05 13:16:34 -07:00
Archana Shinde
00860a7e43 qmp: Pass aio backend while adding block device
Allow govmm to pass aio backend while adding block device.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-08-05 13:16:34 -07:00
Archana Shinde
e1b49d7586 config: Add block aio as a supported annotation
Allow Block AIO to be passed as a per pod annotation.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-08-05 13:16:34 -07:00
Archana Shinde
ed0f1d0b32 config: Add "block_device_aio" as a config option for qemu
This configuration will allow users to choose between different
I/O backends for qemu, with the default being io_uring.
This will allow users to fallback to a different I/O mechanism while
running on kernels olders than 5.1.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-08-05 13:16:34 -07:00
Archana Shinde
83a919a5ea Merge pull request #4795 from liubin/fix/4794-update-limitation
docs: add back host network limitation
2022-08-05 23:00:47 +05:30
Chelsea Mafrica
c8d4ea84e3 docs: Improve SGX documentation
Remove line about annotations support in CRI-O and containerd since it
has been supported for a couple years.

Fixes #4819

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2022-08-05 09:57:50 -07:00
Fabiano Fidêncio
e2968b177d Merge pull request #4763 from cyyzero/runk-ps
runk: add ps sub-command
2022-08-05 16:28:38 +02:00
chmod100
d8ad16a34e runtime: add unlock before return in sendReq
Unlock is required before return, so there need to add unlock

Fixes: #4827

Signed-off-by: chmod100 <letfu@outlook.com>
2022-08-05 13:30:12 +00:00
Peng Tao
b828190158 Merge pull request #4823 from openanolis/runtime-rs-merge-main-runtime-rs
Depends-on:github.com/kata-containers/tests#4986
Runtime-rs:merge main runtime rs
2022-08-05 14:42:22 +08:00
Peng Tao
f791169efc Merge pull request #4826 from openanolis/runtime-rs-version
runtime-rs:update rtnetlink version
2022-08-05 14:28:46 +08:00
Zhongtao Hu
8bbffc42cf runtime-rs:update rtnetlink version
update rtnetlink version for runtime-rs

Fixes:#4824
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-08-05 11:18:09 +08:00
Zhongtao Hu
e403838131 runtim-rs: Merge remote-tracking branch 'origin/main' into runtime-rs
To keep runtime-rs up to date, we will merge main into runtime-rs every
week.

Fixes:kata-containers#4822
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-08-05 10:49:33 +08:00
Bin Liu
931251105b Merge pull request #4817 from openanolis/runtime-rs-s390x-fail
runtime-rs:skip the build process when the arch is s390x
2022-08-05 08:23:13 +08:00
Salvador Fuentes
587c0c5e55 Merge pull request #4820 from cmaf/sgx-update-docs-1
docs: Improve SGX documentation
2022-08-04 15:59:33 -05:00
Chelsea Mafrica
c5452faec6 docs: Improve SGX documentation
Update documentation with details regarding
intel-device-plugins-for-kubernetes setup and dependencies.

Fixes #4819

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2022-08-04 12:49:01 -07:00
GabyCT
2764bd7522 Merge pull request #4770 from justxuewei/refactor/agent/netlink-neighbor
agent: Use rtnetlink's neighbours API to add neighbors
2022-08-04 12:09:30 -05:00
Zhongtao Hu
389ae97020 runtime-rs:skip the test when the arch is s390x
github.com/kata-containers/tests#4986.To avoid returning an error when
running the ci, we just skip the test if the arch is s390x

Fixes: #4816
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-08-04 21:13:50 +08:00
Zhongtao Hu
945e02227c runtime-rs:skip the build process when the arch is s390x
github.com/kata-containers/tests#4986.To avoid returning an error when running the ci, we just skip the build
process if the arch is s390x

Fixes: #4816
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-08-04 21:13:40 +08:00
Archana Shinde
b6cd2348f5 govmm: Add io_uring as AIO type
io_uring was introduced as a new kernel IO interface in kernel 5.1.
It is designed for higher performance than the older Linux AIO API.
This feature was added in qemu 5.0.

Fixes #4645

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-08-03 10:43:12 -07:00
Archana Shinde
81cdaf0771 govmm: Correct documentation for Linux aio.
The comments for "native" aio are incorrect. Correct these.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-08-03 10:41:50 -07:00
Fabiano Fidêncio
578121124e Merge pull request #4805 from fidencio/topic/bump-tdx-dependencies
Bump TDX dependencies (QEMU and Kernel)
2022-08-03 19:31:26 +02:00
Fabiano Fidêncio
869e408516 Merge pull request #4810 from fidencio/topic/adjust-final-tarball-location-for-tdvf-and-td-shim
OVMF / td-shim: Adjust final tarball location
2022-08-03 16:55:14 +02:00
Fabiano Fidêncio
8d1cb1d513 td-shim: Adjust final tarball location
Let's create the td-shim tarball in the directory where the script was
called from, instead of doing it in the $DESTDIR.

This aligns with the logic being used for creating / extracting the
tarball content, which is already in use by the kata-deploy local build
scripts.

Fixes: #4809

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-03 14:58:44 +02:00
Fabiano Fidêncio
62f05d4b48 ovmf: Adjust final tarball location
Let's create the OVMF tarball in the directory where the script was
called from, instead of doing it in the $DESTDIR.

This aligns with the logic being used for creating / extracting the
tarball content, which is already in use by the kata-deploy local build
scripts.

Fixes: #4808

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-03 14:58:29 +02:00
Fabiano Fidêncio
9972487f6e versions: Bump Kernel TDX version
The latest kernel with TDX support should be pulled from a different
repo (https://github.com/intel/linux-kernel-dcp, instead of
https://github.com/intel/tdx), and the latest version to be used is
SPR-BKC-PC-v9.6.

With the new version being used, let's make sure we enable the
INTEL_TDX_ATTESTATION config option, and all the dependencies needed to
do so.

Fixes: #4803

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-03 12:00:49 +02:00
Fabiano Fidêncio
c9358155a2 kernel: Sort the TDX configs alphabetically
Let's just re-order the TDX configs alphabetically. No new config has
been added or removed, thus no need to bump the kernel version.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-03 11:57:02 +02:00
Fabiano Fidêncio
dd397ff1bf versions: Bump QEMU TDX version
Let's use the latest tag provided in the
"https://github.com/intel/qemu-dcp" repo, "SPR-BKC-QEMU-v2.5".

Fixes: #4802

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-03 11:00:36 +02:00
Ji-Xinyou
a355812e05 runtime-rs: fixed bug on core-sched error handling
Kernel code returns -errno, this should check negative values.

Fixes: #4429
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-08-03 15:26:48 +08:00
Bin Liu
8b0e1859cb Merge pull request #4784 from openanolis/fix-protocol-ci-err
libs: fix CI error for protocols
2022-08-03 11:03:02 +08:00
Bin Liu
b337390c28 Merge pull request #4791 from openanolis/runtime-rs-merge-main-1
runtime-rs: merge main to runtime-rs
2022-08-03 11:00:54 +08:00
Chelsea Mafrica
873e75b915 Merge pull request #4773 from fidencio/topic/build-tdvf
packaging: Add support for building TDVF
2022-08-02 09:14:13 -07:00
Chen Yiyang
230a229052 runk: add ps sub-command
ps command supprot two formats, `json` and `table`. `json` format just
outputs pids in the container. `table` format will use `ps` utilty in
the host, search and output all processes in the container. Add a struct
`container` to represent a spawned container. Move the `kill`
implemention from kill.rs as a method of `container`.

Fixes: #4361

Signed-off-by: Chen Yiyang <cyyzero@qq.com>
2022-08-02 20:45:50 +08:00
Ji-Xinyou
591dfa4fe6 runtime-rs: add support for core scheduling
Linux 5.14 supports core scheduling to have better security control
for SMT siblings. This PR supports that.

Fixes: #4429
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-08-02 17:54:04 +08:00
Bin Liu
889557ecb1 docs: add back host network limitation
Kata Containers doesn't support host network namespace,
it's a common issue for new users. The limitation
is deleted, this commit will add them back.

Also, Docker has support to run containers using
Kata Containers, delete Docker from not support list.

This commit reverts parts of #3710

Fixes: #4794

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-08-02 15:58:16 +08:00
Fabiano Fidêncio
c9b5bde30b versions: Track and build TDVF
TDVF is the firmware used by QEMU to start TDX capable VMs.  Let's start
tracking it as it'll become part of the Confidential Containers sooner
or later.

TDVF lives in the public https://github.com/tianocore/edk2-staging repo
and we're using as its version tags that are consumed internally at
Intel.

Fixes: #4624

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-02 09:51:47 +02:00
Fabiano Fidêncio
e6a5a5106d packaging: Generate a tarball as OVMF build result
Instead of having as a result the directory where OVMF artefacts where
installed, let's follow what we do with the other components and have a
tarball as a result of the OVMF build.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-02 09:48:59 +02:00
Fabiano Fidêncio
42eaf19b43 packaging: Simplify OVMF repo clone
Instead of cloning the repo, and then switching to a specific branch,
let's take advantage of `--branch` and directly clone the specific
branch / tag.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-02 09:48:59 +02:00
Fabiano Fidêncio
4d33b0541d packaging: Don't hardcode "edk2" as the cloned repo's dir.
As TDVF comes from a different repo, the edk2-staging one, we cannot
simply hardcode the name.  Instead, let's get the name of the directory
from name of the git repo.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-02 09:48:59 +02:00
Zhongtao Hu
7247575fa2 runtime-rs:fix cargo clippy
fix cargo clippy

Fixes: #4791
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-08-02 13:17:37 +08:00
Zhongtao Hu
9803393f2f runtime-rs: Merge branch 'main' into runtime-rs-merge-main-1
To keep runtime-rs up to date, we will merge main into runtime-rs every
week.

Fixes: #4790
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-08-02 10:53:01 +08:00
Fabiano Fidêncio
7503bdab6e Merge pull request #4783 from fidencio/topic/build-td-shim
versions: Track and add support for building TD-shim
2022-08-01 20:50:58 +02:00
Fabiano Fidêncio
b06bc82284 versions: Track and add support for building TD-shim
TD-shim is a simplified TDX virtual firmware, used by Cloud Hypervisor,
in order to create a TDX capable VM.

TD-shim is heavily under development, and is hosted as part of the
Confidential Containers project:
https://github.com/confidential-containers/td-shim

The version chosen for this commit, is a version that's being tested
inside Intel, but we, most likely, will need to change it before we have
it officially packaged as part of an official release.

Fixes: #4779

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-08-01 16:36:12 +02:00
Bin Liu
8d9135a7ce Merge pull request #4765 from ryansavino/ccv0-rust-upgrade
versions: Upgrade rust version
2022-08-01 17:15:05 +08:00
Quanwei Zhou
86ac653ba7 libs: fix CI error for protocols
Fix CI error for protocols.

Fixes: #4781
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-08-01 16:26:52 +08:00
Xuewei Niu
81fe51ab0b agent: fix unittests for arp neighbors
Set an ARP address explicitly before netlink::test_add_one_arp_neighbor() running.

Signed-off-by: Xuewei Niu <justxuewei@apache.org>
2022-08-01 16:19:25 +08:00
Xuewei Niu
845c1c03cf agent: use rtnetlink's neighbours API to add neighbors
Bump rtnetlink version from 0.8.0 to 0.11.0. Use rtnetlinks's API to
add neighbors and fix issues to adapt new verson of rtnetlink.

Fixes: #4607

Signed-off-by: Xuewei Niu <justxuewei@apache.org>
2022-08-01 13:44:07 +08:00
Bin Liu
993ae24080 Merge pull request #4777 from openanolis/runtime-rs-merge
Merge Main into runtime-rs branch
2022-08-01 13:04:31 +08:00
Zhongtao Hu
adfad44efe Merge remote-tracking branch 'origin/main' into runtime-rs-merge-tmp
To keep runtime-rs up to date, we will merge main into runtime-rs every
week.

Fixes:#4776
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-08-01 11:12:48 +08:00
Ryan Savino
9b1940e93e versions: update rust version
Fixes #4764

versions: update rust version to fix ccv0 attestation-agent build error
static-checks: kata tools, libs, and agent fixes

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2022-07-29 18:41:43 -05:00
Peng Tao
0aefab4d80 Merge pull request #4739 from liubin/fix/4738-trace-rpc-calls
agent: log RPC calls for debugging
2022-07-29 14:18:23 +08:00
Peng Tao
5457deb034 Merge pull request #4741 from openanolis/fix-stop-failed-in-azure
runtime-rs: fix stop failed in azure
2022-07-29 11:41:16 +08:00
Fabiano Fidêncio
54147db921 Merge pull request #4170 from Alex-Carter01/build-amdsev-ovmf
Add support AmdSev build of OVMF
2022-07-28 19:42:50 +02:00
Alex Carter
638c2c4164 static-build: Add AmdSev option for OVMF builder
Introduces new build of firmware needed for SEV

Fixes: kata-containers#4169

Signed-off-by: Alex Carter <alex.carter@ibm.com>
2022-07-28 09:56:06 -05:00
Alex Carter
f0b58e38d2 static-build: Add build script for OVMF
Introduces a build script for OVMF. Defaults to X86_64 build (x64 in OVMF)

Fixes: #4169

Signed-off-by: Alex Carter <alex.carter@ibm.com>
2022-07-28 09:07:49 -05:00
Quanwei Zhou
fa0b11fc52 runtime-rs: fix stdin hang in azure
Fix stdin hang in azure.

Fixes: #4740
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-07-28 16:16:37 +08:00
Bin Liu
a67402cc1f Merge pull request #4397 from yaoyinnan/3073/ftr/host-cgroupv2
runtime: Support for host cgroupv2
2022-07-28 14:30:03 +08:00
Tim Zhang
229ff29c0f Merge pull request #4758 from GabyCT/topic/updaterunc
versions: Update runc version
2022-07-28 14:12:58 +08:00
yaoyinnan
5c3155f7e2 runtime: Support for host cgroup v2
Support cgroup v2 on the host. Update vendor containerd/cgroups to add cgroup v2.

Fixes: #3073

Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>
2022-07-28 10:30:45 +08:00
yaoyinnan
4ab45e5c93 docs: Update support for host cgroupv2
Currently cgroup v2 is supported. Remove the note that host cgroup v2 is not supported.

Fixes: #3073

Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>
2022-07-28 10:30:44 +08:00
GabyCT
9dfd949f23 Merge pull request #4646 from amshinde/add-liburing-qemu
qemu: Add liburing to qemu build
2022-07-27 15:47:49 -05:00
Gabriela Cervantes
326eb2f910 versions: Update runc version
This PR updates the runc version to v1.1.0.

Fixes #4757

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-07-27 16:19:11 +00:00
Bin Liu
50b0b7cc15 Merge pull request #4681 from Tim-0731-Hzt/runtime-rs-sharepid
runtime-rs: fix set share sandbox pid namespace
2022-07-27 21:43:58 +08:00
Bin Liu
557229c39d Merge pull request #4724 from yahaa/fix-docs
Docs: fix tables format error
2022-07-27 21:13:29 +08:00
Bin Liu
09672eb2da agent: do some rollback works if case of do_create_container failed
In some cases do_create_container may return an error, mostly due to
`container.start(process)` call. This commit will do some rollback
works if this function failed.

Fixes: #4749

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-07-27 10:23:46 +08:00
Archana Shinde
1b01ea53d9 Merge pull request #4735 from nubificus/feature-fc-v1.1
versions: Update Firecracker version to v1.1.0
2022-07-27 04:50:32 +05:30
Peng Tao
27c82018d1 Merge pull request #4753 from Tim-Zhang/agent-fix-stream-fd-double-close
agent: Fix stream fd's double close
2022-07-27 00:54:07 +08:00
Bin Liu
6fddf031df Merge pull request #4664 from lifupan/main
container: kill all of the processes in a container when it terminated
2022-07-26 23:12:11 +08:00
Tim Zhang
f5aa6ae467 agent: Fix stream fd's double close problem
The fd would be closed on Pipestream's dropping and we should
not close it agian.

Fixes: #4752

Signed-off-by: Tim Zhang <tim@hyper.sh>
2022-07-26 20:05:06 +08:00
yahaa
6e149b43f7 Docs: fix tables format error
Fixes: #4725

Signed-off-by: yahaa <1477765176@qq.com>
2022-07-26 19:05:09 +08:00
Bin Liu
85f4e7caf6 runtime: explicitly mark the source of the log is from qemu.log
In qemu.StopVM(), if debug is enabled, the shim will dump logs
from qemu.log, but users don't know which logs are from qemu.log
and shim itself. Adding some additional messages will
help users to distinguish these logs.

Fixes: #4745

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-07-26 16:08:59 +08:00
Peng Tao
129335714b Merge pull request #4727 from openanolis/anolis-fix-network
fix network failed for kata ci
2022-07-26 15:10:55 +08:00
Peng Tao
71384b60f3 Merge pull request #4713 from openanolis/adjust_default_vcpu
runtime-rs: handle default_vcpus greator than default_maxvcpu
2022-07-26 15:02:34 +08:00
gntouts
56d49b5073 versions: Update Firecracker version to v1.1.0
This patch upgrades Firecracker version from v0.23.4 to v1.1.0

* Generate swagger models for v1.1.0 (from firecracker.yaml)
* Replace ht_enabled param to smt (API change)
* Remove NUMA-related jailer param --node 0

Fixes: #4673
Depends-on: github.com/kata-containers/tests#4968

Signed-off-by: George Ntoutsos <gntouts@nubificus.co.uk>
Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
2022-07-26 07:01:26 +00:00
Zhongtao Hu
b3147411e3 runtime-rs:add unit test for set share pid ns
Fixes:#4680
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-07-26 14:42:00 +08:00
Zhongtao Hu
1ef3f8eac6 runtime-rs: set share sandbox pid namespace
Set the share sandbox pid namepsace from spec

Fixes:#4680
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-07-26 14:41:59 +08:00
Quanwei Zhou
57c556a801 runtime-rs: fix stop failed in azure
Fix the stop failed in azure.

Fixes: #4740
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-07-26 12:16:32 +08:00
liubin
0e24f47a43 agent: log RPC calls for debugging
We can log all RPC calls to the agent for debugging purposes
to check which RPC is called, which can help us to understand
the container lifespan.

Fixes: #4738

Signed-off-by: liubin <liubin0329@gmail.com>
2022-07-26 10:32:44 +08:00
Tim Zhang
e764a726ab Merge pull request #4715 from Tim-Zhang/fix-ut-test_do_write_stream
agent: fix fd-double-close problem in ut test_do_write_stream
2022-07-25 17:34:26 +08:00
Peng Tao
3f4dd92c2d Merge pull request #4702 from openanolis/runtime-rs-endpoint-dev
runtime-rs: add functionalities support for macvlan and vlan endpoints
2022-07-25 17:04:45 +08:00
Peng Tao
a3127a03f3 Merge pull request #4721 from openanolis/install-guide-2
Docs: add rust environment setup for kata 3.0
2022-07-25 16:50:20 +08:00
Tim Zhang
427b29454a Merge pull request #4709 from liubin/fix/4708-unwrap-error
rustjail: check result to let it return early
2022-07-25 15:05:20 +08:00
Tim Zhang
0337377838 Merge pull request #4695 from liubin/4694/upgrade-nydus-version
upgrade nydus version
2022-07-25 15:05:04 +08:00
Quanwei Zhou
c825065b27 runtime-rs: fix tc filter setup failed
Fix bug using tc filter and protocol needs to use network byte order.

Fixes: #4726
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-07-25 11:16:33 +08:00
Quanwei Zhou
e0194dcb5e runtime-rs: update route destination with prefix
Update route destination with prefix.

Fixes: #4726
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-07-25 11:16:22 +08:00
Bin Liu
534a4920b1 Merge pull request #4692 from openanolis/support_disable_guest_seccomp
support disable_guest_seccomp
2022-07-25 11:08:41 +08:00
Zhongtao Hu
fa85fd584e docs: add rust environment setup for kata 3.0
add more details for rust set up in kata 3.0 install guide

Fixes: #4720
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-07-25 09:48:18 +08:00
Wainer Moschetta
0b4a91ec1a Merge pull request #4644 from bookinabox/optimize-get-paths
cgroups: remove unnecessary get_paths()
2022-07-22 17:01:01 -03:00
Ji-Xinyou
896478c92b runtime-rs: add functionalities support for macvlan and vlan endpoints
Add macvlan and vlan support to runtime-rs code and corresponding unit
tests.

Fixes: #4701
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-07-22 10:09:11 +08:00
GabyCT
68c265587c Merge pull request #4718 from GabyCT/topic/updatefirecrackerversion
versions: Update firecracker version
2022-07-21 14:26:57 -05:00
Gabriela Cervantes
df79c8fe1d versions: Update firecracker version
This PR updates the firecracker version that is being
used in kata CI.

Fixes #4717

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-07-21 16:10:29 +00:00
Tim Zhang
912641509e agent: fix fd-double-close problem in ut test_do_write_stream
The fd will closed on struct Process's dropping, so don't
close it again manually.

Fixes: #4598

Signed-off-by: Tim Zhang <tim@hyper.sh>
2022-07-21 19:37:15 +08:00
Zhongtao Hu
43045be8d1 runtime-rs: handle default_vcpus greator than default_maxvcpu
when the default_vcpus is greater than the default_maxvcpus, the default
vcpu number should be set equal to the default_maxvcpus.

Fixes: #4712
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-07-21 16:37:56 +08:00
liubin
0d7cb7eb16 agent: delete agent-type property in announce
Since there is only one type of agent now, the
agent-type is not needed anymore.

Signed-off-by: liubin <liubin0329@gmail.com>
2022-07-21 14:53:01 +08:00
liubin
eec9ac81ef rustjail: check result to let it return early.
check the result to let it return early if there are some errors

Fixes: #4708

Signed-off-by: liubin <liubin0329@gmail.com>
2022-07-21 14:51:30 +08:00
liubin
402bfa0ce3 nydus: upgrade nydus/nydus-snapshotter version
Upgrade nydus/nydus-snapshotter to the latest version.

Fixes: #4694

Signed-off-by: liubin <liubin0329@gmail.com>
2022-07-21 14:39:14 +08:00
Quanwei Zhou
54f53d57ef runtime-rs: support disable_guest_seccomp
support disable_guest_seccomp

Fixes: #4691
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-07-21 07:46:28 +08:00
Peng Tao
6d56cdb9ac Merge pull request #4686 from xujunjie-cover/issue4685
kata-monitor: fix can't monitor /run/vc/sbs
2022-07-19 23:40:14 +08:00
Bin Liu
540303880e Merge pull request #4688 from quanweiZhou/fix_sandbox_cgroup_false
runtime-rs: fix sandbox_cgroup_only=false panic
2022-07-19 20:38:57 +08:00
Peng Tao
7c146a5d95 Merge pull request #4684 from quanweiZhou/fix-ctr-exit-error
runtime-rs: fix ctr exit failed
2022-07-19 16:02:20 +08:00
Peng Tao
08a6581673 Merge pull request #4662 from openanolis/runtime-rs-user-manaul
docs: add installation guide for kata 3.0
2022-07-19 15:58:55 +08:00
Zhongtao Hu
4331ef80d0 Runtime-rs: add installation guide for rust-runtime
add installation guide for rust-runtime

Fixes:#4661
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-07-19 13:12:13 +08:00
Peng Tao
4c3bd6b1d1 Merge pull request #4656 from openanolis/runtime-rs-ipvlan
runtime-rs: support functionalities of ipvlan endpoint
2022-07-19 11:15:31 +08:00
xujunjie-cover
72dbd1fcb4 kata-monitor: fix can't monitor /run/vc/sbs.
need bind host dir /run/vc/sbs/ to kata monitor

Fixes: #4685

Signed-off-by: xujunjie-cover <xujunjielxx@163.com>
2022-07-19 09:52:54 +08:00
Bin Liu
960f2a7f70 Merge pull request #4678 from Tim-0731-Hzt/runtime-rs-makefile-2
runtime-rs: remove the value of hypervisor path in DB config
2022-07-19 09:34:45 +08:00
Quanwei Zhou
e9988f0c68 runtime-rs: fix sandbox_cgroup_only=false panic
When run with configuration `sandbox_cgroup_only=false`, we will call
`gen_overhead_path()` as the overhead path. The `cgroup-rs` will push
the path with the subsystem prefix by `PathBuf::push()`. When the path
has prefix “/” it will act as root path, such as
```
let mut path = PathBuf::from("/tmp");
path.push("/etc");
assert_eq!(path, PathBuf::from("/etc"));
```
So we shoud not set overhead path with prefix "/".

Fixes: #4687
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-07-19 08:30:34 +08:00
Quanwei Zhou
cebbebbe8a runtime-rs: fix ctr exit failed
During use, there will be cases where the container is in the stop state
and get another stop. In this case, the second stop needs to be ignored.

Fixes: #4683
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-07-19 07:43:22 +08:00
Bin Liu
758cc47b32 Merge pull request #4671 from liubin/4670-upgrade-nix
kata-sys-util: upgrade nix version
2022-07-18 23:31:07 +08:00
Bin Liu
25be4d00fd Merge pull request #4676 from openanolis/xuejun/runtime-rs
runtime-rs: fix some bugs to make runtime-rs on aarch64
2022-07-18 23:29:32 +08:00
Ji-Xinyou
62182db645 runtime-rs: add unit test for ipvlan endpoint
Add unit test to check the integrity of IPVlanEndpoint::new(...)

Fixes: #4655
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-07-18 15:56:06 +08:00
xuejun-xj
99654ce694 runtime-rs: update dbs-xxx dependencies
Update dbs-xxx commit ID for aarch64 in runtime-rs/Cargo.toml file to add
dependencies for aarch64.

Fixes: #4676

Signed-off-by: xuejun-xj <jiyunxue@alibaba.linux.com>
2022-07-18 13:46:46 +08:00
xuejun-xj
f4c3adf596 runtime-rs: Add compile option file
Add file aarch64-options.mk for compiling on aarch64 architectures.

Fixes: #4676

Signed-off-by: xuejun-xj <jiyunxue@alibaba.linux.com>
2022-07-18 13:46:46 +08:00
xuejun-xj
545ae3f0ee runtime-rs: fix warning
Module anyhow::anyhow is only used on x86_64 architecture in
crates/hypervisor/src/device/vfio.rs file.

Fixes: #4676

Signed-off-by: xuejun-xj <jiyunxue@alibaba.linux.com>
2022-07-18 13:46:39 +08:00
Zhongtao Hu
19eca71cd9 runtime-rs: remove the value of hypervisor path in DB config
As a built in VMM, Path, jailer path, ctlpath are not needed for
Dragonball. So we don't generate those value in Makefile.

Fixes: #4677
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-07-18 13:37:51 +08:00
Ji-Xinyou
d8920b00cd runtime-rs: support functionalities of ipvlan endpoint
Add support for ipvlan endpoint

Fixes: #4655
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-07-18 11:34:03 +08:00
xuejun-xj
2b01e9ba40 dragonball: fix warning
Add map_err for vcpu_manager.set_reset_event_fd() function.

Fixes: #4676

Signed-off-by: xuejun-xj <jiyunxue@alibaba.linux.com>
2022-07-18 09:52:13 +08:00
liubin
996a6b80bc kata-sys-util: upgrade nix version
New nix is supporting UMOUNT_NOFOLLOW, upgrade nix
version to use this flag instead of the self-defined flag.

Fixes: #4670

Signed-off-by: liubin <liubin0329@gmail.com>
2022-07-15 17:38:15 +08:00
Archana Shinde
f690b0aad0 qemu: Add liburing to qemu build
io_uring is a Linux API for asynchronous I/O introduced in qemu 5.0.
It is designed to better performance than older aio API.
We could leverage this in order to get better storage performance.

We should be adding liburing-dev to qemu build to leverage this feature.
However liburing-dev package is not available in ubuntu 20.04,
it is avaiable in 22.04.

Upgrading the ubuntu version in the dockerfile to 22.04 is causing
issues in the static qemu build related to libpmem.

So instead we are building liburing from source until those build issues
are solved.

Fixes: #4645

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-07-14 19:21:47 -07:00
Fupan Li
d93e4b939d container: kill all of the processes in this container
When a container terminated, we should make sure there's no processes
left after destroying the container.

Before this commit, kata-agent depended on the kernel's pidns
to destroy all of the process in a container after the 1 process
exit in a container. This is true for those container using a
separated pidns, but for the case of shared pidns within the
sandbox, the container exit wouldn't trigger the pidns terminated,
and there would be some daemon process left in this container, this
wasn't expected.

Fixes: #4663

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2022-07-14 16:39:49 +08:00
Bin Liu
575b5eb5f5 Merge pull request #4506 from cyyzero/runk-exec
runk: Support `exec` sub-command
2022-07-14 14:22:24 +08:00
Bin Liu
9f49f7adca Merge pull request #4493 from openanolis/runtime-rs-dev
runtime-rs: hypervisor part
2022-07-14 13:49:34 +08:00
Quanwei Zhou
3c989521b1 dragonball: update for review
update for review

Fixes: #3785
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-07-14 10:43:59 +08:00
wllenyj
274598ae56 kata-runtime: add dragonball config check support.
add dragonball config check support.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-07-14 10:43:50 +08:00
Chao Wu
1befbe6738 runtime-rs: Cargo lock for fix version problem
Cargo lock for fix version problem

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-07-14 08:49:39 +08:00
Quanwei Zhou
3d6156f6ec runtime-rs: support dragonball and runtime-binary
Fixes: #3785
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-07-14 08:49:30 +08:00
Zhongtao Hu
3f6123b4dd libs: update configuration and annotations
1. support annotation for runtime.name, hypervisor_name, agent_name.
2. fix parse memory from annotation

Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-07-14 08:49:17 +08:00
Derek Lee
9ae2a45b38 cgroups: remove unnecessary get_paths()
Change get_mounts to get paths from a borrowed argument rather than
calling get_paths a second time.

Fixes #3768

Signed-off-by: Derek Lee <derlee@redhat.com>
2022-07-13 09:17:14 -07:00
Bin Liu
0cc20f014d Merge pull request #4647 from fidencio/topic/fix-clh-crash-when-booting-up-with-no-network-device
clh: Don't crash if no network device is set by the upper layer
2022-07-13 21:28:46 +08:00
Fabiano Fidêncio
418a03a128 Merge pull request #4639 from fidencio/topic/packaging-rework-qemu-build-suffix
packaging: Rework how ${BUILD_SUFFIX} is used with the QEMU builder scripts
2022-07-13 15:03:19 +02:00
Fabiano Fidêncio
be31207f6e clh: Don't crash if no network device is set by the upper layer
`ctr` doesn't set a network device when creating the sandbox, which
leads to Cloud Hypervisor's driver crashing, see the log below:
```
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x55641c23b248]
goroutine 32 [running]:
github.com/kata-containers/kata-containers/src/runtime/virtcontainers.glob..func1(0xc000397900)
	/home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/clh.go:163 +0x128
github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(*cloudHypervisor).vmAddNetPut(...)
	/home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/clh.go:1348
github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(*cloudHypervisor).bootVM(0xc000397900, {0x55641c76dfc0, 0xc000454ae0})
	/home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/clh.go:1378 +0x5a2
github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(*cloudHypervisor).StartVM(0xc000397900, {0x55641c76dff8, 0xc00044c240},
0x55641b8016fd)
        /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/clh.go:659 +0x7ee
github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(*Sandbox).startVM.func2()
        /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/sandbox.go:1219 +0x190
github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(*LinuxNetwork).Run.func1({0xc0004a8910, 0x3b})
        /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/network_linux.go:319 +0x1b
github.com/kata-containers/kata-containers/src/runtime/virtcontainers.doNetNS({0xc000048440, 0xc00044c240}, 0xc0005d5b38)
        /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/network_linux.go:1045 +0x163
github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(*LinuxNetwork).Run(0xc000150c80, {0x55641c76dff8, 0xc00044c240}, 0xc00014e4e0)
	/home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/network_linux.go:318 +0x105
github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(*Sandbox).startVM(0xc000107d40, {0x55641c76dff8, 0xc0005529f0})
	/home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/sandbox.go:1205 +0x65f
github.com/kata-containers/kata-containers/src/runtime/virtcontainers.createSandboxFromConfig({_, _}, {{0x0, 0x0, 0x0}, {0xc000385a00, 0x1, 0x1},
{0x55641d033260, 0x0, ...}, ...}, ...)
        /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/api.go:91 +0x346
github.com/kata-containers/kata-containers/src/runtime/virtcontainers.CreateSandbox({_, _}, {{0x0, 0x0, 0x0}, {0xc000385a00, 0x1, 0x1},
{0x55641d033260, 0x0, ...}, ...}, ...)
        /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/api.go:51 +0x150
github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(*VCImpl).CreateSandbox(_, {_, _}, {{0x0, 0x0, 0x0}, {0xc000385a00, 0x1, 0x1},
{0x55641d033260, ...}, ...})
        /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/implementation.go:35 +0x74
github.com/kata-containers/kata-containers/src/runtime/pkg/katautils.CreateSandbox({_, _}, {_, _}, {{0xc0004806c0, 0x9}, 0xc000140110, 0xc00000f7a0,
{0x0, 0x0}, ...}, ...)
        /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/pkg/katautils/create.go:175 +0x8b6
github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2.create({0x55641c76dff8, 0xc0004129f0}, 0xc00034a000, 0xc00036a000)
        /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2/create.go:147 +0xdea
github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2.(*service).Create.func2()
        /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2/service.go:401 +0x32
created by github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2.(*service).Create
        /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2/service.go:400 +0x534
```

This bug has been introduced as part of the
https://github.com/kata-containers/kata-containers/pull/4312 PR, which
changed how we add the network device.

In order to avoid the crash, let's simply check whether we have a device
to be added before iterating the list of network devices.

Fixes: #4618

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-07-13 10:40:21 +02:00
Peng Tao
39974fbacc Merge pull request #4642 from fidencio/topic/clh-bump-to-v25.0-release
versions: Update Cloud Hypervisor to v25.0
2022-07-13 16:08:01 +08:00
Fabiano Fidêncio
051181249c packaging: Add a "-" in the dir name if $BUILD_DIR is available
Currently $BUILD_DIR will be used to create a directory as:
/opt/kata/share/kata-qemu${BUILD_DIR}

It means that when passing a BUILD_DIR, like "foo", a name would be
built like /opt/kata/share/kata-qemufoo
We should, instead, be building it as /opt/kata/share/kata-qemu-foo.

Fixes: #4638

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-07-12 21:27:41 +02:00
Fabiano Fidêncio
dc3b6f6592 versions: Update Cloud Hypervisor to v25.0
Cloud Hypervisor v25.0 has been released on July 7th, 2022, and brings
the following changes:

**ch-remote Improvements**
The ch-remote command has gained support for creating the VM from a JSON
config and support for booting and deleting the VM from the VMM.

**VM "Coredump" Support**
Under the guest_debug feature flag it is now possible to extract the memory
of the guest for use in debugging with e.g. the crash utility.
(https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4012)

**Notable Bug Fixes**
* Always restore console mode on exit
  (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4249,
   https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4248)
* Restore vCPUs in numerical order which fixes aarch64 snapshot/restore
  (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4244)
* Don't try and configure IFF_RUNNING on TAP devices
  (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4279)
* Propagate configured queue size through to vhost-user backend
  (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4286)
* Always Program vCPU CPUID before running the vCPU to fix running on Linux
  5.16
  (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4156)
* Enable ACPI MADT "Online Capable" flag for hotpluggable vCPUs to fix newer
  Linux guest

**Removals**
The following functionality has been removed:

* The mergeable option from the virtio-pmem support has been removed
  (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/3968)
* The dax option from the virtio-fs support has been removed
  (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/3889)

Fixes: #4641

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-07-12 14:47:58 +00:00
Fabiano Fidêncio
201ff223f6 packaging: Use the $BUILD_SUFFIX when renaming the qemu binary
Instead of always naming the binary as "-experimental", let's take
advantage of the $BUILD_SUFFIX that's already passed and correctly name
the binary according to it.

Fixes: #4638

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-07-12 15:09:31 +02:00
Bin Liu
f3335c99ce Merge pull request #4614 from Tim-0731-Hzt/runtime-rs-merge-main
Runtime-rs merge main
2022-07-12 19:25:11 +08:00
Bin Liu
9f0e4bb775 Merge pull request #4628 from fidencio/topic/rework-tee-kernel-builds
kernel: Deduplicate code used for building TEE kernels
2022-07-12 17:25:04 +08:00
Bin Liu
b424cf3c90 Merge pull request #4544 from openanolis/anolis/virtio_device_aarch64
runtime-rs: Dragonball-sandbox - add virtio device feature support for aarch64
2022-07-12 12:39:31 +08:00
Fabiano Fidêncio
cda1919a0a Merge pull request #4609 from fidencio/topic/kata-deploy-simplify-config-path-handling
packaging: Simplify config path handling
2022-07-11 23:48:54 +02:00
Fabiano Fidêncio
1a25afcdf5 kernel: Allow passing the URL to download the tarball
Passing the URL to be used to download the kernel tarball is useful in
various scenarios, mainly when doing a downstream build, thus let's add
this new option.

This new option also works around a known issue of the Dockerfile used
to build the kernel not having `yq` installed.

Fixes: #4629

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-07-11 14:23:49 +02:00
snir911
0024b8d10a Merge pull request #4617 from Yuan-Zhuo/main
build: save lines for repository_owner check
2022-07-11 15:04:35 +03:00
Fabiano Fidêncio
80c68b80a8 kernel: Deduplicate code used for building TEE kernels
There's no need to have the entire function for building SEV / TDX
duplicated.

Let's remove those functions and create a `get_tee_kernel()` which takes
the TEE as the argument.

Fixes: #4627

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-07-11 13:25:17 +02:00
xuejun-xj
d2584991eb dragonball: fix dependency unused warning
Fix the warning "unused import: `dbs_arch::gic::Error as GICError`" and
"unused import: `dbs_arch::gic::GICDevice`" in file src/vm/mod.rs when
compiling.

Fixes: #4544

Signed-off-by: xuejun-xj <jiyunxue@alibaba.linux.com>
Signed-off-by: jingshan <jingshan@linux.alibaba.com>
2022-07-11 17:55:04 +08:00
xuejun-xj
458f6f42f6 dragonball: use const string for legacy device type
As string "com1", "com2" and "rtc" are used in two files
(device_manager/mod.rs and device_manager/legacy.rs), we use public
const variables COM1, COM2 and RTC to replace them respectively.

Fixes: #4544

Signed-off-by: xuejun-xj <jiyunxue@alibaba.linux.com>
Signed-off-by: jingshan <jingshan@linux.alibaba.com>
2022-07-11 17:46:10 +08:00
James O. D. Hunt
58b0fc4794 Merge pull request #4192 from Tim-0731-Hzt/runtime-rs
kata 3.0 Architecture
2022-07-11 09:34:17 +01:00
Zhongtao Hu
0826a2157d Merge remote-tracking branch 'origin/main' into runtime-rs-1
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-07-11 09:47:23 +08:00
Zhongtao Hu
939959e726 docs: add Dragonball to hypervisors
Fixes:#4193
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-07-11 09:38:17 +08:00
xuejun-xj
f6f96b8fee dragonball: add legacy device support for aarch64
Implement RTC device for aarch64.

Fixes: #4544

Signed-off-by: xuejun-xj <jiyunxue@alibaba.linux.com>
Signed-off-by: jingshan <jingshan@linux.alibaba.com>
2022-07-10 17:35:30 +08:00
xuejun-xj
7a4183980e dragonball: add device info support for aarch64
Implement generate_virtio_device_info() and
get_virtio_mmio_device_info() functions su support the mmio_device_info
member, which is used by FDT.

Fixes: #4544

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
Signed-off-by: jingshan <jingshan@linux.alibaba.com>
2022-07-10 17:09:59 +08:00
Fabiano Fidêncio
46fd7ce025 Merge pull request #4595 from amshinde/fix-clh-tarball-build
Fix clh tarball build
2022-07-08 20:15:30 +02:00
Peng Tao
30da3fb954 Merge pull request #4515 from openanolis/anolis/dragonball-3
runtime-rs: built-in Dragonball sandbox part III - virtio-blk, virtio-fs, virtio-net and VMM API support
2022-07-08 23:14:01 +08:00
Fabiano Fidêncio
f7ccf92dc8 kata-deploy: Rely on the configured config path
Instead of passing a `KATA_CONF_FILE` environament variable, let's rely
on the configured (in the container engine) config path, as both
containerd and CRI-O support it, and we're using this for both of them.

Fixes: #4608

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-07-08 15:02:26 +02:00
Fabiano Fidêncio
33360f1710 Merge pull request #4600 from ManaSugi/fix/selinux-hypervisor-config
runtime: Fix DisableSelinux config
2022-07-08 13:05:25 +02:00
Fabiano Fidêncio
386a523a05 kata-deploy: Pass the config path to CRI-O
As we're already doing for containerd, let's also pass the configuration
path to CRI-O, as all the supported CRI-O versions do support this
configuration option.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-07-08 12:36:47 +02:00
Yuan-Zhuo
13df57c393 build: save lines for repository_owner check
repository_owner check in docs-url-alive-check.yaml now is specified for each step, it can be in job level to save lines.

Fixes: #4611

Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
2022-07-08 10:40:30 +08:00
Bin Liu
f36bc8bc52 Merge pull request #4616 from GabyCT/topic/updatecontainerddoc
docs: Update URL links for containerd documentation
2022-07-08 08:49:06 +08:00
Gabriela Cervantes
57c2d8b749 docs: Update URL links for containerd documentation
This PR updates some url links related with containerd documentation.

Fixes #4615

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-07-07 21:48:18 +00:00
Archana Shinde
e57a1c831e build: Mark git repos as safe for build
This is not an issue when the build is run as non-privilged user.
Marking these as safe in case where the build may be run as root
or some other user.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-07-07 12:11:00 -07:00
GabyCT
ee3f5558ae Merge pull request #4606 from liubin/fix/4605-delete-cri-containerd-plugin
docs: delete CRI containerd plugin statement
2022-07-07 09:35:36 -05:00
Fabiano Fidêncio
c09634dbc7 Merge pull request #4592 from fidencio/revert-kata-deploy-changes-after-2.5.0-rc0-release
release: Revert kata-deploy changes after 2.5.0-rc0 release
2022-07-07 08:59:43 +02:00
liubin
2551924bda docs: delete CRI containerd plugin statement
There is no independent CRI containerd plugin for new containerd,
the related documentation should be updated too.

Fixes: #4605

Signed-off-by: liubin <liubin0329@gmail.com>
2022-07-07 12:06:25 +08:00
Bin Liu
bee7915932 Merge pull request #4533 from bookinabox/simplify-nproc
tools/snap: simplify nproc
2022-07-07 11:38:29 +08:00
Chao Wu
9cee52153b fmt: do cargo fmt and add a dependency for blk_dev
fmt: do cargo fmt and add a dependency for blk_dev

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-07-07 10:32:35 +08:00
Chao Wu
47a4142e0d fs: change vhostuser and virtio into const
change fs mode vhostuser and virtio into const.

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-07-07 10:32:35 +08:00
Chao Wu
e14e98bbeb cpu_topo: add handle_cpu_topology function
add handle_cpu_topology funciton to make it easier to understand the
set_vm_configuration function.

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-07-07 10:32:35 +08:00
Chao Wu
5d3b53ee7b downtime: add downtime support
add downtime support in `resume_all_vcpus_with_downtime`

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-07-07 10:32:35 +08:00
Chao Wu
6a1fe85f10 vfio: add vfio as TODO
We add vfio as TODO in this commit and create a github issue for this.

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-07-07 10:32:35 +08:00
Chao Wu
5ea35ddcdc refractor: remove redundant by_id
remove redundant by_id in get_vm_by_id_mut and get_vm_by_id. They are
optimized to get_vm_mut and get_vm.

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-07-07 10:32:35 +08:00
Chao Wu
b646d7cb37 config: remove ht_enabled
Since cpu topology could tell whether hyper thread is enabled or not, we
removed ht_enabled config from VmConfigInfo

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-07-07 10:32:35 +08:00
Chao Wu
cb54ac6c6e memory: remove reserve_memory_bytes
This is currently an unsupported feature and we will remove it from the
current code.

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-07-07 10:32:35 +08:00
Chao Wu
bde6609b93 hotplug: add room for other hotplug solution
Add room in the code for other hotplug solution without upcall

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-07-07 10:32:35 +08:00
wllenyj
d88b1bf01c dragonball: update vsock dependency
1. fix vsock device init failed
2. fix VsockDeviceConfigInfo not found

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-07-07 10:32:35 +08:00
Chao Wu
dd003ebe0e Dragonball: change error name and fix compile error
Change error name from `StartMicrovm` to `StartMicroVm`,
`StartMicrovmError` to `StartMicroVmError`.

Besides, we fix a compile error in config_manager.

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-07-07 10:32:35 +08:00
Chao Wu
38957fe00b UT: fix compile error in unit tests
fix compile error in unit tests for DummyConfigInfo.

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-07-07 10:32:35 +08:00
wllenyj
11b3f95140 dragonball: add virtio-fs device support
Virtio-fs devices are supported.

Fixes: #4257

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-07-07 10:32:35 +08:00
wllenyj
948381bdbe dragonball: add virtio-net device support
Virtio-net devices are supported.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-07-07 10:32:35 +08:00
wllenyj
3d20387a25 dragonball: add virtio-blk device support
Virtio-blk devices are supported.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-07-07 10:32:35 +08:00
Chao Wu
87d38ae49f Doc: add document for Dragonball API
add detailed explanation for Dragonball API

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-07-07 10:32:26 +08:00
Zhongtao Hu
2bb1eeaecc docs: further questions related to upcall
add questions and answers for upcall

Fixes:#4193
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-07-07 09:52:50 +08:00
Zhongtao Hu
026aaeeccc docs: add FAQ to the report
1.provide answers for the qeustions will be frequently asked

2.format the document

Fixes:#4193
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-07-07 09:52:50 +08:00
Christophe de Dinechin
fffcb81652 docs: update the content of the report
1. Explain why the current situation is a problem.

2. We are beyond a simple introduction now, it's a real proposal.

3. Explain why you think it is solid, and fix a grammatical error.

4. The Rust rationale does not really belong to the initial paragraph.
   Also, I rephrased it to highlight the contrast with Go and the Kata community's
   past experience switching to Rust for the agent.

Fixes:#4193
Signed-off-by: Christophe de Dinechin <christophe@dinechin.org>
2022-07-07 09:52:46 +08:00
Zhongtao Hu
42ea854eb6 docs: kata 3.0 Architecture
An introduction for kata 3.0 architecture design

Fixes:#4193
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
Signed-off-by: Christophe de Dinechin <christophe@dinechin.org>
2022-07-07 09:47:07 +08:00
Archana Shinde
efdb92366b build: Fix clh source build as normal user
While running make as non-privileged user, the make errors out with
the following message:
"INFO: Build cloud-hypervisor enabling the following features: tdx
Got permission denied while trying to connect to the Docker daemon
socket at unix:///var/run/docker.sock: Post
"http://%2Fvar%2Frun%2Fdocker.sock/v1.24/images/create?fromImage=cloudhypervisor%2Fdev&tag=20220524-0":
dial unix /var/run/docker.sock: connect: permission denied"

Even though the user may be part of docker group, the clh build from
source does a docker in docker build. It is necessary for the user of
the nested container to be part of docker build for the build to
succeed.

Fixes #4594

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-07-06 18:28:00 -07:00
Derek Lee
0e40ecf383 tools/snap: simplify nproc
Replaces calls of nproc	with nproc with

nproc ${CI:+--ignore 1}

to run nproc with one less processing unit than the maximum to prevent
DOS-ing the local machine.

If process is being run in a container (determined via whether $CI is
null), all processing units avaliable will be used.

Fixes #3967

Signed-off-by: Derek Lee <derlee@redhat.com>
2022-07-06 15:04:08 -07:00
Chen Yiyang
f59939a31f runk: Support exec sub-command
`exec` will execute a command inside a container which exists and is not
frozon or stopped. *Inside* means that the new process share namespaces
and cgroup with the container init process. Command can be specified by
`--process` parameter to read from a file, or from other parameters such
as arg, env, etc. In order to be compatible with `create`/`run`
commands, I refactor libcontainer. `Container` in builder.rs is divided
into `InitContainer` and `ActivatedContainer`. `InitContainer` is used
for `create`/`run` command. It will load spec from given bundle path.
`ActivatedContainer` is used by `exec` command, and will read the
container's status file, which stores the spec and `CreateOpt` for
creating the rustjail::LinuxContainer. Adapt the spec by replacing the
process with given options and updating the namesapces with some paths
to join the container. I also rename the `ContainerContext` as
`ContainerLauncher`, which is only used to spawn process now. It uses
the `LinuxContaier` in rustjail as the runner. For `create`/`run`, the
`launch` method will create a new container and run the first process.
For `exec`, the `launch` method will spawn a process which joins a
container.

Fixes #4363

Signed-off-by: Chen Yiyang <cyyzero@qq.com>
2022-07-06 21:11:30 +08:00
Bin Liu
be68cf0712 Merge pull request #4597 from bergwolf/github/action
action: revert commit message limit to 150 bytes
2022-07-06 17:13:15 +08:00
Manabu Sugimoto
4d89476c91 runtime: Fix DisableSelinux config
Enable Kata runtime to handle `disable_selinux` flag properly in order
to be able to change the status by the runtime configuration whether the
runtime applies the SELinux label to VMM process.

Fixes: #4599
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-07-06 15:50:28 +09:00
Fabiano Fidêncio
ac91fb7a12 Merge pull request #4591 from fidencio/2.5.0-rc0-branch-bump
# Kata Containers 2.5.0-rc0
2022-07-06 08:24:14 +02:00
wllenyj
090de2dae2 dragonball: fix the clippy errors.
fix clippy errors  and do fmt in this PR.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-07-06 11:29:49 +08:00
wllenyj
a1593322bd dragonball: add vsock api to api server
Enables vsock to use the api for device configuration.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-07-06 11:29:49 +08:00
wllenyj
89b9ba8603 dragonball: add set_vm_configuration api
Set virtual machine configuration configurations.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-07-06 11:29:49 +08:00
wllenyj
95fa0c70c3 dragonball: add start microvm support
We add microvm start related support in thie pull request.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
Signed-off-by: jingshan <jingshan@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-07-06 11:29:49 +08:00
wllenyj
5c1ccc376b dragonball: add Vmm struct
The Vmm struct is global coordinator to manage API servers, virtual
machines etc.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-07-06 11:29:49 +08:00
Jiang Liu
4d234f5742 dragonball: refactor code layout
Refactored some code layout.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2022-07-06 11:29:49 +08:00
wllenyj
cfd5dae47c dragonball: add vm struct
The vm struct to manage resources and control states of an virtual
machine instance.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
Signed-off-by: jingshan <jingshan@linux.alibaba.com>
Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-07-06 11:29:46 +08:00
wllenyj
527b73a8e5 dragonball: remove unused feature in AddressSpaceMgr
log_dirty_pages is useless now and will be redesigned to support live
migration in the future.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-07-06 11:28:32 +08:00
Peng Tao
3bafafec58 action: extend commit message line limit to 150 bytes
So that we can add move info there and few people use such small
terminals nowadays.

Fixes: #4596
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-07-06 11:19:08 +08:00
Fabiano Fidêncio
5010c643c4 release: Revert kata-deploy changes after 2.5.0-rc0 release
As 2.5.0-rc0 has been released, let's switch the kata-deploy / kata-cleanup
tags back to "latest", and re-add the kata-deploy-stable and the
kata-cleanup-stable files.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2022-07-05 22:23:49 +02:00
Fabiano Fidêncio
2d29791c19 release: Kata Containers 2.5.0-rc0
- Drop in cfg files support
- agent: enhance get handled signal
- oci: fix serde skip serializing condition
- agent: Run OCI poststart hooks after a container is launched
- agent: Replace some libc functions with nix ones
- runtime: overwrite mount type to bind for bind mounts
- build: Set safe.directory for runtime repo
- ci/cd: update check-commit-message
- Set safe.directory against tests repository
- runtime: delete Console from Cmd type
- Add `default_maxmemory` config option
- shim: set a non-zero return code if the wait process call failed.
- Refactor how hypervisor config validation is handled
- packaging: Remove unused kata docker configure script
- kata-with-k8s: Add cgroupDriver for containerd
- shim: support shim v2 logging plugin
- device package cleanup/refactor
- versions: Update kernel to latest LTS version 5.15.48
- agent: Allow BUILD_TYPE=debug
- Fix clippy warnings and update agent's vendored code
- block: Leverage multiqueue for virtio-block
- kernel: Add CONFIG_EFI=y as part of the TDX fragments
- runtime: Add heuristic to get the right value(s) for mem-reserve
- runtime: enable sandbox feature on qemu
- snap: fix snap build on ppc64le
- packaging: Remove unused publish kata image script
- rootfs: Fix chronyd.service failing on boot
- tracing: Remove whitespace from root span
- workflow: Removing man-db, workflow kept failing
- docs: Update outdated URLs and keep them available
- runtime: fix error when trying to parse sandbox sizing annotations
- snap: Fix debug cli option
- deps: Resolve dependabot bumps of containerd, crossbeam-utils, regex
- Allow Cloud Hypervisor to run under the `container_kvm_t`
- docs: Update containerd url link
- agent: refactor reading file timing for debugging
- safe-path: fix clippy warning
- kernel building: efi_secret module
- runtime: Switch to using the rust version of virtiofsd (all arches but powerpc)
- shim: change the log level for GetOOMEvent call failures
- docs: Add more kata monitor details
- Allow io.katacontainers.config.hypervisor.enable_iommu annotation by …
- versions: Bump virtiofsd to v1.3.0
- docs: Add storage limits to arch doc
- docs: Update source for cri-tools
- tools: Enable extra detail on error
- docs: Add agent-ctl examples section

f4eea832a release: Adapt kata-deploy for 2.5.0-rc0
0ddb34a38 oci: fix serde skip serializing condition
fbb2e9bce agent: Replace some libc functions with nix ones
acd3302be agent: Run OCI poststart hooks after a container is launched
1f363a386 runtime: overwrite mount type to bind for bind mounts
4e48509ed build: Set safe.directory for runtime repo
48ccd4233 ci: Set safe.directory against tests repository
2a4fbd6d8 agent: enhance get handled signal
433816cca ci/cd: update check-commit-message
a5a25ed13 runtime: delete Console from Cmd type
96553e8bd runtime: Add documentation of drop-in config file fragments
c656457e9 runtime: Add tests of drop-in config file decoding
99f5ca80f runtime: Plug drop-in decoding into decodeConfig()
0f9856c46 runtime: Scan drop-in directory, read files and decode them
2c1efcc69 runtime: Add helpers to copy fields between tomlConfig instances
20f11877b runtime: Add framework to manipulate config structs via reflection
ab5f1c956 shim: set a non-zero return code if the wait process call failed.
e5be5cb08 runtime: device: cleanup outdated comments
5f936f268 virtcontainers: config validation is host specific
323271403 virtcontainers: Remove unused function
0939f5181 config: Expose default_maxmemory
58ff2bd5c clh,qemu: Adapt to using default_maxmemory
1a78c3df2 packaging: Remove unused kata docker configure script
afdc96042 hypervisor: Add default_maxmemory configuration
4e30e11b3 shim: support shim v2 logging plugin
bdf5e5229 virtcontainers: validate hypervisor config outside of hypervisor itself
469e09854 katautils: don't do validation when loading hypervisor config
e32bf5331 device: deduplicate state structures
f97d9b45c runtime: device/persist: drop persist dependency from device pkgs
f9e96c650 runtime: device: move to top level package
3880e0c07 agent: refactor reading file timing for debugging
c70d3a2c3 agent: Update the dependencies
612fd79ba random: Fix "nonminimal-bool" clippy warning
d4417f210 netlink: Fix "or-fun-call" clippy warnings
93874cb3b packaging: Restrict kernel patches applied to top-level dir
07b1367c2 versions: Update kernel to latest LTS version 5.15.48
1b7d36fdb agent: Allow BUILD_TYPE=debug
9ff10c083 kernel: Add CONFIG_EFI=y as part of the TDX fragments
e227b4c40 block: Leverage multiqueue for virtio-block
e7e7dc9df runtime: Add heuristic to get the right value(s) for mem-reserve
c7dd10e5e packaging: Remove unused publish kata image script
0bbbe7068 snap: fix snap build on ppc64le
ef925d40c runtime: enable sandbox feature on qemu
28995301b tracing: Remove whitespace from root span
9941588c0 workflow: Removing man-db, workflow kept failing
90a7763ac snap: Fix debug cli option
a305bafee docs: Update outdated URLs and keep them available
bee770343 docs: Update containerd url link
ac5dbd859 clh: Improve logging related to the net dev addition
0b75522e1 network: Set queues to 1 to ensure we get the network fds
93b61e0f0 network: Add FFI_NO_PI to the netlink flags
bf3ddc125 clh: Pass the tuntap fds down to Cloud Hypervisor
55ed32e92 clh: Take care of the VmAdNetdPut request ourselves
01fe09a4e clh: Hotplug the network devices
2e0753833 clh: Expose VmAddNetPut
1ef0b7ded runtime: Switch to using the rust version of virtiofsd (all but power)
bb26bd73b safe-path: fix clippy warning
1a5ba31cb agent: refactor reading file timing for debugging
721ca72a6 runtime: fix error when trying to parse sandbox sizing annotations
9773838c0 virtiofsd: export env vars needed for building it
b0e090f40 versions: Bump virtiofsd to v1.3.0
db5048d52 kernel: build efi_secret module for SEV
1b845978f docs: Add storage limits to arch doc
412441308 docs: Add more kata monitor details
eff4e1017 shim: change the log level for GetOOMEvent call failures
5d7fb7b7b build(deps): bump github.com/containerd/containerd in /src/runtime
d0ca2fcbb build(deps): bump crossbeam-utils in /src/tools/trace-forwarder
a60dcff4d build(deps): bump regex from 1.5.4 to 1.5.6 in /src/tools/agent-ctl
dbf50672e build(deps): bump crossbeam-utils in /src/tools/agent-ctl
8e2847bd5 build(deps): bump crossbeam-utils from 0.8.6 to 0.8.8 in /src/libs
e9ada165f build(deps): bump regex from 1.5.4 to 1.5.5 in /src/agent
adad9cef1 build(deps): bump crossbeam-utils from 0.8.5 to 0.8.8 in /src/agent
34bcef884 docs: Add agent-ctl examples section
815157bf0 docs: Remove erroneous whitespace
f5099620f tools: Enable extra detail on error
8f10e13e0 config: Allow enable_iommu pod annotation by default
7ae11cad6 docs: Update source for cri-tools
0e2459d13 docs: Add cgroupDriver for containerd
1b7fd19ac rootfs: Fix chronyd.service failing on boot

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2022-07-05 22:23:05 +02:00
Fabiano Fidêncio
f4eea832a1 release: Adapt kata-deploy for 2.5.0-rc0
kata-deploy files must be adapted to a new release.  The cases where it
happens are when the release goes from -> to:
* main -> stable:
  * kata-deploy-stable / kata-cleanup-stable: are removed

* stable -> stable:
  * kata-deploy / kata-cleanup: bump the release to the new one.

There are no changes when doing an alpha release, as the files on the
"main" branch always point to the "latest" and "stable" tags.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2022-07-05 22:23:05 +02:00
Fabiano Fidêncio
071dd4c790 Merge pull request #4109 from pmores/drop-in-cfg-files-support
Drop in cfg files support
2022-07-05 22:21:24 +02:00
Peng Tao
514b4e7235 Merge pull request #4543 from openanolis/anolis/add_vcpu_configure_aarch64
runtime-rs: Dragonball sandbox - add Vcpu::configure() function for aarch64
2022-07-05 17:47:40 +08:00
Bin Liu
d9e868f44e Merge pull request #4479 from quanweiZhou/enhance-get-handled-signal
agent: enhance get handled signal
2022-07-05 15:18:21 +08:00
Bin Liu
b33ad7e57a Merge pull request #4574 from jelipo/fix-serde-serializing
oci: fix serde skip serializing condition
2022-07-05 13:51:43 +08:00
Bin Liu
0189738283 Merge pull request #4576 from ManaSugi/fix/oci-poststart-hook
agent: Run OCI poststart hooks after a container is launched
2022-07-05 11:08:49 +08:00
Peng Tao
cd2d8c6fe2 Merge pull request #4580 from ManaSugi/fix/replace-libc-with-nix
agent: Replace some libc functions with nix ones
2022-07-05 10:53:42 +08:00
Peng Tao
a1de394e51 Merge pull request #4550 from liubin/fix/4548-overwrite-mount-type-for-bind-mount
runtime: overwrite mount type to bind for bind mounts
2022-07-04 19:56:26 +08:00
Peng Tao
44ec9684d8 Merge pull request #4573 from amshinde/unsafe-repo-runtime-shimv2
build: Set safe.directory for runtime repo
2022-07-04 19:51:00 +08:00
haining.cao
0ddb34a38d oci: fix serde skip serializing condition
There is an extra space on the serde serialization condition.

Fixes: #4578

Signed-off-by: haining.cao <haining.cao@daocloud.io>
2022-07-04 16:16:04 +08:00
xuejun-xj
7120afe4ed dragonball: add vcpu test function for aarch64
add create_vcpu() function in vcpu test unit for aarch64

Fixes: #4445

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
Signed-off-by: jingshan <jingshan@linux.alibaba.com>
2022-07-04 15:23:43 +08:00
xuejun-xj
648d285a24 dragonball: add vcpu support for aarch64
add configure() function for aarch64 vcpu

Fixes: #4543

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
Signed-off-by: jingshan <jingshan@linux.alibaba.com>
2022-07-04 15:23:37 +08:00
xuejun-xj
7dad7c89f3 dragonball: update dbs-xxx dependency
change to up-to-date commit ID

Fixes: #4543

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
Signed-off-by: jingshan <jingshan@linux.alibaba.com>
2022-07-04 15:23:11 +08:00
Manabu Sugimoto
fbb2e9bce9 agent: Replace some libc functions with nix ones
Replace `libc::setgroups()`, `libc::fchown()`, and `libc::sethostname()`
functions with nix crate ones for safety and maintainability.

Fixes: #4579

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-07-04 14:49:38 +09:00
Manabu Sugimoto
acd3302bef agent: Run OCI poststart hooks after a container is launched
Run the OCI `poststart` hooks must be called after the
user-specified process is executed but before the `start`
operation returns in accordance with OCI runtime spec.

Fixes: #4575

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-07-03 18:03:51 +09:00
GabyCT
635fa543a3 Merge pull request #4560 from bookinabox/update-commit-message-check
ci/cd: update check-commit-message
2022-07-01 11:30:03 -05:00
James O. D. Hunt
59cab9e835 Merge pull request #4380 from Tim-0731-Hzt/rund/makefile
runtime-rs: makefile for dragonball
2022-07-01 09:12:38 +01:00
liubin
1f363a386c runtime: overwrite mount type to bind for bind mounts
Some clients like nerdctl may pass mount type of none for volumes/bind mounts,
this will lead to container start fails.

Referring to runc, it overwrites the mount type to bind and ignores the input value.

Fixes: #4548

Signed-off-by: liubin <liubin0329@gmail.com>
2022-07-01 12:13:01 +08:00
Archana Shinde
4e48509ed9 build: Set safe.directory for runtime repo
While doing a docker build for shim-v2, we see this:

```
fatal: unsafe repository
('/home/${user}/go/src/github.com/kata-containers/kata-containers' is
owned by someone else)
To add an exception for this directory, call:

        git config --global --add safe.directory
/home/${user}/go/src/github.com/kata-containers/kata-containers
```

This is because the docker container build is run as root while the
runtime repo is checked out as normal user.

Unlike this error causing the rootfs build to error out, the error here
does not really cause `make shim-v2-tarball` to fail.

However its good to get rid of this error message showing during the
make process.

Fixes: #4572

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-06-30 20:52:44 -07:00
Bin Liu
18093251ec Merge pull request #4527 from Tim-0731-Hzt/rund-new/netlink
runtime-rs:refactor network model with netlink
2022-07-01 11:12:54 +08:00
Archana Shinde
c29038a2e2 Merge pull request #4562 from ManaSugi/git-safe-repo
Set safe.directory against tests repository
2022-06-30 16:13:15 -07:00
GabyCT
02a51e75a7 Merge pull request #4554 from liubin/fix/delete-not-used-console-from-container-config
runtime: delete Console from Cmd type
2022-06-30 11:40:07 -05:00
Fabiano Fidêncio
aa561b49f5 Merge pull request #4540 from fidencio/topic/default_maxmemory
Add `default_maxmemory` config option
2022-06-30 12:08:15 +02:00
Manabu Sugimoto
48ccd42339 ci: Set safe.directory against tests repository
Set `safe.directory` against `kata-containers/tests` repository
before checkout because the user in the docker container is root,
but the `tests` repository on the host machine is usually owned
by the normal user.
This works when we already have the `tests` repository which is
not owned by root on the host machine and try to create a rootfs
using Docker (`USE_DOCKER=true`).

Fixes: #4561

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-06-30 17:36:29 +09:00
quanweiZhou
2a4fbd6d8c agent: enhance get handled signal
For runC, send the signal to the init process directly.
For kata, we try to send `SIGKILL` instead of `SIGTERM` when the process
has not installed the handler for `SIGTERM`.
The `is_signal_handled` function determine which signal the container
process has been handled. But currently `is_signal_handled` is only
catching (SigCgt). While the container process is ignoring (SigIgn) or
blocking (SigBlk) also should not be converted from the `SIGTERM` to
`SIGKILL`. For example, when using terminationGracePeriodSeconds the k8s
will send SIGTERM first and then send `SIGKILL`, in this case, the
container ignores the `SIGTERM`, so we should send the `SIGTERM` not the
`SIGKILL` to the container.

Fixes: #4478
Signed-off-by: quanweiZhou <quanweiZhou@linux.alibaba.com>
2022-06-30 14:44:46 +08:00
Derek Lee
433816cca2 ci/cd: update check-commit-message
Recently added check-commit-message to the tests repository. Minor
changes were also made to action. For consistency's sake, copied changes
over to here as well.

tests - https://github.com/kata-containers/tests/pull/4878

Minor Changes:
   1. Body length check is now 75 and consistent with guidelines
   2. Lines without spaces are not counted in body length check

Fixes #4559

Signed-off-by: Derek Lee <derlee@redhat.com>
2022-06-29 16:55:43 -07:00
GabyCT
2a94261df5 Merge pull request #4549 from liubin/fix/4419-set-status-if-wait-process-failed
shim: set a non-zero return code if the wait process call failed.
2022-06-29 17:04:53 -05:00
Fabiano Fidêncio
1e12d56512 Merge pull request #4469 from egernst/config-validation-refactor
Refactor how hypervisor config validation is handled
2022-06-29 14:42:11 +02:00
liubin
a5a25ed13d runtime: delete Console from Cmd type
There is much code related to this property, but it is not used anymore.

Fixes: #4553

Signed-off-by: liubin <liubin0329@gmail.com>
2022-06-29 17:36:32 +08:00
Pavel Mores
96553e8bd2 runtime: Add documentation of drop-in config file fragments
Added user manual for the drop-in config file fragments feature.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-06-29 10:56:53 +02:00
Pavel Mores
c656457e90 runtime: Add tests of drop-in config file decoding
The tests ensure that interactions between drop-ins and the base
configuration.toml and among drop-ins themselves work as intended,
basically that files are evaluated in the correct order (base file
first, then drop-ins in alphabetical order) and the last one to set
a specific key wins.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-06-29 09:54:39 +02:00
Pavel Mores
99f5ca80fc runtime: Plug drop-in decoding into decodeConfig()
Fixes #4108

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-06-29 09:54:38 +02:00
Pavel Mores
0f9856c465 runtime: Scan drop-in directory, read files and decode them
updateFromDropIn() uses the infrastructure built by previous commits to
ensure no contents of 'tomlConfig' are lost during decoding.   To do
this, we preserve the current contents of our tomlConfig in a clone and
decode a drop-in into the original.  At this point, the original
instance is updated but its Agent and/or Hypervisor fields are
potentially damaged.

To merge, we update the clone's Agent/Hypervisor from the original
instance.   Now the clone has the desired Agent/Hypervisor and the
original instance has the rest, so to finish, we just need to move the
clone's Agent/Hypervisor to the original.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-06-29 09:54:38 +02:00
Pavel Mores
2c1efcc697 runtime: Add helpers to copy fields between tomlConfig instances
These functions take a TOML key - an array of individual components,
e.g. ["agent" "kata" "enable_tracing"], as returned by BurntSushi - and
two 'tomlConfig' instances.  They copy the value of the struct field
identified by the key from the source instance to the target one if
necessary.

This is only done if the TOML key points to structures stored in
maps by 'tomlConfig', i.e. 'hypervisor' and 'agent'.  Nothing needs to
be done in other cases.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-06-29 09:54:38 +02:00
Pavel Mores
20f11877be runtime: Add framework to manipulate config structs via reflection
For 'tomlConfig' substructures stored in Golang maps - 'hypervisor' and
'agent' - BurntSushi doesn't preserve their previous contents as it does
for substructures stored directly (e.g. 'runtime').  We use reflection
to work around this.

This commit adds three primitive operations to work with struct fields
identified by their `toml:"..."` tags - one to get a field value, one to
set a field value and one to assign a source struct field value to the
corresponding field of a target.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-06-29 09:54:38 +02:00
liubin
ab5f1c9564 shim: set a non-zero return code if the wait process call failed.
Return code is an int32 type, so if an error occurred, the default value
may be zero, this value will be created as a normal exit code.

Set return code to 255 will let the caller(for example Kubernetes) know
that there are some problems with the pod/container.

Fixes: #4419

Signed-off-by: liubin <liubin0329@gmail.com>
2022-06-29 12:33:32 +08:00
Zhongtao Hu
07231b2f3f runtime-rs:refactor network model with netlink
add unit test for tcfilter

Fixes: #4289
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-29 11:38:23 +08:00
Zhongtao Hu
c8a9052063 build: format files
add Enter at the end of the file

Fixes: #4379
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-29 11:19:10 +08:00
Zhongtao Hu
242992e3de build: put install methods in utils.mk
put install methods in utils.mk to avoid duplication

Fixes: #4379
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-29 11:19:10 +08:00
Zhongtao Hu
8a697268d0 build: makefile for dragonball config
use makefile to generate dragonball config file

Fixes: #4379
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-29 11:19:07 +08:00
Zhongtao Hu
9c526292e7 runtime-rs:refactor network model with netlink
refactor tcfilter with netlink

Fixes: #4289
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-29 11:03:29 +08:00
Eric Ernst
e5be5cb086 runtime: device: cleanup outdated comments
Prior device config move didn't update the comments. Let's address this,
and make sure comments match the new path...

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-06-28 18:22:28 -07:00
Eric Ernst
5f936f268f virtcontainers: config validation is host specific
Ideally this config validation would be in a seperate package
(katautils?), but that would introduce circular dependency since we'd
call it from vc, and it depends on vc types (which, shouldn't be vc, but
probably a hypervisor package instead).

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-06-28 18:22:28 -07:00
Fabiano Fidêncio
323271403e virtcontainers: Remove unused function
While working on the previous commits, some of the functions become
non-used.  Let's simply remove them.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-28 21:19:24 +02:00
Fabiano Fidêncio
0939f5181b config: Expose default_maxmemory
Expose the newly added `default_maxmemory` to the project's Makefile and
to the configuration files.

Fixes: #4516

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-28 21:19:24 +02:00
Fabiano Fidêncio
58ff2bd5c9 clh,qemu: Adapt to using default_maxmemory
Let's adapt Cloud Hypervisor's and QEMU's code to properly behave to the
newly added `default_maxmemory` config.

While implementing this, a change of behaviour (or a bug fix, depending
on how you see it) has been introduced as if a pod requests more memory
than the amount avaiable in the host, instead of failing to start the
pod, we simply hotplug the maximum amount of memory available, mimicing
better the runc behaviour.

Fixes: #4516

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-28 21:19:24 +02:00
Fabiano Fidêncio
ad055235a5 Merge pull request #4547 from GabyCT/topic/removeunuseddocker
packaging: Remove unused kata docker configure script
2022-06-28 20:09:15 +02:00
GabyCT
b2c0387993 Merge pull request #4130 from surajssd/add-cgroup-driver-info
kata-with-k8s: Add cgroupDriver for containerd
2022-06-28 10:30:18 -05:00
GabyCT
12c1b9e6d6 Merge pull request #4536 from Tim-0731-Hzt/runtime-rs-kata-main
runtime-rs: Merge Main into runtime-rs branch
2022-06-28 10:27:35 -05:00
Gabriela Cervantes
1a78c3df2e packaging: Remove unused kata docker configure script
This PR removes an unused kata configure docker script which was used
in packaging for kata 1.x but not longer being used in kata 2.x

Fixes #4546

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-06-28 15:10:39 +00:00
Zhongtao Hu
f3907aa127 runtime-rs:Merge remote-tracking branch 'origin/main' into runtime-rs-newv
Fixes:#4536
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-28 20:58:40 +08:00
Bin Liu
badbbcd8be Merge pull request #4400 from openanolis/anolis/dragonball-2
runtime-rs: built-in Dragonball sandbox part II - vCPU manager
2022-06-28 20:41:36 +08:00
Tim Zhang
916ffb75d7 Merge pull request #4432 from liubin/fix/4420-binary-log
shim: support shim v2 logging plugin
2022-06-28 16:29:07 +08:00
Fabiano Fidêncio
afdc960424 hypervisor: Add default_maxmemory configuration
Let's add a `default_maxmemory` configuration, which allows the admins
to set the maximum amount of memory to be used by a VM, considering the
initial amount + whatever ends up being hotplugged via the pod limits.

By default this value is 0 (zero), and it means that the whole physical
RAM is the limit.

Fixes: #4516

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-28 08:32:15 +02:00
Bin Liu
4e30e11b31 shim: support shim v2 logging plugin
Now kata shim only supports stdout/stderr of fifo from
containerd/CRI-O, but shim v2 supports logging plugins,
and nerdctl default will use the binary schema for logs.

This commit will add the others type of log plugins:

- file
- binary

In case of binary, kata shim will receive a stdout/stderr like:

binary:///nerdctl?_NERDCTL_INTERNAL_LOGGING=/var/lib/nerdctl/1935db59

That means the nerdctl process will handle the logs(stdout/stderr)

Fixes: #4420

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-06-28 13:54:22 +08:00
Eric Ernst
bdf5e5229b virtcontainers: validate hypervisor config outside of hypervisor itself
Depending on the user of it, the hypervisor from hypervisor interface
could have differing view on what is valid or not. To help decouple,
let's instead check the hypervisor config validity as part of the
sandbox creation, rather than as part of the CreateVM call within the
hypervisor interface implementation.

Fixes: #4251

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-06-27 11:53:41 -07:00
Eric Ernst
469e098543 katautils: don't do validation when loading hypervisor config
Policy for whats valid/invalid within the config varies by VMM, host,
and by silicon architecture. Let's keep katautils simple for just
translating a toml to the hypervisor config structure, and leave
validation to virtcontainers.

Without this change, we're doing duplicate validation.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-06-27 10:13:26 -07:00
Chao Wu
71db2dd5b8 hotplug: add room for future acpi hotplug mechanism
In order to support ACPI hotplug in the future with the cooperative work
from the Kata community, we add ACPI feature and dbs-upcall feature to
add room for ACPI hotplug.

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-06-27 21:52:36 +08:00
Zizheng Bian
8bb00a3dc8 dragonball: fix a bug when generating kernel boot args
We should refuse to generate boot args when hotplugging, not cold starting.

Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>
2022-06-27 18:12:50 +08:00
Chao Wu
2aedd4d12a doc: add document for vCPU, api and device
Create the document for vCPU and api.

Add some detail in the device document.

Fixes: #4257

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-06-27 18:12:50 +08:00
wllenyj
bec22ad01f dragonball: add api module
It is used to define the vmm communication interface.

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-06-27 18:12:50 +08:00
wllenyj
07f44c3e0a dragonball: add vcpu manager
Manage vcpu related operations.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: jingshan <jingshan@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-06-27 18:12:48 +08:00
wllenyj
78c9718752 dragonball: add upcall support
Upcall is a direct communication tool between VMM and guest developed
upon vsock. It is used to implement device hotplug.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: jingshan <jingshan@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>
2022-06-27 17:04:47 +08:00
wllenyj
7d1953b52e dragonball: add vcpu
Virtual CPU manager for virtual machines.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: jingshan <jingshan@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-06-27 17:04:42 +08:00
wllenyj
468c73b3cb dragonball: add kvm context
KVM operation context for virtual machines.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: jingshan <jingshan@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-06-27 16:02:06 +08:00
Bin Liu
27b1bb5ed9 Merge pull request #4467 from egernst/device-pkg
device package cleanup/refactor
2022-06-27 14:40:53 +08:00
Eric Ernst
e32bf53318 device: deduplicate state structures
Before, we maintained almost identical structures between our persist
API and what we keep for our devices, with the persist API being a
slight subset of device structures.

Let's deduplicate this, now that persist is importing device package.
Json unmarshal of prior persist structure will work fine, since it was
an exact subset of fields.

Fixes: #4468

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-06-26 21:31:29 -07:00
Eric Ernst
f97d9b45c8 runtime: device/persist: drop persist dependency from device pkgs
Rather than have device package depend on persist, let's define the
(almost duplicate) structures within device itself, and have the Kata
Container's persist pkg import these.

This'll help avoid unecessary dependencies within our core packages.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-06-26 21:31:29 -07:00
Eric Ernst
f9e96c6506 runtime: device: move to top level package
Let's move device package to runtime/pkg instead of being buried under
virtcontainers.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-06-26 21:31:29 -07:00
Bin Liu
3880e0c077 agent: refactor reading file timing for debugging
In the original code, reads mountstats file and return
the content in the error, but at this time the file maybe
changed, we should return the file content that parsed
line by line to check why there is not a fstype option.

Fixes: #4246

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-06-26 21:27:43 -07:00
Archana Shinde
2488a0f6c0 Merge pull request #4439 from amshinde/update-kernel-to-5.15.46
versions: Update kernel to latest LTS version 5.15.48
2022-06-24 11:03:32 -07:00
Fabiano Fidêncio
083ca5f217 Merge pull request #4505 from yoheiueda/agent-debug-build
agent: Allow BUILD_TYPE=debug
2022-06-24 14:04:23 +02:00
Fabiano Fidêncio
03fca8b459 Merge pull request #4526 from fidencio/topic/fix-clippy-warnings-and-update-agent-vendored-code
Fix clippy warnings and update agent's vendored code
2022-06-24 14:02:28 +02:00
Fabiano Fidêncio
c70d3a2c35 agent: Update the dependencies
Let's run a `cargo update` and ensure the deps are up-to-date before we
cut the "-rc0" release.

Fixes: #4525

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-24 11:37:25 +02:00
Fabiano Fidêncio
612fd79bae random: Fix "nonminimal-bool" clippy warning
The error shown below was caught during a dependency bump in the CCv0
branch, but we better fix it here first.
```
error: this boolean expression can be simplified
  --> src/random.rs:85:21
   |
85 |             assert!(!ret.is_ok());
   |                     ^^^^^^^^^^^^ help: try: `ret.is_err()`
   |
   = note: `-D clippy::nonminimal-bool` implied by `-D warnings`
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#nonminimal_bool

error: this boolean expression can be simplified
  --> src/random.rs:93:17
   |
93 |         assert!(!ret.is_ok());
   |                 ^^^^^^^^^^^^ help: try: `ret.is_err()`
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#nonminimal_bool
```

Fixes: #4523

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-24 11:37:05 +02:00
Fabiano Fidêncio
d4417f210e netlink: Fix "or-fun-call" clippy warnings
The error shown below was caught during a dependency bump in the CCv0
branch, but we better fix it here first.
```
error: use of `ok_or` followed by a function call
   --> src/netlink.rs:526:14
    |
526 |             .ok_or(anyhow!(nix::Error::EINVAL))?;
    |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try this: `ok_or_else(|| anyhow!(nix::Error::EINVAL))`
    |
    = note: `-D clippy::or-fun-call` implied by `-D warnings`
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#or_fun_call
error: use of `ok_or` followed by a function call
   --> src/netlink.rs:615:49
    |
615 |         let v = u8::from_str_radix(split.next().ok_or(anyhow!(nix::Error::EINVAL))?, 16)?;
    |                                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try this: `ok_or_else(|| anyhow!(nix::Error::EINVAL))`
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#or_fun_call
```

Fixes: #4523

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-24 11:37:01 +02:00
Archana Shinde
93874cb3bb packaging: Restrict kernel patches applied to top-level dir
The apply_patches.sh script applies all patches in the patches
directory,  as well as subdirectories. This means if there is a sub-dir
called "experimental" under a major kernel version directory,
experimental patches would be applied to the default kernel supported by
Kata.
We did not come accross this issue earlier as typically the experimental
kernel version was different from the default kernel.
With both the default kernel and the arm-experimental kernel having the
same major kernel version (5.15.x) at this time, trying to update the
kernel patch version revealed that arm-experimental patches were being
applied to the default kernel.

Restricting the patches to be applied to the top level directory will
solve the issue. The apply_patches script should ignore any
sub-directories meant for experimental patches.

Fixes #4520

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-06-23 10:43:52 -07:00
Archana Shinde
07b1367c2b versions: Update kernel to latest LTS version 5.15.48
This brings in a few security fixes.
Removing arm patches related to virtio-mem that are no longer required
as they have been merged.

Fixes #4438

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-06-23 10:43:52 -07:00
Fabiano Fidêncio
133528dd14 Merge pull request #4503 from amshinde/multi-queue-block
block: Leverage multiqueue for virtio-block
2022-06-23 12:17:11 +02:00
Fabiano Fidêncio
f186a52b16 Merge pull request #4511 from fidencio/topic/add-config-efi-to-the-tdx-kernel
kernel: Add CONFIG_EFI=y as part of the TDX fragments
2022-06-23 12:15:30 +02:00
Yohei Ueda
1b7d36fdb0 agent: Allow BUILD_TYPE=debug
The cargo command creates debug build binaries, when the --release
option is not specified. Specifying --debug option causes an error.
This patch specifies --release option when BUILD_TYPE=release,
and does not specify any build type option when BUILD_TYPE=debug.

Fixes #4504

Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
2022-06-23 13:54:32 +09:00
Fabiano Fidêncio
9ff10c0830 kernel: Add CONFIG_EFI=y as part of the TDX fragments
Otherwise `./build-kernel.sh -x tdx setup` will fail with the following
error:
```
$ ./build-kernel.sh -x tdx setup
INFO: Config version: 92
INFO: Kernel version: tdx-guest-v5.15-4
INFO: kernel path does not exist, will download kernel
INFO: Apply patches from
/home/ffidenci/go/src/github.com/kata-containers/kata-containers/tools/packaging/kernel/patches/tdx-guest-v5.15-4.x
INFO: Found 0 patches
INFO: Enabling config for 'tdx' confidential guest protection
INFO: Constructing config from fragments:
/home/ffidenci/go/src/github.com/kata-containers/kata-containers/tools/packaging/kernel/configs/fragments/x86_64/.config

WARNING: unmet direct dependencies detected for UNACCEPTED_MEMORY
  Depends on [n]: EFI [=n] && EFI_STUB [=n]
  Selected by [y]:
  - INTEL_TDX_GUEST [=y] && HYPERVISOR_GUEST [=y] && X86_64 [=y] &&
    CPU_SUP_INTEL [=y] && PARAVIRT [=y] && SECURITY [=y] &&
     X86_X2APIC[=y]
INFO: Some CONFIG elements failed to make the final .config:
INFO: Value requested for CONFIG_EFI_STUB not in final .config
INFO: Generated config file can be found in
/home/ffidenci/go/src/github.com/kata-containers/kata-containers/tools/packaging/kernel/configs/fragments/x86_64/.config
ERROR: Failed to construct requested .config file
ERROR: failed to find default config
```

Fixes: #4510

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-22 15:21:30 +02:00
Fabiano Fidêncio
78e27de6c3 Merge pull request #4358 from zvonkok/memreserve
runtime: Add heuristic to get the right value(s) for mem-reserve
2022-06-22 13:41:23 +02:00
Archana Shinde
e227b4c404 block: Leverage multiqueue for virtio-block
Similar to network, we can use multiple queues for virtio-block
devices. This can help improve storage performance.
This commit changes the number of queues for block devices to
the number of cpus for cloud-hypervisor and qemu.

Today the default number of cpus a VM starts with is 1.
Hence the queues used will be 1. This change will help
improve performance when the default cold-plugged cpus is greater
than one by changing this in the config file. This may also help
when we use the sandboxing feature with k8s that passes down
the sum of the resources required down to Kata.

Fixes #4502

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-06-21 12:38:53 -07:00
Eric Ernst
72049350ae Merge pull request #4288 from fengwang666/enable-qemu-sandbox
runtime: enable sandbox feature on qemu
2022-06-21 09:22:26 -07:00
GabyCT
8eac22ac53 Merge pull request #4495 from Amulyam24/snap-fix
snap: fix snap build on ppc64le
2022-06-21 09:21:23 -05:00
Zvonko Kaiser
e7e7dc9dfe runtime: Add heuristic to get the right value(s) for mem-reserve
Fixes: #2938

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2022-06-21 03:44:28 -07:00
Bin Liu
e422730c7f Merge pull request #4497 from GabyCT/topic/removeunusedref
packaging: Remove unused publish kata image script
2022-06-21 17:46:45 +08:00
James O. D. Hunt
e11fcf7d3c Merge pull request #4168 from Champ-Goblem/patch/fix-chronyd-failure-on-boot
rootfs: Fix chronyd.service failing on boot
2022-06-21 09:43:13 +01:00
Gabriela Cervantes
c7dd10e5ed packaging: Remove unused publish kata image script
This PR removes unused the publish kata image script which
was used on kata 1.x when we had OBS packages which are not
longer used on kata 2.x

Fixes #4496

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-06-20 14:43:39 +00:00
Amulyam24
0bbbe70687 snap: fix snap build on ppc64le
Fixes the syntax error while building rustdeps.

Fixes: #4494

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2022-06-20 19:26:27 +05:30
Fabiano Fidêncio
6fd40085ef Merge pull request #4484 from cmaf/tracing-update-rootspan-name
tracing: Remove whitespace from root span
2022-06-20 08:37:45 +02:00
Fupan Li
98f041ed8e Merge pull request #4486 from openanolis/runtime-rs-merge-main
runtime-rs: runtime-rs merge main
2022-06-20 13:52:14 +08:00
Bin Liu
2c1b68d6e4 Merge pull request #4481 from zvonkok/fix-action
workflow: Removing man-db, workflow kept failing
2022-06-20 11:10:48 +08:00
Chao Wu
86123f49f2 Merge branch 'main' into runtime-rs
In order to keep update with the main, we will update runtime-rs every
week.

Fixes: #4485
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-06-20 10:01:58 +08:00
Liang Zhou
ef925d40ce runtime: enable sandbox feature on qemu
Enable "-sandbox on" in qemu can introduce another protect layer
on the host, to make the secure container more secure.

The default option is disable because this feature may introduce some
performance cost, even though user can enable
/proc/sys/net/core/bpf_jit_enable to reduce the impact.

Fixes: #2266

Signed-off-by: Feng Wang <feng.wang@databricks.com>
2022-06-17 15:30:46 -07:00
Chelsea Mafrica
28995301b3 tracing: Remove whitespace from root span
Remove space from root span name to follow camel casing of other tracing
span names in the runtime and to make parsing easier in testing.

Fixes #4483

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2022-06-17 12:07:37 -07:00
Zvonko Kaiser
9941588c00 workflow: Removing man-db, workflow kept failing
Fixes: #4480

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2022-06-17 04:55:12 -07:00
wllenyj
e89e6507a4 dragonball: add signal handler
Used to register dragonball's signal handler.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: jingshan <jingshan@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-06-16 17:31:58 +08:00
Fabiano Fidêncio
f30fe86dc1 Merge pull request #4456 from Bevisy/fixIssue4454
docs: Update outdated URLs and keep them available
2022-06-16 10:26:24 +02:00
Bin Liu
553ec46115 Merge pull request #4436 from alex-matei/fix/sandbox-mem-overflow
runtime: fix error when trying to parse sandbox sizing annotations
2022-06-16 11:18:24 +08:00
James O. D. Hunt
0d33b28802 Merge pull request #4459 from jodh-intel/snap-fix-cli-options
snap: Fix debug cli option
2022-06-15 17:10:15 +01:00
James O. D. Hunt
9766a285a4 Merge pull request #4422 from snir911/dependabot_bumps
deps: Resolve dependabot bumps of containerd, crossbeam-utils, regex
2022-06-15 15:57:53 +01:00
James O. D. Hunt
90a7763ac6 snap: Fix debug cli option
`snap`/`snapcraft` seems to have changed recently. Since `snap`
auto-updates all `snap` packages and since we use the `snapcraft` `snap`
for building snaps, this is impacting all our CI jobs which now show:

```
Installing Snapcraft for Linux…
snapcraft 7.0.4 from Canonical* installed

Run snapcraft -d snap --destructive-mode
Usage: snapcraft [options] command [args]...
Try 'snapcraft pack -h' for help.
Error: unrecognized arguments: -d
Error: Process completed with exit code 1.
```

Move the debug option to make it a sub-command (long) option to resolve
this issue.

Fixes: #4457.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-06-15 10:00:56 +01:00
James O. D. Hunt
d06dd8fcdc Merge pull request #4312 from fidencio/topic/pass-the-tuntap-fd-to-clh
Allow Cloud Hypervisor to run under the `container_kvm_t`
2022-06-15 09:37:49 +01:00
Binbin Zhang
a305bafeef docs: Update outdated URLs and keep them available
By comparing the content of the old url and the new url,
ensure that their content is consistent and does not contain ambiguities

Fixes: #4454

Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
2022-06-15 16:34:28 +08:00
Archana Shinde
185360cb9a Merge pull request #4452 from GabyCT/topic/updatedeveloperguide
docs: Update containerd url link
2022-06-14 16:13:35 -07:00
Chelsea Mafrica
db2a4d6cdf Merge pull request #4441 from liubin/fix/refactor-reading-mountstat-log
agent: refactor reading file timing for debugging
2022-06-14 14:18:14 -07:00
Gabriela Cervantes
bee7703436 docs: Update containerd url link
This PR updates the containerd url link in the Developer Guide.

Fixes #4451

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-06-14 15:35:03 +00:00
Fabiano Fidêncio
ac5dbd8598 clh: Improve logging related to the net dev addition
Let's improve the log so we make it clear that we're only *actually*
adding the net device to the Cloud Hypervisor configuration when calling
our own version of VmAddNetPut().

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-14 10:53:09 +00:00
Fabiano Fidêncio
0b75522e1f network: Set queues to 1 to ensure we get the network fds
We want to have the file descriptors of the opened tuntap device to pass
them down to the VMMs, so the VMMs don't have to explicitly open a new
tuntap device themselves, as the `container_kvm_t` label does not allow
such a thing.

With this change we ensure that what's currently done when using QEMU as
the hypervisor, can be easily replicated with other VMMs, even if they
don't support multiqueue.

As a side effect of this, we need to close the received file descriptors
in the code of the VMMs which are not going to use them.

Fixes: #3533

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-14 10:53:09 +00:00
Fabiano Fidêncio
93b61e0f07 network: Add FFI_NO_PI to the netlink flags
Adding FFI_NO_PI to the netlink flags causes no harm to the supported
and tested hypervisors as when opening the device by its name Cloud
Hypervisor[0], Firecracker[1], and QEMU[2] do set the flag already.

However, when receiving the file descriptor of an opened tutap device
Cloud Hypervisor is not able to set the flag, leaving the guest without
connectivity.

To avoid such an issue, let's simply add the FFI_NO_PI flag to the
netlink flags and ensure, from our side, that the VMMs don't have to set
it on their side when dealing with an already opened tuntap device.

Note that there's a PR opened[3] just for testing that this change
doesn't cause any breakage.

[0]: e52175c2ab/net_util/src/tap.rs (L129)
[1]: b6d6f71213/src/devices/src/virtio/net/tap.rs (L126)
[2]: 3757b0d08b/net/tap-linux.c (L54)
[3]: https://github.com/kata-containers/kata-containers/pull/4292

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-14 10:53:09 +00:00
Fabiano Fidêncio
bf3ddc125d clh: Pass the tuntap fds down to Cloud Hypervisor
This is basically a no-op right now, as:
* netPair.TapInterface.VMFds is nil
* the tap name is still passed to Cloud Hypervisor, which is the Cloud
  Hypervisor's first choice when opening a tap device.

In the very near future we'll stop passing the tap name to Cloud
Hypervisor, and start passing the file descriptors of the opened tap
instead.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-14 10:53:09 +00:00
Fabiano Fidêncio
55ed32e924 clh: Take care of the VmAdNetdPut request ourselves
Knowing that VmAddNetPut works as expected, let's switch to manually
building the request and writing it to the appropriate socket.

By doing this it gives us more flexibility to, later on, pass the file
descriptor of the tuntap device to Cloud Hypervisor, as openAPI doesn't
support such operation (it has no notion of SCM Rights).

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-14 10:53:09 +00:00
Fabiano Fidêncio
01fe09a4ee clh: Hotplug the network devices
Instead of creating the VM with the network device already plugged in,
let's actually add the network device *after* the VM is created, but
*before* the Vm is actually booted.

Although it looks like it doesn't make any functional difference between
what's done in the past and what this commit introduces, this will be
used to workaround a limitation on OpenAPI when it comes to passing down
the network device's file descriptor to Cloud Hypervisor, so Cloud
Hypervisor can use it instead of opening the device by its name on the
VMM side.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-14 10:51:02 +00:00
Fabiano Fidêncio
2e07538334 clh: Expose VmAddNetPut
VmAddNetPut is the API provided by the Cloud Hypervisor client (auto
generated) code to hotplug a new network device to the VM.

Let's expose it now as it'll be used as part this series, mostly to
guide the reviewer through the process of what we have to do, as later
on, spoiler alert, it'll end up being removed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-14 10:27:30 +00:00
Bin Liu
c84a425250 Merge pull request #4442 from openanolis/anolis/fix_safepath_clippy
safe-path: fix clippy warning
2022-06-14 14:02:42 +08:00
Chelsea Mafrica
1d5448fbca Merge pull request #4180 from Alex-Carter01/build-kernel-efi-secret
kernel building: efi_secret module
2022-06-13 13:34:06 -07:00
Fabiano Fidêncio
a80eb33cd6 Merge pull request #4308 from fidencio/topic/virtiofsd-switch-to-using-the-rust-version-on-all-arches
runtime: Switch to using the rust version of virtiofsd (all arches but powerpc)
2022-06-13 13:45:51 +02:00
Bin Liu
81acfc1286 Merge pull request #4425 from liubin/fix/4376-change-log-level-of-getoomevent
shim: change the log level for GetOOMEvent call failures
2022-06-13 17:53:11 +08:00
James O. D. Hunt
9b93db0220 Merge pull request #4417 from jodh-intel/docs-monitor-considerations
docs: Add more kata monitor details
2022-06-13 10:51:52 +01:00
Fabiano Fidêncio
1ef0b7ded0 runtime: Switch to using the rust version of virtiofsd (all but power)
So far this has been done for x86_64.  Now that the support for building
and testing has been added for all arches, let's do the second part of
the switch.

We're still not done yet for powerpc, as some a virtifosd crash on the
rust version has been found by the maintainer.

Fixes: #4258, #4260

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-13 10:41:26 +02:00
wllenyj
b6cb2c4ae3 dragonball: add metrics system
metrics system is added for collecting Dragonball metrics to analyze the
system.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: jingshan <jingshan@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-06-13 13:51:51 +08:00
wllenyj
e80e0c4645 dragonball: add io manager wrapper
Wrapper over IoManager to support device hotplug.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: jingshan <jingshan@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-06-13 13:51:46 +08:00
Chao Wu
bb26bd73b1 safe-path: fix clippy warning
fix clippy warnings in safe-path lib to make clippy happy.

fixes: #4443

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-06-13 13:38:37 +08:00
Bin Liu
1a5ba31cb0 agent: refactor reading file timing for debugging
In the original code, reads mountstats file and return
the content in the error, but at this time the file maybe
changed, we should return the file content that parsed
line by line to check why there is not a fstype option.

Fixes: #4246

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-06-13 10:56:51 +08:00
Bin Liu
f23d7092e3 Merge pull request #4265 from openanolis/anolis/dragonball-1
runtime-rs: built-in Dragonball sandbox part I - resource and device managers
2022-06-12 12:17:57 +08:00
Chao Wu
d5ee3fc856 safe-path: fix clippy warning
fix clippy warnings in safe-path lib to make clippy happy.

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-06-12 10:24:05 +08:00
Alexandru Matei
721ca72a64 runtime: fix error when trying to parse sandbox sizing annotations
Changed bitsize for parsing functions to 64-bit in order to avoid
parsing errors.

Fixes #4435

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-06-11 18:51:10 +03:00
Chao Wu
93c10dfd86 runtime-rs: add crosvm license in Dragonball
add THIRD-PARTY file to add license for crosvm.

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-06-11 17:24:58 +08:00
Chao Wu
dfe6de7714 dragonball: add dragonball into kata README
add dragonball description into kata README to help introduce dragonball
sandbox.

Fixes: #4257

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-06-11 17:24:56 +08:00
wllenyj
39ff85d610 dragonball: green ci
Revert this patch, after dragonball-sandbox is ready. And all
subsequent implementations are submitted.

Fixes: #4257

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-06-11 17:24:17 +08:00
wllenyj
71f24d8271 dragonball: add Makefile.
Currently supported: build, clippy, check, format, test, clean

Fixes: #4257

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-06-11 17:24:17 +08:00
Chao Wu
a1df6d0969 Doc: Update Dragonball Readme and add document for device
Update Dragonball Readme to fix style problem and add github issue for
TODOs.

Add document for devices in dragonball. This is the document for the
current dragonball device status and we'll keep updating it when we
introduce more devices in later pull requets.

Fixes: #4257

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-06-11 17:24:17 +08:00
wllenyj
8619f2b3d6 dragonball: add virtio vsock device manager.
Added VsockDeviceMgr struct to manage all vsock devices.

Fixes: #4257

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-06-11 17:23:56 +08:00
wllenyj
52d42af636 dragonball: add device manager.
Device manager to manage IO devices for a virtual machine. And added
DeviceManagerTx to provide operation transaction for device management,
added DeviceManagerContext to operation context for device management.

Fixes: #4257

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-06-11 17:23:56 +08:00
wllenyj
c1c1e5152a dragonball: add kernel config.
It is used for holding guest kernel configuration information.

Fixes: #4257

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-06-11 17:23:46 +08:00
wllenyj
6850ef99ae dragonball: add configuration manager.
It is used for managing a group of configuration information.

Fixes: #4257

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-06-11 17:23:39 +08:00
wllenyj
0bcb422fcb dragonball: add legacy devices manager
The legacy devices manager is used for managing legacy devices.

Fixes: #4257

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-06-11 17:23:33 +08:00
wllenyj
3c45c0715f dragonball: add console manager.
Console manager to manage frontend and backend console devcies.

A virtual console are composed up of two parts: frontend in virtual
machine and backend in host OS. A frontend may be serial port,
virtio-console etc, a backend may be stdio or Unix domain socket. The
manager connects the frontend with the backend.

Fixes: #4257

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-06-11 17:23:27 +08:00
wllenyj
3d38bb3005 dragonball: add address space manager.
Address space abstraction to manage virtual machine's physical address space.
The AddressSpaceMgr Struct to manage address space.

Fixes: #4257

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-06-11 17:21:41 +08:00
wllenyj
aff6040555 dragonball: add resource manager support.
Resource manager manages all resources of a virtual machine instance.

Fixes: #4257

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-06-11 17:21:41 +08:00
wllenyj
8835db6b0f dragonball: initial commit
The dragonball crate initial commit that includes dragonball README and
basic code structure.

Fixes: #4257

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-06-11 17:21:41 +08:00
Fupan Li
9cb15ab4c5 agent: add the FSGroup support
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2022-06-11 11:30:51 +08:00
Fupan Li
ff7874bc23 protobuf: upgrade the protobuf version to 2.27.0
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2022-06-11 10:05:52 +08:00
Archana Shinde
aefe11b9ba Merge pull request #4331 from dgibson/config-enable-iommu-annotation
Allow io.katacontainers.config.hypervisor.enable_iommu annotation by …
2022-06-10 17:43:27 -07:00
Chelsea Mafrica
7deb87dcbc Merge pull request #4434 from fidencio/topic/bump-virtiofsd-release
versions: Bump virtiofsd to v1.3.0
2022-06-10 12:08:33 -07:00
GabyCT
f811c8b60e Merge pull request #4431 from jodh-intel/docs-arch-storage-limits
docs: Add storage limits to arch doc
2022-06-10 11:52:45 -05:00
Zhongtao Hu
06f398a34f runtime-rs: use withContext to evaluate lazily
Fixes: #4129
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-10 22:03:13 +08:00
Quanwei Zhou
fd4c26f9c1 runtime-rs: support network resource
Fixes: #3785
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-06-10 22:02:58 +08:00
Tim Zhang
4be7185aa4 runtime-rs: runtime part implement
Fixes: #3785
Signed-off-by: Tim Zhang <tim@hyper.sh>
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-06-10 22:01:12 +08:00
Zhongtao Hu
10343b1f3d runtime-rs: enhance runtimes
1. support oom event
2. use ContainerProcess to store container_id and exec_id
3. support stats

Fixes: #3785
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-10 22:01:05 +08:00
Quanwei Zhou
9887272db9 libs: enhance kata-sys-util and kata-types
Fixes: #3785
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-06-10 21:59:47 +08:00
Quanwei Zhou
3ff0db05a7 runtime-rs: support rootfs volume for resource
Fixes: #3785
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-06-10 19:58:01 +08:00
Tim Zhang
234d7bca04 runtime-rs: support cgroup resource
Fixes: #3785
Signed-off-by: Tim Zhang <tim@hyper.sh>
2022-06-10 19:57:53 +08:00
Quanwei Zhou
75e282b4c1 runtime-rs: hypervisor base define
Responsible for VM manager, such as Qemu, Dragonball

Fixes: #3785
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-06-10 19:57:45 +08:00
Quanwei Zhou
bdfee005fa runtime-rs: service and runtime framework
1. service: Responsible for processing services, such as task service, image service
2. Responsible for implementing different runtimes, such as Virt-container,
Linux-container, Wasm-container

Fixes: #3785
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-06-10 19:57:36 +08:00
Quanwei Zhou
4296e3069f runtime-rs: agent implements
Responsible for communicating with the agent, such as kata-agent in the VM

Fixes: #3785
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-06-10 19:57:29 +08:00
Jakob Naucke
d3da156eea runtime-rs: uint FsType for s390x
statfs type on s390x should be c_uint, not __fsword_t

Fixes: #3888
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2022-06-10 19:57:23 +08:00
quanwei.zqw
e705ee07c5 runtime-rs: update containerd-shim-protos to 0.2.0
Fixes: #3866
Signed-off-by: quanwei.zqw <quanwei.zqw@alibaba-inc.com>
2022-06-10 19:57:14 +08:00
quanwei.zqw
8c0a60e191 runtime-rs: modify the review suggestion
Fixes: #3876
Signed-off-by: quanwei.zqw <quanwei.zqw@alibaba-inc.com>
2022-06-10 19:57:07 +08:00
Zack
278f843f92 runtime-rs: shim implements for runtime-rs
Responsible for processing shim related commands: start, delete.

This patch is extracted from Alibaba Cloud's internal repository *runD*
Thanks to all contributors!

Fixes: #3785
Signed-off-by: acetang <aceapril@126.com>
Signed-off-by: Bin Liu <bin@hyper.sh>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Signed-off-by: Eryu Guan <eguan@linux.alibaba.com>
Signed-off-by: Fupan Li <lifupan@gmail.com>
Signed-off-by: gexuyang <gexuyang@linux.alibaba.com>
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
Signed-off-by: He Rongguang <herongguang@linux.alibaba.com>
Signed-off-by: Hui Zhu <teawater@gmail.com>
Signed-off-by: Issac Hai <hjwissac@linux.alibaba.com>
Signed-off-by: Jiahuan Chao <jhchao@linux.alibaba.com>
Signed-off-by: lichenglong9 <lichenglong9@163.com>
Signed-off-by: mengze <mengze@linux.alibaba.com>
Signed-off-by: Qingyuan Hou <qingyuan.hou@linux.alibaba.com>
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
Signed-off-by: shiqiangzhang <shiyu.zsq@linux.alibaba.com>
Signed-off-by: Simon Guo <wei.guo.simon@linux.alibaba.com>
Signed-off-by: Tim Zhang <tim@hyper.sh>
Signed-off-by: wanglei01 <wllenyj@linux.alibaba.com>
Signed-off-by: Wei Yang <wei.yang1@linux.alibaba.com>
Signed-off-by: yanlei <yl.on.the.way@gmail.com>
Signed-off-by: Yiqun Leng <yqleng@linux.alibaba.com>
Signed-off-by: yuchang.xu <yuchang.xu@linux.alibaba.com>
Signed-off-by: Yves Chan <lingfu@linux.alibaba.com>
Signed-off-by: Zack <zmlcc@linux.alibaba.com>
Signed-off-by: Zhiheng Tao <zhihengtao@linux.alibaba.com>
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>
2022-06-10 19:56:59 +08:00
Quanwei Zhou
641b736106 libs: enhance kata-sys-util
1. move verify_cid from agent to libs/kata-sys-util
2. enhance kata-sys-util/k8s

Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-06-10 19:55:39 +08:00
Fupan Li
69ba1ae9e4 trans: fix the issue of wrong swapness type
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2022-06-10 19:46:25 +08:00
Quanwei Zhou
d2a9bc6674 agent: agent-protocol support async
1. support async.
2. update ttrpc and protobuf
update ttrpc to 0.6.0
update protobuf to 2.23.0
3. support trans from oci

Fixes: #3746
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-06-10 19:36:55 +08:00
Fabiano Fidêncio
9773838c01 virtiofsd: export env vars needed for building it
@jongwu, mentioned on an PR[0] that env vars should be exported to
ensure that virtiofsd is statically built for non-x86_64 architectures.

[0]: https://github.com/kata-containers/kata-containers/pull/4308#issuecomment-1137125592

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-10 13:27:02 +02:00
Liu Jiang
aee9633ced libs/sys-util: provide functions to execute hooks
Provide functions to execute OCI hooks.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Bin Liu <bin@hyper.sh>
Signed-off-by: Huamin Tang <huamin.thm@alibaba-inc.com>
Signed-off-by: Lei Wang <wllenyj@linux.alibaba.com>
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-06-10 19:24:30 +08:00
Liu Jiang
8509de0aea libs/sys-util: add function to detect and update K8s emptyDir volume
Add function to detect and update K8s emptyDir volume.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Qingyuan Hou <qingyuan.hou@linux.alibaba.com>
2022-06-10 19:15:59 +08:00
Liu Jiang
6d59e8e197 libs/sys-util: introduce function to get device id
Introduce get_devid() to get major/minor number of a block device.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Eryu Guan <eguan@linux.alibaba.com>
2022-06-10 19:15:28 +08:00
Liu Jiang
5300ea23ad libs/sys-util: implement reflink_copy()
Implement reflink_copy() to copy file by reflink, and fallback to normal
file copy.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Eryu Guan <eguan@linux.alibaba.com>
2022-06-10 19:15:20 +08:00
Liu Jiang
1d5c898d7f libs/sys-util: add utilities to parse NUMA information
Add utilities to parse NUMA information.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Qingyuan Hou <qingyuan.hou@linux.alibaba.com>
Signed-off-by: Simon Guo <wei.guo.simon@linux.alibaba.com>
2022-06-10 19:15:12 +08:00
Liu Jiang
87887026f6 libs/sys-util: add utilities to manipulate cgroup
Add utilities to manipulate cgroup, currently only v1 is supported.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: He Rongguang <herongguang@linux.alibaba.com>
Signed-off-by: Jiahuan Chao <jhchao@linux.alibaba.com>
Signed-off-by: Qingyuan Hou <qingyuan.hou@linux.alibaba.com>
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
Signed-off-by: Tim Zhang <tim@hyper.sh>
2022-06-10 19:14:59 +08:00
Fabiano Fidêncio
b0e090f40b versions: Bump virtiofsd to v1.3.0
Changes since v1.2.0:
!123  Update rust-vmm dependencies                           (main) ← (update-deps)
!121  implement std::error::Error trait                      (main) ← (fix-impl-error)
!120  Show the nofile hard limit value in the warning me...  (main) ← (fix-rlimit-warn)
!119  Do not create tmpdir and bind mount /proc/self/fd ...  (main) ← (remove-tmp-dir-for-proc)
!116  Disable killpriv_v2 by default                         (main) ← (no-killpriv-default)

The one that affected Kata Containers the most was !119, as virtiofsd
would get denied when SELinux was set to run on enforcing mode.

Fixes: #4433

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-10 13:14:58 +02:00
Liu Jiang
ccd03e2cae libs/sys-util: add wrappers for mount and fs
Add some wrappers for mount and fs syscall.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Bin Liu <bin@hyper.sh>
Signed-off-by: Fupan Li <lifupan@gmail.com>
Signed-off-by: Huamin Tang <huamin.thm@alibaba-inc.com>
Signed-off-by: Lei Wang <wllenyj@linux.alibaba.com>
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-06-10 19:14:06 +08:00
Liu Jiang
45a00b4f02 libs/sys-util: add kata-sys-util crate under src/libs
The kata-sys-util crate is a collection of modules that provides helpers
and utilities used by multiple Kata Containers components.

Fixes: #3305

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
2022-06-10 19:10:40 +08:00
Zhongtao Hu
48c201a1ac libs/types: make the variable name easier to understand
1. modify default values for hypervisor
2. change the variable name
3. check the min memory limit

Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-10 19:01:31 +08:00
Zhongtao Hu
b9b6d70aae libs/types: modify implementation details
1. fix nit problems
2. use generic type when parsing different type

Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-10 19:01:24 +08:00
Zhongtao Hu
05ad026fc0 libs/types: fix implementation details
use ok_or_else to handle get_mut(hypervisor) to substitue unwrap

Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-10 19:01:17 +08:00
Zhongtao Hu
d96716b4d2 libs/types:fix styles and implementation details
1. Some Nit problems are fixed
2. Make the code more readable
3. Modify some implementation details

Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-10 19:01:09 +08:00
Zhongtao Hu
6cffd943be libs/types:return Result to handle parse error
If there is a parse error when we are trying to get the annotations, we
will return Result<Option<type>> to handle that.

Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-10 19:00:58 +08:00
Zhongtao Hu
6ae87d9d66 libs/types: use contains to make code more readable
use contains to when validate hypervisor block_device_driver

Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-10 19:00:50 +08:00
Zhongtao Hu
45e5780e7c libs/types: fixed spelling and grammer error
fixed spelling and grammer error in some files

Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-10 19:00:43 +08:00
Zhongtao Hu
2599a06a56 libs/types:use include_str! in test file
use include_str! to load toml file to string fmt

Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-10 18:28:14 +08:00
Zhongtao Hu
8ffff40af4 libs/types:Option type to handle empty tomlconfig
loading from empty string is only used to identity that the config is
not initialized yet, so Option<TomlConfig> is a better option

Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-10 18:28:05 +08:00
Zhongtao Hu
626828696d libs/types: add license for test-config.rs
add SPDX license identifier: Apache-2.0

Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-10 18:27:57 +08:00
Zhongtao Hu
97d8c6c0fa docs: modify move-issues-to-in-progress.yaml
change issue backlog to runtime-rs

Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-10 18:27:49 +08:00
Liu Jiang
8cdd70f6c2 libs/types: change method to update config by annotation
Some annotations are used to override hypervisor configurations, and you
know it's dangerous. We must be careful when overriding hypervisor configuration
by annotations, to avoid security flaws.
There are two existing mechanisms to prevent attacks by annotations:
1) config.hypervisor.enable_annotations defines the allowed annotation
keys for config.hypervisor.
2) config.hyperisor.xxxx_paths defines allowd values for specific keys.

The access methods for config.hypervisor.xxx enforces the permisstion
checks for above rules.

To update conifg, traverse the annotation hashmap,check if the key is enabled in hypervisor or not.
If it is enabled. For path related annotation, check whether it is valid or not
before updating conifg. For cpu and memory related annotation, check whether it
is more than or less than the limitation for DB and qemu beforing updating config.

If it is not enabled, there will be three possibilities, agent related
annotation, runtime related annotation and hypervisor related annotation
but not enabled. The function will handle agent and runtime annotation
first, then the option left will be the invlaid hypervisor, err message
will be returned.

add more edge cases tests for updating config

clean up unused functions, delete unused files and fix warnings

Fixes: #3523

Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
2022-06-10 18:27:36 +08:00
Liu Jiang
e19d04719f libs/types: implement KataConfig to wrap TomlConfig
The TomlConfig structure is a parsed form of Kata configuration file,
but it's a little inconveneient to access those configuration
information directly. So introduce a wrapper KataConfig to easily
access those configuration information.

Two singletons of KataConfig is provided:
- KATA_DEFAULT_CONFIG: the original version directly loaded from Kata
configuration file.
- KATA_ACTIVE_CONFIG: the active version is the KATA_DEFAULT_CONFIG
patched by annotations.

So the recommended to way to use these two singletons:
- Load TomlConfig from configuration file and set it as the default one.
- Clone the default one and patch it with values from annotations.
- Use the default one for permission checks, such as to check for
  allowed annotation keys/values.
- The patched version may be set as the active one or passed to clients.
- The clients directly accesses information from the active/passed one,
  and do not need to check annotation for override.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
2022-06-10 18:26:48 +08:00
Liu Jiang
387ffa914e libs/types: support load Kata agent configuration from file
Add structures to load Kata agent configuration from configuration files.
Also define a mechanism for vendor to extend the Kata configuration
structure.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
2022-06-10 18:26:37 +08:00
Liu Jiang
69f10afb71 libs/types: support load Kata hypervisor configuration from file
Add structures to load Kata hypevisor configuration from configuration
files. Also define a mechanisms to:
1) for hypervisors to handle the configuration info.
2) for vendor to extend the Kata configuration structure.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-10 18:25:24 +08:00
Liu Jiang
21cc02d724 libs/types: support load Kata runtime configuration from file
Add structures to load Kata runtime configuration from configuration
files. Also define a mechanism for vendor to extend the Kata
configuration structure.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-06-10 18:25:24 +08:00
Liu Jiang
5b89c1df2f libs/types: add kata-types crate under src/libs
Add kata-types crate to host constants and data types shared by multiple
Kata Containers components.

Fixes: #3305

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Fupan Li <lifupan@gmail.com>
Signed-off-by: Huamin Tang <huamin.thm@alibaba-inc.com>
Signed-off-by: Lei Wang <wllenyj@linux.alibaba.com>
Signed-off-by: yanlei <yl.on.the.way@gmail.com>
2022-06-10 18:25:24 +08:00
Liu Jiang
4f62a7618c libs/logging: fix clippy warnings
Fix clippy warnings of libs/logging.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
2022-06-10 18:25:24 +08:00
Liu Jiang
6f8acb94c2 libs: refine Makefile rules
Refine Makefile rules to better support the KATA ci env.

Fixes: #3536

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
2022-06-10 18:25:24 +08:00
Liu Jiang
7cdee4980c libs/logging: introduce a wrapper writer for logging
Introduce a wrapper writer `LogWriter` which converts every line written
to it into a log record.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Wei Yang <wei.yang1@linux.alibaba.com>
Signed-off-by: yanlei <yl.on.the.way@gmail.com>
2022-06-10 18:25:24 +08:00
Liu Jiang
426f38de94 libs/logging: implement rotator for log files
Add FileRotator to rotate log files.

The FileRotator structure may be used as writer for create_logger()
and limits the storage space occupied by log files.

Fixes: #3304

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Wei Yang <wei.yang1@linux.alibaba.com>
Signed-off-by: yanlei <yl.on.the.way@gmail.com>
2022-06-10 18:25:24 +08:00
Liu Jiang
392f1ecdf5 libs: convert to a cargo workspace
Convert libs into a Cargo workspace, so all libraries could share the
build infrastructure.

Fixes #3282

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
2022-06-10 18:25:24 +08:00
Liu Jiang
575df4dc4d static-checks: Allow Merge commit to be >75 chars
Some generated merge commit messages are >75 chars
Allow these to not trigger the subject line length failure

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2022-06-10 18:25:24 +08:00
Alex Carter
db5048d52c kernel: build efi_secret module for SEV
Add kernel fork for sev to kernel builder with efi_secret. Additionally, install efi_secret module for sev.

Fixes: #4179
Signed-off-by: Alex Carter <alex.carter@ibm.com>
2022-06-09 12:28:43 -05:00
Snir Sheriber
7676cde0c5 workflow: trigger test-kata-deploy with pull_request
event that changes VERSION (i.e. a release PR)

Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-06-09 18:17:47 +03:00
Snir Sheriber
f10827357e workflow: require PR num input on test-kata-deploy workflow_dispatch
this will require to set a PR number when triggering the test-kata-deploy workflow manually
also make sure user variables are set correctly when workflow_dispatch is used

Fixes: #4349
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-06-09 18:14:43 +03:00
James O. D. Hunt
1b845978f9 docs: Add storage limits to arch doc
Updated the architecture document to explain that if you wish to
constrain the amount of disk space a container uses, you need to use an
existing facility such as `quota(1)`s or device mapper limits.

Fixes: #4430.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-06-09 10:52:17 +01:00
James O. D. Hunt
412441308b docs: Add more kata monitor details
Add more detail to the `kata-monitor` doc to allow an admin to make a
more informed decision about where and how to run the daemon.

Fixes: #4416.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-06-09 09:20:11 +01:00
Bin Liu
ae911d0cd3 Merge pull request #4378 from cmaf/update-containerd-docs-critools
docs: Update source for cri-tools
2022-06-09 15:12:37 +08:00
Bin Liu
05022975c8 Merge pull request #4413 from jodh-intel/tools-full-err-output
tools: Enable extra detail on error
2022-06-09 13:52:08 +08:00
Chelsea Mafrica
aaa74e8a2b Merge pull request #4415 from jodh-intel/agent-ctl-doc-examples
docs: Add agent-ctl examples section
2022-06-08 09:51:30 -07:00
snir911
a57515bdae Merge pull request #4384 from snir911/2.5.0-alpha2-branch-bump
# Kata Containers 2.5.0-alpha2
2022-06-08 19:32:57 +03:00
Eric Ernst
4ebf9d38b9 Merge pull request #4310 from egernst/core-sched
shim: add support for core scheduling
2022-06-08 17:42:45 +02:00
Bin Liu
eff4e1017d shim: change the log level for GetOOMEvent call failures
GetOOMEvent is a blocking call that will fail if
the container exit, in this case, it's not an error or warning.

Changing the log level for logs in case of GetOOMEvent call fails
will reduce log noise in a large cluster that has pods
creating/deleting frequently.

Fixes: #4376

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-06-08 22:17:24 +08:00
Snir Sheriber
eb24e97150 release: Kata Containers 2.5.0-alpha2
- docs: Update storage documentation link
- rustjail: get home dir using nix crate
- runk: Support `list` sub-command
- docs: Update vGPU use-case
- runtime: ignore ESRCH error from stop container
- docs: Update configuration reference for snap documentation
- workflows: add workflow_dispatch triggering to test-kata-deploy
- snap: Use helper script and cleanup
- feature: add ability to interact with IPTables within the guest
- agent: return mount file content if parse mountinfo failed
- docs: Update Intel QAT documentation links
- osbuilder: add iptables package
- runk: Return error when tty is used without console socket
- runk: Add Podman guide in README
- agent: Pass standard I/O to container launched by runk
- agent, runk: Enable test for the agent built with standard-oci-runtime feature
- runk: Handle rootfs path in config.json properly
- Update containerd docs
- clh: Update to v24.0
- snap: Build and package rust version of virtiofsd
- runk: merge oci-kata-agent into runk
- virtiofsd: static build virtiofsd from rust code for non-x86
- Fix issues with direct-volume stats feature
- runtime: fix incorrect Action function for direct-volume stats
- runtime: Adding the correct detection of mediated PCIe devices
- runtime: remove duplicate 'types' import
- runtime: sync docstrings with function names
- qemu: allow using legacy serial device for the console
- docs: Remove clear containers reference in README
- runtime: do not check for EOF error in console watcher
- kernel: Remove nemu.conf from packaging
- tools: delete unused param from get_from_kata_deps callers
- agent: Fix is_signal_handled failing parsing str to u64
- Improve Go unit test script
- packaging: Add kernel config option for SGX in Gramine
- ci: Don't run Docs URL Alive Check workflow on forks
- tools: Add QEMU patches for SGX numa support
- docs: Update runc containerd runtime
- Build and distribute the rust version of virtiofsd
- doc: Update log parser link
- Move the kata-log-parser from the tests repo
- versions: Upgrade to Cloud Hypervisor v23.1
- agent: Add a macro to skip a loop easier
- runk: use custom Kill command to support --all option
- agent: add test coverage for functions find_process and online_resources

fe3c1d9cd docs: Update storage documentation link
9d27c1fce agent: ignore ESRCH error when destroying containers
9726f56fd runtime: force stop container after the container process exits
168f325c4 docs: Update configuration reference for snap documentation
38a318820 runk: Support `list` sub-command
b9fc24ff3 docs: update release process github token instructions
c1476a174 docs: update release process with latest workflow triggering
002f2cd10 snap: Use helper script and cleanup
2e04833fb docs: Update Intel QAT documentation links
8b57bf97a workflows: add workflow_dispatch triggering to test-kata-deploy
6d0ff901a docs: Update vGPU use-case
9b108d993 docs: Improve snap formatting
894f661cc docs: Add warning to snap build
d759f6c3e snap: Fix CH architecture check
590381574 agent: Pass standard I/O to container launched by runk
af2ef3f7a agent-ctl: introduce handle for iptables get/set
65f0cef16 kata-runtime: add iptables CLI to test http endpoint
3201ad083 shim-client: ensure we check resp status for Put/Post
0706fb28a kata-runtime: shmgmt: make url usage consistent
2a09378dd shim-client: add support for DoPut
640173cfc shim-mgmt: Add endpoint handler for interacting with iptables
0136be22c virtcontainers: plumb iptable set/get from sandbox to agent
bd50d463b agent: iptables: get/set handling for iptables
7c4049aab osbuilder: add iptables package
03176a9e0 proto: update generated code based on proto update
38ebbc705 proto: update to add set/get iptables
78d45b434 agent: return mount file content if parse mountinfo failed
c7b3941c9 runk: Enable test for the agent built with standard-oci-runtime feature
6dbce7c3d agent: Remove unused import in console test
6ecea84bc rustjail: get home dir using nix crate
648b8d0ae runk: Return error when tty is used without console socket
5205efd9b runk: Add Podman guide in README
d862ca059 runk: Handle rootfs path in config.json properly
56591804b docs: Improve snap build instructions
cb2b30970 snap: Build using destructive mode
60823abb9 docs: Move snap README
fff832874 clh: Update to v24.0
49361749e snap: Build and package rust version of virtiofsd
27d903b76 snap: Put the yq binary in the staging bin directory
d7b4ce049 snap: Remove unused variable
43de5440e snap: Fix unbound variable error
c9b291509 snap: Fix whitespace
122a85e22 agent: remove bin oci-kata-agent
35619b45a runk: merge oci-kata-agent into runk
10c13d719 qemu: remove virtiofsd option in qemu config
d20bc5a4d virtiofsd: build rust based virtiofsd from source for non-x86_64
c95ba63c0 docs: Remove information related to Kata 1.x
34b80382b docs: Get rid of note related to networking.
dfad5728a docs: Mention --cni flag while invoking ctr
8e7c5975c agent: fix direct-assigned volume stats
4428ceae1 runtime: direct-volume stats use correct name
ffdc065b4 runtime: direct-volume stats update to use GET parameter
f29595318 runtime: fix incorrect Action function for direct-volume stats
7a5ccd126 runtime: sync docstrings with function names
ce2e521a0 runtime: remove duplicate 'types' import
834f93ce8 docs: fix annotations example
f4994e486 runtime: allow annotation configuration to use_legacy_serial
24a2b0f6a docs: Remove clear containers reference in README
abad33eba kernel: Remove nemu.conf from packaging
e87eb13c4 tools: delete unused param from get_from_kata_deps callers
8052fe62f runtime: do not check for EOF error in console watcher
c67b9d297 qemu: allow using legacy serial device for the console
44814dce1 qemu: treat console kernel params within appendConsole
4f586d2a9 packaging: Add kernel config option for SGX in Gramine
4b437d91f agent: Fix is_signal_handled failing parsing str to u64
88fb9b72e docs: Update runc containerd runtime
d1f2852d8 tools: Stop building virtiofsd with qemu (for x86_64)
c39852e83 runtime: Use ${LIBEXEC}/virtiofsd as the default virtiofsd path
b4b9068cb tools: Add QEMU patches for SGX numa support
a475956ab workflows: Add support for building virtiofsd
71f59f3a7 local-build: Add support for building virtiofsd
c7ac55b6d dockerbuild: Install unzip
8e2042d05 tools: add script to pull virtiofsd
dbedea508 versions: Add virtiofsd entry
e73b70baf runtime: Don't run unit tests verbose by default
f24a6e761 runtime: Consolidate flags setting in unit tests script
cf465feb0 runtime: Don't change test behaviour based on $CI or $KATA_DEV_MODE
34c4ac599 runtime: Remove redundant subcommands from go-test.sh
0aff5aaa3 runtime: Simplify package listing in go-test.sh
557c4cfd0 runtime: Don't chmod coverage files in Go tests
04c8b52e0 runtime: Remove HTML coverage option from go-test.sh
7f7691442 runtime: Add coverage.txt.tmp to gitignore
13c257700 runtime: Move go testing script locally
421064680 doc: Update log parser link
271933fec log-parser: fix some of the documentation
c7dacb121 log-parser: move the kata-log-parser from the tests repo
82ea01828 versions: Upgrade to Cloud Hypervisor v23.1
2a1d39414 runtime: Adding the correct detection of mediated PCIe devices
7bc4ab68c ci: Don't run Docs URL Alive Check workflow on forks
475e3bf38 agent: add test coverage for functions find_process and online_resources
383be2203 agent: Add a macro to skip a loop easier
97d7b1845 runk: use custom Kill command to support --all option

Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-06-08 11:56:30 +03:00
dependabot[bot]
5d7fb7b7b0 build(deps): bump github.com/containerd/containerd in /src/runtime
Bumps [github.com/containerd/containerd](https://github.com/containerd/containerd) from 1.6.1 to 1.6.6.
- [Release notes](https://github.com/containerd/containerd/releases)
- [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md)
- [Commits](https://github.com/containerd/containerd/compare/v1.6.1...v1.6.6)

---
updated-dependencies:
- dependency-name: github.com/containerd/containerd
  dependency-type: direct:production
...

Fixes: #4421
Signed-off-by: dependabot[bot] <support@github.com>
2022-06-08 10:54:46 +03:00
dependabot[bot]
d0ca2fcbbc build(deps): bump crossbeam-utils in /src/tools/trace-forwarder
Bumps [crossbeam-utils](https://github.com/crossbeam-rs/crossbeam) from 0.8.5 to 0.8.8.
- [Release notes](https://github.com/crossbeam-rs/crossbeam/releases)
- [Changelog](https://github.com/crossbeam-rs/crossbeam/blob/master/CHANGELOG.md)
- [Commits](https://github.com/crossbeam-rs/crossbeam/compare/crossbeam-utils-0.8.5...crossbeam-utils-0.8.8)

---
updated-dependencies:
- dependency-name: crossbeam-utils
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-06-08 10:47:58 +03:00
dependabot[bot]
a60dcff4d8 build(deps): bump regex from 1.5.4 to 1.5.6 in /src/tools/agent-ctl
Bumps [regex](https://github.com/rust-lang/regex) from 1.5.4 to 1.5.6.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/1.5.4...1.5.6)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-06-08 10:47:58 +03:00
dependabot[bot]
dbf50672e1 build(deps): bump crossbeam-utils in /src/tools/agent-ctl
Bumps [crossbeam-utils](https://github.com/crossbeam-rs/crossbeam) from 0.8.5 to 0.8.8.
- [Release notes](https://github.com/crossbeam-rs/crossbeam/releases)
- [Changelog](https://github.com/crossbeam-rs/crossbeam/blob/master/CHANGELOG.md)
- [Commits](https://github.com/crossbeam-rs/crossbeam/compare/crossbeam-utils-0.8.5...crossbeam-utils-0.8.8)

---
updated-dependencies:
- dependency-name: crossbeam-utils
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-06-08 10:47:58 +03:00
dependabot[bot]
8e2847bd52 build(deps): bump crossbeam-utils from 0.8.6 to 0.8.8 in /src/libs
Bumps [crossbeam-utils](https://github.com/crossbeam-rs/crossbeam) from 0.8.6 to 0.8.8.
- [Release notes](https://github.com/crossbeam-rs/crossbeam/releases)
- [Changelog](https://github.com/crossbeam-rs/crossbeam/blob/master/CHANGELOG.md)
- [Commits](https://github.com/crossbeam-rs/crossbeam/compare/crossbeam-utils-0.8.6...crossbeam-utils-0.8.8)

---
updated-dependencies:
- dependency-name: crossbeam-utils
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-06-08 10:47:58 +03:00
dependabot[bot]
e9ada165ff build(deps): bump regex from 1.5.4 to 1.5.5 in /src/agent
Bumps [regex](https://github.com/rust-lang/regex) from 1.5.4 to 1.5.5.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/1.5.4...1.5.5)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-06-08 10:47:58 +03:00
dependabot[bot]
adad9cef18 build(deps): bump crossbeam-utils from 0.8.5 to 0.8.8 in /src/agent
Bumps [crossbeam-utils](https://github.com/crossbeam-rs/crossbeam) from 0.8.5 to 0.8.8.
- [Release notes](https://github.com/crossbeam-rs/crossbeam/releases)
- [Changelog](https://github.com/crossbeam-rs/crossbeam/blob/master/CHANGELOG.md)
- [Commits](https://github.com/crossbeam-rs/crossbeam/compare/crossbeam-utils-0.8.5...crossbeam-utils-0.8.8)

---
updated-dependencies:
- dependency-name: crossbeam-utils
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-06-08 10:47:58 +03:00
James O. D. Hunt
34bcef8846 docs: Add agent-ctl examples section
Add a new `Examples` section to the `agent-ctl` docs giving some
examples of how to use the tool with QEMU and stand-alone.

Fixes: #4414.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-06-08 08:39:38 +01:00
James O. D. Hunt
815157bf02 docs: Remove erroneous whitespace
Deleted an extra blank line.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-06-08 08:39:38 +01:00
GabyCT
5bd81ba232 Merge pull request #4399 from GabyCT/topic/updatestoragedoc
docs: Update storage documentation link
2022-06-07 09:13:45 -05:00
James O. D. Hunt
f5099620f1 tools: Enable extra detail on error
The `agent-ctl` and `trace-forwarder` tools make use of
`anyhow::Context` to provide additional call site information on error.

However, previously neither tool was using the "alternate debug" format
to display the error, meaning full error output was not displayed.

Fixes: #4411.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-06-07 14:00:29 +01:00
Gabriela Cervantes
fe3c1d9cdd docs: Update storage documentation link
This PR updates the storage documentation link for the devicemapper
snapshotter.

Fixes #4398

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-06-06 14:48:34 +00:00
Bin Liu
a238d8c6bd Merge pull request #4300 from justxuewei/fix/rustjail/home-env
rustjail: get home dir using nix crate
2022-06-06 11:03:46 +08:00
Bin Liu
f981190621 Merge pull request #4383 from cyyzero/runk-list
runk: Support `list` sub-command
2022-06-06 10:25:33 +08:00
Bin Liu
f7b22eb777 Merge pull request #4344 from zvonkok/vgpu-documentation
docs: Update vGPU use-case
2022-06-06 10:25:05 +08:00
David Gibson
8f10e13e07 config: Allow enable_iommu pod annotation by default
Since #902 the `io.katacontainers.config.hypervisor` pod annotations
have only been permitted if explicitly allowed in the global
configuration.  The default global configuration allows no such
annotations.  That's important because several of those annotations
would cause Kata to execute arbitrary binaries, and so were wildly
unsafe.

However, this is inconvenient for the
`io.katacontainers.config.hypervisor.enable_iommu` annotation
specifically, which controls whether the sandbox VM includes a vIOMMU.
A guest side vIOMMU is necessary to implement VFIO passthrough devices
with `vfio_mode = vfio`, so enabling that mode of operation currently
requires a global configuration change, and can't just be enabled
per-pod.

Unlike some of the other hypervisor annotations, the `enable_iommu`
annotation is quite safe.  By default the vIOMMU is not present, so
allowing a user to override it for a pod only improves their
facilities for isolation.  Even if the global default were changed to
enable the vIOMMU, that doesn't compel the guest kernel to use it, so
allowing a user to disable the vIOMMU doesn't materially affect
isolation either.

Therefore, allow the io.katacontainers.config.hypervisor.enable_iommu
annotation to work in the default configurations.

fixes #4330

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-06-04 13:02:05 +10:00
Eric Ernst
430da47215 Merge pull request #4360 from fengwang666/shim-leak
runtime: ignore ESRCH error from stop container
2022-06-02 12:42:19 -07:00
GabyCT
9c9e5984ba Merge pull request #4342 from GabyCT/topic/updatesnapdoc
docs: Update configuration reference for snap documentation
2022-06-02 14:00:22 -05:00
Feng Wang
9d27c1fced agent: ignore ESRCH error when destroying containers
destroy() method should ignore the ESRCH error from signal::kill
and continue the operation as ESRCH is often considered harmless.

Fixes: #4359

Signed-off-by: Feng Wang <feng.wang@databricks.com>
2022-06-02 08:19:48 -07:00
Feng Wang
9726f56fdc runtime: force stop container after the container process exits
Set thestop container force flag to true so that the container state is always set to
“StateStopped” after the container wait goroutine is finished. This is necessary for
the following delete container step to succeed.

Fixes: #4359

Signed-off-by: Feng Wang <feng.wang@databricks.com>
2022-06-02 08:17:08 -07:00
Gabriela Cervantes
168f325c43 docs: Update configuration reference for snap documentation
This PR updates the url link for the kata containers configuration
for the general snap documentation.

Fixes #4341

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-06-02 14:55:06 +00:00
Chen Yiyang
38a3188206 runk: Support list sub-command
Support list sub-command. It will traverse the root directory, parse
status file and print basic information of containers. Behavior and
print format consistent with runc. To handle race with runk delete
or system user modify, the loop will continue to traverse when errors
are encountered.

Fixes: #4362

Signed-off-by: Chen Yiyang <cyyzero@qq.com>
2022-06-02 18:24:51 +08:00
snir911
a0805742d6 Merge pull request #4350 from snir911/fix_workflow
workflows: add workflow_dispatch triggering to test-kata-deploy
2022-06-02 13:19:13 +03:00
Fabiano Fidêncio
24182d72d9 Merge pull request #4322 from jodh-intel/snap-cleanup
snap: Use helper script and cleanup
2022-06-02 11:47:02 +02:00
Peng Tao
295a01f9b1 Merge pull request #4159 from egernst/topic/iptables
feature: add ability to interact with IPTables within the guest
2022-06-02 11:19:41 +08:00
Tim Zhang
b8e98b175c Merge pull request #4355 from liubin/fix/add-debug-info-for-parse-mount-error
agent: return mount file content if parse mountinfo failed
2022-06-02 10:31:46 +08:00
GabyCT
e8d0be364f Merge pull request #4375 from GabyCT/topic/updateqat
docs: Update Intel QAT documentation links
2022-06-01 15:52:02 -05:00
Chelsea Mafrica
7ae11cad67 docs: Update source for cri-tools
Kubernetes-incubator was previously deprecated in favor of
kubernetes-sigs.

Fixes #4377

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2022-06-01 12:48:48 -07:00
Chelsea Mafrica
25b1317ead Merge pull request #4357 from egernst/iptables-pkg
osbuilder: add iptables package
2022-06-01 09:28:38 -07:00
Snir Sheriber
b9fc24ff3a docs: update release process github token instructions
and fix the gpg generating key url

Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-06-01 19:08:41 +03:00
Snir Sheriber
c1476a174b docs: update release process with latest workflow triggering
instructions

Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-06-01 19:08:25 +03:00
James O. D. Hunt
002f2cd109 snap: Use helper script and cleanup
Move the common shell code to a helper script that is sourced by all
parts.

Add extra quoting to some variables in the snap config file
and simplify.

Fixes: #4304.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-06-01 16:09:29 +01:00
Gabriela Cervantes
2e04833fb9 docs: Update Intel QAT documentation links
This PR updates some Intel QAT documentation url links.

Fixes #4374

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-06-01 14:41:00 +00:00
Snir Sheriber
8b57bf97ab workflows: add workflow_dispatch triggering to test-kata-deploy
This will allow to trigger the test-kata-deploy workflow manually from
any branch instead of using always the one that is defined on main

See: https://github.blog/changelog/2020-07-06-github-actions-manual-triggers-with-workflow_dispatch/

Fixes: #4349
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-06-01 16:21:01 +03:00
Zvonko Kaiser
6d0ff901ab docs: Update vGPU use-case
Now that #4213 is merged we need updated documentation for vGPU time-sliced or vGPU MIG-backed.

Fixes: #4343

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2022-06-01 05:58:46 -07:00
James O. D. Hunt
9b108d9937 docs: Improve snap formatting
Improve the snap docs by using more consistent formatting and proper
shell code in the shell example.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-06-01 12:00:40 +01:00
James O. D. Hunt
894f661cc4 docs: Add warning to snap build
Since we must build with `--destructive-mode`, add a warning that the
host environment could change the behaviour of the build, depending on
the packages installed on the system.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-06-01 12:00:40 +01:00
James O. D. Hunt
d759f6c3e5 snap: Fix CH architecture check
Correct the `cloud-hypervisor` part architecture check to use `x86_64`, not
`x64_64`.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-06-01 12:00:38 +01:00
Bin Liu
3e2817f7b5 Merge pull request #4325 from ManaSugi/runk/error-terminal
runk: Return error when tty is used without console socket
2022-06-01 13:58:38 +08:00
Bin Liu
a9a3074828 Merge pull request #4339 from ManaSugi/runk/add-podman-instruction
runk: Add Podman guide in README
2022-06-01 11:05:42 +08:00
Bin Liu
9f81c2dbf0 Merge pull request #4328 from ManaSugi/runk/output-stdout
agent: Pass standard I/O to container launched by runk
2022-06-01 11:00:26 +08:00
Manabu Sugimoto
5903815746 agent: Pass standard I/O to container launched by runk
The `kata-agent` passes its standard I/O file descriptors
through to the container process that will be launched
by `runk` without manipulation or modification in order to
allow the container process can handle its I/O operations.

Fixes: #4327

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-06-01 10:19:57 +09:00
Bin Liu
9658c6218e Merge pull request #4353 from ManaSugi/runk/enable-agent-unit-tests
agent, runk: Enable test for the agent built with standard-oci-runtime feature
2022-06-01 07:39:01 +08:00
Eric Ernst
d2df1209a5 docs: describe kata handling for core-scheduling
Add initial documentation for core-scheduling.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-05-31 16:17:00 -07:00
Michael Crosby
22b6a94a84 shim: add support for core scheduling
In linux 5.14 and hopefully some backports, core scheduling allows processes to
be co scheduled within the same domain on SMT enabled systems.

Containerd impl sets the core sched domain when launching a shim. This
allows a clean way for each shim(container/pod) to be in its own domain and any
additional containers, (v2 pods) be be launched with the same domain as well as
any exec'd process added to the container.

kernel docs: https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/core-scheduling.html

For Kata specifically, we will look for SCHED_CORE environment variable
to be set to indicate we shuold create a new schedule core domain.

This is equivalent to the containerd shim's PR: e48bbe8394

Fixes: #4309

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
Signed-off-by: Michael Crosby <michael@thepasture.io>
2022-05-31 10:10:40 -07:00
Eric Ernst
af2ef3f7a5 agent-ctl: introduce handle for iptables get/set
Add support for the updated agent API for iptables

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-05-31 09:27:58 -07:00
Eric Ernst
65f0cef16c kata-runtime: add iptables CLI to test http endpoint
While end users can connect directly to the shim, let's provide a way to
easily get/set iptables from kata-runtime itself.

Fixes: #4080
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-05-31 09:27:58 -07:00
Eric Ernst
3201ad0830 shim-client: ensure we check resp status for Put/Post
Without this, potential errors are silently dropped. Let's ensure we
return the error code as well as potenial data from the response.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-05-31 09:27:58 -07:00
Eric Ernst
0706fb28ac kata-runtime: shmgmt: make url usage consistent
Before, we had a mix of slash, etc. Unfortunately, when cleaning URL
paths, serve mux seems to mangle the request method, resulting in each
request being a GET (instead of PUT or POST).

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-05-31 09:27:58 -07:00
Eric Ernst
2a09378dd9 shim-client: add support for DoPut
While at it, make sure we check for nil in DoPost

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-05-31 09:27:58 -07:00
Eric Ernst
640173cfc2 shim-mgmt: Add endpoint handler for interacting with iptables
Add two endpoints: ip6tables, iptables.

Each url handler supports GET and PUT operations. PUT expects
the requests' data to be []bytes, and to contain iptable information in
format to be consumed by iptables-restore.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-05-31 09:27:58 -07:00
Eric Ernst
0136be22ca virtcontainers: plumb iptable set/get from sandbox to agent
Introduce get/set iptable handling. We add a sandbox API for getting and
setting the IPTables within the guest. This routes it from sandbox
interface, through kata-agent, ultimately making requests to the guest
agent.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-05-31 09:27:58 -07:00
Eric Ernst
bd50d463b2 agent: iptables: get/set handling for iptables
Initial support for getting and setting iptables in the guest.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-05-31 09:27:58 -07:00
Eric Ernst
7c4049aabb osbuilder: add iptables package
Since we are introducing an agent API for interacting with guest
iptables, let's ensure that our example rootfs' have iptables-save/restore
installed.

Fixes: #4356

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-05-31 09:21:02 -07:00
Eric Ernst
03176a9e09 proto: update generated code based on proto update
Update the generated agent.pb.go code based on proto update.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-05-31 08:45:59 -07:00
Eric Ernst
38ebbc705b proto: update to add set/get iptables
Update the agent protocol definition to introduce support for setting
and getting iptables from the guest.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-05-31 08:45:59 -07:00
Bin Liu
78d45b434f agent: return mount file content if parse mountinfo failed
Include mount file content in error message when parsing
mountinfo failed for debug.

Fixes: #4246, #4103

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-05-31 23:36:14 +08:00
Manabu Sugimoto
c7b3941c96 runk: Enable test for the agent built with standard-oci-runtime feature
This enables tests for the kata-agent for runk that is built
with standard-oci-runtime feature in CI.

Fixes: #4351

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-05-31 21:54:28 +09:00
Manabu Sugimoto
6dbce7c3de agent: Remove unused import in console test
Remove some unused imports in console test module
used by runk's test.

Fixes: #4351

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-05-31 21:54:02 +09:00
Xuewei Niu
6ecea84bc5 rustjail: get home dir using nix crate
Get user's home dir using `nix::unistd` crate instead of `utils` crate,
and remove useless code from agent.

Fixes: #4209

Signed-off-by: Xuewei Niu <justxuewei@apache.org>
2022-05-31 15:04:33 +08:00
Manabu Sugimoto
648b8d0aec runk: Return error when tty is used without console socket
runk always launches containers with detached mode,
so users have to use a console socket with run or
create operation when a terminal is used.
If users set `terminal` to `true` in `config.json` and
try to launch a container without specifying a console
socket, runk returns an error with a message early.

Fixes: #4324

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-05-31 09:55:39 +09:00
James O. D. Hunt
96c8df40b5 Merge pull request #4335 from ManaSugi/runk/fix-invalid-rootfs
runk: Handle rootfs path in config.json properly
2022-05-30 14:03:58 +01:00
Manabu Sugimoto
5205efd9b4 runk: Add Podman guide in README
runk can launch containers using Podman, so add the guide
in README.

Fixes: #4338

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-05-30 19:06:46 +09:00
James O. D. Hunt
d157f9b71e Merge pull request #3871 from amshinde/update-containerd-docs
Update containerd docs
2022-05-30 08:38:07 +01:00
Manabu Sugimoto
d862ca0590 runk: Handle rootfs path in config.json properly
This commit enables runk to handle `root.path` in `config.json`
properly even if the path is specified by a relative path that
includes the single (`.`) or the double (`..`) dots.
For example, with a bundle at `/to/bundle` and a rootfs directly
under `/to/bundle` such as `/to/bundle/{bin,dev,etc,home,...}`,
the `root.path` value can be either `/to/bundle` or just `.`.
This behavior conforms to OCI runtime spec.
Accordingly, a bundle path managed by runk's status file
(`status.json`) always is statically stored as a canonical path.
Previously, a bundle path has been got by `oci_state()` of rustjail's
API that returns the path as the parent directory path of a rootfs
(`root.path`). In case of the kata-agent, this works properly because
the kata containers assume that the rootfs path is always
`/to/bundle/rootfs`. However in case of standard OCI runtimes,
a rootfs can be placed anywhere under a bundle, so the rootfs path
doesn't always have to be at a `/to/bundle/rootfs`.

Fixes: #4334

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-05-30 14:41:26 +09:00
snir911
d50937435d Merge pull request #4318 from fidencio/topic/update-clh-to-v24.0
clh: Update to v24.0
2022-05-29 15:06:17 +03:00
James O. D. Hunt
56591804b3 docs: Improve snap build instructions
Make it clearer how to build the snap package manually.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-05-26 15:56:36 +01:00
James O. D. Hunt
cb2b30970d snap: Build using destructive mode
Destructive mode is required to build the Kata Containers snap. See:

```
.github/workflows/snap-release.yaml
.github/workflows/snap.yaml
```

Hence, update the last file that we forgot to update with
`--destructive-mode`.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-05-26 15:56:36 +01:00
James O. D. Hunt
60823abb9c docs: Move snap README
Move the snap README to a subdirectory to resolve the warning given by
`snapcraft` (folded and reformatted slightly for clarity):

```
The 'snap' directory is meant specifically for snapcraft,
but it contains the following non-snapcraft-related paths,
which is unsupported and will cause unexpected behavior:

- README.md

If you must store these files within the 'snap' directory,
move them to 'snap/local', which is ignored by snapcraft.
```

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-05-26 15:56:36 +01:00
James O. D. Hunt
4134beee39 Merge pull request #4301 from jodh-intel/snap-package-rust-virtiofsd
snap: Build and package rust version of virtiofsd
2022-05-26 15:55:06 +01:00
Fabiano Fidêncio
fff832874e clh: Update to v24.0
This release has been tracked through the v24.0 project.

virtio-iommu specification describes how a device can be attached by default
to a bypass domain. This feature is particularly helpful for booting a VM with
guest software which doesn't support virtio-iommu but still need to access
the device. Now that Cloud Hypervisor supports this feature, it can boot a VM
with Rust Hypervisor Firmware or OVMF even if the virtio-block device exposing
the disk image is placed behind a virtual IOMMU.

Multiple checks have been added to the code to prevent devices with identical
identifiers from being created, and therefore avoid unexpected behaviors at boot
or whenever a device was hot plugged into the VM.

Sparse mmap support has been added to both VFIO and vfio-user devices. This
allows the device regions that are not fully mappable to be partially mapped.
And the more a device region can be mapped into the guest address space, the
fewer VM exits will be generated when this device is accessed. This directly
impacts the performance related to this device.

A new serial_number option has been added to --platform, allowing a user to
set a specific serial number for the platform. This number is exposed to the
guest through the SMBIOS.

* Fix loading RAW firmware (#4072)
* Reject compressed QCOW images (#4055)
* Reject virtio-mem resize if device is not activated (#4003)
* Fix potential mmap leaks from VFIO/vfio-user MMIO regions (#4069)
* Fix algorithm finding HOB memory resources (#3983)

* Refactor interrupt handling (#4083)
* Load kernel asynchronously (#4022)
* Only create ACPI memory manager DSDT when resizable (#4013)

Deprecated features will be removed in a subsequent release and users should
plan to use alternatives

* The mergeable option from the virtio-pmem support has been deprecated
(#3968)
* The dax option from the virtio-fs support has been deprecated (#3889)

Fixes: #4317

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-05-26 08:51:18 +00:00
James O. D. Hunt
49361749ed snap: Build and package rust version of virtiofsd
Update the snap config file to build the rust version of `virtiofsd` for
x86_64, but build QEMU's C version for other platforms.

Fixes: #4261.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-05-25 17:04:05 +01:00
James O. D. Hunt
27d903b76a snap: Put the yq binary in the staging bin directory
Rather than putting the `yq` binary in the staging directory itself,
put it in the `bin/` sub-directory.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-05-25 09:40:09 +01:00
James O. D. Hunt
d7b4ce049e snap: Remove unused variable
Remove the unused `kata_url` variable and use the value in the `website`
YAML metadata instead.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-05-25 09:40:09 +01:00
James O. D. Hunt
43de5440e5 snap: Fix unbound variable error
Don't assume `GITHUB_REF` is set.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-05-25 09:40:09 +01:00
James O. D. Hunt
c9b291509d snap: Fix whitespace
Remove trailing space.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-05-25 09:40:09 +01:00
Fupan Li
62d1ed0651 Merge pull request #4290 from Tim-Zhang/remove-oci-kata-agent
runk: merge oci-kata-agent into runk
2022-05-25 11:31:25 +08:00
Fabiano Fidêncio
8a2b82ff51 Merge pull request #4276 from jongwu/build_rust_virtiofsd
virtiofsd: static build virtiofsd from rust code for non-x86
2022-05-24 14:57:21 +02:00
Eric Ernst
6d00701ec9 Merge pull request #4298 from yibozhuang/fix-direct-volume
Fix issues with direct-volume stats feature
2022-05-23 15:23:51 -07:00
Tim Zhang
122a85e222 agent: remove bin oci-kata-agent
Fixes: #4291

Signed-off-by: Tim Zhang <tim@hyper.sh>
2022-05-23 16:55:16 +08:00
Tim Zhang
35619b45aa runk: merge oci-kata-agent into runk
Merge two bins into one.

Fixes: #4291

Signed-off-by: Tim Zhang <tim@hyper.sh>
2022-05-23 16:54:09 +08:00
Fabiano Fidêncio
b9315af092 Merge pull request #4294 from yibozhuang/direct-volume-stats
runtime: fix incorrect Action function for direct-volume stats
2022-05-23 10:22:29 +02:00
Jianyong Wu
10c13d719a qemu: remove virtiofsd option in qemu config
As virtiofsd will be built base on rust, "virtiofsd" option is no longer
needed in qemu.

Fixes: #4258
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2022-05-23 12:57:59 +08:00
Jianyong Wu
d20bc5a4d2 virtiofsd: build rust based virtiofsd from source for non-x86_64
Based on @fidencio's opoinon,
On Arm: static build virtiofsd using musl lib;
on ppc64 & s390: static build virtiofsd using gnu lib;

Fixes: #4258
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2022-05-23 12:57:59 +08:00
Archana Shinde
c95ba63c0c docs: Remove information related to Kata 1.x
Since Kata 2.x does not support runtime cli, remove information
related to it. Update the configuration snippet accordingly.

Fixes #3870

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-05-21 07:19:28 +05:30
Archana Shinde
34b80382b6 docs: Get rid of note related to networking.
One may want to use standalone containerd without k8s
and still have network enabled for the container.
Getting rid of note due to inaccuracy.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-05-21 07:19:28 +05:30
Archana Shinde
dfad5728a7 docs: Mention --cni flag while invoking ctr
Specify that the `--cni` flag needs to be passed to the `ctr` tool
while starting a container in order to have networking enabled for the
container. This flag allows containerd to call into the configured
network plugin which in turn creates a network interface for the
container.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-05-21 07:19:28 +05:30
Yibo Zhuang
8e7c5975c6 agent: fix direct-assigned volume stats
The current implementation of walking the
disks to match with the requested volume path
in agent doesn't work because the volume path
provided by the shim to the agent is the mount
path within the guest and not the device name.
The current logic is trying to match the
device name to the volume path which will never
match.

This change will simplify the
get_volume_capacity_stats and
get_volume_inode_stats to just call statfs and
get the bytes and inodes usage of the volume
path directly.

Fixes: #4297

Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
2022-05-20 18:43:27 -07:00
Yibo Zhuang
4428ceae16 runtime: direct-volume stats use correct name
Today the shim does a translation when doing
direct-volume stats where it takes the source and
returns the mount path within the guest.

The source for a direct-assigned volume is actually
the device path on the host and not the publish
volume path.

This change will perform a lookup of the mount info
during direct-volume stats to ensure that the
device path is provided to the shim for querying
the volume stats.

Fixes: #4297

Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
2022-05-20 18:42:47 -07:00
Yibo Zhuang
ffdc065b4c runtime: direct-volume stats update to use GET parameter
The go default http mux AFAIK doesn’t support pattern
routing so right now client is padding the url
for direct-volume stats with a subpath of the volume
path and this will always result in 404 not found returned
by the shim.

This change will update the shim to take the volume
path as a GET query parameter instead of a subpath.
If the parameter is missing or empty, then return
400 BadRequest to the client.

Fixes: #4297

Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
2022-05-20 18:41:51 -07:00
Yibo Zhuang
f295953183 runtime: fix incorrect Action function for direct-volume stats
The action function expects a function that returns error
but the current direct-volume stats Action returns
(string, error) which is invalid.

This change fixes the format and print out the stats from
the command instead.

Fixes: #4293

Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
2022-05-20 14:55:00 -07:00
Peng Tao
2c238c8504 Merge pull request #4213 from zvonkok/vfio
runtime: Adding the correct detection of mediated PCIe devices
2022-05-20 15:00:23 +08:00
Fabiano Fidêncio
811ac6a8ce Merge pull request #4282 from r4f4/runtime-dedup-types-import
runtime: remove duplicate 'types' import
2022-05-19 22:15:36 +02:00
Chelsea Mafrica
d8be0f8e9f Merge pull request #4281 from r4f4/runtime-qemu-comments
runtime: sync docstrings with function names
2022-05-19 09:17:38 -07:00
Rafael Fonseca
7a5ccd1264 runtime: sync docstrings with function names
The functions were renamed but their docstrings were not.

Fixes #4006

Signed-off-by: Rafael Fonseca <r4f4rfs@gmail.com>
2022-05-19 14:31:47 +02:00
Greg Kurz
fa61bd43ee Merge pull request #4238 from snir911/wip/legacy_console
qemu: allow using legacy serial device for the console
2022-05-19 14:30:59 +02:00
Rafael Fonseca
ce2e521a0f runtime: remove duplicate 'types' import
Fallout of 09f7962ff

Fixes #4285

Signed-off-by: Rafael Fonseca <r4f4rfs@gmail.com>
2022-05-19 13:49:47 +02:00
Snir Sheriber
834f93ce8a docs: fix annotations example
annotation value should always be quoted, regardless to its type

Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-05-19 09:52:30 +03:00
GabyCT
d7aded7238 Merge pull request #4279 from GabyCT/topic/updateosbuilderreadme
docs: Remove clear containers reference in README
2022-05-18 14:26:56 -05:00
Snir Sheriber
f4994e486b runtime: allow annotation configuration to use_legacy_serial
and update the docs and test

Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-05-18 18:58:21 +03:00
Gabriela Cervantes
24a2b0f6a2 docs: Remove clear containers reference in README
This PR removes the clear containers reference as this is not longer
being used and is deprecated at the rootfs builder README.

Fixes #4278

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-05-18 14:53:17 +00:00
Fabiano Fidêncio
c88a48be21 Merge pull request #4271 from r4f4/runtime-err-check-fix
runtime: do not check for EOF error in console watcher
2022-05-18 09:49:48 +02:00
GabyCT
9458cc0053 Merge pull request #4273 from GabyCT/topic/removenemuconf
kernel: Remove nemu.conf from packaging
2022-05-17 16:06:45 -05:00
Greg Kurz
42c64b3d2c Merge pull request #4269 from r4f4/remove-unused-param-get_kata_deps
tools: delete unused param from get_from_kata_deps callers
2022-05-17 18:54:47 +02:00
Gabriela Cervantes
abad33eba0 kernel: Remove nemu.conf from packaging
This PR removes the nemu.conf as we are not longer using NEMU from
the kernel configurations.

Fixes #4272

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-05-17 16:23:17 +00:00
Chelsea Mafrica
04bd8f16f0 Merge pull request #4252 from Champ-Goblem/patch/fix-is-signal-handled
agent: Fix is_signal_handled failing parsing str to u64
2022-05-17 08:31:48 -07:00
GabyCT
12f0ab120a Merge pull request #4191 from dgibson/go-test-script
Improve Go unit test script
2022-05-17 10:27:04 -05:00
Rafael Fonseca
e87eb13c4f tools: delete unused param from get_from_kata_deps callers
The param was deleted by a09e58fa80, so
update the callers not to use it.

Fixes #4245

Signed-off-by: Rafael Fonseca <r4f4rfs@gmail.com>
2022-05-17 15:18:41 +02:00
Rafael Fonseca
8052fe62fa runtime: do not check for EOF error in console watcher
The documentation of the bufio package explicitly says

"Err returns the first non-EOF error that was encountered by the
Scanner."

When io.EOF happens, `Err()` will return `nil` and `Scan()` will return
`false`.

Fixes #4079

Signed-off-by: Rafael Fonseca <r4f4rfs@gmail.com>
2022-05-17 15:14:33 +02:00
Fabiano Fidêncio
5d43718494 Merge pull request #4267 from cmaf/packaging-config-add-numa
packaging: Add kernel config option for SGX in Gramine
2022-05-17 13:10:24 +02:00
Snir Sheriber
c67b9d2975 qemu: allow using legacy serial device for the console
This allows to get guest early boot logs which are usually
missed when virtconsole is used.
- It utilizes previous work on the govmm side:
https://github.com/kata-containers/govmm/pull/203
- unit test added

Fixes: #4237
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-05-17 12:06:11 +03:00
Snir Sheriber
44814dce19 qemu: treat console kernel params within appendConsole
as it is tightly coupled with the appended console device
additionally have it tested

Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-05-17 12:05:31 +03:00
Fupan Li
856c8e81f1 Merge pull request #4220 from liubin/fix/4219
ci: Don't run Docs URL Alive Check workflow on forks
2022-05-17 12:19:55 +08:00
Chelsea Mafrica
4f586d2a91 packaging: Add kernel config option for SGX in Gramine
For the Gramine Shielded Containers guest kernel, CONFIG_NUMA must be
enabled.

Fixes  #4266

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2022-05-16 16:58:26 -07:00
Champ-Goblem
4b437d91f0 agent: Fix is_signal_handled failing parsing str to u64
In the is_signal_handled function, when parsing the hex string returned
from `/proc/<pid>/status` the space/tab character after the colon
is not removed.

This patch trims the result of SigCgt so that
all whitespace characters are removed. It also extends the existing
test cases to check for this scenario.

Fixes: #4250
Signed-off-by: Champ-Goblem <cameron@northflank.com>
2022-05-16 20:34:26 +02:00
Fabiano Fidêncio
6ffdebd202 Merge pull request #4255 from cmaf/tools-patch-qemu-sgx-numa
tools: Add QEMU patches for SGX numa support
2022-05-16 18:10:41 +02:00
Chelsea Mafrica
ee9ee77388 Merge pull request #4264 from GabyCT/topic/updatecontainerdrunt
docs: Update runc containerd runtime
2022-05-16 08:56:26 -07:00
Gabriela Cervantes
88fb9b72e2 docs: Update runc containerd runtime
As we are using a containerd version > 1.4 we need to update
the runc containerd runtime.

Fixes #4263

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-05-16 14:33:48 +00:00
Suraj Deshmukh
0e2459d13e docs: Add cgroupDriver for containerd
This commit updates the "Run Kata Containers with Kubernetes" to include
cgroupDriver configuration via "KubeletConfiguration". Without this
setting kubeadm defaults to systemd cgroupDriver. Containerd with Kata
cannot spawn conntainers with systemd cgroup driver.

Fixes: #4262

Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
2022-05-16 17:32:57 +05:30
Fabiano Fidêncio
d1f2852d8b tools: Stop building virtiofsd with qemu (for x86_64)
As we finally can move to using the rust virtiofs daemon, let's stop
bulding and packaging the C version of the virtiofsd for x86_64.

Fixes: #4249
Depends-on: github.com/kata-containers/tests#4785

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-05-16 09:30:24 +02:00
Fabiano Fidêncio
c39852e83f runtime: Use ${LIBEXEC}/virtiofsd as the default virtiofsd path
As now we build and ship the rust version of virtiofsd, which is not
tied to QEMU, we need to update its default location to match with where
we're installing this binary.

Fixes: #4249

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-05-16 09:30:24 +02:00
Chelsea Mafrica
b4b9068cb7 tools: Add QEMU patches for SGX numa support
There are a few patches for SGX numa support in QEMU added after the
6.2.0 release. Add them for SGX support in Kata.

Fixes #4254

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2022-05-13 16:34:57 -07:00
Fabiano Fidêncio
b780be99d7 Merge pull request #4233 from fidencio/topic/virtiofsd-switch-to-the-rust-version
Build and distribute the rust version of virtiofsd
2022-05-13 19:38:01 +02:00
Fabiano Fidêncio
a475956abd workflows: Add support for building virtiofsd
As already done for the other assets we rely on, let's build (well, pull
in this very specific case) the virtiofsd binary, as we're relying on
its standlone rust version from now on.

Fixes: #4234

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-05-13 11:37:36 +02:00
Fabiano Fidêncio
71f59f3a7b local-build: Add support for building virtiofsd
As done for the other binaries we release, let's add support for
"building" (or pulling down) the static binary we ship as part of the
kata-containers static tarball (the same one used by kata-deploy).

Right now the virtiofsd is installed in /opt/kata/libexec/virtiofsd, a
different path than the virtiofsd that comes with QEMU.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-05-13 11:37:36 +02:00
Fabiano Fidêncio
c7ac55b6d7 dockerbuild: Install unzip
As virtiofsd comes in the `zip` format, let's install unzip in the
containers and then be able to access the virtiofsd binary.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-05-13 11:37:36 +02:00
Fabiano Fidêncio
8e2042d055 tools: add script to pull virtiofsd
Right now this is very much x86_64 specific, but I'd like to count on
the maintainers of the other architectures to expand it.

Also, the name as it's now may be misleading, as we're actually only
pulling the binary that's statically built using `musl` and released as
part of virtiofsd official releases.  But we'll need to build it for the
other architectures, thus I'm following the naming of the scripts used
by the other components.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-05-13 11:37:21 +02:00
Fabiano Fidêncio
dbedea5086 versions: Add virtiofsd entry
As we're switching to using the rust version of the virtiofsd, let's
give it its own entry in the versions.yaml file, as it's no longer part
of QEMU.

It's important to mention that GitLab doesn't provide a well formed URL
for the releases.  Instead, it adds there a hash, leading us to have to
add the specific link for the tarball.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-05-13 11:23:39 +02:00
David Gibson
e73b70baff runtime: Don't run unit tests verbose by default
go-test.sh by default adds the -v option to 'go test' meaning that output
will be printed from all the passing tests as well as any failing ones.
This results in a lot of output in which it's often difficult to locate the
failing tests you're interested in.

So, remove -v from the default flags.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-05-13 13:22:31 +10:00
David Gibson
f24a6e761f runtime: Consolidate flags setting in unit tests script
One of the responsibilities of the go-test.sh script is setting up the
default flags for 'go test'.  This is constructed across several different
places in the script using several unneeded intermediate variables though.

Consolidate all the flag construction into one place.

fixes #4190

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-05-13 13:22:29 +10:00
David Gibson
cf465feb02 runtime: Don't change test behaviour based on $CI or $KATA_DEV_MODE
go-test.sh changes behaviour based on both the $CI and $KATA_DEV_MODE
variables, but not in a way that makes a lot of sense.

If either one is set it uses the test_coverage path, instead of the
test_local path.  That collects coverage information, as the name
suggests, but it also means it runs the tests twice as root and
non-root, which is very non-obvious.

It's not clear what use case the test_local path is for at all.
Developer local builds will typically have $KATA_DEV_MODE set and CI
builds will have $CI set.  There's essentially no downside to running
coverage all the time - it has little impact on the test runtime.

In addition, if *both* $CI and $KATA_DEV_MODE are set, the script
refuses to run things as root, considering it "unsafe".  While having
both set might be unwise in a general sense, there's not really any
way running sudo can be any more unsafe than it is with either one
set.

So, simplify everything by just always running the test_coverage path.
This leaves the test_local path unused, so we can remove it entirely.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-05-13 13:14:37 +10:00
David Gibson
34c4ac599c runtime: Remove redundant subcommands from go-test.sh
go-test.sh accepts subcommands, however invoking it in the usual way via
the Makefile doesn't use them.  In fact the only remaining subcommand is
"help" and we already have another way of getting the usage information
(-h or --help).  We don't need a second way, so just drop subcommand
handling.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-05-13 13:14:37 +10:00
David Gibson
0aff5aaa39 runtime: Simplify package listing in go-test.sh
go-test.sh defaults to testing all the packages listed by go list, except
for a number filtered out.  It turns out that none of those filters are
necessary any more:
  * We've long required a Go newer than 1.9 which means the vendor filter
    isn't needed
  * The agent filter doesn't do anything now that we've moved to the Kata
    2.x unified repo
  * The tests filters don't hit anything on the list of modules in
    src/runtime (which is the only user of the script)

But since we don't need to filter anything out any more, we don't even need
to iterate through a list ourselves.  We can simply pass "./..." directly
to go test and it will iterate through all the sub-packages itself.

Interestingly this more than doubles the speed of "make test" for me - I
suspect because go test's internal paralellism works better over a larger
pool of tests.

This also lets us remove handling of non-existent coverage files from
test_go_package(), since with default options we will no longer test packages without tests
by default.  If the user explicitly requests testing of a package with no
tests, then failing makes sense.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-05-13 13:14:37 +10:00
David Gibson
557c4cfd00 runtime: Don't chmod coverage files in Go tests
The go-test.sh script has an explicit chmod command, run as root, to
set the mode of the temporary coverage files to 0644.  AFAICT the
point of this is specifically the 004 bit allowing world read access,
so that we can then merge the temporary coverage file into the main
coverage file.

That's a convoluted way of doing things.  Instead we can just run the tail
command which reads the temporary file as the same user that generated it.

In addition, go-test.sh became root to remove that temporary coverage
file.  This is not necessary, since deleting a regular file just requires
write access to the directory, not the file itself.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-05-13 13:14:37 +10:00
David Gibson
04c8b52e04 runtime: Remove HTML coverage option from go-test.sh
The html-coverage option to this script doesn't really alter behaviour
it just does the same thing as normal coverage, then converts the
report to HTML.  That conversion is a single command, plus a chmod to
make the final output mode 0644.  That overrides any umask the user
has set, which doesn't seem like a policy decision this script should
be making.

Nothing in the kata-containers or tests repository uses this, so it doesn't
really make sense to keep this logic inside this script.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-05-13 13:14:37 +10:00
David Gibson
7f76914422 runtime: Add coverage.txt.tmp to gitignore
In addition to coverage.txt, the go-test.sh script creates
coverage.txt.tmp files while running.  These are temporary and
certainly shouldn't be committed, so add them to the gitignore file.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-05-13 13:14:37 +10:00
David Gibson
13c2577004 runtime: Move go testing script locally
The go unit tests for the runtime are invoked by the helper script
ci/go-test.sh.  Which calls the run_go_test() function in ci/lib.sh.  Which
calls into .ci/go-test.sh from the tests repository.

But.. the runtime is the only user of this script, and generally stuff for
unit tests (rather than functional or integration tests) lives in the main
repository, not the tests repository.

So, just move the actual script into src/runtime.  A change to remove it
from the tests repo will follow.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-05-13 13:14:37 +10:00
Wainer Moschetta
97425a7fe6 Merge pull request #4240 from stevenhorsman/dev-guide-broken-link
doc: Update log parser link
2022-05-12 11:51:51 -03:00
stevenhorsman
4210646802 doc: Update log parser link
- Update log-parser link to reflect new location
- Also update the link to be relative

Fixes: #4239
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2022-05-12 14:23:13 +01:00
snir911
51fa4ab671 Merge pull request #4165 from snir911/mv_parser
Move the kata-log-parser from the tests repo
2022-05-11 10:33:36 +03:00
Bo Chen
79fb4fc5cb Merge pull request #4223 from likebreath/0509/clh_v23.1
versions: Upgrade to Cloud Hypervisor v23.1
2022-05-10 10:40:22 -07:00
Snir Sheriber
271933fec0 log-parser: fix some of the documentation
minor fixes of links and text

Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-05-10 13:23:25 +03:00
Snir Sheriber
c7dacb1211 log-parser: move the kata-log-parser from the tests repo
to the kata-containers repo under the src/tools/log-parser folder
and vendor the modules

Fixes: #4100
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-05-10 13:23:25 +03:00
GabyCT
61a167139c Merge pull request #4186 from liubin/fix/4185-skip-loop-by-user
agent: Add a macro to skip a loop easier
2022-05-09 16:58:29 -05:00
Bo Chen
82ea018281 versions: Upgrade to Cloud Hypervisor v23.1
The following issues have been addressed from the latest bug fix release
v23.1 of Cloud Hypervisor: 1) Add some missing seccomp rules; 2) Remove
virtio-fs filesystem entries from config on removal; 3) Do not delete
API socket on API server start; 4) Reject virtio-mem resize if the guest
doesn't activate the device; 5) Fix OpenAPI naming of I/O throttling
knobs;

Fixes: #4222

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-05-09 14:15:12 -07:00
Fupan Li
8aad2c59c5 Merge pull request #4184 from liubin/fix/4182-runk-kill-all
runk: use custom Kill command to support --all option
2022-05-09 17:56:10 +08:00
Zvonko Kaiser
2a1d394147 runtime: Adding the correct detection of mediated PCIe devices
Fixes #4212

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2022-05-09 00:57:06 -07:00
Bin Liu
7bc4ab68c3 ci: Don't run Docs URL Alive Check workflow on forks
This workflow is a scheduled job that runs at 23:00
every Sunday, it should only run the main repo
but not the forked ones.

Fixes: #4219

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-05-09 11:54:25 +08:00
James O. D. Hunt
79d93f1fe7 Merge pull request #4137 from Shensd/sandbox-tests-online_resources
agent: add test coverage for functions find_process and online_resources
2022-05-06 09:20:57 +01:00
Jack Hance
475e3bf38f agent: add test coverage for functions find_process and online_resources
Add test coverage for the functions find_process and online_resources in src/sandbox.rs.

Fixes #4085
Fixes #4136

Signed-off-by: Jack Hance <jack.hance@ndsu.edu>
2022-05-03 16:00:24 -05:00
Bin Liu
383be2203a agent: Add a macro to skip a loop easier
Add a macro to skip a loop easier without using a
if {} else {} condition check.

Fixes: #4185

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-04-30 20:45:41 +08:00
Bin Liu
97d7b1845b runk: use custom Kill command to support --all option
runk uses liboci-cli crate to parse command line options,
but liboci-cli does not support --all option for kill command,
though this is the runtime spec behavior.

But crictl will issue kill --all command when stopping containers,
as a workaround, we use a custom kill command instead of the one
provided by liboci-cli.

Fixes: #4182

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-04-30 19:34:18 +08:00
Champ-Goblem
1b7fd19acb rootfs: Fix chronyd.service failing on boot
In at least kata versions 2.3.3 and 2.4.0 it was noticed that the guest
operating system's clock would drift out of sync slowly over time
whilst the pod was running.

This had previously been raised and fixed in the old reposity via [1].
In essence kvm_ptp and chrony were paired together in order to
keep the system clock up to date with the host.

In the recent versions of kata metioned above,
the chronyd.service fails upon boot with status `266/NAMESPACE`
which seems to be due to the fact that the `/var/lib/chrony`
directory no longer exists.

This change sets the `/var/lib/chrony` directory for the `ReadWritePaths`
to be ignored when the directory does not exist, as per [2].

[1] https://github.com/kata-containers/runtime/issues/1279
[2] https://www.freedesktop.org/software/systemd
/man/systemd.exec.html#ReadWritePaths=

Fixes: #4167
Signed-off-by: Champ-Goblem <cameron_mcdermott@yahoo.co.uk>
2022-04-29 17:15:29 +01:00
5890 changed files with 1230095 additions and 225159 deletions

24
.github/actionlint.yaml vendored Normal file
View File

@@ -0,0 +1,24 @@
# Copyright (c) 2024 Red Hat
#
# SPDX-License-Identifier: Apache-2.0
#
# Configuration file with rules for the actionlint tool.
#
self-hosted-runner:
# Labels of self-hosted runner that linter should ignore
labels:
- ubuntu-22.04-arm
- garm-ubuntu-2004
- garm-ubuntu-2004-smaller
- garm-ubuntu-2204
- garm-ubuntu-2304
- garm-ubuntu-2304-smaller
- garm-ubuntu-2204-smaller
- k8s-ppc64le
- metrics
- ppc64le
- sev
- sev-snp
- s390x
- s390x-large
- tdx

View File

@@ -0,0 +1,40 @@
#!/bin/bash
#
# Copyright (c) 2022 Red Hat
#
# SPDX-License-Identifier: Apache-2.0
#
script_dir=$(dirname "$(readlink -f "$0")")
parent_dir=$(realpath "${script_dir}/../..")
cidir="${parent_dir}/ci"
source "${cidir}/../tests/common.bash"
cargo_deny_file="${script_dir}/action.yaml"
cat cargo-deny-skeleton.yaml.in > "${cargo_deny_file}"
changed_files_status=$(run_get_pr_changed_file_details)
changed_files_status=$(echo "$changed_files_status" | grep "Cargo\.toml$" || true)
changed_files=$(echo "$changed_files_status" | awk '{print $NF}' || true)
if [ -z "$changed_files" ]; then
cat >> "${cargo_deny_file}" << EOF
- run: echo "No Cargo.toml files to check"
shell: bash
EOF
fi
for path in $changed_files
do
cat >> "${cargo_deny_file}" << EOF
- name: ${path}
continue-on-error: true
shell: bash
run: |
pushd $(dirname ${path})
cargo deny check
popd
EOF
done

View File

@@ -0,0 +1,30 @@
#
# Copyright (c) 2022 Red Hat
#
# SPDX-License-Identifier: Apache-2.0
#
name: 'Cargo Crates Check'
description: 'Checks every Cargo.toml file using cargo-deny'
env:
CARGO_TERM_COLOR: always
runs:
using: "composite"
steps:
- name: Install Rust
uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: nightly
override: true
- name: Cache
uses: Swatinem/rust-cache@v2
- name: Install Cargo deny
shell: bash
run: |
which cargo
cargo install --locked cargo-deny || true

View File

@@ -9,9 +9,13 @@ on:
- labeled
- unlabeled
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
pr_wip_check:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
name: WIP Check
steps:
- name: WIP Check

33
.github/workflows/actionlint.yaml vendored Normal file
View File

@@ -0,0 +1,33 @@
name: Lint GHA workflows
on:
workflow_dispatch:
pull_request:
types:
- opened
- edited
- reopened
- synchronize
paths:
- '.github/workflows/**'
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
run-actionlint:
env:
GH_TOKEN: ${{ github.token }}
runs-on: ubuntu-24.04
steps:
- name: Checkout the code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Install actionlint gh extension
run: gh extension install https://github.com/cschleiden/gh-actionlint
- name: Run actionlint
run: gh actionlint

View File

@@ -11,9 +11,13 @@ on:
- opened
- reopened
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
add-new-issues-to-backlog:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
steps:
- name: Install hub
run: |
@@ -29,13 +33,13 @@ jobs:
run: |
# Clone into a temporary directory to avoid overwriting
# any existing github directory.
pushd $(mktemp -d) &>/dev/null
pushd "$(mktemp -d)" &>/dev/null
git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts
sudo install hub-util.sh /usr/local/bin
popd &>/dev/null
- name: Checkout code to allow hub to communicate with the project
uses: actions/checkout@v2
uses: actions/checkout@v4
- name: Add issue to issue backlog
env:

View File

@@ -12,18 +12,31 @@ on:
- reopened
- synchronize
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
add-pr-size-label:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
steps:
- name: Checkout code
uses: actions/checkout@v1
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ github.event.pull_request.base.ref }}
- name: Install PR sizing label script
run: |
# Clone into a temporary directory to avoid overwriting
# any existing github directory.
pushd $(mktemp -d) &>/dev/null
pushd "$(mktemp -d)" &>/dev/null
git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts
sudo install pr-add-size-label.sh /usr/local/bin
popd &>/dev/null
@@ -33,6 +46,8 @@ jobs:
GITHUB_TOKEN: ${{ secrets.KATA_GITHUB_ACTIONS_PR_SIZE_TOKEN }}
run: |
pr=${{ github.event.number }}
# Removing man-db, workflow kept failing, fixes: #4480
sudo apt -y remove --purge man-db
sudo apt -y install diffstat patchutils
pr-add-size-label.sh -p "$pr"

428
.github/workflows/basic-ci-amd64.yaml vendored Normal file
View File

@@ -0,0 +1,428 @@
name: CI | Basic amd64 tests
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-cri-containerd:
strategy:
# We can set this to true whenever we're 100% sure that
# the all the tests are not flaky, otherwise we'll fail
# all the tests due to a single flaky instance.
fail-fast: false
matrix:
containerd_version: ['lts', 'active']
vmm: ['clh', 'dragonball', 'qemu', 'stratovirt', 'cloud-hypervisor', 'qemu-runtime-rs']
runs-on: ubuntu-22.04
env:
CONTAINERD_VERSION: ${{ matrix.containerd_version }}
GOPATH: ${{ github.workspace }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/cri-containerd/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/cri-containerd/gha-run.sh install-kata kata-artifacts
- name: Run cri-containerd tests
timeout-minutes: 10
run: bash tests/integration/cri-containerd/gha-run.sh run
run-containerd-sandboxapi:
strategy:
# We can set this to true whenever we're 100% sure that
# the all the tests are not flaky, otherwise we'll fail
# all the tests due to a single flaky instance.
fail-fast: false
matrix:
containerd_version: ['latest']
vmm: ['dragonball', 'cloud-hypervisor', 'qemu-runtime-rs']
runs-on: ubuntu-22.04
env:
CONTAINERD_VERSION: ${{ matrix.containerd_version }}
GOPATH: ${{ github.workspace }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
#the latest containerd from 2.0 need to set the CGROUP_DRIVER for e2e testing
CGROUP_DRIVER: ""
SANDBOXER: "shim"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/cri-containerd/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/cri-containerd/gha-run.sh install-kata kata-artifacts
- name: Run containerd-sandboxapi tests
timeout-minutes: 10
run: bash tests/integration/cri-containerd/gha-run.sh run
run-containerd-stability:
strategy:
fail-fast: false
matrix:
containerd_version: ['lts', 'active']
vmm: ['clh', 'cloud-hypervisor', 'dragonball', 'qemu', 'stratovirt']
runs-on: ubuntu-22.04
env:
CONTAINERD_VERSION: ${{ matrix.containerd_version }}
GOPATH: ${{ github.workspace }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
SANDBOXER: "podsandbox"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/stability/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/stability/gha-run.sh install-kata kata-artifacts
- name: Run containerd-stability tests
timeout-minutes: 15
run: bash tests/stability/gha-run.sh run
run-nydus:
strategy:
# We can set this to true whenever we're 100% sure that
# the all the tests are not flaky, otherwise we'll fail
# all the tests due to a single flaky instance.
fail-fast: false
matrix:
containerd_version: ['lts', 'active']
vmm: ['clh', 'qemu', 'dragonball', 'stratovirt']
runs-on: ubuntu-22.04
env:
CONTAINERD_VERSION: ${{ matrix.containerd_version }}
GOPATH: ${{ github.workspace }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/nydus/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/nydus/gha-run.sh install-kata kata-artifacts
- name: Run nydus tests
timeout-minutes: 10
run: bash tests/integration/nydus/gha-run.sh run
run-runk:
# Skip runk tests as we have no maintainers. TODO: Decide when to remove altogether
if: false
runs-on: ubuntu-22.04
env:
CONTAINERD_VERSION: lts
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/runk/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/runk/gha-run.sh install-kata kata-artifacts
- name: Run runk tests
timeout-minutes: 10
run: bash tests/integration/runk/gha-run.sh run
run-tracing:
strategy:
fail-fast: false
matrix:
vmm:
- clh # cloud-hypervisor
- qemu
# TODO: enable me when https://github.com/kata-containers/kata-containers/issues/9763 is fixed
# TODO: Transition to free runner (see #9940).
if: false
runs-on: garm-ubuntu-2204-smaller
env:
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/functional/tracing/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/functional/tracing/gha-run.sh install-kata kata-artifacts
- name: Run tracing tests
timeout-minutes: 15
run: bash tests/functional/tracing/gha-run.sh run
run-vfio:
strategy:
fail-fast: false
matrix:
vmm:
- clh
- qemu
# TODO: enable with clh when https://github.com/kata-containers/kata-containers/issues/9764 is fixed
# TODO: enable with qemu when https://github.com/kata-containers/kata-containers/issues/9851 is fixed
# TODO: Transition to free runner (see #9940).
if: false
runs-on: garm-ubuntu-2304
env:
GOPATH: ${{ github.workspace }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/functional/vfio/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Run vfio tests
timeout-minutes: 15
run: bash tests/functional/vfio/gha-run.sh run
run-docker-tests:
strategy:
# We can set this to true whenever we're 100% sure that
# all the tests are not flaky, otherwise we'll fail them
# all due to a single flaky instance.
fail-fast: false
matrix:
vmm:
- clh
- qemu
- dragonball
- cloud-hypervisor
runs-on: ubuntu-22.04
env:
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/docker/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/docker/gha-run.sh install-kata kata-artifacts
- name: Run docker smoke test
timeout-minutes: 5
run: bash tests/integration/docker/gha-run.sh run
run-nerdctl-tests:
strategy:
# We can set this to true whenever we're 100% sure that
# all the tests are not flaky, otherwise we'll fail them
# all due to a single flaky instance.
fail-fast: false
matrix:
vmm:
- clh
- dragonball
- qemu
- cloud-hypervisor
runs-on: ubuntu-22.04
env:
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/nerdctl/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/nerdctl/gha-run.sh install-kata kata-artifacts
- name: Run nerdctl smoke test
timeout-minutes: 5
run: bash tests/integration/nerdctl/gha-run.sh run
- name: Collect artifacts ${{ matrix.vmm }}
if: always()
run: bash tests/integration/nerdctl/gha-run.sh collect-artifacts
continue-on-error: true
- name: Archive artifacts ${{ matrix.vmm }}
uses: actions/upload-artifact@v4
with:
name: nerdctl-tests-garm-${{ matrix.vmm }}
path: /tmp/artifacts
retention-days: 1
run-kata-agent-apis:
strategy:
fail-fast: false
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/functional/kata-agent-apis/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/functional/kata-agent-apis/gha-run.sh install-kata kata-artifacts
- name: Run kata agent api tests with agent-ctl
run: bash tests/functional/kata-agent-apis/gha-run.sh run

108
.github/workflows/build-checks.yaml vendored Normal file
View File

@@ -0,0 +1,108 @@
on:
workflow_call:
inputs:
instance:
required: true
type: string
name: Build checks
jobs:
check:
runs-on: ${{ inputs.instance }}
strategy:
fail-fast: false
matrix:
component:
- agent
- dragonball
- runtime
- runtime-rs
- agent-ctl
- kata-ctl
- trace-forwarder
- genpolicy
command:
- "make vendor"
- "make check"
- "make test"
- "sudo -E PATH=\"$PATH\" make test"
include:
- component: agent
component-path: src/agent
- component: dragonball
component-path: src/dragonball
- component: runtime
component-path: src/runtime
- component: runtime-rs
component-path: src/runtime-rs
- component: agent-ctl
component-path: src/tools/agent-ctl
- component: kata-ctl
component-path: src/tools/kata-ctl
- component: trace-forwarder
component-path: src/tools/trace-forwarder
- install-libseccomp: no
- component: agent
install-libseccomp: yes
- component: genpolicy
component-path: src/tools/genpolicy
steps:
- name: Adjust a permission for repo
run: |
sudo chown -R "$USER":"$USER" "$GITHUB_WORKSPACE" "$HOME"
sudo rm -rf "$GITHUB_WORKSPACE"/* || { sleep 10 && sudo rm -rf "$GITHUB_WORKSPACE"/*; }
sudo rm -f /tmp/kata_hybrid* # Sometime we got leftover from test_setup_hvsock_failed()
- name: Checkout the code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Install yq
run: |
./ci/install_yq.sh
env:
INSTALL_IN_GOPATH: false
- name: Install golang
if: ${{ matrix.component == 'runtime' }}
run: |
./tests/install_go.sh -f -p
echo "/usr/local/go/bin" >> "$GITHUB_PATH"
- name: Install rust
if: ${{ matrix.component != 'runtime' }}
run: |
./tests/install_rust.sh
echo "${HOME}/.cargo/bin" >> "$GITHUB_PATH"
- name: Install musl-tools
if: ${{ matrix.component != 'runtime' }}
run: sudo apt-get -y install musl-tools
- name: Install devicemapper
if: ${{ matrix.command == 'make check' && matrix.component == 'agent' }}
run: sudo apt-get -y install libdevmapper-dev
- name: Install libseccomp
if: ${{ matrix.command != 'make vendor' && matrix.command != 'make check' && matrix.install-libseccomp == 'yes' }}
run: |
libseccomp_install_dir=$(mktemp -d -t libseccomp.XXXXXXXXXX)
gperf_install_dir=$(mktemp -d -t gperf.XXXXXXXXXX)
./ci/install_libseccomp.sh "${libseccomp_install_dir}" "${gperf_install_dir}"
echo "Set environment variables for the libseccomp crate to link the libseccomp library statically"
echo "LIBSECCOMP_LINK_TYPE=static" >> "$GITHUB_ENV"
echo "LIBSECCOMP_LIB_PATH=${libseccomp_install_dir}/lib" >> "$GITHUB_ENV"
- name: Install protobuf-compiler
if: ${{ matrix.command != 'make vendor' && (matrix.component == 'agent' || matrix.component == 'genpolicy' || matrix.component == 'agent-ctl') }}
run: sudo apt-get -y install protobuf-compiler
- name: Install clang
if: ${{ matrix.command == 'make check' && (matrix.component == 'agent' || matrix.component == 'agent-ctl') }}
run: sudo apt-get -y install clang
- name: Setup XDG_RUNTIME_DIR for the `runtime` tests
if: ${{ matrix.command != 'make vendor' && matrix.command != 'make check' && matrix.component == 'runtime' }}
run: |
XDG_RUNTIME_DIR=$(mktemp -d "/tmp/kata-tests-$USER.XXX" | tee >(xargs chmod 0700))
echo "XDG_RUNTIME_DIR=${XDG_RUNTIME_DIR}" >> "$GITHUB_ENV"
- name: Running `${{ matrix.command }}` for ${{ matrix.component }}
run: |
cd ${{ matrix.component-path }}
${{ matrix.command }}
env:
RUST_BACKTRACE: "1"
SKIP_GO_VERSION_CHECK: "1"

View File

@@ -0,0 +1,331 @@
name: CI | Build kata-static tarball for amd64
on:
workflow_call:
inputs:
stage:
required: false
type: string
default: test
tarball-suffix:
required: false
type: string
push-to-registry:
required: false
type: string
default: no
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
build-asset:
runs-on: ubuntu-22.04
permissions:
contents: read
packages: write
id-token: write
attestations: write
strategy:
matrix:
asset:
- agent
- agent-ctl
- busybox
- cloud-hypervisor
- cloud-hypervisor-glibc
- coco-guest-components
- csi-kata-directvolume
- firecracker
- genpolicy
- kata-ctl
- kata-manager
- kernel
- kernel-confidential
- kernel-dragonball-experimental
- kernel-nvidia-gpu
- kernel-nvidia-gpu-confidential
- nydus
- ovmf
- ovmf-sev
- pause-image
- qemu
- qemu-snp-experimental
- stratovirt
- trace-forwarder
- virtiofsd
stage:
- ${{ inputs.stage }}
exclude:
- asset: cloud-hypervisor-glibc
stage: release
env:
PERFORM_ATTESTATION: ${{ matrix.asset == 'agent' && inputs.push-to-registry == 'yes' && 'yes' || 'no' }}
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Build ${{ matrix.asset }}
id: build
run: |
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: Parse OCI image name and digest
id: parse-oci-segments
if: ${{ env.PERFORM_ATTESTATION == 'yes' }}
run: |
oci_image="$(<"build/${{ matrix.asset }}-oci-image")"
echo "oci-name=${oci_image%@*}" >> "$GITHUB_OUTPUT"
echo "oci-digest=${oci_image#*@}" >> "$GITHUB_OUTPUT"
- uses: oras-project/setup-oras@v1
if: ${{ env.PERFORM_ATTESTATION == 'yes' }}
with:
version: "1.2.0"
# for pushing attestations to the registry
- uses: docker/login-action@v3
if: ${{ env.PERFORM_ATTESTATION == 'yes' }}
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- uses: actions/attest-build-provenance@v1
if: ${{ env.PERFORM_ATTESTATION == 'yes' }}
with:
subject-name: ${{ steps.parse-oci-segments.outputs.oci-name }}
subject-digest: ${{ steps.parse-oci-segments.outputs.oci-digest }}
push-to-registry: true
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-amd64-${{ matrix.asset }}${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
retention-days: 15
if-no-files-found: error
- name: store-extratarballs-artifact ${{ matrix.asset }}
if: ${{ startsWith(matrix.asset, 'kernel-nvidia-gpu') }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-amd64-${{ matrix.asset }}-headers${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}-headers.tar.xz
retention-days: 15
if-no-files-found: error
build-asset-rootfs:
runs-on: ubuntu-22.04
needs: build-asset
strategy:
matrix:
asset:
- rootfs-image
- rootfs-image-confidential
- rootfs-image-mariner
- rootfs-initrd
- rootfs-initrd-confidential
- rootfs-nvidia-gpu-initrd
- rootfs-nvidia-gpu-confidential-initrd
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-amd64-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: Build ${{ matrix.asset }}
id: build
run: |
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-amd64-${{ matrix.asset }}${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
retention-days: 15
if-no-files-found: error
# We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs
remove-rootfs-binary-artifacts:
runs-on: ubuntu-22.04
needs: build-asset-rootfs
strategy:
matrix:
asset:
- busybox
- coco-guest-components
- kernel-nvidia-gpu-headers
- kernel-nvidia-gpu-confidential-headers
- pause-image
steps:
- uses: geekyeggo/delete-artifact@v5
with:
name: kata-artifacts-amd64-${{ matrix.asset}}${{ inputs.tarball-suffix }}
# We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs
remove-rootfs-binary-artifacts-for-release:
runs-on: ubuntu-22.04
needs: build-asset-rootfs
strategy:
matrix:
asset:
- agent
steps:
- uses: geekyeggo/delete-artifact@v5
if: ${{ inputs.stage == 'release' }}
with:
name: kata-artifacts-amd64-${{ matrix.asset}}${{ inputs.tarball-suffix }}
build-asset-shim-v2:
runs-on: ubuntu-22.04
needs: [build-asset, build-asset-rootfs, remove-rootfs-binary-artifacts, remove-rootfs-binary-artifacts-for-release]
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-amd64-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: Build shim-v2
id: build
run: |
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: shim-v2
TAR_OUTPUT: shim-v2.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
MEASURED_ROOTFS: yes
- name: store-artifact shim-v2
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-amd64-shim-v2${{ inputs.tarball-suffix }}
path: kata-build/kata-static-shim-v2.tar.xz
retention-days: 15
if-no-files-found: error
create-kata-tarball:
runs-on: ubuntu-22.04
needs: [build-asset, build-asset-rootfs, build-asset-shim-v2]
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-amd64-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: merge-artifacts
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts versions.yaml
- name: store-artifacts
uses: actions/upload-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-static.tar.xz
retention-days: 15
if-no-files-found: error

View File

@@ -0,0 +1,301 @@
name: CI | Build kata-static tarball for arm64
on:
workflow_call:
inputs:
stage:
required: false
type: string
default: test
tarball-suffix:
required: false
type: string
push-to-registry:
required: false
type: string
default: no
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
build-asset:
runs-on: ubuntu-22.04-arm
permissions:
contents: read
packages: write
id-token: write
attestations: write
strategy:
matrix:
asset:
- agent
- busybox
- cloud-hypervisor
- firecracker
- kernel
- kernel-dragonball-experimental
- kernel-nvidia-gpu
- nydus
- qemu
- stratovirt
- virtiofsd
env:
PERFORM_ATTESTATION: ${{ matrix.asset == 'agent' && inputs.push-to-registry == 'yes' && 'yes' || 'no' }}
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Build ${{ matrix.asset }}
run: |
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: Parse OCI image name and digest
id: parse-oci-segments
if: ${{ env.PERFORM_ATTESTATION == 'yes' }}
run: |
oci_image="$(<"build/${{ matrix.asset }}-oci-image")"
echo "oci-name=${oci_image%@*}" >> "$GITHUB_OUTPUT"
echo "oci-digest=${oci_image#*@}" >> "$GITHUB_OUTPUT"
- uses: oras-project/setup-oras@v1
if: ${{ env.PERFORM_ATTESTATION == 'yes' }}
with:
version: "1.2.0"
# for pushing attestations to the registry
- uses: docker/login-action@v3
if: ${{ env.PERFORM_ATTESTATION == 'yes' }}
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- uses: actions/attest-build-provenance@v1
if: ${{ env.PERFORM_ATTESTATION == 'yes' }}
with:
subject-name: ${{ steps.parse-oci-segments.outputs.oci-name }}
subject-digest: ${{ steps.parse-oci-segments.outputs.oci-digest }}
push-to-registry: true
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-arm64-${{ matrix.asset }}${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
retention-days: 15
if-no-files-found: error
- name: store-extratarballs-artifact ${{ matrix.asset }}
if: ${{ startsWith(matrix.asset, 'kernel-nvidia-gpu') }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-arm64-${{ matrix.asset }}-headers${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}-headers.tar.xz
retention-days: 15
if-no-files-found: error
build-asset-rootfs:
runs-on: ubuntu-22.04-arm
needs: build-asset
strategy:
matrix:
asset:
- rootfs-image
- rootfs-initrd
- rootfs-nvidia-gpu-initrd
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-arm64-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: Build ${{ matrix.asset }}
run: |
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-arm64-${{ matrix.asset }}${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
retention-days: 15
if-no-files-found: error
# We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs
remove-rootfs-binary-artifacts:
runs-on: ubuntu-22.04-arm
needs: build-asset-rootfs
strategy:
matrix:
asset:
- busybox
- kernel-nvidia-gpu-headers
steps:
- uses: geekyeggo/delete-artifact@v5
with:
name: kata-artifacts-arm64-${{ matrix.asset}}${{ inputs.tarball-suffix }}
# We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs
remove-rootfs-binary-artifacts-for-release:
runs-on: ubuntu-22.04-arm
needs: build-asset-rootfs
strategy:
matrix:
asset:
- agent
steps:
- uses: geekyeggo/delete-artifact@v5
if: ${{ inputs.stage == 'release' }}
with:
name: kata-artifacts-arm64-${{ matrix.asset}}${{ inputs.tarball-suffix }}
build-asset-shim-v2:
runs-on: ubuntu-22.04-arm
needs: [build-asset, build-asset-rootfs, remove-rootfs-binary-artifacts, remove-rootfs-binary-artifacts-for-release]
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-arm64-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: Build shim-v2
run: |
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: shim-v2
TAR_OUTPUT: shim-v2.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact shim-v2
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-arm64-shim-v2${{ inputs.tarball-suffix }}
path: kata-build/kata-static-shim-v2.tar.xz
retention-days: 15
if-no-files-found: error
create-kata-tarball:
runs-on: ubuntu-22.04-arm
needs: [build-asset, build-asset-rootfs, build-asset-shim-v2]
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-arm64-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: merge-artifacts
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts versions.yaml
- name: store-artifacts
uses: actions/upload-artifact@v4
with:
name: kata-static-tarball-arm64${{ inputs.tarball-suffix }}
path: kata-static.tar.xz
retention-days: 15
if-no-files-found: error

View File

@@ -0,0 +1,260 @@
name: CI | Build kata-static tarball for ppc64le
on:
workflow_call:
inputs:
stage:
required: false
type: string
default: test
tarball-suffix:
required: false
type: string
push-to-registry:
required: false
type: string
default: no
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
build-asset:
runs-on: ppc64le
strategy:
matrix:
asset:
- agent
- kernel
- qemu
- virtiofsd
stage:
- ${{ inputs.stage }}
steps:
- name: Prepare the self-hosted runner
timeout-minutes: 15
run: |
"${HOME}/scripts/prepare_runner.sh"
sudo rm -rf "$GITHUB_WORKSPACE"/*
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Build ${{ matrix.asset }}
run: |
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-ppc64le-${{ matrix.asset }}${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
retention-days: 1
if-no-files-found: error
build-asset-rootfs:
runs-on: ppc64le
needs: build-asset
strategy:
matrix:
asset:
- rootfs-initrd
stage:
- ${{ inputs.stage }}
steps:
- name: Prepare the self-hosted runner
timeout-minutes: 15
run: |
"${HOME}/scripts/prepare_runner.sh"
sudo rm -rf "$GITHUB_WORKSPACE"/*
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-ppc64le-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: Build ${{ matrix.asset }}
run: |
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-ppc64le-${{ matrix.asset }}${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
retention-days: 1
if-no-files-found: error
# We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs
remove-rootfs-binary-artifacts:
runs-on: ubuntu-22.04
needs: build-asset-rootfs
strategy:
matrix:
asset:
- agent
steps:
- uses: geekyeggo/delete-artifact@v5
if: ${{ inputs.stage == 'release' }}
with:
name: kata-artifacts-ppc64le-${{ matrix.asset}}${{ inputs.tarball-suffix }}
build-asset-shim-v2:
runs-on: ppc64le
needs: [build-asset, build-asset-rootfs, remove-rootfs-binary-artifacts]
steps:
- name: Prepare the self-hosted runner
timeout-minutes: 15
run: |
"${HOME}/scripts/prepare_runner.sh"
sudo rm -rf "$GITHUB_WORKSPACE"/*
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-ppc64le-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: Build shim-v2
run: |
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: shim-v2
TAR_OUTPUT: shim-v2.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact shim-v2
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-ppc64le-shim-v2${{ inputs.tarball-suffix }}
path: kata-build/kata-static-shim-v2.tar.xz
retention-days: 1
if-no-files-found: error
create-kata-tarball:
runs-on: ppc64le
needs: [build-asset, build-asset-rootfs, build-asset-shim-v2]
steps:
- name: Adjust a permission for repo
run: |
sudo chown -R "$USER":"$USER" "$GITHUB_WORKSPACE"
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-ppc64le-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: merge-artifacts
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts versions.yaml
- name: store-artifacts
uses: actions/upload-artifact@v4
with:
name: kata-static-tarball-ppc64le${{ inputs.tarball-suffix }}
path: kata-static.tar.xz
retention-days: 1
if-no-files-found: error

View File

@@ -0,0 +1,324 @@
name: CI | Build kata-static tarball for s390x
on:
workflow_call:
inputs:
stage:
required: false
type: string
default: test
tarball-suffix:
required: false
type: string
push-to-registry:
required: false
type: string
default: no
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
build-asset:
runs-on: s390x
permissions:
contents: read
packages: write
id-token: write
attestations: write
strategy:
matrix:
asset:
- agent
- coco-guest-components
- kernel
- kernel-confidential
- pause-image
- qemu
- virtiofsd
env:
PERFORM_ATTESTATION: ${{ matrix.asset == 'agent' && inputs.push-to-registry == 'yes' && 'yes' || 'no' }}
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Build ${{ matrix.asset }}
id: build
run: |
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: Parse OCI image name and digest
id: parse-oci-segments
if: ${{ env.PERFORM_ATTESTATION == 'yes' }}
run: |
oci_image="$(<"build/${{ matrix.asset }}-oci-image")"
echo "oci-name=${oci_image%@*}" >> "$GITHUB_OUTPUT"
echo "oci-digest=${oci_image#*@}" >> "$GITHUB_OUTPUT"
# for pushing attestations to the registry
- uses: docker/login-action@v3
if: ${{ env.PERFORM_ATTESTATION == 'yes' }}
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- uses: actions/attest-build-provenance@v1
if: ${{ env.PERFORM_ATTESTATION == 'yes' }}
with:
subject-name: ${{ steps.parse-oci-segments.outputs.oci-name }}
subject-digest: ${{ steps.parse-oci-segments.outputs.oci-digest }}
push-to-registry: true
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-s390x-${{ matrix.asset }}${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
retention-days: 15
if-no-files-found: error
build-asset-rootfs:
runs-on: s390x
needs: build-asset
strategy:
matrix:
asset:
- rootfs-image
- rootfs-image-confidential
- rootfs-initrd
- rootfs-initrd-confidential
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-s390x-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: Build ${{ matrix.asset }}
id: build
run: |
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-s390x-${{ matrix.asset }}${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
retention-days: 15
if-no-files-found: error
build-asset-boot-image-se:
runs-on: s390x
needs: [build-asset, build-asset-rootfs]
steps:
- uses: actions/checkout@v4
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-s390x-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: Place a host key document
run: |
mkdir -p "host-key-document"
cp "${CI_HKD_PATH}" "host-key-document"
env:
CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}
- name: Build boot-image-se
run: |
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "boot-image-se"
make boot-image-se-tarball
build_dir=$(readlink -f build)
sudo cp -r "${build_dir}" "kata-build"
sudo chown -R "$(id -u)":"$(id -g)" "kata-build"
env:
HKD_PATH: "host-key-document"
- name: store-artifact boot-image-se
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-s390x${{ inputs.tarball-suffix }}
path: kata-build/kata-static-boot-image-se.tar.xz
retention-days: 1
if-no-files-found: error
# We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs
remove-rootfs-binary-artifacts:
runs-on: ubuntu-22.04
needs: [build-asset-rootfs, build-asset-boot-image-se]
strategy:
matrix:
asset:
- agent
- coco-guest-components
- pause-image
steps:
- uses: geekyeggo/delete-artifact@v5
if: ${{ inputs.stage == 'release' }}
with:
name: kata-artifacts-s390x-${{ matrix.asset}}${{ inputs.tarball-suffix }}
build-asset-shim-v2:
runs-on: s390x
needs: [build-asset, build-asset-rootfs, remove-rootfs-binary-artifacts]
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-s390x-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: Build shim-v2
id: build
run: |
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: shim-v2
TAR_OUTPUT: shim-v2.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
MEASURED_ROOTFS: yes
- name: store-artifact shim-v2
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-s390x-shim-v2${{ inputs.tarball-suffix }}
path: kata-build/kata-static-shim-v2.tar.xz
retention-days: 15
if-no-files-found: error
create-kata-tarball:
runs-on: s390x
needs:
- build-asset
- build-asset-rootfs
- build-asset-boot-image-se
- build-asset-shim-v2
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-s390x-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: merge-artifacts
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts versions.yaml
- name: store-artifacts
uses: actions/upload-artifact@v4
with:
name: kata-static-tarball-s390x${{ inputs.tarball-suffix }}
path: kata-static.tar.xz
retention-days: 15
if-no-files-found: error

View File

@@ -0,0 +1,30 @@
name: Cargo Crates Check Runner
on:
pull_request:
types:
- opened
- edited
- reopened
- synchronize
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
cargo-deny-runner:
runs-on: ubuntu-22.04
steps:
- name: Checkout Code
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/checkout@v4
- name: Generate Action
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: bash cargo-deny-generator.sh
working-directory: ./.github/cargo-deny-composite-action/
env:
GOPATH: ${{ github.workspace }}/kata-containers
- name: Run Action
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: ./.github/cargo-deny-composite-action

View File

@@ -0,0 +1,19 @@
name: Kata Containers CoCo Stability Tests Weekly
on:
schedule:
- cron: '0 0 * * 0'
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
kata-containers-ci-on-push:
uses: ./.github/workflows/ci-weekly.yaml
with:
commit-hash: ${{ github.sha }}
pr-number: "weekly"
tag: ${{ github.sha }}-weekly
target-branch: ${{ github.ref_name }}
secrets: inherit

13
.github/workflows/ci-devel.yaml vendored Normal file
View File

@@ -0,0 +1,13 @@
name: Kata Containers CI (manually triggered)
on:
workflow_dispatch:
jobs:
kata-containers-ci-on-push:
uses: ./.github/workflows/ci.yaml
with:
commit-hash: ${{ github.sha }}
pr-number: "dev"
tag: ${{ github.sha }}-dev
target-branch: ${{ github.ref_name }}
secrets: inherit

21
.github/workflows/ci-nightly-s390x.yaml vendored Normal file
View File

@@ -0,0 +1,21 @@
on:
schedule:
- cron: '0 5 * * *'
name: Nightly CI for s390x
jobs:
check-internal-test-result:
runs-on: s390x
strategy:
fail-fast: false
matrix:
test_title:
- kata-vfio-ap-e2e-tests
- cc-se-e2e-tests
steps:
- name: Fetch a test result for {{ matrix.test_title }}
run: |
file_name="${TEST_TITLE}-$(date +%Y-%m-%d).log"
"/home/${USER}/script/handle_test_log.sh" download "$file_name"
env:
TEST_TITLE: ${{ matrix.test_title }}

18
.github/workflows/ci-nightly.yaml vendored Normal file
View File

@@ -0,0 +1,18 @@
name: Kata Containers Nightly CI
on:
schedule:
- cron: '0 0 * * *'
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
kata-containers-ci-on-push:
uses: ./.github/workflows/ci.yaml
with:
commit-hash: ${{ github.sha }}
pr-number: "nightly"
tag: ${{ github.sha }}-nightly
target-branch: ${{ github.ref_name }}
secrets: inherit

39
.github/workflows/ci-on-push.yaml vendored Normal file
View File

@@ -0,0 +1,39 @@
name: Kata Containers CI
on:
pull_request_target:
branches:
- 'main'
- 'stable-*'
types:
# Adding 'labeled' to the list of activity types that trigger this event
# (default: opened, synchronize, reopened) so that we can run this
# workflow when the 'ok-to-test' label is added.
# Reference: https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#pull_request_target
- opened
- synchronize
- reopened
- labeled
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
skipper:
if: ${{ contains(github.event.pull_request.labels.*.name, 'ok-to-test') }}
uses: ./.github/workflows/gatekeeper-skipper.yaml
with:
commit-hash: ${{ github.event.pull_request.head.sha }}
target-branch: ${{ github.event.pull_request.base.ref }}
kata-containers-ci-on-push:
needs: skipper
if: ${{ needs.skipper.outputs.skip_build != 'yes' }}
uses: ./.github/workflows/ci.yaml
with:
commit-hash: ${{ github.event.pull_request.head.sha }}
pr-number: ${{ github.event.pull_request.number }}
tag: ${{ github.event.pull_request.number }}-${{ github.event.pull_request.head.sha }}
target-branch: ${{ github.event.pull_request.base.ref }}
skip-test: ${{ needs.skipper.outputs.skip_test }}
secrets: inherit

87
.github/workflows/ci-weekly.yaml vendored Normal file
View File

@@ -0,0 +1,87 @@
name: Run the CoCo Kata Containers Stability CI
on:
workflow_call:
inputs:
commit-hash:
required: true
type: string
pr-number:
required: true
type: string
tag:
required: true
type: string
target-branch:
required: false
type: string
default: ""
jobs:
build-kata-static-tarball-amd64:
uses: ./.github/workflows/build-kata-static-tarball-amd64.yaml
with:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
publish-kata-deploy-payload-amd64:
needs: build-kata-static-tarball-amd64
uses: ./.github/workflows/publish-kata-deploy-payload-amd64.yaml
with:
tarball-suffix: -${{ inputs.tag }}
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
build-and-publish-tee-confidential-unencrypted-image:
runs-on: ubuntu-22.04
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Kata Containers ghcr.io
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Docker build and push
uses: docker/build-push-action@v5
with:
tags: ghcr.io/kata-containers/test-images:unencrypted-${{ inputs.pr-number }}
push: true
context: tests/integration/kubernetes/runtimeclass_workloads/confidential/unencrypted/
platforms: linux/amd64
file: tests/integration/kubernetes/runtimeclass_workloads/confidential/unencrypted/Dockerfile
run-kata-coco-stability-tests:
needs: [publish-kata-deploy-payload-amd64, build-and-publish-tee-confidential-unencrypted-image]
uses: ./.github/workflows/run-kata-coco-stability-tests.yaml
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
tarball-suffix: -${{ inputs.tag }}
secrets: inherit

300
.github/workflows/ci.yaml vendored Normal file
View File

@@ -0,0 +1,300 @@
name: Run the Kata Containers CI
on:
workflow_call:
inputs:
commit-hash:
required: true
type: string
pr-number:
required: true
type: string
tag:
required: true
type: string
target-branch:
required: false
type: string
default: ""
skip-test:
required: false
type: string
default: no
jobs:
build-kata-static-tarball-amd64:
uses: ./.github/workflows/build-kata-static-tarball-amd64.yaml
with:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
publish-kata-deploy-payload-amd64:
needs: build-kata-static-tarball-amd64
uses: ./.github/workflows/publish-kata-deploy-payload-amd64.yaml
with:
tarball-suffix: -${{ inputs.tag }}
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
build-kata-static-tarball-arm64:
uses: ./.github/workflows/build-kata-static-tarball-arm64.yaml
with:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
publish-kata-deploy-payload-arm64:
needs: build-kata-static-tarball-arm64
uses: ./.github/workflows/publish-kata-deploy-payload-arm64.yaml
with:
tarball-suffix: -${{ inputs.tag }}
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-arm64
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
build-kata-static-tarball-s390x:
uses: ./.github/workflows/build-kata-static-tarball-s390x.yaml
with:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
build-kata-static-tarball-ppc64le:
uses: ./.github/workflows/build-kata-static-tarball-ppc64le.yaml
with:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
publish-kata-deploy-payload-s390x:
needs: build-kata-static-tarball-s390x
uses: ./.github/workflows/publish-kata-deploy-payload-s390x.yaml
with:
tarball-suffix: -${{ inputs.tag }}
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-s390x
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
publish-kata-deploy-payload-ppc64le:
needs: build-kata-static-tarball-ppc64le
uses: ./.github/workflows/publish-kata-deploy-payload-ppc64le.yaml
with:
tarball-suffix: -${{ inputs.tag }}
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-ppc64le
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
build-and-publish-tee-confidential-unencrypted-image:
runs-on: ubuntu-22.04
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Kata Containers ghcr.io
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Docker build and push
uses: docker/build-push-action@v5
with:
tags: ghcr.io/kata-containers/test-images:unencrypted-${{ inputs.pr-number }}
push: true
context: tests/integration/kubernetes/runtimeclass_workloads/confidential/unencrypted/
platforms: linux/amd64, linux/s390x
file: tests/integration/kubernetes/runtimeclass_workloads/confidential/unencrypted/Dockerfile
publish-csi-driver-amd64:
needs: build-kata-static-tarball-amd64
runs-on: ubuntu-22.04
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64-${{ inputs.tag }}
path: kata-artifacts
- name: Install tools
run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-artifacts
- name: Copy binary into Docker context
run: |
# Copy to the location where the Dockerfile expects the binary.
mkdir -p src/tools/csi-kata-directvolume/bin/
cp /opt/kata/bin/csi-kata-directvolume src/tools/csi-kata-directvolume/bin/directvolplugin
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Kata Containers ghcr.io
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Docker build and push
uses: docker/build-push-action@v5
with:
tags: ghcr.io/kata-containers/csi-kata-directvolume:${{ inputs.pr-number }}
push: true
context: src/tools/csi-kata-directvolume/
platforms: linux/amd64
file: src/tools/csi-kata-directvolume/Dockerfile
run-kata-monitor-tests:
if: ${{ inputs.skip-test != 'yes' }}
needs: build-kata-static-tarball-amd64
uses: ./.github/workflows/run-kata-monitor-tests.yaml
with:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
run-k8s-tests-on-aks:
if: ${{ inputs.skip-test != 'yes' }}
needs: publish-kata-deploy-payload-amd64
uses: ./.github/workflows/run-k8s-tests-on-aks.yaml
with:
tarball-suffix: -${{ inputs.tag }}
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
run-k8s-tests-on-amd64:
if: ${{ inputs.skip-test != 'yes' }}
needs: publish-kata-deploy-payload-amd64
uses: ./.github/workflows/run-k8s-tests-on-amd64.yaml
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
run-kata-coco-tests:
if: ${{ inputs.skip-test != 'yes' }}
needs:
- publish-kata-deploy-payload-amd64
- build-and-publish-tee-confidential-unencrypted-image
- publish-csi-driver-amd64
uses: ./.github/workflows/run-kata-coco-tests.yaml
with:
tarball-suffix: -${{ inputs.tag }}
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
run-k8s-tests-on-zvsi:
if: ${{ inputs.skip-test != 'yes' }}
needs: [publish-kata-deploy-payload-s390x, build-and-publish-tee-confidential-unencrypted-image]
uses: ./.github/workflows/run-k8s-tests-on-zvsi.yaml
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-s390x
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
secrets: inherit
run-k8s-tests-on-ppc64le:
if: ${{ inputs.skip-test != 'yes' }}
needs: publish-kata-deploy-payload-ppc64le
uses: ./.github/workflows/run-k8s-tests-on-ppc64le.yaml
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-ppc64le
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
run-metrics-tests:
if: ${{ inputs.skip-test != 'yes' }}
needs: build-kata-static-tarball-amd64
uses: ./.github/workflows/run-metrics.yaml
with:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
run-basic-amd64-tests:
if: ${{ inputs.skip-test != 'yes' }}
needs: build-kata-static-tarball-amd64
uses: ./.github/workflows/basic-ci-amd64.yaml
with:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
run-cri-containerd-tests-s390x:
if: ${{ inputs.skip-test != 'yes' }}
needs: build-kata-static-tarball-s390x
uses: ./.github/workflows/run-cri-containerd-tests-s390x.yaml
with:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
run-cri-containerd-tests-ppc64le:
if: ${{ inputs.skip-test != 'yes' }}
needs: build-kata-static-tarball-ppc64le
uses: ./.github/workflows/run-cri-containerd-tests-ppc64le.yaml
with:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}

View File

@@ -0,0 +1,31 @@
name: Cleanup dangling Azure resources
on:
schedule:
- cron: "0 0 * * *"
workflow_dispatch:
jobs:
cleanup-resources:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v4
- name: Log into Azure
env:
AZ_APPID: ${{ secrets.AZ_APPID }}
AZ_PASSWORD: ${{ secrets.AZ_PASSWORD }}
AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}
AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}
run: bash tests/integration/kubernetes/gha-run.sh login-azure
- name: Install Python dependencies
run: |
pip3 install --user --upgrade \
azure-identity==1.16.0 \
azure-mgmt-resource==23.0.1
- name: Cleanup resources
env:
AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}
CLEANUP_AFTER_HOURS: 24 # Clean up resources created more than this many hours ago.
run: python3 tests/cleanup_resources.py

View File

@@ -6,6 +6,10 @@ on:
- reopened
- synchronize
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
env:
error_msg: |+
See the document below for help on formatting commits for the project.
@@ -14,7 +18,9 @@ env:
jobs:
commit-message-check:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
env:
PR_AUTHOR: ${{ github.event.pull_request.user.login }}
name: Commit Message Check
steps:
- name: Get PR Commits
@@ -28,7 +34,10 @@ jobs:
#
# Revert "<original-subject-line>"
#
filter_out_pattern: '^Revert "'
# The format of a re-re-vert commit as follows:
#
# Reapply "<original-subject-line>"
filter_out_pattern: '^Revert "|^Reapply "'
- name: DCO Check
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
@@ -43,7 +52,7 @@ jobs:
commits: ${{ steps.get-pr-commits.outputs.commits }}
- name: Check Subject Line Length
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}
if: ${{ (env.PR_AUTHOR != 'dependabot[bot]') && !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}
uses: tim-actions/commit-message-checker-with-regex@v0.3.1
with:
commits: ${{ steps.get-pr-commits.outputs.commits }}
@@ -52,7 +61,7 @@ jobs:
post_error: ${{ env.error_msg }}
- name: Check Body Line Length
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}
if: ${{ (env.PR_AUTHOR != 'dependabot[bot]') && !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}
uses: tim-actions/commit-message-checker-with-regex@v0.3.1
with:
commits: ${{ steps.get-pr-commits.outputs.commits }}
@@ -62,8 +71,12 @@ jobs:
# to be specified at the start of the regex as the action is passed
# the entire commit message.
#
# - This check will pass if the commit message only contains a subject
# line, as other body message properties are enforced elsewhere.
#
# - Body lines *can* be longer than the maximum if they start
# with a non-alphabetic character.
# with a non-alphabetic character or if there is no whitespace in
# the line.
#
# This allows stack traces, log files snippets, emails, long URLs,
# etc to be specified. Some of these naturally "work" as they start
@@ -74,23 +87,12 @@ jobs:
#
# - A SoB comment can be any length (as it is unreasonable to penalise
# people with long names/email addresses :)
pattern: '^.+(\n([a-zA-Z].{0,149}|[^a-zA-Z\n].*|Signed-off-by:.*|))+$'
error: 'Body line too long (max 72)'
pattern: '(^[^\n]+$|^.+(\n([a-zA-Z].{0,150}|[^a-zA-Z\n].*|[^\s\n]*|Signed-off-by:.*|))+$)'
error: 'Body line too long (max 150)'
post_error: ${{ env.error_msg }}
- name: Check Fixes
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}
uses: tim-actions/commit-message-checker-with-regex@v0.3.1
with:
commits: ${{ steps.get-pr-commits.outputs.commits }}
pattern: '\s*Fixes\s*:?\s*(#\d+|github\.com\/kata-containers\/[a-z-.]*#\d+)|^\s*release\s*:'
flags: 'i'
error: 'No "Fixes" found'
post_error: ${{ env.error_msg }}
one_pass_all_pass: 'true'
- name: Check Subsystem
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}
if: ${{ (env.PR_AUTHOR != 'dependabot[bot]') && !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}
uses: tim-actions/commit-message-checker-with-regex@v0.3.1
with:
commits: ${{ steps.get-pr-commits.outputs.commits }}

View File

@@ -6,20 +6,20 @@ on:
- reopened
- synchronize
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
name: Darwin tests
jobs:
test:
strategy:
matrix:
go-version: [1.16.x, 1.17.x]
os: [macos-latest]
runs-on: ${{ matrix.os }}
runs-on: macos-latest
steps:
- name: Install Go
uses: actions/setup-go@v2
uses: actions/setup-go@v5
with:
go-version: ${{ matrix.go-version }}
go-version: 1.22.11
- name: Checkout code
uses: actions/checkout@v2
uses: actions/checkout@v4
- name: Build utils
run: ./ci/darwin-test.sh

View File

@@ -5,40 +5,28 @@ on:
name: Docs URL Alive Check
jobs:
test:
strategy:
matrix:
go-version: [1.17.x]
os: [ubuntu-20.04]
runs-on: ${{ matrix.os }}
runs-on: ubuntu-22.04
# don't run this action on forks
if: github.repository_owner == 'kata-containers'
env:
target_branch: ${{ github.base_ref }}
steps:
- name: Install Go
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/setup-go@v2
uses: actions/setup-go@v5
with:
go-version: ${{ matrix.go-version }}
go-version: 1.22.11
env:
GOPATH: ${{ runner.workspace }}/kata-containers
GOPATH: ${{ github.workspace }}/kata-containers
- name: Set env
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
echo "GOPATH=${{ github.workspace }}" >> $GITHUB_ENV
echo "${{ github.workspace }}/bin" >> $GITHUB_PATH
echo "GOPATH=${{ github.workspace }}" >> "$GITHUB_ENV"
echo "${{ github.workspace }}/bin" >> "$GITHUB_PATH"
- name: Checkout code
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/checkout@v2
uses: actions/checkout@v4
with:
fetch-depth: 0
path: ./src/github.com/${{ github.repository }}
- name: Setup
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/setup.sh
env:
GOPATH: ${{ runner.workspace }}/kata-containers
# docs url alive check
- name: Docs URL Alive Check
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && make docs-url-alive-check
cd "${GOPATH}/src/github.com/${{ github.repository }}" && make docs-url-alive-check

View File

@@ -0,0 +1,52 @@
name: Skipper
# This workflow sets various "skip_*" output values that can be used to
# determine what workflows/jobs are expected to be executed. Sample usage:
#
# skipper:
# uses: ./.github/workflows/gatekeeper-skipper.yaml
# with:
# commit-hash: ${{ github.event.pull_request.head.sha }}
# target-branch: ${{ github.event.pull_request.base.ref }}
#
# your-workflow:
# needs: skipper
# if: ${{ needs.skipper.outputs.skip_build != 'yes' }}
on:
workflow_call:
inputs:
commit-hash:
required: true
type: string
target-branch:
required: false
type: string
default: ""
outputs:
skip_build:
value: ${{ jobs.skipper.outputs.skip_build }}
skip_test:
value: ${{ jobs.skipper.outputs.skip_test }}
skip_static:
value: ${{ jobs.skipper.outputs.skip_static }}
jobs:
skipper:
runs-on: ubuntu-22.04
outputs:
skip_build: ${{ steps.skipper.outputs.skip_build }}
skip_test: ${{ steps.skipper.outputs.skip_test }}
skip_static: ${{ steps.skipper.outputs.skip_static }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- id: skipper
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
run: |
python3 tools/testing/gatekeeper/skips.py | tee -a "$GITHUB_OUTPUT"
shell: /usr/bin/bash -x {0}

44
.github/workflows/gatekeeper.yaml vendored Normal file
View File

@@ -0,0 +1,44 @@
name: Gatekeeper
# Gatekeeper uses the "skips.py" to determine which job names/regexps are
# required for given PR and waits for them to either complete or fail
# reporting the status.
on:
pull_request_target:
types:
- opened
- synchronize
- reopened
- labeled
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
gatekeeper:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
fetch-depth: 0
- id: gatekeeper
env:
TARGET_BRANCH: ${{ github.event.pull_request.base.ref }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
COMMIT_HASH: ${{ github.event.pull_request.head.sha }}
GH_PR_NUMBER: ${{ github.event.pull_request.number }}
run: |
#!/usr/bin/env bash -x
mapfile -t lines < <(python3 tools/testing/gatekeeper/skips.py -t)
export REQUIRED_JOBS="${lines[0]}"
export REQUIRED_REGEXPS="${lines[1]}"
export REQUIRED_LABELS="${lines[2]}"
echo "REQUIRED_JOBS: $REQUIRED_JOBS"
echo "REQUIRED_REGEXPS: $REQUIRED_REGEXPS"
echo "REQUIRED_LABELS: $REQUIRED_LABELS"
python3 tools/testing/gatekeeper/jobs.py
exit $?
shell: /usr/bin/bash -x {0}

View File

@@ -1,83 +0,0 @@
name: kata deploy build
on:
pull_request:
types:
- opened
- edited
- reopened
- synchronize
paths:
- tools/**
- versions.yaml
jobs:
build-asset:
runs-on: ubuntu-latest
strategy:
matrix:
asset:
- kernel
- shim-v2
- qemu
- cloud-hypervisor
- firecracker
- rootfs-image
- rootfs-initrd
steps:
- uses: actions/checkout@v2
- name: Install docker
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
curl -fsSL https://test.docker.com -o test-docker.sh
sh test-docker.sh
- name: Build ${{ matrix.asset }}
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
sudo cp -r --preserve=all "${build_dir}" "kata-build"
env:
KATA_ASSET: ${{ matrix.asset }}
- name: store-artifact ${{ matrix.asset }}
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/upload-artifact@v2
with:
name: kata-artifacts
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
if-no-files-found: error
create-kata-tarball:
runs-on: ubuntu-latest
needs: build-asset
steps:
- uses: actions/checkout@v2
- name: get-artifacts
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/download-artifact@v2
with:
name: kata-artifacts
path: build
- name: merge-artifacts
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
make merge-builds
- name: store-artifacts
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/upload-artifact@v2
with:
name: kata-static-tarball
path: kata-static.tar.xz
make-kata-tarball:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: make kata-tarball
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
make kata-tarball
sudo make install-tarball

View File

@@ -1,147 +0,0 @@
on:
issue_comment:
types: [created, edited]
name: test-kata-deploy
jobs:
check-comment-and-membership:
runs-on: ubuntu-latest
if: |
github.event.issue.pull_request
&& github.event_name == 'issue_comment'
&& github.event.action == 'created'
&& startsWith(github.event.comment.body, '/test_kata_deploy')
steps:
- name: Check membership
uses: kata-containers/is-organization-member@1.0.1
id: is_organization_member
with:
organization: kata-containers
username: ${{ github.event.comment.user.login }}
token: ${{ secrets.GITHUB_TOKEN }}
- name: Fail if not member
run: |
result=${{ steps.is_organization_member.outputs.result }}
if [ $result == false ]; then
user=${{ github.event.comment.user.login }}
echo Either ${user} is not part of the kata-containers organization
echo or ${user} has its Organization Visibility set to Private at
echo https://github.com/orgs/kata-containers/people?query=${user}
echo
echo Ensure you change your Organization Visibility to Public and
echo trigger the test again.
exit 1
fi
build-asset:
runs-on: ubuntu-latest
needs: check-comment-and-membership
strategy:
matrix:
asset:
- cloud-hypervisor
- firecracker
- kernel
- qemu
- rootfs-image
- rootfs-initrd
- shim-v2
steps:
- name: get-PR-ref
id: get-PR-ref
run: |
ref=$(cat $GITHUB_EVENT_PATH | jq -r '.issue.pull_request.url' | sed 's#^.*\/pulls#refs\/pull#' | sed 's#$#\/merge#')
echo "reference for PR: " ${ref}
echo "##[set-output name=pr-ref;]${ref}"
- uses: actions/checkout@v2
with:
ref: ${{ steps.get-PR-ref.outputs.pr-ref }}
- name: Install docker
run: |
curl -fsSL https://test.docker.com -o test-docker.sh
sh test-docker.sh
- name: Build ${{ matrix.asset }}
run: |
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
sudo cp -r "${build_dir}" "kata-build"
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v2
with:
name: kata-artifacts
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
if-no-files-found: error
create-kata-tarball:
runs-on: ubuntu-latest
needs: build-asset
steps:
- name: get-PR-ref
id: get-PR-ref
run: |
ref=$(cat $GITHUB_EVENT_PATH | jq -r '.issue.pull_request.url' | sed 's#^.*\/pulls#refs\/pull#' | sed 's#$#\/merge#')
echo "reference for PR: " ${ref}
echo "##[set-output name=pr-ref;]${ref}"
- uses: actions/checkout@v2
with:
ref: ${{ steps.get-PR-ref.outputs.pr-ref }}
- name: get-artifacts
uses: actions/download-artifact@v2
with:
name: kata-artifacts
path: kata-artifacts
- name: merge-artifacts
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts
- name: store-artifacts
uses: actions/upload-artifact@v2
with:
name: kata-static-tarball
path: kata-static.tar.xz
kata-deploy:
needs: create-kata-tarball
runs-on: ubuntu-latest
steps:
- name: get-PR-ref
id: get-PR-ref
run: |
ref=$(cat $GITHUB_EVENT_PATH | jq -r '.issue.pull_request.url' | sed 's#^.*\/pulls#refs\/pull#' | sed 's#$#\/merge#')
echo "reference for PR: " ${ref}
echo "##[set-output name=pr-ref;]${ref}"
- uses: actions/checkout@v2
with:
ref: ${{ steps.get-PR-ref.outputs.pr-ref }}
- name: get-kata-tarball
uses: actions/download-artifact@v2
with:
name: kata-static-tarball
- name: build-and-push-kata-deploy-ci
id: build-and-push-kata-deploy-ci
run: |
PR_SHA=$(git log --format=format:%H -n1)
mv kata-static.tar.xz $GITHUB_WORKSPACE/tools/packaging/kata-deploy/kata-static.tar.xz
docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t quay.io/kata-containers/kata-deploy-ci:$PR_SHA $GITHUB_WORKSPACE/tools/packaging/kata-deploy
docker login -u ${{ secrets.QUAY_DEPLOYER_USERNAME }} -p ${{ secrets.QUAY_DEPLOYER_PASSWORD }} quay.io
docker push quay.io/kata-containers/kata-deploy-ci:$PR_SHA
mkdir -p packaging/kata-deploy
ln -s $GITHUB_WORKSPACE/tools/packaging/kata-deploy/action packaging/kata-deploy/action
echo "::set-output name=PKG_SHA::${PR_SHA}"
- name: test-kata-deploy-ci-in-aks
uses: ./packaging/kata-deploy/action
with:
packaging-sha: ${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}}
env:
PKG_SHA: ${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}}
AZ_APPID: ${{ secrets.AZ_APPID }}
AZ_PASSWORD: ${{ secrets.AZ_PASSWORD }}
AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}
AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

View File

@@ -0,0 +1,36 @@
on:
pull_request:
types:
- opened
- edited
- reopened
- synchronize
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
kata-deploy-runtime-classes-check:
runs-on: ubuntu-22.04
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Ensure the split out runtime classes match the all-in-one file
run: |
pushd tools/packaging/kata-deploy/runtimeclasses/
echo "::group::Combine runtime classes"
for runtimeClass in $(find . -type f \( -name "*.yaml" -and -not -name "kata-runtimeClasses.yaml" \) | sort); do
echo "Adding ${runtimeClass} to the resultingRuntimeClasses.yaml"
cat "${runtimeClass}" >> resultingRuntimeClasses.yaml;
done
echo "::endgroup::"
echo "::group::Displaying the content of resultingRuntimeClasses.yaml"
cat resultingRuntimeClasses.yaml
echo "::endgroup::"
echo ""
echo "::group::Displaying the content of kata-runtimeClasses.yaml"
cat kata-runtimeClasses.yaml
echo "::endgroup::"
echo ""
diff resultingRuntimeClasses.yaml kata-runtimeClasses.yaml

View File

@@ -13,7 +13,7 @@ on:
jobs:
move-linked-issues-to-in-progress:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
steps:
- name: Install hub
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
@@ -31,14 +31,24 @@ jobs:
run: |
# Clone into a temporary directory to avoid overwriting
# any existing github directory.
pushd $(mktemp -d) &>/dev/null
pushd "$(mktemp -d)" &>/dev/null
git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts
sudo install hub-util.sh /usr/local/bin
popd &>/dev/null
- name: Checkout code to allow hub to communicate with the project
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/checkout@v2
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ github.event.pull_request.base.ref }}
- name: Move issue to "In progress"
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
@@ -52,20 +62,19 @@ jobs:
grep -v "^\#" |\
cut -d';' -f3 || true)
# PR doesn't have any linked issues
# (it should, but maybe a new user forgot to add a "Fixes: #XXX" commit).
# PR doesn't have any linked issues, handle it only if it exists
[ -z "$linked_issue_urls" ] && {
echo "::error::No linked issues for PR $pr"
exit 1
echo "::warning::No linked issues for PR $pr"
exit 0
}
project_name="Issue backlog"
project_type="org"
project_column="In progress"
for issue_url in $(echo "$linked_issue_urls")
for issue_url in $linked_issue_urls
do
issue=$(echo "$issue_url"| awk -F\/ '{print $NF}' || true)
issue=$(echo "$issue_url"| awk -F/ '{print $NF}' || true)
[ -z "$issue" ] && {
echo "::error::Cannot determine issue number from $issue_url for PR $pr"

View File

@@ -0,0 +1,107 @@
name: CI | Publish Kata Containers payload
on:
push:
branches:
- main
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
jobs:
build-assets-amd64:
uses: ./.github/workflows/build-kata-static-tarball-amd64.yaml
with:
commit-hash: ${{ github.sha }}
push-to-registry: yes
target-branch: ${{ github.ref_name }}
secrets: inherit
build-assets-arm64:
uses: ./.github/workflows/build-kata-static-tarball-arm64.yaml
with:
commit-hash: ${{ github.sha }}
push-to-registry: yes
target-branch: ${{ github.ref_name }}
secrets: inherit
build-assets-s390x:
uses: ./.github/workflows/build-kata-static-tarball-s390x.yaml
with:
commit-hash: ${{ github.sha }}
push-to-registry: yes
target-branch: ${{ github.ref_name }}
secrets: inherit
build-assets-ppc64le:
uses: ./.github/workflows/build-kata-static-tarball-ppc64le.yaml
with:
commit-hash: ${{ github.sha }}
push-to-registry: yes
target-branch: ${{ github.ref_name }}
secrets: inherit
publish-kata-deploy-payload-amd64:
needs: build-assets-amd64
uses: ./.github/workflows/publish-kata-deploy-payload-amd64.yaml
with:
commit-hash: ${{ github.sha }}
registry: quay.io
repo: kata-containers/kata-deploy-ci
tag: kata-containers-latest-amd64
target-branch: ${{ github.ref_name }}
secrets: inherit
publish-kata-deploy-payload-arm64:
needs: build-assets-arm64
uses: ./.github/workflows/publish-kata-deploy-payload-arm64.yaml
with:
commit-hash: ${{ github.sha }}
registry: quay.io
repo: kata-containers/kata-deploy-ci
tag: kata-containers-latest-arm64
target-branch: ${{ github.ref_name }}
secrets: inherit
publish-kata-deploy-payload-s390x:
needs: build-assets-s390x
uses: ./.github/workflows/publish-kata-deploy-payload-s390x.yaml
with:
commit-hash: ${{ github.sha }}
registry: quay.io
repo: kata-containers/kata-deploy-ci
tag: kata-containers-latest-s390x
target-branch: ${{ github.ref_name }}
secrets: inherit
publish-kata-deploy-payload-ppc64le:
needs: build-assets-ppc64le
uses: ./.github/workflows/publish-kata-deploy-payload-ppc64le.yaml
with:
commit-hash: ${{ github.sha }}
registry: quay.io
repo: kata-containers/kata-deploy-ci
tag: kata-containers-latest-ppc64le
target-branch: ${{ github.ref_name }}
secrets: inherit
publish-manifest:
runs-on: ubuntu-22.04
needs: [publish-kata-deploy-payload-amd64, publish-kata-deploy-payload-arm64, publish-kata-deploy-payload-s390x, publish-kata-deploy-payload-ppc64le]
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Login to Kata Containers quay.io
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- name: Push multi-arch manifest
run: |
./tools/packaging/release/release.sh publish-multiarch-manifest
env:
KATA_DEPLOY_IMAGE_TAGS: "kata-containers-latest"
KATA_DEPLOY_REGISTRIES: "quay.io/kata-containers/kata-deploy-ci"

View File

@@ -0,0 +1,66 @@
name: CI | Publish kata-deploy payload for amd64
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
kata-payload:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
- name: Login to Kata Containers quay.io
if: ${{ inputs.registry == 'quay.io' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- name: Login to Kata Containers ghcr.io
if: ${{ inputs.registry == 'ghcr.io' }}
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: build-and-push-kata-payload
id: build-and-push-kata-payload
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
"$(pwd)"/kata-static.tar.xz \
${{ inputs.registry }}/${{ inputs.repo }} ${{ inputs.tag }}

View File

@@ -0,0 +1,66 @@
name: CI | Publish kata-deploy payload for arm64
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
kata-payload:
runs-on: ubuntu-22.04-arm
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-arm64${{ inputs.tarball-suffix }}
- name: Login to Kata Containers quay.io
if: ${{ inputs.registry == 'quay.io' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- name: Login to Kata Containers ghcr.io
if: ${{ inputs.registry == 'ghcr.io' }}
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: build-and-push-kata-payload
id: build-and-push-kata-payload
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
"$(pwd)"/kata-static.tar.xz \
${{ inputs.registry }}/${{ inputs.repo }} ${{ inputs.tag }}

View File

@@ -0,0 +1,76 @@
name: CI | Publish kata-deploy payload for ppc64le
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
kata-payload:
runs-on: ppc64le
steps:
- name: Prepare the self-hosted runner
timeout-minutes: 15
run: |
"${HOME}/scripts/prepare_runner.sh"
sudo rm -rf "$GITHUB_WORKSPACE"/*
- name: Adjust a permission for repo
run: |
sudo chown -R "$USER":"$USER" "$GITHUB_WORKSPACE"
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-ppc64le${{ inputs.tarball-suffix }}
- name: Login to Kata Containers quay.io
if: ${{ inputs.registry == 'quay.io' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- name: Login to Kata Containers ghcr.io
if: ${{ inputs.registry == 'ghcr.io' }}
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: build-and-push-kata-payload
id: build-and-push-kata-payload
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
"$(pwd)"/kata-static.tar.xz \
${{ inputs.registry }}/${{ inputs.repo }} ${{ inputs.tag }}

View File

@@ -0,0 +1,66 @@
name: CI | Publish kata-deploy payload for s390x
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
kata-payload:
runs-on: s390x
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-s390x${{ inputs.tarball-suffix }}
- name: Login to Kata Containers quay.io
if: ${{ inputs.registry == 'quay.io' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- name: Login to Kata Containers ghcr.io
if: ${{ inputs.registry == 'ghcr.io' }}
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: build-and-push-kata-payload
id: build-and-push-kata-payload
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
"$(pwd)"/kata-static.tar.xz \
${{ inputs.registry }}/${{ inputs.repo }} ${{ inputs.tag }}

59
.github/workflows/release-amd64.yaml vendored Normal file
View File

@@ -0,0 +1,59 @@
name: Publish Kata release artifacts for amd64
on:
workflow_call:
inputs:
target-arch:
required: true
type: string
jobs:
build-kata-static-tarball-amd64:
uses: ./.github/workflows/build-kata-static-tarball-amd64.yaml
with:
push-to-registry: yes
stage: release
secrets: inherit
kata-deploy:
needs: build-kata-static-tarball-amd64
runs-on: ubuntu-22.04
steps:
- name: Login to Kata Containers docker.io
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Login to Kata Containers quay.io
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64
- name: build-and-push-kata-deploy-ci-amd64
id: build-and-push-kata-deploy-ci-amd64
run: |
# We need to do such trick here as the format of the $GITHUB_REF
# is "refs/tags/<tag>"
tag=$(echo "$GITHUB_REF" | cut -d/ -f3-)
if [ "${tag}" = "main" ]; then
tag=$(./tools/packaging/release/release.sh release-version)
tags=("${tag}" "latest")
else
tags=("${tag}")
fi
for tag in "${tags[@]}"; do
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
"$(pwd)"/kata-static.tar.xz "docker.io/katadocker/kata-deploy" \
"${tag}-${{ inputs.target-arch }}"
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
"$(pwd)"/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \
"${tag}-${{ inputs.target-arch }}"
done

59
.github/workflows/release-arm64.yaml vendored Normal file
View File

@@ -0,0 +1,59 @@
name: Publish Kata release artifacts for arm64
on:
workflow_call:
inputs:
target-arch:
required: true
type: string
jobs:
build-kata-static-tarball-arm64:
uses: ./.github/workflows/build-kata-static-tarball-arm64.yaml
with:
push-to-registry: yes
stage: release
secrets: inherit
kata-deploy:
needs: build-kata-static-tarball-arm64
runs-on: ubuntu-22.04-arm
steps:
- name: Login to Kata Containers docker.io
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Login to Kata Containers quay.io
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-arm64
- name: build-and-push-kata-deploy-ci-arm64
id: build-and-push-kata-deploy-ci-arm64
run: |
# We need to do such trick here as the format of the $GITHUB_REF
# is "refs/tags/<tag>"
tag=$(echo "$GITHUB_REF" | cut -d/ -f3-)
if [ "${tag}" = "main" ]; then
tag=$(./tools/packaging/release/release.sh release-version)
tags=("${tag}" "latest")
else
tags=("${tag}")
fi
for tag in "${tags[@]}"; do
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
"$(pwd)"/kata-static.tar.xz "docker.io/katadocker/kata-deploy" \
"${tag}-${{ inputs.target-arch }}"
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
"$(pwd)"/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \
"${tag}-${{ inputs.target-arch }}"
done

65
.github/workflows/release-ppc64le.yaml vendored Normal file
View File

@@ -0,0 +1,65 @@
name: Publish Kata release artifacts for ppc64le
on:
workflow_call:
inputs:
target-arch:
required: true
type: string
jobs:
build-kata-static-tarball-ppc64le:
uses: ./.github/workflows/build-kata-static-tarball-ppc64le.yaml
with:
push-to-registry: yes
stage: release
secrets: inherit
kata-deploy:
needs: build-kata-static-tarball-ppc64le
runs-on: ppc64le
steps:
- name: Prepare the self-hosted runner
timeout-minutes: 15
run: |
bash "${HOME}/scripts/prepare_runner.sh"
sudo rm -rf "$GITHUB_WORKSPACE"/*
- name: Login to Kata Containers docker.io
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Login to Kata Containers quay.io
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-ppc64le
- name: build-and-push-kata-deploy-ci-ppc64le
id: build-and-push-kata-deploy-ci-ppc64le
run: |
# We need to do such trick here as the format of the $GITHUB_REF
# is "refs/tags/<tag>"
tag=$(echo "$GITHUB_REF" | cut -d/ -f3-)
if [ "${tag}" = "main" ]; then
tag=$(./tools/packaging/release/release.sh release-version)
tags=("${tag}" "latest")
else
tags=("${tag}")
fi
for tag in "${tags[@]}"; do
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
"$(pwd)"/kata-static.tar.xz "docker.io/katadocker/kata-deploy" \
"${tag}-${{ inputs.target-arch }}"
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
"$(pwd)"/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \
"${tag}-${{ inputs.target-arch }}"
done

59
.github/workflows/release-s390x.yaml vendored Normal file
View File

@@ -0,0 +1,59 @@
name: Publish Kata release artifacts for s390x
on:
workflow_call:
inputs:
target-arch:
required: true
type: string
jobs:
build-kata-static-tarball-s390x:
uses: ./.github/workflows/build-kata-static-tarball-s390x.yaml
with:
push-to-registry: yes
stage: release
secrets: inherit
kata-deploy:
needs: build-kata-static-tarball-s390x
runs-on: s390x
steps:
- name: Login to Kata Containers docker.io
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Login to Kata Containers quay.io
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-s390x
- name: build-and-push-kata-deploy-ci-s390x
id: build-and-push-kata-deploy-ci-s390x
run: |
# We need to do such trick here as the format of the $GITHUB_REF
# is "refs/tags/<tag>"
tag=$(echo "$GITHUB_REF" | cut -d/ -f3-)
if [ "${tag}" = "main" ]; then
tag=$(./tools/packaging/release/release.sh release-version)
tags=("${tag}" "latest")
else
tags=("${tag}")
fi
for tag in "${tags[@]}"; do
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
"$(pwd)"/kata-static.tar.xz "docker.io/katadocker/kata-deploy" \
"${tag}-${{ inputs.target-arch }}"
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
"$(pwd)"/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \
"${tag}-${{ inputs.target-arch }}"
done

View File

@@ -1,177 +1,206 @@
name: Publish Kata 2.x release artifacts
name: Release Kata Containers
on:
push:
tags:
- '2.*'
workflow_dispatch
jobs:
build-asset:
runs-on: ubuntu-latest
strategy:
matrix:
asset:
- cloud-hypervisor
- firecracker
- kernel
- qemu
- rootfs-image
- rootfs-initrd
- shim-v2
release:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v2
- name: Install docker
run: |
curl -fsSL https://test.docker.com -o test-docker.sh
sh test-docker.sh
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Build ${{ matrix.asset }}
- name: Create a new release
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-copy-yq-installer.sh
./tools/packaging/kata-deploy/local-build/kata-deploy-binaries-in-docker.sh --build="${KATA_ASSET}"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
sudo cp -r "${build_dir}" "kata-build"
./tools/packaging/release/release.sh create-new-release
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
GH_TOKEN: ${{ github.token }}
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v2
with:
name: kata-artifacts
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
if-no-files-found: error
build-and-push-assets-amd64:
needs: release
uses: ./.github/workflows/release-amd64.yaml
with:
target-arch: amd64
secrets: inherit
create-kata-tarball:
runs-on: ubuntu-latest
needs: build-asset
build-and-push-assets-arm64:
needs: release
uses: ./.github/workflows/release-arm64.yaml
with:
target-arch: arm64
secrets: inherit
build-and-push-assets-s390x:
needs: release
uses: ./.github/workflows/release-s390x.yaml
with:
target-arch: s390x
secrets: inherit
build-and-push-assets-ppc64le:
needs: release
uses: ./.github/workflows/release-ppc64le.yaml
with:
target-arch: ppc64le
secrets: inherit
publish-multi-arch-images:
runs-on: ubuntu-22.04
needs: [build-and-push-assets-amd64, build-and-push-assets-arm64, build-and-push-assets-s390x, build-and-push-assets-ppc64le]
steps:
- uses: actions/checkout@v2
- name: get-artifacts
uses: actions/download-artifact@v2
with:
name: kata-artifacts
path: kata-artifacts
- name: merge-artifacts
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts
- name: store-artifacts
uses: actions/upload-artifact@v2
with:
name: kata-static-tarball
path: kata-static.tar.xz
- name: Checkout repository
uses: actions/checkout@v4
kata-deploy:
needs: create-kata-tarball
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: get-kata-tarball
uses: actions/download-artifact@v2
- name: Login to Kata Containers docker.io
uses: docker/login-action@v3
with:
name: kata-static-tarball
- name: build-and-push-kata-deploy-ci
id: build-and-push-kata-deploy-ci
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Login to Kata Containers quay.io
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- name: Get the image tags
run: |
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
pushd $GITHUB_WORKSPACE
git checkout $tag
pkg_sha=$(git rev-parse HEAD)
popd
mv kata-static.tar.xz $GITHUB_WORKSPACE/tools/packaging/kata-deploy/kata-static.tar.xz
docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t katadocker/kata-deploy-ci:$pkg_sha -t quay.io/kata-containers/kata-deploy-ci:$pkg_sha $GITHUB_WORKSPACE/tools/packaging/kata-deploy
docker login -u ${{ secrets.DOCKER_USERNAME }} -p ${{ secrets.DOCKER_PASSWORD }}
docker push katadocker/kata-deploy-ci:$pkg_sha
docker login -u ${{ secrets.QUAY_DEPLOYER_USERNAME }} -p ${{ secrets.QUAY_DEPLOYER_PASSWORD }} quay.io
docker push quay.io/kata-containers/kata-deploy-ci:$pkg_sha
mkdir -p packaging/kata-deploy
ln -s $GITHUB_WORKSPACE/tools/packaging/kata-deploy/action packaging/kata-deploy/action
echo "::set-output name=PKG_SHA::${pkg_sha}"
- name: test-kata-deploy-ci-in-aks
uses: ./packaging/kata-deploy/action
with:
packaging-sha: ${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}}
release_version=$(./tools/packaging/release/release.sh release-version)
echo "KATA_DEPLOY_IMAGE_TAGS=$release_version latest" >> "$GITHUB_ENV"
- name: Publish multi-arch manifest on docker.io and quay.io
run: |
./tools/packaging/release/release.sh publish-multiarch-manifest
env:
PKG_SHA: ${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}}
AZ_APPID: ${{ secrets.AZ_APPID }}
AZ_PASSWORD: ${{ secrets.AZ_PASSWORD }}
AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}
AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}
- name: push-tarball
run: |
# tag the container image we created and push to DockerHub
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
tags=($tag)
tags+=($([[ "$tag" =~ "alpha"|"rc" ]] && echo "latest" || echo "stable"))
for tag in ${tags[@]}; do \
docker tag katadocker/kata-deploy-ci:${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}} katadocker/kata-deploy:${tag} && \
docker tag quay.io/kata-containers/kata-deploy-ci:${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}} quay.io/kata-containers/kata-deploy:${tag} && \
docker push katadocker/kata-deploy:${tag} && \
docker push quay.io/kata-containers/kata-deploy:${tag}; \
done
KATA_DEPLOY_REGISTRIES: "quay.io/kata-containers/kata-deploy docker.io/katadocker/kata-deploy"
upload-static-tarball:
needs: kata-deploy
runs-on: ubuntu-latest
upload-multi-arch-static-tarball:
needs: [build-and-push-assets-amd64, build-and-push-assets-arm64, build-and-push-assets-s390x, build-and-push-assets-ppc64le]
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v2
- name: download-artifacts
uses: actions/download-artifact@v2
- name: Checkout repository
uses: actions/checkout@v4
- name: Set KATA_STATIC_TARBALL env var
run: |
tarball=$(pwd)/kata-static.tar.xz
echo "KATA_STATIC_TARBALL=${tarball}" >> "$GITHUB_ENV"
- name: Download amd64 artifacts
uses: actions/download-artifact@v4
with:
name: kata-static-tarball
- name: install hub
name: kata-static-tarball-amd64
- name: Upload amd64 static tarball to GitHub
run: |
HUB_VER=$(curl -s "https://api.github.com/repos/github/hub/releases/latest" | jq -r .tag_name | sed 's/^v//')
wget -q -O- https://github.com/github/hub/releases/download/v$HUB_VER/hub-linux-amd64-$HUB_VER.tgz | \
tar xz --strip-components=2 --wildcards '*/bin/hub' && sudo mv hub /usr/local/bin/hub
- name: push static tarball to github
./tools/packaging/release/release.sh upload-kata-static-tarball
env:
GH_TOKEN: ${{ github.token }}
ARCHITECTURE: amd64
- name: Download arm64 artifacts
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-arm64
- name: Upload arm64 static tarball to GitHub
run: |
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
tarball="kata-static-$tag-x86_64.tar.xz"
mv kata-static.tar.xz "$GITHUB_WORKSPACE/${tarball}"
pushd $GITHUB_WORKSPACE
echo "uploading asset '${tarball}' for tag: ${tag}"
GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "${tarball}" "${tag}"
popd
./tools/packaging/release/release.sh upload-kata-static-tarball
env:
GH_TOKEN: ${{ github.token }}
ARCHITECTURE: arm64
- name: Download s390x artifacts
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-s390x
- name: Upload s390x static tarball to GitHub
run: |
./tools/packaging/release/release.sh upload-kata-static-tarball
env:
GH_TOKEN: ${{ github.token }}
ARCHITECTURE: s390x
- name: Download ppc64le artifacts
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-ppc64le
- name: Upload ppc64le static tarball to GitHub
run: |
./tools/packaging/release/release.sh upload-kata-static-tarball
env:
GH_TOKEN: ${{ github.token }}
ARCHITECTURE: ppc64le
upload-versions-yaml:
needs: release
runs-on: ubuntu-22.04
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Upload versions.yaml to GitHub
run: |
./tools/packaging/release/release.sh upload-versions-yaml-file
env:
GH_TOKEN: ${{ github.token }}
upload-cargo-vendored-tarball:
needs: upload-static-tarball
runs-on: ubuntu-latest
needs: release
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v2
- name: generate-and-upload-tarball
- name: Checkout repository
uses: actions/checkout@v4
- name: Generate and upload vendored code tarball
run: |
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
tarball="kata-containers-$tag-vendor.tar.gz"
pushd $GITHUB_WORKSPACE
bash -c "tools/packaging/release/generate_vendor.sh ${tarball}"
GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "${tarball}" "${tag}"
popd
./tools/packaging/release/release.sh upload-vendored-code-tarball
env:
GH_TOKEN: ${{ github.token }}
upload-libseccomp-tarball:
needs: upload-cargo-vendored-tarball
runs-on: ubuntu-latest
needs: release
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v2
- name: download-and-upload-tarball
env:
GITHUB_TOKEN: ${{ secrets.GIT_UPLOAD_TOKEN }}
GOPATH: ${HOME}/go
- name: Checkout repository
uses: actions/checkout@v4
- name: Download libseccomp tarball and upload it to GitHub
run: |
pushd $GITHUB_WORKSPACE
./ci/install_yq.sh
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
versions_yaml="versions.yaml"
version=$(${GOPATH}/bin/yq read ${versions_yaml} "externals.libseccomp.version")
repo_url=$(${GOPATH}/bin/yq read ${versions_yaml} "externals.libseccomp.url")
download_url="${repo_url}/releases/download/v${version}"
tarball="libseccomp-${version}.tar.gz"
asc="${tarball}.asc"
curl -sSLO "${download_url}/${tarball}"
curl -sSLO "${download_url}/${asc}"
# "-m" option should be empty to re-use the existing release title
# without opening a text editor.
# For the details, check https://hub.github.com/hub-release.1.html.
hub release edit -m "" -a "${tarball}" "${tag}"
hub release edit -m "" -a "${asc}" "${tag}"
popd
./tools/packaging/release/release.sh upload-libseccomp-tarball
env:
GH_TOKEN: ${{ github.token }}
upload-helm-chart-tarball:
needs: release
runs-on: ubuntu-22.04
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Install helm
uses: azure/setup-helm@v4.2.0
id: install
- name: Generate and upload helm chart tarball
run: |
./tools/packaging/release/release.sh upload-helm-chart-tarball
env:
GH_TOKEN: ${{ github.token }}
publish-release:
needs: [ build-and-push-assets-amd64, build-and-push-assets-arm64, build-and-push-assets-s390x, build-and-push-assets-ppc64le, publish-multi-arch-images, upload-multi-arch-static-tarball, upload-versions-yaml, upload-cargo-vendored-tarball, upload-libseccomp-tarball ]
runs-on: ubuntu-22.04
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Publish a release
run: |
./tools/packaging/release/release.sh publish-release
env:
GH_TOKEN: ${{ github.token }}

View File

@@ -1,54 +0,0 @@
# Copyright (c) 2020 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#
name: Ensure PR has required porting labels
on:
pull_request_target:
types:
- opened
- reopened
- labeled
- unlabeled
branches:
- main
jobs:
check-pr-porting-labels:
runs-on: ubuntu-latest
steps:
- name: Install hub
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
HUB_ARCH="amd64"
HUB_VER=$(curl -sL "https://api.github.com/repos/github/hub/releases/latest" |\
jq -r .tag_name | sed 's/^v//')
curl -sL \
"https://github.com/github/hub/releases/download/v${HUB_VER}/hub-linux-${HUB_ARCH}-${HUB_VER}.tgz" |\
tar xz --strip-components=2 --wildcards '*/bin/hub' && \
sudo install hub /usr/local/bin
- name: Checkout code to allow hub to communicate with the project
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/checkout@v2
- name: Install porting checker script
run: |
# Clone into a temporary directory to avoid overwriting
# any existing github directory.
pushd $(mktemp -d) &>/dev/null
git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts
sudo install pr-porting-checks.sh /usr/local/bin
popd &>/dev/null
- name: Stop PR being merged unless it has a correct set of porting labels
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
env:
GITHUB_TOKEN: ${{ secrets.KATA_GITHUB_ACTIONS_TOKEN }}
run: |
pr=${{ github.event.number }}
repo=${{ github.repository }}
pr-porting-checks.sh "$pr" "$repo"

View File

@@ -0,0 +1,69 @@
name: CI | Run cri-containerd tests on ppc64le
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-cri-containerd:
strategy:
# We can set this to true whenever we're 100% sure that
# the all the tests are not flaky, otherwise we'll fail
# all the tests due to a single flaky instance
fail-fast: false
matrix:
containerd_version: ['active']
vmm: ['qemu']
runs-on: ppc64le
env:
CONTAINERD_VERSION: ${{ matrix.containerd_version }}
GOPATH: ${{ github.workspace }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- name: Adjust a permission for repo
run: sudo chown -R "$USER":"$USER" "$GITHUB_WORKSPACE"
- name: Prepare the self-hosted runner
timeout-minutes: 15
run: |
bash "${HOME}/scripts/prepare_runner.sh" cri-containerd
sudo rm -rf "$GITHUB_WORKSPACE"/*
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
timeout-minutes: 15
run: bash tests/integration/cri-containerd/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-ppc64le${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/cri-containerd/gha-run.sh install-kata kata-artifacts
- name: Run cri-containerd tests
run: bash tests/integration/cri-containerd/gha-run.sh run
- name: Cleanup actions for the self hosted runner
run: bash "${HOME}/scripts/cleanup_runner.sh"

View File

@@ -0,0 +1,56 @@
name: CI | Run cri-containerd tests
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-cri-containerd:
strategy:
# We can set this to true whenever we're 100% sure that
# the all the tests are not flaky, otherwise we'll fail
# all the tests due to a single flaky instance
fail-fast: false
matrix:
containerd_version: ['active']
vmm: ['qemu', 'qemu-runtime-rs']
runs-on: s390x-large
env:
CONTAINERD_VERSION: ${{ matrix.containerd_version }}
GOPATH: ${{ github.workspace }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/cri-containerd/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-s390x${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/cri-containerd/gha-run.sh install-kata kata-artifacts
- name: Run cri-containerd tests
run: bash tests/integration/cri-containerd/gha-run.sh run

View File

@@ -0,0 +1,133 @@
name: CI | Run kubernetes tests on AKS
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
pr-number:
required: true
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-k8s-tests:
strategy:
fail-fast: false
matrix:
host_os:
- ubuntu
vmm:
- clh
- dragonball
- qemu
- qemu-runtime-rs
- stratovirt
- cloud-hypervisor
instance-type:
- small
- normal
include:
- host_os: cbl-mariner
vmm: clh
instance-type: small
genpolicy-pull-method: oci-distribution
auto-generate-policy: yes
- host_os: cbl-mariner
vmm: clh
instance-type: small
genpolicy-pull-method: containerd
auto-generate-policy: yes
- host_os: cbl-mariner
vmm: clh
instance-type: normal
auto-generate-policy: yes
runs-on: ubuntu-22.04
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
GH_PR_NUMBER: ${{ inputs.pr-number }}
KATA_HOST_OS: ${{ matrix.host_os }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBERNETES: "vanilla"
USING_NFD: "false"
K8S_TEST_HOST_TYPE: ${{ matrix.instance-type }}
GENPOLICY_PULL_METHOD: ${{ matrix.genpolicy-pull-method }}
AUTO_GENERATE_POLICY: ${{ matrix.auto-generate-policy }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-artifacts
- name: Download Azure CLI
run: bash tests/integration/kubernetes/gha-run.sh install-azure-cli
- name: Log into the Azure account
run: bash tests/integration/kubernetes/gha-run.sh login-azure
env:
AZ_APPID: ${{ secrets.AZ_APPID }}
AZ_PASSWORD: ${{ secrets.AZ_PASSWORD }}
AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}
AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}
- name: Create AKS cluster
uses: nick-fields/retry@v3
with:
timeout_minutes: 15
max_attempts: 20
retry_on: error
retry_wait_seconds: 10
command: bash tests/integration/kubernetes/gha-run.sh create-cluster
- name: Install `bats`
run: bash tests/integration/kubernetes/gha-run.sh install-bats
- name: Install `kubectl`
run: bash tests/integration/kubernetes/gha-run.sh install-kubectl
- name: Download credentials for the Kubernetes CLI to use them
run: bash tests/integration/kubernetes/gha-run.sh get-cluster-credentials
- name: Deploy Kata
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-aks
- name: Run tests
timeout-minutes: 60
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete AKS cluster
if: always()
run: bash tests/integration/kubernetes/gha-run.sh delete-cluster

View File

@@ -0,0 +1,109 @@
name: CI | Run kubernetes tests on amd64
on:
workflow_call:
inputs:
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
pr-number:
required: true
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-k8s-tests-amd64:
strategy:
fail-fast: false
matrix:
vmm:
- clh #cloud-hypervisor
- dragonball
- fc #firecracker
- qemu
- cloud-hypervisor
container_runtime:
- containerd
snapshotter:
- devmapper
k8s:
- k3s
include:
- vmm: qemu
container_runtime: crio
snapshotter: ""
k8s: k0s
runs-on: ubuntu-22.04
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
GH_PR_NUMBER: ${{ inputs.pr-number }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBERNETES: ${{ matrix.k8s }}
KUBERNETES_EXTRA_PARAMS: ${{ matrix.container_runtime != 'crio' && '' || '--cri-socket remote:unix:///var/run/crio/crio.sock --kubelet-extra-args --cgroup-driver="systemd"' }}
SNAPSHOTTER: ${{ matrix.snapshotter }}
USING_NFD: "false"
K8S_TEST_HOST_TYPE: all
CONTAINER_RUNTIME: ${{ matrix.container_runtime }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Configure CRI-O
if: matrix.container_runtime == 'crio'
run: bash tests/integration/kubernetes/gha-run.sh setup-crio
- name: Deploy ${{ matrix.k8s }}
run: bash tests/integration/kubernetes/gha-run.sh deploy-k8s
- name: Configure the ${{ matrix.snapshotter }} snapshotter
if: matrix.snapshotter != ''
run: bash tests/integration/kubernetes/gha-run.sh configure-snapshotter
- name: Deploy Kata
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata
- name: Install `bats`
run: bash tests/integration/kubernetes/gha-run.sh install-bats
- name: Run tests
timeout-minutes: 30
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Collect artifacts ${{ matrix.vmm }}
if: always()
run: bash tests/integration/kubernetes/gha-run.sh collect-artifacts
continue-on-error: true
- name: Archive artifacts ${{ matrix.vmm }}
uses: actions/upload-artifact@v4
with:
name: k8s-tests-${{ matrix.vmm }}-${{ matrix.snapshotter }}-${{ matrix.k8s }}-${{ inputs.tag }}
path: /tmp/artifacts
retention-days: 1
- name: Delete kata-deploy
if: always()
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh cleanup

View File

@@ -0,0 +1,83 @@
name: CI | Run kubernetes tests on Power(ppc64le)
on:
workflow_call:
inputs:
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
pr-number:
required: true
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-k8s-tests:
strategy:
fail-fast: false
matrix:
vmm:
- qemu
k8s:
- kubeadm
runs-on: k8s-ppc64le
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
GH_PR_NUMBER: ${{ inputs.pr-number }}
GOPATH: ${{ github.workspace }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBERNETES: ${{ matrix.k8s }}
USING_NFD: "false"
TARGET_ARCH: "ppc64le"
steps:
- name: Prepare the self-hosted runner
timeout-minutes: 15
run: |
bash "${HOME}/scripts/prepare_runner.sh" kubernetes
sudo rm -rf "$GITHUB_WORKSPACE"/*
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install golang
run: |
./tests/install_go.sh -f -p
echo "/usr/local/go/bin" >> "$GITHUB_PATH"
- name: Prepare the runner for k8s cluster creation
run: bash "${HOME}/scripts/k8s_cluster_cleanup.sh"
- name: Create k8s cluster using kubeadm
run: bash "${HOME}/scripts/k8s_cluster_create.sh"
- name: Deploy Kata
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-kubeadm
- name: Run tests
timeout-minutes: 30
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete cluster and post cleanup actions
run: bash "${HOME}/scripts/k8s_cluster_cleanup.sh"

View File

@@ -0,0 +1,137 @@
name: CI | Run kubernetes tests on IBM Cloud Z virtual server instance (zVSI)
on:
workflow_call:
inputs:
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
pr-number:
required: true
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-k8s-tests:
strategy:
fail-fast: false
matrix:
snapshotter:
- overlayfs
- devmapper
- nydus
vmm:
- qemu
- qemu-runtime-rs
- qemu-coco-dev
k8s:
- kubeadm
include:
- snapshotter: devmapper
pull-type: default
using-nfd: true
deploy-cmd: configure-snapshotter
- snapshotter: nydus
pull-type: guest-pull
using-nfd: false
deploy-cmd: deploy-snapshotter
exclude:
- snapshotter: overlayfs
vmm: qemu
- snapshotter: overlayfs
vmm: qemu-coco-dev
- snapshotter: devmapper
vmm: qemu-runtime-rs
- snapshotter: devmapper
vmm: qemu-coco-dev
- snapshotter: nydus
vmm: qemu
- snapshotter: nydus
vmm: qemu-runtime-rs
runs-on: s390x-large
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
GH_PR_NUMBER: ${{ inputs.pr-number }}
KATA_HOST_OS: "ubuntu"
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBERNETES: ${{ matrix.k8s }}
PULL_TYPE: ${{ matrix.pull-type }}
SNAPSHOTTER: ${{ matrix.snapshotter }}
USING_NFD: ${{ matrix.using-nfd }}
TARGET_ARCH: "s390x"
AUTHENTICATED_IMAGE_USER: ${{ secrets.AUTHENTICATED_IMAGE_USER }}
AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Set SNAPSHOTTER to empty if overlayfs
run: echo "SNAPSHOTTER=" >> "$GITHUB_ENV"
if: ${{ matrix.snapshotter == 'overlayfs' }}
- name: Set KBS and KBS_INGRESS if qemu-coco-dev
run: |
echo "KBS=true" >> "$GITHUB_ENV"
echo "KBS_INGRESS=nodeport" >> "$GITHUB_ENV"
if: ${{ matrix.vmm == 'qemu-coco-dev' }}
# qemu-runtime-rs only works with overlayfs
# See: https://github.com/kata-containers/kata-containers/issues/10066
- name: Configure the ${{ matrix.snapshotter }} snapshotter
run: bash tests/integration/kubernetes/gha-run.sh ${{ matrix.deploy-cmd }}
if: ${{ matrix.snapshotter != 'overlayfs' }}
- name: Deploy Kata
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-zvsi
- name: Uninstall previous `kbs-client`
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh uninstall-kbs-client
if: ${{ matrix.vmm == 'qemu-coco-dev' }}
- name: Deploy CoCo KBS
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs
if: ${{ matrix.vmm == 'qemu-coco-dev' }}
- name: Install `kbs-client`
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client
if: ${{ matrix.vmm == 'qemu-coco-dev' }}
- name: Run tests
timeout-minutes: 60
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete kata-deploy
if: always()
run: bash tests/integration/kubernetes/gha-run.sh cleanup-zvsi
- name: Delete CoCo KBS
if: always()
run: |
if [ "${KBS}" == "true" ]; then
bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs
fi

View File

@@ -0,0 +1,129 @@
name: CI | Run Kata CoCo k8s Stability Tests
on:
workflow_call:
inputs:
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
pr-number:
required: true
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
tarball-suffix:
required: false
type: string
jobs:
# Generate jobs for testing CoCo on non-TEE environments
run-stability-k8s-tests-coco-nontee:
strategy:
fail-fast: false
matrix:
vmm:
- qemu-coco-dev
snapshotter:
- nydus
pull-type:
- guest-pull
runs-on: ubuntu-22.04
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
GH_PR_NUMBER: ${{ inputs.pr-number }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
# Some tests rely on that variable to run (or not)
KBS: "true"
# Set the KBS ingress handler (empty string disables handling)
KBS_INGRESS: "aks"
KUBERNETES: "vanilla"
PULL_TYPE: ${{ matrix.pull-type }}
AUTHENTICATED_IMAGE_USER: ${{ secrets.AUTHENTICATED_IMAGE_USER }}
AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}
SNAPSHOTTER: ${{ matrix.snapshotter }}
USING_NFD: "false"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-artifacts
- name: Download Azure CLI
run: bash tests/integration/kubernetes/gha-run.sh install-azure-cli
- name: Log into the Azure account
run: bash tests/integration/kubernetes/gha-run.sh login-azure
env:
AZ_APPID: ${{ secrets.AZ_APPID }}
AZ_PASSWORD: ${{ secrets.AZ_PASSWORD }}
AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}
AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}
- name: Create AKS cluster
uses: nick-fields/retry@v3
with:
timeout_minutes: 15
max_attempts: 20
retry_on: error
retry_wait_seconds: 10
command: bash tests/integration/kubernetes/gha-run.sh create-cluster
- name: Install `bats`
run: bash tests/integration/kubernetes/gha-run.sh install-bats
- name: Install `kubectl`
run: bash tests/integration/kubernetes/gha-run.sh install-kubectl
- name: Download credentials for the Kubernetes CLI to use them
run: bash tests/integration/kubernetes/gha-run.sh get-cluster-credentials
- name: Deploy Snapshotter
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh deploy-snapshotter
- name: Deploy Kata
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-aks
- name: Deploy CoCo KBS
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs
- name: Install `kbs-client`
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client
- name: Run stability tests
timeout-minutes: 300
run: bash tests/stability/gha-stability-run.sh run-tests
- name: Delete AKS cluster
if: always()
run: bash tests/integration/kubernetes/gha-run.sh delete-cluster

View File

@@ -0,0 +1,303 @@
name: CI | Run kata coco tests
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
pr-number:
required: true
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-k8s-tests-on-tdx:
strategy:
fail-fast: false
matrix:
vmm:
- qemu-tdx
snapshotter:
- nydus
pull-type:
- guest-pull
runs-on: tdx
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
GH_PR_NUMBER: ${{ inputs.pr-number }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBERNETES: "vanilla"
USING_NFD: "true"
KBS: "true"
K8S_TEST_HOST_TYPE: "baremetal"
KBS_INGRESS: "nodeport"
SNAPSHOTTER: ${{ matrix.snapshotter }}
PULL_TYPE: ${{ matrix.pull-type }}
AUTHENTICATED_IMAGE_USER: ${{ secrets.AUTHENTICATED_IMAGE_USER }}
AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}
ITA_KEY: ${{ secrets.ITA_KEY }}
AUTO_GENERATE_POLICY: "yes"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Deploy Snapshotter
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh deploy-snapshotter
- name: Deploy Kata
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-tdx
- name: Uninstall previous `kbs-client`
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh uninstall-kbs-client
- name: Deploy CoCo KBS
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs
- name: Install `kbs-client`
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client
- name: Deploy CSI driver
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh deploy-csi-driver
- name: Run tests
timeout-minutes: 100
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete kata-deploy
if: always()
run: bash tests/integration/kubernetes/gha-run.sh cleanup-tdx
- name: Delete Snapshotter
if: always()
run: bash tests/integration/kubernetes/gha-run.sh cleanup-snapshotter
- name: Delete CoCo KBS
if: always()
run: bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs
- name: Delete CSI driver
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh delete-csi-driver
# AMD has deprecated SEV support on Kata and henceforth SNP will be the only feature supported for Kata Containers.
run-k8s-tests-sev-snp:
strategy:
fail-fast: false
matrix:
vmm:
- qemu-snp
snapshotter:
- nydus
pull-type:
- guest-pull
runs-on: sev-snp
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
GH_PR_NUMBER: ${{ inputs.pr-number }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBECONFIG: /home/kata/.kube/config
KUBERNETES: "vanilla"
USING_NFD: "false"
KBS: "true"
KBS_INGRESS: "nodeport"
K8S_TEST_HOST_TYPE: "baremetal"
SNAPSHOTTER: ${{ matrix.snapshotter }}
PULL_TYPE: ${{ matrix.pull-type }}
AUTHENTICATED_IMAGE_USER: ${{ secrets.AUTHENTICATED_IMAGE_USER }}
AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}
AUTO_GENERATE_POLICY: "yes"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Deploy Snapshotter
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh deploy-snapshotter
- name: Deploy Kata
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-snp
- name: Uninstall previous `kbs-client`
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh uninstall-kbs-client
- name: Deploy CoCo KBS
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs
- name: Install `kbs-client`
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client
- name: Deploy CSI driver
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh deploy-csi-driver
- name: Run tests
timeout-minutes: 50
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete kata-deploy
if: always()
run: bash tests/integration/kubernetes/gha-run.sh cleanup-snp
- name: Delete Snapshotter
if: always()
run: bash tests/integration/kubernetes/gha-run.sh cleanup-snapshotter
- name: Delete CoCo KBS
if: always()
run: bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs
- name: Delete CSI driver
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh delete-csi-driver
# Generate jobs for testing CoCo on non-TEE environments
run-k8s-tests-coco-nontee:
strategy:
fail-fast: false
matrix:
vmm:
- qemu-coco-dev
snapshotter:
- nydus
pull-type:
- guest-pull
runs-on: ubuntu-22.04
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
GH_PR_NUMBER: ${{ inputs.pr-number }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
# Some tests rely on that variable to run (or not)
KBS: "true"
# Set the KBS ingress handler (empty string disables handling)
KBS_INGRESS: "aks"
KUBERNETES: "vanilla"
PULL_TYPE: ${{ matrix.pull-type }}
AUTHENTICATED_IMAGE_USER: ${{ secrets.AUTHENTICATED_IMAGE_USER }}
AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}
SNAPSHOTTER: ${{ matrix.snapshotter }}
USING_NFD: "false"
AUTO_GENERATE_POLICY: "yes"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-artifacts
- name: Download Azure CLI
run: bash tests/integration/kubernetes/gha-run.sh install-azure-cli
- name: Log into the Azure account
run: bash tests/integration/kubernetes/gha-run.sh login-azure
env:
AZ_APPID: ${{ secrets.AZ_APPID }}
AZ_PASSWORD: ${{ secrets.AZ_PASSWORD }}
AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}
AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}
- name: Create AKS cluster
uses: nick-fields/retry@v3
with:
timeout_minutes: 15
max_attempts: 20
retry_on: error
retry_wait_seconds: 10
command: bash tests/integration/kubernetes/gha-run.sh create-cluster
- name: Install `bats`
run: bash tests/integration/kubernetes/gha-run.sh install-bats
- name: Install `kubectl`
run: bash tests/integration/kubernetes/gha-run.sh install-kubectl
- name: Download credentials for the Kubernetes CLI to use them
run: bash tests/integration/kubernetes/gha-run.sh get-cluster-credentials
- name: Deploy Snapshotter
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh deploy-snapshotter
- name: Deploy Kata
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-aks
- name: Deploy CoCo KBS
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs
- name: Install `kbs-client`
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client
- name: Deploy CSI driver
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh deploy-csi-driver
- name: Run tests
timeout-minutes: 80
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete AKS cluster
if: always()
run: bash tests/integration/kubernetes/gha-run.sh delete-cluster

View File

@@ -0,0 +1,96 @@
name: CI | Run kata-deploy tests on AKS
on:
workflow_call:
inputs:
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
pr-number:
required: true
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-kata-deploy-tests:
strategy:
fail-fast: false
matrix:
host_os:
- ubuntu
vmm:
- clh
- dragonball
- qemu
- qemu-runtime-rs
include:
- host_os: cbl-mariner
vmm: clh
runs-on: ubuntu-22.04
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
GH_PR_NUMBER: ${{ inputs.pr-number }}
KATA_HOST_OS: ${{ matrix.host_os }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBERNETES: "vanilla"
USING_NFD: "false"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Download Azure CLI
run: bash tests/functional/kata-deploy/gha-run.sh install-azure-cli
- name: Log into the Azure account
run: bash tests/functional/kata-deploy/gha-run.sh login-azure
env:
AZ_APPID: ${{ secrets.AZ_APPID }}
AZ_PASSWORD: ${{ secrets.AZ_PASSWORD }}
AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}
AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}
- name: Create AKS cluster
uses: nick-fields/retry@v3
with:
timeout_minutes: 15
max_attempts: 20
retry_on: error
retry_wait_seconds: 10
command: bash tests/integration/kubernetes/gha-run.sh create-cluster
- name: Install `bats`
run: bash tests/functional/kata-deploy/gha-run.sh install-bats
- name: Install `kubectl`
run: bash tests/functional/kata-deploy/gha-run.sh install-kubectl
- name: Download credentials for the Kubernetes CLI to use them
run: bash tests/functional/kata-deploy/gha-run.sh get-cluster-credentials
- name: Run tests
run: bash tests/functional/kata-deploy/gha-run.sh run-tests
- name: Delete AKS cluster
if: always()
run: bash tests/functional/kata-deploy/gha-run.sh delete-cluster

View File

@@ -0,0 +1,69 @@
name: CI | Run kata-deploy tests on GARM
on:
workflow_call:
inputs:
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
pr-number:
required: true
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-kata-deploy-tests:
strategy:
fail-fast: false
matrix:
vmm:
- clh
- qemu
k8s:
- k0s
- k3s
- rke2
# TODO: There are a couple of vmm/k8s combination failing (https://github.com/kata-containers/kata-containers/issues/9854)
# and we will put the entire kata-deploy-tests on GARM on maintenance.
# TODO: Transition to free runner (see #9940).
if: false
runs-on: garm-ubuntu-2004-smaller
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
GH_PR_NUMBER: ${{ inputs.pr-number }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBERNETES: ${{ matrix.k8s }}
USING_NFD: "false"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Deploy ${{ matrix.k8s }}
run: bash tests/functional/kata-deploy/gha-run.sh deploy-k8s
- name: Install `bats`
run: bash tests/functional/kata-deploy/gha-run.sh install-bats
- name: Run tests
run: bash tests/functional/kata-deploy/gha-run.sh run-tests

View File

@@ -0,0 +1,64 @@
name: CI | Run kata-monitor tests
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-monitor:
strategy:
fail-fast: false
matrix:
vmm:
- qemu
container_engine:
- crio
- containerd
# TODO: enable when https://github.com/kata-containers/kata-containers/issues/9853 is fixed
#include:
# - container_engine: containerd
# containerd_version: lts
exclude:
# TODO: enable with containerd when https://github.com/kata-containers/kata-containers/issues/9761 is fixed
- container_engine: containerd
vmm: qemu
runs-on: ubuntu-22.04
env:
CONTAINER_ENGINE: ${{ matrix.container_engine }}
#CONTAINERD_VERSION: ${{ matrix.containerd_version }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/functional/kata-monitor/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/functional/kata-monitor/gha-run.sh install-kata kata-artifacts
- name: Run kata-monitor tests
run: bash tests/functional/kata-monitor/gha-run.sh run

94
.github/workflows/run-metrics.yaml vendored Normal file
View File

@@ -0,0 +1,94 @@
name: CI | Run test metrics
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
setup-kata:
name: Kata Setup
runs-on: metrics
env:
GOPATH: ${{ github.workspace }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/metrics/gha-run.sh install-kata kata-artifacts
run-metrics:
needs: setup-kata
strategy:
# We can set this to true whenever we're 100% sure that
# the all the tests are not flaky, otherwise we'll fail
# all the tests due to a single flaky instance.
fail-fast: false
matrix:
vmm: ['clh', 'qemu']
max-parallel: 1
runs-on: metrics
env:
GOPATH: ${{ github.workspace }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- name: enabling the hypervisor
run: bash tests/metrics/gha-run.sh enabling-hypervisor
- name: run launch times test
run: bash tests/metrics/gha-run.sh run-test-launchtimes
- name: run memory foot print test
run: bash tests/metrics/gha-run.sh run-test-memory-usage
- name: run memory usage inside container test
run: bash tests/metrics/gha-run.sh run-test-memory-usage-inside-container
- name: run blogbench test
run: bash tests/metrics/gha-run.sh run-test-blogbench
- name: run tensorflow test
run: bash tests/metrics/gha-run.sh run-test-tensorflow
- name: run fio test
run: bash tests/metrics/gha-run.sh run-test-fio
- name: run iperf test
run: bash tests/metrics/gha-run.sh run-test-iperf
- name: run latency test
run: bash tests/metrics/gha-run.sh run-test-latency
- name: make metrics tarball ${{ matrix.vmm }}
run: bash tests/metrics/gha-run.sh make-tarball-results
- name: archive metrics results ${{ matrix.vmm }}
uses: actions/upload-artifact@v4
with:
name: metrics-artifacts-${{ matrix.vmm }}
path: results-${{ matrix.vmm }}.tar.gz
retention-days: 1
if-no-files-found: error

48
.github/workflows/run-runk-tests.yaml vendored Normal file
View File

@@ -0,0 +1,48 @@
name: CI | Run runk tests
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
jobs:
run-runk:
# Skip runk tests as we have no maintainers. TODO: Decide when to remove altogether
if: false
runs-on: ubuntu-22.04
env:
CONTAINERD_VERSION: lts
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/runk/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/runk/gha-run.sh install-kata kata-artifacts
- name: Run runk tests
run: bash tests/integration/runk/gha-run.sh run

29
.github/workflows/shellcheck.yaml vendored Normal file
View File

@@ -0,0 +1,29 @@
# https://github.com/marketplace/actions/shellcheck
name: Check shell scripts
on:
workflow_dispatch:
pull_request:
types:
- opened
- edited
- reopened
- synchronize
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
shellcheck:
runs-on: ubuntu-24.04
steps:
- name: Checkout the code
uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: actions/checkout@v4
- name: Run ShellCheck
uses: ludeeus/action-shellcheck@master

View File

@@ -1,39 +0,0 @@
name: Release Kata 2.x in snapcraft store
on:
push:
tags:
- '2.*'
jobs:
release-snap:
runs-on: ubuntu-20.04
steps:
- name: Check out Git repository
uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Install Snapcraft
uses: samuelmeuli/action-snapcraft@v1
with:
snapcraft_token: ${{ secrets.snapcraft_token }}
- name: Build snap
run: |
sudo apt-get install -y git git-extras
kata_url="https://github.com/kata-containers/kata-containers"
latest_version=$(git ls-remote --tags ${kata_url} | egrep -o "refs.*" | egrep -v "\-alpha|\-rc|{}" | egrep -o "[[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+" | sort -V -r | head -1)
current_version="$(echo ${GITHUB_REF} | cut -d/ -f3)"
# Check semantic versioning format (x.y.z) and if the current tag is the latest tag
if echo "${current_version}" | grep -q "^[[:digit:]]\+\.[[:digit:]]\+\.[[:digit:]]\+$" && echo -e "$latest_version\n$current_version" | sort -C -V; then
# Current version is the latest version, build it
snapcraft -d snap --destructive-mode
fi
- name: Upload snap
run: |
snap_version="$(echo ${GITHUB_REF} | cut -d/ -f3)"
snap_file="kata-containers_${snap_version}_amd64.snap"
# Upload the snap if it exists
if [ -f ${snap_file} ]; then
snapcraft upload --release=stable ${snap_file}
fi

View File

@@ -1,27 +0,0 @@
name: snap CI
on:
pull_request:
types:
- opened
- synchronize
- reopened
- edited
jobs:
test:
runs-on: ubuntu-20.04
steps:
- name: Check out
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Install Snapcraft
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: samuelmeuli/action-snapcraft@v1
- name: Build snap
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
snapcraft -d snap --destructive-mode

17
.github/workflows/stale.yaml vendored Normal file
View File

@@ -0,0 +1,17 @@
name: 'Automatically close stale PRs'
on:
schedule:
- cron: '0 0 * * *'
workflow_dispatch:
jobs:
stale:
runs-on: ubuntu-22.04
steps:
- uses: actions/stale@v9
with:
stale-pr-message: 'This PR has been opened without with no activity for 180 days. Comment on the issue otherwise it will be closed in 7 days'
days-before-pr-stale: 180
days-before-pr-close: 7
days-before-issue-stale: -1
days-before-issue-close: -1

View File

@@ -0,0 +1,34 @@
on:
pull_request:
types:
- opened
- synchronize
- reopened
- labeled # a workflow runs only when the 'ok-to-test' label is added
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
name: Static checks self-hosted
jobs:
skipper:
if: ${{ contains(github.event.pull_request.labels.*.name, 'ok-to-test') }}
uses: ./.github/workflows/gatekeeper-skipper.yaml
with:
commit-hash: ${{ github.event.pull_request.head.sha }}
target-branch: ${{ github.event.pull_request.base.ref }}
build-checks:
needs: skipper
if: ${{ needs.skipper.outputs.skip_static != 'yes' }}
strategy:
fail-fast: false
matrix:
instance:
- "ubuntu-22.04-arm"
- "s390x"
- "ppc64le"
uses: ./.github/workflows/build-checks.yaml
with:
instance: ${{ matrix.instance }}

View File

@@ -6,91 +6,120 @@ on:
- reopened
- synchronize
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
name: Static checks
jobs:
test:
strategy:
matrix:
go-version: [1.16.x, 1.17.x]
os: [ubuntu-20.04]
runs-on: ${{ matrix.os }}
env:
TRAVIS: "true"
TRAVIS_BRANCH: ${{ github.base_ref }}
TRAVIS_PULL_REQUEST_BRANCH: ${{ github.head_ref }}
TRAVIS_PULL_REQUEST_SHA : ${{ github.event.pull_request.head.sha }}
RUST_BACKTRACE: "1"
target_branch: ${{ github.base_ref }}
skipper:
uses: ./.github/workflows/gatekeeper-skipper.yaml
with:
commit-hash: ${{ github.event.pull_request.head.sha }}
target-branch: ${{ github.event.pull_request.base.ref }}
check-kernel-config-version:
needs: skipper
if: ${{ needs.skipper.outputs.skip_static != 'yes' }}
runs-on: ubuntu-22.04
steps:
- name: Install Go
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/setup-go@v2
with:
go-version: ${{ matrix.go-version }}
env:
GOPATH: ${{ runner.workspace }}/kata-containers
- name: Setup GOPATH
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
echo "TRAVIS_BRANCH: ${TRAVIS_BRANCH}"
echo "TRAVIS_PULL_REQUEST_BRANCH: ${TRAVIS_PULL_REQUEST_BRANCH}"
echo "TRAVIS_PULL_REQUEST_SHA: ${TRAVIS_PULL_REQUEST_SHA}"
echo "TRAVIS: ${TRAVIS}"
- name: Set env
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
echo "GOPATH=${{ github.workspace }}" >> $GITHUB_ENV
echo "${{ github.workspace }}/bin" >> $GITHUB_PATH
- name: Checkout code
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/checkout@v2
with:
fetch-depth: 0
path: ./src/github.com/${{ github.repository }}
- name: Setup travis references
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
echo "TRAVIS_BRANCH=${TRAVIS_BRANCH:-$(echo $GITHUB_REF | awk 'BEGIN { FS = \"/\" } ; { print $3 }')}"
target_branch=${TRAVIS_BRANCH}
- name: Setup
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/setup.sh
env:
GOPATH: ${{ runner.workspace }}/kata-containers
- name: Installing rust
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/install_rust.sh
PATH=$PATH:"$HOME/.cargo/bin"
rustup target add x86_64-unknown-linux-musl
rustup component add rustfmt clippy
- name: Setup seccomp
run: |
libseccomp_install_dir=$(mktemp -d -t libseccomp.XXXXXXXXXX)
gperf_install_dir=$(mktemp -d -t gperf.XXXXXXXXXX)
cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/install_libseccomp.sh "${libseccomp_install_dir}" "${gperf_install_dir}"
echo "Set environment variables for the libseccomp crate to link the libseccomp library statically"
echo "LIBSECCOMP_LINK_TYPE=static" >> $GITHUB_ENV
echo "LIBSECCOMP_LIB_PATH=${libseccomp_install_dir}/lib" >> $GITHUB_ENV
# Check whether the vendored code is up-to-date & working as the first thing
- name: Check vendored code
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && make vendor
- name: Static Checks
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && make static-checks
- name: Run Compiler Checks
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && make check
- name: Run Unit Tests
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && make test
- name: Run Unit Tests As Root User
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && sudo -E PATH="$PATH" make test
- name: Checkout the code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Ensure the kernel config version has been updated
run: |
kernel_dir="tools/packaging/kernel/"
kernel_version_file="${kernel_dir}kata_config_version"
modified_files=$(git diff --name-only origin/"$GITHUB_BASE_REF"..HEAD)
if git diff --name-only origin/"$GITHUB_BASE_REF"..HEAD "${kernel_dir}" | grep "${kernel_dir}"; then
echo "Kernel directory has changed, checking if $kernel_version_file has been updated"
if echo "$modified_files" | grep -v "README.md" | grep "${kernel_dir}" >>"/dev/null"; then
echo "$modified_files" | grep "$kernel_version_file" >>/dev/null || ( echo "Please bump version in $kernel_version_file" && exit 1)
else
echo "Readme file changed, no need for kernel config version update."
fi
echo "Check passed"
fi
build-checks:
needs: skipper
if: ${{ needs.skipper.outputs.skip_static != 'yes' }}
uses: ./.github/workflows/build-checks.yaml
with:
instance: ubuntu-22.04
build-checks-depending-on-kvm:
runs-on: ubuntu-22.04
needs: skipper
if: ${{ needs.skipper.outputs.skip_static != 'yes' }}
strategy:
fail-fast: false
matrix:
component:
- runtime-rs
include:
- component: runtime-rs
command: "sudo -E env PATH=$PATH LIBC=gnu SUPPORT_VIRTUALIZATION=true make test"
- component: runtime-rs
component-path: src/dragonball
steps:
- name: Checkout the code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Install system deps
run: |
sudo apt-get install -y build-essential musl-tools
- name: Install yq
run: |
sudo -E ./ci/install_yq.sh
env:
INSTALL_IN_GOPATH: false
- name: Install rust
run: |
export PATH="$PATH:/usr/local/bin"
./tests/install_rust.sh
- name: Running `${{ matrix.command }}` for ${{ matrix.component }}
run: |
export PATH="$PATH:${HOME}/.cargo/bin"
cd ${{ matrix.component-path }}
${{ matrix.command }}
env:
RUST_BACKTRACE: "1"
static-checks:
runs-on: ubuntu-22.04
needs: skipper
if: ${{ needs.skipper.outputs.skip_static != 'yes' }}
strategy:
fail-fast: false
matrix:
cmd:
- "make static-checks"
env:
GOPATH: ${{ github.workspace }}
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0
path: ./src/github.com/${{ github.repository }}
- name: Install yq
run: |
cd "${GOPATH}/src/github.com/${{ github.repository }}"
./ci/install_yq.sh
env:
INSTALL_IN_GOPATH: false
- name: Install golang
run: |
cd "${GOPATH}/src/github.com/${{ github.repository }}"
./tests/install_go.sh -f -p
echo "/usr/local/go/bin" >> "$GITHUB_PATH"
- name: Install system dependencies
run: |
sudo apt-get -y install moreutils hunspell hunspell-en-gb hunspell-en-us pandoc
- name: Run check
run: |
export PATH="${PATH}:${GOPATH}/bin"
cd "${GOPATH}/src/github.com/${{ github.repository }}" && ${{ matrix.cmd }}

7
.gitignore vendored
View File

@@ -4,10 +4,15 @@
**/*.rej
**/target
**/.vscode
**/.idea
**/.fleet
**/*.swp
**/*.swo
pkg/logging/Cargo.lock
src/agent/src/version.rs
src/agent/kata-agent.service
src/agent/protocols/src/*.rs
!src/agent/protocols/src/lib.rs
build
src/tools/log-parser/kata-log-parser
tools/packaging/static-build/agent/install_libseccomp.sh

View File

@@ -1,4 +1,4 @@
# Copyright (c) 2019 Intel Corporation
# Copyright (c) 2019-2023 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#
@@ -9,4 +9,83 @@
# Order in this file is important. Only the last match will be
# used. See https://help.github.com/articles/about-code-owners/
*.md @kata-containers/documentation
/CODEOWNERS @kata-containers/codeowners
VERSION @kata-containers/release
# The versions database needs careful handling
versions.yaml @kata-containers/release @kata-containers/ci @kata-containers/tests
Makefile* @kata-containers/build
*.mak @kata-containers/build
*.mk @kata-containers/build
# Documentation related files could also appear anywhere
# else in the repo.
*.md @kata-containers/documentation
*.drawio @kata-containers/documentation
*.jpg @kata-containers/documentation
*.png @kata-containers/documentation
*.svg @kata-containers/documentation
*.bash @kata-containers/shell
*.sh @kata-containers/shell
**/completions/ @kata-containers/shell
Dockerfile* @kata-containers/docker
/ci/ @kata-containers/ci
*.bats @kata-containers/tests
/tests/ @kata-containers/tests
*.rs @kata-containers/rust
*.go @kata-containers/golang
/utils/ @kata-containers/utils
# FIXME: Maybe a new "protocol" team would be better?
#
# All protocol changes must be reviewed.
# Note, we include all subdirs, including the vendor dir, as at present there are no .proto files
# in the vendor dir. Later we may have to extend this matching rule if that changes.
/src/libs/protocols/*.proto @kata-containers/architecture-committee @kata-containers/builder @kata-containers/packaging
# GitHub Actions
/.github/workflows/ @kata-containers/action-admins @kata-containers/ci
/ci/ @kata-containers/ci @kata-containers/tests
/docs/ @kata-containers/documentation
/src/agent/ @kata-containers/agent
/src/runtime*/ @kata-containers/runtime
/src/runtime/ @kata-containers/golang
src/runtime-rs/ @kata-containers/rust
src/libs/ @kata-containers/rust
src/dragonball/ @kata-containers/dragonball
/tools/osbuilder/ @kata-containers/builder
/tools/packaging/ @kata-containers/packaging
/tools/packaging/kernel/ @kata-containers/kernel
/tools/packaging/kata-deploy/ @kata-containers/kata-deploy
/tools/packaging/qemu/ @kata-containers/qemu
/tools/packaging/release/ @kata-containers/release
**/vendor/ @kata-containers/vendoring
# Handle arch specific files last so they match more specifically than
# the kernel packaging files.
**/*aarch64* @kata-containers/arch-aarch64
**/*arm64* @kata-containers/arch-aarch64
**/*amd64* @kata-containers/arch-amd64
**/*x86-64* @kata-containers/arch-amd64
**/*x86_64* @kata-containers/arch-amd64
**/*ppc64* @kata-containers/arch-ppc64le
**/*s390x* @kata-containers/arch-s390x

View File

@@ -1,4 +1,4 @@
# Copyright (c) 2020 Intel Corporation
# Copyright (c) 2020-2023 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#
@@ -6,25 +6,29 @@
# List of available components
COMPONENTS =
COMPONENTS += libs
COMPONENTS += agent
COMPONENTS += dragonball
COMPONENTS += runtime
COMPONENTS += runtime-rs
# List of available tools
TOOLS =
TOOLS += agent-ctl
TOOLS += trace-forwarder
TOOLS += kata-ctl
TOOLS += log-parser
TOOLS += runk
TOOLS += trace-forwarder
STANDARD_TARGETS = build check clean install test vendor
STANDARD_TARGETS = build check clean install static-checks-build test vendor
# Variables for the build-and-publish-kata-debug target
KATA_DEBUG_REGISTRY ?= ""
KATA_DEBUG_TAG ?= ""
default: all
all: logging-crate-tests build
logging-crate-tests:
make -C src/libs/logging
include utils.mk
include ./tools/packaging/kata-deploy/local-build/Makefile
@@ -37,19 +41,19 @@ generate-protocols:
make -C src/agent generate-protocols
# Some static checks rely on generated source files of components.
static-checks: build
bash ci/static-checks.sh
static-checks: static-checks-build
bash tests/static-checks.sh github.com/kata-containers/kata-containers
docs-url-alive-check:
bash ci/docs-url-alive-check.sh
build-and-publish-kata-debug:
bash tools/packaging/kata-debug/kata-debug-build-and-upload-payload.sh ${KATA_DEBUG_REGISTRY} ${KATA_DEBUG_TAG}
.PHONY: \
all \
binary-tarball \
kata-tarball \
install-tarball \
default \
install-binary-tarball \
logging-crate-tests \
static-checks \
docs-url-alive-check

View File

@@ -1,4 +1,6 @@
<img src="https://www.openstack.org/assets/kata/kata-vertical-on-white.png" width="150">
<img src="https://object-storage-ca-ymq-1.vexxhost.net/swift/v1/6e4619c416ff4bd19e1c087f27a43eea/www-images-prod/openstack-logo/kata/SVG/kata-1.svg" width="900">
[![CI | Publish Kata Containers payload](https://github.com/kata-containers/kata-containers/actions/workflows/payload-after-push.yaml/badge.svg)](https://github.com/kata-containers/kata-containers/actions/workflows/payload-after-push.yaml) [![Kata Containers Nightly CI](https://github.com/kata-containers/kata-containers/actions/workflows/ci-nightly.yaml/badge.svg)](https://github.com/kata-containers/kata-containers/actions/workflows/ci-nightly.yaml)
# Kata Containers
@@ -71,6 +73,7 @@ See the [official documentation](docs) including:
- [Developer guide](docs/Developer-Guide.md)
- [Design documents](docs/design)
- [Architecture overview](docs/design/architecture)
- [Architecture 3.0 overview](docs/design/architecture_3.0/)
## Configuration
@@ -116,10 +119,11 @@ The table below lists the core parts of the project:
| Component | Type | Description |
|-|-|-|
| [runtime](src/runtime) | core | Main component run by a container manager and providing a containerd shimv2 runtime implementation. |
| [runtime-rs](src/runtime-rs) | core | The Rust version runtime. |
| [agent](src/agent) | core | Management process running inside the virtual machine / POD that sets up the container environment. |
| [`dragonball`](src/dragonball) | core | An optional built-in VMM brings out-of-the-box Kata Containers experience with optimizations on container workloads |
| [documentation](docs) | documentation | Documentation common to all components (such as design and install documentation). |
| [libraries](src/libs) | core | Library crates shared by multiple Kata Container components or published to [`crates.io`](https://crates.io/index.html) |
| [tests](https://github.com/kata-containers/tests) | tests | Excludes unit tests which live with the main code. |
| [tests](tests) | tests | Excludes unit tests which live with the main code. |
### Additional components
@@ -130,18 +134,28 @@ The table below lists the remaining parts of the project:
| [packaging](tools/packaging) | infrastructure | Scripts and metadata for producing packaged binaries<br/>(components, hypervisors, kernel and rootfs). |
| [kernel](https://www.kernel.org) | kernel | Linux kernel used by the hypervisor to boot the guest image. Patches are stored [here](tools/packaging/kernel). |
| [osbuilder](tools/osbuilder) | infrastructure | Tool to create "mini O/S" rootfs and initrd images and kernel for the hypervisor. |
| [kata-debug](tools/packaging/kata-debug/README.md) | infrastructure | Utility tool to gather Kata Containers debug information from Kubernetes clusters. |
| [`agent-ctl`](src/tools/agent-ctl) | utility | Tool that provides low-level access for testing the agent. |
| [`kata-ctl`](src/tools/kata-ctl) | utility | Tool that provides advanced commands and debug facilities. |
| [`trace-forwarder`](src/tools/trace-forwarder) | utility | Agent tracing helper. |
| [`runk`](src/tools/runk) | utility | Standard OCI container runtime based on the agent. |
| [`ci`](https://github.com/kata-containers/ci) | CI | Continuous Integration configuration files and scripts. |
| [`ci`](.github/workflows) | CI | Continuous Integration configuration files and scripts. |
| [`ocp-ci`](ci/openshift-ci/README.md) | CI | Continuous Integration configuration for the OpenShift pipelines. |
| [`katacontainers.io`](https://github.com/kata-containers/www.katacontainers.io) | Source for the [`katacontainers.io`](https://www.katacontainers.io) site. |
| [`Webhook`](tools/testing/kata-webhook/README.md) | utility | Example of a simple admission controller webhook to annotate pods with the Kata runtime class |
### Packaging and releases
Kata Containers is now
[available natively for most distributions](docs/install/README.md#packaged-installation-methods).
However, packaging scripts and metadata are still used to generate snap and GitHub releases. See
the [components](#components) section for further details.
## General tests
See the [tests documentation](tests/README.md).
## Metrics tests
See the [metrics documentation](tests/metrics/README.md).
## Glossary of Terms

View File

@@ -1 +1 @@
2.5.0-alpha1
3.14.0

400
ci/README.md Normal file
View File

@@ -0,0 +1,400 @@
# Kata Containers CI
> [!WARNING]
> While this project's CI has several areas for improvement, it is constantly
> evolving. This document attempts to describe its current state, but due to
> ongoing changes, you may notice some outdated information here. Feel free to
> modify/improve this document as you use the CI and notice anything odd. The
> community appreciates it!
## Introduction
The Kata Containers CI relies on [GitHub Actions][gh-actions], where the actions
themselves can be found in the `.github/workflows` directory, and they may call
helper scripts, which are located under the `tests` directory, to actually
perform the tasks required for each test case.
## The different workflows
There are a few different sets of workflows that are running as part of our CI,
and here we're going to cover the ones that are less likely to get rotten. With
this said, it's fair to advise that if the reader finds something that got
rotten, opening an issue to the project pointing to the problem is a nice way to
help, and providing a fix for the issue is a very encouraging way to help.
### Jobs that run automatically when a PR is raised
These are a bunch of tests that will automatically run as soon as a PR is
opened, they're mostly running on "cost free" runners, and they do some
pre-checks to evaluate that your PR may be okay to start getting reviewed.
Mind, though, that the community expects the contributors to, at least, build
their code before submitting a PR, which the community sees as a very fair
request.
Without getting into the weeds with details on this, those jobs are the ones
responsible for ensuring that:
- The commit message is in the expected format
- There's no missing Developer's Certificate of Origin
- Static checks are passing
### Jobs that require a maintainer's approval to run
There are some tests, and our so-called "CI". These require a
maintainer's approval to run as parts of those jobs will be running on "paid
runners", which are currently using Azure infrastructure.
Once a maintainer of the project gives "the green light" (currently by adding an
`ok-to-test` label to the PR, soon to be changed to commenting "/test" as part
of a PR review), the following tests will be executed:
- Build all the components (runs on free cost runners, or bare-metal depending on the architecture)
- Create a tarball with all the components (runs on free cost runners, or bare-metal depending on the architecture)
- Create a kata-deploy payload with the tarball generated in the previous step (runs on free costs runner, or bare-metal depending on the architecture)
- Run the following tests:
- Tests depending on the generated tarball
- Metrics (runs on bare-metal)
- `docker` (runs on cost free runners)
- `nerdctl` (runs on cost free runners)
- `kata-monitor` (runs on cost free runners)
- `cri-containerd` (runs on cost free runners)
- `nydus` (runs on cost free runners)
- `vfio` (runs on cost free runners)
- Tests depending on the generated kata-deploy payload
- kata-deploy (runs on cost free runners)
- Tests are performed using different "Kubernetes flavors", such as k0s, k3s, rke2, and Azure Kubernetes Service (AKS).
- Kubernetes (runs in Azure small and medium instances depending on what's required by each test, and on TEE bare-metal machines)
- Tests are performed with different runtime engines, such as CRI-O and containerd.
- Tests are performed with different snapshotters for containerd, namely OverlayFS and devmapper.
- Tests are performed with all the supported hypervisors, which are Cloud Hypervisor, Dragonball, Firecracker, and QEMU.
For all the tests relying on Azure instances, real money is being spent, so the
community asks for the maintainers to be mindful about those, and avoid abusing
them to merely debug issues.
## The different runners
In the previous section we've mentioned using different runners, now in this section we'll go through each type of runner used.
- Cost free runners: Those are the runners provided by GitHub itself, and
those are fairly small machines with virtualization capabilities enabled.
- Azure small instances: Those are runners which have virtualization
capabilities enabled, 2 CPUs, and 8GB of RAM. These runners have a "-smaller"
suffix to their name.
- Azure normal instances: Those are runners which have virtualization
capabilities enabled, 4 CPUs, and 16GB of RAM. These runners are usually
`garm` ones with no "-smaller" suffix.
- Bare-metal runners: Those are runners provided by community contributors,
and they may vary in architecture, size and virtualization capabilities.
Builder runners don't actually require any virtualization capabilities, while
runners which will be actually performing the tests must have virtualization
capabilities and a reasonable amount for CPU and RAM available (at least
matching the Azure normal instances).
## Adding new tests
Before someone decides to add a new test, we strongly recommend them to go
through [GitHub Actions Documentation][gh-actions],
which will provide you a very sensible background on how to read and understand
current tests we have, and also become familiar with how to write a new test.
On the Kata Containers land, there are basically two sets of tests: "standalone"
and "part of something bigger".
The "standalone" tests, for example the commit message check, won't be covered
here as they're better covered by the GitHub Actions documentation pasted above.
The "part of something bigger" is the more complicated one and not so
straightforward to add, so we'll be focusing our efforts on describing the
addition of those.
> [!NOTE]
> TODO: Currently, this document refers to "tests" when it actually means the
> jobs (or workflows) of GitHub. In an ideal world, except in some specific cases,
> new tests should be added without the need to add new workflows. In the
> not-too-distant future (hopefully), we will improve the workflows to support
> this.
### Adding a new test that's "part of something bigger"
The first important thing here is to align expectations, and we must say that
the community strongly prefers receiving tests that already come with:
- Instructions how to run them
- A proven run where it's passing
There are several ways to achieve those two requirements, and an example of that
can be seen in PR #8115.
With the expectations aligned, adding a test consists in:
- Adding a new yaml file for your test, and ensure it's called from the
"bigger" yaml. See the [Kata Monitor test example][monitor-ex01].
- Adding the helper scripts needed for your test to run. Again, use the [Kata Monitor script as example][monitor-ex02].
Following those examples, the community advice during the review, and even
asking the community directly on Slack are the best ways to get your test
accepted.
## Required tests
In our CI we have two categories of jobs - required and non-required:
- Required jobs need to all pass for a PR to be merged normally and
should cover all the core features on Kata Containers that we want to
ensure don't have regressions.
- The non-required jobs are for unstable tests, or for features that
are experimental and not-fully supported. We'd like those tests to also
pass on all PRs ideally, but don't block merging if they don't as it's
not necessarily an indication of the PR code causing regressions.
### Transitioning between required and non-required status
Required jobs that fail block merging of PRs, so we want to ensure that
jobs are stable and maintained before we make them required.
The [Kata Containers CI Dashboard](https://kata-containers.github.io/)
is a useful resource to check when collecting evidence of job stability.
At time of writing it reports the last ten days of Kata CI nightly test
results for each job. This isn't perfect as it doesn't currently capture
results on PRs, but is a good guideline for stability.
> [!NOTE]
> Below are general guidelines about jobs being marked as
> required/non-required, but they are subject to change and the Kata
> Architecture Committee may overrule these guidelines at their
> discretion.
#### Initial marking as required
For new jobs, or jobs that haven't been marked as required recently,
the criteria to be initially marked as required is ten days
of passing tests, with no relevant PR failures reported in that time.
Required jobs also need one or more nominated maintainers that are
responsible for the stability of their jobs.
> [!NOTE]
> We don't currently have a good place to record the job maintainers, but
> once we have this, the intention is to show it on the CI Dashboard so
> people can find the contact easily.
#### Expectation of required job maintainers
Due to the nature of the Kata Containers community having contributors
spread around the world, required jobs being blocked due to infrastructure,
or test issues can have a big impact on work. As such, the expectation is
that when a problem with a required job is noticed/reported, the maintainers
have one working day to acknowledge the issue, perform an initial
investigation and then either fix it, or get it marked as non-required
whilst the investigation and/or fix it done.
### Re-marking of required status
Once a job has been removed from the required list, it requires two
consecutive successful nightly test runs before being made required
again.
## Running tests
### Running the tests as part of the CI
If you're a maintainer of the project, you'll be able to kick in the tests by
yourself. With the current approach, you just need to add the `ok-to-test`
label and the tests will automatically start. We're moving, though, to use a
`/test` command as part of a GitHub review comment, which will simplify this
process.
If you're not a maintainer, please, send a message on Slack or wait till one of
the maintainers reviews your PR. Maintainers will then kick in the tests on
your behalf.
In case a test fails and there's the suspicion it happens due to flakiness in
the test itself, please, create an issue for us, and then re-run (or asks
maintainers to re-run) the tests following these steps:
- Locate which tests is failing
- Click in "details"
- In the top right corner, click in "Re-run jobs"
- And then in "Re-run failed jobs"
- And finally click in the green "Re-run jobs" button
> [!NOTE]
> TODO: We need figures here
### Running the tests locally
In this section, aligning expectations is also something very important, as one
will not be able to run the tests exactly in the same way the tests are running
in the CI, as one most likely won't have access to an Azure subscription.
However, we're trying our best here to provide you with instructions on how to
run the tests in an environment that's "close enough" and will help you to debug
issues you find with the current tests, or even provide a proof-of-concept to
the new test you're trying to add.
The basic steps, which we will cover in details down below are:
1. Create a VM matching the configuration of the target runner
2. Generate the artifacts you'll need for the test, or download them from a
current failed run
3. Follow the steps provided in the action itself to run the tests.
Although the general overview looks easy, we know that some tricks need to be
shared, and we'll go through the general process of debugging one non-Kubernetes
and one Kubernetes specific test for educational purposes.
One important thing to note is that "Create a VM" can be done in innumerable
different ways, using the tools of your choice. For the sake of simplicity on
this guide, we'll be using `kcli`, which we strongly recommend in case you're a
non-experienced user, and happen to be developing on a Linux box.
For both non-Kubernetes and Kubernetes cases, we'll be using PR #8070 as an
example, which at the time this document is being written serves us very well
the purpose, as you can see that we have `nerdctl` and Kubernetes tests failing.
## Debugging tests
### Debugging a non Kubernetes test
As shown above, the `nerdctl` test is failing.
As a developer you can go ahead to the details of the job, and expand the job
that's failing in order to gather more information.
But when that doesn't help, we need to set up our own environment to debug
what's going on.
Taking a look at the `nerdctl` test, which is located here, you can easily see
that it runs-on a `garm-ubuntu-2304-smaller` virtual machine.
The important parts to understand are `ubuntu-2304`, which is the OS where the
test is running on; and "smaller", which means we're running it on a machine
with 2 CPUs and 8GB of RAM.
With this information, we can go ahead and create a similar VM locally using `kcli`.
```bash
$ sudo kcli create vm -i ubuntu2304 -P disks=[60] -P numcpus=2 -P memory=8192 -P cpumodel=host-passthrough debug-nerdctl-pr8070
```
In order to run the tests, you'll need the "kata-tarball" artifacts, which you
can build your own using "make kata-tarball" (see below), or simply get them
from the PR where the tests failed. To download them, click on the "Summary"
button that's on the top left corner, and then scroll down till you see the
artifacts, as shown below.
Unfortunately GitHub doesn't give us a link that we can download those from
inside the VM, but we can download them on our local box, and then `scp` the
tarball to the newly created VM that will be used for debugging purposes.
> [!NOTE]
> Those artifacts are only available (for 15 days) when all jobs are finished.
Once you have the `kata-static.tar.xz` in your VM, you can login to the VM with
`kcli ssh debug-nerdctl-pr8070`, go ahead and then clone your development branch
```bash
$ git clone --branch feat_add-fc-runtime-rs https://github.com/nubificus/kata-containers
```
Add the upstream as a remote, set up your git, and rebase your branch atop of the upstream main one
```bash
$ git remote add upstream https://github.com/kata-containers/kata-containers
$ git remote update
$ git config --global user.email "you@example.com"
$ git config --global user.name "Your Name"
$ git rebase upstream/main
```
Now copy the `kata-static.tar.xz` into your `kata-containers/kata-artifacts` directory
```bash
$ mkdir kata-artifacts
$ cp ../kata-static.tar.xz kata-artifacts/
```
> [!NOTE]
> If you downloaded the .zip from GitHub you need to uncompress first to see `kata-static.tar.xz`
And finally run the tests following what's in the yaml file for the test you're
debugging.
In our case, the `run-nerdctl-tests-on-garm.yaml`.
When looking at the file you'll notice that some environment variables are set,
such as `KATA_HYPERVISOR`, and should be aware that, for this particular example,
the important steps to follow are:
Install the dependencies
Install kata
Run the tests
Let's now run the steps mentioned above exporting the expected environment variables
```bash
$ export KATA_HYPERVISOR=dragonball
$ bash ./tests/integration/nerdctl/gha-run.sh install-dependencies
$ bash ./tests/integration/nerdctl/gha-run.sh install-kata
$ bash tests/integration/nerdctl/gha-run.sh run
```
And with this you should've been able to reproduce exactly the same issue found
in the CI, and from now on you can build your own code, use your own binaries,
and have fun debugging and hacking!
### Debugging a Kubernetes test
Steps for debugging the Kubernetes tests are very similar to the ones for
debugging non-Kubernetes tests, with the caveat that what you'll need, this
time, is not the `kata-static.tar.xz` tarball, but rather a payload to be used
with kata-deploy.
In order to generate your own kata-deploy image you can generate your own
`kata-static.tar.xz` and then take advantage of the following script. Be aware
that the image generated and uploaded must be accessible by the VM where you'll
be performing your tests.
In case you want to take advantage of the payload that was already generated
when you faced the CI failure, which is considerably easier, take a look at the
failed job, then click in "Deploy Kata" and expand the "Final kata-deploy.yaml
that is used in the test" section. From there you can see exactly what you'll
have to use when deploying kata-deploy in your local cluster.
> [!NOTE]
> TODO: WAINER TO FINISH THIS PART BASED ON HIS PR TO RUN A LOCAL CI
## Adding new runners
Any admin of the project is able to add or remove GitHub runners, and those are
the folks you should rely on.
If you need a new runner added, please, tag @ac in the Kata Containers slack,
and someone from that group will be able to help you.
If you're part of that group and you're looking for information on how to help
someone, this is simple, and must be done in private. Basically what you have to
do is:
- Go to the kata-containers/kata-containers repo
- Click on the Settings button, located in the top right corner
- On the left panel, under "Code and automation", click on "Actions"
- Click on "Runners"
If you want to add a new self-hosted runner:
- In the top right corner there's a green button called "New self-hosted runner"
If you want to remove a current self-hosted runner:
- For each runner there's a "..." menu, where you can just click and the
"Remove runner" option will show up
## Known limitations
As the GitHub actions are structured right now we cannot: Test the addition of a
GitHub action that's not triggered by a pull_request event as part of the PR.
[gh-actions]: https://docs.github.com/en/actions
[monitor-ex01]: https://github.com/kata-containers/kata-containers/commit/a3fb067f1bccde0cbd3fd4d5de12dfb3d8c28b60
[monitor-ex02]: https://github.com/kata-containers/kata-containers/commit/489caf1ad0fae27cfd00ba3c9ed40e3d512fa492

View File

@@ -11,10 +11,10 @@ runtimedir=$cidir/../src/runtime
build_working_packages() {
# working packages:
device_api=$runtimedir/virtcontainers/device/api
device_config=$runtimedir/virtcontainers/device/config
device_drivers=$runtimedir/virtcontainers/device/drivers
device_manager=$runtimedir/virtcontainers/device/manager
device_api=$runtimedir/pkg/device/api
device_config=$runtimedir/pkg/device/config
device_drivers=$runtimedir/pkg/device/drivers
device_manager=$runtimedir/pkg/device/manager
rc_pkg_dir=$runtimedir/pkg/resourcecontrol/
utils_pkg_dir=$runtimedir/virtcontainers/utils

View File

@@ -7,6 +7,6 @@
set -e
cidir=$(dirname "$0")
source "${cidir}/lib.sh"
source "${cidir}/../tests/common.bash"
run_docs_url_alive_check

182
ci/gh-util.sh Executable file
View File

@@ -0,0 +1,182 @@
#!/bin/bash
# Copyright (c) 2020 Intel Corporation
# Copyright (c) 2024 IBM Corporation
#
# SPDX-License-Identifier: Apache-2.0
set -o errexit
set -o errtrace
set -o nounset
set -o pipefail
[ -n "${DEBUG:-}" ] && set -o xtrace
script_name=${0##*/}
#---------------------------------------------------------------------
die()
{
echo >&2 "$*"
exit 1
}
usage()
{
cat <<EOF
Usage: $script_name [OPTIONS] [command] [arguments]
Description: Utility to expand the abilities of the GitHub CLI tool, gh.
Command descriptions:
list-issues-for-pr List issues linked to a PR.
list-labels-for-issue List labels, in json format for an issue
Commands and arguments:
list-issues-for-pr <pr>
list-labels-for-issue <issue>
Options:
-h Show this help statement.
-r <owner/repo> Optional <org/repo> specification. Default: 'kata-containers/kata-containers'
Examples:
- List issues for a Pull Request 123 in kata-containers/kata-containers repo
$ $script_name list-issues-for-pr 123
EOF
}
list_issues_for_pr()
{
local pr="${1:-}"
local repo="${2:-kata-containers/kata-containers}"
[ -z "$pr" ] && die "need PR"
local commits=$(gh pr view ${pr} --repo ${repo} --json commits --jq .commits[].messageBody)
[ -z "$commits" ] && die "cannot determine commits for PR $pr"
# Extract the issue number(s) from the commits.
#
# This needs to be careful to take account of lines like this:
#
# fixes 99
# fixes: 77
# fixes #123.
# Fixes: #1, #234, #5678.
#
# Note the exclusion of lines starting with whitespace which is
# specifically to ignore vendored git log comments, which are whitespace
# indented and in the format:
#
# "<git-commit> <git-commit-msg>"
#
local issues=$(echo "$commits" |\
grep -v -E "^( | )" |\
grep -i -E "fixes:* *(#*[0-9][0-9]*)" |\
tr ' ' '\n' |\
grep "[0-9][0-9]*" |\
sed 's/[.,\#]//g' |\
sort -nu || true)
[ -z "$issues" ] && die "cannot determine issues for PR $pr"
echo "# Issues linked to PR"
echo "#"
echo "# Fields: issue_number"
local issue
echo "$issues"|while read issue
do
printf "%s\n" "$issue"
done
}
list_labels_for_issue()
{
local issue="${1:-}"
[ -z "$issue" ] && die "need issue number"
local labels=$(gh issue view ${issue} --repo kata-containers/kata-containers --json labels)
[ -z "$labels" ] && die "cannot determine labels for issue $issue"
printf "$labels"
}
setup()
{
for cmd in gh jq
do
command -v "$cmd" &>/dev/null || die "need command: $cmd"
done
}
handle_args()
{
setup
local show_all="false"
local opt
while getopts "ahr:" opt "$@"
do
case "$opt" in
a) show_all="true" ;;
h) usage && exit 0 ;;
r) repo="${OPTARG}" ;;
esac
done
shift $(($OPTIND - 1))
local repo="${repo:-kata-containers/kata-containers}"
local cmd="${1:-}"
case "$cmd" in
list-issues-for-pr) ;;
list-labels-for-issue) ;;
"") usage && exit 0 ;;
*) die "invalid command: '$cmd'" ;;
esac
# Consume the command name
shift
local issue=""
local pr=""
case "$cmd" in
list-issues-for-pr)
pr="${1:-}"
list_issues_for_pr "$pr" "${repo}"
;;
list-labels-for-issue)
issue="${1:-}"
list_labels_for_issue "$issue"
;;
*) die "impossible situation: cmd: '$cmd'" ;;
esac
exit 0
}
main()
{
handle_args "$@"
}
main "$@"

View File

@@ -1,12 +0,0 @@
#!/usr/bin/env bash
#
# Copyright (c) 2020 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
set -e
cidir=$(dirname "$0")
source "${cidir}/lib.sh"
run_go_test

View File

@@ -1,22 +0,0 @@
#!/usr/bin/env bash
#
# Copyright (c) 2019 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
set -e
cidir=$(dirname "$0")
source "${cidir}/lib.sh"
clone_tests_repo
new_goroot=/usr/local/go
pushd "${tests_repo_dir}"
# Force overwrite the current version of golang
[ -z "${GOROOT}" ] || rm -rf "${GOROOT}"
.ci/install_go.sh -p -f -d "$(dirname ${new_goroot})"
[ -z "${GOROOT}" ] || sudo ln -sf "${new_goroot}" "${GOROOT}"
go version
popd

View File

@@ -7,12 +7,10 @@
set -o errexit
cidir=$(dirname "$0")
source "${cidir}/lib.sh"
script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
script_name="$(basename "${BASH_SOURCE[0]}")"
clone_tests_repo
source "${tests_repo_dir}/.ci/lib.sh"
source "${script_dir}/../tests/common.bash"
# The following variables if set on the environment will change the behavior
# of gperf and libseccomp configure scripts, that may lead this script to
@@ -23,88 +21,91 @@ arch=${ARCH:-$(uname -m)}
workdir="$(mktemp -d --tmpdir build-libseccomp.XXXXX)"
# Variables for libseccomp
# Currently, specify the libseccomp version directly without using `versions.yaml`
# because the current Snap workflow is incomplete.
# After solving the issue, replace this code by using the `versions.yaml`.
# libseccomp_version=$(get_version "externals.libseccomp.version")
# libseccomp_url=$(get_version "externals.libseccomp.url")
libseccomp_version="2.5.1"
libseccomp_url="https://github.com/seccomp/libseccomp"
libseccomp_version="${LIBSECCOMP_VERSION:-""}"
if [ -z "${libseccomp_version}" ]; then
libseccomp_version=$(get_from_kata_deps ".externals.libseccomp.version")
fi
libseccomp_url="${LIBSECCOMP_URL:-""}"
if [ -z "${libseccomp_url}" ]; then
libseccomp_url=$(get_from_kata_deps ".externals.libseccomp.url")
fi
libseccomp_tarball="libseccomp-${libseccomp_version}.tar.gz"
libseccomp_tarball_url="${libseccomp_url}/releases/download/v${libseccomp_version}/${libseccomp_tarball}"
cflags="-O2"
# Variables for gperf
# Currently, specify the gperf version directly without using `versions.yaml`
# because the current Snap workflow is incomplete.
# After solving the issue, replace this code by using the `versions.yaml`.
# gperf_version=$(get_version "externals.gperf.version")
# gperf_url=$(get_version "externals.gperf.url")
gperf_version="3.1"
gperf_url="https://ftp.gnu.org/gnu/gperf"
gperf_version="${GPERF_VERSION:-""}"
if [ -z "${gperf_version}" ]; then
gperf_version=$(get_from_kata_deps ".externals.gperf.version")
fi
gperf_url="${GPERF_URL:-""}"
if [ -z "${gperf_url}" ]; then
gperf_url=$(get_from_kata_deps ".externals.gperf.url")
fi
gperf_tarball="gperf-${gperf_version}.tar.gz"
gperf_tarball_url="${gperf_url}/${gperf_tarball}"
# We need to build the libseccomp library from sources to create a static library for the musl libc.
# However, ppc64le and s390x have no musl targets in Rust. Hence, we do not set cflags for the musl libc.
if ([ "${arch}" != "ppc64le" ] && [ "${arch}" != "s390x" ]); then
# Set FORTIFY_SOURCE=1 because the musl-libc does not have some functions about FORTIFY_SOURCE=2
cflags="-U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=1 -O2"
# Set FORTIFY_SOURCE=1 because the musl-libc does not have some functions about FORTIFY_SOURCE=2
cflags="-U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=1 -O2"
fi
die() {
msg="$*"
echo "[Error] ${msg}" >&2
exit 1
msg="$*"
echo "[Error] ${msg}" >&2
exit 1
}
finish() {
rm -rf "${workdir}"
rm -rf "${workdir}"
}
trap finish EXIT
build_and_install_gperf() {
echo "Build and install gperf version ${gperf_version}"
mkdir -p "${gperf_install_dir}"
curl -sLO "${gperf_tarball_url}"
tar -xf "${gperf_tarball}"
pushd "gperf-${gperf_version}"
# Unset $CC for configure, we will always use native for gperf
CC= ./configure --prefix="${gperf_install_dir}"
make
make install
export PATH=$PATH:"${gperf_install_dir}"/bin
popd
echo "Gperf installed successfully"
echo "Build and install gperf version ${gperf_version}"
mkdir -p "${gperf_install_dir}"
curl -sLO "${gperf_tarball_url}"
tar -xf "${gperf_tarball}"
pushd "gperf-${gperf_version}"
# Unset $CC for configure, we will always use native for gperf
CC= ./configure --prefix="${gperf_install_dir}"
make
make install
export PATH=$PATH:"${gperf_install_dir}"/bin
popd
echo "Gperf installed successfully"
}
build_and_install_libseccomp() {
echo "Build and install libseccomp version ${libseccomp_version}"
mkdir -p "${libseccomp_install_dir}"
curl -sLO "${libseccomp_tarball_url}"
tar -xf "${libseccomp_tarball}"
pushd "libseccomp-${libseccomp_version}"
./configure --prefix="${libseccomp_install_dir}" CFLAGS="${cflags}" --enable-static --host="${arch}"
make
make install
popd
echo "Libseccomp installed successfully"
echo "Build and install libseccomp version ${libseccomp_version}"
mkdir -p "${libseccomp_install_dir}"
curl -sLO "${libseccomp_tarball_url}"
tar -xf "${libseccomp_tarball}"
pushd "libseccomp-${libseccomp_version}"
[ "${arch}" == $(uname -m) ] && cc_name="" || cc_name="${arch}-linux-gnu-gcc"
CC=${cc_name} ./configure --prefix="${libseccomp_install_dir}" CFLAGS="${cflags}" --enable-static --host="${arch}"
make
make install
popd
echo "Libseccomp installed successfully"
}
main() {
local libseccomp_install_dir="${1:-}"
local gperf_install_dir="${2:-}"
local libseccomp_install_dir="${1:-}"
local gperf_install_dir="${2:-}"
if [ -z "${libseccomp_install_dir}" ] || [ -z "${gperf_install_dir}" ]; then
die "Usage: ${0} <libseccomp-install-dir> <gperf-install-dir>"
fi
if [ -z "${libseccomp_install_dir}" ] || [ -z "${gperf_install_dir}" ]; then
die "Usage: ${0} <libseccomp-install-dir> <gperf-install-dir>"
fi
pushd "$workdir"
# gperf is required for building the libseccomp.
build_and_install_gperf
build_and_install_libseccomp
popd
pushd "$workdir"
# gperf is required for building the libseccomp.
build_and_install_gperf
build_and_install_libseccomp
popd
}
main "$@"

View File

@@ -1,16 +0,0 @@
#!/usr/bin/env bash
# Copyright (c) 2019 Ant Financial
#
# SPDX-License-Identifier: Apache-2.0
#
set -e
cidir=$(dirname "$0")
source "${cidir}/lib.sh"
clone_tests_repo
pushd ${tests_repo_dir}
.ci/install_rust.sh ${1:-}
popd

View File

@@ -1,19 +0,0 @@
#!/usr/bin/env bash
#
# Copyright (c) 2018 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
set -e
cidir=$(dirname "$0")
vcdir="${cidir}/../src/runtime/virtcontainers/"
source "${cidir}/lib.sh"
export CI_JOB="${CI_JOB:-default}"
clone_tests_repo
if [ "${CI_JOB}" != "PODMAN" ]; then
echo "Install virtcontainers"
make -C "${vcdir}" && chronic sudo make -C "${vcdir}" install
fi

View File

@@ -5,6 +5,8 @@
# SPDX-License-Identifier: Apache-2.0
#
[ -n "$DEBUG" ] && set -o xtrace
# If we fail for any reason a message will be displayed
die() {
msg="$*"
@@ -12,21 +14,48 @@ die() {
exit 1
}
function verify_yq_exists() {
local yq_path=$1
local yq_version=$2
local expected="yq (https://github.com/mikefarah/yq/) version $yq_version"
if [ -x "${yq_path}" ] && [ "$($yq_path --version)"X == "$expected"X ]; then
return 0
else
return 1
fi
}
# Install the yq yaml query package from the mikefarah github repo
# Install via binary download, as we may not have golang installed at this point
function install_yq() {
local yq_pkg="github.com/mikefarah/yq"
local yq_version=3.4.1
local yq_version=v4.44.5
local precmd=""
local yq_path=""
INSTALL_IN_GOPATH=${INSTALL_IN_GOPATH:-true}
if [ "${INSTALL_IN_GOPATH}" == "true" ];then
if [ "${INSTALL_IN_GOPATH}" == "true" ]; then
GOPATH=${GOPATH:-${HOME}/go}
mkdir -p "${GOPATH}/bin"
local yq_path="${GOPATH}/bin/yq"
yq_path="${GOPATH}/bin/yq"
else
yq_path="/usr/local/bin/yq"
fi
[ -x "${yq_path}" ] && [ "`${yq_path} --version`"X == "yq version ${yq_version}"X ] && return
if verify_yq_exists "$yq_path" "$yq_version"; then
echo "yq is already installed in correct version"
return
fi
if [ "${yq_path}" == "/usr/local/bin/yq" ]; then
# Check if we need sudo to install yq
if [ ! -w "/usr/local/bin" ]; then
# Check if we have sudo privileges
if ! sudo -n true 2>/dev/null; then
die "Please provide sudo privileges to install yq"
else
precmd="sudo"
fi
fi
fi
read -r -a sysInfo <<< "$(uname -sm)"
@@ -43,6 +72,19 @@ function install_yq() {
"aarch64")
goarch=arm64
;;
"arm64")
# If we're on an apple silicon machine, just assign amd64.
# The version of yq we use doesn't have a darwin arm build,
# but Rosetta can come to the rescue here.
if [ $goos == "Darwin" ]; then
goarch=amd64
else
goarch=arm64
fi
;;
"riscv64")
goarch=riscv64
;;
"ppc64le")
goarch=ppc64le
;;
@@ -64,10 +106,10 @@ function install_yq() {
fi
## NOTE: ${var,,} => gives lowercase value of var
local yq_url="https://${yq_pkg}/releases/download/${yq_version}/yq_${goos,,}_${goarch}"
curl -o "${yq_path}" -LSsf "${yq_url}"
local yq_url="https://${yq_pkg}/releases/download/${yq_version}/yq_${goos}_${goarch}"
${precmd} curl -o "${yq_path}" -LSsf "${yq_url}"
[ $? -ne 0 ] && die "Download ${yq_url} failed"
chmod +x "${yq_path}"
${precmd} chmod +x "${yq_path}"
if ! command -v "${yq_path}" >/dev/null; then
die "Cannot not get ${yq_path} executable"

View File

@@ -1,55 +0,0 @@
#
# Copyright (c) 2018 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
set -o nounset
export tests_repo="${tests_repo:-github.com/kata-containers/tests}"
export tests_repo_dir="$GOPATH/src/$tests_repo"
export branch="${target_branch:-main}"
# Clones the tests repository and checkout to the branch pointed out by
# the global $branch variable.
# If the clone exists and `CI` is exported then it does nothing. Otherwise
# it will clone the repository or `git pull` the latest code.
#
clone_tests_repo()
{
if [ -d "$tests_repo_dir" ]; then
[ -n "${CI:-}" ] && return
pushd "${tests_repo_dir}"
git checkout "${branch}"
git pull
popd
else
git clone -q "https://${tests_repo}" "$tests_repo_dir"
pushd "${tests_repo_dir}"
git checkout "${branch}"
popd
fi
}
run_static_checks()
{
clone_tests_repo
# Make sure we have the targeting branch
git remote set-branches --add origin "${branch}"
git fetch -a
bash "$tests_repo_dir/.ci/static-checks.sh" "$@"
}
run_go_test()
{
clone_tests_repo
bash "$tests_repo_dir/.ci/go-test.sh"
}
run_docs_url_alive_check()
{
clone_tests_repo
# Make sure we have the targeting branch
git remote set-branches --add origin "${branch}"
git fetch -a
bash "$tests_repo_dir/.ci/static-checks.sh" --docs --all "github.com/kata-containers/kata-containers"
}

149
ci/openshift-ci/README.md Normal file
View File

@@ -0,0 +1,149 @@
OpenShift CI
============
This directory contains scripts used by
[the OpenShift CI](https://github.com/openshift/release/tree/master/ci-operator/config/kata-containers/kata-containers)
pipelines to monitor selected functional tests on OpenShift.
There are 2 pipelines, history and logs can be accessed here:
* [main - currently supported OCP](https://prow.ci.openshift.org/job-history/gs/origin-ci-test/logs/periodic-ci-kata-containers-kata-containers-main-e2e-tests)
* [next - currently under development OCP](https://prow.ci.openshift.org/job-history/gs/origin-ci-test/logs/periodic-ci-kata-containers-kata-containers-main-next-e2e-tests)
Running openshift-tests on OCP with kata-containers manually
============================================================
To run openshift-tests (or other suites) with kata-containers one can use
the kata-webhook. To deploy everything you can mimic the CI pipeline by:
```bash
#!/bin/bash -e
# Setup your kubectl and check it's accessible by
kubectl nodes
# Deploy kata (set KATA_DEPLOY_IMAGE to override the default kata-deploy-ci:latest image)
./test.sh
# Deploy the webhook
KATA_RUNTIME=kata-qemu cluster/deploy_webhook.sh
```
This should ensure kata-containers as well as kata-webhook are installed and
working. Before running the openshift-tests it's (currently) recommended to
ignore some security features by:
```bash
#!/bin/bash -e
oc adm policy add-scc-to-group privileged system:authenticated system:serviceaccounts
oc adm policy add-scc-to-group anyuid system:authenticated system:serviceaccounts
oc label --overwrite ns default pod-security.kubernetes.io/enforce=privileged pod-security.kubernetes.io/warn=baseline pod-security.kubernetes.io/audit=baseline
```
Now you should be ready to run the openshift-tests. Our CI only uses a subset
of tests, to get the current ``TEST_SKIPS`` see
[the pipeline config](https://github.com/openshift/release/tree/master/ci-operator/config/kata-containers/kata-containers).
Following steps require the [openshift tests](https://github.com/openshift/origin)
being cloned and built in the current directory:
```bash
#!/bin/bash -e
# Define tests to be skipped (see the pipeline config for the current version)
TEST_SKIPS="\[sig-node\] Security Context should support seccomp runtime/default\|\[sig-node\] Variable Expansion should allow substituting values in a volume subpath\|\[k8s.io\] Probing container should be restarted with a docker exec liveness probe with timeout\|\[sig-node\] Pods Extended Pod Container lifecycle evicted pods should be terminal\|\[sig-node\] PodOSRejection \[NodeConformance\] Kubelet should reject pod when the node OS doesn't match pod's OS\|\[sig-network\].*for evicted pods\|\[sig-network\].*HAProxy router should override the route\|\[sig-network\].*HAProxy router should serve a route\|\[sig-network\].*HAProxy router should serve the correct\|\[sig-network\].*HAProxy router should run\|\[sig-network\].*when FIPS.*the HAProxy router\|\[sig-network\].*bond\|\[sig-network\].*all sysctl on whitelist\|\[sig-network\].*sysctls should not affect\|\[sig-network\] pods should successfully create sandboxes by adding pod to network"
# Get the list of tests to be executed
TESTS="$(./openshift-tests run --dry-run --provider "${TEST_PROVIDER}" "${TEST_SUITE}")"
# Store the list of tests in /tmp/tsts file
echo "${TESTS}" | grep -v "$TEST_SKIPS" > /tmp/tsts
# Remove previously-existing temporarily files as well as previous results
OUT=RESULTS/tmp
rm -Rf /tmp/*test* /tmp/e2e-*
rm -R $OUT
mkdir -p $OUT
# Run the tests ignoring the monitor health checks
./openshift-tests run --provider azure -o "$OUT/job.log" --junit-dir "$OUT" --file /tmp/tsts --max-parallel-tests 5 --cluster-stability Disruptive --run '^\[sig-node\].*|^\[sig-network\]'
```
[!NOTE]
Note we are ignoring the cluster stability checks because our public cloud is
not that stable and running with VMs instead of containers results in minor
stability issues. Some of the old monitor stability tests do not reflect
the ``--cluster-stability`` setting, one should simply ignore these. If you
get a message like ``invariant was violated`` or ``error: failed due to a
MonitorTest failure``, it's usually an indication that only those kind of
tests failed but the real tests passed. See
[wrapped-openshift-tests.sh](https://github.com/openshift/release/blob/master/ci-operator/config/kata-containers/kata-containers/wrapped-openshift-tests.sh)
for details how our pipeline deals with that.
[!TIP]
To compare multiple results locally one can use
[junit2html](https://github.com/inorton/junit2html) tool.
Best-effort kata-containers cleanup
===================================
If you need to cleanup the cluster after testing, you can use the
``cleanup.sh`` script from the current directory. It tries to delete all
resources created by ``test.sh`` as well as ``cluster/deploy_webhook.sh``
ignoring all failures. The primary purpose of this script is to allow
soft-cleanup after deployment to test different versions without
re-provisioning everything.
[!WARNING]
Do not rely on this script in production, return codes are not checked!**
Bisecting e2e tests failures
============================
Let's say the OCP pipeline passed running with
``quay.io/kata-containers/kata-deploy-ci:kata-containers-d7afd31fd40e37a675b25c53618904ab57e74ccd-amd64``
but failed running with
``quay.io/kata-containers/kata-deploy-ci:kata-containers-9f512c016e75599a4a921bd84ea47559fe610057-amd64``
and you'd like to know which PR caused the regression. You can either run with
all the 60 tags between or you can utilize the [bisecter](https://github.com/ldoktor/bisecter)
to optimize the number of steps in between.
Before running the bisection you need a reproducer script. Sample one called
``sample-test-reproducer.sh`` is provided in this directory but you might
want to copy and modify it, especially:
* ``OCP_DIR`` - directory where your openshift/release is located (can be exported)
* ``E2E_TEST`` - openshift-test(s) to be executed (can be exported)
* behaviour of SETUP (returning 125 skips the current image tag, returning
>=128 interrupts the execution, everything else reports the tag as failure
* what should be executed (perhaps running the setup is enough for you or
you might want to be looking for specific failures...)
* use ``timeout`` to interrupt execution in case you know things should be faster
Executing that script with the GOOD commit should pass
``./sample-test-reproducer.sh quay.io/kata-containers/kata-deploy-ci:kata-containers-d7afd31fd40e37a675b25c53618904ab57e74ccd-amd64``
and fail when executed with the BAD commit
``./sample-test-reproducer.sh quay.io/kata-containers/kata-deploy-ci:kata-containers-9f512c016e75599a4a921bd84ea47559fe610057-amd64``.
To get the list of all tags in between those two PRs you can use the
``bisect-range.sh`` script
```bash
./bisect-range.sh d7afd31fd40e37a675b25c53618904ab57e74ccd 9f512c016e75599a4a921bd84ea47559fe610057
```
[!NOTE]
The tagged images are only built per PR, not for individual commits. See
[kata-deploy-ci](https://quay.io/kata-containers/kata-deploy-ci) to see the
available images.
To find out which PR caused this regression, you can either manually try the
individual commits or you can simply execute:
```bash
bisecter start "$(./bisect-range.sh d7afd31fd40 9f512c016)"
OCP_DIR=/path/to/openshift/release bisecter run ./sample-test-reproducer.sh
```
[!NOTE]
If you use ``KATA_WITH_SYSTEM_QEMU=yes`` you might want to deploy once with
it and skip it for the cleanup. That way you might (in most cases) test
all images with a single MCP update instead of per-image MCP update.
[!TIP]
You can check the bisection progress during/after execution by running
``bisecter log`` from the current directory. Before starting a new
bisection you need to execute ``bisecter reset``.

27
ci/openshift-ci/bisect-range.sh Executable file
View File

@@ -0,0 +1,27 @@
#!/bin/bash
# Copyright (c) 2024 Red Hat, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
if [ "$#" -gt 2 ] || [ "$#" -lt 1 ] ; then
echo "Usage: $0 GOOD [BAD]"
echo "Prints list of available kata-deploy-ci tags between GOOD and BAD commits (by default BAD is the latest available tag)"
exit 255
fi
GOOD="$1"
[ -n "$2" ] && BAD="$2"
ARCH=amd64
REPO="quay.io/kata-containers/kata-deploy-ci"
TAGS=$(skopeo list-tags "docker://$REPO")
# Only amd64
TAGS=$(echo "$TAGS" | jq '.Tags' | jq "map(select(endswith(\"$ARCH\")))" | jq -r '.[]')
# Sort by git
SORTED=""
[ -n "$BAD" ] && LOG_ARGS="$GOOD~1..$BAD" || LOG_ARGS="$GOOD~1.."
for TAG in $(git log --merges --pretty=format:%H --reverse $LOG_ARGS); do
[[ "$TAGS" =~ "$TAG" ]] && SORTED+="
kata-containers-$TAG-$ARCH"
done
# Comma separated tags with repo
echo "$SORTED" | tail -n +2 | sed -e "s@^@$REPO:@" | paste -s -d, -

59
ci/openshift-ci/cleanup.sh Executable file
View File

@@ -0,0 +1,59 @@
#!/bin/bash
#
# Copyright (c) 2024 Red Hat, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# This script tries to removes most of the resources added by `test.sh` script
# from the cluster.
scripts_dir=$(dirname $0)
deployments_dir=${scripts_dir}/cluster/deployments
configs_dir=${scripts_dir}/configs
source ${scripts_dir}/lib.sh
# Set to 'yes' if you want to configure SELinux to permissive on the cluster
# workers.
#
SELINUX_PERMISSIVE=${SELINUX_PERMISSIVE:-no}
# Enable workaround for OCP 4.13 https://github.com/kata-containers/kata-containers/pull/9206
#
WORKAROUND_9206_CRIO=${WORKAROUND_9206_CRIO:-no}
# Ignore errors as we want best-effort-approach here
trap - ERR
# Delete webhook resources
oc delete -f "${scripts_dir}/../../tools/testing/kata-webhook/deploy"
oc delete -f "${scripts_dir}/cluster/deployments/configmap_kata-webhook.yaml.in"
# Delete potential smoke-test resources
oc delete -f "${scripts_dir}/smoke/service.yaml"
oc delete -f "${scripts_dir}/smoke/service_kubernetes.yaml"
oc delete -f "${scripts_dir}/smoke/http-server.yaml"
# Delete test.sh resources
oc delete -f "${deployments_dir}/relabel_selinux.yaml"
if [[ "$WORKAROUND_9206_CRIO" == "yes" ]]; then
oc delete -f "${deployments_dir}/workaround-9206-crio-ds.yaml"
oc delete -f "${deployments_dir}/workaround-9206-crio.yaml"
fi
[ ${SELINUX_PERMISSIVE} == "yes" ] && oc delete -f "${deployments_dir}/machineconfig_selinux.yaml.in"
# Delete kata-containers
pushd "$katacontainers_repo_dir/tools/packaging/kata-deploy"
oc delete -f kata-deploy/base/kata-deploy.yaml
oc -n kube-system wait --timeout=10m --for=delete -l name=kata-deploy pod
oc apply -f kata-cleanup/base/kata-cleanup.yaml
echo "Wait for all related pods to be gone"
( repeats=1; for i in $(seq 1 600); do
oc get pods -l name="kubelet-kata-cleanup" --no-headers=true -n kube-system 2>&1 | grep "No resources found" -q && ((repeats++)) || repeats=1
[ "$repeats" -gt 5 ] && echo kata-cleanup finished && break
sleep 1
done) || { echo "There are still some kata-cleanup related pods after 600 iterations"; oc get all -n kube-system; exit -1; }
oc delete -f kata-cleanup/base/kata-cleanup.yaml
oc delete -f kata-rbac/base/kata-rbac.yaml
oc delete -f runtimeclasses/kata-runtimeClasses.yaml

View File

@@ -0,0 +1,6 @@
# Copyright (c) 2020 Red Hat, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
SELINUX=permissive
SELINUXTYPE=targeted

View File

@@ -0,0 +1,36 @@
#!/bin/bash
#
# Copyright (c) 2021 Red Hat, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# This script builds the kata-webhook and deploys it in the test cluster.
#
# You should export the KATA_RUNTIME variable with the runtimeclass name
# configured in your cluster in case it is not the default "kata-ci".
#
set -e
set -o nounset
set -o pipefail
script_dir="$(realpath $(dirname $0))"
webhook_dir="${script_dir}/../../../tools/testing/kata-webhook"
source "${script_dir}/../lib.sh"
KATA_RUNTIME=${KATA_RUNTIME:-kata-ci}
pushd "${webhook_dir}" >/dev/null
# Build and deploy the webhook
#
info "Builds the kata-webhook"
./create-certs.sh
info "Deploys the kata-webhook"
oc apply -f deploy/
info "Override our KATA_RUNTIME ConfigMap"
RUNTIME_CLASS="${KATA_RUNTIME}" \
envsubst < "${script_dir}/deployments/configmap_kata-webhook.yaml.in" \
| oc apply -f -
# Check the webhook was deployed and is working.
RUNTIME_CLASS="${KATA_RUNTIME}" ./webhook-check.sh
popd >/dev/null

View File

@@ -0,0 +1,13 @@
# Copyright (c) 2021 Red Hat, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# Instruct the daemonset installer to configure Kata Containers to use the
# host kernel.
#
apiVersion: v1
kind: ConfigMap
metadata:
name: ci.kata.installer.kernel
data:
host_kernel: "yes"

View File

@@ -0,0 +1,14 @@
# Copyright (c) 2021 Red Hat, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# Instruct the daemonset installer to configure Kata Containers to use the
# system QEMU.
#
apiVersion: v1
kind: ConfigMap
metadata:
name: ci.kata.installer.qemu
data:
qemu_path: /usr/libexec/qemu-kvm
host_kernel: "yes"

View File

@@ -0,0 +1,12 @@
# Copyright (c) 2021 Red Hat, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# Apply customizations to the kata-webhook.
#
apiVersion: v1
kind: ConfigMap
metadata:
name: kata-webhook
data:
runtime_class: ${RUNTIME_CLASS}

View File

@@ -0,0 +1,9 @@
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: 50-enable-sandboxed-containers-extension
spec:
extensions:
- sandboxed-containers

View File

@@ -0,0 +1,23 @@
# Copyright (c) 2020 Red Hat, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# Configure SELinux on worker nodes.
---
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: 51-kata-selinux
spec:
config:
ignition:
version: 2.2.0
storage:
files:
- contents:
source: data:text/plain;charset=utf-8;base64,${SELINUX_CONF_BASE64}
filesystem: root
mode: 0644
path: /etc/selinux/config

View File

@@ -0,0 +1,40 @@
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: relabel-selinux-daemonset
namespace: kube-system
spec:
selector:
matchLabels:
app: restorecon
template:
metadata:
labels:
app: restorecon
spec:
serviceAccountName: kata-deploy-sa
hostPID: true
containers:
- name: relabel-selinux-container
image: alpine
securityContext:
privileged: true
command: ["/bin/sh", "-c", "
set -e;
echo Starting the relabel;
nsenter --target 1 --mount bash -xc '
command -v semanage &>/dev/null || { echo Does not look like a SELINUX cluster, skipping; exit 0; };
for ENTRY in \
\"/(.*/)?opt/kata/bin(/.*)?\" \
\"/(.*/)?opt/kata/runtime-rs/bin(/.*)?\" \
\"/(.*/)?opt/kata/share/kata-.*(/.*)?(/.*)?\" \
\"/(.*/)?opt/kata/share/ovmf(/.*)?\" \
\"/(.*/)?opt/kata/share/tdvf(/.*)?\" \
\"/(.*/)?opt/kata/libexec(/.*)?\";
do
semanage fcontext -a -t qemu_exec_t \"$ENTRY\" || semanage fcontext -m -t qemu_exec_t \"$ENTRY\" || { echo \"Error in semanage command\"; exit 1; }
done;
restorecon -v -R /opt/kata || { echo \"Error in restorecon command\"; exit 1; }
';
echo NSENTER_FINISHED_WITH: $?;
sleep infinity"]

View File

@@ -0,0 +1,28 @@
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: workaround-9206-crio-ds
spec:
selector:
matchLabels:
app: workaround-9206-crio-ds
template:
metadata:
labels:
app: workaround-9206-crio-ds
spec:
containers:
- name: workaround-9206-crio-ds
image: alpine
volumeMounts:
- name: host-dir
mountPath: /tmp/config
securityContext:
runAsUser: 0
privileged: true
command: ["/bin/sh", "-c", "while [ ! -f '/tmp/config/10-workaround-9206-crio' ]; do sleep 1; done; echo 'Config file present'; sleep infinity"]
volumes:
- name: host-dir
hostPath:
path: /etc/crio/crio.conf.d/

View File

@@ -0,0 +1,18 @@
---
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: 10-workaround-9206-crio
spec:
config:
ignition:
version: 2.2.0
storage:
files:
- contents:
source: data:text/plain;charset=utf-8;base64,W2NyaW9dCnN0b3JhZ2Vfb3B0aW9uID0gWwoJIm92ZXJsYXkuc2tpcF9tb3VudF9ob21lPXRydWUiLApdCg==
filesystem: root
mode: 0644
path: /etc/crio/crio.conf.d/10-workaround-9206-crio

View File

@@ -0,0 +1,245 @@
#!/bin/bash
#
# Copyright (c) 2020 Red Hat, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# This script installs the built kata-containers in the test cluster,
# and configure a runtime.
scripts_dir=$(dirname $0)
deployments_dir=${scripts_dir}/deployments
configs_dir=${scripts_dir}/configs
source ${scripts_dir}/../lib.sh
# Set to 'yes' if you want to configure SELinux to permissive on the cluster
# workers.
#
SELINUX_PERMISSIVE=${SELINUX_PERMISSIVE:-no}
# Set to 'yes' if you want to configure Kata Containers to use the system's
# QEMU (from the RHCOS extension).
#
KATA_WITH_SYSTEM_QEMU=${KATA_WITH_SYSTEM_QEMU:-no}
# Set to 'yes' if you want to configure Kata Containers to use the host kernel.
#
KATA_WITH_HOST_KERNEL=${KATA_WITH_HOST_KERNEL:-no}
# kata-deploy image to be used to deploy the kata (by default use CI image
# that is built for each pull request)
#
KATA_DEPLOY_IMAGE=${KATA_DEPLOY_IMAGE:-quay.io/kata-containers/kata-deploy-ci:kata-containers-latest}
# Enable workaround for OCP 4.13 https://github.com/kata-containers/kata-containers/pull/9206
#
WORKAROUND_9206_CRIO=${WORKAROUND_9206_CRIO:-no}
# Leverage kata-deploy to install Kata Containers in the cluster.
#
apply_kata_deploy() {
local deploy_file="tools/packaging/kata-deploy/kata-deploy/base/kata-deploy.yaml"
pushd "$katacontainers_repo_dir"
sed -ri "s#(\s+image:) .*#\1 ${KATA_DEPLOY_IMAGE}#" "$deploy_file"
info "Applying kata-deploy"
oc apply -f tools/packaging/kata-deploy/kata-rbac/base/kata-rbac.yaml
oc label --overwrite ns kube-system pod-security.kubernetes.io/enforce=privileged pod-security.kubernetes.io/warn=baseline pod-security.kubernetes.io/audit=baseline
oc apply -f "$deploy_file"
oc -n kube-system wait --timeout=10m --for=condition=Ready -l name=kata-deploy pod
info "Adding the kata runtime classes"
oc apply -f tools/packaging/kata-deploy/runtimeclasses/kata-runtimeClasses.yaml
popd
}
# Wait all worker nodes reboot.
#
# Params:
# $1 - timeout in seconds (default to 900).
#
wait_for_reboot() {
local delta="${1:-900}"
local sleep_time=60
declare -A BOOTIDS
local workers=($(oc get nodes | \
awk '{if ($3 == "worker") { print $1 } }'))
# Get the boot ID to compared it changed over time.
for node in ${workers[@]}; do
BOOTIDS[$node]=$(oc get -o jsonpath='{.status.nodeInfo.bootID}'\
node/$node)
echo "Wait $node reboot"
done
echo "Set timeout to $delta seconds"
timer_start=$(date +%s)
while [ ${#workers[@]} -gt 0 ]; do
sleep $sleep_time
now=$(date +%s)
if [ $(($timer_start + $delta)) -lt $now ]; then
echo "Timeout: not all workers rebooted"
return 1
fi
echo "Checking after $(($now - $timer_start)) seconds"
for i in ${!workers[@]}; do
current_id=$(oc get \
-o jsonpath='{.status.nodeInfo.bootID}' \
node/${workers[i]})
if [ "$current_id" != ${BOOTIDS[${workers[i]}]} ]; then
echo "${workers[i]} rebooted"
unset workers[i]
fi
done
done
}
wait_mcp_update() {
local delta="${1:-3600}"
local sleep_time=30
# The machineconfigpool is fine when all the workers updated and are ready,
# and none are degraded.
local ready_count=0
local degraded_count=0
local machine_count=$(oc get mcp worker -o jsonpath='{.status.machineCount}')
if [[ -z "$machine_count" && "$machine_count" -lt 1 ]]; then
warn "Unabled to obtain the machine count"
return 1
fi
echo "Set timeout to $delta seconds"
local deadline=$(($(date +%s) + $delta))
# The ready count might not have changed yet, so wait a little.
while [[ "$ready_count" != "$machine_count" && \
"$degraded_count" == 0 ]]; do
# Let's check it hit the timeout (or not).
local now=$(date +%s)
if [ $deadline -lt $now ]; then
echo "Timeout: not all workers updated" >&2
return 1
fi
sleep $sleep_time
ready_count=$(oc get mcp worker \
-o jsonpath='{.status.readyMachineCount}')
degraded_count=$(oc get mcp worker \
-o jsonpath='{.status.degradedMachineCount}')
echo "check machineconfigpool - ready_count: $ready_count degraded_count: $degraded_count"
done
[ $degraded_count -eq 0 ]
}
# Enable the RHCOS extension for the Sandboxed Containers.
#
enable_sandboxedcontainers_extension() {
info "Enabling the RHCOS extension for Sandboxed Containers"
local deployment_file="${deployments_dir}/machineconfig_sandboxedcontainers_extension.yaml"
oc apply -f ${deployment_file}
oc get -f ${deployment_file} || \
die "Sandboxed Containers extension machineconfig not found"
wait_mcp_update || die "Failed to update the machineconfigpool"
}
# Print useful information for debugging.
#
# Params:
# $1 - the pod name
debug_pod() {
local pod="$1"
info "Debug pod: ${pod}"
oc describe pods "$pod"
oc logs "$pod"
}
# Wait for all pods of the app label to contain expected message
#
# Params:
# $1 - app labela
# $2 - expected pods count (>=1)
# $3 - message to be present in the logs
# $4 - timeout (60)
# $5 - namespace (the current one)
wait_for_app_pods_message() {
local app="$1"
local pod_count="$2"
local message="$3"
local timeout="$4"
local namespace="$5"
[ -z "$pod_count" ] && pod_count=1
[ -z "$timeout" ] && timeout=60
[ -n "$namespace" ] && namespace=" -n $namespace "
local pod
local pods
local i
SECONDS=0
while :; do
pods=($(oc get pods -l app="$app" --no-headers=true $namespace | awk '{print $1}'))
[ "${#pods}" -ge "$pod_count" ] && break
if [ "$SECONDS" -gt "$timeout" ]; then
echo "Unable to find ${pod_count} pods for '-l app=\"$app\"' in ${SECONDS}s (${pods[@]})"
return -1
fi
done
for pod in "${pods[@]}"; do
while :; do
local log=$(oc logs $namespace "$pod")
echo "$log" | grep "$message" -q && echo "Found $(echo "$log" | grep "$message") in $pod's log ($SECONDS)" && break;
if [ "$SECONDS" -gt "$timeout" ]; then
echo -n "Message '$message' not present in '${pod}' pod of the '-l app=\"$app\"' "
echo "pods after ${SECONDS}s (${pods[@]})"
echo "Pod $pod's output so far:"
echo "$log"
return -1
fi
sleep 1;
done
done
}
oc config set-context --current --namespace=default
worker_nodes=$(oc get nodes | awk '{if ($3 == "worker") { print $1 } }')
num_nodes=$(echo $worker_nodes | wc -w)
[ $num_nodes -ne 0 ] || \
die "No worker nodes detected. Something is wrong with the cluster"
if [ "${KATA_WITH_SYSTEM_QEMU}" == "yes" ]; then
# QEMU is deployed on the workers via RCHOS extension.
enable_sandboxedcontainers_extension
oc apply -f ${deployments_dir}/configmap_installer_qemu.yaml
fi
if [ "${KATA_WITH_HOST_KERNEL}" == "yes" ]; then
oc apply -f ${deployments_dir}/configmap_installer_kernel.yaml
fi
apply_kata_deploy
# Set SELinux to permissive mode
if [ ${SELINUX_PERMISSIVE} == "yes" ]; then
info "Configuring SELinux"
if [ -z "$SELINUX_CONF_BASE64" ]; then
export SELINUX_CONF_BASE64=$(echo \
$(cat $configs_dir/selinux.conf|base64) | \
sed -e 's/\s//g')
fi
envsubst < ${deployments_dir}/machineconfig_selinux.yaml.in | \
oc apply -f -
oc get machineconfig/51-kata-selinux || \
die "SELinux machineconfig not found"
# The new SELinux configuration will trigger another reboot.
wait_for_reboot
fi
if [[ "$WORKAROUND_9206_CRIO" == "yes" ]]; then
info "Applying workaround to enable skip_mount_home in crio on OCP 4.13"
oc apply -f "${deployments_dir}/workaround-9206-crio.yaml"
oc apply -f "${deployments_dir}/workaround-9206-crio-ds.yaml"
wait_for_app_pods_message workaround-9206-crio-ds "$num_nodes" "Config file present" 1200 || echo "Failed to apply the workaround, proceeding anyway..."
fi
# FIXME: Remove when https://github.com/kata-containers/kata-containers/pull/8417 is resolved
# Selinux context is currently not handled by kata-deploy
oc apply -f ${deployments_dir}/relabel_selinux.yaml
wait_for_app_pods_message restorecon "$num_nodes" "NSENTER_FINISHED_WITH:" 120 "kube-system" || echo "Failed to treat selinux, proceeding anyway..."

View File

@@ -4,7 +4,7 @@
#
# This is the build root image for Kata Containers on OpenShift CI.
#
FROM quay.io/centos/centos:stream8
FROM quay.io/centos/centos:stream9
RUN yum -y update && \
yum -y install \

20
ci/openshift-ci/lib.sh Normal file
View File

@@ -0,0 +1,20 @@
#!/usr/bin/env bash
#
# Copyright (c) 2023 Red Hat
#
# SPDX-License-Identifier: Apache-2.0
#
# Ensure GOPATH set
if command -v go > /dev/null; then
export GOPATH=${GOPATH:-$(go env GOPATH)}
else
# if go isn't installed, set default location for GOPATH
export GOPATH="${GOPATH:-$HOME/go}"
fi
lib_dir=$(dirname "${BASH_SOURCE[0]}")
source "$lib_dir/../../tests/common.bash"
export katacontainers_repo=${katacontainers_repo:="github.com/kata-containers/kata-containers"}
export katacontainers_repo_dir="${GOPATH}/src/${katacontainers_repo}"

View File

@@ -0,0 +1,94 @@
#!/bin/bash
#
# Copyright (c) 2020 Red Hat, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# Run a smoke test.
#
script_dir=$(dirname $0)
source ${script_dir}/lib.sh
pod='http-server'
# Create a pod.
#
info "Creating the ${pod} pod"
[ -z "$KATA_RUNTIME" ] && die "Please set the KATA_RUNTIME first"
envsubst < "${script_dir}/smoke/${pod}.yaml.in" | \
oc apply -f - || \
die "failed to create ${pod} pod"
# Check it eventually goes to 'running'
#
wait_time=600
sleep_time=5
cmd="oc get pod/${pod} -o jsonpath='{.status.containerStatuses[0].state}' | \
grep running > /dev/null"
info "Wait until the pod gets running"
waitForProcess $wait_time $sleep_time "$cmd" || timed_out=$?
if [ -n "$timed_out" ]; then
oc describe pod/${pod}
oc delete pod/${pod}
die "${pod} not running"
fi
info "${pod} is running"
# Add a file with the hello message
#
hello_file=/tmp/hello
hello_msg='Hello World'
oc exec ${pod} -- sh -c "echo $hello_msg > $hello_file"
info "Creating the service and route"
if oc apply -f ${script_dir}/smoke/service.yaml; then
# Likely on OCP, use service
is_ocp=1
host=$(oc get route/http-server-route -o jsonpath={.spec.host})
port=80
else
# Likely on plain kubernetes, test using another container
is_ocp=0
info "Failed to create service, likely not on OCP, trying via NodePort"
oc apply -f "${script_dir}/smoke/service_kubernetes.yaml"
# For some reason kcli's cluster lists external IP as internal IP, try both
host=$(oc get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="ExternalIP")].address}')
[ -z "$host"] && host=$(oc get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="InternalIP")].address}')
port=$(oc get service/http-server-service -o jsonpath='{.spec.ports[0].nodePort}')
fi
info "Wait for the HTTP server to respond"
tempfile=$(mktemp)
check_cmd="curl -vvv '${host}:${port}${hello_file}' 2>&1 | tee -a '$tempfile' | grep -q '$hello_msg'"
if waitForProcess 60 1 "${check_cmd}"; then
test_status=0
info "HTTP server is working"
else
test_status=1
echo "::error:: HTTP server not working"
echo "::group::Output of the \"curl -vvv '${host}:${port}${hello_file}'\""
cat "${tempfile}"
echo "::endgroup::"
echo "::group::Describe kube-system namespace"
oc describe -n kube-system all
echo "::endgroup::"
echo "::group::Descibe current namespace"
oc describe all
echo "::endgroup::"
info "HTTP server is unreachable"
fi
rm -f "$tempfile"
# Delete the resources.
#
info "Deleting the service/route"
if [ "$is_ocp" -eq 0 ]; then
oc delete -f ${script_dir}/smoke/service_kubernetes.yaml
else
oc delete -f ${script_dir}/smoke/service.yaml
fi
info "Deleting the ${pod} pod"
oc delete pod/${pod} || test_status=$?
exit $test_status

View File

@@ -0,0 +1,50 @@
#!/bin/bash
# Copyright (c) 2024 Red Hat, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# A sample script to deploy, configure, run E2E_TEST and soft-cleanup
# afterwards OCP cluster using kata-containers primarily created for use
# with https://github.com/ldoktor/bisecter
[ "$#" -ne 1 ] && echo "Provide image as the first and only argument" && exit 255
export KATA_DEPLOY_IMAGE="$1"
OCP_DIR="${OCP_DIR:-/path/to/your/openshift/release/}"
E2E_TEST="${E2E_TEST:-'"[sig-node] Container Runtime blackbox test on terminated container should report termination message as empty when pod succeeds and TerminationMessagePolicy FallbackToLogsOnError is set [NodeConformance] [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]"'}"
KATA_CI_DIR="${KATA_CI_DIR:-$(pwd)}"
export KATA_RUNTIME="${KATA_RUNTIME:-kata-qemu}"
## SETUP
# Deploy kata
SETUP=0
pushd "$KATA_CI_DIR" || { echo "Failed to cd to '$KATA_CI_DIR'"; exit 255; }
./test.sh || SETUP=125
cluster/deploy_webhook.sh || SETUP=125
if [ $SETUP != 0 ]; then
./cleanup.sh
exit "$SETUP"
fi
popd || true
# Disable security
oc adm policy add-scc-to-group privileged system:authenticated system:serviceaccounts
oc adm policy add-scc-to-group anyuid system:authenticated system:serviceaccounts
oc label --overwrite ns default pod-security.kubernetes.io/enforce=privileged pod-security.kubernetes.io/warn=baseline pod-security.kubernetes.io/audit=baseline
## TEST EXECUTION
# Run the testing
pushd "$OCP_DIR" || { echo "Failed to cd to '$OCP_DIR'"; exit 255; }
echo "$E2E_TEST" > /tmp/tsts
# Remove previously-existing temporarily files as well as previous results
OUT=RESULTS/tmp
rm -Rf /tmp/*test* /tmp/e2e-*
rm -R $OUT
mkdir -p $OUT
# Run the tests ignoring the monitor health checks
./openshift-tests run --provider azure -o "$OUT/job.log" --junit-dir "$OUT" --file /tmp/tsts --max-parallel-tests 5 --cluster-stability Disruptive
RET=$?
popd || true
## CLEANUP
./cleanup.sh
exit "$RET"

View File

@@ -0,0 +1,30 @@
# Copyright (c) 2020 Red Hat, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# Define the pod for a http server app.
---
apiVersion: v1
kind: Pod
metadata:
name: http-server
labels:
app: http-server-app
spec:
containers:
- name: http-server
image: docker.io/library/python:3
ports:
- containerPort: 8080
command: ["python3"]
args: [ "-m", "http.server", "8080"]
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
runtimeClassName: ${KATA_RUNTIME}

View File

@@ -0,0 +1,28 @@
# Copyright (c) 2020 Red Hat, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# Create the service on port 80 for the http-server app.
---
apiVersion: v1
kind: Service
metadata:
name: http-server-service
spec:
selector:
app: http-server-app
ports:
- protocol: TCP
port: 80
targetPort: 8080
# Create the route to the app's service '/'.
---
apiVersion: route.openshift.io/v1
kind: Route
metadata:
name: http-server-route
spec:
path: "/"
to:
kind: Service
name: http-server-service

View File

@@ -0,0 +1,18 @@
# Copyright (c) 2020 Red Hat, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# Create the service on port 80 for the http-server app.
---
apiVersion: v1
kind: Service
metadata:
name: http-server-service
spec:
selector:
app: http-server-app
ports:
- protocol: TCP
port: 80
targetPort: 8080
type: NodePort

32
ci/openshift-ci/test.sh Executable file
View File

@@ -0,0 +1,32 @@
#!/bin/bash
#
# Copyright (c) 2020 Red Hat, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# The kata shim to be used
export KATA_RUNTIME=${KATA_RUNTIME:-kata-qemu}
script_dir=$(dirname $0)
source ${script_dir}/lib.sh
suite=$1
if [ -z "$1" ]; then
suite='smoke'
fi
# Make oc and kubectl visible
export PATH=/tmp/shared:$PATH
oc version || die "Test cluster is unreachable"
info "Install and configure kata into the test cluster"
export SELINUX_PERMISSIVE="no"
${script_dir}/cluster/install_kata.sh || die "Failed to install kata-containers"
info "Run test suite: $suite"
test_status='PASS'
${script_dir}/run_${suite}_test.sh || test_status='FAIL'
info "Test suite: $suite: $test_status"
[ "$test_status" == "PASS" ]

View File

@@ -1,21 +0,0 @@
#!/usr/bin/env bash
#
# Copyright (c) 2019 Ant Financial
#
# SPDX-License-Identifier: Apache-2.0
#
set -e
cidir=$(dirname "$0")
source "${cidir}/lib.sh"
export CI_JOB="${CI_JOB:-}"
clone_tests_repo
pushd ${tests_repo_dir}
.ci/run.sh
# temporary fix, see https://github.com/kata-containers/tests/issues/3878
if [ "$(uname -m)" != "s390x" ] && [ "$CI_JOB" == "CRI_CONTAINERD_K8S_MINIMAL" ]; then
tracing/test-agent-shutdown.sh
fi
popd

Some files were not shown because too many files have changed in this diff Show More