Commit Graph

11868 Commits

Author SHA1 Message Date
David Esparza
f7ef45b167
Merge pull request #8077 from fidencio/topic/kata-deploy-ship-the-tools
kata-deploy: build & ship the rust components from src/tools/
2023-09-28 09:59:19 -06:00
Zvonko Kaiser
7c934dc7da gpu: Fix cold-plug of VFIO devices
We need to do proper sandbox sizing when we're doing cold-plug introduce CDI,
the de-facto standard for enabling devices in containers. containerd
will pass-through annotations for accumulated CPU,Memory and now CDI
devices. With that information sandbox sizing can be derived correctly.

Fixes: #7331

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-09-28 09:49:13 +00:00
GabyCT
fcc755fc3b
Merge pull request #8068 from GabyCT/topic/limitlatency
metrics: Add latency value limits for kata CI
2023-09-27 13:28:41 -06:00
Greg Kurz
defbb64ac8
Merge pull request #8036 from rye-stripe/bugfix/overhead-metrics
runtime: fix reading cgroup stats of sandboxes
2023-09-27 19:39:55 +02:00
Archana Shinde
95455e6fe8
Merge pull request #8058 from likebreath/0925/clh_v35.0
Upgrade to Cloud Hypervisor v35.0
2023-09-27 10:39:32 -07:00
Gabriela Cervantes
8d66ef5185 metrics: Increase qemu jitter value
This PR increases qemu jitter value.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-27 17:31:07 +00:00
Gabriela Cervantes
5600e28b54 metrics: Increase jitter value for clh
This PR increases jitter value for clh.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-27 17:30:19 +00:00
Fabiano Fidêncio
a6b1f5e21b ci: Build src/tools components as part of our tests / releases
Build those as part of our CI and release workflows.

Fixes #5520 #5348

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 18:50:25 +02:00
Fabiano Fidêncio
501a168a81 kata-deploy: Build components from src/tools
Let's add targets and actually enable users and oursevles to build those
components in the same way we build the rest of the project.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 18:49:02 +02:00
Fabiano Fidêncio
6ef42db5ec static-build: Add scripts to build content from src/tools
As we'd like to ship the content from src/tools, we need to build them
in the very same way we build the other components, and the first step
is providing scripts that can build those inside a container.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 18:48:56 +02:00
Fabiano Fidêncio
4d08ec29bc packaging: Add get_tools_image_name()
This will be used for building all the (rust) components from src/tools.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 18:48:35 +02:00
Fabiano Fidêncio
98097c96de packaging: Use git abbreviated hash
This will make it easier to build images that rely on several
directories hashes.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 18:48:30 +02:00
Fabiano Fidêncio
8b25e90027
Merge pull request #8075 from fidencio/topic/ci-add-kata-monitor-tests
ci: Port kata-monitor tests from Jenkins to GHA
2023-09-27 15:48:46 +02:00
Fabiano Fidêncio
489caf1ad0 ci: kata-monitor: Move tests over
Let's move, adapt, and use the kata-monitor tests from the tests repo.
In this PR I'm keeping the SoB from every single contributor from who
touched those tests in the past.

Fixes: #8074

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-27 11:40:31 +02:00
Fabiano Fidêncio
a3fb067f1b ci: Add placeholder for kata-monitor tests
The kata-monitor tests is currently running as part of the Jenkins CI
with the following setups:
* Container Engines: CRI-O | containerd
* VMMs: QEMU

When using containerd, we're testing it with:
* Snapshotter: overlayfs | devmapper

We will stop running those tests on devmapper / overlayfs as that hardly
would get us a functionality issue.

Also, we're restricting this to run with the LTS version of containerd,
when containerd is used.

As it's known due to our GHA limitation, this is just a placeholder and
the tests will actually be added in the next iterations.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:31:17 +02:00
Fabiano Fidêncio
57cb4ce204 ci: Make install_kata aware of container engines
This will help us when running tests using CRI-O.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:31:17 +02:00
Fabiano Fidêncio
de1eeee334 ci: Create a generic install_crio function
This will serve us quite will in the upcoming tests addition, which will
also have to be executed using CRi-O.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:26:13 +02:00
Fabiano Fidêncio
64a2000859 ci: Add install_cni_plugins helper
This will become handy when doing tests with CRI-O, as CRI-O doesn't
install the CNI plugins for us.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:26:13 +02:00
Fabiano Fidêncio
8132fe15c9 ci: Modify containerd default config
Let's ensure we have runc running with `SystemdCgroups = false`,
otherwise we'll face failures when running tests depending on runc on
Ubuntu 22.04, woth LTS containerd.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:16:12 +02:00
Gabriela Cervantes
8cb7df1bed metrics: Add checkmetrics for latency test
This PR adds the checkmetrics for latency test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-26 19:11:08 +00:00
Gabriela Cervantes
e90440ae24 metrics: Add qemu latency value limit
This PR adds the qemu latency value limit for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-26 17:30:09 +00:00
Gabriela Cervantes
a74a8f8a9d metrics: Add latency value limits for kata CI
This PR adds latency value limits for kata CI.

Fixes #8067

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-26 17:29:07 +00:00
Gabriela Cervantes
d7def8317a metrics: Fix general check static warnings
This PR fixes general check static warnings.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-26 16:30:59 +00:00
GabyCT
309103169d
Merge pull request #8056 from GabyCT/topic/fixlatencypath
metrics: Fix latency yamls path
2023-09-26 10:16:55 -06:00
Gabriela Cervantes
928553d1ba docs: Update url in kata vra document
This PR updates the url in kata vra document.

Fixes #8065

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-26 16:13:12 +00:00
GabyCT
5c0afaacf4
Merge pull request #8018 from GabyCT/topic/fixreadme
metrics: Fix metrics README
2023-09-26 09:51:47 -06:00
David Esparza
83326f89b3
Merge pull request #8054 from GabyCT/topic/fixcrdoc
metrics: Fix C-Ray documentation
2023-09-26 09:50:19 -06:00
James O. D. Hunt
31478b9c33
Merge pull request #7944 from jodh-intel/runtime-rs-ch-enable-tdx
runtime-rs: ch: Enable Intel TDX
2023-09-26 14:11:12 +01:00
James O. D. Hunt
b0a3293d53 runtime-rs: ch: Enable Intel TDX
Allow Cloud Hypervisor to create a confidential guest (a TD or
"Trust Domain") rather than a VM (Virtual Machine) on Intel systems
that provide TDX functionality.

> **Notes:**
>
> - At least currently, when built with the `tdx` feature, Cloud Hypervisor
>   cannot create a standard VM on a TDX capable system: it can only create
>   a TD. This implies that on TDX capable systems, the Kata Configuration
>   option `confidential_guest=` must be set to `true`. If it is not, Kata
>   will detect this and display the following error:
>
>   ```
>   TDX guest protection available and must be used with Cloud Hypervisor (set 'confidential_guest=true')
>   ```
>
> - This change expands the scope of the protection code, changing
>   Intel TDX specific booleans to more generic "available guest protection"
>   code that could be "none" or "TDX", or some other form of guest
>   protection.

Fixes: #6448.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 10:55:25 +01:00
James O. D. Hunt
523399c329 runtime-rs: ch: Add more consts
Introduce a few new constants (for PCI segment count and FS queues) and
move the disk queue constants to `convert.rs` to allow them to be used
there too.

> **Note:**
>
> This change gives the `ShareFs` code it's own set of values rather
> than relying on the disk queue constants.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 08:41:32 +01:00
James O. D. Hunt
dea8065811 runtime-rs: ch: Remove unused function
Delete the `handle_pending_devices_after_boot()` function which is no
longer required.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 08:41:32 +01:00
James O. D. Hunt
995f2c015f runtime-rs: ch: Only handle particular pending device types
Modify the Cloud Hypervisor `add_device()` method to add `ShareFs` and
`Network` devices to the list of pending devices since only these two
device types need to be cached before VM startup. Full details in the
comments.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 08:41:32 +01:00
James O. D. Hunt
b1b96a5c49 runtime-rs: ch: Remove erroneous "virtio-blk-mmio" check
Remove the `VIRTIO_BLK_MMIO` check which appears to have been added
erroneously in the first place.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 08:41:32 +01:00
Gabriela Cervantes
9ac29b8d38 metrics: Add init_env function to latency test
This Pr adds the init_env function to latency test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-25 22:06:00 +00:00
Bo Chen
dfd0c9fa9a runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v35.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #8057

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-09-25 12:22:37 -07:00
Bo Chen
8f9f087e35 versions: Upgrade to Cloud Hypervisor v35.0
Details of this release can be found in ourroadmap project as iteration
v35.0: https://github.com/orgs/cloud-hypervisor/projects/6.

Fixes: #8057

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-09-25 12:22:01 -07:00
Fabiano Fidêncio
a4daa86535
Merge pull request #8028 from fidencio/topic/ci-test-with-crio-part-2
ci: k8s: crio: Follow up patches to have CRI-O also working as part of our CI
2023-09-25 18:40:42 +02:00
Gabriela Cervantes
81c8babca9 metrics: Fix latency yamls path
This PR fixes the latency yamls path for the latency test for
kata metrics.

Fixes #8055

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-25 15:52:24 +00:00
Gabriela Cervantes
4815736820 metrics: Fix C-Ray documentation
This PR fixes the C-Ray documentation for kata metrics.

Fixes #8052

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-25 15:27:58 +00:00
Fabiano Fidêncio
ef63d67c41 ci: crio: Trail '\r' from exec_host() output
We've faced this as part of the CI, only happening with the CRI-O tests:
```
 not ok 1 Test readonly volume for pods
 # (from function `exec_host' in file tests_common.sh, line 51,
 #  in test file k8s-file-volume.bats, line 25)
 #   `exec_host "echo "$file_body" > $tmp_file"' failed with status 127
 # [bats-exec-test:38] INFO: k8s configured to use runtimeclass
 # bash: line 1: $'\r': command not found
 #
 # Error from server (NotFound): pods "test-file-volume" not found
```

I must say I didn't dig into figuring out why this is happening, but we
may be safe enough to just trail the '\r', as long as all the tests keep
passing on containerd.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-25 16:42:18 +02:00
Fabiano Fidêncio
74c12b2927 ci: crio: Enable default capabilities
We need the default capabilities to be enabled, especially `SYS_CHROOT`,
in order to have tests accessing the host to pass.

A huge thanks to Greg Kurz for spotting this and suggesting the fix.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
2023-09-25 14:56:15 +02:00
Fabiano Fidêncio
358dc2f569 kata-deploy: Fix CRI-O detection
Some of the "k8s distros" allow using CRI-O in a non-official way, and
if that's done we cannot simply assume they're on containerd, otherwise
kata-deploy will simply not work.

In order to avoid such issue, let's check for `cri-o` as the container
engine as the first place and only proceed with the checks for the "k8s
distros" after we rule out that CRI-O is not being used.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-25 14:56:15 +02:00
Fabiano Fidêncio
ebaa4fa4c1 ci: crio: Pass -y to apt
That was something overlooked during my tests. :-/

Fixes: #8005

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-25 14:56:15 +02:00
GabyCT
11cf0e2d28
Merge pull request #8038 from GabyCT/topic/latency
metrics: Enable latency test in gha run script
2023-09-22 16:57:53 -06:00
GabyCT
3ef57b335e
Merge pull request #8045 from jepio/fix-docker-ownership
local-build: Fix .docker ownership before build-payload
2023-09-22 14:43:38 -06:00
Archana Shinde
9bb9a3e7a4
Merge pull request #7966 from amshinde/runtime-rs-network-clh
runtime-rs: Add network support for cloud-hypervisor
2023-09-22 13:08:09 -07:00
Gabriela Cervantes
97e73b2234 metrics: Fix spelling warnings
This PR fixes general spelling warnings detected by the spelling check.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-22 15:50:51 +00:00
Gabriela Cervantes
36c8cd6f1f metrics: Fix metrics README
This PR fixes the network metrics section at the README by leaving
the current tests that we have in our kata metrics.

Fixes #8017

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-22 15:28:58 +00:00
Fabiano Fidêncio
c5a5a0c95e
Merge pull request #8012 from arronwy/strip
osbuild: Reduce guest components binary size with strip
2023-09-22 15:45:38 +02:00
Fabiano Fidêncio
9d190f2390
Merge pull request #8042 from GabyCT/topic/pandoc
gha: Add pandoc as a dependency for static checks
2023-09-22 15:31:18 +02:00