The gperf-3.3 tarball frequently fails to download on my end with
cryptic error messages such as: "tar: This does not look like a tar
archive". This change tightens the download logic a bit: We fail at
the point in time when we're supposed to fail. This way we detect
rate limiting issues right away, and this way, the actual hashsum
and signature checks are effective, not only printouts.
This change also updates the key reference and allows for an array,
for instance, when a different signer was used for a cache vs
upstream version.
The change also makes it clear, that signature verification is only
implemented for the gperf tarball. Improvements can be made in a
subsequent change.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Custom runtimes whose base config lives under runtime-rs/ (e.g. dragonball,
cloud-hypervisor) were not found because the path was always built under
share/defaults/kata-containers/. Use get_kata_containers_original_config_path
for the handler so rust shim configs are read from .../runtime-rs/.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
On helm uninstall let's rely on a preStop hook to run kata-deploy
cleanup so each pod cleans its node before exiting.
We **must** keep RBAC (resource-policy: keep) so pods retain API access
during termination, and then can properly delete the NodeFeatureRules
and remove the labels from the nodes.
The post-delete hook Job, which runs on a single node, now is only
responsible for cleaning the kept RBAC (cluster-wide resource) after
uninstall, not leaving any resource or artefact behind.
The changes on this commit lead to a "resouerces were kept" message when
running `helm uninstall`, which document as being normal, as the
post-delete job will remove those.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
When removing a node label, JSON merge patch semantics require setting
the key to null; omitting the key leaves it unchanged.
Fix label_node to send a patch with the label key set to null so the API
server actually removes katacontainers.io/kata-runtime.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Wait for SIGTERM after install and exit(0) so the container terminates
cleanly. If registering the SIGTERM handler fails, log a warning and
sleep forever instead of exiting with an error (fallback to the old
behaviour).
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
After the move to Linux 6.17 and QEMU 10.2 from Kata,
k8s-sandbox-vcpus-allocation.bats started failing on TDX.
2026-02-10T16:39:39.1305813Z # pod/vcpus-less-than-one-with-no-limits created
2026-02-10T16:39:39.1306474Z # pod/vcpus-less-than-one-with-limits created
2026-02-10T16:39:39.1307090Z # pod/vcpus-more-than-one-with-limits created
2026-02-10T16:39:39.1307672Z # pod/vcpus-less-than-one-with-limits condition met
2026-02-10T16:39:39.1308373Z # timed out waiting for the condition on pods/vcpus-less-than-one-with-no-limits
2026-02-10T16:39:39.1309132Z # timed out waiting for the condition on pods/vcpus-more-than-one-with-limits
2026-02-10T16:39:39.1310370Z # Error from server (BadRequest): container "vcpus-less-than-one-with-no-limits" in pod "vcpus-less-than-one-with-no-limits" is waiting to start: ContainerCreating
A manual test without agent policies added it seems to work OK but disable
the test for now to get CI stable.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
- We don't use containerd.latest as the comment on it suggests
- We also don't have any references to `sriov-network-device`
so remove that and the plugins section.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
this will suppress yaml output only if the input is passed via
stdin. If {base64/raw}-out is passed in alongside a yaml file, the
encoded annotation or the policy data respectively will be printed
to stdout as before.
Fixes#12438
Signed-off-by: Spyros Seimenis <sse@edgeless.systems>
As s390x and ppc64 use a flat CPU topology without sockets and threads,
this commit skips the socket_id and thread_id properties for vCPU hotplug
on these architectures instead of aborting the operation.
This is the change in line with those from the Go runtime:
- isSocketIDSupported()
- isThreadIDSupported()
Fixes: #12155
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
- Trim trailing whitespace and ensure final newline in non-vendor files
- Add .editorconfig-checker.json excluding vendor dirs, *.patch, *.img,
*.dtb, *.drawio, *.svg, and pkg/cloud-hypervisor/client so CI only
checks project code
- Leave generated and binary assets unchanged (excluded from checker)
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
The runtime-rs shim was failing to load its configuration when deployed
via kata-deploy because it couldn't correctly parse the ConfigPath passed
by containerd. The previous implementation naively skipped the first 2
bytes of the options and interpreted the rest as a UTF-8 string, which
doesn't work since containerd passes a properly serialized protobuf
message of type runtimeoptions.v1.Options.
This change adds the runtimeoptions.proto definition to the protocols
crate and updates the load_config function to correctly deserialize the
protobuf message and extract the config_path field, matching how the Go
runtime handles this via typeurl.UnmarshalAny.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Update the enable_nvrc_trace() function to use the new drop-in
configuration mechanism instead of directly modifying the base
configuration file. The function now creates a 90-nvrc-trace.toml
drop-in file that properly combines existing kernel parameters
with the nvrc.log=trace setting.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
When kata-deploy installs Kata Containers, the base configuration files
should not be modified directly. This change adds documentation explaining
how to use drop-in configuration files for customization, and prepends a
warning comment to all deployed configuration files reminding users to use
drop-in files instead.
The warning is added to both standard shim configurations and custom
runtime configurations. It includes a brief explanation of how drop-in
files work and points users to the documentation for more details.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add clear INFO-level messages when creating drop-in configuration
files, making it easy to understand what kata-deploy is doing during
installation:
- "Setting up runtime directory for shim: X"
- "Generating drop-in configuration files for shim: X"
- "Created drop-in file: <path>"
When DEBUG mode is enabled (via DEBUG=true environment variable),
also log the full content of each drop-in file to aid troubleshooting.
The log level is now automatically set to Debug when the DEBUG
environment variable is set, ensuring debug messages are visible.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Deduplicate the drop-in file generation logic between configure_shim_config
and install_custom_runtime_configs by extracting it into a shared
write_common_drop_ins helper function.
This ensures both standard and custom runtimes use the same code path
for generating drop-in configuration files.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add a combined drop-in file (30-kernel-params.toml) that handles all
kernel_params modifications. This approach reads the base kernel_params
from the original untouched config file and combines them with:
- Proxy settings (agent.https_proxy, agent.no_proxy)
- Debug settings (agent.log=debug, initcall_debug)
Using a single drop-in file for kernel_params avoids the TOML merge
behavior where scalar values are replaced rather than appended.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
When debug mode is enabled, generate a drop-in configuration file
(20-debug.toml) with the boolean debug flags for hypervisor, runtime,
and agent sections.
Note: kernel_params for debug (agent.log=debug, initcall_debug) will
be handled by a separate combined kernel_params drop-in file to avoid
the TOML merge replacement behavior.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
When the installation prefix differs from the default /opt/kata,
generate a drop-in configuration file (10-installation-prefix.toml)
with the adjusted paths instead of modifying the original config file.
This removes the need for adjust_installation_prefix and
adjust_qemu_cmdline functions which are now deleted along with
their tests.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Instead of modifying original config files directly, set up a per-shim
directory structure that uses symlinks to the original configs and
config.d/ directories for drop-in overrides.
This enables cleaner configuration management where the original files
remain untouched and all kata-deploy customizations are in separate
drop-in files that can be easily inspected and removed.
Directory structure:
{config_path}/runtimes/{shim}/
{config_path}/runtimes/{shim}/configuration-{shim}.toml -> symlink
{config_path}/runtimes/{shim}/config.d/
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
There is code to disable this at runtime when confidential_guest
is enabled anyway[^1], but it will omit a warning every time. All
the touched configuration files set confidential_guest to true,
so we already know nvdimm isn't supported.
[^1]: 16a7ed6e14/src/runtime/virtcontainers/qemu_amd64.go (L144-L148)
Signed-off-by: Paul Meyer <katexochen0@gmail.com>
The number of workflows increased over 30 so we need to paginate them as
well as jobs. This commit extracts the existing pagination from jobs and
uses it for both jobs and workflows.
Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
The previous implementation failed to correctly propagate the network
multiqueue configuration, causing the effective queue number to remain
0.
It also mixed up "queue pairs" with "queue number", so tap devices were
opened without proper multiqueue initialization which causes Clh
netconfig validation failed.
This commit fixes the configuration mapping and initializes tap devices
with the correct multiqueue semantics, ensuring Cloud Hypervisor
receives a valid netconfig.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
To make build with a configurable item of network queues, a dedicated
variable of DEFNETQUEUES is added.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
This commit introduces a new annotation for users to easily set network
queues via "io.katacontainers.config.hypervisor.network_queues".
And the annotation will be mapped into `NetworkInfo.network_queues`
within the configuration.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
This adds a basic configuration for editorconfig checker. The
supplied configuration checks against trailing whitespaces and
issues with newlines.
Example:
| tools/packaging/kernel/configs/fragments/x86_64/numa.conf:
| Wrong line endings or no final newline
| tools/packaging/release/generate_vendor.sh:
| 44: Trailing whitespace
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Update time to resolve CVE-2026-25727.
Note: this involved bumping the versions of slog-term and slog-json
and bumping the MSRV to 1.88.0 which time 0.3.47 requires.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>