Drop kernel build signing (KBUILD_SIGN_PIN) from CI and from all
scripts that referenced it.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Per-arch images were failing publish-multiarch-manifest with 'X is a manifest
list' because Buildx now enables attestations by default, so each arch tag
became an image index. Use 'docker buildx imagetools create' instead of
'docker manifest create' so we can merge those indexes into the final
multi-arch manifest while keeping provenance and SBOM on per-arch images.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
We depend on GPU Operator v26.3 release, which is not out yet.
Although we have been testing with it, it's not yet publicly available,
which would break anyone actually trying to use the GPU runtime classes.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
In containerd config v3 the CRI plugin is split into runtime and images,
and setting the snapshotter only on the runtime plugin is not enough for image
pull/prepare.
The images plugin must have runtime_platform.<runtime>.snapshotter so it
uses the correct snapshotter per runtime (e.g. nydus, erofs).
A PR on the containerd side is open so we can rely on the runtime plugin
snapshotter alone: https://github.com/containerd/containerd/pull/12836
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
In CI the default GPG keyring is often read-only or missing, so
'gpg --import' of the cached keyring fails and verification cannot
succeed. Use a temporary GNUPGHOME for import and verify so cached
gperf can be verified without writing to the system keyring.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
The CC runtime classes kata-qemu-nvidia-gpu-snp and kata-qemu-nvidia-gpu-tdx
are mutually exclusive with kata-qemu-nvidia-gpu, as dictated by the gpu
cc mode setting. In order to properly support a cluster that has both CC and
non-CC nodes, we use a node selector so the scheduling is consistent with the
GPU mode. The GPU operator sets a label nvidia.com/cc.ready.state=[true, false]
to indicate the gpu mode setting
Fixes#12431
Signed-off-by: Joji Mekkattuparamban <jojim@nvidia.com>
Add push-oras-tarball-cache workflow that runs on push to main when
versions.yaml changes (and on workflow_dispatch). It populates the
ghcr.io ORAS cache with gperf and busybox tarballs from versions.yaml.
Remove the push_to_cache call from download-with-oras-cache.sh since
it was never triggered in CI. Cache population is now done solely by the
new workflow and by populate-oras-tarball-cache.sh when run manually.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add tests for the split_non_toml_header helper that strips Go template
directives before TOML parsing, and for every TOML operation (set, get,
append, remove, set_array) on files that start with {{ template "base" . }}.
Also converts the containerd version detection tests in manager.rs from
individual #[test] functions with helper wrappers to parametrized #[rstest]
cases, which is more readable and easier to extend.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
K3s docs (https://docs.k3s.io/advanced#configuring-containerd) say that the
right way to customize containerd is to extend the base template with
{{ template "base" . }} and append your own TOML blocks, rather than copying a
prerendered config.toml into the template file.
We were copying config.toml into config.toml.tmpl / config-v3.toml.tmpl, which
meant we were replacing the K3s defaults with a snapshot that gets stale as
soon as K3s is upgraded.
Now we create the template files with just the base directive and let our
regular set_toml_value code path append the Kata runtime configuration on top.
To make that work, the TOML utils learned to handle files that start with a
Go template line ({{ ... }}): strip it before parsing, put it back when writing.
This keeps the K3s/RKE2 path identical to every other runtime -- no special
append logic needed.
refs:
* k3s:: https://docs.k3s.io/advanced#configuring-containerd
* rke2: https://docs.rke2.io/advanced?_highlight=conyainerd#configuring-containerd
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
The gperf-3.3 tarball frequently fails to download on my end with
cryptic error messages such as: "tar: This does not look like a tar
archive". This change tightens the download logic a bit: We fail at
the point in time when we're supposed to fail. This way we detect
rate limiting issues right away, and this way, the actual hashsum
and signature checks are effective, not only printouts.
This change also updates the key reference and allows for an array,
for instance, when a different signer was used for a cache vs
upstream version.
The change also makes it clear, that signature verification is only
implemented for the gperf tarball. Improvements can be made in a
subsequent change.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Custom runtimes whose base config lives under runtime-rs/ (e.g. dragonball,
cloud-hypervisor) were not found because the path was always built under
share/defaults/kata-containers/. Use get_kata_containers_original_config_path
for the handler so rust shim configs are read from .../runtime-rs/.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
On helm uninstall let's rely on a preStop hook to run kata-deploy
cleanup so each pod cleans its node before exiting.
We **must** keep RBAC (resource-policy: keep) so pods retain API access
during termination, and then can properly delete the NodeFeatureRules
and remove the labels from the nodes.
The post-delete hook Job, which runs on a single node, now is only
responsible for cleaning the kept RBAC (cluster-wide resource) after
uninstall, not leaving any resource or artefact behind.
The changes on this commit lead to a "resouerces were kept" message when
running `helm uninstall`, which document as being normal, as the
post-delete job will remove those.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
When removing a node label, JSON merge patch semantics require setting
the key to null; omitting the key leaves it unchanged.
Fix label_node to send a patch with the label key set to null so the API
server actually removes katacontainers.io/kata-runtime.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Wait for SIGTERM after install and exit(0) so the container terminates
cleanly. If registering the SIGTERM handler fails, log a warning and
sleep forever instead of exiting with an error (fallback to the old
behaviour).
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
- Trim trailing whitespace and ensure final newline in non-vendor files
- Add .editorconfig-checker.json excluding vendor dirs, *.patch, *.img,
*.dtb, *.drawio, *.svg, and pkg/cloud-hypervisor/client so CI only
checks project code
- Leave generated and binary assets unchanged (excluded from checker)
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
When kata-deploy installs Kata Containers, the base configuration files
should not be modified directly. This change adds documentation explaining
how to use drop-in configuration files for customization, and prepends a
warning comment to all deployed configuration files reminding users to use
drop-in files instead.
The warning is added to both standard shim configurations and custom
runtime configurations. It includes a brief explanation of how drop-in
files work and points users to the documentation for more details.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add clear INFO-level messages when creating drop-in configuration
files, making it easy to understand what kata-deploy is doing during
installation:
- "Setting up runtime directory for shim: X"
- "Generating drop-in configuration files for shim: X"
- "Created drop-in file: <path>"
When DEBUG mode is enabled (via DEBUG=true environment variable),
also log the full content of each drop-in file to aid troubleshooting.
The log level is now automatically set to Debug when the DEBUG
environment variable is set, ensuring debug messages are visible.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Deduplicate the drop-in file generation logic between configure_shim_config
and install_custom_runtime_configs by extracting it into a shared
write_common_drop_ins helper function.
This ensures both standard and custom runtimes use the same code path
for generating drop-in configuration files.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add a combined drop-in file (30-kernel-params.toml) that handles all
kernel_params modifications. This approach reads the base kernel_params
from the original untouched config file and combines them with:
- Proxy settings (agent.https_proxy, agent.no_proxy)
- Debug settings (agent.log=debug, initcall_debug)
Using a single drop-in file for kernel_params avoids the TOML merge
behavior where scalar values are replaced rather than appended.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
When debug mode is enabled, generate a drop-in configuration file
(20-debug.toml) with the boolean debug flags for hypervisor, runtime,
and agent sections.
Note: kernel_params for debug (agent.log=debug, initcall_debug) will
be handled by a separate combined kernel_params drop-in file to avoid
the TOML merge replacement behavior.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
When the installation prefix differs from the default /opt/kata,
generate a drop-in configuration file (10-installation-prefix.toml)
with the adjusted paths instead of modifying the original config file.
This removes the need for adjust_installation_prefix and
adjust_qemu_cmdline functions which are now deleted along with
their tests.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Instead of modifying original config files directly, set up a per-shim
directory structure that uses symlinks to the original configs and
config.d/ directories for drop-in overrides.
This enables cleaner configuration management where the original files
remain untouched and all kata-deploy customizations are in separate
drop-in files that can be easily inspected and removed.
Directory structure:
{config_path}/runtimes/{shim}/
{config_path}/runtimes/{shim}/configuration-{shim}.toml -> symlink
{config_path}/runtimes/{shim}/config.d/
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Build a single kernel for both kernel and kernel-confidential on x86_64
and s390x. The kernel is built with TEE support (-x) on those arches only.
This helps to simplilfy and to maintain the code, and having a single
kernel was the original plan since forever.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Build a single kernel for both nvidia-gpu and nvidia-gpu-confidential,
simplifying and reducing code maintenance.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Remove the initramfs folder, its build steps, and use the kernel
based dm-verity enforcement for the handlers which used the
initramfs mode. Also, remove the initramfs verity mode
capability from the shims and their configs.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
This change introduces the kernelinit dm-verity mode, allowing
initramfs-less dm-verity enforcement against the rootfs image.
For this, the change introduces a new variable with dm-verity
information. This variable will be picked up by shim
configurations in subsequent commits.
This will allow the shims to build the kernel command line
with dm-verity information based on the existing
kernel_parameters configuration knob and a new
kernel_verity_params configuration knob. The latter
specifically provides the relevant dm-verity information.
This new configuration knob avoids merging the verity
parameters into the kernel_params field. Avoiding this, no
cumbersome escape logic is required as we do not need to pass the
dm-mod.create="..." parameter directly in the kernel_parameters,
but only relevant dm-verity parameters in semi-structured manner
(see above). The only place where the final command line is
assembled is in the shims. Further, this is a line easy to comment
out for developers to disable dm-verity enforcement (or for CI
tasks).
This change produces the new kernelinit dm-verity parameters for
the NVIDIA runtime handlers, and modifies the format of how
these parameters are prepared for all handlers. With this, the
parameters are currently no longer provided to the
kernel_params configuration knob for any runtime handler.
This change alone should thus not be used as dm-verity
information will no longer be picked up by the shims.
systemd-analyze on the coco-dev handler shows that using the
kernelinit mode on a local machine, less time is spent in the
kernel phase, slightly speeding up pod start-up. On that machine,
the average of 172.5ms was reduced to 141ms (4 measurements, each
with a basic pod manifest), i.e., the kernel phase duration is
improved by about 18 percent.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
This reverts commit 923f97bc66 in
order to re-instantiate the logic from commit
e4a13b9a4a.
The latter commit was previously reverted due to the NVIDIA GPU TEE
handler using an initrd, not an image.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
We cannot overwrtie a binary that's currently in use, and that's the
reason that elsewhere we remove / unlink the binary (the running process
keeps its file descriptor, so we're good doing that) and only then we
copy the binary. However, we missed doing this for the
nydus-snapshotter deployment.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Clean up existing nydus-snapshotter state to ensure fresh start with new
version.
This is safe across all K8s distributions (k3s, rke2, k0s, microk8s,
etc.) because we only touch the nydus data directory, not containerd's
internals.
When containerd tries to use non-existent snapshots, it will
re-pull/re-unpack.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
As we have moved to use QEMU (and OVMF already earlier) from
kata-deploy, the custom tdx configurations and distro checks
are no longer needed.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
This Allows the updateStrategy to be configured for the kata-deploy helm
chart, this is enabling administrators to control the aggressiveness of
updates. For a less aggressive approach, the strategy can be set to
`OnDelete`. Alternatively, the update process can be made more
aggressive by adjusting the `maxUnavailable` parameter.
Signed-off-by: Nikolaj Lindberg Lerche <nlle@ambu.com>
Delete the pause_bundle directory before running the umoci unpack
operation. This will make builds idempotent and not fail with
errors like "create runtime bundle: config.json already exists in
.../build/pause-image/destdir/pause_bundle". This will make life
better when building locally.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Remove the initrd function and add the image function to align
with the actually existing functions in this file.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Update the Helm chart README to document the new shims.disableAll
option and simplify the examples that previously required listing
every shim to disable.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Simplify the example values files by using the new shims.disableAll
option instead of listing every shim to disable.
Before (try-kata-nvidia-gpu.values.yaml):
shims:
clh:
enabled: false
cloud-hypervisor:
enabled: false
# ... 15 more lines ...
After:
shims:
disableAll: true
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add a new `shims.disableAll` option that disables all standard shims
at once. This is useful when:
- Enabling only specific shims without listing every other shim
- Using custom runtimes only mode (no standard Kata shims)
Usage:
shims:
disableAll: true
qemu:
enabled: true # Only qemu is enabled
All helper templates are updated to check for this flag before
iterating over shims.
One thing that's super important to note here is that helm recursively
merges user values with chart defaults, making a simple
`disableAll` flag problematic: if defaults have `enabled: true`, user's
`disableAll: true` gets merged with those defaults, resulting in all
shims still being enabled.
The workaround found is to use null (`~`) as the default for `enabled`
field. The template logic interprets null differently based on
disableAll:
| enabled value | disableAll: false | disableAll: true |
|---------------|-------------------|------------------|
| ~ (null) | Enabled | Disabled |
| true | Enabled | Enabled |
| false | Disabled | Disabled |
This is backward compatible:
- Default behavior unchanged: all shims enabled when disableAll: false
- Users can set `disableAll: true` to disable all, then explicitly
enable specific shims with `enabled: true`
- Explicit `enabled: false` always disables, regardless of disableAll
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add functions to install and remove custom runtime configuration files.
Each custom runtime gets an isolated directory structure:
custom-runtimes/{handler}/
configuration-{baseConfig}.toml # Copied from base config
config.d/
50-overrides.toml # User's drop-in overrides
The base config is copied AFTER kata-deploy has applied its modifications
(debug settings, proxy configuration, annotations), so custom runtimes
inherit these settings.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add functions to configure custom runtimes in containerd and CRI-O.
Custom runtimes use an isolated config directory under:
custom-runtimes/{handler}/
Custom runtimes automatically derive the shim binary path from the
baseConfig field using the existing is_rust_shim() logic.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>