Move the cleanup logic from a preStop lifecycle hook (separate exec)
into the main process's SIGTERM handler. This simplifies the
architecture: the install process now handles its own teardown when
the pod is terminated.
The SIGTERM handler is registered before install begins, and
tokio::select! races install against SIGTERM so cleanup always runs
even if SIGTERM arrives mid-install (e.g. helm uninstall while the
container is restarting after a failed install attempt).
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Check the rendered containerd config for the versioned drop-in dir import
(config.toml.d or config-v3.toml.d) and bail with a clear error if it is
missing.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
On containerd v3 config, disable_snapshot_annotations must be set under the
images plugin, not the runtime plugin.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Different kubernetes distributions, such as k0s, use a different kubelet
root dir location instead of the default /var/lib/kubelet, so ConfigMap
and Secret volume propagation were failing.
This adds a kubelet_root_dir config option that the go runtime uses when
matching volume paths and kata-deploy now sets it automatically for k0s
via a drop-in file.
runtime-rs does not need this option: it identifies ConfigMap/Secret,
projected, and downward-api volumes by volume-type path segment
(kubernetes.io~configmap, etc.), not by kubelet root prefix.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Use rstest parameterized tests for QEMU variants, other hypervisors,
and unknown/empty shim cases.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
When copying artifacts from the container to the host, detect source
entries that are symlinks and recreate them as symlinks at the
destination instead of copying the target file.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Turn test_toml_value_types into a parameterized test with one case per type
(string, bool, int). Merge the two invalid-TOML tests (get and set) into one
rstest with two cases, and the two "not an array" tests into one rstest
with two cases.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
When writing containerd drop-in or other TOML (e.g. initially empty file),
the serialized document could start with many newlines.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Replace multiple #[test] functions for snapshotter and erofs version
checks with parameterized #[rstest] #[case] tests for consistency and
easier extension.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
In containerd config v3 the CRI plugin is split into runtime and images,
and setting the snapshotter only on the runtime plugin is not enough for image
pull/prepare.
The images plugin must have runtime_platform.<runtime>.snapshotter so it
uses the correct snapshotter per runtime (e.g. nydus, erofs).
A PR on the containerd side is open so we can rely on the runtime plugin
snapshotter alone: https://github.com/containerd/containerd/pull/12836
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add tests for the split_non_toml_header helper that strips Go template
directives before TOML parsing, and for every TOML operation (set, get,
append, remove, set_array) on files that start with {{ template "base" . }}.
Also converts the containerd version detection tests in manager.rs from
individual #[test] functions with helper wrappers to parametrized #[rstest]
cases, which is more readable and easier to extend.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
K3s docs (https://docs.k3s.io/advanced#configuring-containerd) say that the
right way to customize containerd is to extend the base template with
{{ template "base" . }} and append your own TOML blocks, rather than copying a
prerendered config.toml into the template file.
We were copying config.toml into config.toml.tmpl / config-v3.toml.tmpl, which
meant we were replacing the K3s defaults with a snapshot that gets stale as
soon as K3s is upgraded.
Now we create the template files with just the base directive and let our
regular set_toml_value code path append the Kata runtime configuration on top.
To make that work, the TOML utils learned to handle files that start with a
Go template line ({{ ... }}): strip it before parsing, put it back when writing.
This keeps the K3s/RKE2 path identical to every other runtime -- no special
append logic needed.
refs:
* k3s:: https://docs.k3s.io/advanced#configuring-containerd
* rke2: https://docs.rke2.io/advanced?_highlight=conyainerd#configuring-containerd
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Custom runtimes whose base config lives under runtime-rs/ (e.g. dragonball,
cloud-hypervisor) were not found because the path was always built under
share/defaults/kata-containers/. Use get_kata_containers_original_config_path
for the handler so rust shim configs are read from .../runtime-rs/.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
On helm uninstall let's rely on a preStop hook to run kata-deploy
cleanup so each pod cleans its node before exiting.
We **must** keep RBAC (resource-policy: keep) so pods retain API access
during termination, and then can properly delete the NodeFeatureRules
and remove the labels from the nodes.
The post-delete hook Job, which runs on a single node, now is only
responsible for cleaning the kept RBAC (cluster-wide resource) after
uninstall, not leaving any resource or artefact behind.
The changes on this commit lead to a "resouerces were kept" message when
running `helm uninstall`, which document as being normal, as the
post-delete job will remove those.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
When removing a node label, JSON merge patch semantics require setting
the key to null; omitting the key leaves it unchanged.
Fix label_node to send a patch with the label key set to null so the API
server actually removes katacontainers.io/kata-runtime.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Wait for SIGTERM after install and exit(0) so the container terminates
cleanly. If registering the SIGTERM handler fails, log a warning and
sleep forever instead of exiting with an error (fallback to the old
behaviour).
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
- Trim trailing whitespace and ensure final newline in non-vendor files
- Add .editorconfig-checker.json excluding vendor dirs, *.patch, *.img,
*.dtb, *.drawio, *.svg, and pkg/cloud-hypervisor/client so CI only
checks project code
- Leave generated and binary assets unchanged (excluded from checker)
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
When kata-deploy installs Kata Containers, the base configuration files
should not be modified directly. This change adds documentation explaining
how to use drop-in configuration files for customization, and prepends a
warning comment to all deployed configuration files reminding users to use
drop-in files instead.
The warning is added to both standard shim configurations and custom
runtime configurations. It includes a brief explanation of how drop-in
files work and points users to the documentation for more details.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add clear INFO-level messages when creating drop-in configuration
files, making it easy to understand what kata-deploy is doing during
installation:
- "Setting up runtime directory for shim: X"
- "Generating drop-in configuration files for shim: X"
- "Created drop-in file: <path>"
When DEBUG mode is enabled (via DEBUG=true environment variable),
also log the full content of each drop-in file to aid troubleshooting.
The log level is now automatically set to Debug when the DEBUG
environment variable is set, ensuring debug messages are visible.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Deduplicate the drop-in file generation logic between configure_shim_config
and install_custom_runtime_configs by extracting it into a shared
write_common_drop_ins helper function.
This ensures both standard and custom runtimes use the same code path
for generating drop-in configuration files.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add a combined drop-in file (30-kernel-params.toml) that handles all
kernel_params modifications. This approach reads the base kernel_params
from the original untouched config file and combines them with:
- Proxy settings (agent.https_proxy, agent.no_proxy)
- Debug settings (agent.log=debug, initcall_debug)
Using a single drop-in file for kernel_params avoids the TOML merge
behavior where scalar values are replaced rather than appended.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
When debug mode is enabled, generate a drop-in configuration file
(20-debug.toml) with the boolean debug flags for hypervisor, runtime,
and agent sections.
Note: kernel_params for debug (agent.log=debug, initcall_debug) will
be handled by a separate combined kernel_params drop-in file to avoid
the TOML merge replacement behavior.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
When the installation prefix differs from the default /opt/kata,
generate a drop-in configuration file (10-installation-prefix.toml)
with the adjusted paths instead of modifying the original config file.
This removes the need for adjust_installation_prefix and
adjust_qemu_cmdline functions which are now deleted along with
their tests.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Instead of modifying original config files directly, set up a per-shim
directory structure that uses symlinks to the original configs and
config.d/ directories for drop-in overrides.
This enables cleaner configuration management where the original files
remain untouched and all kata-deploy customizations are in separate
drop-in files that can be easily inspected and removed.
Directory structure:
{config_path}/runtimes/{shim}/
{config_path}/runtimes/{shim}/configuration-{shim}.toml -> symlink
{config_path}/runtimes/{shim}/config.d/
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
We cannot overwrtie a binary that's currently in use, and that's the
reason that elsewhere we remove / unlink the binary (the running process
keeps its file descriptor, so we're good doing that) and only then we
copy the binary. However, we missed doing this for the
nydus-snapshotter deployment.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Clean up existing nydus-snapshotter state to ensure fresh start with new
version.
This is safe across all K8s distributions (k3s, rke2, k0s, microk8s,
etc.) because we only touch the nydus data directory, not containerd's
internals.
When containerd tries to use non-existent snapshots, it will
re-pull/re-unpack.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
As we have moved to use QEMU (and OVMF already earlier) from
kata-deploy, the custom tdx configurations and distro checks
are no longer needed.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
Add functions to install and remove custom runtime configuration files.
Each custom runtime gets an isolated directory structure:
custom-runtimes/{handler}/
configuration-{baseConfig}.toml # Copied from base config
config.d/
50-overrides.toml # User's drop-in overrides
The base config is copied AFTER kata-deploy has applied its modifications
(debug settings, proxy configuration, annotations), so custom runtimes
inherit these settings.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add functions to configure custom runtimes in containerd and CRI-O.
Custom runtimes use an isolated config directory under:
custom-runtimes/{handler}/
Custom runtimes automatically derive the shim binary path from the
baseConfig field using the existing is_rust_shim() logic.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add support for parsing custom runtime configurations from a mounted
ConfigMap. This allows users to define their own RuntimeClasses with
custom Kata configurations.
The ConfigMap format uses a custom-runtimes.list file with entries:
handler:baseConfig:containerd_snapshotter:crio_pulltype
Drop-in files are read from dropin-{handler}.toml, if present.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Let's extract the common logic from configure_containerd_runtime and
configure_crio_runtime into reusable helper functions. This reduces
code duplication and prepares for adding custom runtime support.
For containerd:
- Add ContainerdRuntimeParams struct to encapsulate common parameters
- Add get_containerd_pluginid() to extract version detection logic
- Add get_containerd_output_path() to extract file path resolution
- Add write_containerd_runtime_config() to write common TOML values
For CRI-O:
- Add CrioRuntimeParams struct to encapsulate common parameters
- Add write_crio_runtime_config() to write common configuration
While here, let's also simplify pod_annotations to always use
"[\"io.katacontainers.*\"]" for all runtimes, as the NVIDIA specific
case has been removed from the shell script, but we forgot to do so
here.
No functional changes intended.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add the necessary configuration and code changes to support QEMU
on arm64 architecture in runtime-rs.
Changes:
- Set MACHINETYPE to "virt" for arm64
- Add machine accelerators "usb=off,gic-version=host" required for
proper arm64 virtualization
- Add arm64-specific kernel parameter "iommu.passthrough=0"
- Guard vIOMMU (Intel IOMMU) to skip on arm64 since it's not supported
These changes align runtime-rs with the Go runtime's arm64 QEMU support.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>
Remove k0s-worker and k0s-controller from
RUNTIMES_WITHOUT_CONTAINERD_DROP_IN_SUPPORT and always return true for
k0s in is_containerd_capable_of_using_drop_in_files since k0s auto-loads
from containerd.d/ directory regardless of containerd version.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add microk8s case to get_containerd_paths() method and remove microk8s
from RUNTIMES_WITHOUT_CONTAINERD_DROP_IN_SUPPORT to enable dynamic
containerd version checking.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Introduce ContainerdPaths struct and get_containerd_paths() method to
centralize the complex logic for determining containerd configuration
file paths across different Kubernetes distributions.
The new ContainerdPaths struct includes:
- config_file: File to read containerd version from and write to
- backup_file: Backup file path before modification
- imports_file: File to add/remove drop-in imports from (Option<String>)
- drop_in_file: Path to the drop-in configuration file
- use_drop_in: Whether drop-in files can be used
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
The JSONPath parser was incorrectly splitting on escaped dots (\.)
causing microk8s detection to fail. Labels like "microk8s.io/cluster"
were being split into ["microk8s\", "io/cluster"] instead of being
treated as a single key.
This adds a split_jsonpath() helper that properly handles escaped dots,
allowing the automatic microk8s detection via the node label to work
correctly.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
kata-deploy shell script is not THAT bad and, to be honest, it's quite
handy for quick hacks and quick changes. However, it's been
increasingly becoming harder to maintain as it's grown its scope from a
testing tool to the proper project's front door, lacking unit tests, and
with an abundacy of complex regular expressions and bashisms to be able
to properly parse the environment variables it consumes.
Morever, the fact it is a Frankstein's monster glued together using
python packages, golang binaries, and a distro dependent container makes
the situation VERY HARD to use it from a distroless container (thus,
avoiding security issues), preventing further integration with
components that require a higher standard of security than we've been
requiring.
With everything said, with the help of Cursor (mostly on generating the
tests cases), here comes the oxidized version of the script, which runs
from a distroless container image.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>