Similar to #12075, bump-backtrace to 0.3.76 to remove the dependency
on adler, which is unmaintained - contributing to mitigating RUSTSEC-2025-0056
As a side effect this brought in loads of other crate changes, which I think are due
to it bumping the local dependencies that this package builds on.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Similar to #12075, bump-backtrace to remove the dependency
on adler, which is unmaintained - contributing to mitigating RUSTSEC-2025-0056
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Similar to #12075, bump-backtrace to remove the dependency
on adler, which is unmaintained - contributing to mitigating RUSTSEC-2025-0056
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Similar to #12075, bump-backtrace to remove the dependency
on adler, which is unmaintained - contributing to mitigating RUSTSEC-2025-0056
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Similar to #12075, bump flate2 and backtrace to remove the dependency
on adler, which is unmaintained - contributing to mitigating RUSTSEC-2025-0056
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Similar to #12075, bump flate2 and backtrace to remove the dependency
on adler, which is unmaintained - contributing to mitigating RUSTSEC-2025-0056
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Since the network device hotplug is an asynchronous operation,
it's possible that the hotplug operation had returned, but
the network device hasn't ready in guest, thus it's better to
retry on this operation to wait until the device ready in guest.
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
This makes the user experience better, as the admin can deploy Kata
Containers without having to download / set up any additional file.
Of course, if the admin wants something more specific, examples are
provided.
Tests and documentation are updated to reflect this change.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
In commit 1f95d9401b
runtime-rs: change representation of default_vcpus from i32 to f32,
When the vCPU number is less than 1.0, directly converting an integer to
a floating-point number will automatically convert it to 0. Therefore,
it needs to be rounded up before converting it back to an integer.
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
Add three example values files to make it easier for users to try out
different Kata Containers configurations:
- try-kata.values.yaml: Enables all available shims
- try-kata-tee.values.yaml: Enables only TEE/confidential computing shims
- try-kata-nvidia-gpu.values.yaml: Enables only NVIDIA GPU shims
These files use the new structured configuration format and serve as
ready-to-use examples for common deployment scenarios.
Also update the README.md to document these example files and how to use them.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Update the helm_helper function in gha-run-k8s-common.sh to use the
new structured configuration format instead of the legacy env.* format.
All possible settings have been migrated to the structured format:
- HELM_DEBUG now sets root-level 'debug' boolean
- HELM_SHIMS now enables shims in structured format with automatic
architecture detection based on shim name
- HELM_DEFAULT_SHIM now sets per-architecture defaultShim mapping
- HELM_EXPERIMENTAL_SETUP_SNAPSHOTTER now sets snapshotter.setup array
- HELM_ALLOWED_HYPERVISOR_ANNOTATIONS now sets per-shim allowedHypervisorAnnotations
- HELM_SNAPSHOTTER_HANDLER_MAPPING now sets per-shim containerd.snapshotter
- HELM_AGENT_HTTPS_PROXY and HELM_AGENT_NO_PROXY now set per-shim agent proxy settings
- HELM_PULL_TYPE_MAPPING now sets per-shim forceGuestPull/guestPull settings
- HELM_EXPERIMENTAL_FORCE_GUEST_PULL now sets per-shim forceGuestPull/guestPull
The test helper automatically determines supported architectures for
each shim (e.g., qemu-se supports s390x, qemu-cca supports arm64,
qemu-snp/qemu-tdx support amd64, etc.) and applies per-shim settings
to the appropriate shims based on HELM_SHIMS.
Only HELM_HOST_OS remains in legacy env.* format as it doesn't have
a structured equivalent yet.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add comprehensive documentation for the new structured configuration
format, including:
- Migration guide from legacy env.* format
- List of deprecated fields with removal timeline (2 releases)
- Examples of the new structured format
- Explanation of key benefits
- Backward compatibility notes
The documentation makes it clear that the legacy format is deprecated
but will continue to work during the transition period.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
This commit adds backward compatibility support to ensure existing
configurations using the legacy env.* format continue to work.
The helper functions now check for legacy env.* values first, and
only fall back to the new structured format if legacy values are
not set. This allows for gradual migration without breaking
existing deployments.
Backward compatibility is maintained for:
- env.shims, env.shims_* (per architecture)
- env.defaultShim, env.defaultShim_* (per architecture)
- env.allowedHypervisorAnnotations
- env.snapshotterHandlerMapping_* (per architecture)
- env.pullTypeMapping_* (per architecture)
- env.agentHttpsProxy, env.agentNoProxy
- env._experimentalSetupSnapshotter
- env._experimentalForceGuestPull_* (per architecture)
- env.debug
Legacy env vars (SHIMS, DEFAULT_SHIM, etc.) are still set in the
DaemonSet when using the old format to maintain full compatibility.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
This commit introduces a new structured configuration format for
configuring Kata Containers shims in the Helm chart. The new format
provides:
- Per-shim configuration with enabled/supportedArches
- Per-shim snapshotter, guest pull, and agent proxy settings
- Architecture-aware default shim configuration
- Root-level debug and snapshotter setup configuration
All shims are disabled by default and must be explicitly enabled.
This provides better type safety and clearer organization compared
to the legacy env.* string-based format.
The templates are updated to use the new structure exclusively.
Backward compatibility will be added in a follow-up commit.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
As the some of the global vars can be empty, we should actually check
their _FOR_ARCH version instead.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
As we're making the values.yaml more user friendly, we actually have to
handle the https_proxy and no_proxy entries per shim, instead of having
this globally available, as this will only affect images being pulled
inside the guest (as in, when using TEE variations of the shims).
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Adds a practical set of kernel config used by docker-in-docker and kind
for network bridging and filtering. It also includes the matching IPv6
support to allow tools like kind that require IPv6 network policies to
work out of the box.
This support includes:
- nftables reject and filtering support for inet/ipv4/ipv6
- Bridge filtering for container-to-container traffic
- IPv6 NAT, filtering, and packet matching rules for network policies
- VXLAN and IPsec crypto support for network tunneling
- TMPFS POSIX ACL support for filesystem permissions
The configs are organized across fragment files:
- common/fs.conf: TMPFS ACL support
- common/crypto.conf: IPsec/VXLAN crypto algorithms
- common/network.conf: VXLAN, IPsec ESP, nftables bridge/ARP/netdev
- common/netfilter.conf: IPv6 netfilter stack and nftables advanced features
Fixes: #11886
Signed-off-by: Simon Kaegi <simon.kaegi@gmail.com>
Re-enable AUTO_GENERATE_POLICY for coco-dev Hosts, unless PULL_TYPE is
"experimental-force-guest-pull", or the caller specified a different
value for AUTO_GENERATE_POLICY.
Auto-generated Policy has been disabled accidentally and recently for
these Hosts, by a GHA workflow change.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Don't skip anymore parsing the pause container image when using the
recently updated AKS pause container handling - i.e. when
pause_container_id_policy == "v2".
This was the easiest CI fix for guest pull + new AKS given the *current*
tests. When adding *new* UID/GID/AdditionalGids tests in the future,
these workarounds might need additional updates.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
The update removes the deprecated adler crate from our dependencies. In
addition, we're switching to the default backend (miniz_oxide), which is
a pure Rust implementation and thus much more portable. The performance
impact is negligible, because flate2 is only used for initdata
decompression, which is limited to a couple of MiB anyway.
Signed-off-by: Markus Rudy <mr@edgeless.systems>
The update removes the deprecated adler crate from our dependencies. In
addition, we're switching to the default backend (miniz_oxide), which is
a pure Rust implementation and thus much more portable. The performance
impact is acceptable for a developer tool.
Signed-off-by: Markus Rudy <mr@edgeless.systems>
The github API suggestions that `Authorization: Bearer <YOUR-TOKEN>`
is the way to set the auth token, but it also mentioned that `token`
should work, so it's unclear if this will help much, but it shouldn't harm.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The formatting wasn't quite right, so the `qemu-coco-dev-runtime-rs`
hypervisor wasn't skipping this test
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Introduce a flag `DEFSTATICRESOURCEMGMT_COCO` for setting static sandbox
resource management with default true. And then set it to the item of
`static_sandbox_resource_mgmt` in configuration.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
The teardown_common will print the description of the running pods, kill
them all and print the system's syslogs afterwards.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
When testing this branch, on several occasions the Delete
AKS cluster step has hung for multiple hours, so add a timeout
to prevent this.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Since it aligns with the create_container_timeout definition in
runtime-go, we need to set the value in configuration.toml in seconds,
not milliseconds. We must also convert it to milliseconds when the
configuration is loaded for request_timeout_ms.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Create non-tee runtime class for runtime-rs qemu CoCo development
without requiring TEE hardware. Based on the qemu-runtime-rs
config, but with updated guest image, kernel and shared_fs
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The new environment of Power runners for agent checks is causing two test case failures
w.r.to selinux and inode which needs further understanding and is mostly an issue
due to environemnt change and not to do with the agent.
Fall back to running agent checks on original ppc64le self hosted runners.
Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
As the arm 22.04 runner isn't working at the moment, let's test the
24.04 version to see if that is better.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The fact that we were not explicitly setting the VMM was leading to us
testing with the default runtime class (qemu). :-/
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>