Commit Graph

6635 Commits

Author SHA1 Message Date
Manuel Huber
c9352ffffe runtime: plumb block discard unmap
Pass block-device discard support through the Go QEMU stack.

Block drives can now carry a DiscardUnmap request into govmm. QEMU
command-line and QMP hotplug paths set discard=unmap on the backend and
enable discard on virtio-blk frontends, while leaving SCSI frontend
arguments unchanged.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Assisted-by: OpenAI Codex <codex@openai.com>
2026-06-26 21:05:51 +00:00
Manuel Huber
3332b7c4bc runtime-rs: support block-plain emptyDirs
Add runtime-rs support for the block-plain emptyDir mode.

Disk-backed Kubernetes emptyDir mounts remain bind mounts so the block
emptyDir volume handler can intercept them. The handler creates a sparse
disk.img in the kubelet emptyDir directory, attaches it as a block device,
and rewrites the container mount to the agent-visible block storage path.

The same handler now covers encrypted and plain block emptyDirs. Fresh
block emptyDirs request filesystem creation through a dedicated metadata
flag. Plain emptyDirs add discard support, while encrypted emptyDirs keep
the existing ephemeral encryption metadata.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Assisted-by: OpenAI Codex <codex@openai.com>
2026-06-26 21:05:51 +00:00
Manuel Huber
24c51cfbbf runtime: support block-plain emptyDirs
Add Go runtime support for the block-plain emptyDir mode.

Disk-backed Kubernetes emptyDir mounts remain bind mounts so the block
emptyDir handling path can intercept them. The runtime creates a sparse
disk.img in the kubelet emptyDir directory and records direct-volume
metadata for the agent-visible block storage path.

Fresh block emptyDirs request filesystem creation through a dedicated
metadata flag. Plain emptyDirs also record discard support on the block
device. Encrypted emptyDirs keep the existing ephemeral encryption
metadata and carry the same filesystem-creation signal.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Assisted-by: OpenAI Codex <codex@openai.com>
2026-06-26 21:05:51 +00:00
manuelh-dev
b05d705ea0 Merge pull request #13286 from kata-containers/mahuber/genpolicy-image-user-group-tests
genpolicy: test image user group handling
2026-06-26 13:55:44 -07:00
Hyounggyu Choi
b5aa4cef35 runtime-rs: use SE-specific overhead_memory for qemu-se config
The IBM SEL runtime requires a larger overhead_memory budget than
other TEE runtimes (SNP, TDX) because the kernel command line baked
into the SE image sets:

swiotlb=262144  (262144 × 2 KiB slots = 512 MiB)

This buffer is pre-allocated at boot from the guest's physical RAM
before any workload runs.
With static_sandbox_resource_mgmt = true the VM gets:

vm_memory = overhead_memory + container_limit

In k8s-limit-range.bats, DEFOVERHEADMEMSZ_TEE (128 MiB) resulted in
a 256 MiB VM when a container with a 128 MiB memory limit was scheduled
— far too small to even fit the swiotlb allocation, causing boot failure.
In a similar way, the failure is also observed for k8s-oom.bats.

Introduce DEFOVERHEADMEMSZ_TEE_SE := 768 MiB, sized to cover:
  - 512 MiB  swiotlb bounce buffer (fixed by sealed kernel cmdline)
  - ~128 MiB SE kernel + initrd + agent baseline
  - ~128 MiB headroom for other stuff

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2026-06-26 13:29:41 +02:00
Fupan Li
3f5ffa42a0 Merge pull request #12958 from Apokleos/integrated-erofslayers-gpt-vmdk
runtime-rs: Support erofs snapshotter integrety with dmverity
2026-06-26 15:35:10 +08:00
Alex Lyn
b2d0e5b712 kata-agent: Use kata-types dmverity with optional devicemapper support
Replace the agent's inline devicemapper implementation with the libs
kata-types::dmverity module. The agent's devicemapper Cargo feature
now forwards to kata-types/devicemapper, removing the direct
libdevmapper link dependency from the agent crate. Gate all dm-verity
imports, constants, and call sites behind libdevmapper.

Add USE_DEVMAPPER Makefile variable (default no) that appends the
devicemapper feature flag and forces LIBC=gnu when enabled.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-26 09:51:05 +08:00
Alex Lyn
274a904bf7 kata-agent: Mount multi-layer EROFS partitions concurrently
This commit is just a enhancement without any functionality changes.

Replace the sequential loop in handle_multi_layer_erofs_group with
join_all-based concurrent mounting. Base device paths and mount
directories are pre-resolved before spawning futures to avoid lock
contention. On partial failure, successfully mounted layers are
unmounted and dm-verity devices cleaned up before propagating the
error.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-26 09:51:05 +08:00
Alex Lyn
c74bddddaf kata-types: Add dmverity module with optional devicemapper support
Introduce a new `dmverity` module in kata-types that provides dm-verity
device creation, destruction and lifecycle management via devicemapper
ioctls. The module is conditionally compiled behind the `devicemapper`
feature flag, which also pulls in tokio for async device-node polling.

The workspace devicemapper dependency is pinned to a specific git
revision for reproducible builds.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-26 09:51:05 +08:00
Alex Lyn
a08267faaf runtime-rs: Track GPT partition padding files for cleanup
When a GPT-partitioned VMDK is split into individual partition images,
padding files may be generated between partitions to maintain correct
byte offsets. These were not tracked for cleanup, leading to stale
temporary files after container removal.

Iterate over the partition layout and check for pad-{idx}.img files
alongside the head image; add any that exist to gpt_metadata_paths
so they are removed during teardown.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-26 09:51:05 +08:00
Alex Lyn
51e8310ef3 kata-agent: Integrate dm-verity into multi-layer EROFS mount path
Wire the dm-verity helpers into the layer mount flow so that GPT
partitions carrying verity metadata are mounted through a verified
device-mapper target instead of the raw partition.

Refactor wait_and_mount_layer to resolve partition path and verity
device as separate steps: create a dm-verity device when
X-kata.dmverity-enabled=true is set, fall back to direct partition
mount otherwise, and return the verity device path for cleanup
tracking.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-26 09:51:05 +08:00
Alex Lyn
963ba6c6cd kata-agent: Add dm-verity device cleanup for GPT-partitioned layers
Add per-container verity_devices tracking in Sandbox and wire the
teardown path: destroy_partition_dmverity_device removes the
device-mapper target via deferred-remove ioctl and deletes the mknod
node, cleanup_dmverity_devices iterates all devices in reverse order.

Wire into remove_container_resources (rpc.rs) so verity devices are
torn down after unmount, and record verity device paths in
add_storages (storage/mod.rs) for tracking.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-26 09:51:05 +08:00
Alex Lyn
dce409bc35 kata-agent: Add dm-verity device creation for GPT-partitioned layers
GPT-partitioned EROFS layers can carry dm-verity hashes appended after
the filesystem data within the same partition. The host runtime passes
the root hash and parameters as X-kata.dmverity.* storage options; the
agent must set up the kernel dm-verity target before mounting so that
every read is integrity-checked against the Merkle tree.

Implement dm-verity device creation: option parsing from storage
options, device name generation, and create helper via devicemapper
ioctls with hash_start_block calculation (accounting for v1 superblock
presence).

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-26 09:51:05 +08:00
Alex Lyn
e900eae388 kata-agent: Add no-udev DmOptions builders and mknod device node helpers
The kata guest VM runs without udev, so device-mapper nodes under
/dev/mapper are never created automatically. Add the foundational
helpers that subsequent dm-verity integration will rely on:

It focus on the following key points:
(1) DmOptions builders that disable all udev synchronization flags,
  with read-only and deferred-remove variants.
(2) mknod-based device node creation/removal under /dev/mapper, since
  devtmpfs nodes are not auto-created without udev.

Also add the devicemapper crate dependency (default-features = false).

But note that the commit depends on device mapper with no-udev support
with the PR:https://github.com/stratis-storage/devicemapper-rs/pull/1036

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-26 09:51:05 +08:00
Alex Lyn
c471644477 runtime-rs: Add dm-verity annotation extraction to GPT+VMDK integration
Extract dm-verity metadata from containerd mount annotations and pass
them through to kata-agent as X-kata.dmverity.* storage options. This
enables the agent to create dm-verity devices for integrity-verified
EROFS partitions.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-26 09:51:05 +08:00
Alex Lyn
3051b8d11a runtime-rs: Add dm-verity utility functions to gpt_disk module
When containerd creates dm-verity-protected EROFS layers, it stores
the root hash and parameters as OCI annotations — but the format
does not directly map to the kernel dm-verity table that the guest
agent needs to construct.

Bridge this gap with functions that parse containerd's dm-verity
annotation JSON, detect whether a v1 superblock is embedded at the
hash offset (to extract the salt automatically rather than relying
on containerd's hardcoded default), and produce the X-kata.dmverity.*
storage options the agent expects.

This keeps all dm-verity metadata translation on the host side, so
the agent can consume a flat list of options without understanding
the containerd annotation schema.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-26 09:51:05 +08:00
Alex Lyn
499fefd972 kata-types: Extend DmVerityInfo with salt, hash_type, no_superblock fields
Add fields to DmVerityInfo needed for dm-verity device creation:
(1) salt: Optional salt value for the hash computation
(2) hash_type: dm-verity version
(3) no_superblock: whether to skip the superblock at hash offset

Uses serde defaults for backward compatibility with existing serialized
data that lacks these fields.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-26 09:51:05 +08:00
Manuel Huber
c6ee1c70a8 genpolicy: test image user group handling
Add unit coverage for image config User values that include a
group component. For Kubernetes, containerd CRI ImageStatus exposes
only the user side before kubelet creates the container security
context, so genpolicy keeps treating those values like the user-only
form.

The fixture uses in-memory passwd and group data so the test does not
rely on private reproducer images.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Assisted-by: OpenAI Codex <codex@openai.com>
2026-06-26 00:04:17 +00:00
Aurélien Bombo
b1e6b9449d runtime-rs: special case emptyDirs with peer pods
Peer pods don't support fs sharing, hence we need to be thoughtful about
removing disable_guest_empty_dir there (=false for peer pods today, missed it
in my previous PR).

So we preserve disable_guest_empty_dir=false behavior for peer pods only (ie.
using guest-local mounts) but we detect the need for guest-local mounts directly
in code instead of using a config flag.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-06-25 16:37:34 -05:00
Aurélien Bombo
b20f974ddd runtime-rs: remove disable_guest_empty_dir config
Follow-up to #12373 which defaulted disable_guest_empty_dir=true for
runtime-go/rs.

Here we remove the config option entirely from runtime-rs to make 4.0
secure by design, as with disable_guest_empty_dir=false, a pod could starve
the host storage.

Closes: #12494

Generated-by: GitHub Copilot
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-06-25 15:31:46 -05:00
Fabiano Fidêncio
a34c74a2d4 runtime-rs: size static sandboxes with overhead values
When static sandbox sizing is enabled, keep configured defaults when
workloads do not specify CPU or memory limits. When limits are present,
size the VM as requested resources plus overhead_vcpus/overhead_memory
values derived from runtime-rs profile defaults.

Limit-driven vCPU sizing is clamped to a minimum of one vCPU so a 0.0
result never yields an unbootable VM, and sandbox setup fails early with
a clear, actionable error when the computed memory is 0 MiB (pointing at
memory limits or non-zero default/overhead memory settings).

This keeps static VM sizing predictable across runtime-rs profiles,
including NVIDIA ones.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <cursoragent@cursor.com>
2026-06-25 13:56:11 +02:00
Fabiano Fidêncio
65a266f532 Merge pull request #13272 from cayoub-oai/codex/upstream-cgroupfs-init-subcgroup
agent: Apply init subcgroup in cgroupfs manager
2026-06-25 13:54:19 +02:00
Aurélien Bombo
1217dd1584 Merge pull request #12373 from kata-containers/disable-guest-empty-dir
runtime: Set `disable_guest_empty_dir = true` by default
2026-06-24 20:09:46 -05:00
Chris Ayoub
4e3d257dc0 agent: Apply init subcgroup in cgroupfs manager
When cgroup v2 is enabled, exec can fail with EBUSY while writing the
process to cgroup.procs if the container process has been delegated to an
init subcgroup.

PR #10845 fixed this behavior for the systemd/D-Bus cgroup manager
path, which was related to #10733. The cgroupfs manager still writes the
process directly to the container cgroup, so apply the same init
subcgroup handling there.

Also fix the cgroupfs init-subcgroup existence check for absolute OCI
cgroup paths by joining the trimmed cgroup path under the cgroup root.

Fixes: #9701

Signed-off-by: Chris Ayoub <cayoub@openai.com>

Generated-By: OpenAI Codex
2026-06-24 21:25:49 +00:00
Aurélien Bombo
3acb618f6b genpolicy: Assume disable_guest_empty_dir = true
This option should be removed for 4.0, so we don't handle `false`.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-06-24 15:22:13 -05:00
Aurélien Bombo
e191c5b716 runtime-go/rs: Reconcile hugepage emptyDirs and disable_guest_empty_dir
This addresses an issue where the disable_guest_empty_dir=true code paths did
not take into account that hugepage-backed emptyDirs should always be recreated
in the guest (using guest hugepages).

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-06-24 15:22:13 -05:00
Aurélien Bombo
a3e91d9ed2 runtime-go/rs: Set disable_guest_empty_dir = true by default
This makes the runtime share the host Kubelet emptyDir folder with the guest
instead of the agent creating an empty folder in the container rootfs. Doing so
enables the Kubelet to track emptyDir usage and evict greedy pods.

In other words, with virtio-fs the container rootfs uses host storage whether
this is true or false, however with true, Kata uses the k8s emptyDir folder so
the sizeLimit is properly enforced by k8s.

Addresses the ephemeral storage part of #12203.

History:

 * Initially, emptyDirs are slow because they are shared from the host with 9p.
   https://github.com/kata-containers/runtime/issues/1472

 * To address above, emptyDirs are hardcoded to be created by the agent in the
   pause container's rootfs, potentially leveraging devicemapper and improving
   perf.
   https://github.com/kata-containers/runtime/pull/1485

 * The previous PR regressed an (interesting?) use case where emptyDirs were
   used to share data from the host to the guest, so the behavior was made
   configurable and `disable_guest_empty_dir = false` is introduced, defaulting
   to the behavior of the previous PR.
   https://github.com/kata-containers/kata-containers/pull/2056

 * Another resource accounting regression remains which is addressed in this PR.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-06-24 15:21:53 -05:00
Fabiano Fidêncio
6528e7a72f Merge pull request #13228 from fidencio/topic/dont-set-slots-maxmem-for-confidential-guests
runtime-rs: qemu: don't set slots/maxmem for confidential guests
2026-06-24 17:27:28 +02:00
Greg Kurz
13b3020c34 Merge pull request #13261 from c3d/bug/13260-Info-log-level
runtime: Change default log level from Warn to Info
2026-06-24 08:57:13 +02:00
Fabiano Fidêncio
392b802f61 Merge pull request #12878 from Apokleos/fix-configs
runtime-rs: Fix configs differences between runtime-rs and runtime-go
2026-06-23 13:53:16 +02:00
Steve Horsman
811914a372 Merge pull request #13246 from Apokleos/copyfile-with-gid-uid
runtime-rs: correct uid/gid for K8s secret/configmap copy_file
2026-06-23 10:43:03 +01:00
Christophe de Dinechin
631fd96715 runtime: Change default log level from Warn to Info
When the kata configuration does not set log_level to debug, the
containerd-shim-v2 defaults to WarnLevel, which suppresses important
diagnostic information logged at Info level.

Key Info-level logs that are currently hidden:
- QEMU command line (qemu.go:3566) - critical for debugging VM issues
- VM lifecycle events (creation, start, stop)
- Device hotplug operations (VFIO, network, volumes)
- Resource configuration (NUMA, memory)
- QMP socket details

Info level provides significantly better diagnostic data without
flooding logs with excessive detail (which would occur at Debug level).
This change improves troubleshooting capabilities for production
deployments where debug mode is not enabled.

Note: runtime-rs already defaults to Info level (see
src/runtime-rs/crates/shim/src/logger.rs:13,30), so this change only
affects the Go runtime.

Fixes: #13260

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2026-06-23 10:29:33 +02:00
Steve Horsman
a87d71763e Merge pull request #13255 from kata-containers/dependabot/go_modules/src/runtime/github.com/containerd/containerd-1.7.33
build(deps): bump github.com/containerd/containerd from 1.7.32 to 1.7.33 in /src/runtime
2026-06-22 11:17:54 +01:00
Steve Horsman
20bcff185f Merge pull request #13254 from kata-containers/dependabot/go_modules/src/runtime/go.mongodb.org/mongo-driver-1.17.7
build(deps): bump go.mongodb.org/mongo-driver from 1.14.0 to 1.17.7 in /src/runtime
2026-06-22 11:17:29 +01:00
Fabiano Fidêncio
f9682356ce Merge pull request #13216 from Apokleos/hotunplug-blk
runtime-rs: Add support for hot-unplugging block devices
2026-06-22 12:14:30 +02:00
Alex Lyn
9550a323ac Merge pull request #13245 from kata-containers/unify-nix-version
Unify nix version
2026-06-22 15:25:10 +08:00
Alex Lyn
8ae08e7fb0 runtime-rs: Add dan_conf to allow network devices in host netns for qemu
Network devices for VM-based containers are allowed to be placed in the
host netns to eliminate as many hops as possible, which is what we
aim for to achieve near-native networking performance.

This commit introduces the `dan_conf` field to the configuration file.
This allows the runtime to specify the configuration path for
Direct Attached Network (DAN) devices, enabling interfaces to remain
in the host network namespace while being utilized by the VM-based(qemu)
containers.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-22 14:16:37 +08:00
Alex Lyn
b068f73543 runtime-rs: add experimental features documentation
The experimental configuration allows enabling features not yet
stable for production. These features may break compatibility and
are prepared for major version bumps.

Add documentation with force_guest_pull example across all
runtime-rs configuration files. This feature enables guest-side
image pulling in CoCo (Confidential Computing) scenarios.
Example usage:
  experimental = ["force_guest_pull"]

Fixes inconsistent documentation across configuration files

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-22 14:14:06 +08:00
Alex Lyn
71f3f783a4 runtime-rs: Remove mem_agent configuration for kata coco dev scenarios
As it's useless with memory agent in kata-coco-dev scenarios, this
commit aims to remove this items.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-22 14:14:06 +08:00
PiotrProkop
c2d737c9d7 agent: report 128+signal as exit code for signal-terminated processes
When a container process is terminated by a signal, the agent's SIGCHLD
reaper stored the raw signal number as the process exit code. As a result
a process killed by SIGKILL(9) reported exit code 9 instead of the
conventional 137 (128+9).

Apply the standard shell convention of 128+signal_number so that
signal-terminated processes report the expected exit codes, e.g.
SIGKILL(9) -> 137, SIGTERM(15) -> 143, SIGINT(2) -> 130. This mimics
runc, which encodes wait-status exit codes the same way:
https://github.com/opencontainers/runc/blob/v1.4.3/libcontainer/utils/utils.go#L19

Both runc and this new Kata behaviour follow the conventional exit code
semantics documented at https://tldp.org/LDP/abs/html/exitcodes.html.

The conversion is factored into a small helper and covered by a unit
test. The runtime and shim already pass the exit code through unchanged,
so no further changes are needed for the corrected value to surface.

Fixes: signal-terminated containers reporting raw signal numbers

Signed-off-by: PiotrProkop <pprokop@nvidia.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 16:34:17 +02:00
dependabot[bot]
9c6cccb483 build(deps): bump github.com/containerd/containerd in /src/runtime
Bumps [github.com/containerd/containerd](https://github.com/containerd/containerd) from 1.7.32 to 1.7.33.
- [Release notes](https://github.com/containerd/containerd/releases)
- [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md)
- [Commits](https://github.com/containerd/containerd/compare/v1.7.32...v1.7.33)

---
updated-dependencies:
- dependency-name: github.com/containerd/containerd
  dependency-version: 1.7.33
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-06-21 14:32:17 +00:00
Fabiano Fidêncio
374a867774 Merge pull request #13196 from microsoft/cameronbaird/upstream/runtime-go-clh-templating
runtime: Enable VM Templating Support for CLH
2026-06-21 16:31:19 +02:00
Alex Lyn
0a63aebea9 runtime-rs: Implement remove_device for block device hot removal
Replace the "Not yet implemented" stub in QemuInner::remove_device()
with a working implementation that calls hotunplug_device() to perform
the QMP-level device removal, then cleans up the internal devices list
via retain() to remove stale coldplug entries.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-20 22:08:57 +08:00
Alex Lyn
d4212bcb74 runtime-rs: Add hotunplug_device dispatcher for device type routing
Introduce hotunplug_device() as the device-type dispatcher that routes
hot removal requests to the appropriate QMP method. Currently supports
Block and BlockModern device types, which are forwarded to
Qmp::hotunplug_block_device(). All other device types return an
explicit "unsupported" error.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-20 22:08:57 +08:00
Alex Lyn
281b6aa61a runtime-rs: Add hotunplug_block_device for block device hot removal
Implement QMP-level block device hot-unplug by issuing device_del to
remove the frontend device and blockdev_del to remove the backend
blockdev node. For virtio-blk-ccw on s390x, the CCW subchannel slot
is also released.

Since QMP device_del is asynchronous and only initiates the removal
request, introduce wait_for_device_deleted() to poll for the
DEVICE_DELETED event before tearing down the backend. This prevents
blockdev_del from failing with "Node is still in use".

If blockdev_del fails, the error is logged but CCW cleanup still
proceeds before the error is propagated, ensuring consistent
subchannel state.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-20 22:08:57 +08:00
Alex Lyn
431720025c runtime-rs: Enhance hotplug_block_device error handling and rollback
Improve the reliability of block device hotplug by ensuring that
blockdev-add nodes are properly cleaned up when subsequent device_add
operations fail.

To address this, A new method of device_add_with_rollback is introduced
to do device_add and do properly cleaned up when it fails.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-20 22:08:57 +08:00
dependabot[bot]
399c863cd2 build(deps): bump go.mongodb.org/mongo-driver in /src/runtime
Bumps [go.mongodb.org/mongo-driver](https://github.com/mongodb/mongo-go-driver) from 1.14.0 to 1.17.7.
- [Release notes](https://github.com/mongodb/mongo-go-driver/releases)
- [Commits](https://github.com/mongodb/mongo-go-driver/compare/v1.14.0...v1.17.7)

---
updated-dependencies:
- dependency-name: go.mongodb.org/mongo-driver
  dependency-version: 1.17.7
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-06-20 10:22:56 +00:00
Cameron Baird
730307f32c factory: Default to normal sandbox boot path when factory init not done
The behavior we had before was that, for a starting k8s pod,
it sees enable_template=true and therefore:

1. Tries NewFactory with fetchOnly=true
2. When that fails (because template.Fetch fails to find the artifacts,
	we retry with fetchOnly=false. This creates a direct factory
	which creates the template from scratch
	(hence we pay a full pod sandbox boot time here)
	and then restores from that. Hence the boot times
	are strictly worse on this path.

Now, even when enable_template=true, we don't try to force a direct factory.
Instead we just revert to the standard sandbox boot path.

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
2026-06-19 18:00:02 +00:00
Cameron Baird
65a5f272f8 ci: Introduce tests for VM template factory
Add k8s-vm-templating-test.bats which exercises pod create
with the factory initialized on the target node.

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
2026-06-19 18:00:02 +00:00
Cameron Baird
c0f9744225 runtime: Implement support for VM Template factory in clh
Add support for VM Template factory on the clh path.

In order to support snapshot/restore-based VM templating,
the following changes were needed:
1. For clh.go, implement SaveVM, PauseVM, restoreVM, ResumeVM
2. Remove initrd config check for VM Templating path. The
        root disk image (when using image mode) is created in memory
        and therefore captured in the VM snapshot.
3. Truncate the memory file to the size of the VM at factory VM
        create time. This allows CLH to use the memory file
        as the backing for the template VM memory, allowing O(1)
        snapshot times.
4. CLH uses memory zones as backing for its memory on the template paths
5. Update StartVM in CLH to use the restore path when template is
        configured and available

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
2026-06-19 18:00:02 +00:00