mirror of https://github.com/kata-containers/kata-containers.git synced 2026-04-27 02:56:50 +00:00

Files

Aurélien Bombo 93bd2899fb runtime-rs/ch: Fix hang on pod deletion

This serializes CH API calls to avoid a race condition where deleting a pod
would hang indefinitely and leak both the shim and CH processes.

The race happened because the CRI can send multiple shutdown requests for the
same pod, however the CH socket wasn't guarded against concurrent usage, hence
it was possible that HTTP responses would interleave (see below) on the
shutdown path, leading to an error.

This would repro in <15 iterations (sometime 2-3) using a 2-container pod.
With this commit, I haven't observed a repro in 200+ iterations.

Fixes: #12858

ORIGINAL REPRO:

while true; do
  kubectl apply -f busybox.yaml
  kubectl wait --for=condition=ready po busybox
  kubectl exec busybox -- echo foo
  kubectl delete po busybox
done

ORIGINAL ERROR:

 Apr 17 20:15:54 kata[2297383]: Failed to stop process, process = ContainerProcess { container_id: ContainerID { container_id: "d4eb8984d630111bbf808c7ea30b7a21274c0193cdb8d501d20e4f26a0a69151" }, exec_id: "", process_type: Container }, err = failed to update_mem_resource

                               Caused by:
                                   0: resize memory
                                   1: get vminfo
                                   2: failed to serde {"config":{"cpus":{"boot_vcpus":1,"max_vcpus":32,"topology":{"threads_per_core":1,"cores_per_die":32,"dies_per_package":1,"packages":1},"kvm_hyperv":false,"max_phys_bits":46,"affinity":null,"features":{"amx":false},"nested":null},"memory":{"size":2147483648,"mergeable":false,"hotplug_method":"Acpi","hotplug_size":132024107008,"hotplugged_size":null,"shared":true,"hugepages":false,"hugepage_size":null,"prefault":false,"zones":null,"thp":true},"payload":{"firmware":null,"kernel":"/usr/share/cloud-hypervisor/vmlinux.bin","cmdline":"reboot=k panic=1 systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service agent.log_vport=1025 console=ttyS0,115200n8 root=/dev/vda1 rootflags=data=ordered,errors=remount-ro ro rootfstype=ext4 no_timer_check noreplace-smp systemd.log_target=console agent.container_pipe_size=1 agent.log=debug cgroup_no_v1=all systemd.unified_cgroup_hierarchy=1","initramfs":null},"rate_limit_groups":null,"disks":[{"path":"/usr/share/kata-containers/kata-containers.img","readonly":true,"direct":false,"iommu":false,"num_queues":1,"queue_size":128,"vhost_user":false,"vhost_socket":null,"rate_limit_group":null,"rate_limiter_config":null,"id":"_disk0","disable_io_uring":false,"disable_aio":false,"pci_segment":0,"serial":null,"queue_affinity":null,"backing_files":false}],"net":[{"tap":null,"ip":"192.168.249.1","mask":"255.255.255.0","mac":"9e:7e:13:ee:03:5c","host_mac":null,"mtu":null,"iommu":false,"num_queues":2,"queue_size":256,"vhost_user":false,"vhost_socket":null,"vhost_mode":"Client","id":"_net1","fds":[-1],"rate_limiter_config":null,"pci_segment":0,"offload_tso":true,"offload_ufo":true,"offload_csum":true}],"rng":{"src":"/dev/urandom","iommu":false},"balloon":null,"fs":[{"tag":"kataShared","socket":"/run/kata/e1ae0a05f575a13a535aa95a9990d1fded4766a759f76be0e528c7912d3a5e39/root/virtiofsd.sock","num_queues":1,"queue_size":1024,"id":"_fs2","pci_segment":0}],"pmem":null:"/run/kata/e1ae0a05f575a13a535aa95a9990d1fded4766a759f76be0e528c7912d3a5e39/ch-vm.sock","iommu":false,"id":"_vsock3","pci_segment":0},"pvpanic":false,"iommu":false,"numa":null,"watchdog":false,"pci_segments":null,"platform":null,"tpm":null,"landlock_enabl"index":0,"base":3891789824,"size":524288,"type_":"Mmio32","prefetchable":false}}],"parent":null,"children":["_disk0"],"pci_bdf":"0000:00:01.0"},"_virtio-pci-_vsock3":{"id":"_virtio-pci-_vsock3","resources":[{"PciBar":{"index":0,"base":70367622201344,"sizee":false}}],"parent":null,"children":["_fs2"],"pci_bdf":"0000:00:04.0"},"_vsock3":{"id":"_vsock3","resources":[],"parent":"_virtio-pci-_vsock3","children":[],"pci_bdf":null},"_net1":{"id":"_net1","resources":[],"parent":"_virtio-pci-_net1","children":[],"presources":[{"PciBar":{"index":0,"base":70367623774208,"size":524288,"type_":"Mmio64","prefetchable":false}}],"parent":null,"children":["_net1"],"pci_bdf":"0000:00:02.0"},"_virtio-pci-__rng":{"id":"_virtio-pci-__rng","resources":[{"PciBar":{"index":0,"baseesources":[],"parent":null,"children":[],"pci_bdf":null}}}HTTP/1.1 200
                                      Server: Cloud Hypervisor API
                                      Connection: keep-alive
                                      Content-Type: application/json
                                      Content-Length: 4285

                                      {"config":{"cpus":{"boot_vcpus":1,"max_vcpus":32,"topology":{"threads_per_core":1,"cores_per_die":32,"dies_per_package":1,"packagesepage_size":null,"prefault":false,"zones":null,"thp":true},"payload":{"firmware":null,"kernel":"/usr/share/cloud-hypervisor/vmlinux.bin","cmdline":"reboot=k panic=1 systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service agent.log_vport=1025 console=ttyS0,115200n8 root=/dev/vda1 rootflags=data=ordered,errors=remount-ro ro rootfstype=ext4 no_timer_check noreplace-smp systemd.log_target=console agent.container_pipe_size=1 agent.log=debug cgroup_no_v1=all systemd.unified_cgroup_hierarchy=1","miter_config":null,"id":"_disk0","disable_io_uring":false,"disable_aio":false,"pci_segment":0,"serial":null,"queue_affinity":null,"backing_files":false}],"net":[{"tap":null,"ip":"192.168.249.1","mask":"255.255.255.0","mac":"9e:7e:13:ee:03:5c","host_mac":nu,"serial":{"file":null,"mode":"Tty","iommu":false,"socket":null},"console":{"file":null,"mode":"Off","iommu":false,"socket":null},"debug_console":{"file":null,"mode":"Off","iobase":233},"devices":[],"user_devices":null,"vdpa":null,"vsock":{"cid":3,"socket"
                                   3: expected `,` or `}` at line 1 column 1924

                               Stack backtrace:
                                  0: <E as anyhow::context::ext::StdError>::ext_context
                                  1: anyhow::context::<impl anyhow::Context<T,E> for core::result::Result<T,E>>::with_context
                                  2: <hypervisor::ch::CloudHypervisor as hypervisor::Hypervisor>::resize_memory::{{closure}}
                                  3: resource::manager_inner::ResourceManagerInner::update_linux_resource::{{closure}}
                                  4: virt_container::container_manager::container::Container::stop_process::{{closure}}
                                  5: virt_container::container_manager::process::Process::run_io_wait::{{closure}}::{{closure}}
                                  6: tokio::runtime::task::core::Core<T,S>::poll
                                  7: tokio::runtime::task::harness::Harness<T,S>::poll
                                  8: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
                                  9: tokio::runtime::scheduler::multi_thread::worker::Context::run
                                 10: tokio::runtime::context::scoped::Scoped<T>::set
                                 11: tokio::runtime::context::runtime::enter_runtime
                                 12: tokio::runtime::scheduler::multi_thread::worker::run
                                 13: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
                                 14: tokio::runtime::task::core::Core<T,S>::poll
                                 15: tokio::runtime::task::harness::Harness<T,S>::poll
                                 16: tokio::runtime::blocking::pool::Inner::run
                                 17: std::sys::backtrace::__rust_begin_short_backtrace
                                 18: core::ops::function::FnOnce::call_once{{vtable.shim}}
                                 19: std::sys::thread::unix::Thread::new::thread_start
                                 20: <unknown>
                                 21: <unknown>

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>

2026-04-20 15:36:00 -05:00

arch

Merge pull request #12755 from katexochen/runtime-rs-config-cleanup

2026-04-06 13:14:58 +02:00

config

runtime-rs: Increase reconnect_timeout_ms for confidential VMs

2026-04-18 00:48:13 +02:00

crates

runtime-rs/ch: Fix hang on pod deletion

2026-04-20 15:36:00 -05:00

docs/images

docs: fix spelling of "crate"

2023-06-21 16:10:54 +08:00

tests

versions: Bump rand crate where possible

2026-04-17 15:58:58 +01:00

.gitignore

…

Cargo.toml

runtime-rs: Remove unused crates

2026-02-26 09:37:46 +00:00

Makefile

runtime-rs: add build optimization flags

2026-04-14 15:52:38 -07:00

README.md

docs: Update runtime-rs README with accurate architecture documentation

2026-03-29 19:17:03 +02:00

VERSION

…

README.md

runtime-rs

What is runtime-rs

runtime-rs is a core component of Kata Containers 4.0. It is a high-performance, Rust-based implementation of the containerd shim v2 runtime.

Key characteristics:

Implementation Language: Rust, leveraging memory safety and zero-cost abstractions
Project Maturity: Production-ready component of Kata Containers 4.0
Architectural Design: Modular framework optimized for Kata Containers 4.0

For architecture details, see Architecture Overview.

Architecture Overview

Key features:

Built-in VMM (Dragonball): Deeply integrated into shim lifecycle, eliminating IPC overhead for peak performance
Asynchronous I/O: Tokio-based async runtime for high-concurrency with reduced thread footprint
Extensible Framework: Pluggable hypervisors, network interfaces, and storage backends
Resource Lifecycle Management: Comprehensive sandbox and container resource management

Crates

Crate	Description
`shim`	Containerd shim v2 entry point (start, delete, run commands)
`service`	Services including TaskService for containerd shim protocol
`runtimes`	Runtime handlers: VirtContainer (default), LinuxContainer(experimental), WasmContainer(experimental)
`resource`	Resource management: network, share_fs, rootfs, volume, cgroups, cpu_mem
`hypervisor`	Hypervisor implementations
`agent`	Guest agent communication (KataAgent)
`persist`	State persistence to disk (JSON format)
`shim-ctl`	Development tool for testing shim without containerd

shim

Entry point implementing containerd shim v2 binary protocol:

start: Start new shim process
delete: Delete existing shim process
run: Run ttRPC service

service

Extensible service framework. Currently implements TaskService conforming to containerd shim protocol.

runtimes

Runtime handlers manage sandbox and container operations:

Handler	Feature Flag	Description
`VirtContainer`	`virt` (default)	Virtual machine-based containers
`LinuxContainer`	`linux`	Linux container runtime (experimental)
`WasmContainer`	`wasm`	WebAssembly runtime (experimental)

resource

All resources abstracted uniformly:

Sandbox resources: network, share-fs
Container resources: rootfs, volume, cgroup

Sub-modules: cpu_mem, cdi_devices, coco_data, network, share_fs, rootfs, volume

hypervisor

Supported hypervisors:

Hypervisor	Mode	Description
Dragonball	Built-in	Integrated VMM for peak performance (default)
QEMU	External	Full-featured emulator
Cloud Hypervisor	External	Modern VMM (x86_64, aarch64)
Firecracker	External	Lightweight microVM
Remote	External	Remote hypervisor

The built-in VMM mode (Dragonball) is recommended for production, offering superior performance by eliminating IPC overhead.

agent

Communication with guest OS agent via ttRPC. Supports KataAgent for full container lifecycle management.

persist

State serialization to disk for sandbox recovery after restart. Stores state.json under /run/kata/<sandbox-id>/.

Build from Source and Install

Prerequisites

Download Rustup and install Rust. For Rust version, see languages.rust.meta.newest-version in versions.yaml.

Example for x86_64:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env
rustup install ${RUST_VERSION}
rustup default ${RUST_VERSION}-x86_64-unknown-linux-gnu

Musl Support (Optional)

For fully static binary:

# Add musl target
rustup target add x86_64-unknown-linux-musl

# Install musl libc (example: musl 1.2.3)
curl -O https://git.musl-libc.org/cgit/musl/snapshot/musl-1.2.3.tar.gz
tar vxf musl-1.2.3.tar.gz
cd musl-1.2.3/
./configure --prefix=/usr/local/
make && sudo make install

Install Kata 4.0 Rust Runtime Shim

git clone https://github.com/kata-containers/kata-containers.git
cd kata-containers/src/runtime-rs
make && sudo make install

After installation:

Config file: /usr/share/defaults/kata-containers/configuration.toml
Binary: /usr/local/bin/containerd-shim-kata-v2

Install Without Built-in Dragonball VMM

To build without the built-in Dragonball hypervisor:

make USE_BUILTIN_DB=false

Specify hypervisor during installation:

sudo make install HYPERVISOR=qemu
# or
sudo make install HYPERVISOR=cloud-hypervisor

Configuration

Configuration files in config/:

Config File	Hypervisor	Notes
`configuration-dragonball.toml.in`	Dragonball	Built-in VMM
`configuration-qemu-runtime-rs.toml.in`	QEMU	Default external
`configuration-cloud-hypervisor.toml.in`	Cloud Hypervisor	Modern VMM
`configuration-rs-fc.toml.in`	Firecracker	Lightweight microVM
`configuration-remote.toml.in`	Remote	Remote hypervisor
`configuration-qemu-tdx-runtime-rs.toml.in`	QEMU + TDX	Intel TDX confidential computing
`configuration-qemu-snp-runtime-rs.toml.in`	QEMU + SEV-SNP	AMD SEV-SNP confidential computing
`configuration-qemu-se-runtime-rs.toml.in`	QEMU + SEV	AMD SEV confidential computing
`configuration-qemu-coco-dev-runtime-rs.toml.in`	QEMU + CoCo	CoCo development

See runtime configuration for configuration options.

Logging

See Developer Guide - Troubleshooting.

Debugging

For development, use shim-ctl to test shim without containerd dependencies.

Limitations

See Limitations for details.