This serializes CH API calls to avoid a race condition where deleting a pod would hang indefinitely and leak both the shim and CH processes. The race happened because the CRI can send multiple shutdown requests for the same pod, however the CH socket wasn't guarded against concurrent usage, hence it was possible that HTTP responses would interleave (see below) on the shutdown path, leading to an error. This would repro in <15 iterations (sometime 2-3) using a 2-container pod. With this commit, I haven't observed a repro in 200+ iterations. Fixes: #12858 ORIGINAL REPRO: while true; do kubectl apply -f busybox.yaml kubectl wait --for=condition=ready po busybox kubectl exec busybox -- echo foo kubectl delete po busybox done ORIGINAL ERROR: Apr 17 20:15:54 kata[2297383]: Failed to stop process, process = ContainerProcess { container_id: ContainerID { container_id: "d4eb8984d630111bbf808c7ea30b7a21274c0193cdb8d501d20e4f26a0a69151" }, exec_id: "", process_type: Container }, err = failed to update_mem_resource Caused by: 0: resize memory 1: get vminfo 2: failed to serde {"config":{"cpus":{"boot_vcpus":1,"max_vcpus":32,"topology":{"threads_per_core":1,"cores_per_die":32,"dies_per_package":1,"packages":1},"kvm_hyperv":false,"max_phys_bits":46,"affinity":null,"features":{"amx":false},"nested":null},"memory":{"size":2147483648,"mergeable":false,"hotplug_method":"Acpi","hotplug_size":132024107008,"hotplugged_size":null,"shared":true,"hugepages":false,"hugepage_size":null,"prefault":false,"zones":null,"thp":true},"payload":{"firmware":null,"kernel":"/usr/share/cloud-hypervisor/vmlinux.bin","cmdline":"reboot=k panic=1 systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service agent.log_vport=1025 console=ttyS0,115200n8 root=/dev/vda1 rootflags=data=ordered,errors=remount-ro ro rootfstype=ext4 no_timer_check noreplace-smp systemd.log_target=console agent.container_pipe_size=1 agent.log=debug cgroup_no_v1=all systemd.unified_cgroup_hierarchy=1","initramfs":null},"rate_limit_groups":null,"disks":[{"path":"/usr/share/kata-containers/kata-containers.img","readonly":true,"direct":false,"iommu":false,"num_queues":1,"queue_size":128,"vhost_user":false,"vhost_socket":null,"rate_limit_group":null,"rate_limiter_config":null,"id":"_disk0","disable_io_uring":false,"disable_aio":false,"pci_segment":0,"serial":null,"queue_affinity":null,"backing_files":false}],"net":[{"tap":null,"ip":"192.168.249.1","mask":"255.255.255.0","mac":"9e:7e:13:ee:03:5c","host_mac":null,"mtu":null,"iommu":false,"num_queues":2,"queue_size":256,"vhost_user":false,"vhost_socket":null,"vhost_mode":"Client","id":"_net1","fds":[-1],"rate_limiter_config":null,"pci_segment":0,"offload_tso":true,"offload_ufo":true,"offload_csum":true}],"rng":{"src":"/dev/urandom","iommu":false},"balloon":null,"fs":[{"tag":"kataShared","socket":"/run/kata/e1ae0a05f575a13a535aa95a9990d1fded4766a759f76be0e528c7912d3a5e39/root/virtiofsd.sock","num_queues":1,"queue_size":1024,"id":"_fs2","pci_segment":0}],"pmem":null:"/run/kata/e1ae0a05f575a13a535aa95a9990d1fded4766a759f76be0e528c7912d3a5e39/ch-vm.sock","iommu":false,"id":"_vsock3","pci_segment":0},"pvpanic":false,"iommu":false,"numa":null,"watchdog":false,"pci_segments":null,"platform":null,"tpm":null,"landlock_enabl"index":0,"base":3891789824,"size":524288,"type_":"Mmio32","prefetchable":false}}],"parent":null,"children":["_disk0"],"pci_bdf":"0000:00:01.0"},"_virtio-pci-_vsock3":{"id":"_virtio-pci-_vsock3","resources":[{"PciBar":{"index":0,"base":70367622201344,"sizee":false}}],"parent":null,"children":["_fs2"],"pci_bdf":"0000:00:04.0"},"_vsock3":{"id":"_vsock3","resources":[],"parent":"_virtio-pci-_vsock3","children":[],"pci_bdf":null},"_net1":{"id":"_net1","resources":[],"parent":"_virtio-pci-_net1","children":[],"presources":[{"PciBar":{"index":0,"base":70367623774208,"size":524288,"type_":"Mmio64","prefetchable":false}}],"parent":null,"children":["_net1"],"pci_bdf":"0000:00:02.0"},"_virtio-pci-__rng":{"id":"_virtio-pci-__rng","resources":[{"PciBar":{"index":0,"baseesources":[],"parent":null,"children":[],"pci_bdf":null}}}HTTP/1.1 200 Server: Cloud Hypervisor API Connection: keep-alive Content-Type: application/json Content-Length: 4285 {"config":{"cpus":{"boot_vcpus":1,"max_vcpus":32,"topology":{"threads_per_core":1,"cores_per_die":32,"dies_per_package":1,"packagesepage_size":null,"prefault":false,"zones":null,"thp":true},"payload":{"firmware":null,"kernel":"/usr/share/cloud-hypervisor/vmlinux.bin","cmdline":"reboot=k panic=1 systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service agent.log_vport=1025 console=ttyS0,115200n8 root=/dev/vda1 rootflags=data=ordered,errors=remount-ro ro rootfstype=ext4 no_timer_check noreplace-smp systemd.log_target=console agent.container_pipe_size=1 agent.log=debug cgroup_no_v1=all systemd.unified_cgroup_hierarchy=1","miter_config":null,"id":"_disk0","disable_io_uring":false,"disable_aio":false,"pci_segment":0,"serial":null,"queue_affinity":null,"backing_files":false}],"net":[{"tap":null,"ip":"192.168.249.1","mask":"255.255.255.0","mac":"9e:7e:13:ee:03:5c","host_mac":nu,"serial":{"file":null,"mode":"Tty","iommu":false,"socket":null},"console":{"file":null,"mode":"Off","iommu":false,"socket":null},"debug_console":{"file":null,"mode":"Off","iobase":233},"devices":[],"user_devices":null,"vdpa":null,"vsock":{"cid":3,"socket" 3: expected `,` or `}` at line 1 column 1924 Stack backtrace: 0: <E as anyhow::context::ext::StdError>::ext_context 1: anyhow::context::<impl anyhow::Context<T,E> for core::result::Result<T,E>>::with_context 2: <hypervisor::ch::CloudHypervisor as hypervisor::Hypervisor>::resize_memory::{{closure}} 3: resource::manager_inner::ResourceManagerInner::update_linux_resource::{{closure}} 4: virt_container::container_manager::container::Container::stop_process::{{closure}} 5: virt_container::container_manager::process::Process::run_io_wait::{{closure}}::{{closure}} 6: tokio::runtime::task::core::Core<T,S>::poll 7: tokio::runtime::task::harness::Harness<T,S>::poll 8: tokio::runtime::scheduler::multi_thread::worker::Context::run_task 9: tokio::runtime::scheduler::multi_thread::worker::Context::run 10: tokio::runtime::context::scoped::Scoped<T>::set 11: tokio::runtime::context::runtime::enter_runtime 12: tokio::runtime::scheduler::multi_thread::worker::run 13: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll 14: tokio::runtime::task::core::Core<T,S>::poll 15: tokio::runtime::task::harness::Harness<T,S>::poll 16: tokio::runtime::blocking::pool::Inner::run 17: std::sys::backtrace::__rust_begin_short_backtrace 18: core::ops::function::FnOnce::call_once{{vtable.shim}} 19: std::sys::thread::unix::Thread::new::thread_start 20: <unknown> 21: <unknown> Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Kata Containers
Welcome to Kata Containers!
This repository is the home of the Kata Containers code for the 2.0 and newer releases.
If you want to learn about Kata Containers, visit the main Kata Containers website.
Introduction
Kata Containers is an open source project and community working to build a standard implementation of lightweight Virtual Machines (VMs) that feel and perform like containers, but provide the workload isolation and security advantages of VMs.
License
The code is licensed under the Apache 2.0 license. See the license file for further details.
Platform support
Kata Containers currently runs on 64-bit systems supporting the following technologies:
| Architecture | Virtualization technology |
|---|---|
x86_64, amd64 |
Intel VT-x, AMD SVM |
aarch64 ("arm64") |
ARM Hyp |
ppc64le |
IBM Power |
s390x |
IBM Z & LinuxONE SIE |
Hardware requirements
The Kata Containers runtime provides a command to determine if your host system is capable of running and creating a Kata Container:
$ kata-runtime check
Notes:
This command runs a number of checks including connecting to the network to determine if a newer release of Kata Containers is available on GitHub. If you do not wish this to check to run, add the
--no-network-checksoption.By default, only a brief success / failure message is printed. If more details are needed, the
--verboseflag can be used to display the list of all the checks performed.If the command is run as the
rootuser additional checks are run (including checking if another incompatible hypervisor is running). When running asroot, network checks are automatically disabled.
Getting started
See the installation documentation.
Documentation
See the official documentation including:
Configuration
Kata Containers uses a single configuration file which contains a number of sections for various parts of the Kata Containers system including the runtime, the agent and the hypervisor.
Hypervisors
See the hypervisors document and the Hypervisor specific configuration details.
Community
To learn more about the project, its community and governance, see the community repository. This is the first place to go if you wish to contribute to the project.
Getting help
See the community section for ways to contact us.
Raising issues
Please raise an issue in this repository.
Note: If you are reporting a security issue, please follow the vulnerability reporting process
Developers
See the developer guide.
Components
Main components
The table below lists the core parts of the project:
| Component | Type | Description |
|---|---|---|
| runtime | core | Main component run by a container manager and providing a containerd shimv2 runtime implementation. |
| runtime-rs | core | The Rust version runtime. |
| agent | core | Management process running inside the virtual machine / POD that sets up the container environment. |
dragonball |
core | An optional built-in VMM brings out-of-the-box Kata Containers experience with optimizations on container workloads |
| documentation | documentation | Documentation common to all components (such as design and install documentation). |
| tests | tests | Excludes unit tests which live with the main code. |
Additional components
The table below lists the remaining parts of the project:
| Component | Type | Description |
|---|---|---|
| packaging | infrastructure | Scripts and metadata for producing packaged binaries (components, hypervisors, kernel and rootfs). |
| kernel | kernel | Linux kernel used by the hypervisor to boot the guest image. Patches are stored here. |
| osbuilder | infrastructure | Tool to create "mini O/S" rootfs and initrd images and kernel for the hypervisor. |
| kata-debug | infrastructure | Utility tool to gather Kata Containers debug information from Kubernetes clusters. |
agent-ctl |
utility | Tool that provides low-level access for testing the agent. |
kata-ctl |
utility | Tool that provides advanced commands and debug facilities. |
trace-forwarder |
utility | Agent tracing helper. |
ci |
CI | Continuous Integration configuration files and scripts. |
ocp-ci |
CI | Continuous Integration configuration for the OpenShift pipelines. |
katacontainers.io |
Source for the katacontainers.io site. |
|
Webhook |
utility | Example of a simple admission controller webhook to annotate pods with the Kata runtime class |
Packaging and releases
Kata Containers is now available natively for most distributions.
General tests
See the tests documentation.
Metrics tests
See the metrics documentation.
Glossary of Terms
See the glossary of terms related to Kata Containers.