mirror of
https://github.com/kata-containers/kata-containers.git
synced 2026-04-30 21:03:49 +00:00
When using multi-layer EROFS snapshotter, the destroy() method fails to kill container processes, causing process leaks in shared PID namespace scenarios. Problem Background: 1. Multi-layer EROFS creates temporary mount points under the container's root directory: - /run/kata-containers/<cid>/multi-layer/upper (ext4, writable) - /run/kata-containers/<cid>/multi-layer/lower-0 (EROFS, read-only) 2. The original destroy() method executed in this order: (1) umount rootfs (2) fs::remove_dir_all(&self.root) <- FAILS with "Read-only file system" (3) cgroup cleanup and process killing <- NEVER EXECUTED 3. When remove_dir_all() encounters the read-only EROFS mount point, it returns EROFS error (os error 30), causing destroy() to exit early without killing processes. Why This Fix: 1. The test case k8s-kill-all-process-in-container.bats creates an init container with a background process (tail -f /dev/null), expecting it to be killed when the init container is destroyed. 2. With shared PID namespace (shareProcessNamespace: true), the orphaned process continues running, causing the test to fail. Solution: 1. Reorder the destroy() method to kill processes BEFORE attempting to remove the container directory: (1) Get PIDs from cgroup and send SIGKILL (2) Destroy cgroup (3) umount rootfs (4) fs::remove_dir_all(&self.root) 2. This ensures processes are always killed regardless of filesystem cleanup status, matching the behavior of overlayfs snapshotter. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>