kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2025-08-24 02:31:12 +00:00

Author	SHA1	Message	Date
alex.lyn	1a06bd1f08	kata-types: Introduce annotation *_RUNTIME_CREATE_CONTAINTER_TIMEOUT It's used to indicate timeout value set for image pulling in guest during creating container. This allows users to set this timeout with annotation according to the size of image to be pulled. Fixes #10692 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-30 20:04:56 +08:00
alex.lyn	f886e82f03	runtime-rs: support setting create_container_timeout It allows users to set this create container timeout within configuration.toml according to the size of image to be pulled inside guest. Fixes #10692 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-30 20:04:56 +08:00
alex.lyn	ce524a3958	kata-types: Give a more comprehensive definition of request_timeout_ms To better understand the impact of different timeout values on system behavior, this section provides a more comprehensive explanation of the request_timeout_ms: This timeout value is used to set the maximum duration for the agent to process a CreateContainerRequest. It's also used to ensure that workloads, especially those involving large image pulls within the guest, have sufficient time to complete. Based on explaination above, it's renamed with `create_container_timeout`, Specially, exposed in 'configuration.toml' Fixes #10692 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-30 20:04:56 +08:00
Steve Horsman	f04bb3f34c	Merge pull request #11479 from stevenhorsman/skip-weekly-coco-stability-tests workflows: Skip weekly coco stability tests	2025-06-30 09:05:14 +01:00
Fabiano Fidêncio	b024d8737c	Merge pull request #11481 from fidencio/topic/fix-passing-image-size-alignment build: Allow passing IMAGE_SIZE_ALIGNMENT_MB as an env var	2025-06-30 09:04:39 +02:00
Alex Lyn	69d2c078d1	Merge pull request #11484 from stevenhorsman/bump-nydus-snapshotter-0.15.2 version: Bump nydus-snapshotter	2025-06-30 14:44:01 +08:00
Alex Lyn	e66baf503b	Merge pull request #11474 from Apokleos/remote-annotation runtime-rs: Add GPU annotations for remote hypervisor	2025-06-30 14:05:15 +08:00
Fabiano Fidêncio	8d4e3b47b1	Merge pull request #11470 from fidencio/topic/runtime-rs-fix-odd-memory-size-calculation runtime-rs: Fix calculation of odd memory sizes	2025-06-30 07:26:30 +02:00
Champ-Goblem	91cadb7bfe	runtime-rs: Fix calculation of odd memory sizes An odd memory size leads to the runtime breaking during its startup, as shown below: ``` Warning FailedCreatePodSandBox 34s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox "708c81910f4e67e53b4170b6615083339b220154cb9a0c521b3232cdb40d50f9": failed to create containerd task: failed to create shim task: Others("failed to handle message start sandbox in task handler\n\nCaused by:\n 0: start vm\n 1: set vm base config\n 2: set vm configuration\n 3: Failed to set vm configuration VmConfigInfo { vcpu_count: 2, max_vcpu_count: 16, cpu_pm: \"on\", cpu_topology: CpuTopology { threads_per_core: 1, cores_per_die: 1, dies_per_socket: 1, sockets: 1 }, vpmu_feature: 0, mem_type: \"shmem\", mem_file_path: \"\", mem_size_mib: 4513, serial_path: Some(\"/run/kata/708c81910f4e67e53b4170b6615083339b220154cb9a0c521b3232cdb40d50f9/console.sock\"), pci_hotplug_enabled: true }\n 4: vmm action error: MachineConfig(InvalidMemorySize(4513))\n\nStack backtrace:\n 0: anyhow::error::<impl anyhow::Error>::msg\n 1: hypervisor::dragonball::vmm_instance::VmmInstance::handle_request\n 2: hypervisor::dragonball::vmm_instance::VmmInstance::set_vm_configuration\n 3: hypervisor::dragonball::inner::DragonballInner::set_vm_base_config\n 4: <hypervisor::dragonball::Dragonball as hypervisor::Hypervisor>::start_vm::{{closure}}::{{closure}}\n 5: <hypervisor::dragonball::Dragonball as hypervisor::Hypervisor>::start_vm::{{closure}}\n 6: <virt_container::sandbox::VirtSandbox as common::sandbox::Sandbox>::start::{{closure}}::{{closure}}\n 7: <virt_container::sandbox::VirtSandbox as common::sandbox::Sandbox>::start::{{closure}}\n 8: runtimes::manager::RuntimeHandlerManager::handler_task_message::{{closure}}::{{closure}}\n 9: runtimes::manager::RuntimeHandlerManager::handler_task_message::{{closure}}\n 10: <service::task_service::TaskService as containerd_shim_protos::shim::shim_ttrpc_async::Task>::create::{{closure}}\n 11: <containerd_shim_protos::shim::shim_ttrpc_async::CreateMethod as ttrpc::asynchronous::utils::MethodHandler>::handler::{{closure}}\n 12: <tokio::time::timeout::Timeout<T> as core::future::future::Future>::poll\n 13: ttrpc::asynchronous::server::HandlerContext::handle_msg::{{closure}}\n 14: <core::future::poll_fn::PollFn<F> as core::future::future::Future>::poll\n 15: <ttrpc::asynchronous::server::ServerReader as ttrpc::asynchronous::connection::ReaderDelegate>::handle_msg::{{closure}}::{{closure}}\n 16: tokio::runtime::task::core::Core<T,S>::poll\n 17: tokio::runtime::task::harness::Harness<T,S>::poll\n 18: tokio::runtime::scheduler::multi_thread::worker::Context::run_task\n 19: tokio::runtime::scheduler::multi_thread::worker::Context::run\n 20: tokio::runtime::context::runtime::enter_runtime\n 21: tokio::runtime::scheduler::multi_thread::worker::run\n 22: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll\n 23: tokio::runtime::task::core::Core<T,S>::poll\n 24: tokio::runtime::task::harness::Harness<T,S>::poll\n 25: tokio::runtime::blocking::pool::Inner::run\n 26: std::sys::backtrace::__rust_begin_short_backtrace\n 27: core::ops::function::FnOnce::call_once{{vtable.shim}}\n 28: std::sys::pal::unix:🧵:Thread:🆕:thread_start") ``` As we cannot control what the users will set, let's just round it up to the next acceptable value. Signed-off-by: Champ-Goblem <cameron@northflank.com> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-06-28 14:29:18 +02:00
Fabiano Fidêncio	e2b93fff3f	build: Allow passing IMAGE_SIZE_ALIGNMENT_MB as an env var This helps considerably to avoid patching the code, and just adjusting the build environment to use a smaller alignment than the default one. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-06-28 00:05:20 +02:00
stevenhorsman	fe5d43b4bd	workflows: Skip weekly coco stability tests These tests are not passing, or being maintained, so as discussed on the AC meeting, we will skip them from automatically running until they can be reviewed and re-worked, so avoid wasting CI cycles. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-27 16:51:53 +01:00
stevenhorsman	61b12d4e1b	version: Bump nydus-snapshotter Bump to version v0.15.2 to pick up fix to mount source in https://github.com/containerd/nydus-snapshotter/pull/636 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-27 14:04:00 +01:00
RuoqingHe	a43e06e0eb	Merge pull request #11461 from stevenhorsman/bump-guest-components-4cd62c3 versions: Bump guest-components	2025-06-27 10:45:06 +08:00
Saul Paredes	8c57beb943	Merge pull request #11471 from microsoft/saulparedes/fix_kata_monitor_dockerfile tools: kata-monitor: update go version used to build in Dockerfile	2025-06-26 08:37:08 -07:00
Chao Wu	ac928218f3	Merge pull request #11434 from hsiangkao/erofs runtime: improve EROFS snapshotter support	2025-06-26 22:40:48 +08:00
Cameron McDermott	b6cd6e6914	Merge pull request #11469 from fidencio/topic/dragonball-set-default_maxvcpus-to-zero runtime-rs: Set default_maxvcpus to 0	2025-06-26 15:20:21 +01:00
Aurélien Bombo	a1aa3e79d4	Merge pull request #11392 from kata-containers/sprt/zizmor ci: Run zizmor for GHA security analysis	2025-06-26 08:55:22 -05:00
Fupan Li	1ff54a95d2	Merge pull request #11422 from lifupan/memory_hotplug runtime-rs: Add the memory and vcpu hotplug for cloud-hypervisor	2025-06-26 17:56:49 +08:00
Aurélien Bombo	34c8cd810d	ci: Run zizmor for GHA security analysis This runs the zizmor security lint [1] on our GH Actions. The initial workflow uses [2] as a base. [1] https://docs.zizmor.sh/ [2] https://docs.zizmor.sh/usage/#use-in-github-actions Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-06-26 10:52:28 +01:00
alex.lyn	e6e4cd91b8	runtime-rs: Enable GPU annotations in remote hypervisor configuration Enable GPU annotations by adding `default_gpus` and `default_gpu_model` into the list of valid annotations `enable_annotations`. Fixes #10484 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-26 17:29:36 +08:00
alex.lyn	e5f44fae30	runtime-rs: Add GPU annotations during remote hypervisor preparation Add GPU specific annotations used by remote hypervisor for instance selection during `prepare_vm`. Fixes #10484 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-26 17:27:41 +08:00
alex.lyn	866d3facba	kata-types: Introduce two GPU annotations for remote hypervisor Two annotations: `default_gpus and `default_gpu_model` as GPU annotations are introduced for Kata VM configurations to improve instance selection on remote hypervisors. By adding these annotations: (1) `default_gpus`: Allows users to specify the minimum number of GPUs a VM requires. This ensures that the remote hypervisor selects an instance with at least that many GPUs, preventing resource under-provisioning. (2) `default_gpu_model`: Lets users define the specific GPU model needed for the VM. This is crucial for workloads that depend on particular GPU archs or features, ensuring compatibility and optimal performance. Fixes #10484 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-26 17:27:41 +08:00
alex.lyn	ed0c0b2367	kata-types: Introduce GPU related fields in RemoteInfo To provide the remote hypervisor with the necessary intelligence to select the most appropriate instance for a given GPU instance, leading to better resource allocation, two fields `default_gpus` and `default_gpu_model` are introduced in `RemoteInfo`. Fixes #10484 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-26 17:27:28 +08:00
Alex Lyn	9a1d4fc5d6	Merge pull request #11468 from Apokleos/fix-sharefs-none runtime-rs: Support shared fs with "none" on non-tee platforms	2025-06-26 15:37:44 +08:00
Gao Xiang	9079c8e598	runtime: improve EROFS snapshotter support To better support containerd 2.1 and later versions, remove the hardcoded `layer.erofs` and instead parse `/proc/mounts` to obtain the real mount source (and `/sys/block/loopX/loop/backing_file` if needed). If the mount source doesn't end with `layer.erofs`, it should be marked as unsupported, as it may be a filesystem meta file generated by later containerd versions for the EROFS flattened filesystem feature. Also check whether the filesystem type is `overlay` or not, since the containerd mount manager [1] may change it after being introduced. [1] https://github.com/containerd/containerd/issues/11303 Fixes: `f63ec50ba3` ("runtime: Add EROFS snapshotter with block device support") Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2025-06-26 10:12:12 +08:00
Saul Paredes	d53c720ac1	tools: kata-monitor: update go version used to build in Dockerfile Current Dockerfile fails when trying to build from the root of the repo docker build -t kata-monitor -f tools/packaging/kata-monitor/Dockerfile . with "invalid go version '1.23.0': must match format 1.23" Using go 1.23 in the Dockerfile fixes the build error Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-06-25 15:32:41 -07:00
stevenhorsman	290fda9b97	agent-ctl: Bump image-rs version I notices that agent-ctl is including a 9 month old version of image-rs and the libs crates haven't been update for potentially many years, so bump all of these. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-25 16:30:58 +01:00
stevenhorsman	c7da62dd1e	versions: Bump guest-components Bump to pick up the new guest-components and matching trustee which use rust 1.85.1 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-25 15:05:07 +01:00
Fabiano Fidêncio	bebe377f0d	runtime-rs: Set default_maxvcpus to 0 Otherwise we just cannot start a container that requests more than 1 vcpu. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-06-25 14:36:46 +02:00
Steve Horsman	9ff30c6aeb	Merge pull request #11462 from kata-containers/add-scorecard-action ci: Add scorecard action	2025-06-25 12:48:11 +01:00
Fabiano Fidêncio	69c706b570	Merge pull request #11441 from stevenhorsman/protobuf-3.7.2-bump versions: Bump protobuf to 3.7.2	2025-06-25 13:47:28 +02:00
alex.lyn	eae62ca9ac	runtime-rs: Support shared fs with "none" on non-tee platforms This commit introduces the ability to run Pods without shared fs mechanism in Kata. The default shared fs can lead to unnecessary resource consumption and security risks for certain use cases. Specifically, scenarios where files only need to be copied into the VM once at Pod creation (e.g., non-tee envs) and don't require dynamic updates make the shared fs redundant and inefficient. By explicitly disabling shared fs functionality, we reduce resource overhead and shrink the attack surface. Users will need to employ alternative methods(e.g. guest-pull) to ensure container images are shared into the guest VM for these specific scenarios. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-25 17:36:57 +08:00
Fabiano Fidêncio	4719c08184	Merge pull request #11467 from lifupan/fixblockfile runtime-rs: fix the issue return the wrong volume	2025-06-25 09:56:28 +02:00
Fupan Li	48c8e0f296	runtime-rs: fix the issue return the wrong volume In the pre commit:74eccc54e7b31cc4c9abd8b6e4007c3a4c1d4dd4, it missed return the right rootfs volume. In the is_block_rootfs fn, if the rootfs is based on a block device such as devicemapper, it should clear the volume's source and let the device_manager to use the dev_id to get the device's host path. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-25 10:02:52 +08:00
Alex Lyn	648fef4f52	Merge pull request #11466 from lifupan/blockfile runtime-rs: add the blockfile based rootfs support	2025-06-25 09:46:54 +08:00
Dan Mihai	2d43b3f9fc	Merge pull request #11424 from katexochen/p/regorus-oras-cache ci/static-checks: use oras cache for regorus	2025-06-24 14:49:00 -07:00
Fupan Li	74eccc54e7	runtime-rs: add the blockfile based rootfs support For containerd's Blockfile Snapshotter, it will pass a rootfs mounts with a rawfile as a mount source and mount options with "loop" embeded. To support this type of rootfs, it is necessary to identify this as a blockfile rootfs through the "loop" flag, and then use the volume source of the rootfs as the source of the block device to hot-insert it into the guest. Fixes:#11464 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-24 22:31:54 +08:00
Paul Meyer	43739cefdf	ci/static-checks: use oras cache for regorus Instead of building it every time, we can store the regorus binary in OCI registry using oras and download it from there. This reduces the install time from ~1m40s to ~15s. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-06-24 13:14:18 +02:00
Fupan Li	9bdbd82690	Merge pull request #11181 from Apokleos/initdata-runtime-rs runtime-rs: Implement Initdata Spec Support in runtime-rs for CoCo	2025-06-24 18:59:34 +08:00
Fupan Li	1c59516d72	runtime-rs: add support resize_vcpu for cloud-hypervisor This commit add support of resize_vcpu for cloud-hypervisor using the it's vm resize api. It can support bothof vcpu hotplug and hot unplug. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-24 17:15:05 +08:00
Fupan Li	a3671b7a5c	runtime-rs: Add the memory hotplug for cloud-hypervisor For cloud-hypervisor, currently only hot plugging of memory is supported, but hot unplugging of memory is not supported. In addition, by default, cloud-hypervisor uses ACPI-based memory hot-plugging instead of virtio-mem based memory hot-plugging. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-24 17:15:05 +08:00
Fupan Li	7df29605a4	runtime-rs: add the vm resize and get vminfo api for clh Add API interfaces for get vminfo and resize. get vminfo can obtain the memory size and number of vCPUs from the cloud hypervisor vmm in real time. This interface provides information for the subsequent resize memory and vCPU. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-24 17:15:05 +08:00
Fupan Li	9a51ade4e2	runtime-rs: impl the Deserialize trait for MacAddr The system's own Deserialize cannot implement parsing from string to MacAddr, so we need to implement this trait ourself. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-24 17:15:05 +08:00
Fupan Li	ceaae3049c	runtime-rs: move the bytes_to_megs and megs_to_bytes to utils Since those two functions would be used by other hypervisors, thus move them into the utils crate. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-24 17:15:05 +08:00
alex.lyn	871465f5d3	kata-agent: Allow unrecognized fields in InitData To make it flexibility and extensibility This change modifies the Kata Agent's handling of `InitData` to allow for unrecognized key-value pairs. The `InitData` field now directly utilizes `HashMap<String, String>`, enabling it to carry arbitrary metadata and information that may be consumed by other components Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	afcb042c28	runtime-rs: Specify the initdata to mrconfigid correctly During sandbox preparation, initdata should be specified to TdxConfig, specially mrconfigid, which is used to pass to tdx guest report for measurement. Fixes #11180 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	d6d8497b56	runtime-rs: Add host-data property to sev-snp-guest object SEV-SNP guest configuration utilizes a different set of properties compared to the existing 'sev-guest' object. This change introduces the `host-data` property within the sev-snp-guest object. This property allows for configuring an SEV-SNP guest with host-provided data, which is crucial for data integrity verification during attestation. The `host-data` property is specifically valid for SEV-SNP guests running on a capable platform. It is configured as a base64-encoded string when using the sev-snp-guest object. the example cmdline looks like: ```shell -object sev-snp-guest,id=sev-snp0,host-data=CGNkCHoBC5CcdGXir... ``` Fixes #11180 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	4a4361393c	runtime-rs: Introduce host-data in SevSnpConfig for validation To facilitate the transfer of initdata generated during `prepare_initdata_device_config`, a new parameter has been introduced into the `prepare_protection_device_config` function. Furthermore, to specifically pass initdata to SEV-SNP Guests, a `host_data` field has been added to the `SevSnpConfig` structure. However, this field is exclusively applicable to the SEV-SNP platform. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	5c8170dbb9	runtime-rs: Handle initdata block device config during sandbox start Retrieve the Initdata string content from the security_info of the Configuration. Based on the Protection Platform type, calculate the digest of the Initdata. Write the Initdata content to the block device. Subsequently, construct the BlockConfig based on this block device information. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	6ea1494701	runtime-rs: Add InitData Resource type for block device management To correctly manage initdata as a block device, a new InitData Resource type, inherently a block device, has been introduced within the ResourceManager. As a component of the Sandbox's resources, this InitData Resource needs to be appropriately handled by the Device Manager's handler. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00

1 2 3 4 5 ...

16313 Commits