diff --git a/docs/how-to/how-to-pull-images-in-guest-with-kata.md b/docs/how-to/how-to-pull-images-in-guest-with-kata.md index 56f057ffbd..7e1918e24e 100644 --- a/docs/how-to/how-to-pull-images-in-guest-with-kata.md +++ b/docs/how-to/how-to-pull-images-in-guest-with-kata.md @@ -137,7 +137,7 @@ snapshotter = "nydus" $ sudo systemctl restart containerd ``` -## Verification +## Run pod in kata containers with pulling image in guest To verify pulling images in a guest VM, please refer to the following commands: @@ -148,8 +148,6 @@ apiVersion: v1 kind: Pod metadata: name: busybox - annotations: - io.containerd.cri.runtime-handler: kata-qemu spec: runtimeClassName: kata-qemu containers: @@ -163,9 +161,6 @@ NAME READY STATUS RESTARTS AGE busybox 1/1 Running 0 10s ``` -> **Notes:** -> The `CRI Runtime Specific Snapshotter` is still an experimental feature. To pull images in the guest under the specific kata runtime (such as `kata-qemu`), we need to add the following annotation in metadata to each pod yaml: `io.containerd.cri.runtime-handler: kata-qemu`. By adding the annotation, we can ensure that the feature works as expected. - 2. Verify that the pod's images have been successfully downloaded in the guest. If images intended for deployment are deleted prior to deploying with `nydus snapshotter`, the root filesystems required for the pod's images (including the pause image and the container image) should not be present on the host. ```bash @@ -173,4 +168,145 @@ $ sandbox_id=$(ps -ef| grep containerd-shim-kata-v2| grep -oP '(?<=-id\s)[a-f0-9 $ rootfs_count=$(find /run/kata-containers/shared/sandboxes/$sandbox_id -name rootfs -type d| grep -o "rootfs" | wc -l) $ echo $rootfs_count 0 +``` + +## Run pod in kata containers with pulling large image in guest + +Currently, the image pulled in the guest will be downloaded and unpacked in the `/run/kata-containers/image` directory. However, by default, in rootfs-confidential image, systemd allocates 50% of the available physical RAM to the `/run` directory using a `tmpfs` filesystem. As we all know, memory is valuable, especially for confidential containers. This means that if we run a kata container with the default configuration (where the default memory assigned for a VM is 2048 MiB), `/run` would be allocated around 1024 MiB. Consequently, we can only pull images up to 1024 MiB in the guest. So we can use a block volume from the host and use `dm-crypt` and `dm-integrity` to encrypt the block volume in the guest, providing a secure place to store downloaded container images. + +### Create block volume with k8s + +There are a lot of CSI Plugins that support block volumes: AWS EBS, Azure Disk, Open-Local and so on. But as an example, we use Local Persistent Volumes to use local disks as block storage with k8s cluster. + +1. Create an empty disk image and attach the image to a loop device, such as `/dev/loop0` +```bash +$ loop_file="/tmp/trusted-image-storage.img" +$ sudo dd if=/dev/zero of=$loop_file bs=1M count=2500 +$ sudo losetup /dev/loop0 $loop_file +``` + +2. Create a Storage Class +```yaml +apiVersion: storage.k8s.io/v1 +kind: StorageClass +metadata: + name: local-storage +provisioner: kubernetes.io/no-provisioner +volumeBindingMode: WaitForFirstConsumer +``` + +3. Create Persistent Volume +```yaml +apiVersion: v1 +kind: PersistentVolume +metadata: + name: trusted-block-pv +spec: + capacity: + storage: 10Gi + volumeMode: Block + accessModes: + - ReadWriteOnce + persistentVolumeReclaimPolicy: Retain + storageClassName: local-storage + local: + path: /dev/loop0 + nodeAffinity: + required: + nodeSelectorTerms: + - matchExpressions: + - key: kubernetes.io/hostname + operator: In + values: + - NODE_NAME +``` + +4. Create Persistent Volume Claim +```yaml +apiVersion: v1 +kind: PersistentVolumeClaim +metadata: + name: trusted-pvc +spec: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 1Gi + volumeMode: Block + storageClassName: local-storage +``` + +5. Run a pod with pulling large image in guest + +```yaml +apiVersion: v1 +kind: Pod +metadata: + name: large-image-pod +spec: + runtimeClassName: kata-qemu + affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: kubernetes.io/hostname + operator: In + values: + - NODE_NAME + volumes: + - name: trusted-storage + persistentVolumeClaim: + claimName: trusted-pvc + containers: + - name: app-container + image: quay.io/confidential-containers/test-images:largeimage + command: ["/bin/sh", "-c"] + args: + - sleep 6000 + volumeDevices: + - devicePath: /dev/trusted_store + name: trusted-image-storage +``` + +5. Docker image size +```bash +docker image ls|grep "largeimage" +quay.io/confidential-containers/test-images largeimage 00bc1f6c893a 4 months ago 2.15GB +``` + +6. Check whether the device is encrypted and used by entering into the VM +```bash +$ lsblk --fs +NAME FSTYPE LABEL UUID FSAVAIL FSUSE% MOUNTPOINT +sda +└─encrypted_disk_GsLDt + 178M 87% /run/kata-containers/image + +$ cryptsetup status encrypted_disk_GsLDt +/dev/mapper/encrypted_disk_GsLDt is active and is in use. + type: LUKS2 + cipher: aes-xts-plain64 + keysize: 512 bits + key location: keyring + device: /dev/sda + sector size: 4096 + offset: 32768 sectors + size: 5087232 sectors + mode: read/write + +$ mount|grep "encrypted_disk_GsLDt" +/dev/mapper/encrypted_disk_GsLDt on /run/kata-containers/image type ext4 + +$ du -h --max-depth=1 /run/kata-containers/image/ +16K /run/kata-containers/image/lost+found +2.1G /run/kata-containers/image/layers +60K /run/kata-containers/image/overlay +2.1G /run/kata-containers/image/ + +$ free -m + total used free shared buff/cache available +Mem: 1989 52 43 0 1893 1904 +Swap: 0 0 0 ``` \ No newline at end of file diff --git a/src/agent/Cargo.lock b/src/agent/Cargo.lock index fd0799f36d..7afdf286a4 100644 --- a/src/agent/Cargo.lock +++ b/src/agent/Cargo.lock @@ -398,11 +398,12 @@ checksum = "1505bd5d3d116872e7271a6d4e16d81d0c8570876c8de68093a09ac269d8aac0" [[package]] name = "attester" version = "0.1.0" -source = "git+https://github.com/confidential-containers/guest-components?rev=2c5ac6b01aafcb0be3875f5743c77d654a548146#2c5ac6b01aafcb0be3875f5743c77d654a548146" +source = "git+https://github.com/confidential-containers/guest-components?rev=51e967045296570abb4ad8bef215e92323306ed4#51e967045296570abb4ad8bef215e92323306ed4" dependencies = [ "anyhow", "async-trait", "base64 0.21.7", + "hex", "kbs-types", "log", "serde", @@ -1156,7 +1157,7 @@ checksum = "7a81dae078cea95a014a339291cec439d2f232ebe854a9d672b796c6afafa9b7" [[package]] name = "crypto" version = "0.1.0" -source = "git+https://github.com/confidential-containers/guest-components?rev=2c5ac6b01aafcb0be3875f5743c77d654a548146#2c5ac6b01aafcb0be3875f5743c77d654a548146" +source = "git+https://github.com/confidential-containers/guest-components?rev=51e967045296570abb4ad8bef215e92323306ed4#51e967045296570abb4ad8bef215e92323306ed4" dependencies = [ "aes-gcm", "anyhow", @@ -2462,7 +2463,7 @@ dependencies = [ [[package]] name = "image-rs" version = "0.1.0" -source = "git+https://github.com/confidential-containers/guest-components?rev=2c5ac6b01aafcb0be3875f5743c77d654a548146#2c5ac6b01aafcb0be3875f5743c77d654a548146" +source = "git+https://github.com/confidential-containers/guest-components?rev=51e967045296570abb4ad8bef215e92323306ed4#51e967045296570abb4ad8bef215e92323306ed4" dependencies = [ "anyhow", "async-compression", @@ -2888,7 +2889,7 @@ dependencies = [ [[package]] name = "kbc" version = "0.1.0" -source = "git+https://github.com/confidential-containers/guest-components?rev=2c5ac6b01aafcb0be3875f5743c77d654a548146#2c5ac6b01aafcb0be3875f5743c77d654a548146" +source = "git+https://github.com/confidential-containers/guest-components?rev=51e967045296570abb4ad8bef215e92323306ed4#51e967045296570abb4ad8bef215e92323306ed4" dependencies = [ "anyhow", "async-trait", @@ -2917,7 +2918,7 @@ dependencies = [ [[package]] name = "kbs_protocol" version = "0.1.0" -source = "git+https://github.com/confidential-containers/guest-components?rev=2c5ac6b01aafcb0be3875f5743c77d654a548146#2c5ac6b01aafcb0be3875f5743c77d654a548146" +source = "git+https://github.com/confidential-containers/guest-components?rev=51e967045296570abb4ad8bef215e92323306ed4#51e967045296570abb4ad8bef215e92323306ed4" dependencies = [ "anyhow", "async-trait", @@ -3671,7 +3672,7 @@ dependencies = [ [[package]] name = "ocicrypt-rs" version = "0.1.0" -source = "git+https://github.com/confidential-containers/guest-components?rev=2c5ac6b01aafcb0be3875f5743c77d654a548146#2c5ac6b01aafcb0be3875f5743c77d654a548146" +source = "git+https://github.com/confidential-containers/guest-components?rev=51e967045296570abb4ad8bef215e92323306ed4#51e967045296570abb4ad8bef215e92323306ed4" dependencies = [ "aes", "anyhow", @@ -4676,7 +4677,7 @@ dependencies = [ [[package]] name = "resource_uri" version = "0.1.0" -source = "git+https://github.com/confidential-containers/guest-components?rev=2c5ac6b01aafcb0be3875f5743c77d654a548146#2c5ac6b01aafcb0be3875f5743c77d654a548146" +source = "git+https://github.com/confidential-containers/guest-components?rev=51e967045296570abb4ad8bef215e92323306ed4#51e967045296570abb4ad8bef215e92323306ed4" dependencies = [ "anyhow", "serde", diff --git a/src/agent/Cargo.toml b/src/agent/Cargo.toml index efdf652cd2..b545449b89 100644 --- a/src/agent/Cargo.toml +++ b/src/agent/Cargo.toml @@ -77,7 +77,7 @@ strum = "0.26.2" strum_macros = "0.26.2" # Image pull/decrypt -image-rs = { git = "https://github.com/confidential-containers/guest-components", rev = "2c5ac6b01aafcb0be3875f5743c77d654a548146", default-features = false, optional = true } +image-rs = { git = "https://github.com/confidential-containers/guest-components", rev = "51e967045296570abb4ad8bef215e92323306ed4", default-features = false, optional = true } # Agent Policy regorus = { version = "0.1.4", default-features = false, features = [ diff --git a/src/agent/README.md b/src/agent/README.md index c5a092d17f..a62e46123d 100644 --- a/src/agent/README.md +++ b/src/agent/README.md @@ -134,6 +134,7 @@ The kata agent has the ability to configure agent options in guest kernel comman | `agent.log_vport` | Log port | Allow to specify the `vsock` port to read logs | integer | `0` | | `agent.no_proxy` | NO proxy | Allow to configure `no_proxy` in the guest | string | `""` | | `agent.passfd_listener_port` | File descriptor passthrough IO listener port | Allow to set the file descriptor passthrough IO listener port | integer | `0` | +| `agent.secure_image_storage_integrity` | Image storage integrity | Allow to use `dm-integrity` to protect the integrity of encrypted block volume | boolean | `false` | | `agent.server_addr` | Server address | Allow the ttRPC server address to be specified | string | `"vsock://-1:1024"` | | `agent.trace` | Trace mode | Allow to static tracing | boolean | `false` | | `systemd.unified_cgroup_hierarchy` | `Cgroup hierarchy` | Allow to setup v2 cgroups | boolean | `false` | diff --git a/src/agent/src/cdh.rs b/src/agent/src/cdh.rs index 7fe529e471..1ebfac6aa6 100644 --- a/src/agent/src/cdh.rs +++ b/src/agent/src/cdh.rs @@ -10,13 +10,14 @@ use anyhow::Result; use derivative::Derivative; use protocols::{ - sealed_secret, sealed_secret_ttrpc_async, sealed_secret_ttrpc_async::SealedSecretServiceClient, + confidential_data_hub, confidential_data_hub_ttrpc_async, + confidential_data_hub_ttrpc_async::{SealedSecretServiceClient, SecureMountServiceClient}, }; use crate::CDH_SOCKET_URI; // Nanoseconds -const CDH_UNSEAL_TIMEOUT: i64 = 50 * 1000 * 1000 * 1000; +const CDH_API_TIMEOUT: i64 = 50 * 1000 * 1000 * 1000; const SEALED_SECRET_PREFIX: &str = "sealed."; #[derive(Derivative)] @@ -24,26 +25,30 @@ const SEALED_SECRET_PREFIX: &str = "sealed."; pub struct CDHClient { #[derivative(Debug = "ignore")] sealed_secret_client: SealedSecretServiceClient, + #[derivative(Debug = "ignore")] + secure_mount_client: SecureMountServiceClient, } impl CDHClient { pub fn new() -> Result { let client = ttrpc::asynchronous::Client::connect(CDH_SOCKET_URI)?; let sealed_secret_client = - sealed_secret_ttrpc_async::SealedSecretServiceClient::new(client); - + confidential_data_hub_ttrpc_async::SealedSecretServiceClient::new(client.clone()); + let secure_mount_client = + confidential_data_hub_ttrpc_async::SecureMountServiceClient::new(client); Ok(CDHClient { sealed_secret_client, + secure_mount_client, }) } pub async fn unseal_secret_async(&self, sealed_secret: &str) -> Result> { - let mut input = sealed_secret::UnsealSecretInput::new(); + let mut input = confidential_data_hub::UnsealSecretInput::new(); input.set_secret(sealed_secret.into()); let unsealed_secret = self .sealed_secret_client - .unseal_secret(ttrpc::context::with_timeout(CDH_UNSEAL_TIMEOUT), &input) + .unseal_secret(ttrpc::context::with_timeout(CDH_API_TIMEOUT), &input) .await?; Ok(unsealed_secret.plaintext) } @@ -60,6 +65,26 @@ impl CDHClient { Ok((*env.to_owned()).to_string()) } + + pub async fn secure_mount( + &self, + volume_type: &str, + options: &std::collections::HashMap, + flags: Vec, + mount_point: &str, + ) -> Result<()> { + let req = confidential_data_hub::SecureMountRequest { + volume_type: volume_type.to_string(), + options: options.clone(), + flags, + mount_point: mount_point.to_string(), + ..Default::default() + }; + self.secure_mount_client + .secure_mount(ttrpc::context::with_timeout(CDH_API_TIMEOUT), &req) + .await?; + Ok(()) + } } #[cfg(test)] @@ -69,7 +94,7 @@ mod tests { use crate::cdh::CDH_ADDR; use anyhow::anyhow; use async_trait::async_trait; - use protocols::{sealed_secret, sealed_secret_ttrpc_async}; + use protocols::{confidential_data_hub, confidential_data_hub_ttrpc_async}; use std::sync::Arc; use test_utils::skip_if_not_root; use tokio::signal::unix::{signal, SignalKind}; @@ -77,13 +102,13 @@ mod tests { struct TestService; #[async_trait] - impl sealed_secret_ttrpc_async::SealedSecretService for TestService { + impl confidential_data_hub_ttrpc_async::SealedSecretService for TestService { async fn unseal_secret( &self, _ctx: &::ttrpc::asynchronous::TtrpcContext, - _req: sealed_secret::UnsealSecretInput, - ) -> ttrpc::error::Result { - let mut output = sealed_secret::UnsealSecretOutput::new(); + _req: confidential_data_hub::UnsealSecretInput, + ) -> ttrpc::error::Result { + let mut output = confidential_data_hub::UnsealSecretOutput::new(); output.set_plaintext("unsealed".into()); Ok(output) } @@ -104,9 +129,9 @@ mod tests { fn start_ttrpc_server() { tokio::spawn(async move { let ss = Box::new(TestService {}) - as Box; + as Box; let ss = Arc::new(ss); - let ss_service = sealed_secret_ttrpc_async::create_sealed_secret_service(ss); + let ss_service = confidential_data_hub_ttrpc_async::create_sealed_secret_service(ss); remove_if_sock_exist(CDH_ADDR).unwrap(); diff --git a/src/agent/src/config.rs b/src/agent/src/config.rs index 00787f3d9a..acb07dfacc 100644 --- a/src/agent/src/config.rs +++ b/src/agent/src/config.rs @@ -31,6 +31,7 @@ const GUEST_COMPONENTS_REST_API_OPTION: &str = "agent.guest_components_rest_api" const GUEST_COMPONENTS_PROCS_OPTION: &str = "agent.guest_components_procs"; #[cfg(feature = "guest-pull")] const IMAGE_REGISTRY_AUTH_OPTION: &str = "agent.image_registry_auth"; +const SECURE_STORAGE_INTEGRITY_OPTION: &str = "agent.secure_storage_integrity"; // Configure the proxy settings for HTTPS requests in the guest, // to solve the problem of not being able to access the specified image in some cases. @@ -110,6 +111,7 @@ pub struct AgentConfig { pub guest_components_procs: GuestComponentsProcs, #[cfg(feature = "guest-pull")] pub image_registry_auth: String, + pub secure_storage_integrity: bool, } #[derive(Debug, Deserialize)] @@ -131,6 +133,7 @@ pub struct AgentConfigBuilder { pub guest_components_procs: Option, #[cfg(feature = "guest-pull")] pub image_registry_auth: Option, + pub secure_storage_integrity: Option, } macro_rules! config_override { @@ -198,6 +201,7 @@ impl Default for AgentConfig { guest_components_procs: GuestComponentsProcs::default(), #[cfg(feature = "guest-pull")] image_registry_auth: String::from(""), + secure_storage_integrity: false, } } } @@ -237,7 +241,7 @@ impl FromStr for AgentConfig { config_override!(agent_config_builder, agent_config, guest_components_procs); #[cfg(feature = "guest-pull")] config_override!(agent_config_builder, agent_config, image_registry_auth); - + config_override!(agent_config_builder, agent_config, secure_storage_integrity); Ok(agent_config) } } @@ -359,6 +363,12 @@ impl AgentConfig { config.image_registry_auth, get_string_value ); + parse_cmdline_param!( + param, + SECURE_STORAGE_INTEGRITY_OPTION, + config.secure_storage_integrity, + get_bool_value + ); } config.override_config_from_envs(); @@ -586,6 +596,7 @@ mod tests { guest_components_procs: GuestComponentsProcs, #[cfg(feature = "guest-pull")] image_registry_auth: &'a str, + secure_storage_integrity: bool, } impl Default for TestData<'_> { @@ -607,6 +618,7 @@ mod tests { guest_components_procs: GuestComponentsProcs::default(), #[cfg(feature = "guest-pull")] image_registry_auth: "", + secure_storage_integrity: false, } } } @@ -1050,6 +1062,31 @@ mod tests { image_registry_auth: "kbs:///default/credentials/test", ..Default::default() }, + TestData { + contents: "", + secure_storage_integrity: false, + ..Default::default() + }, + TestData { + contents: "agent.secure_storage_integrity=true", + secure_storage_integrity: true, + ..Default::default() + }, + TestData { + contents: "agent.secure_storage_integrity=false", + secure_storage_integrity: false, + ..Default::default() + }, + TestData { + contents: "agent.secure_storage_integrity=1", + secure_storage_integrity: true, + ..Default::default() + }, + TestData { + contents: "agent.secure_storage_integrity=0", + secure_storage_integrity: false, + ..Default::default() + }, ]; let dir = tempdir().expect("failed to create tmpdir"); @@ -1111,6 +1148,11 @@ mod tests { ); #[cfg(feature = "guest-pull")] assert_eq!(d.image_registry_auth, config.image_registry_auth, "{}", msg); + assert_eq!( + d.secure_storage_integrity, config.secure_storage_integrity, + "{}", + msg + ); for v in vars_to_unset { env::remove_var(v); diff --git a/src/agent/src/image.rs b/src/agent/src/image.rs index ce0230b2fa..0cc0d209c5 100644 --- a/src/agent/src/image.rs +++ b/src/agent/src/image.rs @@ -21,7 +21,7 @@ use tokio::sync::Mutex; use crate::rpc::CONTAINER_BASE; use crate::AGENT_CONFIG; -const KATA_IMAGE_WORK_DIR: &str = "/run/kata-containers/image/"; +pub const KATA_IMAGE_WORK_DIR: &str = "/run/kata-containers/image/"; const CONFIG_JSON: &str = "config.json"; const KATA_PAUSE_BUNDLE: &str = "/pause_bundle"; diff --git a/src/agent/src/rpc.rs b/src/agent/src/rpc.rs index f731671eca..92cb0a6381 100644 --- a/src/agent/src/rpc.rs +++ b/src/agent/src/rpc.rs @@ -59,6 +59,7 @@ use crate::device::{ add_devices, get_virtio_blk_pci_device_name, update_env_pci, wait_for_net_interface, }; use crate::features::get_build_features; +use crate::image::KATA_IMAGE_WORK_DIR; use crate::linux_abi::*; use crate::metrics::get_metrics; use crate::mount::baremount; @@ -106,7 +107,7 @@ use kata_types::k8s; pub const CONTAINER_BASE: &str = "/run/kata-containers"; const MODPROBE_PATH: &str = "/sbin/modprobe"; - +const TRUSTED_IMAGE_STORAGE_DEVICE: &str = "/dev/trusted_store"; /// the iptables seriers binaries could appear either in /sbin /// or /usr/sbin, we need to check both of them const USR_IPTABLES_SAVE: &str = "/usr/sbin/iptables-save"; @@ -243,6 +244,37 @@ impl AgentService { } } + let linux = oci + .linux() + .as_ref() + .ok_or_else(|| anyhow!("Spec didn't contain linux field"))?; + if let Some(devices) = linux.devices() { + for specdev in devices.iter() { + if specdev.path().as_path().to_str() == Some(TRUSTED_IMAGE_STORAGE_DEVICE) { + let dev_major_minor = format!("{}:{}", specdev.major(), specdev.minor()); + let secure_storage_integrity = + AGENT_CONFIG.secure_storage_integrity.to_string(); + info!( + sl(), + "trusted_store device major:min {}, enable data integrity {}", + dev_major_minor, + secure_storage_integrity + ); + + if let Some(cdh) = self.cdh_client.as_ref() { + let options = std::collections::HashMap::from([ + ("deviceId".to_string(), dev_major_minor), + ("encryptType".to_string(), "LUKS".to_string()), + ("dataIntegrity".to_string(), secure_storage_integrity), + ]); + cdh.secure_mount("BlockDevice", &options, vec![], KATA_IMAGE_WORK_DIR) + .await?; + break; + } + } + } + } + // Both rootfs and volumes (invoked with --volume for instance) will // be processed the same way. The idea is to always mount any provided // storage to the specified MountPoint, so that it will match what's diff --git a/src/libs/protocols/build.rs b/src/libs/protocols/build.rs index 3f14fcd222..a76606b07c 100644 --- a/src/libs/protocols/build.rs +++ b/src/libs/protocols/build.rs @@ -203,7 +203,7 @@ fn real_main() -> Result<(), std::io::Error> { &[ "protos/agent.proto", "protos/health.proto", - "protos/sealed_secret.proto", + "protos/confidential_data_hub.proto", ], true, )?; @@ -211,8 +211,8 @@ fn real_main() -> Result<(), std::io::Error> { fs::rename("src/agent_ttrpc.rs", "src/agent_ttrpc_async.rs")?; fs::rename("src/health_ttrpc.rs", "src/health_ttrpc_async.rs")?; fs::rename( - "src/sealed_secret_ttrpc.rs", - "src/sealed_secret_ttrpc_async.rs", + "src/confidential_data_hub_ttrpc.rs", + "src/confidential_data_hub_ttrpc_async.rs", )?; } @@ -221,7 +221,7 @@ fn real_main() -> Result<(), std::io::Error> { &[ "protos/agent.proto", "protos/health.proto", - "protos/sealed_secret.proto", + "protos/confidential_data_hub.proto", ], false, )?; diff --git a/src/libs/protocols/protos/confidential_data_hub.proto b/src/libs/protocols/protos/confidential_data_hub.proto new file mode 100644 index 0000000000..8752925a0c --- /dev/null +++ b/src/libs/protocols/protos/confidential_data_hub.proto @@ -0,0 +1,37 @@ +// +// Copyright (c) 2024 IBM +// Copyright (c) 2024 Intel Corporation +// +// SPDX-License-Identifier: Apache-2.0 +// + +syntax = "proto3"; + +package api; + +message UnsealSecretInput { + bytes secret = 1; +} + +message UnsealSecretOutput { + bytes plaintext = 1; +} + +message SecureMountRequest { + string volume_type = 1; + map options = 2; + repeated string flags = 3; + string mount_point = 4; +} + +message SecureMountResponse { + string mount_path = 1; +} + +service SealedSecretService { + rpc UnsealSecret(UnsealSecretInput) returns (UnsealSecretOutput) {}; +} + +service SecureMountService { + rpc SecureMount(SecureMountRequest) returns (SecureMountResponse) {}; +} \ No newline at end of file diff --git a/src/libs/protocols/protos/sealed_secret.proto b/src/libs/protocols/protos/sealed_secret.proto deleted file mode 100644 index 4e886ab2c4..0000000000 --- a/src/libs/protocols/protos/sealed_secret.proto +++ /dev/null @@ -1,21 +0,0 @@ -// -// Copyright (c) 2024 IBM -// -// SPDX-License-Identifier: Apache-2.0 -// - -syntax = "proto3"; - -package api; - -message UnsealSecretInput { - bytes secret = 1; -} - -message UnsealSecretOutput { - bytes plaintext = 1; -} - -service SealedSecretService { - rpc UnsealSecret(UnsealSecretInput) returns (UnsealSecretOutput) {}; -} diff --git a/src/libs/protocols/src/lib.rs b/src/libs/protocols/src/lib.rs index 9f2c244123..97bbef6f0a 100644 --- a/src/libs/protocols/src/lib.rs +++ b/src/libs/protocols/src/lib.rs @@ -28,8 +28,8 @@ pub use serde_config::{ serialize_message_field, }; -pub mod sealed_secret; -pub mod sealed_secret_ttrpc; +pub mod confidential_data_hub; +pub mod confidential_data_hub_ttrpc; #[cfg(feature = "async")] -pub mod sealed_secret_ttrpc_async; +pub mod confidential_data_hub_ttrpc_async; diff --git a/tests/integration/kubernetes/confidential_common.sh b/tests/integration/kubernetes/confidential_common.sh index 1cee34ec71..bdd97c480d 100644 --- a/tests/integration/kubernetes/confidential_common.sh +++ b/tests/integration/kubernetes/confidential_common.sh @@ -82,3 +82,28 @@ function is_confidential_hardware() { return 1 } + +function create_loop_device(){ + local loop_file="${1:-/tmp/trusted-image-storage.img}" + cleanup_loop_device "$loop_file" + + sudo dd if=/dev/zero of=$loop_file bs=1M count=2500 + sudo losetup -fP $loop_file >/dev/null 2>&1 + local device=$(sudo losetup -j $loop_file | awk -F'[: ]' '{print $1}') + echo $device +} + +function cleanup_loop_device(){ + local loop_file="${1:-/tmp/trusted-image-storage.img}" + # Find all loop devices associated with $loop_file + local existed_devices=$(sudo losetup -j $loop_file | awk -F'[: ]' '{print $1}') + + if [ -n "$existed_devices" ]; then + # Iterate over each found loop device and detach it + for d in $existed_devices; do + sudo losetup -d "$d" >/dev/null 2>&1 + done + fi + + sudo rm -f "$loop_file" >/dev/null 2>&1 || true +} \ No newline at end of file diff --git a/tests/integration/kubernetes/k8s-guest-pull-image-encrypted.bats b/tests/integration/kubernetes/k8s-guest-pull-image-encrypted.bats index 722a4d3a30..2e7788705e 100644 --- a/tests/integration/kubernetes/k8s-guest-pull-image-encrypted.bats +++ b/tests/integration/kubernetes/k8s-guest-pull-image-encrypted.bats @@ -79,7 +79,7 @@ function create_pod_yaml_with_encrypted_image() { echo "Pod ${kata_pod_with_encrypted_image}: $(cat ${kata_pod_with_encrypted_image})" assert_pod_fail "${kata_pod_with_encrypted_image}" - assert_logs_contain "${node}" kata "${node_start_time}" 'failed to get decrypt key missing private key needed for decryption' + assert_logs_contain "${node}" kata "${node_start_time}" 'failed to get decrypt key no suitable key found for decrypting layer key' } @@ -106,7 +106,7 @@ function create_pod_yaml_with_encrypted_image() { echo "Pod ${kata_pod_with_encrypted_image}: $(cat ${kata_pod_with_encrypted_image})" assert_pod_fail "${kata_pod_with_encrypted_image}" - assert_logs_contain "${node}" kata "${node_start_time}" 'failed to get decrypt key missing private key needed for decryption' + assert_logs_contain "${node}" kata "${node_start_time}" 'failed to get decrypt key no suitable key found for decrypting layer key' } teardown() { diff --git a/tests/integration/kubernetes/k8s-guest-pull-image.bats b/tests/integration/kubernetes/k8s-guest-pull-image.bats index ba7d8da111..28c86df6ac 100644 --- a/tests/integration/kubernetes/k8s-guest-pull-image.bats +++ b/tests/integration/kubernetes/k8s-guest-pull-image.bats @@ -9,14 +9,6 @@ load "${BATS_TEST_DIRNAME}/lib.sh" load "${BATS_TEST_DIRNAME}/confidential_common.sh" setup() { - if [ "${KATA_HYPERVISOR}" = "qemu-tdx" ]; then - skip "${KATA_HYPERVISOR} is already running all the tests with guest-pulling, skip this specific one" - fi - - if is_confidential_hardware; then - skip "Due to issues related to pull-image integration skip tests for ${KATA_HYPERVISOR}." - fi - if ! is_confidential_runtime_class; then skip "Test not supported for ${KATA_HYPERVISOR}." fi @@ -24,24 +16,19 @@ setup() { [ "${SNAPSHOTTER:-}" = "nydus" ] || skip "None snapshotter was found but this test requires one" setup_common - unencrypted_image_1="quay.io/sjenning/nginx:1.15-alpine" - unencrypted_image_2="quay.io/prometheus/busybox:latest" - large_image="quay.io/confidential-containers/test-images:largeimage" + get_pod_config_dir + unencrypted_image="quay.io/prometheus/busybox:latest" + image_pulled_time_less_than_default_time="ghcr.io/confidential-containers/test-container:rust-1.79.0" # unpacked size: 1.41GB + large_image="quay.io/confidential-containers/test-images:largeimage" # unpacked size: 2.15GB + pod_config_template="${pod_config_dir}/pod-guest-pull-in-trusted-storage.yaml.in" + storage_config_template="${pod_config_dir}/confidential/trusted-storage.yaml.in" } @test "Test we can pull an unencrypted image outside the guest with runc and then inside the guest successfully" { - if is_confidential_hardware; then - skip "Due to issues related to pull-image integration skip tests for ${KATA_HYPERVISOR}." - fi - - if ! is_confidential_runtime_class; then - skip "Test not supported for ${KATA_HYPERVISOR}." - fi - - # 1. Create one runc pod with the $unencrypted_image_1 image + # 1. Create one runc pod with the $unencrypted_image image # We want to have one runc pod, so we pass a fake runtimeclass "runc" and then delete the runtimeClassName, # because the runtimeclass is not optional in new_pod_config function. - runc_pod_config="$(new_pod_config "$unencrypted_image_1" "runc")" + runc_pod_config="$(new_pod_config "$unencrypted_image" "runc")" sed -i '/runtimeClassName:/d' $runc_pod_config set_node "$runc_pod_config" "$node" set_container_command "$runc_pod_config" "0" "sleep" "30" @@ -56,8 +43,8 @@ setup() { echo "Runc pod test-e2e is running" kubectl delete -f "$runc_pod_config" - # 2. Create one kata pod with the $unencrypted_image_1 image and nydus annotation - kata_pod_with_nydus_config="$(new_pod_config "$unencrypted_image_1" "kata-${KATA_HYPERVISOR}")" + # 2. Create one kata pod with the $unencrypted_image image and nydus annotation + kata_pod_with_nydus_config="$(new_pod_config "$unencrypted_image" "kata-${KATA_HYPERVISOR}")" set_node "$kata_pod_with_nydus_config" "$node" set_container_command "$kata_pod_with_nydus_config" "0" "sleep" "30" @@ -72,178 +59,175 @@ setup() { add_allow_all_policy_to_yaml "$kata_pod_with_nydus_config" k8s_create_pod "$kata_pod_with_nydus_config" - echo "Kata pod test-e2e with nydus annotation is running" - - echo "Checking the image was pulled in the guest" - sandbox_id=$(get_node_kata_sandbox_id $node) - echo "sandbox_id is: $sandbox_id" - # With annotation for nydus, only rootfs for pause container can be found on host - assert_rootfs_count "$node" "$sandbox_id" "1" } -@test "Test we can pull a large image inside the guest" { - [[ " ${SUPPORTED_NON_TEE_HYPERVISORS} " =~ " ${KATA_HYPERVISOR} " ]] && skip "Test not supported for ${KATA_HYPERVISOR}." - skip "This test requires large memory, which the encrypted memory is typically small and valuable in TEE. \ - The test will be skiped until https://github.com/kata-containers/kata-containers/issues/8142 is addressed." - kata_pod_with_nydus_config="$(new_pod_config "$large_image" "kata-${KATA_HYPERVISOR}")" - set_node "$kata_pod_with_nydus_config" "$node" - set_container_command "$kata_pod_with_nydus_config" "0" "sleep" "30" +@test "Test we cannot pull an image that exceeds the memory limit inside the guest" { + # The image pulled in the guest will be downloaded and unpacked in the `/run/kata-containers/image` directory. + # However, by default, systemd allocates 50% of the available physical RAM to the `/run` directory using a `tmpfs` filesystem. + # It means that if we run a kata container with the default configuration (where the default memory assigned for a VM is 2048 MiB), + # `/run` would be allocated around 1024 MiB. Consequently, we can only pull images up to 1024 MiB in the guest. + # However, the unpacked size of image "ghcr.io/confidential-containers/test-container:rust-1.79.0" is 1.41GB. + # It will fail to run the pod with pulling the image in the memory in the guest by default. - # Set annotation to pull large image in guest - set_metadata_annotation "$kata_pod_with_nydus_config" \ + pod_config="$(new_pod_config "$image_pulled_time_less_than_default_time" "kata-${KATA_HYPERVISOR}")" + set_node "$pod_config" "$node" + set_container_command "$pod_config" "0" "sleep" "30" + + # Set annotation to pull image in guest + set_metadata_annotation "${pod_config}" \ "io.containerd.cri.runtime-handler" \ "kata-${KATA_HYPERVISOR}" # For debug sake - echo "Pod $kata_pod_with_nydus_config file:" - cat $kata_pod_with_nydus_config + echo "Pod $pod_config file:" + cat $pod_config - # The pod should be failed because the default timeout of CreateContainerRequest is 60s - assert_pod_fail "$kata_pod_with_nydus_config" + # The pod should be failed because the unpacked image size is larger than the memory size in the guest. + assert_pod_fail "$pod_config" assert_logs_contain "$node" kata "$node_start_time" \ - 'context deadline exceeded' + 'No space left on device' +} - kubectl delete -f $kata_pod_with_nydus_config +@test "Test we can pull an image inside the guest using trusted storage" { + # The image pulled in the guest will be downloaded and unpacked in the `/run/kata-containers/image` directory. + # The tests will use `cryptsetup` to encrypt a block device and mount it at `/run/kata-containers/image`. + + if [ "${KATA_HYPERVISOR}" = "qemu-coco-dev" ]; then + skip "skip this specific one due to issue https://github.com/kata-containers/kata-containers/issues/10133" + fi + + storage_config=$(mktemp "${BATS_FILE_TMPDIR}/$(basename "${storage_config_template}").XXX") + local_device=$(create_loop_device) + LOCAL_DEVICE="$local_device" NODE_NAME="$node" envsubst < "$storage_config_template" > "$storage_config" - # Set CreateContainerRequest timeout in the annotation to pull large image in guest - create_container_timeout=300 - set_metadata_annotation "$kata_pod_with_nydus_config" \ + # For debug sake + echo "Trusted storage $storage_config file:" + cat $storage_config + + # Create persistent volume and persistent volume claim + kubectl create -f $storage_config + + pod_config=$(mktemp "${BATS_FILE_TMPDIR}/$(basename "${pod_config_template}").XXX") + IMAGE="$image_pulled_time_less_than_default_time" NODE_NAME="$node" envsubst < "$pod_config_template" > "$pod_config" + + # Enable dm-integrity in guest + set_metadata_annotation "${pod_config}" \ + "io.katacontainers.config.hypervisor.kernel_params" \ + "agent.secure_storage_integrity=true" + + # Set annotation to pull image in guest + set_metadata_annotation "${pod_config}" \ + "io.containerd.cri.runtime-handler" \ + "kata-${KATA_HYPERVISOR}" + + # For debug sake + echo "Pod $pod_config file:" + cat $pod_config + + add_allow_all_policy_to_yaml "$pod_config" + k8s_create_pod "$pod_config" +} + +@test "Test we cannot pull a large image that pull time exceeds createcontainer timeout inside the guest" { + + if [ "${KATA_HYPERVISOR}" = "qemu-coco-dev" ]; then + skip "skip this specific one due to issue https://github.com/kata-containers/kata-containers/issues/10133" + fi + + storage_config=$(mktemp "${BATS_FILE_TMPDIR}/$(basename "${storage_config_template}").XXX") + local_device=$(create_loop_device) + LOCAL_DEVICE="$local_device" NODE_NAME="$node" envsubst < "$storage_config_template" > "$storage_config" + + # For debug sake + echo "Trusted storage $storage_config file:" + cat $storage_config + + # Create persistent volume and persistent volume claim + kubectl create -f $storage_config + + pod_config=$(mktemp "${BATS_FILE_TMPDIR}/$(basename "${pod_config_template}").XXX") + IMAGE="$large_image" NODE_NAME="$node" envsubst < "$pod_config_template" > "$pod_config" + + # Set a short CreateContainerRequest timeout in the annotation to fail to pull image in guest + create_container_timeout=10 + set_metadata_annotation "$pod_config" \ "io.katacontainers.config.runtime.create_container_timeout" \ "${create_container_timeout}" - # For debug sake - echo "Pod $kata_pod_with_nydus_config file:" - cat $kata_pod_with_nydus_config - - add_allow_all_policy_to_yaml "$kata_pod_with_nydus_config" - k8s_create_pod "$kata_pod_with_nydus_config" -} - -@test "Test we can pull an unencrypted image inside the guest twice in a row and then outside the guest successfully" { - # 1. Create one kata pod with the $unencrypted_image_1 image and nydus annotation twice - kata_pod_with_nydus_config="$(new_pod_config "$unencrypted_image_1" "kata-${KATA_HYPERVISOR}")" - set_node "$kata_pod_with_nydus_config" "$node" - set_container_command "$kata_pod_with_nydus_config" "0" "sleep" "30" + # Enable dm-integrity in guest + set_metadata_annotation "${pod_config}" \ + "io.katacontainers.config.hypervisor.kernel_params" \ + "agent.secure_storage_integrity=true" # Set annotation to pull image in guest - set_metadata_annotation "$kata_pod_with_nydus_config" \ + set_metadata_annotation "${pod_config}" \ "io.containerd.cri.runtime-handler" \ "kata-${KATA_HYPERVISOR}" # For debug sake - echo "Pod $kata_pod_with_nydus_config file:" - cat $kata_pod_with_nydus_config + echo "Pod $pod_config file:" + cat $pod_config - add_allow_all_policy_to_yaml "$kata_pod_with_nydus_config" - k8s_create_pod "$kata_pod_with_nydus_config" - - echo "Kata pod test-e2e with nydus annotation is running" - echo "Checking the image was pulled in the guest" - - sandbox_id=$(get_node_kata_sandbox_id $node) - echo "sandbox_id is: $sandbox_id" - # With annotation for nydus, only rootfs for pause container can be found on host - assert_rootfs_count "$node" "$sandbox_id" "1" - - kubectl delete -f $kata_pod_with_nydus_config - - # 2. Create one kata pod with the $unencrypted_image_1 image and without nydus annotation - kata_pod_without_nydus_config="$(new_pod_config "$unencrypted_image_1" "kata-${KATA_HYPERVISOR}")" - set_node "$kata_pod_without_nydus_config" "$node" - set_container_command "$kata_pod_without_nydus_config" "0" "sleep" "30" - - # For debug sake - echo "Pod $kata_pod_without_nydus_config file:" - cat $kata_pod_without_nydus_config - - add_allow_all_policy_to_yaml "$kata_pod_without_nydus_config" - k8s_create_pod "$kata_pod_without_nydus_config" - - echo "Kata pod test-e2e without nydus annotation is running" - echo "Check the image was not pulled in the guest" - sandbox_id=$(get_node_kata_sandbox_id $node) - echo "sandbox_id is: $sandbox_id" - - # The assert_rootfs_count will be FAIL. - # The expect count of rootfs in host is "2" but the found count of rootfs in host is "1" - # As the the first time we pull the $unencrypted_image_1 image via nydus-snapshotter in the guest - # for all subsequent pulls still use nydus-snapshotter in the guest - # More details: https://github.com/kata-containers/kata-containers/issues/8337 - # The test case will be PASS after we use containerd 2.0 with 'image pull per runtime class' feature: - # https://github.com/containerd/containerd/issues/9377 - assert_rootfs_count "$node" "$sandbox_id" "2" + # The pod should be failed because the default timeout of CreateContainerRequest is 60s + assert_pod_fail "$pod_config" + assert_logs_contain "$node" kata "$node_start_time" \ + 'context deadline exceeded' } -@test "Test we can pull an other unencrypted image outside the guest and then inside the guest successfully" { - # 1. Create one kata pod with the $unencrypted_image_2 image and without nydus annotation - kata_pod_without_nydus_config="$(new_pod_config "$unencrypted_image_2" "kata-${KATA_HYPERVISOR}")" - set_node "$kata_pod_without_nydus_config" "$node" - set_container_command "$kata_pod_without_nydus_config" "0" "sleep" "30" +@test "Test we can pull a large image inside the guest with large createcontainer timeout" { + + if [ "${KATA_HYPERVISOR}" = "qemu-coco-dev" ]; then + skip "skip this specific one due to issue https://github.com/kata-containers/kata-containers/issues/10133" + fi + storage_config=$(mktemp "${BATS_FILE_TMPDIR}/$(basename "${storage_config_template}").XXX") + local_device=$(create_loop_device) + LOCAL_DEVICE="$local_device" NODE_NAME="$node" envsubst < "$storage_config_template" > "$storage_config" # For debug sake - echo "Pod $kata_pod_without_nydus_config file:" - cat $kata_pod_without_nydus_config + echo "Trusted storage $storage_config file:" + cat $storage_config + + # Create persistent volume and persistent volume claim + kubectl create -f $storage_config - add_allow_all_policy_to_yaml "$kata_pod_without_nydus_config" - k8s_create_pod "$kata_pod_without_nydus_config" + pod_config=$(mktemp "${BATS_FILE_TMPDIR}/$(basename "${pod_config_template}").XXX") + IMAGE="$large_image" NODE_NAME="$node" envsubst < "$pod_config_template" > "$pod_config" - echo "Kata pod test-e2e without nydus annotation is running" - echo "Checking the image was pulled in the host" + # Set CreateContainerRequest timeout in the annotation to pull large image in guest + create_container_timeout=120 + set_metadata_annotation "$pod_config" \ + "io.katacontainers.config.runtime.create_container_timeout" \ + "${create_container_timeout}" - sandbox_id=$(get_node_kata_sandbox_id $node) - echo "sandbox_id is: $sandbox_id" - # Without annotation for nydus, both rootfs for pause and the test container can be found on host - assert_rootfs_count "$node" "$sandbox_id" "2" - - kubectl delete -f $kata_pod_without_nydus_config - - # 2. Create one kata pod with the $unencrypted_image_2 image and with nydus annotation - kata_pod_with_nydus_config="$(new_pod_config "$unencrypted_image_2" "kata-${KATA_HYPERVISOR}")" - set_node "$kata_pod_with_nydus_config" "$node" - set_container_command "$kata_pod_with_nydus_config" "0" "sleep" "30" + # Enable dm-integrity in guest + set_metadata_annotation "${pod_config}" \ + "io.katacontainers.config.hypervisor.kernel_params" \ + "agent.secure_storage_integrity=true" # Set annotation to pull image in guest - set_metadata_annotation "$kata_pod_with_nydus_config" \ + set_metadata_annotation "${pod_config}" \ "io.containerd.cri.runtime-handler" \ "kata-${KATA_HYPERVISOR}" # For debug sake - echo "Pod $kata_pod_with_nydus_config file:" - cat $kata_pod_with_nydus_config + echo "Pod $pod_config file:" + cat $pod_config - add_allow_all_policy_to_yaml "$kata_pod_with_nydus_config" - k8s_create_pod "$kata_pod_with_nydus_config" - - echo "Kata pod test-e2e with nydus annotation is running" - echo "Checking the image was pulled in the guest" - sandbox_id=$(get_node_kata_sandbox_id $node) - echo "sandbox_id is: $sandbox_id" - - # The assert_rootfs_count will be FAIL. - # The expect count of rootfs in host is "1" but the found count of rootfs in host is "2" - # As the the first time we pull the $unencrypted_image_2 image via overlayfs-snapshotter in host - # for all subsequent pulls still use overlayfs-snapshotter in host. - # More details: https://github.com/kata-containers/kata-containers/issues/8337 - # The test case will be PASS after we use containerd 2.0 with 'image pull per runtime class' feature: - # https://github.com/containerd/containerd/issues/9377 - assert_rootfs_count "$node" "$sandbox_id" "1" + add_allow_all_policy_to_yaml "$pod_config" + k8s_create_pod "$pod_config" } teardown() { - if [ "${KATA_HYPERVISOR}" = "qemu-tdx" ]; then - skip "${KATA_HYPERVISOR} is already running all the tests with guest-pulling, skip this specific one" - fi - - if is_confidential_hardware; then - skip "Due to issues related to pull-image integration skip tests for ${KATA_HYPERVISOR}." - fi - if ! is_confidential_runtime_class; then skip "Test not supported for ${KATA_HYPERVISOR}." fi [ "${SNAPSHOTTER:-}" = "nydus" ] || skip "None snapshotter was found but this test requires one" - kubectl describe pod "$pod_name" + kubectl describe pods k8s_delete_all_pods_if_any_exists || true + kubectl delete --ignore-not-found pvc trusted-pvc + kubectl delete --ignore-not-found pv trusted-block-pv + kubectl delete --ignore-not-found storageclass local-storage + cleanup_loop_device || true } diff --git a/tests/integration/kubernetes/runtimeclass_workloads/confidential/trusted-storage.yaml.in b/tests/integration/kubernetes/runtimeclass_workloads/confidential/trusted-storage.yaml.in new file mode 100644 index 0000000000..4a513a2c0c --- /dev/null +++ b/tests/integration/kubernetes/runtimeclass_workloads/confidential/trusted-storage.yaml.in @@ -0,0 +1,48 @@ +# +# Copyright (c) 2024 Intel Corporation +# +# SPDX-License-Identifier: Apache-2.0 +# + +apiVersion: storage.k8s.io/v1 +kind: StorageClass +metadata: + name: local-storage +provisioner: kubernetes.io/no-provisioner +volumeBindingMode: WaitForFirstConsumer +--- +apiVersion: v1 +kind: PersistentVolume +metadata: + name: trusted-block-pv +spec: + capacity: + storage: 10Gi + volumeMode: Block + accessModes: + - ReadWriteOnce + persistentVolumeReclaimPolicy: Retain + storageClassName: local-storage + local: + path: $LOCAL_DEVICE + nodeAffinity: + required: + nodeSelectorTerms: + - matchExpressions: + - key: kubernetes.io/hostname + operator: In + values: + - $NODE_NAME +--- +apiVersion: v1 +kind: PersistentVolumeClaim +metadata: + name: trusted-pvc +spec: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 1Gi + volumeMode: Block + storageClassName: local-storage \ No newline at end of file diff --git a/tests/integration/kubernetes/runtimeclass_workloads/pod-guest-pull-in-trusted-storage.yaml.in b/tests/integration/kubernetes/runtimeclass_workloads/pod-guest-pull-in-trusted-storage.yaml.in new file mode 100644 index 0000000000..edb1fa9fff --- /dev/null +++ b/tests/integration/kubernetes/runtimeclass_workloads/pod-guest-pull-in-trusted-storage.yaml.in @@ -0,0 +1,33 @@ +# +# Copyright (c) 2024 Intel Corporation +# +# SPDX-License-Identifier: Apache-2.0 +# +apiVersion: v1 +kind: Pod +metadata: + name: large-image-pod +spec: + runtimeClassName: kata + affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: kubernetes.io/hostname + operator: In + values: + - $NODE_NAME + volumes: + - name: trusted-storage + persistentVolumeClaim: + claimName: trusted-pvc + containers: + - name: app-container + image: $IMAGE + command: ["/bin/sh", "-c"] + args: + - sleep 6000 + volumeDevices: + - devicePath: /dev/trusted_store + name: trusted-storage \ No newline at end of file diff --git a/tools/osbuilder/rootfs-builder/rootfs.sh b/tools/osbuilder/rootfs-builder/rootfs.sh index 5fd77290dc..f1901673ee 100755 --- a/tools/osbuilder/rootfs-builder/rootfs.sh +++ b/tools/osbuilder/rootfs-builder/rootfs.sh @@ -750,6 +750,11 @@ EOF tar xvJpf ${COCO_GUEST_COMPONENTS_TARBALL} -C ${ROOTFS_DIR} fi + if [ "${MEASURED_ROOTFS}" == "yes" ]; then + info "Install init_trusted_storage script" + install -o root -g root -m 0500 "${script_dir}/scripts/init_trusted_storage.sh" "${ROOTFS_DIR}/usr/local/bin/luks-encrypt-storage" + fi + # Create an empty /etc/resolv.conf, to allow agent to bind mount container resolv.conf to Kata VM dns_file="${ROOTFS_DIR}/etc/resolv.conf" if [ -L "$dns_file" ]; then diff --git a/tools/osbuilder/rootfs-builder/scripts/init_trusted_storage.sh b/tools/osbuilder/rootfs-builder/scripts/init_trusted_storage.sh new file mode 100644 index 0000000000..2e5fdf46e7 --- /dev/null +++ b/tools/osbuilder/rootfs-builder/scripts/init_trusted_storage.sh @@ -0,0 +1,145 @@ +#!/bin/bash +# +# Copyright (c) 2024 Intel Corporation +# +# SPDX-License-Identifier: Apache-2.0 +# + +set -o errexit +set -o nounset +set -o pipefail +set -o errtrace + +[ -n "${DEBUG:-}" ] && set -o xtrace + +handle_error() { + local exit_code="${?}" + local line_number="${1:-}" + echo "error:" + echo "Failed at $line_number: ${BASH_COMMAND}" + exit "${exit_code}" +} +trap 'handle_error $LINENO' ERR + +die() { + local msg="$*" + echo >&2 "ERROR: $msg" + exit 1 +} + +setup() { + local cmds=() + + cmds+=("cryptsetup" "mkfs.ext4" "mount") + + local cmd + for cmd in "${cmds[@]}"; do + command -v "$cmd" &>/dev/null || die "need command: '$cmd'" + done +} + +setup + +device_num=${1:-} +if [ -z "$device_num" ]; then + die "invalid arguments, at least one param for device num" +fi + +is_encrypted="false" +if [ -n "${2-}" ]; then + is_encrypted="$2" +fi + +mount_point="/tmp/target_path" +if [ -n "${3-}" ]; then + mount_point="$3" +fi + +storage_key_path="/run/encrypt_storage.key" +if [ -n "${4-}" ]; then + storage_key_path="$4" +fi + +data_integrity="true" +if [ -n "${5-}" ]; then + data_integrity="$5" +fi + +device_name=$(sed -e 's/DEVNAME=//g;t;d' "/sys/dev/block/${device_num}/uevent") +device_path="/dev/$device_name" + +opened_device_name=$(mktemp -u "encrypted_disk_XXXXX") + +if [[ -n "$device_name" && -b "$device_path" ]]; then + + if [ "$is_encrypted" == "false" ]; then + + if [ "$data_integrity" == "false" ]; then + cryptsetup --batch-mode luksFormat --type luks2 "$device_path" --sector-size 4096 \ + --cipher aes-xts-plain64 "$storage_key_path" + else + # Wiping a device is a time consuming operation. To avoid a full wipe, integritysetup + # and crypt setup provide a --no-wipe option. + # However, an integrity device that is not wiped will have invalid checksums. Normally + # this should not be a problem since a page must first be written to before it can be read + # (otherwise the data would be arbitrary). The act of writing would populate the checksum + # for the page. + # However, tools like mkfs.ext4 read pages before they are written; sometimes the read + # of an unwritten page happens due to kernel buffering. + # See https://gitlab.com/cryptsetup/cryptsetup/-/issues/525 for explanation and fix. + # The way to propery format the non-wiped dm-integrity device is to figure out which pages + # mkfs.ext4 will write to and then to write to those pages before hand so that they will + # have valid integrity tags. + cryptsetup --batch-mode luksFormat --type luks2 "$device_path" --sector-size 4096 \ + --cipher aes-xts-plain64 --integrity hmac-sha256 "$storage_key_path" \ + --integrity-no-wipe + fi + fi + + cryptsetup luksOpen -d "$storage_key_path" "$device_path" $opened_device_name + rm "$storage_key_path" + + if [ "$data_integrity" == "false" ]; then + mkfs.ext4 /dev/mapper/$opened_device_name -E lazy_journal_init + else + # mkfs.ext4 doesn't perform whole sector writes and this will cause checksum failures + # with an unwiped integrity device. Therefore, first perform a dry run. + output=$(mkfs.ext4 /dev/mapper/$opened_device_name -F -n) + + # The above command will produce output like + # mke2fs 1.46.5 (30-Dec-2021) + # Creating filesystem with 268435456 4k blocks and 67108864 inodes + # Filesystem UUID: 4a5ff012-91c0-47d9-b4bb-8f83e830825f + # Superblock backups stored on blocks: + # 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, + # 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, + # 102400000, 214990848 + delimiter="Superblock backups stored on blocks:" + blocks_list=$([[ $output =~ $delimiter(.*) ]] && echo "${BASH_REMATCH[1]}") + + # Find list of blocks + block_nums=$(echo "$blocks_list" | grep -Eo '[0-9]{4,}' | sort -n) + + # Add zero to list of blocks + block_nums="0 $block_nums" + + # Iterate through each block and write to it to ensure that it has valid checksum + for block_num in $block_nums; do + echo "Clearing page at $block_num" + # Zero out the page + dd if=/dev/zero bs=4k count=1 oflag=direct \ + of=/dev/mapper/$opened_device_name seek="$block_num" + done + + # Now perform the actual ext4 format. Use lazy_journal_init so that the journal is + # initialized on demand. This is safe for ephemeral storage since we don't expect + # ephemeral storage to survice a power cycle. + mkfs.ext4 /dev/mapper/$opened_device_name -E lazy_journal_init + fi + + [ ! -d "$mount_point" ] && mkdir -p $mount_point + + mount /dev/mapper/$opened_device_name $mount_point +else + die "Invalid device: '$device_path'" +fi diff --git a/versions.yaml b/versions.yaml index 4c3ea64ed8..51ab955231 100644 --- a/versions.yaml +++ b/versions.yaml @@ -231,7 +231,7 @@ externals: coco-guest-components: description: "Provides attested key unwrapping for image decryption" url: "https://github.com/confidential-containers/guest-components/" - version: "v0.9.0" + version: "d996c692207a983426ae0043952d15ed18e84f66" toolchain: "1.76.0" coco-trustee: