diff --git a/docs/README.md b/docs/README.md index cd111721eb..e247e54aba 100644 --- a/docs/README.md +++ b/docs/README.md @@ -32,6 +32,7 @@ See the [how-to documentation](how-to). * [Intel QAT with Kata](./use-cases/using-Intel-QAT-and-kata.md) * [SPDK vhost-user with Kata](./use-cases/using-SPDK-vhostuser-and-kata.md) * [Intel SGX with Kata](./use-cases/using-Intel-SGX-and-kata.md) +* [IBM Crypto Express passthrough with Confidential Containers](./use-cases/CEX-passthrough-and-coco.md) ## Developer Guide diff --git a/docs/use-cases/CEX-passthrough-and-coco.md b/docs/use-cases/CEX-passthrough-and-coco.md new file mode 100644 index 0000000000..f74b8053aa --- /dev/null +++ b/docs/use-cases/CEX-passthrough-and-coco.md @@ -0,0 +1,96 @@ +# Using IBM Crypto Express with Confidential Containers + +On IBM Z (s390x), IBM Crypto Express (CEX) hardware security modules (HSM) can be passed through to virtual guests. +This VFIO pass-through is domain-wise, i.e. guests can securely share one physical card. +For the Accelerator and Enterprise PKCS #11 (EP11) modes of CEX, on IBM z16 and up, pass-through is also supported when using the IBM Secure Execution trusted execution environment. +To maintain confidentiality when using EP11 within Secure Execution, additional steps are required. +When using Secure Execution within Kata Containers, some of these steps are managed by the Kata agent, but preparation is required to make pass-through work. +The Kata agent will expect required confidential information at runtime via [Confidential Data Hub](https://github.com/confidential-containers/guest-components/tree/main/confidential-data-hub) from Confidential Containers, and this guide assumes Confidential Containers components as a means of secret provisioning. + +At the time of writing, devices for trusted execution environments are only supported via the `--device` option of e.g. `ctr`, `docker`, or `podman`, but **not** via Kubernetes. +Refer to [KEP 4113](https://github.com/kubernetes/enhancements/pull/4113) for details. + +Using a CEX card in Accelerator mode is much simpler and does not require the steps below. +To do so, prepare [Kata for Secure Execution](../how-to/how-to-run-kata-containers-with-SE-VMs.md), set `vfio_mode = "vfio"` and `cold_plug_vfio = "bridge-port"` in the Kata `configuration.toml` file and use a [mediated device](../../src/runtime/virtcontainers/README.md#how-to-pass-a-device-using-vfio-ap-passthrough) similar to operating without Secure Execution. +The Kata agent will do the [Secure Execution bind](https://www.ibm.com/docs/en/linux-on-systems?topic=adapters-accelerator-mode) automatically. + +## Prerequisites + +- A host kernel that supports adjunct processor (AP) pass-through with Secure Execution. [Official support](https://www.ibm.com/docs/en/linux-on-systems?topic=restrictions-required-software) exists as of Ubuntu 24.04, RHEL 8.10 and 9.4, and SLES 15 SP6. +- An EP11 domain with a master key set up. In this process, you will need the master key verification pattern (MKVP) [1]. +- A [mediated device](../../src/runtime/virtcontainers/README.md#how-to-pass-a-device-using-vfio-ap-passthrough), created from this domain, to pass through. +- Working [Kata Containers with Secure Execution](../how-to/how-to-run-kata-containers-with-SE-VMs.md). +- Working access to a [key broker service (KBS) with the IBM Secure Execution verifier](https://github.com/confidential-containers/trustee/blob/main/deps/verifier/src/se/README.md) from a Kata container. The provided Secure Execution header must match the Kata guest image and a policy to allow the appropriate secrets for this guest must be set up. +- In Kata's `configuration.toml`, set `vfio_mode = "vfio"` and `cold_plug_vfio = "bridge-port"` + +## Prepare an association secret + +An EP11 Secure Execution workload requires an [association secret](https://www.ibm.com/docs/en/linux-on-systems?topic=adapters-ep11-mode) to be inserted in the guest and associated with the adjunct processor (AP) queue. +In Kata Containers, this secret must be created and made available via Trustee, whereas the Kata agent performs the actual secret insertion and association. +On a trusted system, to create an association secret using the host key document (HKD) `z16.crt`, a guest header `hdr.bin`, a CA certificate `DigiCertCA.crt`, an IBM signing key `ibm-z-host-key-signing-gen2.crt`, and let the command create a random association secret that is named `my secret` and save this random association secret to `my_random_secret`, run: + +``` +[trusted]# pvsecret create -k z16.crt --hdr hdr.bin -o my_addsecreq \ + --crt DigiCertCA.crt --crt ibm-z-host-key-signing-gen2.crt \ + association "my secret" --output-secret my_random_secret +``` + +using `pvsecret` from the [s390-tools](https://github.com/ibm-s390-linux/s390-tools) suite. +`hdr.bin` **must** be the Secure Execution header matching the Kata guest image, i.e. the one also provided to Trustee. +This command saves the add-secret request itself to `my_addsecreq`, and information on the secret, including the secret ID, to `my_secret.yaml`. +This secret ID must be provided alongside the secret. +Write it to `my_addsecid` with or without leading `0x` or, using `yq`: + +``` +[trusted]# yq ".id" my_secret.yaml > my_addsecid +``` + +## Provision the association secret with Trustee + +The secret and secret ID must be provided via Trustee with respect to the MKVP. +The paths where the Kata agent will expect this info are `vfio_ap/${mkvp}/secret` and `vfio_ap/${mkvp}/secret_id`, where `$mkvp` is the first 16 bytes (32 hex numbers) without leading `0x` of the MKVP. + +For example, if your MKVPs read [1] as + +``` +WK CUR: valid 0xdb3c3b3c3f097dd55ec7eb0e7fdbcb933b773619640a1a75a9161cec00000000 +WK NEW: empty - +``` + +use `db3c3b3c3f097dd55ec7eb0e7fdbcb93` in the provision for Trustee. +With a KBS running at `127.0.0.1:8080`, to store the secret and ID created above in the KBS with the authentication key `kbs.key` and this MKVP, run: + +``` +[trusted]# kbs-client --url http://127.0.0.1:8080 config \ + --auth-private-key kbs.key set-resource \ + --path vfio_ap/db3c3b3c3f097dd55ec7eb0e7fdbcb93/secret \ + --resource-file my_addsecreq +[trusted]# kbs-client --url http://127.0.0.1:8080 config \ + --auth-private-key kbs.key set-resource \ + --path vfio_ap/db3c3b3c3f097dd55ec7eb0e7fdbcb93/secret_id \ + --resource-file my_addsecid +``` + +## Run the workload + +Assuming the mediated device exists at `/dev/vfio/0`, run e.g. + +``` +[host]# docker run --rm --runtime io.containerd.run.kata.v2 --device /dev/vfio/0 -it ubuntu +``` + +If you have [s390-tools](https://github.com/ibm-s390-linux/s390-tools) available in the container, you can see the available CEX domains including Secure Execution info using `lszcrypt -V`: + +``` +[container]# lszcrypt -V +CARD.DOM TYPE MODE STATUS REQUESTS PENDING HWTYPE QDEPTH FUNCTIONS DRIVER SESTAT +-------------------------------------------------------------------------------------------------------- +03 CEX8P EP11-Coproc online 2 0 14 08 -----XN-F- cex4card - +03.0041 CEX8P EP11-Coproc online 2 0 14 08 -----XN-F- cex4queue usable +``` + +--- + +[1] If you have access to the host, the MKVP can be read at `/sys/bus/ap/card${cardno}/${apqn}/mkvps`, where `${cardno}` is the the two-digit hexadecimal identification for the card, and `${apqn}` is the APQN of the domain you want to pass, e.g. `card03/03.0041` for the the domain 0x41 on card 3. +This information is only readable when card and domain are not yet masked for use with VFIO. +If you do not have access to the host, you should receive the MKVP from your HSM domain administrator. diff --git a/src/agent/Cargo.lock b/src/agent/Cargo.lock index fbe871d827..67f4319dfc 100644 --- a/src/agent/Cargo.lock +++ b/src/agent/Cargo.lock @@ -3085,6 +3085,7 @@ dependencies = [ "rtnetlink", "runtime-spec", "rustjail", + "s390_pv_core", "safe-path", "scan_fmt", "scopeguard", @@ -5575,6 +5576,20 @@ version = "0.2.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "6518fc26bced4d53678a22d6e423e9d8716377def84545fe328236e3af070e7f" +[[package]] +name = "s390_pv_core" +version = "0.11.0" +source = "git+https://github.com/ibm-s390-linux/s390-tools?rev=4942504a9a2977d49989a5e5b7c1c8e07dc0fa41#4942504a9a2977d49989a5e5b7c1c8e07dc0fa41" +dependencies = [ + "byteorder", + "libc", + "log", + "regex", + "serde", + "thiserror 2.0.12", + "zerocopy 0.7.35", +] + [[package]] name = "safe-path" version = "0.1.0" @@ -7768,6 +7783,7 @@ version = "0.7.35" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "1b9b4fd18abc82b8136838da5d50bae7bdea537c574d8dc1a34ed098d6c166f0" dependencies = [ + "byteorder", "zerocopy-derive 0.7.35", ] diff --git a/src/agent/Cargo.toml b/src/agent/Cargo.toml index 544e3d235c..170668bdde 100644 --- a/src/agent/Cargo.toml +++ b/src/agent/Cargo.toml @@ -187,6 +187,9 @@ base64 = "0.22" sha2 = "0.10.8" async-compression = { version = "0.4.22", features = ["tokio", "gzip"] } +[target.'cfg(target_arch = "s390x")'.dependencies] +pv_core = { git = "https://github.com/ibm-s390-linux/s390-tools", rev = "4942504a9a2977d49989a5e5b7c1c8e07dc0fa41", package = "s390_pv_core" } + [dev-dependencies] tempfile.workspace = true which.workspace = true diff --git a/src/agent/src/cdh.rs b/src/agent/src/cdh.rs index f3be4bc701..07f7347578 100644 --- a/src/agent/src/cdh.rs +++ b/src/agent/src/cdh.rs @@ -11,8 +11,12 @@ use crate::AGENT_CONFIG; use anyhow::{bail, Context, Result}; use derivative::Derivative; use protocols::{ - confidential_data_hub, confidential_data_hub_ttrpc_async, - confidential_data_hub_ttrpc_async::{SealedSecretServiceClient, SecureMountServiceClient}, + confidential_data_hub, + confidential_data_hub::GetResourceRequest, + confidential_data_hub_ttrpc_async, + confidential_data_hub_ttrpc_async::{ + GetResourceServiceClient, SealedSecretServiceClient, SecureMountServiceClient, + }, }; use std::fs; use std::os::unix::fs::symlink; @@ -39,6 +43,8 @@ pub struct CDHClient { sealed_secret_client: SealedSecretServiceClient, #[derivative(Debug = "ignore")] secure_mount_client: SecureMountServiceClient, + #[derivative(Debug = "ignore")] + get_resource_client: GetResourceServiceClient, } impl CDHClient { @@ -47,10 +53,13 @@ impl CDHClient { let sealed_secret_client = confidential_data_hub_ttrpc_async::SealedSecretServiceClient::new(client.clone()); let secure_mount_client = - confidential_data_hub_ttrpc_async::SecureMountServiceClient::new(client); + confidential_data_hub_ttrpc_async::SecureMountServiceClient::new(client.clone()); + let get_resource_client = + confidential_data_hub_ttrpc_async::GetResourceServiceClient::new(client); Ok(CDHClient { sealed_secret_client, secure_mount_client, + get_resource_client, }) } @@ -84,6 +93,18 @@ impl CDHClient { .await?; Ok(()) } + + pub async fn get_resource(&self, resource_path: &str) -> Result> { + let req = GetResourceRequest { + ResourcePath: format!("kbs://{}", resource_path), + ..Default::default() + }; + let res = self + .get_resource_client + .get_resource(ttrpc::context::with_timeout(*CDH_API_TIMEOUT), &req) + .await?; + Ok(res.Resource) + } } pub async fn init_cdh_client(cdh_socket_uri: &str) -> Result<()> { @@ -201,6 +222,15 @@ pub async fn secure_mount( Ok(()) } +#[allow(dead_code)] +pub async fn get_cdh_resource(resource_path: &str) -> Result> { + let cdh_client = CDH_CLIENT + .get() + .expect("Confidential Data Hub not initialized"); + + cdh_client.get_resource(resource_path).await +} + #[cfg(test)] mod tests { use super::*; diff --git a/src/agent/src/device/vfio_device_handler.rs b/src/agent/src/device/vfio_device_handler.rs index 59f89e8cb1..e0fd4d00da 100644 --- a/src/agent/src/device/vfio_device_handler.rs +++ b/src/agent/src/device/vfio_device_handler.rs @@ -4,14 +4,13 @@ // SPDX-License-Identifier: Apache-2.0 // -#[cfg(target_arch = "s390x")] -use crate::ap; use crate::device::{pcipath_to_sysfs, DevUpdate, DeviceContext, DeviceHandler, SpecUpdate}; use crate::linux_abi::*; use crate::pci; use crate::sandbox::Sandbox; use crate::uevent::{wait_for_uevent, Uevent, UeventMatcher}; use anyhow::{anyhow, Context, Result}; +use cfg_if::cfg_if; use kata_types::device::{ DRIVER_VFIO_AP_COLD_TYPE, DRIVER_VFIO_AP_TYPE, DRIVER_VFIO_PCI_GK_TYPE, DRIVER_VFIO_PCI_TYPE, }; @@ -27,6 +26,22 @@ use std::sync::Arc; use tokio::sync::Mutex; use tracing::instrument; +cfg_if! { + if #[cfg(target_arch = "s390x")] { + use crate::ap; + use crate::cdh::get_cdh_resource; + use std::convert::TryFrom; + use pv_core::ap::{ + Apqn, + apqn_info::Ep11, + assoc_state::AssocState, + bind_state::BindState, + }; + use pv_core::misc::{encode_hex, pv_guest_bit_set}; + use pv_core::uv; + } +} + #[derive(Debug)] pub struct VfioPciDeviceHandler {} @@ -103,7 +118,14 @@ impl DeviceHandler for VfioApDeviceHandler { #[instrument] async fn device_handler(&self, device: &Device, ctx: &mut DeviceContext) -> Result { // Force AP bus rescan - fs::write(AP_SCANS_PATH, "1")?; + let mut ap_context = String::from("Failed to rescan AP bus"); + if pv_guest_bit_set() { + ap_context.push_str( + ". Verify your host kernel supports AP pass-through with Secure Execution", + ); + } + fs::write(AP_SCANS_PATH, "1").context(ap_context)?; + for apqn in device.options.iter() { let ap_address = ap::Address::from_str(apqn).context("Failed to parse AP address")?; match device.type_.as_str() { @@ -111,7 +133,7 @@ impl DeviceHandler for VfioApDeviceHandler { wait_for_ap_device(ctx.sandbox, ap_address).await?; } DRIVER_VFIO_AP_COLD_TYPE => { - check_ap_device(ctx.sandbox, ap_address).await?; + check_ap_device(ap_address).await?; } _ => return Err(anyhow!("Unsupported AP device type: {}", device.type_)), } @@ -214,35 +236,70 @@ async fn wait_for_ap_device(sandbox: &Arc>, address: ap::Address) #[cfg(target_arch = "s390x")] #[instrument] -async fn check_ap_device(sandbox: &Arc>, address: ap::Address) -> Result<()> { - let ap_path = format!( - "/sys/{}/card{:02x}/{}/online", - AP_ROOT_BUS_PATH, address.adapter_id, address - ); - if !Path::new(&ap_path).is_file() { - return Err(anyhow!( - "AP device online file not found or not accessible: {}", - ap_path - )); +async fn check_ap_device(address: ap::Address) -> Result<()> { + let apqn = Apqn::try_from(&address.to_string() as &str) + .context("Failed to establish AP at {address}")?; + if apqn.info.is_none() { + return Err(anyhow!("Failed to read info for AP {address}")); } - match fs::read_to_string(&ap_path) { - Ok(content) => { - let is_online = content.trim() == "1"; - if !is_online { - return Err(anyhow!("AP device {} exists but is not online", address)); - } - } - Err(e) => { + if !pv_guest_bit_set() { + return Ok(()); + } + apqn.set_bind_state(BindState::Bound) + .context(anyhow!("Failed to bind AP {address}"))?; + if let Some(Ep11(ep11_info)) = &apqn.info { + if ep11_info.mkvp.is_empty() { return Err(anyhow!( - "Failed to read online status for AP device {}: {}", - address, - e + "Master key verification pattern for AP {address} is unset" )); } + associate_ap_device(&apqn, &ep11_info.mkvp) + .await + .context(anyhow!("Failed to associate AP {address}"))?; } Ok(()) } +#[cfg(target_arch = "s390x")] +async fn associate_ap_device(apqn: &Apqn, mkvp: &str) -> Result<()> { + let resource_path = format!("/vfio_ap/{mkvp}"); + let secret_resource_path = format!("{resource_path}/secret"); + let secret_id_resource_path = format!("{resource_path}/secret_id"); + + let uv_secret = get_cdh_resource(&secret_resource_path) + .await + .context(anyhow!( + "Failed to read Confidential Data Hub secret {secret_resource_path}. \ + Provide the desired Ultravisor secret for this MKVP with an appropriate key broker service." + ))?; + let secret_id_bytes = get_cdh_resource(&secret_id_resource_path) + .await + .context(anyhow!( + "Failed to read Confidential Data Hub secret {secret_id_resource_path}. \ + Provide the desired Ultravisor secret ID for this MKVP with an appropriate key broker service." + ))?; + let secret_id = std::str::from_utf8(&secret_id_bytes)? + .trim_start_matches("0x") + .trim_end(); + + // TODO Once initdata is stable, enable and mandate this request be signed + // (`pvsecret create --user-sign-key`, `pvsecret verify --user-cert`) + let uv = uv::UvDevice::open()?; + let mut add_cmd = uv::AddCmd::new(&mut uv_secret.as_slice()) + .context("Failed to create add secret request")?; + uv.send_cmd(&mut add_cmd).context("Failed to add secret")?; + let mut list_cmd = uv::ListCmd::new(); + uv.send_cmd(&mut list_cmd)?; + + let secret_idx = uv::SecretList::try_from(list_cmd)? + .iter() + .find(|&s| encode_hex(s.id()) == secret_id) + .ok_or_else(|| anyhow!("Could not find secret with the ID {secret_id}. \ + Perhaps there is a mismatch between the provided secret and secret ID."))? + .index(); + Ok(apqn.set_associate_state(AssocState::Associated(secret_idx))?) +} + pub async fn wait_for_pci_device( sandbox: &Arc>, pcipath: &pci::Path, diff --git a/src/libs/protocols/protos/confidential_data_hub.proto b/src/libs/protocols/protos/confidential_data_hub.proto index 8752925a0c..f639c94c98 100644 --- a/src/libs/protocols/protos/confidential_data_hub.proto +++ b/src/libs/protocols/protos/confidential_data_hub.proto @@ -34,4 +34,16 @@ service SealedSecretService { service SecureMountService { rpc SecureMount(SecureMountRequest) returns (SecureMountResponse) {}; +} + +message GetResourceRequest { + string ResourcePath = 1; +} + +message GetResourceResponse { + bytes Resource = 1; +} + +service GetResourceService { + rpc GetResource(GetResourceRequest) returns (GetResourceResponse) {}; } \ No newline at end of file