diff --git a/docs/how-to/how-to-pull-images-in-guest-with-kata.md b/docs/how-to/how-to-pull-images-in-guest-with-kata.md index 56f057ffbd..7e1918e24e 100644 --- a/docs/how-to/how-to-pull-images-in-guest-with-kata.md +++ b/docs/how-to/how-to-pull-images-in-guest-with-kata.md @@ -137,7 +137,7 @@ snapshotter = "nydus" $ sudo systemctl restart containerd ``` -## Verification +## Run pod in kata containers with pulling image in guest To verify pulling images in a guest VM, please refer to the following commands: @@ -148,8 +148,6 @@ apiVersion: v1 kind: Pod metadata: name: busybox - annotations: - io.containerd.cri.runtime-handler: kata-qemu spec: runtimeClassName: kata-qemu containers: @@ -163,9 +161,6 @@ NAME READY STATUS RESTARTS AGE busybox 1/1 Running 0 10s ``` -> **Notes:** -> The `CRI Runtime Specific Snapshotter` is still an experimental feature. To pull images in the guest under the specific kata runtime (such as `kata-qemu`), we need to add the following annotation in metadata to each pod yaml: `io.containerd.cri.runtime-handler: kata-qemu`. By adding the annotation, we can ensure that the feature works as expected. - 2. Verify that the pod's images have been successfully downloaded in the guest. If images intended for deployment are deleted prior to deploying with `nydus snapshotter`, the root filesystems required for the pod's images (including the pause image and the container image) should not be present on the host. ```bash @@ -173,4 +168,145 @@ $ sandbox_id=$(ps -ef| grep containerd-shim-kata-v2| grep -oP '(?<=-id\s)[a-f0-9 $ rootfs_count=$(find /run/kata-containers/shared/sandboxes/$sandbox_id -name rootfs -type d| grep -o "rootfs" | wc -l) $ echo $rootfs_count 0 +``` + +## Run pod in kata containers with pulling large image in guest + +Currently, the image pulled in the guest will be downloaded and unpacked in the `/run/kata-containers/image` directory. However, by default, in rootfs-confidential image, systemd allocates 50% of the available physical RAM to the `/run` directory using a `tmpfs` filesystem. As we all know, memory is valuable, especially for confidential containers. This means that if we run a kata container with the default configuration (where the default memory assigned for a VM is 2048 MiB), `/run` would be allocated around 1024 MiB. Consequently, we can only pull images up to 1024 MiB in the guest. So we can use a block volume from the host and use `dm-crypt` and `dm-integrity` to encrypt the block volume in the guest, providing a secure place to store downloaded container images. + +### Create block volume with k8s + +There are a lot of CSI Plugins that support block volumes: AWS EBS, Azure Disk, Open-Local and so on. But as an example, we use Local Persistent Volumes to use local disks as block storage with k8s cluster. + +1. Create an empty disk image and attach the image to a loop device, such as `/dev/loop0` +```bash +$ loop_file="/tmp/trusted-image-storage.img" +$ sudo dd if=/dev/zero of=$loop_file bs=1M count=2500 +$ sudo losetup /dev/loop0 $loop_file +``` + +2. Create a Storage Class +```yaml +apiVersion: storage.k8s.io/v1 +kind: StorageClass +metadata: + name: local-storage +provisioner: kubernetes.io/no-provisioner +volumeBindingMode: WaitForFirstConsumer +``` + +3. Create Persistent Volume +```yaml +apiVersion: v1 +kind: PersistentVolume +metadata: + name: trusted-block-pv +spec: + capacity: + storage: 10Gi + volumeMode: Block + accessModes: + - ReadWriteOnce + persistentVolumeReclaimPolicy: Retain + storageClassName: local-storage + local: + path: /dev/loop0 + nodeAffinity: + required: + nodeSelectorTerms: + - matchExpressions: + - key: kubernetes.io/hostname + operator: In + values: + - NODE_NAME +``` + +4. Create Persistent Volume Claim +```yaml +apiVersion: v1 +kind: PersistentVolumeClaim +metadata: + name: trusted-pvc +spec: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 1Gi + volumeMode: Block + storageClassName: local-storage +``` + +5. Run a pod with pulling large image in guest + +```yaml +apiVersion: v1 +kind: Pod +metadata: + name: large-image-pod +spec: + runtimeClassName: kata-qemu + affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: kubernetes.io/hostname + operator: In + values: + - NODE_NAME + volumes: + - name: trusted-storage + persistentVolumeClaim: + claimName: trusted-pvc + containers: + - name: app-container + image: quay.io/confidential-containers/test-images:largeimage + command: ["/bin/sh", "-c"] + args: + - sleep 6000 + volumeDevices: + - devicePath: /dev/trusted_store + name: trusted-image-storage +``` + +5. Docker image size +```bash +docker image ls|grep "largeimage" +quay.io/confidential-containers/test-images largeimage 00bc1f6c893a 4 months ago 2.15GB +``` + +6. Check whether the device is encrypted and used by entering into the VM +```bash +$ lsblk --fs +NAME FSTYPE LABEL UUID FSAVAIL FSUSE% MOUNTPOINT +sda +└─encrypted_disk_GsLDt + 178M 87% /run/kata-containers/image + +$ cryptsetup status encrypted_disk_GsLDt +/dev/mapper/encrypted_disk_GsLDt is active and is in use. + type: LUKS2 + cipher: aes-xts-plain64 + keysize: 512 bits + key location: keyring + device: /dev/sda + sector size: 4096 + offset: 32768 sectors + size: 5087232 sectors + mode: read/write + +$ mount|grep "encrypted_disk_GsLDt" +/dev/mapper/encrypted_disk_GsLDt on /run/kata-containers/image type ext4 + +$ du -h --max-depth=1 /run/kata-containers/image/ +16K /run/kata-containers/image/lost+found +2.1G /run/kata-containers/image/layers +60K /run/kata-containers/image/overlay +2.1G /run/kata-containers/image/ + +$ free -m + total used free shared buff/cache available +Mem: 1989 52 43 0 1893 1904 +Swap: 0 0 0 ``` \ No newline at end of file