diff --git a/README.md b/README.md index d5d632b7c7..965f8631d2 100644 --- a/README.md +++ b/README.md @@ -39,6 +39,7 @@ See the [howto documentation](how-to). * [SR-IOV with Kata](./use-cases/using-SRIOV-and-kata.md) * [Intel QAT with Kata](./use-cases/using-Intel-QAT-and-kata.md) * [VPP with Kata](./use-cases/using-vpp-and-kata.md) +* [SPDK vhost-user with Kata](./use-cases/using-SPDK-vhostuser-and-kata.md) ## Developer Guide diff --git a/use-cases/using-SPDK-vhostuser-and-kata.md b/use-cases/using-SPDK-vhostuser-and-kata.md new file mode 100644 index 0000000000..f6d4ddfb45 --- /dev/null +++ b/use-cases/using-SPDK-vhostuser-and-kata.md @@ -0,0 +1,221 @@ +# Setup to run SPDK vhost-user devices with Kata Containers and Docker* + +- [SPDK vhost-user target overview](#spdk-vhost-user-target-overview) +- [Install and setup SPDK vhost-user target](#install-and-setup-spdk-vhost-user-target) + - [Get source code and build SPDK](#get-source-code-and-build-spdk) + - [Run SPDK vhost-user target](#run-spdk-vhost-user-target) +- [Host setup for vhost-user devices](#host-setup-for-vhost-user-devices) +- [Launch a Kata container with SPDK vhost-user block device](#launch-a-kata-container-with-spdk-vhost-user-block-device) + +## SPDK vhost-user Target Overview + +The Storage Performance Development Kit (SPDK) provides a set of tools and +libraries for writing high performance, scalable, user-mode storage applications. + +virtio, vhost and vhost-user: +- virtio is an efficient way to transport data for virtual environments and +guests. It is most commonly used in QEMU VMs, where the VM itself exposes a +virtual PCI device and the guest OS communicates with it using a specific virtio +PCI driver. Its diagram is: +``` ++---------+------+--------+----------+--+ +| +------+-------------------+ | +| | +----------+ | | +| user | | | | | +| space | | guest | | | +| | | | | | +| +----+ qemu | +-+------+ | | +| | | | | virtio | | | +| | | | | driver | | | +| | | +-+---++---+ | | +| | +------+-------------------+ | +| | ^ | | +| | | | | +| v | v | ++-+------+---+------------+--+-------+--+ +| |block | +------------+ kvm.ko | | +| |device| | | | +| +------+ +--+-------+ | +| host kernel | ++---------------------------------------+ +``` + +- vhost is a protocol for devices accessible via inter-process communication. It +uses the same virtio queue layout as virtio to allow vhost devices to be mapped +directly to virtio devices. The initial vhost implementation is a part of the +Linux kernel and uses an ioctl interface to communicate with userspace +applications. Its diagram is: +``` ++---------+------+--------+----------+--+ +| +------+-------------------+ | +| | +----------+ | | +| user | | | | | +| space | | guest | | | +| | | | | | +| | qemu | +-+------+ | | +| | | | virtio | | | +| | | | driver | | | +| | +-+-----++-+ | | +| +------+-------------------+ | +| | | +| | | ++-+------+--+-------------+--+--v-------+ +| |block | |vhost-scsi.ko| | kvm.ko | +| |device| | | | | +| +---^--+ +-v---------^-+ +--v-------+ +| | | host | | | +| +-------+ kernel +-------+ | ++---------------------------------------+ +``` + +- vhost-user implements the control plane through Unix domain socket to establish +virtio queue sharing with a user space process on the same host. SPDK exposes +vhost devices via the vhost-user protocol. Its diagram is: +``` ++----------------+------+--+----------+-+ +| +------+-------------+ | +| user | +----------+ | | +| space | | | | | +| | | guest | | | +| +-+-------+ | qemu | +-+------+ | | +| | vhost | | | | virtio | | | +| | backend | | | | driver | | | +| +-^-^---^-+ | +-+--+-----+ | | +| | | | | | | | +| | | | +--+---+----V------+-+ | +| | | | | | | | +| | | ++--------+--+ | | | +| | | |unix sockets| | | | +| | | +------------+ | | | +| | | | | | +| | | +-------------+ | | | +| | +--|shared memory|<---+ | | ++----+----+-------------+---+--+----+---+ +| | | | +| +----------------------+ kvm.ko | +| +--+--------+ +| host kernel | ++---------------------------------------+ +``` + +SPDK vhost is a vhost-user slave server. It exposes Unix domain sockets and +allows external applications to connect. It is capable of exposing virtualized +storage devices to QEMU instances or other arbitrary processes. + +Currently, the SPDK vhost-user target can exposes these types of virtualized +devices: + +- `vhost-user-blk` +- `vhost-user-scsi` +- `vhost-user-nvme` + +For more information, visit [SPDK](https://spdk.io) and [SPDK vhost-user target](https://spdk.io/doc/vhost.html). + +## Install and setup SPDK vhost-user target + +### Get source code and build SPDK + +Following the SPDK [getting started guide](https://spdk.io/doc/getting_started.html). + +### Run SPDK vhost-user target + +First, run the SPDK `setup.sh` script to setup some hugepages for the SPDK vhost +target application. We recommend you use a minimum of 4GiB, enough for the SPDK +vhost target and the virtual machine. +This will allocate 4096MiB (4GiB) of hugepages: + +```bash +$ sudo HUGEMEM=4096 scripts/setup.sh +``` + +Then, take directory `/var/run/kata-containers/vhost-user` as Kata's vhost-user +device directory. Make subdirectories for vhost-user sockets and device nodes: + +```bash +$ sudo mkdir -p /var/run/kata-containers/vhost-user/ +$ sudo mkdir -p /var/run/kata-containers/vhost-user/block/ +$ sudo mkdir -p /var/run/kata-containers/vhost-user/block/sockets/ +$ sudo mkdir -p /var/run/kata-containers/vhost-user/block/devices/ +``` + +For more details, see section [Host setup for vhost-user devices](#host-setup-for-vhost-user-devices). + +Next, start the SPDK vhost target application. The following command will start +vhost on the first CPU core with all future socket files placed in +`/var/run/kata-containers/vhost-user/block/sockets/`: + +```bash +$ sudo app/spdk_tgt/spdk_tgt -S /var/run/kata-containers/vhost-user/block/sockets/ & +``` + +To list all available vhost options run the following command: + +```bash +$ app/spdk_tgt/spdk_tgt -h +``` + +Create an experimental `vhost-user-blk` device based on memory directly: + +- The following RPC will create a 64MB memory block device named `Malloc0` +with 4096-byte block size: + +```bash +$ sudo scripts/rpc.py bdev_malloc_create 64 4096 -b Malloc0 +``` + +- The following RPC will create a `vhost-user-blk` device exposing `Malloc0` +block device. The device will be accessible via +`/var/run/kata-containers/vhost-user/block/sockets/vhostblk0`: + +```bash +$ sudo scripts/rpc.py vhost_create_blk_controller vhostblk0 Malloc0 +``` + +## Host setup for vhost-user devices + +Considering the OCI specification and characteristics of vhost-user device, +Kata has chosen to use Linux reserved the block major range `240-254` +to map each vhost-user block type to a major. Also a specific directory is +used for vhost-user devices. The base directory is a configurable value, +with the default being `/var/run/kata-containers/vhost-user`. It can be +configured by parameter `vhost_user_store_path` in [Kata TOML configuration file](https://github.com/kata-containers/runtime/blob/master/README.md#configuration). + +The reset of the path `block` is used for block device; `block/sockets` is where +we expect vhost-user sockets to live; `block/sockets` is where simulated block +device node for vhost-user devices to live. + +For example, if using the default directory `/var/run/kata-containers/vhost-user`, +sockets for vhost-user device are under `/var/run/kata-containers/vhost-user/block/sockets/`. +Device nodes for vhost-user device are under `/var/run/kata-containers/vhost-user/block/devices/`. + +Currently, Kata has chosen major number 241 to map to `vhost-user-blk` devices. +For `vhost-user-blk` device `vhostblk0`, create a block device node with major +`241` and minor `0` for it in order to be recognized by Kata Containers runtime: + +```bash +$ sudo mknod /var/run/kata-containers/vhost-user/block/devices/vhostblk0 b 241 0 +``` + +## Launch a Kata container with SPDK vhost-user block device + +To use `vhost-user-blk` device, use Docker to pass a host `vhost-user-blk` +device to the container. In docker, `--device=HOST-DIR:CONTAINER-DIR` is used +to pass a host device to the container. + +For example: + +```bash +$ sudo docker run --runtime kata-runtime --device=/var/run/kata-containers/vhost-user/block/devices/vhostblk0:/dev/vda -it busybox sh +``` + +Example of performing I/O operations on the `vhost-user-blk` device inside +container: + +``` +/ # ls -l /dev/vda +brw-r--r-- 1 root root 254, 0 Jan 20 03:54 /dev/vda +/ # dd if=/dev/vda of=/tmp/ddtest bs=4k count=20 +20+0 records in +20+0 records out +81920 bytes (80.0KB) copied, 0.002996 seconds, 26.1MB/s +```