In order to let upcall being used by Kata Container, we need to add
those patches into kernel build script.
Currently, only when experimental (-e) and hypervisor type dragonball
(-t dragonball) are both enabled, that the upcall patches will be
applied to build a 5.10 guest kernel.
example commands: sh ./build-kernel.sh -e -t dragonball -d setup
fixes: #5642
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Upcall is a direct communication tool between VMM and guest developed
upon vsock. The server side of the upcall is a driver in guest kernel
(kernel patches are needed for this feature) and it'll start to serve
the requests after the kernel starts. And the client side is in
Dragonball VMM , it'll be a thread that communicates with vsock through
uds.
We want to keep the lightweight of the VM through the implementation of
the upcall, through which we could achieve vCPU hotplug, virtio-mmio
hotplug without implementing complex and heavy virtualization features
such as ACPI virtualization.
fixes: #5642
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
As done for different components, let's also use a cached version of the
shim-v2 whenever it's possible.
Fixes: #5838
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
In order to cache the shim-v2 we're considering the the cached component
can be used if:
* There were no changes in the runtime directory
* There were no changes in the golang version used
* There were no changes in the rust version used
* We don't build the rust agent, but better be prepared for the future
* There were no changes in the following files that are provided by the
rootfs builds:
* root_hash_vanilla.txt
* root_hash_tdx.txt
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As done for different components, let's also use a cached version of
the rootfs whenever it's possible.
Fixes: #5433
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
e1f075dc60 reworked the action so the
shim-v2 was split out of the matrix build. With that done I ended up
not realising I'd need to log into the quay.io as one step of the
build-asset-cc-shim-v2 job.
Fixes: #5885
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is the most complex part to cache, as the cached component can be
only used if:
* There were no changes in the agent
* There were no changes in the libs (used by the agent)
* There were no changes in the rootfs build scripts
* There is no change in the version of the following components:
* attestation-agent (part of the rootfs)
* gperf (used to build libseccomp)
* libseccomp (used to build the agent)
* pause image (part of the rootfs)
* skopeo (part of the rootfs)
* umoci (part of the rootfs)
* rust (used to build the kata-containers and attestation agents)
We're relying on the last commit merged on places related to the rootfs
generation and using that as the rootfs version and that should be good
enough for what we need.
Apart from everything already mentioned, we've also added the ability to
cache the `root_hash_vanilla.txt` and `root_hash_tdx.txt` files, as
those are needed for when building the shim-v2, in order to have
measured boot working there.
It's important to note that we've added the ability to cache *both*
files, and I've taken that path as the shim-v2 cache work (which will
come soon) relies on both files.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This will help us, in the future, to debug any possible issue related to
the measured rootfs arguments passed to the shim during the build time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The ability to do a measured boot has been overlooked when releasing the
payload consumed by the Confidential Containers project, and this
happened as we depend, at the shim-v2 build time, of a `root_hash_*.txt`
generated in the `tools/osbuilder/` directory, which is then used to add
a specific parameter to the `kernel_params` in the Kata Containers
configuration files.
With everything said above, the best way we can ensure this is done is
by saving those files during the rootfs build, download them during the
shim-v2 build (which *must* happen only after the rootfs builds happen),
and correctly use them there.
Fixes: #5847
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As Cloud Hypervisor and QEMU are using different rootfs images (the
former with `offline_fs_kbc` as aa_kbc, and the latter with `eaa_kbc`),
we need to differentiate the kernel parameters passed to each one of
those, as the `root_hash.txt` file used for measured boot will differ
according to the rootfs used.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
By doing this we can ensure that when building different rootfs-images
we won't end up overring the `root_hash.txt` file.
Plus, this will help us later in this series to pass the correct
argument to be used with the respective image.
Nothing's been done for SEV as it uses a initrd instead of an image.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
If a pod of kata is deployed on a machine, after the machine restarts, the pod status of kata-deploy will be CrashLoopBackOff.
Fixes: #5868
Signed-off-by: SinghWang <wangxin_0611@126.com>
None of the host namespace paths make sense in the guest. Let's clear
them all before sending the spec to the agent.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
We should test is_pid_namespace_enabled before amending the container
spec, where the pid namespace path is cleared and resulting
sandbox_pidns to always being false.
Fixes: #5881
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Strings in Rust don't have \0 at the end, but C does, which leads to `umount2`
in the libc can't get the correct path. Besides, calling `nix::mount::umount2`
to avoid using an unsafe block is a robust solution.
Fixes: #5871
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
Standalone share fs should add virtiofs device in setup_device_before_start_vm
and return the storages to mount the directory in guest. And it uses
hypervisor's jailer root directly instead of jail config.
Besides, we tweaked the parameter, so it adapts to rust version virtiofsd
now. And its cache policy which forbids caching is "never" now, instead of
"none". Hence, we change the default cache mode.
Fixes: #5655
Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
Adding the `root_hash.txt` to the final tarball doesn't bring any
benefit to the project, as the file dependency is for building the
shim-v2 and passing the correct measurement for the kernel command line.
It's important to mention that when building shim-v2, it doesn't look
for the file in `/opt/confidential-containers/share/kata-containers`,
bur rather in the `${repo_root_dir}/tools/osbuilder/`, as shown here:
ac3683e26e/tools/packaging/kata-deploy/local-build/kata-deploy-binaries.sh (L228-L232)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
For now, rng init is too slow for kata3.0/dragonball. Enable
random_trust_cpu can speed up rng init when kernel boot.
Fixes: #5870
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
It turns out that there's more work needed to be done on the Cloud
Hypervisor side so we can fully support EAA_KBC with it.
For now, let's remove the configuration as the tests are not currently
passing when using it, and stick to the `offline_fs_kbc` and its
specific image for the Cloud Hypervisor + TDX case.
Fixes: #5862
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The `qemu-tdx` configuration is tied to using `offline_fs_kbc` as the
aa_kbc, which is something we're moving away from.
With this in mind, let's rename the `qemu-tdx-eaa-kbc` to `qemu-tdx` and
decrease the amount of the way too many configurations that we ship.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Script to execute to build virtiofsd has been changed in #5426 but not in the doc. This commit update the developer guide.
Fixes: #5860
Signed-off-by: Mathias Flagey <mathiasflagey1201@gmail.com>
Cgroup manager for a container will always be created.
Thus, dropping the option for LinuxContainer.cgroup_manager
is feasible and could simplify the code.
Fixes: #5778
Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
This is to differentiate an artifact name between amd64 and s390x and add a
virtiofsd target for s390x.
Fixes: #5851
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Use pidfd_open and poll on newer versions of Linux to wait
for the process to exit. For older versions use existing wait logic
Fixes: #5617
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>