Device mapper is the only supported block device driver so far,
which seems limiting. Kata Containers can work well with other
block devices. It is necessary to enhance supporting of multiple
kinds of host block device.
Fixes#4714
Signed-off-by: yuchen.cc <yuchen.cc@alibaba-inc.com>
This reverts commit b0157ad73a.
```
commit b0157ad73a
Refs: 3.3.0-alpha0-124-gb0157ad73
Author: Fabiano Fidêncio <fabiano.fidencio@intel.com>
AuthorDate: Fri Aug 11 14:55:11 2023 +0200
Commit: Fabiano Fidêncio <fabiano.fidencio@intel.com>
CommitDate: Fri Nov 10 12:58:20 2023 +0100
runtime: confidential: Do not set the max_vcpu to cpu
We don't have to do this since we're relying on the
`static_sandbox_resource_mgmt` feature, which gives us the correct
amount of memory and CPUs to be allocated.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
```
This commit was removing a requirement that was made previously, but due
to the SMP issue we're facing with the QEMU used for TDX (see commit
d1b54ede290e95762099fff4e0bcdad10f816126*), QEMU will fail to start due
to:
```
Invalid CPU topology: product of the hierarchy must match maxcpus:
sockets (1) * dies (1) * cores (1) * threads (1) != maxcpus (240)"
```
This has no affect on the SEV / SNP workflow and hopefully we'll be able
to re-revet this soon enough, when this gets solved on te QEMU side.
Last but not least, this is not a "clean" revert as we're using
conf.NumVCPUs() instead of conf.NumVCPUs, to ensure we're dealing with
uint32.
Fixes: #8532
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
- Update the remote hypervisor code to match the re-genned code for
the ttrpc Hypervisor Service
Fixes: #8519
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The new clean-generated-files make target allows for removing the
generated files (including the configuration.toml files).
The tools/packaging/static-build/shim-v2/build.sh script now uses that
target to always force the re-generation of those files.
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
1) Creating storage for all `io.katacontainers.volume=` messages in rootFs.Options,
and then aggregates all storages into `containerStorages`.
2) Creating storage for other data volumes and push them into `volumeStorages`.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
To enhance the construction and administration of `Katavirtualvolume` storages,
this commit expands the 'sharedFile' structure to manage both
rootfs storages(`containerStorages`) including `Katavirtualvolume` and other data volumes storages(`volumeStorages`).
NOTE: `volumeStorages` is intended for future extensions to support Kubernetes data volumes.
Currently, `KataVirtualVolume` is exclusively employed for container rootfs, hence only `containerStorages` is actively utilized.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
The snapshotter will place `KataVirtualVolume` information
into 'rootfs.options' and commence with the prefix 'io.katacontainers.volume='.
The purpose of this commit is to transform the encapsulated KataVirtualVolume data into device information.
Fixes: #8495
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Feng Wang <feng.wang@databricks.com>
Co-authored-by: Samuel Ortiz <sameo@linux.intel.com>
Co-authored-by: Wedson Almeida Filho <walmeida@microsoft.com>
Add the corresponding data structure in the runtime part according to
kata-containers/kata-containers/pull/7698.
Fixes: #8472
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
In order to support different pod VM instance type via
remote hypervisor implementation (cloud-api-adaptor),
we need to pass machine_type, default_vcpus
and default_memory annotations to cloud-api-adaptor.
The cloud-api-adaptor then uses these annotations to spin
up the appropriate cloud instance.
Reference PR for cloud-api-adaptor
https://github.com/confidential-containers/cloud-api-adaptor/pull/1088Fixes: #7140
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
(based on commit 004f07f076)
This patch updates the template configuration file for
the remote hypervisor to set static_sandbox_resource_mgmt
to be true. The remote hypervisor uses the peer pod config
to determine the sandbox size, so requires this to be set to
true by default.
Fixes: #6616
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
(based on commit 938447803b)
Add the SELinux setting to ensure it is passed through to the remote
hypervisor
Fixes: #5936
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
(based on commit 3ef2fd1784)
This patch adds the support of the remote hypervisor type.
Shim opens a Unix domain socket specified in the config file,
and sends TTPRC requests to a external process to control
sandbox VMs.
Fixes#4482
Co-authored-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
(based on commit f9278f22c3)
This patch adds a protobuf definiton of the remote hypervisor type.
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
(based on commit 150e8aba6d)
This PR fixes k8's configmap/secrets etc update propagation when filesystem sharing is disabled.
The commit introduces below changes with some limitations:
- creates new timestamped directory in guest
- updates the '..data' symlink
- creates user visible symlinks to newly created secrets.
- Limitation: The older timestamped directory and stale user visible symlinks exist in guest
due to missing DELETE api in agent.
Fixes: #7398
Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
Improve the `CODEOWNERS` file by specifying more groups.
Since GitHub automatically checks the `CODEOWNERS` file when a PR is
created and adds all matching groups as reviewers for the PR, this may
help reduce the PR backlog since the right people will be alerted and
requested to review the PR. That should improve the quality of reviews
(and thus the quality of the landed code). It may also have a positive
effect on PR velocity.
> **Note:**
>
> This PR combines the other `CODEOWNERS` files so we have
> a single, visible, top-level file.
See: https://github.com/kata-containers/community/issues/253Fixes: #3804.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
This commit enables StratoVirt hypervisor to be tested in kata GHA,
incluing k8s, metrics, cri-containerd, nydus and so on.
Meanwhile, adding some unit tests for StratoVirt to make sure it works.
Fixes: #7794
Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
Add configuration-stratovirt.toml.in to generate the StratoVirt configuration,
and parser to deliver config to StratoVirt.
Fixes: #7794
Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
Initial support of the MicroVM machine type of StratoVirt
hypervisor for the kata go runtime.
Fixes: #7794
Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
With the change done in the last commit, instead of calculating milli
cpus, we're actually converting the CPUs to a fraction number, a float.
Let's update the function name (and associated vars) to represent that
change.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
First of all, this is a controversial piece, and I know that.
In this commit we're trying to make a less greedy approach regards the
amount of vCPUs we allocate for the VMM, which will be advantageous
mainly when using the `static_sandbox_resource_mgmt` feature, which is
used by the confidential guests.
The current approach we have basically does:
* Gets the amount of vCPUs set in the config (an integer)
* Gets the amount of vCPUs set as limit (an integer)
* Sum those up
* Starts / Updates the VMM to use that total amount of vCPUs
The fact we're dealing with integers is logical, as we cannot request
500m vCPUs to the VMMs. However, it leads us to, in several cases, be
wasting one vCPU.
Let's take the example that we know the VMM requires 500m vCPUs to be
running, and the workload sets 250m vCPUs as a resource limit.
In that case, we'd do:
* Gets the amount of vCPUs set in the config: 1
* Gets the amount of vCPUs set as limit: ceil(0.25)
* 1 + ceil(0.25) = 1 + 1 = 2 vCPUs
* Starts / Updates the VMM to use 2 vCPUs
With the logic changed here, what we're doing is considering everything
as float till just before we start / update the VMM. So, the flow
describe above would be:
* Gets the amount of vCPUs set in the config: 0.5
* Gets the amount of vCPUs set as limit: 0.25
* ceil(0.5 + 0.25) = 1 vCPUs
* Starts / Updates the VMM to use 1 vCPUs
In the way I've written this patch we introduce zero regressions, as
the default values set are still the same, and those will only be
changed for the TEE use cases (although I can see firecracker, or any
other user of `static_sandbox_resource_mgmt=true` taking advantage of
this).
There's, though, an implicit assumption in this patch that we'd need to
make explicit, and that's that the default_vcpus / default_memory is the
amount of vcpus / memory required by the VMM, and absolutely nothing
else. Also, the amount set there should be reflected in the
podOverhead for the specific runtime class.
One other possible approach, which I am not that much in favour of
taking as I think it's **less clear**, is that we could actually get the
podOverhead amount, subtract it from the default_vcpus (treating the
result as a float), then sum up what the user set as limit (as a float),
and finally ceil the result. It could work, but IMHO this is **less
clear**, and **less explicit** on what we're actually doing, and how the
default_vcpus / default_memory should be used.
Fixes: #6909
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
We don't have to do this since we're relying on the
`static_sandbox_resource_mgmt` feature, which gives us the correct
amount of memory and CPUs to be allocated.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
TestCheckHostIsVMContainerCapable removes sysModuleDir to simulate a
case that the kernel modules are not loaded. However,
checkKernelModules() executes modprobe <module> if a module not
found in that directory. Loading those modules is required to be denied
temporarily.
Fixes: #8390
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
Since moving from network coldplug to hotplug, the only case verified
was veth endpoints. Support for network hotplug for ipvlan and macvlan was
broken/not added. Fix it.
Fixes: #8391
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
TestCheckHostIsVMContainerCapable is failing on AMD machines.
kata-check_amd64_test.go:96 has no AMD modules, also getCPUType is
missing.
Fixes#8384.
Signed-off-by: Beraldo Leal <bleal@redhat.com>
This update includes necessary changes due to the version bump of
containerd and its dependencies. It's part of a broader initiative to
phase out gogo protobuf, which has been deprecated, and to align with
the current supported libraries.
Fixes#7420.
Signed-off-by: Beraldo Leal <bleal@redhat.com>
This patch re-generates the client code for Cloud Hypervisor v35.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.
Fixes: #8378
Signed-off-by: Bo Chen <chen.bo@intel.com>
We used the approach of cold-plugging network interface for pre-shimv2
support for docker.Since the hotplug approach was not required,
we never really got to implementing hotplug support for certain network
endpoints, ipvlan and macvlan being among them.
Since moving to shimv2 interface as the default for
runtime, we switched to hotplugging the network interface for supporting
docker and nerdctl. This was done for veth endpoints only.
Implement the hot-attach apis for ipvlan and macvlan as well to support
ipvlan and macvlan networks with docker and nerdctl.
Fixes: #8333
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Add the hypervisor security details to the output of the `kata-runtime
env` and `kata-ctl env` commands so the user can see, amongst other
things, the value of `confidential_guest`.
Fixes: #8313.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>