Commit Graph

1745 Commits

Author SHA1 Message Date
Hyounggyu Choi
588f639a69 Merge pull request #6755 from BbolroC/add-se-artifacts-to-main
packaging: Add IBM Z SE artifacts to main
2023-12-08 05:17:38 +01:00
Fabiano Fidêncio
d149b9f9ca Merge pull request #7231 from wainersm/measured_rootfs-improvements
Build for measured rootfs improvements
2023-12-05 22:20:33 +01:00
Hyounggyu Choi
bb1d4adaa9 config: add SE configuration
This is to add SE configuration which is used by kata runtime.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-04 21:08:49 +01:00
yuchen.cc
1cd1558a92 mount: support checking multiple kinds of block device driver
Device mapper is the only supported block device driver so far,
which seems limiting. Kata Containers can work well with other
block devices. It is necessary to enhance supporting of multiple
kinds of host block device.

Fixes #4714

Signed-off-by: yuchen.cc <yuchen.cc@alibaba-inc.com>
2023-12-01 11:59:30 +08:00
Steve Horsman
c6110284d5 Merge pull request #8520 from stevenhorsman/hypervisor-ttrpc
runtime: Update hypervisor generated code
2023-11-30 10:01:56 +00:00
Fabiano Fidêncio
f15e16b692 Revert "runtime: confidential: Do not set the max_vcpu to cpu"
This reverts commit b0157ad73a.
```
commit b0157ad73a
Refs: 3.3.0-alpha0-124-gb0157ad73
Author:     Fabiano Fidêncio <fabiano.fidencio@intel.com>
AuthorDate: Fri Aug 11 14:55:11 2023 +0200
Commit:     Fabiano Fidêncio <fabiano.fidencio@intel.com>
CommitDate: Fri Nov 10 12:58:20 2023 +0100

    runtime: confidential: Do not set the max_vcpu to cpu

    We don't have to do this since we're relying on the
    `static_sandbox_resource_mgmt` feature, which gives us the correct
    amount of memory and CPUs to be allocated.

    Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
```

This commit was removing a requirement that was made previously, but due
to the SMP issue we're facing with the QEMU used for TDX (see commit
d1b54ede290e95762099fff4e0bcdad10f816126*), QEMU will fail to start due
to:
```
Invalid CPU topology: product of the hierarchy must match maxcpus:
sockets (1) * dies (1) * cores (1) * threads (1) != maxcpus (240)"
```

This has no affect on the SEV / SNP workflow and hopefully we'll be able
to re-revet this soon enough, when this gets solved on te QEMU side.

Last but not least, this is not a "clean" revert as we're using
conf.NumVCPUs() instead of conf.NumVCPUs, to ensure we're dealing with
uint32.

Fixes: #8532

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-30 00:41:27 +01:00
stevenhorsman
47b8c3181f runtime: remote hypervisor updates to ttrpc
- Update the remote hypervisor code to match the re-genned code for
the ttrpc Hypervisor Service

Fixes: #8519
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-11-29 18:04:40 +00:00
stevenhorsman
613c75ba8c runtime: Update hypervisor generated code
Update to use ttrpc_out instead of grpc_out

Fixes: #8519
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-11-29 18:04:40 +00:00
Wainer dos Santos Moschetta
a13eecf7f3 runtime(-rs): add clean-generated-files target
The new clean-generated-files make target allows for removing the
generated files (including the configuration.toml files).

The tools/packaging/static-build/shim-v2/build.sh script now uses that
target to always force the re-generation of those files.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:53 -03:00
James O. D. Hunt
45cc417a4e Merge pull request #8461 from jodh-intel/update-codeowners
CODEOWNERS: Expand scope
2023-11-27 15:38:39 +00:00
Fabiano Fidêncio
bb4c51a5e0 Merge pull request #8494 from ChengyuZhu6/kata_virtual_volume
runtime: Pass `KataVirtualVolume` to the guest as devices in go runtime
2023-11-27 16:02:28 +01:00
ChengyuZhu6
5318afe273 runtime: support to create VirtualVolume rootfs storages
1) Creating storage for all `io.katacontainers.volume=` messages in rootFs.Options,
and then aggregates all storages  into `containerStorages`.
2) Creating storage for other data volumes and push them into `volumeStorages`.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2023-11-23 23:22:55 +08:00
ChengyuZhu6
0b4f7c2ee7 runtime: redefine and add functions to handle VirtualVolume to storage
1) Extract function `handleBlockVolume` to create Storage only.
2) Add functions to handle KataVirtualVolume device and construct
   corresponding storages.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2023-11-23 23:07:32 +08:00
ChengyuZhu6
bd099fbda9 runtime: extend SharedFile to support mutiple storage devices
To enhance the construction and administration of `Katavirtualvolume` storages,
this commit expands the 'sharedFile' structure to manage both
rootfs storages(`containerStorages`) including `Katavirtualvolume` and other data volumes storages(`volumeStorages`).

NOTE: `volumeStorages` is intended for future extensions to support Kubernetes data volumes.
Currently, `KataVirtualVolume` is exclusively employed for container rootfs, hence only `containerStorages` is actively utilized.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2023-11-23 23:05:14 +08:00
ChengyuZhu6
e4f33ac141 runtime: add functions to create devices in KataVirtualVolume
The snapshotter will place `KataVirtualVolume` information
into 'rootfs.options' and commence with the prefix 'io.katacontainers.volume='.
The purpose of this commit is to transform the encapsulated KataVirtualVolume data into device information.

Fixes: #8495

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Feng Wang <feng.wang@databricks.com>
Co-authored-by: Samuel Ortiz <sameo@linux.intel.com>
Co-authored-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-11-23 23:05:13 +08:00
Dan Mihai
756022787c Merge pull request #8239 from Sumynwa/sumsharma/fix_configmap_update_propagation
runtime: Fix configmap/secrets updates with FS sharing disabled
2023-11-23 06:50:53 -08:00
Fabiano Fidêncio
9445a967b6 Merge pull request #8471 from ChengyuZhu6/kata-virtual-volume
runtime: Introduce `KataVirtualVolume` structure into go runtime
2023-11-20 21:58:27 +01:00
ChengyuZhu6
1353b14e6c runtime: Add KataVirtualVolume struct in runtime
Add the corresponding data structure in the runtime part according to
kata-containers/kata-containers/pull/7698.

Fixes: #8472

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2023-11-19 13:30:32 +08:00
Pradipta Banerjee
39e8c84269 runtime: Add support for key annotations to remote hyp
In order to support different pod VM instance type via
remote hypervisor implementation (cloud-api-adaptor),
we need to pass machine_type, default_vcpus
and default_memory annotations to cloud-api-adaptor.

The cloud-api-adaptor then uses these annotations to spin
up the appropriate cloud instance.

Reference PR for cloud-api-adaptor
https://github.com/confidential-containers/cloud-api-adaptor/pull/1088

Fixes: #7140
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
(based on commit 004f07f076)
2023-11-17 13:33:27 +00:00
Yohei Ueda
2910e333a8 runtime: Use static resource in remote hypervisor
This patch updates the template configuration file for
the remote hypervisor to set static_sandbox_resource_mgmt
to be true.  The remote hypervisor uses the peer pod config
to determine the sandbox size, so requires this to be set to
true by default.

Fixes: #6616
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
(based on commit 938447803b)
2023-11-17 13:33:27 +00:00
stevenhorsman
26d56678a9 config: Add initial remote hypervisor config
- Remote hypervisor template config
- Add annotation enablement for machine_type, default_memory and
default_vcpus for flexible instance types

Fixes: #6349
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
(based on commits 7c9a791d67
and 335a456425)
2023-11-17 13:33:24 +00:00
stevenhorsman
ad63439a3e runtime: Update the remote hypervisor config
Add the SELinux setting to ensure it is passed through to the remote
hypervisor

Fixes: #5936

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
(based on commit 3ef2fd1784)
2023-11-17 13:32:52 +00:00
Lei Li
50e0d43dad runtime: Support privileged containers in peer pod VM
This patch fixes the issue of running containers
with privileged as true.

See the discussion at this URL for the details.
https://github.com/confidential-containers/cloud-api-adaptor/issues/111

Signed-off-by: Lei Li <cdlleili@cn.ibm.com>
(based on commit c3e6b66051)
2023-11-17 13:32:52 +00:00
Yohei Ueda
57d4dd8e57 runtime: Support the remote hypervisor type
This patch adds the support of the remote hypervisor type.
Shim opens a Unix domain socket specified in the config file,
and sends TTPRC requests to a external process to control
sandbox VMs.

Fixes #4482

Co-authored-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
(based on commit f9278f22c3)
2023-11-17 13:32:49 +00:00
Yohei Ueda
8ac9a22097 runtime: Add hypervisor proto to support peer pod VMs
This patch adds a protobuf definiton of the remote hypervisor type.

Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
(based on commit 150e8aba6d)
2023-11-17 13:31:09 +00:00
Sumedh Alok Sharma
4aaf54bdad runtime: Fix configmap/secrets update propagation with FS sharing disabled
This PR fixes k8's configmap/secrets etc update propagation when filesystem sharing is disabled.
The commit introduces below changes with some limitations:
- creates new timestamped directory in guest
- updates the '..data' symlink
- creates user visible symlinks to newly created secrets.
- Limitation: The older timestamped directory and stale user visible symlinks exist in guest
  due to missing DELETE api in agent.

Fixes: #7398

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2023-11-17 13:01:23 +05:30
James O. D. Hunt
4a4fc9c648 CODEOWNERS: Expand scope
Improve the `CODEOWNERS` file by specifying more groups.

Since GitHub automatically checks the `CODEOWNERS` file when a PR is
created and adds all matching groups as reviewers for the PR, this may
help reduce the PR backlog since the right people will be alerted and
requested to review the PR. That should improve the quality of reviews
(and thus the quality of the landed code). It may also have a positive
effect on PR velocity.

> **Note:**
>
> This PR combines the other `CODEOWNERS` files so we have
> a single, visible, top-level file.

See: https://github.com/kata-containers/community/issues/253

Fixes: #3804.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-16 16:09:20 +00:00
Liu Wenyuan
c77e990c3e tests: Enable tests for StratoVirt hypervisor
This commit enables StratoVirt hypervisor to be tested in kata GHA,
incluing k8s, metrics, cri-containerd, nydus and so on.

Meanwhile, adding some unit tests for StratoVirt to make sure it works.

Fixes: #7794

Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2023-11-16 20:47:26 +08:00
Liu Wenyuan
9542211e71 configuration: add configuration for StratoVirt hypervisor.
Add configuration-stratovirt.toml.in to generate the StratoVirt configuration,
and parser to deliver config to StratoVirt.

Fixes: #7794

Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2023-11-16 20:47:26 +08:00
Liu Wenyuan
561c85be54 build: Makefile for StratoVirt hypervisor
Add support for building StratoVirt hypervisor, including x86_64 and
arm64.

Fixes: #7794

Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2023-11-16 20:47:26 +08:00
Liu Wenyuan
26966c8469 virtcontainers: Add StratoVirt as a supported hypervisor
Initial support of the MicroVM machine type of StratoVirt
hypervisor for the kata go runtime.

Fixes: #7794

Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2023-11-16 20:47:24 +08:00
Fabiano Fidêncio
5e9cf75937 vc: utils: Rename CalculateMilliCPUs() to CalculateCPUsF()
With the change done in the last commit, instead of calculating milli
cpus, we're actually converting the CPUs to a fraction number, a float.

Let's update the function name (and associated vars) to represent that
change.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-10 18:26:01 +01:00
Fabiano Fidêncio
e477ed0e86 runtime: Improve vCPU allocation for the VMMs
First of all, this is a controversial piece, and I know that.

In this commit we're trying to make a less greedy approach regards the
amount of vCPUs we allocate for the VMM, which will be advantageous
mainly when using the `static_sandbox_resource_mgmt` feature, which is
used by the confidential guests.

The current approach we have basically does:
* Gets the amount of vCPUs set in the config (an integer)
* Gets the amount of vCPUs set as limit (an integer)
* Sum those up
* Starts / Updates the VMM to use that total amount of vCPUs

The fact we're dealing with integers is logical, as we cannot request
500m vCPUs to the VMMs.  However, it leads us to, in several cases, be
wasting one vCPU.

Let's take the example that we know the VMM requires 500m vCPUs to be
running, and the workload sets 250m vCPUs as a resource limit.

In that case, we'd do:
* Gets the amount of vCPUs set in the config: 1
* Gets the amount of vCPUs set as limit: ceil(0.25)
* 1 + ceil(0.25) = 1 + 1 = 2 vCPUs
* Starts / Updates the VMM to use 2 vCPUs

With the logic changed here, what we're doing is considering everything
as float till just before we start / update the VMM. So, the flow
describe above would be:
* Gets the amount of vCPUs set in the config: 0.5
* Gets the amount of vCPUs set as limit: 0.25
* ceil(0.5 + 0.25) = 1 vCPUs
* Starts / Updates the VMM to use 1 vCPUs

In the way I've written this patch we introduce zero regressions, as
the default values set are still the same, and those will only be
changed for the TEE use cases (although I can see firecracker, or any
other user of `static_sandbox_resource_mgmt=true` taking advantage of
this).

There's, though, an implicit assumption in this patch that we'd need to
make explicit, and that's that the default_vcpus / default_memory is the
amount of vcpus / memory required by the VMM, and absolutely nothing
else.  Also, the amount set there should be reflected in the
podOverhead for the specific runtime class.

One other possible approach, which I am not that much in favour of
taking as I think it's **less clear**, is that we could actually get the
podOverhead amount, subtract it from the default_vcpus (treating the
result as a float), then sum up what the user set as limit (as a float),
and finally ceil the result.  It could work, but IMHO this is **less
clear**, and **less explicit** on what we're actually doing, and how the
default_vcpus / default_memory should be used.

Fixes: #6909

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2023-11-10 18:25:57 +01:00
Fabiano Fidêncio
b0157ad73a runtime: confidential: Do not set the max_vcpu to cpu
We don't have to do this since we're relying on the
`static_sandbox_resource_mgmt` feature, which gives us the correct
amount of memory and CPUs to be allocated.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-10 12:58:20 +01:00
Archana Shinde
1611723465 Merge pull request #8379 from likebreath/1103/clh_v36.0
Upgrade to Cloud Hypervisor v36.0
2023-11-08 21:10:41 -08:00
Archana Shinde
268d4d622f Merge pull request #8389 from justxuewei/vm-capable-test
runtime: Fix TestCheckHostIsVMContainerCapable unstablity issue
2023-11-08 12:14:04 -08:00
Archana Shinde
92a517156c Merge pull request #8367 from amshinde/add-nerdctl-ipvlan-test
network: Fix network hotplug for ipvlan and macvlan endpoints for qemu and add tests
2023-11-08 11:45:13 -08:00
Xuewei Niu
acd9057c7b runtime: Fix TestCheckHostIsVMContainerCapable unstablity issue
TestCheckHostIsVMContainerCapable removes sysModuleDir to simulate a
case that the kernel modules are not loaded. However,
checkKernelModules() executes modprobe <module> if a module not
found in that directory. Loading those modules is required to be denied
temporarily.

Fixes: #8390

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 22:40:08 +08:00
Archana Shinde
a6272733e7 network: Fix network hotplug for ipvlan and macvlan endpoints.
Since moving from network coldplug to hotplug, the only case verified
was veth endpoints. Support for network hotplug for ipvlan and macvlan was
broken/not added. Fix it.

Fixes: #8391

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-07 10:13:51 -08:00
Beraldo Leal
dd530ba8ee tests: fixes AMD errors
TestCheckHostIsVMContainerCapable is failing on AMD machines.
kata-check_amd64_test.go:96 has no AMD modules, also getCPUType is
missing.

Fixes #8384.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:59 +00:00
Beraldo Leal
7641c19f74 runtime: bump containerd for gogo deprecation
This update includes necessary changes due to the version bump of
containerd and its dependencies. It's part of a broader initiative to
phase out gogo protobuf, which has been deprecated, and to align with
the current supported libraries.

Fixes #7420.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:59 +00:00
Beraldo Leal
16fa2c39e6 protocols: replace gogo/types.Empty and Any
by Google versions.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:58 +00:00
Beraldo Leal
5d88c78a6e protocols: generating agent.pb.go
a3b003c345 modified agent but agent.pb.go
was not updated.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:58 +00:00
Bo Chen
071667f1ca runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v35.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #8378

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-11-03 10:47:06 -07:00
Fabiano Fidêncio
40cc397218 Merge pull request #8255 from cmaf/migrate-checks-fixes-links
docs: Fix broken links
2023-11-01 14:46:30 +01:00
David Esparza
2a17d3889e Merge pull request #8334 from amshinde/ipvlan-nerdctl-fix
network: Fix network attach for ipvlan and macvlan
2023-10-30 16:00:32 -06:00
Archana Shinde
f53f86884f network: Fix network attach for ipvlan and macvlan
We used the approach of cold-plugging network interface for pre-shimv2
support for docker.Since the hotplug approach was not required,
we never really got to implementing hotplug support for certain network
endpoints, ipvlan and macvlan being among them.

Since moving to shimv2 interface as the default for
runtime, we switched to hotplugging the network interface for supporting
docker and nerdctl. This was done for veth endpoints only.

Implement the hot-attach apis for ipvlan and macvlan as well to support
ipvlan and macvlan networks with docker and nerdctl.

Fixes: #8333

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-10-27 21:42:37 -07:00
Chelsea Mafrica
0608e20a01 docs: Fix broken links
Update broken links so that static checks pass.

Fixes #8254

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-10-26 10:17:01 -07:00
James O. D. Hunt
d707fa2c0d kata-runtime/kata-ctl: Add security details to output
Add the hypervisor security details to the output of the `kata-runtime
env` and `kata-ctl env` commands so the user can see, amongst other
things, the value of `confidential_guest`.

Fixes: #8313.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-25 16:34:42 +01:00
Fabiano Fidêncio
328ba0da99 Merge pull request #7647 from jongwu/use_pcie_virt
AArch64: runtime: use pcie root port to do pci/pcie device hotplug
2023-10-25 09:17:13 +02:00