Commit Graph

12201 Commits

Author SHA1 Message Date
Lei Li
50e0d43dad runtime: Support privileged containers in peer pod VM
This patch fixes the issue of running containers
with privileged as true.

See the discussion at this URL for the details.
https://github.com/confidential-containers/cloud-api-adaptor/issues/111

Signed-off-by: Lei Li <cdlleili@cn.ibm.com>
(based on commit c3e6b66051)
2023-11-17 13:32:52 +00:00
Yohei Ueda
57d4dd8e57 runtime: Support the remote hypervisor type
This patch adds the support of the remote hypervisor type.
Shim opens a Unix domain socket specified in the config file,
and sends TTPRC requests to a external process to control
sandbox VMs.

Fixes #4482

Co-authored-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
(based on commit f9278f22c3)
2023-11-17 13:32:49 +00:00
Yohei Ueda
8ac9a22097 runtime: Add hypervisor proto to support peer pod VMs
This patch adds a protobuf definiton of the remote hypervisor type.

Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
(based on commit 150e8aba6d)
2023-11-17 13:31:09 +00:00
Fabiano Fidêncio
f8322ffad2
Merge pull request #7796 from WenyuanLau/7794/StratoVirt_VMM_support
StratoVirt: add support for a lightweight VMM StratoVirt in Kata
2023-11-17 10:53:17 +01:00
Fabiano Fidêncio
d6d9b45007
Merge pull request #7931 from BbolroC/migrate-to-gha-s390x
tests|gha: add containerd and k8s tests for s390x
2023-11-17 10:24:14 +01:00
Sumedh Alok Sharma
4aaf54bdad runtime: Fix configmap/secrets update propagation with FS sharing disabled
This PR fixes k8's configmap/secrets etc update propagation when filesystem sharing is disabled.
The commit introduces below changes with some limitations:
- creates new timestamped directory in guest
- updates the '..data' symlink
- creates user visible symlinks to newly created secrets.
- Limitation: The older timestamped directory and stale user visible symlinks exist in guest
  due to missing DELETE api in agent.

Fixes: #7398

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2023-11-17 13:01:23 +05:30
Hyounggyu Choi
0c7aa1f307 gha: Set nightly test for s390x to 5 UTC
This is to push back the time for the s390x nightly test to 5 a.m. UTC.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-11-17 05:47:44 +01:00
Hyounggyu Choi
ffe1ea52cf tests|gha: add containerd and k8s tests for s390x
As part of the CI migration, this PR is to add workflows for containerd and k8s for s390x.

Fixes: #7930
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-11-16 18:14:26 +01:00
GabyCT
8586308dcd
Merge pull request #8453 from GabyCT/topic/udpreadme
metrics: Add iperf udp information to README
2023-11-16 10:38:56 -06:00
GabyCT
494174a98e
Merge pull request #8421 from GabyCT/topic/enablestressng
tests: Enable stressng scalability test
2023-11-16 10:25:05 -06:00
James O. D. Hunt
4a4fc9c648 CODEOWNERS: Expand scope
Improve the `CODEOWNERS` file by specifying more groups.

Since GitHub automatically checks the `CODEOWNERS` file when a PR is
created and adds all matching groups as reviewers for the PR, this may
help reduce the PR backlog since the right people will be alerted and
requested to review the PR. That should improve the quality of reviews
(and thus the quality of the landed code). It may also have a positive
effect on PR velocity.

> **Note:**
>
> This PR combines the other `CODEOWNERS` files so we have
> a single, visible, top-level file.

See: https://github.com/kata-containers/community/issues/253

Fixes: #3804.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-16 16:09:20 +00:00
Fabiano Fidêncio
10996f3bbb
Merge pull request #8460 from ldoktor/artifacts
gha: Keep kata tarballs for 15 days
2023-11-16 13:56:25 +01:00
Liu Wenyuan
c77e990c3e tests: Enable tests for StratoVirt hypervisor
This commit enables StratoVirt hypervisor to be tested in kata GHA,
incluing k8s, metrics, cri-containerd, nydus and so on.

Meanwhile, adding some unit tests for StratoVirt to make sure it works.

Fixes: #7794

Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2023-11-16 20:47:26 +08:00
Liu Wenyuan
14d8790d83 kata-deploy: Add StratoVirt support to deploy process
Allow kata-deploy process to pull StratoVirt from release binaries, and
add them as a part of kata release.

Fixes: #7794

Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2023-11-16 20:47:26 +08:00
Liu Wenyuan
9542211e71 configuration: add configuration for StratoVirt hypervisor.
Add configuration-stratovirt.toml.in to generate the StratoVirt configuration,
and parser to deliver config to StratoVirt.

Fixes: #7794

Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2023-11-16 20:47:26 +08:00
Liu Wenyuan
561c85be54 build: Makefile for StratoVirt hypervisor
Add support for building StratoVirt hypervisor, including x86_64 and
arm64.

Fixes: #7794

Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2023-11-16 20:47:26 +08:00
Liu Wenyuan
26966c8469 virtcontainers: Add StratoVirt as a supported hypervisor
Initial support of the MicroVM machine type of StratoVirt
hypervisor for the kata go runtime.

Fixes: #7794

Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2023-11-16 20:47:24 +08:00
Fabiano Fidêncio
edb791315e
Merge pull request #7987 from BbolroC/nightly-ci-s390x
tests|gha: add nightly tests for s390x
2023-11-16 11:45:32 +01:00
Lukáš Doktor
8959e3ca05
gha: Keep kata tarballs for 15 days
these tarballs are useful for debugging and re-running jobs, keep them
for 15 days.

Fixes: #8000

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2023-11-16 10:35:20 +01:00
Gabriela Cervantes
9cc6908b09 stability: Update stressng to run on the gha
This PR updates the stressng test to run on the gha for kata CI.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-15 19:34:36 +00:00
Gabriela Cervantes
9d8eb298c3 metrics: Add iperf udp information to README
This PR adds the iperf udp information to the network README
for the kata metrics CI.

Fixes #8452

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-15 15:22:06 +00:00
Gabriela Cervantes
4b7854b668 stability: Add missing dependencies
This PR adds missing dependencies to run stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-15 14:51:14 +00:00
Gabriela Cervantes
79177bb9cb tests: Enable stressng scalability test
This PR enables the stressng scalability test for kata CI.

Fixes #8420

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-15 14:51:14 +00:00
Xuewei Niu
f18794d880
Merge pull request #8426 from justxuewei/vhost-rm-virtio-net
dragonball: Remove vhost-net dependency on virtio-net
2023-11-15 10:39:27 +08:00
alex.lyn
ba632ba825 runitme-rs: kata with multi-containers sharing one direct volume
When multiple containers in a kata pod share one direct volume,
it's important to make sure that the corresponding block device
is only mounted once in the guest. This means that there should
be only one mount entry for the device in the mount information.

Fixes: #8328

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-15 10:37:01 +08:00
alex.lyn
d7594d830c runtime-rs: correct the path from cid to device_id.
When a direct volume is used by multiple containers in Kata,
Generating many shared paths with cids will cause IO error
as the result of one direct volume mounts more than once.
To correct it, use the device_id instead of cid which
ensures that the guest only mounts the FS once.

Fixes: #8328

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-15 10:30:39 +08:00
Fabiano Fidêncio
906f6b7380
Merge pull request #8431 from UiPath/fix-vsock-packets-drop
kernel: Fix vsock packets drop when the driver initializes
2023-11-14 18:52:53 +01:00
Fabiano Fidêncio
1699b84f13 utils: kata-manager: Remove $enable_debug from the install_kata call
This was added as part of d4d65bed38, but
install_kata has never actually used the passed enable_debug var.

With this in mind, let's just remove it.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-14 17:34:03 +01:00
Fabiano Fidêncio
38d2edd83b utils: kata-manager: Allow installing kata from a given tarball
With this change, we give the users the change to try kata-containers
with their own pre-built tarball.

This will become very useful in the CI context, as we won't be
downloading a specific version of kata-containers, but rather installing
whatever was built in previous steps of the CI pipeline.

Fixes: #8438

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-14 17:34:01 +01:00
Fabiano Fidêncio
fd9b6d6837
Merge pull request #7623 from fidencio/topic/runtime-improve-vcpu-allocation-on-host-side
runtime: Improve vCPU allocation for the VMMs
2023-11-14 14:10:54 +01:00
Alexandru Matei
bfd1ce30e1 kernel: Fix vsock packets drop when the vsock driver starts
The virtio vsock driver has a small window during initialization
where it can silently drop replies to connection requests.
Because no reply is sent, kata waits for 10 seconds and in the
end it generates a connection timeout error in HybridVSockDialer.

Fixes: #8291

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2023-11-14 11:02:52 +02:00
Xuewei Niu
49c2e6e23c dragonball: Remove vhost-net dependency on virtio-net
This patch is to remove vhost-net dependency on virtio-net for
dbs-virtio-devices crate. Then, the feature of vhost-net is able to enable
without enabling virtio-net device, error, etc.

Fixes: #8423

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-14 15:35:10 +08:00
Fabiano Fidêncio
dffc6f611c
Merge pull request #8432 from justxuewei/rm-ci-docker-and-nerdctl
gha: Remove docker and nerdctl tests from ci.yaml
2023-11-14 08:34:18 +01:00
alex.lyn
4d65c2e8a2 runtime-rs: introduce update_device in trait Hypervisor
Introduce the `update_device` trait in Hypervisor to enable
device updates for VMMs.This trait will initially be utilized
for virtiofs Mount operations.

Fixes: #7915

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-11-14 11:56:36 +08:00
Xuewei Niu
481486c6d5 gha: Remove docker and nerdctl tests from CI
Two workflows, run-nerdctl-tests-on-garm.yaml and
run-docker-tests-on-garm.yaml, are removed from commit b481d39. However,
they are referenced by CI workflow. It leads to the CI not working
properly. This patch is to remove those files from ci.yaml.

Fixes: #8433

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-14 10:44:14 +08:00
Fabiano Fidêncio
c858ea1460
Merge pull request #8174 from fidencio/topic/re-revert-8115
ci: Re-add tracing tests and move docker/nerdctl to the basic-ci-amd64.yaml file
2023-11-13 18:19:40 +01:00
James O. D. Hunt
a781ce33b0
Merge pull request #8383 from jodh-intel/kata-manager-add-list-option
utils: kata-manager: Add option to list versions
2023-11-13 16:18:36 +00:00
David Esparza
98ec34b04c
Merge pull request #8338 from dborquez/improve_metrics_init_environment
metrics: Fix function that completely stops kata containers before running a test
2023-11-13 09:35:27 -06:00
Fabiano Fidêncio
b481d396fc gha: Move docker / nerdctl content to the basic-ci-amd64 file
There's no need to keep those as separate files, and by having those in
the basic-ci-amd64.yaml file actually helps us to avoid the
undocummented GHA limitation about the number of files imported.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-13 15:34:00 +01:00
Fabiano Fidêncio
3c735c236d ci: tracing: Adapt to basic-ci-amd64.yaml
Peng Tao made this move as part of 1280f85343, and here we're
simply adjusting to the move.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-13 15:27:39 +01:00
Fabiano Fidêncio
ee17fe9d20 Revert "gha: ci: Revert tracing test PR to unbreak CI"
This reverts commit e9bd852113.
2023-11-13 15:27:39 +01:00
James O. D. Hunt
4d5b23b73a
Merge pull request #8419 from jodh-intel/2023-11-10-fix-tdx
runtime-rs: ch: Fix TDX
2023-11-13 11:58:16 +00:00
James O. D. Hunt
7f666f783d runtime-rs: ch: Fix TDX
PR #8311 inadvertently broke the runtime-rs / Cloud Hypervisor TDX
handling. It also introduced unrecoverable failure scenarios. Hence,
replace slow, fallible regex matching in logging fast path with single pass
non-failing multi-string log level matching.

Also, added a unit test for `parse_ch_log_level()`.

Fixes: #8418.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-13 08:49:47 +00:00
Xuewei Niu
0a9125e629
Merge pull request #7675 from justxuewei/vhost-net 2023-11-12 20:38:18 +08:00
Xuewei Niu
d1deaf0538 dragonball: Minor changes for a comment from Bian
- Add feature control for InsertNetworkDevice.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-12 14:14:10 +08:00
Xuewei Niu
e4f83e27c4 dragonball: vhost-net set_offload with acked features
set_offload() for tap devices depends on acked features.

Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-12 14:10:39 +08:00
Xuewei Niu
6cd572dbbb dragonball: Minor changes for Chao's comments
- Remove two panic statements from InsertNetworkDevice test.
- Rename `NUM_QUEUES` to `DEFAULT_NUM_QUEUES`, `QUEUE_SIZE` to
  `DEFAULT_QUEUE_SIZE` for vhost-net and virtio-net.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-12 14:10:39 +08:00
Xuewei Niu
dcdf3c6556 runtime-rs: Supply missing fields of NetworkConfig
`test_networkconfig_to_netconfig` from clh depends on `NetworkConfig` which
has some new fields in this PR. Therefore, this commit gives the test
missing fields.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-12 14:10:39 +08:00
Xuewei Niu
58e9709c1f dragonball: Changes for ZizhengBian's comments
- Dragonball's vhost-net feature not depends on virtio-net feature.
- Remove `TapError` from dbs-virtio-devices's Error, and add `VirtioNet`
  and `VhostNet` two fields.
- Downgrade visiblity of two fields of `VhostNetDeviceMgr` from
  `pub(crate)`.
- File an issue to record a todo for network rate limiter.
- Print internal errors with `{0:?}.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-12 14:10:33 +08:00
Fabiano Fidêncio
849253e55c tests: Add a simple test to check the VMM vcpu allocation
As we've done some changes in the VMM vcpu allocation, let's introduce
basic tests to make sure that we're getting the expected behaviour.

The test consists in checking 3 scenarios:
* default_vcpus = 0 | no limits set
  * this should allocate 1 vcpu
* default_vcpus = 0.75 | limits set to 0.25
  * this should allocate 1 vcpu
* default_vcpus = 0.75 | limits set to 1.2
  * this should allocate 2 vcpus

The tests are very basic, but they do ensure we're rounding things up to
what the new logic is supposed to do.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-10 18:26:01 +01:00