Commit Graph

13488 Commits

Author SHA1 Message Date
Wainer dos Santos Moschetta
28a63070f7 gha: fix step name in run-runk-tests
Likely copied from the tracing workflow by mistake.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-26 11:09:29 -03:00
Wainer dos Santos Moschetta
8a606eb94d tests/runk: convert to bats
Migrated runk tests from pure shell script to bats to be consistent with
other test suites.

The install_dependencies() will install the bats tool locally.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-26 11:09:23 -03:00
Xuewei Niu
bb5e33b33a
Merge pull request #9100 from littlejawa/fix_5738_metrics_memory
runtime: remove kata_shim_netdev metric
2024-02-26 19:01:21 +08:00
James O. D. Hunt
0ea30f44cf
Merge pull request #9076 from jodh-intel/add-survey-link-to-release-notes
packaging: release notes: Don't show shortlist by default, and add survey link
2024-02-26 10:25:19 +00:00
Steve Horsman
483ecbadf0
Merge pull request #9142 from ChengyuZhu6/protoc
build-checks: Install protoc in the ci environments
2024-02-26 09:52:31 +00:00
Dan Mihai
f4509b806b runtime: clh: minimum 10s timeout for CreateVM + BootVM
Relax the timeout for calling CLH's CreateVM + BootVM APIs. When
hitting the older 1s timeout, killing a half-booted Guest and
retrying the same boot sequence could have been wasteful and resulting
in unstable CI testing on slower Hosts.

Fixes: #9152

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-24 19:15:57 +00:00
GabyCT
4f3c83cd12
Merge pull request #9115 from GabyCT/topic/adddief
scripts: Add an enhanced die function
2024-02-23 12:03:02 -06:00
Saul Paredes
9b7bd376eb genpolicy: panic when we see a volume mount subpath
Based on https://github.com/kata-containers/runtime/issues/2812

Fixes: #9145

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-02-23 09:56:51 -08:00
James O. D. Hunt
8c72abe38d packaging: Add link to survey in release notes
Add a link in the release notes to the Kata Container survey, to
advertise it, and hopefully encourage users to take the survey.

Fixes: #9074.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-02-23 09:57:52 +00:00
James O. D. Hunt
0391c0de82 packaging: Add twistie to release notes shortlog
Add a "twistie" / arrow (`▶`) that the user can click on to see the full
list of commits _if they want to_.

This way, the release notes become easier to read and we can display
information below the shortlog which would (probably) normally not be
seen due to the huge long list of commits.

Fixes: #9075.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-02-23 09:57:52 +00:00
ChengyuZhu6
3cc55ff8af build-checks: Install protoc in the ci environments
To test PR #8484 for pulling image in the guest with image-rs, the compilation process for the kata-agent relies on
protoc:
https://github.com/kata-containers/kata-containers/actions/runs/8016317290/job/21898040849?pr=8484
https://github.com/kata-containers/kata-containers/actions/runs/8016534530/job/21898654435?pr=8484

Fixes: #9141

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-23 17:38:13 +08:00
Xuewei Niu
89c76d7d8d
Merge pull request #9125 from gkurz/fix-agent-cgroup-ns
agent: Run container workload in its own cgroup namespace (cgroup v2 guest only)
2024-02-23 10:40:17 +08:00
Steve Horsman
e342a9adc4
Merge pull request #9119 from ChengyuZhu6/pause-confidential
kata-deploy: Add pause image to confidential rootfs
2024-02-22 17:10:55 +00:00
Steve Horsman
531dcd2f25
Merge pull request #9132 from ChengyuZhu6/nydus-snapshotter-version
gha: bump nydus snapshotter version to v0.13.8
2024-02-22 17:10:42 +00:00
Steve Horsman
dfa6e932bb
Merge pull request #9122 from ChengyuZhu6/snapshotter-clean
gha: try to cleanup nydus snapshotter before deploying it
2024-02-22 13:30:04 +00:00
Julien Ropé
1c306fe4a6 runtime-rs: stop reporting net dev metrics for the shim
For consistency with the go runtime.
As the shim itself is not using the network (all its communication with
other processes is done with local unix sockets), there is no reason to
keep gathering and reporting shim-specific network metrics.
Actual network usage of the kata containers can be found from the existing
agent network metrics (kata_guest_netdev_stat).

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-02-22 14:00:00 +01:00
Julien Ropé
9de65707ca runtime: stop reporting net dev metrics for the shim
As part of the shim network metrics, the shim is reporting network interfaces
from the host with no namespace isolation - this gives insight in interfaces
not tied to the kata containers, and causes an increase in resource usage for
kata metrics.

As the shim itself is not using the network (all its communication with
other processes is done with local unix sockets), there is no reason to
keep gathering and reporting shim-specific network metrics.
Actual network usage of the kata containers can be found from the existing
hypervisor network metrics (kata_hypervisor_netdev) and from the agent
network metrics (kata_guest_netdev_stat).

Fixes: #5738

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-02-22 14:00:00 +01:00
ChengyuZhu6
8ab3894dc5 gha: try to cleanup nydus snapshotter before deploying it
CI failed to deploy nydus snapshotter because it was not cleaned up last time.
So we can try to cleanup nydus snapshotter before deploying it.

Fixes: #9121

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-22 18:51:14 +08:00
Alex Lyn
5d3ae360ed
Merge pull request #9130 from Apokleos/bugfix-dragonball-invalidOperation
runtime-rs: bugfix for GPU passthrough failed with InvalidOperation.
2024-02-22 17:47:09 +08:00
ChengyuZhu6
f16f709a5e kata-deploy: Add pause image to confidential rootfs
For confidential containers, the pause image needs to be installed in
the rootfs.

Fixes: #9118

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-22 15:41:16 +08:00
ChengyuZhu6
d8db3fb17f gha: bump nydus snapshotter version to v0.13.8
Bump nydus snapshotter version to v0.13.8 to fix the bug in v0.13.7 : https://github.com/containerd/nydus-snapshotter/pull/582

Fixes: #9131

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-22 15:35:08 +08:00
Alex Lyn
014e0f4e46 runtime-rs: bugfix for GPU passthrough failed with InvalidOperation.
We need initailize the pci_hotplug_enabled with true before we do GPU
passthrough with runtime-rs/dragonball. Otherwise it fails with error
`InvalidOperation`.

Fixes: #9129

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-02-22 10:22:32 +08:00
Dan Mihai
58fbb9f6ec
Merge pull request #9073 from microsoft/danmihai1/test-genpolicy3
tests: k8s: generated policy for additional tests
2024-02-21 14:11:51 -08:00
Dan Mihai
b3c3f992ab tests: k8s: common clean-up on teardown
teardown() gets executed after each test case, so there is no need to
clean-up before teardown.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
9c164698d3 tests: k8s: k8s-optional-empty-configmap policy
Auto-generate policy for k8s-optional-empty-configmap.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
74a52c6d25 tests: k8s: k8s-oom.bats auto-generated policy
Auto-generate policy for k8s-oom.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
26a77d67f4 tests: k8s: k8s-number-cpus auto-generated policy
Auto-generate policy for k8s-number-cpus.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
9cbdce15fd tests: k8s: k8s-memory.bats auto-generated policy
Auto-generate policy for k8s-memory.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
40209cc0b7 tests: k8s: k8s-limit-range auto-generated policy
Auto-generate policy for k8s-limit-range.bats.

Also, fix teardown() namespace.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
df3c0318c6 tests: k8s: add set_namespace_to_policy_settings
Add set_namespace_to_policy_settings() for changing the pod namespace
in genpolicy settings.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
6e14ce93c9 tests: k8s-kill-all-process-in-container policy
Auto-generate policy for k8s-kill-all-process-in-container.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
fad7ba0aea tests: k8s: k8s-job.bats auto-generated policy
Auto-generate policy for 8s-job.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
41c2bcbdc5 tests: k8s: k8s-file-volume auto-generated policy
Auto-generate policy for k8s-file-volume.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
d84f50db5b genpolicy: fix typo in policy logging
Improve logging, for easier debugging.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
81e641814f tests: k8s: k8s-cpu-ns auto-generated policy
Auto-generate policy for k8s-cpu-ns.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
bc6d3fc238 tests: k8s: k8s-env.bats auto-generated policy
Auto-generate policy for k8s-env.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
0a4fc071ac tests: k8s: k8s-custom-dns auto-generated policy
Auto-generate policy for k8s-custom-dns.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
f693f49e92 tests: k8s: k8s-credentials-secrets policy
Auto-generate policy for k8s-credentials-secrets.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
d3d27bbb5b tests: k8s: k8s-configmap auto-generated policy
Auto-generate policy for k8s-configmap.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
b318535536 tests: k8s: auto-generate k8s-caps.bats policy
Auto-generated policy for k8s-caps.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Greg Kurz
600b951afd agent: Run container workload in its own cgroup namespace
When cgroup v2 is in use, a container should only see its part of the
unified hierarchy in `/sys/fs/cgroup`, not the full hierarchy created
at the OS level. Similarly, `/proc/self/cgroup` inside the container
should display `0::/`, rather than a full path such as :

0::/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podde291f58_8f20_4d44_aa89_c9e538613d85.slice/crio-9e1823d09627f3c2d42f30d76f0d2933abdbc033a630aab732339c90334fbc5f.scope

What is needed here is isolation from the OS. Do that by running the
container in its own cgroup namespace. This matches what runc and
other non VM based runtimes do.

Fixes #9124

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-02-21 13:14:13 +01:00
Greg Kurz
14886c7b32 agent: lint code
Run cargo-clippy to reduce noise in actual functional changes.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-02-21 13:14:13 +01:00
ChengyuZhu6
cddaf2ce97 kata-deploy: Remove specific kernel/initrd/image leftovers in Makefile
Remove specific kernel/initrd/image leftovers in Makefile of
local-build, which is the part of #9026.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-21 18:24:10 +08:00
Chelsea Mafrica
241a56989a
Merge pull request #9090 from GabyCT/topic/pulldockerimage
gha: docker: Pull docker image as part of the dependencies
2024-02-20 14:28:53 -08:00
GabyCT
ea78013c7e
Merge pull request #9079 from GabyCT/topic/removecilink
docs: Update CI link into the README
2024-02-20 14:11:13 -06:00
GabyCT
64c09fe6c5
Merge pull request #9088 from GabyCT/topic/fixnydus
gha: nydus: Fix indentation in gha run script
2024-02-20 14:09:54 -06:00
Gabriela Cervantes
ff8a6fa9ef scripts: Add error script
This PR adds the error script to display the error message with
much more information to help debugging.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-20 18:30:03 +00:00
Gabriela Cervantes
43a46d5a6b scripts: Add an enhanced die function
This PR adds an enhanced die function in order to dump more information
in a yaml format that will help with the debugging.

Fixes #9105

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-20 18:27:44 +00:00
Archana Shinde
6d84fe3a37
Merge pull request #8647 from amshinde/cleanup-network
Cleanup network to make sure physical interfaces are restores back to original host driver.
2024-02-20 08:59:53 -08:00
Archana Shinde
6d38fa1530 network: Try removing as many changes as possible during network cleanup
In case an error is encountered while removing a network endpoint during
network cleanup, we cuurently return immediately with the error.
With this change, in case of error we simply log the error and proceed
towards removing the next endpoint. With this, we can cleanup the
network changes made by the shim as much as possible.
This is especially important when multiple interfaces are passed to the
network namespace using a network plugin like multus.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-02-20 06:08:05 -08:00