Commit Graph

12052 Commits

Author SHA1 Message Date
Chelsea Mafrica
83e731328f
Merge pull request #8023 from cmaf/runtime-rs-ch-pause-resume
runtime-rs: Update status for pause and resume
2023-11-08 11:34:47 -08:00
Fupan Li
100a73d2fd
Merge pull request #7531 from justxuewei/device-cgroup
agent: Restrict device access at upper node of container's cgroup
2023-11-08 22:01:48 +08:00
Chao Wu
4435c1efd7
Merge pull request #8386 from jodh-intel/runtime-rs-ch-tidy-up
runtime-rs: ch: Simplify VSOCK error handling
2023-11-08 17:31:40 +08:00
Xuewei Niu
023d8dc01e agent: Changes according to Pan's comments
- Disable device cgroup restriction while pod cgroup is not available.
- Remove balcklist-related names and change whitelist-related names to
  allowed_all.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 09:39:08 +08:00
Xuewei Niu
136fb76222 tests: Add a integrated test for device cgroup
`TestDeviceCgroup` is added to cri-containerd's integration tests. The test
launches two containers. Each container has a block device. It checks the
validity of device cgroup.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 09:39:07 +08:00
Xuewei Niu
b5f3a8cb39 agent: Fix container launching failure with systemd cgroup
FSManager of systemd cgroup manager is responsible for setting up cgroup
path. The container launching will be failed if the FSManager is in
read-only mode.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 09:39:07 +08:00
Xuewei Niu
6477825195 agent: Minor changes according to Zhou's comments
The changes include:

- Change to debug logging level for resources after processed.
- Remove a todo for pod cgroup cleanup.
- Add an anyhow context to `get_paths_and_mounts()`.
- Remove code which denys access to VMROOTFS since it won't take effect. If
  blackmode is in use, the VMROOTFS will be denyed as default. Otherwise,
  device cgroups won't be updated in whitelist mode.
- Add a unit test for `default_allowed_devices()`.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 09:39:07 +08:00
Xuewei Niu
cec8044744 agent: Make devcg_info optional for LinuxContainer::new()
The runk is a standard OCI runtime that isnt' aware of concept of sandbox.
Therefore, the `devcg_info` argument of `LinuxContainer::new()` is
unneccessary to be provided.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 09:39:07 +08:00
Xuewei Niu
ef4c3844a3 agent: Restrict device access at upper node of container's cgroup
The target is to guarantee that containers couldn't escape to access extra
devices, like vm rootfs, etc.

Assume that there is a cgroup, such as `/A/B`. The `B` is container cgroup,
and the `A` is what we called pod cgroup. No matter what permissions are
set for the container (`B`), the `A`'s permission is always `a *:* rwm`. It
leads that containers could acquire permission to access to other devices
in VM that not belongs to themselves.

In order to set devices cgroup properly, the order of setting cgroups is
that the pod cgroup comes first and the container cgroup comes after.

The `Sandbox` has a new field, `devcg_info`, to save cgroup states. To
avoid setting container cgroup too early, an initialization should be done
carefully. `inited`, one of the states, is a boolean to indicate if the pod
cgroup is initialized. If no, the pod cgroup should be created firstly, and
set default permissions. After that, the pause container cgroup is created
and inherits the permissions from the pod cgroup.

If whitelist mode which allows containers to access all devices in VM is
enabled,  then device resources from OCI spec are ignored.

This feature not supports systemd cgroup and cgroup v2, since:

- Systemd cgroup implemented on Agent hasn't supported devices subsystem so
  far, see: https://github.com/kata-containers/kata-containers/issues/7506.
- Cgroup v2's device controller depends on eBPF programs, which is out of
  scope of cgroup.

Fixes: #7507

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 09:39:07 +08:00
James O. D. Hunt
59d0d4caff runtime-rs: ch: Simplify VSOCK error handling
Remove the redundant `VmConfigError::EmptyVsockSocketPath` error from
the Cloud Hypervisor config crate since this scenario is already handled
by the `VsockConfigError::NoVsockSocketPath` error.

Fixes: #8385.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-07 17:45:38 +00:00
James O. D. Hunt
bdb83f8282 runtime-rs: ch: Remove unused function
Remove the redundant `parse_mac()` function: this was never used and we
already have an implementation in `crates/resource/src/network/utils/mod.rs`.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-07 17:45:38 +00:00
Wainer Moschetta
949ac4d810
Merge pull request #8217 from beraldoleal/issues/8216
tests: fixes permission denied when running test
2023-11-07 12:25:23 -03:00
Wainer Moschetta
7f5d70f48b
Merge pull request #8061 from beraldoleal/gogo-removal-v3
Updating containerd to a GogoProtobuf free version
2023-11-07 12:18:50 -03:00
Beraldo Leal
dd530ba8ee tests: fixes AMD errors
TestCheckHostIsVMContainerCapable is failing on AMD machines.
kata-check_amd64_test.go:96 has no AMD modules, also getCPUType is
missing.

Fixes #8384.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:59 +00:00
Beraldo Leal
7641c19f74 runtime: bump containerd for gogo deprecation
This update includes necessary changes due to the version bump of
containerd and its dependencies. It's part of a broader initiative to
phase out gogo protobuf, which has been deprecated, and to align with
the current supported libraries.

Fixes #7420.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:59 +00:00
Beraldo Leal
16fa2c39e6 protocols: replace gogo/types.Empty and Any
by Google versions.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:58 +00:00
Beraldo Leal
c61f4a8592 protocols: remove unused fieldpath option
The +fieldpath option, specific to gogoprotobuf, enabled dynamic field
access in protobuf messages, allowing nested fields to be accessed via
string paths.

This change is part of a larger effort to transition to the official Go
protobuf library for better maintainability and community support.
Upon review, no instances of dynamic field access were found in the
codebase, confirming that the feature is not in use.

By removing this unused feature, we simplify the build process and make
it easier to complete the transition away from gogoprotobuf.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:58 +00:00
Beraldo Leal
c87bc60ea0 protocols: removing unused mappings
Those mappings are not used by our .proto files and there is no
difference between .pb.go files generated.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:58 +00:00
Beraldo Leal
c5d845b30a agent: updating Cargo.lock files
Probably previous changes missed updating Cargo.lock.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:58 +00:00
Beraldo Leal
5d88c78a6e protocols: generating agent.pb.go
a3b003c345 modified agent but agent.pb.go
was not updated.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:58 +00:00
Archana Shinde
3b2fb6a604
Merge pull request #8284 from amshinde/runtime-rs-update-device-pci-info
runtime-rs: update device pci info for vfio and virtio-blk devices
2023-11-06 01:09:20 -08:00
Archana Shinde
036b7787dd runtime-rs: Use PCI path from hypervisor for vfio devices
Remove earlier functionality that tries to assign PCI path to vfio
devices from the host assuming pci slots to start from 1.
Get this from the hypervisor instead.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-05 21:59:44 -08:00
Archana Shinde
c3ce6a1d15 runtime-rs: Provide PCI path to the agent for virtio-block
If PCI path for block device is not empty for a block device, use
that as identifier for agent instead of virt path which is valid only
for mmio devices.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-05 21:59:44 -08:00
Archana Shinde
a2bbbad711 runtime-rs: change hypervisor add_device trait to return device copy
Block(virtio-blk) and vfio devices are currently not handled correctly
by the agent as the agent is not provided with correct PCI paths for
these devices.

The PCI paths for these devices can be inferred from the PCI information
provided by the hypervisor when the device is added.
Hence changing the add_device trait function to return a device copy
with PCI info potentially provided by the hypervisor. This can then be
provided to the agent to correctly detect devices within the VM.

This commit includes implementation for PCI info update for
cloud-hupervisor for virtio-blk devices with stubs provided for other
hypervisors.

Removing Vsock from the DeviceType enum as Vsock currently does not
implement the Device Trait, it has no attach and detach trait functions
among others. Part of the reason is because these functions require Vsock
to implement Clone trait as these functions need cloned copies to be
passed down the hypervisor.

The change introduced for returning a device copy from the add_device
hypervisor trait explicitly requires a device to implement
Copy trait. Hence removing Vsock from the DeviceType enum for now, as
its implementation is incomplete and not currently used.

Note, one of the blockers for adding the Clone trait to Vsock is that it
currently includes a file handle which cannot be cloned. For Clone and
Device Traits to be implemented for Vsock, it requires an implementation
change in the future for it to be cloneable.

Fixes: #8283

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-05 21:59:44 -08:00
Fabiano Fidêncio
0aac3c76ee
Merge pull request #8365 from fidencio/topic/kata-manager-restrict-containerd-versions-to-be-used
kata-manager: Accept only "lts" or "active" as containerd versions
2023-11-03 11:54:05 +01:00
Fabiano Fidêncio
8b4fc847d7 kata-manager: Accept only "lts" or "active" as containerd versions
kata-manager is a very nice tool, but we shouldn't be trying to take
care of "everything" in "all possible scenarios", and we should focus on
installing Kata Containers dependencies that are supported.

With this in mind, let's limit a little bit the scope of which versions
of containerd can be installed, limitting to "active" and "lts", which
will then install the latest version of those "flavours".  The default
value will always be "lts" as that's supposed to be the stable one.

NOTE: This is a breaking change, as it changes the behaviour of what the
script takes in its `-c` parameter.  I'm assuming here we're safe to do
so as the majority of the users should / would only be using the full
installation by default.

Fixes: #8356

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-03 10:30:37 +01:00
Fabiano Fidêncio
d395ae8198
Merge pull request #8368 from fidencio/topic/gha-stale-fixes
gha: stale: Fix typo and allow manually triggering it
2023-11-03 10:07:56 +01:00
Fabiano Fidêncio
994615ca28 gha: stale: Allow manually triggering it
This will help us to avoid waiting till the next time cron would trigger
the action to test

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-03 08:17:48 +01:00
Fabiano Fidêncio
6abcf03611 gha: stale: Fix typo action -> actions
This is causing the following error:
```
Unable to resolve action action/stale, repository not found
```

Fixes: #8347

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-03 08:15:18 +01:00
Steve Horsman
a7a14e33d8
Merge pull request #8285 from sazzy4o/patch-1
Docs: Fix Dragonball link
2023-11-02 17:54:47 +00:00
Fabiano Fidêncio
37233622da kata-manager: Ensure we run apt-get update before apt-get install
As that's an operation that can easily fail, and it's quite simple /
cheap for us to run it, let's just do it and avoid the failure.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-02 14:14:32 +01:00
Fabiano Fidêncio
d547798284
Merge pull request #7057 from brianwang12/kata-manager-fix
kata-manager: Fix deployment of containerd on architectures other than amd64.
2023-11-02 14:14:18 +01:00
Fabiano Fidêncio
8905286767
Merge pull request #8348 from fidencio/topic/gha-add-stale-action-for-PRs
gha: Add workflow to close stale PRs
2023-11-02 11:34:35 +01:00
Fabiano Fidêncio
abec287058 gha: Add workflow to close stale PRs
Our goal. as discussed in the Architecture Committee meeting held on
October 31st, 2023, is to take a more aggressive action on issues and
PRs that have been opened for a long time.

This commit is the very first step, and it's **only** targetting
**PRs**.  What this action will do is:
* Mark all the PRs that have no activity for more than 180 days,
  starting from May 1st, 2023, as stale.
  * A message will be added, letting the contributor know that they can
    simply comment on the PR in order to make it "not stale".
* If there's no activity on the PR for 7 days, the PR will be
  automatically closed.

Fixes: #8347

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-02 09:19:44 +01:00
briwan.wang
437db15916 kata-manager: Fix Mulit-Arch deployment for containerd
Fix: Kata-Manager fails to retrieve the correct Containerd string name
for architectures other than amd64.

Update the 'github_get_release_file_url()' function to make it compatible
with different architecture expressions. eg. aarch64/arm64, or x86_64/amd64,
allowing it to acquire the correct URL addresses

Fixes: #7071

Signed-off-by: briwan.wang <briwan.wang@arm.com>
2023-11-02 06:12:04 +00:00
Archana Shinde
004646162e
Merge pull request #8308 from gkurz/fully-drop-hub
release: Fully migrate from hub to gh
2023-11-01 22:46:44 -07:00
Peng Tao
b3dbd4f1c7
Merge pull request #8351 from amshinde/update-agent-cargo-lock
cargo: Agent cargo.lock updated
2023-11-02 11:31:24 +08:00
Archana Shinde
58b4d1a264 cargo: Agent cargo.lock updated
The Cargo.lock for agent needs to be updated to include
"safe-path" dependency.

Fixes: #8350

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-01 11:54:33 -07:00
Fabiano Fidêncio
40cc397218
Merge pull request #8255 from cmaf/migrate-checks-fixes-links
docs: Fix broken links
2023-11-01 14:46:30 +01:00
Greg Kurz
d20b7381f0 release: Drop obsolete comment in workflow file
This comment belongs to the hub tool that got sunset by 710eb8ab9d.
Just drop it.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-10-31 16:03:12 +01:00
Greg Kurz
6236fa4617 release: Drop build_hub helper
Not used anymore.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-10-31 15:28:57 +01:00
Greg Kurz
bc4c66caaf release: Migrate tag_repos.sh to GitHub CLI
The hub tool is deprecated. Convert this script to use the
official GitHub CLI gh instead of hub.

A typical gh setup is able to access repos using HTTPS along with
GitHub credentials. It is only needed to patch the remote url when
using SSH.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-10-31 15:11:28 +01:00
Greg Kurz
e331102ba3 release: Migrate update-repository-version.sh to GitHub CLI
The hub tool is deprecated. Convert this script to use the
official GitHub CLI gh instead of hub.

A couple of adjustments had to be made :
- the notes.md temporary file is moved to ${tmp_dir} in order to silent gh,
  otherwise it complains about an untracked file,
- title of a PR no longer goes to the notes.md file since gh requires the
  title to be passed with a dedicated --title option.

Fixes #8303

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-10-31 15:10:50 +01:00
Greg Kurz
b83a7149ee release: Introduce helper to get GitHub CLI
If gh isn't installed already, download it from GitHub.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-10-31 15:09:24 +01:00
Fabiano Fidêncio
53cda12a71
Merge pull request #8311 from TimePrinciple/log-system-enhancement
runtime-rs: Log system enhancement
2023-10-31 10:14:41 +01:00
Greg Kurz
ceeabe3714 release: Allow to test release scripts with an alternate repo
We don't want to mess with the official repo when testing a change
in the release scripts. Adapt `update-repository-version.sh` to
be able to use an alternate repo just like `tag_repos.sh` already
does.

This means that the following command :

$ OWNER="$SOME_ORG" ./update-repository-version.sh -p "$NEW_VERSION" "$BRANCH"

will only create a PR in this repo :

http://github.com/$SOME_ORG/kata-containers.git

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-10-31 09:49:27 +01:00
Archana Shinde
148c565b2f
Merge pull request #8289 from BbolroC/skip-create-tmpfs-s390x
agent: Skip flaky create_tmpfs on s390x
2023-10-30 22:26:28 -07:00
Ruoqing He
4ad2cfe0c2 runtime-rs: Log system enhancement
By modifying RuntimeLevelFilter drain to improve logging control,
enabling isolation of change effect of the loggers between components,
tuning clh logs to be logged according to their log levels
given by cloud-hypervisor.

Fixes: #8310

Signed-off-by: Ruoqing He <linuxwatcher@outlook.com>
2023-10-31 04:57:46 +00:00
David Esparza
2a17d3889e
Merge pull request #8334 from amshinde/ipvlan-nerdctl-fix
network: Fix network attach for ipvlan and macvlan
2023-10-30 16:00:32 -06:00
David Esparza
5573705800
Merge pull request #8202 from dborquez/enable_fio_checkmetrics
Enable fio checkmetrics
2023-10-30 15:55:37 -06:00