Compare commits

..

117 Commits
2.5.1 ... 2.4.3

Author SHA1 Message Date
Archana Shinde
6330386ab6 Merge pull request #4593 from fidencio/2.4.3-branch-bump
# Kata Containers 2.4.3
2022-07-05 15:04:34 -07:00
Fabiano Fidêncio
847003187c release: Kata Containers 2.4.3
- stable-2.4 | shim: set a non-zero return code if the wait process call failed.
- stable-2.4 | rootfs: Fix chronyd.service failing on boot

396fed42c release: Adapt kata-deploy for 2.4.3
025e3ea6a shim: set a non-zero return code if the wait process call failed.
f32a14663 snap: Fix debug cli option
0718b9b55 rootfs: Fix chronyd.service failing on boot

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2022-07-05 22:26:52 +02:00
Fabiano Fidêncio
396fed42c1 release: Adapt kata-deploy for 2.4.3
kata-deploy files must be adapted to a new release.  The cases where it
happens are when the release goes from -> to:
* main -> stable:
  * kata-deploy-stable / kata-cleanup-stable: are removed

* stable -> stable:
  * kata-deploy / kata-cleanup: bump the release to the new one.

There are no changes when doing an alpha release, as the files on the
"main" branch always point to the "latest" and "stable" tags.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2022-07-05 22:26:52 +02:00
GabyCT
ca7bb9dceb Merge pull request #4571 from fidencio/topic/stable-2.4-set-status-if-wait-process-failed
stable-2.4 | shim: set a non-zero return code if the wait process call failed.
2022-07-01 11:28:35 -05:00
liubin
025e3ea6ab shim: set a non-zero return code if the wait process call failed.
Return code is an int32 type, so if an error occurred, the default value
may be zero, this value will be created as a normal exit code.

Set return code to 255 will let the caller(for example Kubernetes) know
that there are some problems with the pod/container.

Fixes: #4419

Signed-off-by: liubin <liubin0329@gmail.com>
(cherry picked from commit ab5f1c9564)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-06-30 21:55:13 +02:00
GabyCT
5d50fc3908 Merge pull request #4501 from fidencio/topic/stable-2.4-backport-chronyd-fix
stable-2.4 | rootfs: Fix chronyd.service failing on boot
2022-06-22 12:32:21 -05:00
James O. D. Hunt
f32a146637 snap: Fix debug cli option
`snap`/`snapcraft` seems to have changed recently. Since `snap`
auto-updates all `snap` packages and since we use the `snapcraft` `snap`
for building snaps, this is impacting all our CI jobs which now show:

```
Installing Snapcraft for Linux…
snapcraft 7.0.4 from Canonical* installed

Run snapcraft -d snap --destructive-mode
Usage: snapcraft [options] command [args]...
Try 'snapcraft pack -h' for help.
Error: unrecognized arguments: -d
Error: Process completed with exit code 1.
```

Move the debug option to make it a sub-command (long) option to resolve
this issue.

Fixes: #4457.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 90a7763ac6)
2022-06-21 16:35:53 +02:00
Champ-Goblem
0718b9b55f rootfs: Fix chronyd.service failing on boot
In at least kata versions 2.3.3 and 2.4.0 it was noticed that the guest
operating system's clock would drift out of sync slowly over time
whilst the pod was running.

This had previously been raised and fixed in the old reposity via [1].
In essence kvm_ptp and chrony were paired together in order to
keep the system clock up to date with the host.

In the recent versions of kata metioned above,
the chronyd.service fails upon boot with status `266/NAMESPACE`
which seems to be due to the fact that the `/var/lib/chrony`
directory no longer exists.

This change sets the `/var/lib/chrony` directory for the `ReadWritePaths`
to be ignored when the directory does not exist, as per [2].

[1] https://github.com/kata-containers/runtime/issues/1279
[2] https://www.freedesktop.org/software/systemd
/man/systemd.exec.html#ReadWritePaths=

Fixes: #4167
Signed-off-by: Champ-Goblem <cameron_mcdermott@yahoo.co.uk>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 1b7fd19acb)
2022-06-21 15:49:56 +02:00
snir911
6d93875ead Merge pull request #4385 from snir911/2.4.2-branch-bump
# Kata Containers 2.4.2
2022-06-08 19:38:17 +03:00
Snir Sheriber
7fd22d77d0 release: Kata Containers 2.4.2
- My 2.4 pr backport -- fix shim leak caused by ESRCH in agent destroy
- backport-2.4 | workflows: add workflow_dispatch triggering to test-kata-deploy
- stable-2.4: runtime: Adding the correct detection of mediated PCIe devices
- stable-2.4: backport agent fixes
- stable-2.4 | clh: Update to the v24.0 release
- stable-2.4 | Backport fixes for direct-volume stats
- stable-2.4 | tools: Add QEMU patches for SGX numa support
- stable-2.4 | versions: Upgrade to Cloud Hypervisor v23.1

607a8a9c2 release: Adapt kata-deploy for 2.4.2
e5568a31a agent: ignore ESRCH error when destroying containers
322839ac7 runtime: force stop container after the container process exits
b75d5cee7 docs: update release process github token instructions
e938ce443 docs: update release process with latest workflow triggering
046ba4df7 workflows: add workflow_dispatch triggering to test-kata-deploy
14ce4b01b runtime: Adding the correct detection of mediated PCIe devices
f54d5cf16 agent: Fix is_signal_handled failing parsing str to u64
80d5f9e14 agent: move assert_result macro to test_utils file
50a74dfee agent: add tests for is_signal_handled function
560247f8d agent: add tests for update_container_namespaces
47d4e79c1 agent: add tests for do_write_stream function
e3ce8aff9 agent: add tests for get_memory_info function
ebe9fc2ca clh: Update to the v24.0 release
29c9391da agent: fix direct-assigned volume stats
d1848523d runtime: direct-volume stats use correct name
338c9f2b0 runtime: direct-volume stats update to use GET parameter
f528bc010 runtime: fix incorrect Action function for direct-volume stats
3413c8588 tools: Add QEMU patches for SGX numa support
db6d4f7e1 versions: Upgrade to Cloud Hypervisor v23.1

Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-06-08 11:54:59 +03:00
Snir Sheriber
607a8a9c2d release: Adapt kata-deploy for 2.4.2
kata-deploy files must be adapted to a new release.  The cases where it
happens are when the release goes from -> to:
* main -> stable:
  * kata-deploy-stable / kata-cleanup-stable: are removed

* stable -> stable:
  * kata-deploy / kata-cleanup: bump the release to the new one.

There are no changes when doing an alpha release, as the files on the
"main" branch always point to the "latest" and "stable" tags.

Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-06-08 11:54:59 +03:00
Feng Wang
562e968d19 Merge pull request #4389 from fengwang666/my_2.4_pr_backport
My 2.4 pr backport -- fix shim leak caused by ESRCH in agent destroy
2022-06-02 14:28:40 -07:00
Feng Wang
e5568a31a7 agent: ignore ESRCH error when destroying containers
destroy() method should ignore the ESRCH error from signal::kill
and continue the operation as ESRCH is often considered harmless.

Fixes: #4359

Signed-off-by: Feng Wang <feng.wang@databricks.com>
2022-06-02 12:50:54 -07:00
Feng Wang
322839ac75 runtime: force stop container after the container process exits
Set thestop container force flag to true so that the container state is always set to
“StateStopped” after the container wait goroutine is finished. This is necessary for
the following delete container step to succeed.

Fixes: #4359

Signed-off-by: Feng Wang <feng.wang@databricks.com>
2022-06-02 12:50:40 -07:00
snir911
4be3aebd15 Merge pull request #4352 from snir911/fix-workflow-stable-2.4
backport-2.4 | workflows: add workflow_dispatch triggering to test-kata-deploy
2022-06-02 13:19:19 +03:00
Snir Sheriber
b75d5cee74 docs: update release process github token instructions
and fix the gpg generating key url

Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-06-01 19:12:45 +03:00
Snir Sheriber
e938ce443c docs: update release process with latest workflow triggering
instructions

Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-06-01 19:12:37 +03:00
Snir Sheriber
046ba4df7f workflows: add workflow_dispatch triggering to test-kata-deploy
This will allow to trigger the test-kata-deploy workflow manually from
any branch instead of using always the one that is defined on main

See: https://github.blog/changelog/2020-07-06-github-actions-manual-triggers-with-workflow_dispatch/

Fixes: #4349
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-06-01 16:25:25 +03:00
Fabiano Fidêncio
a1d2049bee Merge pull request #4337 from snir911/backports-stable-2.4
stable-2.4: runtime: Adding the correct detection of mediated PCIe devices
2022-05-30 22:35:26 +02:00
James O. D. Hunt
8dcf6c354f Merge pull request #4274 from egernst/backport-agent-fixes
stable-2.4: backport agent fixes
2022-05-30 16:57:07 +01:00
Zvonko Kaiser
14ce4b01ba runtime: Adding the correct detection of mediated PCIe devices
Fixes #4212

Backport-of: https://github.com/kata-containers/kata-containers/pull/4213
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2022-05-30 12:32:41 +03:00
snir911
4018fdc9b2 Merge pull request #4319 from fidencio/topic/stable-2.4-update-clh-to-v24.0
stable-2.4 | clh: Update to the v24.0 release
2022-05-29 11:46:58 +03:00
Champ-Goblem
f54d5cf165 agent: Fix is_signal_handled failing parsing str to u64
In the is_signal_handled function, when parsing the hex string returned
from `/proc/<pid>/status` the space/tab character after the colon
is not removed.

This patch trims the result of SigCgt so that
all whitespace characters are removed. It also extends the existing
test cases to check for this scenario.

Fixes: #4250
Signed-off-by: Champ-Goblem <cameron@northflank.com>
2022-05-26 15:44:56 -07:00
Braden Rayhorn
80d5f9e145 agent: move assert_result macro to test_utils file
Move the assert_result macro to the shared test_utils file
so that it is not duplicated in individual files.

Fixes: #4093

Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>
2022-05-26 15:44:56 -07:00
Braden Rayhorn
50a74dfeee agent: add tests for is_signal_handled function
Add test coverage for is_signal_handled function in rpc.rs. Includes
refactors to make the function testable and handle additional cases.

Fixes #3939

Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>
2022-05-26 15:43:45 -07:00
Braden Rayhorn
560247f8da agent: add tests for update_container_namespaces
Add test coverage for update_container_namespaces function
in src/rpc.rs. Includes minor refactor to make function easier
to test.

Fixes #4034

Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>
2022-05-26 15:43:45 -07:00
Braden Rayhorn
47d4e79c15 agent: add tests for do_write_stream function
Add test coverage for do_write_stream function of AgentService
in src/rpc.rs. Includes minor refactoring to make function more
easily testable.

Fixes #3984

Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>
2022-05-26 15:42:45 -07:00
Braden Rayhorn
e3ce8aff99 agent: add tests for get_memory_info function
Add test coverage for get_memory_info function in src/rpc.rs. Includes
some minor refactoring of the function.

Fixes #3837

Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>
2022-05-26 15:42:45 -07:00
Yibo Zhuang
fc2c933a88 Merge pull request #4305 from yibozhuang/stable-2.4
stable-2.4 | Backport fixes for direct-volume stats
2022-05-26 13:52:19 -07:00
Fabiano Fidêncio
ebe9fc2cad clh: Update to the v24.0 release
This release has been tracked through the v24.0 project.

virtio-iommu specification describes how a device can be attached by default
to a bypass domain. This feature is particularly helpful for booting a VM with
guest software which doesn't support virtio-iommu but still need to access
the device. Now that Cloud Hypervisor supports this feature, it can boot a VM
with Rust Hypervisor Firmware or OVMF even if the virtio-block device exposing
the disk image is placed behind a virtual IOMMU.

Multiple checks have been added to the code to prevent devices with identical
identifiers from being created, and therefore avoid unexpected behaviors at boot
or whenever a device was hot plugged into the VM.

Sparse mmap support has been added to both VFIO and vfio-user devices. This
allows the device regions that are not fully mappable to be partially mapped.
And the more a device region can be mapped into the guest address space, the
fewer VM exits will be generated when this device is accessed. This directly
impacts the performance related to this device.

A new serial_number option has been added to --platform, allowing a user to
set a specific serial number for the platform. This number is exposed to the
guest through the SMBIOS.

* Fix loading RAW firmware (#4072)
* Reject compressed QCOW images (#4055)
* Reject virtio-mem resize if device is not activated (#4003)
* Fix potential mmap leaks from VFIO/vfio-user MMIO regions (#4069)
* Fix algorithm finding HOB memory resources (#3983)

* Refactor interrupt handling (#4083)
* Load kernel asynchronously (#4022)
* Only create ACPI memory manager DSDT when resizable (#4013)

Deprecated features will be removed in a subsequent release and users should
plan to use alternatives

* The mergeable option from the virtio-pmem support has been deprecated
(#3968)
* The dax option from the virtio-fs support has been deprecated (#3889)

Fixes: #4317

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-05-26 08:58:27 +00:00
Yibo Zhuang
29c9391da1 agent: fix direct-assigned volume stats
The current implementation of walking the
disks to match with the requested volume path
in agent doesn't work because the volume path
provided by the shim to the agent is the mount
path within the guest and not the device name.
The current logic is trying to match the
device name to the volume path which will never
match.

This change will simplify the
get_volume_capacity_stats and
get_volume_inode_stats to just call statfs and
get the bytes and inodes usage of the volume
path directly.

Fixes: #4297

Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
2022-05-23 16:40:35 -07:00
Yibo Zhuang
d1848523d3 runtime: direct-volume stats use correct name
Today the shim does a translation when doing
direct-volume stats where it takes the source and
returns the mount path within the guest.

The source for a direct-assigned volume is actually
the device path on the host and not the publish
volume path.

This change will perform a lookup of the mount info
during direct-volume stats to ensure that the
device path is provided to the shim for querying
the volume stats.

Fixes: #4297

Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
2022-05-23 16:34:50 -07:00
Yibo Zhuang
338c9f2b0b runtime: direct-volume stats update to use GET parameter
The go default http mux AFAIK doesn’t support pattern
routing so right now client is padding the url
for direct-volume stats with a subpath of the volume
path and this will always result in 404 not found returned
by the shim.

This change will update the shim to take the volume
path as a GET query parameter instead of a subpath.
If the parameter is missing or empty, then return
400 BadRequest to the client.

Fixes: #4297

Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
2022-05-23 16:34:41 -07:00
Yibo Zhuang
f528bc0103 runtime: fix incorrect Action function for direct-volume stats
The action function expects a function that returns error
but the current direct-volume stats Action returns
(string, error) which is invalid.

This change fixes the format and print out the stats from
the command instead.

Fixes: #4293

Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
2022-05-23 16:34:29 -07:00
Chelsea Mafrica
f821ecbdc6 Merge pull request #4268 from cmaf/tools-patch-qemu-sgx-numa-2.4
stable-2.4 | tools: Add QEMU patches for SGX numa support
2022-05-16 14:31:48 -07:00
Chelsea Mafrica
3413c8588d tools: Add QEMU patches for SGX numa support
There are a few patches for SGX numa support in QEMU added after the
6.2.0 release. Add them for SGX support in Kata.

Fixes #4254

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
(cherry picked from commit b4b9068cb7)
2022-05-16 11:48:22 -07:00
Fabiano Fidêncio
b93a0b1012 Merge pull request #4229 from likebreath/0510/backport_clh_v23.1
stable-2.4 | versions: Upgrade to Cloud Hypervisor v23.1
2022-05-12 21:55:20 +02:00
Bo Chen
db6d4f7e16 versions: Upgrade to Cloud Hypervisor v23.1
The following issues have been addressed from the latest bug fix release
v23.1 of Cloud Hypervisor: 1) Add some missing seccomp rules; 2) Remove
virtio-fs filesystem entries from config on removal; 3) Do not delete
API socket on API server start; 4) Reject virtio-mem resize if the guest
doesn't activate the device; 5) Fix OpenAPI naming of I/O throttling
knobs;

Fixes: #4222

Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 82ea018281)
2022-05-10 11:26:57 -07:00
Fabiano Fidêncio
67d67ab66d Merge pull request #4204 from fidencio/2.4.1-branch-bump
# Kata Containers 2.4.1
2022-05-04 19:11:37 +02:00
Fabiano Fidêncio
99c6726cf6 release: Kata Containers 2.4.1
- stable-2.4 | Second round of backports for the 2.4.1 release
- stable-2.4 | First round of backports for the 2.4.1 release
- stable-2.4 | versions: Upgrade to Cloud Hypervisor v23.0
- stable-2.4 | runtime: Base64 encode the direct volume mountInfo path
- stable-2.4 | agent: Avoid agent panic when reading empty stats

8e076c87 release: Adapt kata-deploy for 2.4.1
b50b091c agent: watchers: ensure uid/gid is preserved on copy/mkdir
03bc89ab clh: Rely on Cloud Hypervisor for generating the device ID
6b2c641f tools: fix typo in clh directory name
81e10fe3 packaging: Fix clh build from source fall-back
8b21c5f7 agent: modify the type of swappiness to u64
3f5c6e71 runtime: Allock mockfs storage to be placed in any directory
0bd1abac runtime: Let MockFSInit create a mock fs driver at any path
3e74243f runtime: Move mockfs control global into mockfs.go
aed4fe6a runtime: Export StoragePathSuffix
e1c4f57c runtime: Don't abuse MockStorageRootPath() for factory tests
c49084f3 runtime: Make bind mount tests better clean up after themselves
4e350f7d runtime: Clean up mock hook logs in tests
415420f6 runtime: Make SetupOCIConfigFile clean up after itself
688b9abd runtime: Don't use fixed /tmp/mountPoint path
dc1288de kata-monitor: add a README file
78edf827 kata-monitor: add some links when generating pages for browsers
eff74fab agent: fsGroup support for direct-assigned volume
01cd5809 proto: fsGroup support for direct-assigned volume
97ad1d55 runtime: fsGroup support for direct-assigned volume
b62cced7 runtime: no need to write virtiofsd error to log
8242cfd2 kata-monitor: update the hrefs in the debug/pprof index page
a37d4e53 agent: best-effort removing mount point
d1197ee8 tools/packaging: Fix error path in 'kata-deploy-binaries.sh -s'
c9c77511 tools/packaging: Fix usage of kata-deploy-binaries.sh
1e622316 tools/packaging/kata-deploy: Copy install_yq.sh in a dedicated script
8fa64e01 packaging: Eliminate TTY_OPT and NO_TTY variables in kata-deploy
8f67f9e3 tools/packaging/kata-deploy/local-build: Add build to gitignore
3049b776 versions: Bump firecracker to v0.23.4
aedfef29 runtime/virtcontainers: Pass the hugepages resources to agent
c9e1f727 agent: Verify that we allocated as many hugepages as we need
ba858e8c agent: Don't attempt to create directories for hugepage configuration
bc32eff7 virtcontainers: clh: Re-generate the client code
984ef538 versions: Upgrade to Cloud Hypervisor v23.0
adf6493b runtime: Base64 encode the direct volume mountInfo path
6b417540 agent: Avoid agent panic when reading empty stats

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-05-04 16:18:45 +02:00
Fabiano Fidêncio
8e076c8701 release: Adapt kata-deploy for 2.4.1
kata-deploy files must be adapted to a new release.  The cases where it
happens are when the release goes from -> to:
* main -> stable:
  * kata-deploy-stable / kata-cleanup-stable: are removed

* stable -> stable:
  * kata-deploy / kata-cleanup: bump the release to the new one.

There are no changes when doing an alpha release, as the files on the
"main" branch always point to the "latest" and "stable" tags.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-05-04 16:18:45 +02:00
Fabiano Fidêncio
b11f7df5ab Merge pull request #4202 from fidencio/topic/second-round-of-backports-for-2.4.1
stable-2.4 | Second round of backports for the 2.4.1 release
2022-05-04 14:30:23 +02:00
Yibo Zhuang
b50b091c87 agent: watchers: ensure uid/gid is preserved on copy/mkdir
Today in agent watchers, when we copy files/symlinks
or create directories, the ownership of the source path
is not preserved which can lead to permission issues.

In copy, ensure that we do a chown of the source path
uid/gid to the destination file/symlink after copy to
ensure that ownership matches the source ownership.
fs::copy() takes care of setting the permissions.

For directory creation, ensure that we set the
permissions of the created directory to the source
directory permissions and also perform a chown of the
source path uid/gid to ensure directory ownership
and permissions matches to the source.

Fixes: #4188

Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
(cherry picked from commit 70eda2fa6c)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-05-04 12:39:44 +02:00
Fabiano Fidêncio
03bc89ab0b clh: Rely on Cloud Hypervisor for generating the device ID
We're currently hitting a race condition on the Cloud Hypervisor's
driver code when quickly removing and adding a block device.

This happens because the device removal is an asynchronous operation,
and we currently do *not* monitor events coming from Cloud Hypervisor to
know when the device was actually removed.  Together with this, the
sandbox code doesn't know about that and when a new device is attached
it'll quickly assign what may be the very same ID to the new device,
leading to the Cloud Hypervisor's driver trying to hotplug a device with
the very same ID of the device that was not yet removed.

This is, in a nutshell, why the tests with Cloud Hypervisor and
devmapper have been failing every now and then.

The workaround taken to solve the issue is basically *not* passing down
the device ID to Cloud Hypervisor and simply letting Cloud Hypervisor
itself generate those, as Cloud Hypervisor does it in a manner that
avoids such conflicts.  With this addition we have then to keep a map of
the device ID and the Cloud Hypervisor's generated ID, so we can
properly remove the device.

This workaround will probably stay for a while, at least till someone
has enough cycles to implement a way to watch the device removal event
and then properly act on that.  Spoiler alert, this will be a complex
change that may not even be worth it considering the race can be avoided
with this commit.

Fixes: #4196

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 33a8b70558)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-05-04 12:39:39 +02:00
Fabiano Fidêncio
d4dccb4900 Merge pull request #4153 from fidencio/wip/first-round-of-backports-for-2.4.1
stable-2.4 | First round of backports for the 2.4.1 release
2022-04-27 11:23:35 +02:00
Greg Kurz
6b2c641f0b tools: fix typo in clh directory name
This allows to get released binaries again.

Fixes: #4151

Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit b658dccc5f)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:11:05 +02:00
Greg Kurz
81e10fe34f packaging: Fix clh build from source fall-back
If we fail to download the clh binary, we fall-back to build from source.
Unfortunately, `pull_clh_released_binary()` leaves a `cloud_hypervisor`
directory behind, which causes `build_clh_from_source()` not to clone
the git repo:

    [ -d "${repo_dir}" ] || git clone "${cloud_hypervisor_repo}"

When building from a kata-containers git repo, the subsequent calls
to `git` in this function thus apply to the kata-containers repo and
eventually fail, e.g.:

+ git checkout v23.0
error: pathspec 'v23.0' did not match any file(s) known to git

It doesn't quite make sense actually to keep an existing directory the
content of which is arbitrary when we want to it to contain a specific
version of clh. Just remove it instead.

Fixes: #4151

Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit afbd60da27)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:11:01 +02:00
holyfei
8b21c5f78d agent: modify the type of swappiness to u64
The type of MemorySwappiness in runtime is uint64, and the type of swappiness in agent is int64,
if we set max uint64 in runtime and pass it to agent, the value will be equal to -1. We should
modify the type of swappiness to u64

Fixes: #4123

Signed-off-by: holyfei <yangfeiyu20092010@163.com>
(cherry picked from commit 0239502781)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:10:21 +02:00
David Gibson
3f5c6e7182 runtime: Allock mockfs storage to be placed in any directory
Currently EnableMockTesting() takes no arguments and will always place the
mock storage in the fixed location /tmp/vc/mockfs.  This means that one
test run can interfere with the next one if anything isn't cleaned up
(and there are other bugs which means that happens).  If if those were
fixed this would allow developers testing on the same machine to interfere
with each other.

So, allow the mockfs to be placed at an arbitrary place given as a
parameter to EnableMockTesting().  In TestMain() we place it under our
existing temporary directory, so we don't need any additional cleanup just
for the mockfs.

fixes #4140

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 1b931f4203)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:10:21 +02:00
David Gibson
0bd1abac3e runtime: Let MockFSInit create a mock fs driver at any path
Currently MockFSInit always creates the mockfs at the fixed path
/tmp/vc/mockfs.  This change allows it to be initialized at any path
given as a parameter.  This allows the tests in fs_test.go to be
simplified, because the by using a temporary directory from
t.TempDir(), which is automatically cleaned up, we don't need to
manually trigger initTestDir() (which is misnamed, it's actually a
cleanup function).

For now we still use the fixed path when auto-creating the mockfs in
MockAutoInit(), but we'll change that later.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit ef6d54a781)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:10:21 +02:00
David Gibson
3e74243fbe runtime: Move mockfs control global into mockfs.go
virtcontainers/persist/fs/mockfs.go defines a mock filesystem type for
testing.  A global variable in virtcontainers/persist/manager.go is used to
force use of the mock fs rather than a normal one.

This patch moves the global, and the EnableMockTesting() function which
sets it into mockfs.go.  This is slightly cleaner to begin with, and will
allow some further enhancements.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 5d8438e939)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:10:21 +02:00
David Gibson
aed4fe6a2e runtime: Export StoragePathSuffix
storagePathSuffix defines the file path suffix - "vc" - used for
Kata's persistent storage information, as a private constant.  We
duplicate this information in fc.go which also needs it.

Export it from fs.go instead, so it can be used in fc.go.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 963d03ea8a)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:10:21 +02:00
David Gibson
e1c4f57c35 runtime: Don't abuse MockStorageRootPath() for factory tests
A number of unit tests under virtcontainers/factory use
MockStorageRootPath() as a general purpose temporary directory.  This
doesn't make sense: the mockfs driver isn't even in use here since we only
call EnableMockTesting for the pase virtcontainers package, not the
subpackages.

Instead use t.TempDir() which is for exactly this purpose.  As a bonus it
also handles the cleanup, so we don't need MockStorageDestroy any more.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 1719a8b491)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:10:21 +02:00
David Gibson
c49084f303 runtime: Make bind mount tests better clean up after themselves
There are several tests in mount_test.go which perform a sample bind
mount.  These need a corresponding unmount to clean up afterwards or
attempting to delete the temporary files will fail due to the existing
mountpoint.  Most of them had such an unmount, but
TestBindMountInvalidPgtypes was missing one.

In addition, the existing unmounts where done inconsistently - one was
simply inline (so wouldn't be executed if the test fails too early) and one
is a defer.  Change them all to use the t.Cleanup mechanism.

For the dummy mountpoint files, rather than cleaning them up after the
test, the tests were removing them at the beginning of the test.  That
stops the test being messed up by a previous run, but messily.  Since
these are created in a private temporary directory anyway, if there's
something already there, that indicates a problem we shouldn't ignore.
In fact we don't need to explicitly remove these at all - they'll be
removed along with the rest of the private temporary directory.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit bec59f9e39)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:10:21 +02:00
David Gibson
4e350f7d53 runtime: Clean up mock hook logs in tests
The tests in hook_test.go run a mock hook binary, which does some debug
logging to /tmp/mock_hook.log.  Currently we don't clean up those logs
when the tests are done.  Use a test cleanup function to do this.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit f7ba21c86f)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:10:21 +02:00
David Gibson
415420f689 runtime: Make SetupOCIConfigFile clean up after itself
SetupOCIConfigFile creates a temporary directory with os.MkDirTemp().  This
means the callers need to register a deferred function to remove it again.
At least one of them was commented out meaning that a /temp/katatest-
directory was leftover after the unit tests ran.

Change to using t.TempDir() which as well as better matching other parts of
the tests means the testing framework will handle cleaning it up.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 90b2f5b776)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:10:21 +02:00
David Gibson
688b9abd35 runtime: Don't use fixed /tmp/mountPoint path
Several tests in kata_agent_test.go create /tmp/mountPoint as a dummy
directory to mount.  This is not cleaned up after the test.  Although it
is in /tmp, that's still a little messy and can be confusing to a user.
In addition, because it uses the same name every time, it allows for one
run of the test to interfere with the next.

Use the built in t.TempDir() to use an automatically named and deleted
temporary directory instead.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 2eeb5dc223)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:10:21 +02:00
Francesco Giudici
dc1288de8d kata-monitor: add a README file
Fixes: #3704

Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
(cherry picked from commit 7b2ff02647)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:10:21 +02:00
bin
78edf827df kata-monitor: add some links when generating pages for browsers
Add some links to rendered webpages for better user experience,
let users can jump to pages only by clicking links in browsers.

Fixes: #4061

Signed-off-by: bin <bin@hyper.sh>
(cherry picked from commit f8cc5d1ad8)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:10:21 +02:00
Yibo Zhuang
eff74fab0e agent: fsGroup support for direct-assigned volume
Adding two functions set_ownership and
recursive_ownership_change to support changing group id
ownership for a mounted volume.

The set_ownership will be called in common_storage_handler
after mount_storage performs the mount for the volume.
set_ownership will be a noop if the FSGroup field in the
Storage struct is not set which indicates no chown will be
performed. If FSGroup field is specified, then it will
perform the recursive walk of the mounted volume path to
change ownership of all files and directories to the
desired group id. It will also configure the SetGid bit
so that files created the directory will have group
following parent directory group.

If the fsGroupChangePolicy is on root mismatch,
then the group ownership will be skipped if the root
directory group id alreasy matches the desired group
id and if the SetGid bit is also set on the root directory.

This is the same behavior as what
Kubelet does today when performing the recursive walk
to change ownership.

Fixes #4018

Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
(cherry picked from commit 92c00c7e84)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:09:22 +02:00
Yibo Zhuang
01cd58094e proto: fsGroup support for direct-assigned volume
This change adds two fields to the Storage pb

FSGroup which is a group id that the runtime
specifies to indicate to the agent to perform a
chown of the mounted volume to the specified
group id after mounting is complete in the guest.

FSGroupChangePolicy which is a policy to indicate
whether to always perform the group id ownership
change or only if the root directory group id
does not match with the desired group id.

These two fields will allow CSI plugins to indicate
to Kata that after the block device is mounted in
the guest, group id ownership change should be performed
on that volume.

Fixes #4018

Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
(cherry picked from commit 6a47b82c81)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:05:29 +02:00
Yibo Zhuang
97ad1d55ff runtime: fsGroup support for direct-assigned volume
The fsGroup will be specified by the fsGroup key in
the direct-assign mountinfo metadate field.
This will be set when invoking the kata-runtime
binary and providing the key, value pair in the metadata
field. Similarly, the fsGroupChangePolicy will also
be provided in the mountinfo metadate field.

Adding an extra fields FsGroup and FSGroupChangePolicy
in the Mount construct for container mount which will
be populated when creating block devices by parsing
out the mountInfo.json.

And in handleDeviceBlockVolume of the kata-agent client,
it checks if the mount FSGroup is not nil, which
indicates that fsGroup change is required in the guest,
and will provide the FSGroup field in the protobuf to
pass the value to the agent.

Fixes #4018

Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
(cherry picked from commit 532d53977e)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:05:29 +02:00
Zhuoyu Tie
b62cced7f4 runtime: no need to write virtiofsd error to log
The scanner reads nothing from viriofsd stderr pipe, because param
'--syslog' rediercts stderr to syslog. So there is no need to write
scanner.Text() to kata log

Fixes: #4063

Signed-off-by: Zhuoyu Tie <tiezhuoyu@outlook.com>
(cherry picked from commit 6e79042aa0)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:05:29 +02:00
Francesco Giudici
8242cfd2be kata-monitor: update the hrefs in the debug/pprof index page
kata-monitor allows to get data profiles from the kata shim
instances running on the same node by acting as a proxy
(e.g., http://$NODE_ADDRESS:8090/debug/pprof/?sandbox=$MYSANDBOXID).
In order to proxy the requests and the responses to the right shim,
kata-monitor requires to pass the sandbox id via a query string in the
url.

The profiling index page proxied by kata-monitor contains the link to all
the data profiles available. All the links anyway do not contain the
sandbox id included in the request: the links result then broken when
accessed through kata-monitor.
This happens because the profiling index page comes from the kata shim,
which will not include the query string provided in the http request.

Let's add on-the-fly the sandbox id in each href tag returned by the kata
shim index page before providing the proxied page.

Fixes: #4054

Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
(cherry picked from commit 86977ff780)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:05:29 +02:00
Feng Wang
a37d4e538f agent: best-effort removing mount point
During container exit, the agent tries to remove all the mount point directories,
which can fail if it's a readonly filesytem (e.g. device mapper). This commit ignores
the removal failure and logs a warning message.

Fixes: #4043

Signed-off-by: Feng Wang <feng.wang@databricks.com>
(cherry picked from commit aabcebbf58)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:05:29 +02:00
Greg Kurz
d1197ee8e5 tools/packaging: Fix error path in 'kata-deploy-binaries.sh -s'
`make kata-tarball` relies on `kata-deploy-binaries.sh -s` which
silently ignores errors, and you may end up with an incomplete
tarball without noticing it because `make`'s exit status is 0.

`kata-deploy-binaries.sh` does set the `errexit` option and all the
code in the script seems to assume that since it doesn't do error
checking. Unfortunately, bash automatically disables `errexit` when
calling a function from a conditional pipeline, like done in the `-s`
case:

	if [ "${silent}" == true ]; then
		if ! handle_build "${t}" &>"$log_file"; then
                ^^^^^^
           this disables `errexit`

and `handle_build` ends with a `tar tvf` that always succeeds.

Adding error checking all over the place isn't really an option
as it would seriously obfuscate the code. Drop the conditional
pipeline instead and print the final error message from a `trap`
handler on the special ERR signal. This requires the `errtrace`
option as `trap`s aren't propagated to functions by default.

Since all outputs of `handle_build` are redirected to the build
log file, some file descriptor duplication magic is needed for
the handler to be able to write to the orignal stdout and stderr.

Fixes #3757

Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit a779e19bee)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:05:29 +02:00
Greg Kurz
c9c7751184 tools/packaging: Fix usage of kata-deploy-binaries.sh
Add missing documentation for -s .

Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 0baebd2b37)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:05:29 +02:00
Greg Kurz
1e62231610 tools/packaging/kata-deploy: Copy install_yq.sh in a dedicated script
'make kata-tarball' sometimes fails early with:

cp: cannot create regular file '[...]/tools/packaging/kata-deploy/local-build/dockerbuild/install_yq.sh': File exists

This happens because all assets are built in parallel using the same
`kata-deploy-binaries-in-docker.sh` script, and thus all try to copy
the `install_yq.sh` script to the same location with the `cp` command.
This is a well known race condition that cannot be avoided without
serialization of `cp` invocations.

Move the copying of `install_yq.sh` to a separate script and ensure
it is called *before* parallel builds. Make the presence of the copy
a prerequisite for each sub-build so that they still can be triggered
individually. Update the GH release workflow to also call this script
before calling `kata-deploy-binaries-in-docker.sh`.

Fixes #3756

Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 154c8b03d3)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:05:29 +02:00
David Gibson
8fa64e011d packaging: Eliminate TTY_OPT and NO_TTY variables in kata-deploy
NO_TTY configured whether to add the -t option to docker run.  It makes no
sense for the caller to configure this, since whether you need it depends
on the commands you're running.  Since the point here is to run
non-interactive build scripts, we don't need -t, or -i either.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 1ed7da8fc7)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:05:29 +02:00
David Gibson
8f67f9e384 tools/packaging/kata-deploy/local-build: Add build to gitignore
This directory consists entirely of files built during a make kata-tarball,
so it should not be committed to the tree. A symbolic link to this directory
might be created during 'make tarball', ignore it as well.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[greg: - rearranged the subject to make the subsystem checker happy
       - also ignore the symbolic link created by
         `kata-deploy-binaries-in-docker.sh`]
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit bad859d2f8)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:05:29 +02:00
Greg Kurz
3049b7760a versions: Bump firecracker to v0.23.4
This release changes Docker images repository from DockerHub to Amazon
ECR. This resolves the `You have reached your pull rate limit` error
when building the firecracker tarball.

Fixes #4001

Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 0d5f80b803)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:05:29 +02:00
Miao Xia
aedfef29a3 runtime/virtcontainers: Pass the hugepages resources to agent
The hugepages resources claimed by containers should be limited
by cgroup in the guest OS.

Fixes: #3695

Signed-off-by: Miao Xia <xia.miao1@zte.com.cn>
(cherry picked from commit a2f5c1768e)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:05:29 +02:00
David Gibson
c9e1f72785 agent: Verify that we allocated as many hugepages as we need
allocate_hugepages() writes to the kernel sysfs file to allocate hugepages
in the Kata VM.  However, even if the write succeeds, it's not certain that
the kernel will actually be able to allocate as many hugepages as we
requested.

This patch reads back the file after writing it to check if we were able to
allocate all the required hugepages.

fixes #3816

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 42e35505b0)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:05:29 +02:00
David Gibson
ba858e8cd9 agent: Don't attempt to create directories for hugepage configuration
allocate_hugepages() constructs the path for the sysfs directory containing
hugepage configuration, then attempts to create this directory if it does
not exist.

This doesn't make sense: sysfs is a view into kernel configuration, if the
kernel has support for the hugepage size, the directory will already be
there, if it doesn't, trying to create it won't help.

For the same reason, attempting to create the "nr_hugepages" file
itself is pointless, so there's no reason to call
OpenOptions::create(true).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 608e003abc)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 09:05:29 +02:00
Fabiano Fidêncio
b784763685 Merge pull request #4120 from likebreath/0420/backport_clh_v23.0
stable-2.4 | versions: Upgrade to Cloud Hypervisor v23.0
2022-04-21 14:33:37 +02:00
Fabiano Fidêncio
df2d57e9b8 Merge pull request #4098 from fengwang666/stable-2.4_backport
stable-2.4 | runtime: Base64 encode the direct volume mountInfo path
2022-04-21 12:54:03 +02:00
Bo Chen
bc32eff7b4 virtcontainers: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v23.0.
Note: The client code of cloud-hypervisor's (CLH) OpenAPI is
automatically generated by openapi-generator [1-2].

[1] https://github.com/OpenAPITools/openapi-generator
[2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md

Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 29e569aa92)
2022-04-20 15:57:50 -07:00
Bo Chen
984ef5389e versions: Upgrade to Cloud Hypervisor v23.0
Highlights from the Cloud Hypervisor release v23.0: 1) vDPA Support; 2)
Updated OS Support list (Jammy 22.04 added with EOLed versions removed);
3) AArch64 Memory Map Improvements; 4) AMX Support; 5) Bug Fixes;

Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v23.0

Fixes: #4101

Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 6012c19707)
2022-04-20 15:57:50 -07:00
Feng Wang
adf6493b89 runtime: Base64 encode the direct volume mountInfo path
This is to avoid accidentally deleting multiple volumes.

Fixes #4020

Signed-off-by: Feng Wang <feng.wang@databricks.com>
(cherry picked from commit 354cd3b9b6)
2022-04-13 22:30:53 -07:00
Greg Kurz
10bab3c96a Merge pull request #4081 from fidencio/wip/stable-2.4-agent-avoid-panic-when-getting-empty-stats
stable-2.4 | agent: Avoid agent panic when reading empty stats
2022-04-13 14:13:13 +02:00
Fabiano Fidêncio
6b41754018 agent: Avoid agent panic when reading empty stats
This was seen in an issue report, where we'd try to unwrap a None value,
leading to a panic.

Fixes: #4077
Related: #4043

Full backtrace:
```
"thread 'tokio-runtime-worker' panicked at 'called `Option::unwrap()` on a `None` value', rustjail/src/cgroups/fs/mod.rs:593:31"
"stack backtrace:"
"   0:     0x7f0390edcc3a - std::backtrace_rs::backtrace::libunwind::trace::hd5eff4de16dbdd15"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5"
"   1:     0x7f0390edcc3a - std::backtrace_rs::backtrace::trace_unsynchronized::h04a775b4c6ab90d6"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5"
"   2:     0x7f0390edcc3a - std::sys_common::backtrace::_print_fmt::h3253c3db9f17d826"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/sys_common/backtrace.rs:67:5"
"   3:     0x7f0390edcc3a - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h02bfc712fc868664"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/sys_common/backtrace.rs:46:22"
"   4:     0x7f0390a91fbc - core::fmt::write::hfd5090d1132106d8"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/fmt/mod.rs:1149:17"
"   5:     0x7f0390edb804 - std::io::Write::write_fmt::h34acb699c6d6f5a9"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/io/mod.rs:1697:15"
"   6:     0x7f0390edbee0 - std::sys_common::backtrace::_print::hfca761479e3d91ed"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/sys_common/backtrace.rs:49:5"
"   7:     0x7f0390edbee0 - std::sys_common::backtrace::print::hf666af0b87d2b5ba"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/sys_common/backtrace.rs:36:9"
"   8:     0x7f0390edbee0 - std::panicking::default_hook::{{closure}}::hb4617bd1d4a09097"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/panicking.rs:211:50"
"   9:     0x7f0390edb2da - std::panicking::default_hook::h84f684d9eff1eede"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/panicking.rs:228:9"
"  10:     0x7f0390edb2da - std::panicking::rust_panic_with_hook::h8e784f5c39f46346"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/panicking.rs:606:17"
"  11:     0x7f0390f0c416 - std::panicking::begin_panic_handler::{{closure}}::hef496869aa926670"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/panicking.rs:500:13"
"  12:     0x7f0390f0c3b6 - std::sys_common::backtrace::__rust_end_short_backtrace::h8e9b039b8ed3e70f"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/sys_common/backtrace.rs:139:18"
"  13:     0x7f0390f0c372 - rust_begin_unwind"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/panicking.rs:498:5"
"  14:     0x7f03909062c0 - core::panicking::panic_fmt::h568976b83a33ae59"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/panicking.rs:107:14"
"  15:     0x7f039090641c - core::panicking::panic::he2e71cfa6548cc2c"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/panicking.rs:48:5"
"  16:     0x7f0390eb443f - <rustjail::cgroups::fs::Manager as rustjail::cgroups::Manager>::get_stats::h85031fc1c59c53d9"
"  17:     0x7f03909c0138 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hfa6e6cd7516f8d11"
"  18:     0x7f0390d697e5 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hffbaa534cfa97d44"
"  19:     0x7f039099c0b3 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hae3ab083a06d0b4b"
"  20:     0x7f0390af9e1e - std::panic::catch_unwind::h1fdd25c8ebba32e1"
"  21:     0x7f0390b7c4e6 - tokio::runtime::task::raw::poll::hd3ebbd0717dac808"
"  22:     0x7f0390f49f3f - tokio::runtime::thread_pool::worker::Context::run_task::hfdd63cd1e0b17abf"
"  23:     0x7f0390f3a599 - tokio::runtime::task::raw::poll::h62954f6369b1d210"
"  24:     0x7f0390f37863 - std::sys_common::backtrace::__rust_begin_short_backtrace::h1c58f232c078bfe9"
"  25:     0x7f0390f4f3dd - core::ops::function::FnOnce::call_once{{vtable.shim}}::h2d329a84c0feed57"
"  26:     0x7f0390f0e535 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h137e5243c6233a3b"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/alloc/src/boxed.rs:1694:9"
"  27:     0x7f0390f0e535 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h7331c46863d912b7"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/alloc/src/boxed.rs:1694:9"
"  28:     0x7f0390f0e535 - std::sys::unix::thread::Thread::new::thread_start::h1fb20b966cb927ab"
"                               at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/sys/unix/thread.rs:106:17"
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 78f30c33c6)
2022-04-12 18:59:02 +02:00
Fabiano Fidêncio
0ad6f05dee Merge pull request #4024 from bergwolf/2.4.0-branch-bump
# Kata Containers 2.4.0
2022-04-01 13:46:35 +02:00
Peng Tao
4c9c01a124 release: Kata Containers 2.4.0
- stable-2.4 | agent: fix container stop error with signal SIGRTMIN+3
- stable-2.4 | kata-monitor: fix duplicated output when printing usage
- stable-2.4 | runtime: Stop getting OOM events from agent for "ttrpc closed" error
- kata-deploy: fix version bump from -rc to stable
- stable-2.4: release: Include all the rust vendored code into the vendored tarball
- stable-2.4 | tools: release: Do not consider release candidates as stable releases
- agent: Signal the whole process group
- stable-2.4 | docs: Update k8s documentation
- backport main commits to stable 2.4
- stable-2.4: Bump QEMU to 6.2 (bringing then SGX support in)
- runtime: Properly handle ESRCH error when signaling container
- stable-2.4 | versions: Upgrade to Cloud Hypervisor v22.1

f2319d69 release: Adapt kata-deploy for 2.4.0
cae48e9c agent: fix container stop error with signal SIGRTMIN+3
342aa95c kata-monitor: fix duplicated output when printing usage
9f75e226 runtime: add logs around sandbox monitor
363fbed8 runtime: stop getting OOM events when ttrpc: closed error
f840de5a workflows,release: Ship *all* the rust vendored code
952cea5f tools: Add a generate_vendor.sh script
cc965fa0 kata-deploy: fix version bump from -rc to stable
f41cc184 tools: release: Do not consider release candidates as stable releases
e059b50f runtime: Add more debug logs for container io stream copy
71ce6f53 agent: Kill the all the container processes of the same cgroup
30fc2c86 docs: Update k8s documentation
24028969 virtcontainers: Run mock hook from build tree rather than system bin dir
4e54aa5a doc: fix filename typo
d815393c manager: Add options to change self test behaviour
4111e1a3 manager: Add option to enable component debug
2918be18 manager: Create containerd link
6b31b068 kernel: fix cve-2022-0847
5589b246 doc: update Intel SGX use cases document
1da88dca tools: update QEMU to 6.2
3e2f9223 runtime: Properly handle ESRCH error when signaling container
4c21cb3e versions: Upgrade to Cloud Hypervisor v22.1

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-04-01 06:20:20 +00:00
Peng Tao
f2319d693d release: Adapt kata-deploy for 2.4.0
kata-deploy files must be adapted to a new release.  The cases where it
happens are when the release goes from -> to:
* main -> stable:
  * kata-deploy-stable / kata-cleanup-stable: are removed

* stable -> stable:
  * kata-deploy / kata-cleanup: bump the release to the new one.

There are no changes when doing an alpha release, as the files on the
"main" branch always point to the "latest" and "stable" tags.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-04-01 06:20:20 +00:00
Bin Liu
98ccf8f6a1 Merge pull request #4008 from wxx213/stable-2.4
stable-2.4 | agent: fix container stop error with signal SIGRTMIN+3
2022-04-01 11:29:18 +08:00
Wang Xingxing
cae48e9c9b agent: fix container stop error with signal SIGRTMIN+3
The nix::sys::signal::Signal package api cannot deal with SIGRTMIN+3,
directly use libc function to send the signal.

Fixes: #3990

Signed-off-by: Wang Xingxing <stellarwxx@163.com>
(cherry picked from commit 0d765bd082)
Signed-off-by: Wang Xingxing <stellarwxx@163.com>
2022-03-31 16:49:06 +08:00
snir911
a36103c759 Merge pull request #4003 from fgiudici/kata-monitor_fix_help_backport
stable-2.4 | kata-monitor: fix duplicated output when printing usage
2022-03-30 18:57:17 +03:00
Fabiano Fidêncio
6abbcc551c Merge pull request #3997 from liubin/backport-2.4
stable-2.4 | runtime: Stop getting OOM events from agent for "ttrpc closed" error
2022-03-30 14:08:55 +02:00
Francesco Giudici
342aa95cc8 kata-monitor: fix duplicated output when printing usage
(default: "/run/containerd/containerd.sock") is duplicated when
printing kata-monitor usage:

[root@kubernetes ~]# kata-monitor --help
Usage of kata-monitor:
  -listen-address string
        The address to listen on for HTTP requests. (default ":8090")
  -log-level string
        Log level of logrus(trace/debug/info/warn/error/fatal/panic). (default "info")
  -runtime-endpoint string
        Endpoint of CRI container runtime service. (default: "/run/containerd/containerd.sock") (default "/run/containerd/containerd.sock")

the golang flag package takes care of adding the defaults when printing
usage. Remove the explicit print of the value so that it would not be
printed on screen twice.

Fixes: #3998

Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
(cherry picked from commit a63bbf9793)
2022-03-30 14:02:54 +02:00
bin
9f75e226f1 runtime: add logs around sandbox monitor
For debugging purposes, add some logs.

Fixes: #3815

Signed-off-by: bin <bin@hyper.sh>
2022-03-30 17:11:40 +08:00
bin
363fbed804 runtime: stop getting OOM events when ttrpc: closed error
getOOMEvents is a long-waiting call, it will retry when failed.
For cases of agent shutdown, the retry should stop.

When the agent hasn't detected agent has died, we can also check
whether the error is "ttrpc: closed".

Fixes: #3815

Signed-off-by: bin <bin@hyper.sh>
2022-03-30 17:11:35 +08:00
Fabiano Fidêncio
54a638317a Merge pull request #3988 from bergwolf/github/kata-deploy
kata-deploy: fix version bump from -rc to stable
2022-03-30 11:01:45 +02:00
Peng Tao
8ce6b12b41 Merge pull request #3993 from fidencio/wip/stable-2.4-release-include-all-rust-vendored-code-to-the-vendored-tarball
stable-2.4: release: Include all the rust vendored code into the vendored tarball
2022-03-30 16:10:47 +08:00
Fabiano Fidêncio
f840de5acb workflows,release: Ship *all* the rust vendored code
Instead of only vendoring the code needed by the agent, let's ensure we
vendor all the needed rust code, and let's do it using the newly
introduced enerate_vendor.sh script.

Fixes: #3973

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 3606923ac8)
2022-03-29 23:27:43 +02:00
Fabiano Fidêncio
952cea5f5d tools: Add a generate_vendor.sh script
This script is responsible for generating a tarball with all the rust
vendored code that is needed for fully building kata-containers on a
disconnected environment.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 2eb07455d0)
2022-03-29 23:27:29 +02:00
Peng Tao
cc965fa0cb kata-deploy: fix version bump from -rc to stable
In such case, we should bump from "latest" tag rather than from
current_version.

Fixes: #3986
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-03-29 03:45:27 +00:00
GabyCT
44b1473d0c Merge pull request #3977 from fidencio/wip/backport-fix-for-3847
stable-2.4 | tools: release: Do not consider release candidates as stable releases
2022-03-28 10:38:47 -06:00
Fupan Li
565efd1bf2 Merge pull request #3975 from bergwolf/github/backport-stable-2.4
agent: Signal the whole process group
2022-03-28 18:26:12 +08:00
Fabiano Fidêncio
f41cc18427 tools: release: Do not consider release candidates as stable releases
During the release of 2.4.0-rc0 @egernst noticed an incositency in the
way we handle release tags, as release candidates are being taken as
"stable" releases, while both the kata-deploy tests and the release
action consider this as "latest".

Ideally we should have our own tag for "release candidate", but that's
something that could and should be discussed more extensively outside of
the scope of this quick fix.

For now, let's align the code generating the PR for bumping the release
with what we already do as part of the release action and kata-deploy
test, and tag "-rc"  as latest, regardless of which branch it's coming
from.

Fixes: #3847

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 4adf93ef2c)
2022-03-28 11:01:58 +02:00
Feng Wang
e059b50f5c runtime: Add more debug logs for container io stream copy
This can help debugging container lifecycle issues

Fixes: #3913

Signed-off-by: Feng Wang <feng.wang@databricks.com>
2022-03-28 16:22:22 +08:00
Feng Wang
71ce6f537f agent: Kill the all the container processes of the same cgroup
Otherwise the container process might leak and cause an unclean exit

Fixes: #3913

Signed-off-by: Feng Wang <feng.wang@databricks.com>
2022-03-28 16:21:51 +08:00
Bin Liu
a2b73b60bd Merge pull request #3960 from cmaf/update-k8s-docs-1-stable-2.4
stable-2.4 | docs: Update k8s documentation
2022-03-25 15:25:25 +08:00
Bin Liu
2ce9ce7b8f Merge pull request #3954 from bergwolf/github/backport-stable-2.4
backport main commits to stable 2.4
2022-03-25 14:45:17 +08:00
Chelsea Mafrica
30fc2c863d docs: Update k8s documentation
Update documentation with missing step to untaint node to enable
scheduling and update the example to run a pod using the kata runtime
class instead of untrusted workloads, which applies to versions of CRI-O
prior to v1.12.

Fixes #3863

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
(cherry picked from commit 5c434270d1)
2022-03-24 11:22:18 -07:00
David Gibson
24028969c2 virtcontainers: Run mock hook from build tree rather than system bin dir
Running unit tests should generally have minimal dependencies on
things outside the build tree.  It *definitely* shouldn't modify
system wide things outside the build tree.  Currently the runtime
"make test" target does so, though.

Several of the tests in src/runtime/pkg/katautils/hook_test.go require a
sample hook binary.  They expect this hook in
/usr/bin/virtcontainers/bin/test/hook, so the makefile, as root, installs
the test binary to that location.

Go tests automatically run within the package's directory though, so
there's no need to use a system wide path.  We can use a relative path to
the binary build within the tree just as easily.

fixes #3941

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-03-24 12:02:00 +08:00
Garrett Mahin
4e54aa5a7b doc: fix filename typo
Corrects a filename typo in cleanup cluster part
of kata-deploy README.md

Fixes: #3869
Signed-off-by: Garrett Mahin <garrett.mahin@gmail.com>
2022-03-24 12:00:17 +08:00
James O. D. Hunt
d815393c3e manager: Add options to change self test behaviour
Added new `kata-manager` options to control the self-test behaviour. By
default, after installation the manager will run a test to ensure a Kata
Containers container can be created. New options allow:

- The self test to be disabled.
- Only the self test to be run (no installation).

These features allow changes to be made to the installed system before
the self test is run.

Fixes: #3851.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-03-24 11:59:48 +08:00
James O. D. Hunt
4111e1a3de manager: Add option to enable component debug
Added a `-d` option to `kata-manager` to enable Kata Containers
and containerd debug.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-03-24 11:59:33 +08:00
James O. D. Hunt
2918be180f manager: Create containerd link
Make the `kata-manager` create a `containerd` link to ensure the
downloaded containerd systemd service file can find the daemon when
using the GitHub packaged version of containerd.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-03-24 11:59:26 +08:00
Julio Montes
6b31b06832 kernel: fix cve-2022-0847
bump guest kernel version to fix cve-2022-0847 "Dirty Pipe"

fixes #3852

Signed-off-by: Julio Montes <julio.montes@intel.com>
2022-03-24 11:58:43 +08:00
Fabiano Fidêncio
53a9cf7dc4 Merge pull request #3927 from fidencio/stable-2.4/qemu-bump
stable-2.4: Bump QEMU to 6.2 (bringing then SGX support in)
2022-03-23 07:20:35 +01:00
Julio Montes
5589b246d7 doc: update Intel SGX use cases document
Installation section is not longer needed because of the latest
default kata kernel supports Intel SGX.
Include QEMU to the list of supported hypervisors.

fixes #3911

Signed-off-by: Julio Montes <julio.montes@intel.com>
(cherry picked from commit 24b29310b2)
2022-03-22 08:36:04 +01:00
Julio Montes
1da88dca4b tools: update QEMU to 6.2
bring Intel SGX support

Changes tha may impact in Kata Containers
Arm:
The 'virt' machine now supports an emulated ITS
The 'virt' machine now supports more than 123 CPUs in TCG emulation mode
The pl031 real-time clock device now supports sending RTC_CHANGE QMP events

PowerPC:
Improved POWER10 support for the 'powernv' machine
Initial support for POWER10 DD2.0 CPU added
Added support for FORM2 PAPR NUMA descriptions in the "pseries" machine
 type

s390x:
Improved storage key emulation (e.g. fixed address handling, lazy
 storage key enablement for TCG, ...)
New gen16 CPU features are now enabled automatically in the latest
 machine type

KVM:
Support for SGX in the virtual machine, using the /dev/sgx_vepc device
 on the host and the "memory-backend-epc" backend in QEMU.
New "hv-apicv" CPU property (aliased to "hv-avic") sets the
 HV_DEPRECATING_AEOI_RECOMMENDED bit in CPUID[0x40000004].EAX.

virtio-mem:
QEMU now fully supports guest memory dumps with virtio-mem.
QEMU now cleanly supports precopy migration, postcopy migration and
 background snapshots with virtio-mem.

fixes #3902

Signed-off-by: Julio Montes <julio.montes@intel.com>
(cherry picked from commit 18d4d7fb1d)
2022-03-22 08:35:45 +01:00
Peng Tao
8cc2231818 Merge pull request #3892 from fengwang666/my_2.4_pr_backport
runtime: Properly handle ESRCH error when signaling container
2022-03-15 10:11:25 +08:00
GabyCT
63c1498f05 Merge pull request #3891 from likebreath/stable-2.4
stable-2.4 | versions: Upgrade to Cloud Hypervisor v22.1
2022-03-14 17:44:09 -06:00
Feng Wang
3e2f9223b0 runtime: Properly handle ESRCH error when signaling container
Currently kata shim v2 doesn't translate ESRCH signal, causing container
fail to stop and shim leak.

Fixes: #3874

Signed-off-by: Feng Wang <feng.wang@databricks.com>
(cherry picked from commit aa5ae6b17c)
2022-03-14 13:15:54 -07:00
Bo Chen
4c21cb3eb1 versions: Upgrade to Cloud Hypervisor v22.1
This is a bug fix release. The following issues have been addressed:
1) VFIO ioctl reordering to fix MSI on AMD platforms; 2) Fix virtio-net
control queue.

Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v22.1

Fixes: #3872

Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 7a18e32fa7)
2022-03-14 12:34:31 -07:00
1411 changed files with 21036 additions and 356124 deletions

View File

@@ -1,40 +0,0 @@
# Copyright (c) 2022 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#
name: Add PR sizing label
on:
pull_request_target:
types:
- opened
- reopened
- synchronize
jobs:
add-pr-size-label:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v1
- name: Install PR sizing label script
run: |
# Clone into a temporary directory to avoid overwriting
# any existing github directory.
pushd $(mktemp -d) &>/dev/null
git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts
sudo install pr-add-size-label.sh /usr/local/bin
popd &>/dev/null
- name: Add PR sizing label
env:
GITHUB_TOKEN: ${{ secrets.KATA_GITHUB_ACTIONS_PR_SIZE_TOKEN }}
run: |
pr=${{ github.event.number }}
# Removing man-db, workflow kept failing, fixes: #4480
sudo apt -y remove --purge man-db
sudo apt -y install diffstat patchutils
pr-add-size-label.sh -p "$pr"

View File

@@ -10,7 +10,7 @@ env:
error_msg: |+
See the document below for help on formatting commits for the project.
https://github.com/kata-containers/community/blob/main/CONTRIBUTING.md#patch-format
https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md#patch-format
jobs:
commit-message-check:
@@ -63,8 +63,7 @@ jobs:
# the entire commit message.
#
# - Body lines *can* be longer than the maximum if they start
# with a non-alphabetic character or if there is no whitespace in
# the line.
# with a non-alphabetic character.
#
# This allows stack traces, log files snippets, emails, long URLs,
# etc to be specified. Some of these naturally "work" as they start
@@ -75,8 +74,8 @@ jobs:
#
# - A SoB comment can be any length (as it is unreasonable to penalise
# people with long names/email addresses :)
pattern: '^.+(\n([a-zA-Z].{0,150}|[^a-zA-Z\n].*|[^\s\n]*|Signed-off-by:.*|))+$'
error: 'Body line too long (max 150)'
pattern: '^.+(\n([a-zA-Z].{0,149}|[^a-zA-Z\n].*|Signed-off-by:.*|))+$'
error: 'Body line too long (max 72)'
post_error: ${{ env.error_msg }}
- name: Check Fixes

View File

@@ -1,44 +0,0 @@
on:
schedule:
- cron: '0 23 * * 0'
name: Docs URL Alive Check
jobs:
test:
strategy:
matrix:
go-version: [1.17.x]
os: [ubuntu-20.04]
runs-on: ${{ matrix.os }}
env:
target_branch: ${{ github.base_ref }}
steps:
- name: Install Go
if: github.repository_owner == 'kata-containers'
uses: actions/setup-go@v2
with:
go-version: ${{ matrix.go-version }}
env:
GOPATH: ${{ runner.workspace }}/kata-containers
- name: Set env
if: github.repository_owner == 'kata-containers'
run: |
echo "GOPATH=${{ github.workspace }}" >> $GITHUB_ENV
echo "${{ github.workspace }}/bin" >> $GITHUB_PATH
- name: Checkout code
if: github.repository_owner == 'kata-containers'
uses: actions/checkout@v2
with:
fetch-depth: 0
path: ./src/github.com/${{ github.repository }}
- name: Setup
if: github.repository_owner == 'kata-containers'
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/setup.sh
env:
GOPATH: ${{ runner.workspace }}/kata-containers
# docs url alive check
- name: Docs URL Alive Check
if: github.repository_owner == 'kata-containers'
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && make docs-url-alive-check

View File

@@ -24,7 +24,6 @@ jobs:
- firecracker
- rootfs-image
- rootfs-initrd
- virtiofsd
steps:
- uses: actions/checkout@v2
- name: Install docker

View File

@@ -48,7 +48,6 @@ jobs:
- rootfs-image
- rootfs-initrd
- shim-v2
- virtiofsd
steps:
- name: get-PR-ref
id: get-PR-ref

View File

@@ -17,7 +17,6 @@ jobs:
- rootfs-image
- rootfs-initrd
- shim-v2
- virtiofsd
steps:
- uses: actions/checkout@v2
- name: Install docker

View File

@@ -19,8 +19,6 @@ jobs:
- name: Build snap
run: |
# Removing man-db, workflow kept failing, fixes: #4480
sudo apt -y remove --purge man-db
sudo apt-get install -y git git-extras
kata_url="https://github.com/kata-containers/kata-containers"
latest_version=$(git ls-remote --tags ${kata_url} | egrep -o "refs.*" | egrep -v "\-alpha|\-rc|{}" | egrep -o "[[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+" | sort -V -r | head -1)

1
.gitignore vendored
View File

@@ -10,5 +10,4 @@ src/agent/kata-agent.service
src/agent/protocols/src/*.rs
!src/agent/protocols/src/lib.rs
build
src/tools/log-parser/kata-log-parser

View File

@@ -14,8 +14,6 @@ TOOLS =
TOOLS += agent-ctl
TOOLS += trace-forwarder
TOOLS += runk
TOOLS += log-parser
STANDARD_TARGETS = build check clean install test vendor
@@ -41,16 +39,10 @@ generate-protocols:
static-checks: build
bash ci/static-checks.sh
docs-url-alive-check:
bash ci/docs-url-alive-check.sh
.PHONY: \
all \
binary-tarball \
default \
install-binary-tarball \
logging-crate-tests \
static-checks \
docs-url-alive-check
static-checks

View File

@@ -118,7 +118,6 @@ The table below lists the core parts of the project:
| [runtime](src/runtime) | core | Main component run by a container manager and providing a containerd shimv2 runtime implementation. |
| [agent](src/agent) | core | Management process running inside the virtual machine / POD that sets up the container environment. |
| [documentation](docs) | documentation | Documentation common to all components (such as design and install documentation). |
| [libraries](src/libs) | core | Library crates shared by multiple Kata Container components or published to [`crates.io`](https://crates.io/index.html) |
| [tests](https://github.com/kata-containers/tests) | tests | Excludes unit tests which live with the main code. |
### Additional components
@@ -132,7 +131,6 @@ The table below lists the remaining parts of the project:
| [osbuilder](tools/osbuilder) | infrastructure | Tool to create "mini O/S" rootfs and initrd images and kernel for the hypervisor. |
| [`agent-ctl`](src/tools/agent-ctl) | utility | Tool that provides low-level access for testing the agent. |
| [`trace-forwarder`](src/tools/trace-forwarder) | utility | Agent tracing helper. |
| [`runk`](src/tools/runk) | utility | Standard OCI container runtime based on the agent. |
| [`ci`](https://github.com/kata-containers/ci) | CI | Continuous Integration configuration files and scripts. |
| [`katacontainers.io`](https://github.com/kata-containers/www.katacontainers.io) | Source for the [`katacontainers.io`](https://www.katacontainers.io) site. |
@@ -140,7 +138,7 @@ The table below lists the remaining parts of the project:
Kata Containers is now
[available natively for most distributions](docs/install/README.md#packaged-installation-methods).
However, packaging scripts and metadata are still used to generate [snap](snap/local) and GitHub releases. See
However, packaging scripts and metadata are still used to generate snap and GitHub releases. See
the [components](#components) section for further details.
## Glossary of Terms

View File

@@ -1 +1 @@
2.5.1
2.4.3

View File

@@ -11,10 +11,10 @@ runtimedir=$cidir/../src/runtime
build_working_packages() {
# working packages:
device_api=$runtimedir/pkg/device/api
device_config=$runtimedir/pkg/device/config
device_drivers=$runtimedir/pkg/device/drivers
device_manager=$runtimedir/pkg/device/manager
device_api=$runtimedir/virtcontainers/device/api
device_config=$runtimedir/virtcontainers/device/config
device_drivers=$runtimedir/virtcontainers/device/drivers
device_manager=$runtimedir/virtcontainers/device/manager
rc_pkg_dir=$runtimedir/pkg/resourcecontrol/
utils_pkg_dir=$runtimedir/virtcontainers/utils

View File

@@ -1,6 +1,6 @@
#!/bin/bash
#!/usr/bin/env bash
#
# Copyright (c) 2021 Easystack Inc.
# Copyright (c) 2020 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
@@ -9,4 +9,4 @@ set -e
cidir=$(dirname "$0")
source "${cidir}/lib.sh"
run_docs_url_alive_check
run_go_test

View File

@@ -19,7 +19,7 @@ source "${tests_repo_dir}/.ci/lib.sh"
# fail. So let's ensure they are unset here.
unset PREFIX DESTDIR
arch=${ARCH:-$(uname -m)}
arch=$(uname -m)
workdir="$(mktemp -d --tmpdir build-libseccomp.XXXXX)"
# Variables for libseccomp
@@ -70,8 +70,7 @@ build_and_install_gperf() {
curl -sLO "${gperf_tarball_url}"
tar -xf "${gperf_tarball}"
pushd "gperf-${gperf_version}"
# Unset $CC for configure, we will always use native for gperf
CC= ./configure --prefix="${gperf_install_dir}"
./configure --prefix="${gperf_install_dir}"
make
make install
export PATH=$PATH:"${gperf_install_dir}"/bin
@@ -85,7 +84,7 @@ build_and_install_libseccomp() {
curl -sLO "${libseccomp_tarball_url}"
tar -xf "${libseccomp_tarball}"
pushd "libseccomp-${libseccomp_version}"
./configure --prefix="${libseccomp_install_dir}" CFLAGS="${cflags}" --enable-static --host="${arch}"
./configure --prefix="${libseccomp_install_dir}" CFLAGS="${cflags}" --enable-static
make
make install
popd

24
ci/install_musl.sh Executable file
View File

@@ -0,0 +1,24 @@
#!/usr/bin/env bash
# Copyright (c) 2020 Ant Group
#
# SPDX-License-Identifier: Apache-2.0
#
set -e
install_aarch64_musl() {
local arch=$(uname -m)
if [ "${arch}" == "aarch64" ]; then
local musl_tar="${arch}-linux-musl-native.tgz"
local musl_dir="${arch}-linux-musl-native"
pushd /tmp
if curl -sLO --fail https://musl.cc/${musl_tar}; then
tar -zxf ${musl_tar}
mkdir -p /usr/local/musl/
cp -r ${musl_dir}/* /usr/local/musl/
fi
popd
fi
}
install_aarch64_musl

View File

@@ -18,13 +18,6 @@ clone_tests_repo()
{
if [ -d "$tests_repo_dir" ]; then
[ -n "${CI:-}" ] && return
# git config --global --add safe.directory will always append
# the target to .gitconfig without checking the existence of
# the target, so it's better to check it before adding the target repo.
local sd="$(git config --global --get safe.directory ${tests_repo_dir} || true)"
if [ -z "${sd}" ]; then
git config --global --add safe.directory ${tests_repo_dir}
fi
pushd "${tests_repo_dir}"
git checkout "${branch}"
git pull
@@ -46,11 +39,8 @@ run_static_checks()
bash "$tests_repo_dir/.ci/static-checks.sh" "$@"
}
run_docs_url_alive_check()
run_go_test()
{
clone_tests_repo
# Make sure we have the targeting branch
git remote set-branches --add origin "${branch}"
git fetch -a
bash "$tests_repo_dir/.ci/static-checks.sh" --docs --all "github.com/kata-containers/kata-containers"
bash "$tests_repo_dir/.ci/go-test.sh"
}

View File

@@ -116,7 +116,7 @@ detailed below.
The Kata logs appear in the `containerd` log files, along with logs from `containerd` itself.
For more information about `containerd` debug, please see the
[`containerd` documentation](https://github.com/containerd/containerd/blob/main/docs/getting-started.md).
[`containerd` documentation](https://github.com/containerd/containerd/blob/master/docs/getting-started.md).
#### Enabling full `containerd` debug
@@ -465,7 +465,7 @@ script and paste its output directly into a
> [runtime](../src/runtime) repository.
To perform analysis on Kata logs, use the
[`kata-log-parser`](../src/tools/log-parser)
[`kata-log-parser`](https://github.com/kata-containers/tests/tree/main/cmd/log-parser)
tool, which can convert the logs into formats (e.g. JSON, TOML, XML, and YAML).
See [Set up a debug console](#set-up-a-debug-console).
@@ -700,11 +700,11 @@ options to have the kernel boot messages logged into the system journal.
For generic information on enabling debug in the configuration file, see the
[Enable full debug](#enable-full-debug) section.
The kernel boot messages will appear in the `kata` logs (and in the `containerd` or `CRI-O` log appropriately).
The kernel boot messages will appear in the `containerd` or `CRI-O` log appropriately,
such as:
```bash
$ sudo journalctl -t kata
$ sudo journalctl -t containerd
-- Logs begin at Thu 2020-02-13 16:20:40 UTC, end at Thu 2020-02-13 16:30:23 UTC. --
...
time="2020-09-15T14:56:23.095113803+08:00" level=debug msg="reading guest console" console-protocol=unix console-url=/run/vc/vm/ab9f633385d4987828d342e47554fc6442445b32039023eeddaa971c1bb56791/console.sock pid=107642 sandbox=ab9f633385d4987828d342e47554fc6442445b32039023eeddaa971c1bb56791 source=virtcontainers subsystem=sandbox vmconsole="[ 0.395399] brd: module loaded"
@@ -714,4 +714,3 @@ time="2020-09-15T14:56:23.105268162+08:00" level=debug msg="reading guest consol
time="2020-09-15T14:56:23.121121598+08:00" level=debug msg="reading guest console" console-protocol=unix console-url=/run/vc/vm/ab9f633385d4987828d342e47554fc6442445b32039023eeddaa971c1bb56791/console.sock pid=107642 sandbox=ab9f633385d4987828d342e47554fc6442445b32039023eeddaa971c1bb56791 source=virtcontainers subsystem=sandbox vmconsole="[ 0.421324] memmap_init_zone_device initialised 32768 pages in 12ms"
...
```
Refer to the [kata-log-parser documentation](../src/tools/log-parser/README.md) which is useful to fetch these.

View File

@@ -46,7 +46,7 @@ The following link shows the latest list of limitations:
# Contributing
If you would like to work on resolving a limitation, please refer to the
[contributors guide](https://github.com/kata-containers/community/blob/main/CONTRIBUTING.md).
[contributors guide](https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md).
If you wish to raise an issue for a new limitation, either
[raise an issue directly on the runtime](https://github.com/kata-containers/kata-containers/issues/new)
or see the

View File

@@ -30,6 +30,7 @@ See the [how-to documentation](how-to).
* [GPU Passthrough with Kata](./use-cases/GPU-passthrough-and-Kata.md)
* [SR-IOV with Kata](./use-cases/using-SRIOV-and-kata.md)
* [Intel QAT with Kata](./use-cases/using-Intel-QAT-and-kata.md)
* [VPP with Kata](./use-cases/using-vpp-and-kata.md)
* [SPDK vhost-user with Kata](./use-cases/using-SPDK-vhostuser-and-kata.md)
* [Intel SGX with Kata](./use-cases/using-Intel-SGX-and-kata.md)

View File

@@ -277,9 +277,7 @@ mod tests {
## Temporary files
Use `t.TempDir()` to create temporary directory. The directory created by
`t.TempDir()` is automatically removed when the test and all its subtests
complete.
Always delete temporary files on success.
### Golang temporary files
@@ -288,7 +286,11 @@ func TestSomething(t *testing.T) {
assert := assert.New(t)
// Create a temporary directory
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
assert.NoError(err)
// Delete it at the end of the test
defer os.RemoveAll(tmpdir)
// Add test logic that will use the tmpdir here...
}
@@ -320,7 +322,7 @@ mod tests {
## Test user
[Unit tests are run *twice*](../src/runtime/go-test.sh):
[Unit tests are run *twice*](https://github.com/kata-containers/tests/blob/main/.ci/go-test.sh):
- as the current user
- as the `root` user (if different to the current user)

View File

@@ -79,7 +79,7 @@ a "`BUG: feature X not implemented see {bug-url}`" type error.
- Don't use multiple log calls when a single log call could be used.
- Use structured logging where possible to allow
[standard tooling](../src/tools/log-parser)
[standard tooling](https://github.com/kata-containers/tests/tree/main/cmd/log-parser)
be able to extract the log fields.
### Names

View File

@@ -11,8 +11,7 @@ Kata Containers design documents:
- [`Inotify` support](inotify.md)
- [Metrics(Kata 2.0)](kata-2-0-metrics.md)
- [Design for Kata Containers `Lazyload` ability with `nydus`](kata-nydus-design.md)
- [Design for direct-assigned volume](direct-blk-device-assignment.md)
- [Design for core-scheduling](core-scheduling.md)
---
- [Design proposals](proposals)

View File

@@ -67,15 +67,22 @@ Using a proxy for multiplexing the connections between the VM and the host uses
4.5MB per [POD][2]. In a high density deployment this could add up to GBs of
memory that could have been used to host more PODs. When we talk about density
each kilobyte matters and it might be the decisive factor between run another
POD or not. Before making the decision not to use VSOCKs, you should ask
POD or not. For example if you have 500 PODs running in a server, the same
amount of [`kata-proxy`][3] processes will be running and consuming for around
2250MB of RAM. Before making the decision not to use VSOCKs, you should ask
yourself, how many more containers can run with the memory RAM consumed by the
Kata proxies?
### Reliability
[`kata-proxy`][3] is in charge of multiplexing the connections between virtual
machine and host processes, if it dies all connections get broken. For example
if you have a [POD][2] with 10 containers running, if `kata-proxy` dies it would
be impossible to contact your containers, though they would still be running.
Since communication via VSOCKs is direct, the only way to lose communication
with the containers is if the VM itself or the `containerd-shim-kata-v2` dies, if this happens
the containers are removed automatically.
[1]: https://wiki.qemu.org/Features/VirtioVsock
[2]: ./vcpu-handling.md#virtual-cpus-and-kubernetes-pods
[3]: https://github.com/kata-containers/proxy

View File

@@ -17,7 +17,7 @@ Kubelet instance is responsible for managing the lifecycle of pods
within the nodes and eventually relies on a container runtime to
handle execution. The Kubelet architecture decouples lifecycle
management from container execution through a dedicated gRPC based
[Container Runtime Interface (CRI)](https://github.com/kubernetes/design-proposals-archive/blob/main/node/container-runtime-interface-v1.md).
[Container Runtime Interface (CRI)](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/container-runtime-interface-v1.md).
In other words, a Kubelet is a CRI client and expects a CRI
implementation to handle the server side of the interface.

View File

@@ -1,17 +1,5 @@
# Storage
## Limits
Kata Containers is [compatible](README.md#compatibility) with existing
standards and runtime. From the perspective of storage, this means no
limits are placed on the amount of storage a container
[workload](README.md#workload) may use.
Since cgroups are not able to set limits on storage allocation, if you
wish to constrain the amount of storage a container uses, consider
using an existing facility such as `quota(1)` limits or
[device mapper](#devicemapper) limits.
## virtio SCSI
If a block-based graph driver is [configured](README.md#configuration),
@@ -32,7 +20,7 @@ For virtio-fs, the [runtime](README.md#runtime) starts one `virtiofsd` daemon
## Devicemapper
The
[devicemapper `snapshotter`](https://github.com/containerd/containerd/tree/main/snapshots/devmapper)
[devicemapper `snapshotter`](https://github.com/containerd/containerd/tree/master/snapshots/devmapper)
is a special case. The `snapshotter` uses dedicated block devices
rather than formatted filesystems, and operates at the block level
rather than the file level. This knowledge is used to directly use the

View File

@@ -1,12 +0,0 @@
# Core scheduling
Core scheduling is a Linux kernel feature that allows only trusted tasks to run concurrently on
CPUs sharing compute resources (for example, hyper-threads on a core).
Containerd versions >= 1.6.4 leverage this to treat all of the processes associated with a
given pod or container to be a single group of trusted tasks. To indicate this should be carried
out, containerd sets the `SCHED_CORE` environment variable for each shim it spawns. When this is
set, the Kata Containers shim implementation uses the `prctl` syscall to create a new core scheduling
domain for the shim process itself as well as future VMM processes it will start.
For more details on the core scheduling feature, see the [Linux documentation](https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/core-scheduling.html).

View File

@@ -1,253 +0,0 @@
# Motivation
Today, there exist a few gaps between Container Storage Interface (CSI) and virtual machine (VM) based runtimes such as Kata Containers
that prevent them from working together smoothly.
First, its cumbersome to use a persistent volume (PV) with Kata Containers. Today, for a PV with Filesystem volume mode, Virtio-fs
is the only way to surface it inside a Kata Container guest VM. But often mounting the filesystem (FS) within the guest operating system (OS) is
desired due to performance benefits, availability of native FS features and security benefits over the Virtio-fs mechanism.
Second, its difficult if not impossible to resize a PV online with Kata Containers. While a PV can be expanded on the host OS,
the updated metadata needs to be propagated to the guest OS in order for the application container to use the expanded volume.
Currently, there is not a way to propagate the PV metadata from the host OS to the guest OS without restarting the Pod sandbox.
# Proposed Solution
Because of the OS boundary, these features cannot be implemented in the CSI node driver plugin running on the host OS
as is normally done in the runc container. Instead, they can be done by the Kata Containers agent inside the guest OS,
but it requires the CSI driver to pass the relevant information to the Kata Containers runtime.
An ideal long term solution would be to have the `kubelet` coordinating the communication between the CSI driver and
the container runtime, as described in [KEP-2857](https://github.com/kubernetes/enhancements/pull/2893/files).
However, as the KEP is still under review, we would like to propose a short/medium term solution to unblock our use case.
The proposed solution is built on top of a previous [proposal](https://github.com/egernst/kata-containers/blob/da-proposal/docs/design/direct-assign-volume.md)
described by Eric Ernst. The previous proposal has two gaps:
1. Writing a `csiPlugin.json` file to the volume root path introduced a security risk. A malicious user can gain unauthorized
access to a block device by writing their own `csiPlugin.json` to the above location through an ephemeral CSI plugin.
2. The proposal didn't describe how to establish a mapping between a volume and a kata sandbox, which is needed for
implementing CSI volume resize and volume stat collection APIs.
This document particularly focuses on how to address these two gaps.
## Assumptions and Limitations
1. The proposal assumes that a block device volume will only be used by one Pod on a node at a time, which we believe
is the most common pattern in Kata Containers use cases. Its also unsafe to have the same block device attached to more than
one Kata pod. In the context of Kubernetes, the `PersistentVolumeClaim` (PVC) needs to have the `accessMode` as `ReadWriteOncePod`.
2. More advanced Kubernetes volume features such as, `fsGroup`, `fsGroupChangePolicy`, and `subPath` are not supported.
## End User Interface
1. The user specifies a PV as a direct-assigned volume. How a PV is specified as a direct-assigned volume is left for each CSI implementation to decide.
There are a few options for reference:
1. A storage class parameter specifies whether it's a direct-assigned volume. This avoids any lookups of PVC
or Pod information from the CSI plugin (as external provisioner takes care of these). However, all PVs in the storage class with the parameter set
will have host mounts skipped.
2. Use a PVC annotation. This approach requires the CSI plugins have `--extra-create-metadata` [set](https://kubernetes-csi.github.io/docs/external-provisioner.html#persistentvolumeclaim-and-persistentvolume-parameters)
to be able to perform a lookup of the PVC annotations from the API server. Pro: API server lookup of annotations only required during creation of PV.
Con: The CSI plugin will always skip host mounting of the PV.
3. The CSI plugin can also lookup pod `runtimeclass` during `NodePublish`. This approach can be found in the [ALIBABA CSI plugin](https://github.com/kubernetes-sigs/alibaba-cloud-csi-driver/blob/master/pkg/disk/nodeserver.go#L248).
2. The CSI node driver delegates the direct assigned volume to the Kata Containers runtime. The CSI node driver APIs need to
be modified to pass the volume mount information and collect volume information to/from the Kata Containers runtime by invoking `kata-runtime` command line commands.
* **NodePublishVolume** -- It invokes `kata-runtime direct-volume add --volume-path [volumePath] --mount-info [mountInfo]`
to propagate the volume mount information to the Kata Containers runtime for it to carry out the filesystem mount operation.
The `volumePath` is the [target_path](https://github.com/container-storage-interface/spec/blob/master/csi.proto#L1364) in the CSI `NodePublishVolumeRequest`.
The `mountInfo` is a serialized JSON string.
* **NodeGetVolumeStats** -- It invokes `kata-runtime direct-volume stats --volume-path [volumePath]` to retrieve the filesystem stats of direct-assigned volume.
* **NodeExpandVolume** -- It invokes `kata-runtime direct-volume resize --volume-path [volumePath] --size [size]` to send a resize request to the Kata Containers runtime to
resize the direct-assigned volume.
* **NodeStageVolume/NodeUnStageVolume** -- It invokes `kata-runtime direct-volume remove --volume-path [volumePath]` to remove the persisted metadata of a direct-assigned volume.
The `mountInfo` object is defined as follows:
```Golang
type MountInfo struct {
// The type of the volume (ie. block)
VolumeType string `json:"volume-type"`
// The device backing the volume.
Device string `json:"device"`
// The filesystem type to be mounted on the volume.
FsType string `json:"fstype"`
// Additional metadata to pass to the agent regarding this volume.
Metadata map[string]string `json:"metadata,omitempty"`
// Additional mount options.
Options []string `json:"options,omitempty"`
}
```
Notes: given that the `mountInfo` is persisted to the disk by the Kata runtime, it shouldn't container any secrets (such as SMB mount password).
## Implementation Details
### Kata runtime
Instead of the CSI node driver writing the mount info into a `csiPlugin.json` file under the volume root,
as described in the original proposal, here we propose that the CSI node driver passes the mount information to
the Kata Containers runtime through a new `kata-runtime` commandline command. The `kata-runtime` then writes the mount
information to a `mount-info.json` file in a predefined location (`/run/kata-containers/shared/direct-volumes/[volume_path]/`).
When the Kata Containers runtime starts a container, it verifies whether a volume mount is a direct-assigned volume by checking
whether there is a `mountInfo` file under the computed Kata `direct-volumes` directory. If it is, the runtime parses the `mountInfo` file,
updates the mount spec with the data in `mountInfo`. The updated mount spec is then passed to the Kata agent in the guest VM together
with other mounts. The Kata Containers runtime also creates a file named by the sandbox id under the `direct-volumes/[volume_path]/`
directory. The reason for adding a sandbox id file is to establish a mapping between the volume and the sandbox using it.
Later, when the Kata Containers runtime handles the `get-stats` and `resize` commands, it uses the sandbox id to identify
the endpoint of the corresponding `containerd-shim-kata-v2`.
### containerd-shim-kata-v2 changes
`containerd-shim-kata-v2` provides an API for sandbox management through a Unix domain socket. Two new handlers are proposed: `/direct-volume/stats` and `/direct-volume/resize`:
Example:
```bash
$ curl --unix-socket "$shim_socket_path" -I -X GET 'http://localhost/direct-volume/stats/[urlSafeVolumePath]'
$ curl --unix-socket "$shim_socket_path" -I -X POST 'http://localhost/direct-volume/resize' -d '{ "volumePath"": [volumePath], "Size": "123123" }'
```
The shim then forwards the corresponding request to the `kata-agent` to carry out the operations inside the guest VM. For `resize` operation,
the Kata runtime also needs to notify the hypervisor to resize the block device (e.g. call `block_resize` in QEMU).
### Kata agent changes
The mount spec of a direct-assigned volume is passed to `kata-agent` through the existing `Storage` GRPC object.
Two new APIs and three new GRPC objects are added to GRPC protocol between the shim and agent for resizing and getting volume stats:
```protobuf
rpc GetVolumeStats(VolumeStatsRequest) returns (VolumeStatsResponse);
rpc ResizeVolume(ResizeVolumeRequest) returns (google.protobuf.Empty);
message VolumeStatsRequest {
// The volume path on the guest outside the container
string volume_guest_path = 1;
}
message ResizeVolumeRequest {
// Full VM guest path of the volume (outside the container)
string volume_guest_path = 1;
uint64 size = 2;
}
// This should be kept in sync with CSI NodeGetVolumeStatsResponse (https://github.com/container-storage-interface/spec/blob/v1.5.0/csi.proto)
message VolumeStatsResponse {
// This field is OPTIONAL.
repeated VolumeUsage usage = 1;
// Information about the current condition of the volume.
// This field is OPTIONAL.
// This field MUST be specified if the VOLUME_CONDITION node
// capability is supported.
VolumeCondition volume_condition = 2;
}
message VolumeUsage {
enum Unit {
UNKNOWN = 0;
BYTES = 1;
INODES = 2;
}
// The available capacity in specified Unit. This field is OPTIONAL.
// The value of this field MUST NOT be negative.
uint64 available = 1;
// The total capacity in specified Unit. This field is REQUIRED.
// The value of this field MUST NOT be negative.
uint64 total = 2;
// The used capacity in specified Unit. This field is OPTIONAL.
// The value of this field MUST NOT be negative.
uint64 used = 3;
// Units by which values are measured. This field is REQUIRED.
Unit unit = 4;
}
// VolumeCondition represents the current condition of a volume.
message VolumeCondition {
// Normal volumes are available for use and operating optimally.
// An abnormal volume does not meet these criteria.
// This field is REQUIRED.
bool abnormal = 1;
// The message describing the condition of the volume.
// This field is REQUIRED.
string message = 2;
}
```
### Step by step walk-through
Given the following definition:
```YAML
---
apiVersion: v1
kind: Pod
metadata:
name: app
spec:
runtime-class: kata-qemu
containers:
- name: app
image: centos
command: ["/bin/sh"]
args: ["-c", "while true; do echo $(date -u) >> /data/out.txt; sleep 5; done"]
volumeMounts:
- name: persistent-storage
mountPath: /data
volumes:
- name: persistent-storage
persistentVolumeClaim:
claimName: ebs-claim
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
skip-hostmount: "true"
name: ebs-claim
spec:
accessModes:
- ReadWriteOncePod
volumeMode: Filesystem
storageClassName: ebs-sc
resources:
requests:
storage: 4Gi
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: ebs-sc
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
parameters:
csi.storage.k8s.io/fstype: ext4
```
Lets assume that changes have been made in the `aws-ebs-csi-driver` node driver.
**Node publish volume**
1. In the node CSI driver, the `NodePublishVolume` API invokes: `kata-runtime direct-volume add --volume-path "/kubelet/a/b/c/d/sdf" --mount-info "{\"Device\": \"/dev/sdf\", \"fstype\": \"ext4\"}"`.
2. The `Kata-runtime` writes the mount-info JSON to a file called `mountInfo.json` under `/run/kata-containers/shared/direct-volumes/kubelet/a/b/c/d/sdf`.
**Node unstage volume**
1. In the node CSI driver, the `NodeUnstageVolume` API invokes: `kata-runtime direct-volume remove --volume-path "/kubelet/a/b/c/d/sdf"`.
2. Kata-runtime deletes the directory `/run/kata-containers/shared/direct-volumes/kubelet/a/b/c/d/sdf`.
**Use the volume in sandbox**
1. Upon the request to start a container, the `containerd-shim-kata-v2` examines the container spec,
and iterates through the mounts. For each mount, if there is a `mountInfo.json` file under `/run/kata-containers/shared/direct-volumes/[mount source path]`,
it generates a `storage` GRPC object after overwriting the mount spec with the information in `mountInfo.json`.
2. The shim sends the storage objects to kata-agent through TTRPC.
3. The shim writes a file with the sandbox id as the name under `/run/kata-containers/shared/direct-volumes/[mount source path]`.
4. The kata-agent mounts the storage objects for the container.
**Node expand volume**
1. In the node CSI driver, the `NodeExpandVolume` API invokes: `kata-runtime direct-volume resize -volume-path "/kubelet/a/b/c/d/sdf" -size 8Gi`.
2. The Kata runtime checks whether there is a sandbox id file under the directory `/run/kata-containers/shared/direct-volumes/kubelet/a/b/c/d/sdf`.
3. The Kata runtime identifies the shim instance through the sandbox id, and sends a GRPC request to resize the volume.
4. The shim handles the request, asks the hypervisor to resize the block device and sends a GRPC request to Kata agent to resize the filesystem.
5. Kata agent receives the request and resizes the filesystem.
**Node get volume stats**
1. In the node CSI driver, the `NodeGetVolumeStats` API invokes: `kata-runtime direct-volume stats -volume-path "/kubelet/a/b/c/d/sdf"`.
2. The Kata runtime checks whether there is a sandbox id file under the directory `/run/kata-containers/shared/direct-volumes/kubelet/a/b/c/d/sdf`.
3. The Kata runtime identifies the shim instance through the sandbox id, and sends a GRPC request to get the volume stats.
4. The shim handles the request and forwards it to the Kata agent.
5. Kata agent receives the request and returns the filesystem stats.

View File

@@ -2,15 +2,24 @@
## Default number of virtual CPUs
Before starting a container, the [runtime][4] reads the `default_vcpus` option
from the [configuration file][5] to determine the number of virtual CPUs
Before starting a container, the [runtime][6] reads the `default_vcpus` option
from the [configuration file][7] to determine the number of virtual CPUs
(vCPUs) needed to start the virtual machine. By default, `default_vcpus` is
equal to 1 for fast boot time and a small memory footprint per virtual machine.
Be aware that increasing this value negatively impacts the virtual machine's
boot time and memory footprint.
In general, we recommend that you do not edit this variable, unless you know
what are you doing. If your container needs more than one vCPU, use
[Kubernetes `cpu` limits][1] to assign more resources.
[docker `--cpus`][1], [docker update][4], or [Kubernetes `cpu` limits][2] to
assign more resources.
*Docker*
```sh
$ docker run --name foo -ti --cpus 2 debian bash
$ docker update --cpus 4 foo
```
*Kubernetes*
@@ -40,7 +49,7 @@ $ sudo -E kubectl create -f ~/cpu-demo.yaml
## Virtual CPUs and Kubernetes pods
A Kubernetes pod is a group of one or more containers, with shared storage and
network, and a specification for how to run the containers [[specification][2]].
network, and a specification for how to run the containers [[specification][3]].
In Kata Containers this group of containers, which is called a sandbox, runs inside
the same virtual machine. If you do not specify a CPU constraint, the runtime does
not add more vCPUs and the container is not placed inside a CPU cgroup.
@@ -64,7 +73,13 @@ constraints with each container trying to consume 100% of vCPU, the resources
divide in two parts, 50% of vCPU for each container because your virtual
machine does not have enough resources to satisfy containers needs. If you want
to give access to a greater or lesser portion of vCPUs to a specific container,
use [Kubernetes `cpu` requests][1].
use [`docker --cpu-shares`][1] or [Kubernetes `cpu` requests][2].
*Docker*
```sh
$ docker run -ti --cpus-shares=512 debian bash
```
*Kubernetes*
@@ -94,9 +109,10 @@ $ sudo -E kubectl create -f ~/cpu-demo.yaml
Before running containers without CPU constraint, consider that your containers
are not running alone. Since your containers run inside a virtual machine other
processes use the vCPUs as well (e.g. `systemd` and the Kata Containers
[agent][3]). In general, we recommend setting `default_vcpus` equal to 1 to
[agent][5]). In general, we recommend setting `default_vcpus` equal to 1 to
allow non-container processes to run on this vCPU and to specify a CPU
constraint for each container.
constraint for each container. If your container is already running and needs
more vCPUs, you can add more using [docker update][4].
## Container with CPU constraint
@@ -105,7 +121,7 @@ constraints using the following formula: `vCPUs = ceiling( quota / period )`, wh
`quota` specifies the number of microseconds per CPU Period that the container is
guaranteed CPU access and `period` specifies the CPU CFS scheduler period of time
in microseconds. The result determines the number of vCPU to hot plug into the
virtual machine. Once the vCPUs have been added, the [agent][3] places the
virtual machine. Once the vCPUs have been added, the [agent][5] places the
container inside a CPU cgroup. This placement allows the container to use only
its assigned resources.
@@ -122,6 +138,25 @@ the virtual machine starts with 8 vCPUs and 1 vCPUs is added and assigned
to the container. Non-container processes might be able to use 8 vCPUs but they
use a maximum 1 vCPU, hence 7 vCPUs might not be used.
*Container without CPU constraint*
```sh
$ docker run -ti debian bash -c "nproc; cat /sys/fs/cgroup/cpu,cpuacct/cpu.cfs_*"
1 # number of vCPUs
100000 # cfs period
-1 # cfs quota
```
*Container with CPU constraint*
```sh
docker run --cpus 4 -ti debian bash -c "nproc; cat /sys/fs/cgroup/cpu,cpuacct/cpu.cfs_*"
5 # number of vCPUs
100000 # cfs period
400000 # cfs quota
```
## Virtual CPU handling without hotplug
In some cases, the hardware and/or software architecture being utilized does not support
@@ -148,8 +183,11 @@ the container's `spec` will provide the sizing information directly. If these ar
calculate the number of CPUs required for the workload and augment this by `default_vcpus`
configuration option, and use this for the virtual machine size.
[1]: https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource
[2]: https://kubernetes.io/docs/concepts/workloads/pods/pod/
[3]: ../../src/agent
[4]: ../../src/runtime
[5]: ../../src/runtime/README.md#configuration
[1]: https://docs.docker.com/config/containers/resource_constraints/#cpu
[2]: https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource
[3]: https://kubernetes.io/docs/concepts/workloads/pods/pod/
[4]: https://docs.docker.com/engine/reference/commandline/update/
[5]: ../../src/agent
[6]: ../../src/runtime
[7]: ../../src/runtime/README.md#configuration

View File

@@ -39,7 +39,7 @@ Details of each solution and a summary are provided below.
Kata Containers with QEMU has complete compatibility with Kubernetes.
Depending on the host architecture, Kata Containers supports various machine types,
for example `q35` on x86 systems, `virt` on ARM systems and `pseries` on IBM Power systems. The default Kata Containers
for example `pc` and `q35` on x86 systems, `virt` on ARM systems and `pseries` on IBM Power systems. The default Kata Containers
machine type is `q35`. The machine type and its [`Machine accelerators`](#machine-accelerators) can
be changed by editing the runtime [`configuration`](architecture/README.md#configuration) file.
@@ -60,8 +60,9 @@ Machine accelerators are architecture specific and can be used to improve the pe
and enable specific features of the machine types. The following machine accelerators
are used in Kata Containers:
- NVDIMM: This machine accelerator is x86 specific and only supported by `q35` machine types.
`nvdimm` is used to provide the root filesystem as a persistent memory device to the Virtual Machine.
- NVDIMM: This machine accelerator is x86 specific and only supported by `pc` and
`q35` machine types. `nvdimm` is used to provide the root filesystem as a persistent
memory device to the Virtual Machine.
#### Hotplug devices

View File

@@ -15,11 +15,6 @@
- `qemu`
- `cloud-hypervisor`
- `firecracker`
In the case of `firecracker` the use of a block device `snapshotter` is needed
for the VM rootfs. Refer to the following guide for additional configuration
steps:
- [Setup Kata containers with `firecracker`](how-to-use-kata-containers-with-firecracker.md)
- `ACRN`
While `qemu` , `cloud-hypervisor` and `firecracker` work out of the box with installation of Kata,

View File

@@ -72,6 +72,7 @@ $ command -v containerd
### Install CNI plugins
> **Note:** You do not need to install CNI plugins if you do not want to use containerd with Kubernetes.
> If you have installed Kubernetes with `kubeadm`, you might have already installed the CNI plugins.
You can manually install CNI plugins as follows:
@@ -93,8 +94,8 @@ $ popd
You can install the `cri-tools` from source code:
```bash
$ go get github.com/kubernetes-sigs/cri-tools
$ pushd $GOPATH/src/github.com/kubernetes-sigs/cri-tools
$ go get github.com/kubernetes-incubator/cri-tools
$ pushd $GOPATH/src/github.com/kubernetes-incubator/cri-tools
$ make
$ sudo -E make install
$ popd
@@ -130,42 +131,74 @@ For
The `RuntimeClass` is suggested.
The following configuration includes two runtime classes:
The following configuration includes three runtime classes:
- `plugins.cri.containerd.runtimes.runc`: the runc, and it is the default runtime.
- `plugins.cri.containerd.runtimes.kata`: The function in containerd (reference [the document here](https://github.com/containerd/containerd/tree/master/runtime/v2#binary-naming))
where the dot-connected string `io.containerd.kata.v2` is translated to `containerd-shim-kata-v2` (i.e. the
binary name of the Kata implementation of [Containerd Runtime V2 (Shim API)](https://github.com/containerd/containerd/tree/master/runtime/v2)).
- `plugins.cri.containerd.runtimes.katacli`: the `containerd-shim-runc-v1` calls `kata-runtime`, which is the legacy process.
```toml
[plugins.cri.containerd]
no_pivot = false
[plugins.cri.containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
privileged_without_host_devices = false
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
BinaryName = ""
CriuImagePath = ""
CriuPath = ""
CriuWorkPath = ""
IoGid = 0
[plugins.cri.containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v1"
[plugins.cri.containerd.runtimes.runc.options]
NoPivotRoot = false
NoNewKeyring = false
ShimCgroup = ""
IoUid = 0
IoGid = 0
BinaryName = "runc"
Root = ""
CriuPath = ""
SystemdCgroup = false
[plugins.cri.containerd.runtimes.kata]
runtime_type = "io.containerd.kata.v2"
privileged_without_host_devices = true
pod_annotations = ["io.katacontainers.*"]
container_annotations = ["io.katacontainers.*"]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata.options]
ConfigPath = "/opt/kata/share/defaults/kata-containers/configuration.toml"
[plugins.cri.containerd.runtimes.katacli]
runtime_type = "io.containerd.runc.v1"
[plugins.cri.containerd.runtimes.katacli.options]
NoPivotRoot = false
NoNewKeyring = false
ShimCgroup = ""
IoUid = 0
IoGid = 0
BinaryName = "/usr/bin/kata-runtime"
Root = ""
CriuPath = ""
SystemdCgroup = false
```
From Containerd v1.2.4 and Kata v1.6.0, there is a new runtime option supported, which allows you to specify a specific Kata configuration file as follows:
```toml
[plugins.cri.containerd.runtimes.kata]
runtime_type = "io.containerd.kata.v2"
privileged_without_host_devices = true
[plugins.cri.containerd.runtimes.kata.options]
ConfigPath = "/etc/kata-containers/config.toml"
```
`privileged_without_host_devices` tells containerd that a privileged Kata container should not have direct access to all host devices. If unset, containerd will pass all host devices to Kata container, which may cause security issues.
`pod_annotations` is the list of pod annotations passed to both the pod sandbox as well as container through the OCI config.
`container_annotations` is the list of container annotations passed through to the OCI config of the containers.
This `ConfigPath` option is optional. If you do not specify it, shimv2 first tries to get the configuration file from the environment variable `KATA_CONF_FILE`. If neither are set, shimv2 will use the default Kata configuration file paths (`/etc/kata-containers/configuration.toml` and `/usr/share/defaults/kata-containers/configuration.toml`).
If you use Containerd older than v1.2.4 or a version of Kata older than v1.6.0 and also want to specify a configuration file, you can use the following workaround, since the shimv2 accepts an environment variable, `KATA_CONF_FILE` for the configuration file path. Then, you can create a
shell script with the following:
```bash
#!/usr/bin/env bash
KATA_CONF_FILE=/etc/kata-containers/firecracker.toml containerd-shim-kata-v2 $@
```
Name it as `/usr/local/bin/containerd-shim-katafc-v2` and reference it in the configuration of containerd:
```toml
[plugins.cri.containerd.runtimes.kata-firecracker]
runtime_type = "io.containerd.katafc.v2"
```
#### Kata Containers as the runtime for untrusted workload
For cases without `RuntimeClass` support, we can use the legacy annotation method to support using Kata Containers
@@ -185,8 +218,28 @@ and then, run an untrusted workload with Kata Containers:
runtime_type = "io.containerd.kata.v2"
```
For the earlier versions of Kata Containers and containerd that do not support Runtime V2 (Shim API), you can use the following alternative configuration:
```toml
[plugins.cri.containerd]
# "plugins.cri.containerd.default_runtime" is the runtime to use in containerd.
[plugins.cri.containerd.default_runtime]
# runtime_type is the runtime type to use in containerd e.g. io.containerd.runtime.v1.linux
runtime_type = "io.containerd.runtime.v1.linux"
# "plugins.cri.containerd.untrusted_workload_runtime" is a runtime to run untrusted workloads on it.
[plugins.cri.containerd.untrusted_workload_runtime]
# runtime_type is the runtime type to use in containerd e.g. io.containerd.runtime.v1.linux
runtime_type = "io.containerd.runtime.v1.linux"
# runtime_engine is the name of the runtime engine used by containerd.
runtime_engine = "/usr/bin/kata-runtime"
```
You can find more information on the [Containerd config documentation](https://github.com/containerd/cri/blob/master/docs/config.md)
#### Kata Containers as the default runtime
If you want to set Kata Containers as the only runtime in the deployment, you can simply configure as follows:
@@ -197,6 +250,15 @@ If you want to set Kata Containers as the only runtime in the deployment, you ca
runtime_type = "io.containerd.kata.v2"
```
Alternatively, for the earlier versions of Kata Containers and containerd that do not support Runtime V2 (Shim API), you can use the following alternative configuration:
```toml
[plugins.cri.containerd]
[plugins.cri.containerd.default_runtime]
runtime_type = "io.containerd.runtime.v1.linux"
runtime_engine = "/usr/bin/kata-runtime"
```
### Configuration for `cri-tools`
> **Note:** If you skipped the [Install `cri-tools`](#install-cri-tools) section, you can skip this section too.
@@ -250,12 +312,10 @@ To run a container with Kata Containers through the containerd command line, you
```bash
$ sudo ctr image pull docker.io/library/busybox:latest
$ sudo ctr run --cni --runtime io.containerd.run.kata.v2 -t --rm docker.io/library/busybox:latest hello sh
$ sudo ctr run --runtime io.containerd.run.kata.v2 -t --rm docker.io/library/busybox:latest hello sh
```
This launches a BusyBox container named `hello`, and it will be removed by `--rm` after it quits.
The `--cni` flag enables CNI networking for the container. Without this flag, a container with just a
loopback interface is created.
### Launch Pods with `crictl` command line

View File

@@ -68,7 +68,7 @@ the Kata logs import to the EFK stack.
> stack they are able to utilise in order to modify and test as necessary.
Minikube by default
[configures](https://github.com/kubernetes/minikube/blob/master/deploy/iso/minikube-iso/board/minikube/x86_64/rootfs-overlay/etc/systemd/journald.conf)
[configures](https://github.com/kubernetes/minikube/blob/master/deploy/iso/minikube-iso/board/coreos/minikube/rootfs-overlay/etc/systemd/journald.conf)
the `systemd-journald` with the
[`Storage=volatile`](https://www.freedesktop.org/software/systemd/man/journald.conf.html) option,
which results in the journal being stored in `/run/log/journal`. Unfortunately, the Minikube EFK
@@ -163,7 +163,7 @@ sub-filter on, for instance, the `SYSLOG_IDENTIFIER` to differentiate the Kata c
on the `PRIORITY` to filter out critical issues etc.
Kata generates a significant amount of Kata specific information, which can be seen as
[`logfmt`](../../src/tools/log-parser/README.md#logfile-requirements).
[`logfmt`](https://github.com/kata-containers/tests/tree/main/cmd/log-parser#logfile-requirements).
data contained in the `MESSAGE` field. Imported as-is, there is no easy way to filter on that data
in Kibana:

View File

@@ -48,9 +48,9 @@ Running Docker containers Kata Containers requires care because `VOLUME`s specif
kataShared on / type virtiofs (rw,relatime,dax)
```
`kataShared` mount types are powered by [`virtio-fs`](https://virtio-fs.gitlab.io/), a marked improvement over `virtio-9p`, thanks to [PR #1016](https://github.com/kata-containers/runtime/pull/1016). While `virtio-fs` is normally an excellent choice, in the case of DinD workloads `virtio-fs` causes an issue -- [it *cannot* be used as a "upper layer" of `overlayfs` without a custom patch](http://lists.katacontainers.io/pipermail/kata-dev/2020-January/001216.html).
`kataShared` mount types are powered by [`virtio-fs`][virtio-fs], a marked improvement over `virtio-9p`, thanks to [PR #1016](https://github.com/kata-containers/runtime/pull/1016). While `virtio-fs` is normally an excellent choice, in the case of DinD workloads `virtio-fs` causes an issue -- [it *cannot* be used as a "upper layer" of `overlayfs` without a custom patch](http://lists.katacontainers.io/pipermail/kata-dev/2020-January/001216.html).
As `/var/lib/docker` is a `VOLUME` specified by DinD (i.e. the `docker` images tagged `*-dind`/`*-dind-rootless`), `docker` will fail to start (or even worse, silently pick a worse storage driver like `vfs`) when started in a Kata Container. Special measures must be taken when running DinD-powered workloads in Kata Containers.
As `/var/lib/docker` is a `VOLUME` specified by DinD (i.e. the `docker` images tagged `*-dind`/`*-dind-rootless`), `docker` fill fail to start (or even worse, silently pick a worse storage driver like `vfs`) when started in a Kata Container. Special measures must be taken when running DinD-powered workloads in Kata Containers.
## Workarounds/Solutions
@@ -58,7 +58,7 @@ Thanks to various community contributions (see [issue references below](#referen
### Use a memory backed volume
For small workloads (small container images, without much generated filesystem load), a memory-backed volume is sufficient. Kubernetes supports a variant of [the `EmptyDir` volume](https://kubernetes.io/docs/concepts/storage/volumes/#emptydir), which allows for memdisk-backed storage -- the the `medium: Memory`. An example of a `Pod` using such a setup [was contributed](https://github.com/kata-containers/runtime/issues/1429#issuecomment-477385283), and is reproduced below:
For small workloads (small container images, without much generated filesystem load), a memory-backed volume is sufficient. Kubernetes supports a variant of [the `EmptyDir` volume][k8s-emptydir], which allows for memdisk-backed storage -- the [the `medium: Memory` ][k8s-memory-volume-type]. An example of a `Pod` using such a setup [was contributed](https://github.com/kata-containers/runtime/issues/1429#issuecomment-477385283), and is reproduced below:
```yaml
apiVersion: v1

View File

@@ -91,7 +91,6 @@ There are several kinds of Kata configurations and they are listed below.
| `io.katacontainers.config.hypervisor.virtio_fs_daemon` | string | virtio-fs `vhost-user` daemon path |
| `io.katacontainers.config.hypervisor.virtio_fs_extra_args` | string | extra options passed to `virtiofs` daemon |
| `io.katacontainers.config.hypervisor.enable_guest_swap` | `boolean` | enable swap in the guest |
| `io.katacontainers.config.hypervisor.use_legacy_serial` | `boolean` | uses legacy serial device for guest's console (QEMU) |
## Container Options
| Key | Value Type | Comments |
@@ -173,7 +172,7 @@ kind: Pod
metadata:
name: pod2
annotations:
io.katacontainers.config.runtime.disable_guest_seccomp: "false"
io.katacontainers.config.runtime.disable_guest_seccomp: false
spec:
runtimeClassName: kata
containers:

View File

@@ -101,7 +101,7 @@ Start an ACRN based Kata Container,
$ sudo docker run -ti --runtime=kata-runtime busybox sh
```
You will see ACRN(`acrn-dm`) is now running on your system, as well as a `kata-shim`. You should obtain an interactive shell prompt. Verify that all the Kata processes terminate once you exit the container.
You will see ACRN(`acrn-dm`) is now running on your system, as well as a `kata-shim`, `kata-proxy`. You should obtain an interactive shell prompt. Verify that all the Kata processes terminate once you exit the container.
```bash
$ ps -ef | grep -E "kata|acrn"

View File

@@ -1,254 +0,0 @@
# Configure Kata Containers to use Firecracker
This document provides an overview on how to run Kata Containers with the AWS Firecracker hypervisor.
## Introduction
AWS Firecracker is an open source virtualization technology that is purpose-built for creating and managing secure, multi-tenant container and function-based services that provide serverless operational models. AWS Firecracker runs workloads in lightweight virtual machines, called `microVMs`, which combine the security and isolation properties provided by hardware virtualization technology with the speed and flexibility of Containers.
Please refer to AWS Firecracker [documentation](https://github.com/firecracker-microvm/firecracker/blob/main/docs/getting-started.md) for more details.
## Pre-requisites
This document requires the presence of Kata Containers on your system. Install using the instructions available through the following links:
- Kata Containers [automated installation](../install/README.md)
- Kata Containers manual installation: Automated installation does not seem to be supported for Clear Linux, so please use [manual installation](../Developer-Guide.md) steps.
> **Note:** Create rootfs image and not initrd image.
## Install AWS Firecracker
Kata Containers only support AWS Firecracker v0.23.4 ([yet](https://github.com/kata-containers/kata-containers/pull/1519)).
To install Firecracker we need to get the `firecracker` and `jailer` binaries:
```bash
$ release_url="https://github.com/firecracker-microvm/firecracker/releases"
$ version="v0.23.1"
$ arch=`uname -m`
$ curl ${release_url}/download/${version}/firecracker-${version}-${arch} -o firecracker
$ curl ${release_url}/download/${version}/jailer-${version}-${arch} -o jailer
$ chmod +x jailer firecracker
```
To make the binaries available from the default system `PATH` it is recommended to move them to `/usr/local/bin` or add a symbolic link:
```bash
$ sudo ln -s $(pwd)/firecracker /usr/local/bin
$ sudo ln -s $(pwd)/jailer /usr/local/bin
```
More details can be found in [AWS Firecracker docs](https://github.com/firecracker-microvm/firecracker/blob/main/docs/getting-started.md)
In order to run Kata with AWS Firecracker a block device as the backing store for a VM is required. To interact with `containerd` and Kata we use the `devmapper` `snapshotter`.
## Configure `devmapper`
To check support for your `containerd` installation, you can run:
```
$ ctr plugins ls |grep devmapper
```
if the output of the above command is:
```
io.containerd.snapshotter.v1 devmapper linux/amd64 ok
```
then you can skip this section and move on to `Configure Kata Containers with AWS Firecracker`
If the output of the above command is:
```
io.containerd.snapshotter.v1 devmapper linux/amd64 error
```
then we need to setup `devmapper` `snapshotter`. Based on a [very useful
guide](https://docs.docker.com/storage/storagedriver/device-mapper-driver/)
from docker, we can set it up using the following scripts:
> **Note:** The following scripts assume a 100G sparse file for storing container images, a 10G sparse file for the thin-provisioning pool and 10G base image files for any sandboxed container created. This means that we will need at least 10GB free space.
```
#!/bin/bash
set -ex
DATA_DIR=/var/lib/containerd/devmapper
POOL_NAME=devpool
mkdir -p ${DATA_DIR}
# Create data file
sudo touch "${DATA_DIR}/data"
sudo truncate -s 100G "${DATA_DIR}/data"
# Create metadata file
sudo touch "${DATA_DIR}/meta"
sudo truncate -s 10G "${DATA_DIR}/meta"
# Allocate loop devices
DATA_DEV=$(sudo losetup --find --show "${DATA_DIR}/data")
META_DEV=$(sudo losetup --find --show "${DATA_DIR}/meta")
# Define thin-pool parameters.
# See https://www.kernel.org/doc/Documentation/device-mapper/thin-provisioning.txt for details.
SECTOR_SIZE=512
DATA_SIZE="$(sudo blockdev --getsize64 -q ${DATA_DEV})"
LENGTH_IN_SECTORS=$(bc <<< "${DATA_SIZE}/${SECTOR_SIZE}")
DATA_BLOCK_SIZE=128
LOW_WATER_MARK=32768
# Create a thin-pool device
sudo dmsetup create "${POOL_NAME}" \
--table "0 ${LENGTH_IN_SECTORS} thin-pool ${META_DEV} ${DATA_DEV} ${DATA_BLOCK_SIZE} ${LOW_WATER_MARK}"
cat << EOF
#
# Add this to your config.toml configuration file and restart `containerd` daemon
#
[plugins]
[plugins.devmapper]
pool_name = "${POOL_NAME}"
root_path = "${DATA_DIR}"
base_image_size = "10GB"
discard_blocks = true
EOF
```
Make it executable and run it:
```bash
$ sudo chmod +x ~/scripts/devmapper/create.sh
$ cd ~/scripts/devmapper/
$ sudo ./create.sh
```
Now, we can add the `devmapper` configuration provided from the script to `/etc/containerd/config.toml`.
> **Note:** If you are using the default `containerd` configuration (`containerd config default >> /etc/containerd/config.toml`), you may need to edit the existing `[plugins."io.containerd.snapshotter.v1.devmapper"]`configuration.
Save and restart `containerd`:
```bash
$ sudo systemctl restart containerd
```
We can use `dmsetup` to verify that the thin-pool was created successfully.
```bash
$ sudo dmsetup ls
```
We should also check that `devmapper` is registered and running:
```bash
$ sudo ctr plugins ls | grep devmapper
```
This script needs to be run only once, while setting up the `devmapper` `snapshotter` for `containerd`. Afterwards, make sure that on each reboot, the thin-pool is initialized from the same data directory. Otherwise, all the fetched containers (or the ones that you have created) will be re-initialized. A simple script that re-creates the thin-pool from the same data directory is shown below:
```
#!/bin/bash
set -ex
DATA_DIR=/var/lib/containerd/devmapper
POOL_NAME=devpool
# Allocate loop devices
DATA_DEV=$(sudo losetup --find --show "${DATA_DIR}/data")
META_DEV=$(sudo losetup --find --show "${DATA_DIR}/meta")
# Define thin-pool parameters.
# See https://www.kernel.org/doc/Documentation/device-mapper/thin-provisioning.txt for details.
SECTOR_SIZE=512
DATA_SIZE="$(sudo blockdev --getsize64 -q ${DATA_DEV})"
LENGTH_IN_SECTORS=$(bc <<< "${DATA_SIZE}/${SECTOR_SIZE}")
DATA_BLOCK_SIZE=128
LOW_WATER_MARK=32768
# Create a thin-pool device
sudo dmsetup create "${POOL_NAME}" \
--table "0 ${LENGTH_IN_SECTORS} thin-pool ${META_DEV} ${DATA_DEV} ${DATA_BLOCK_SIZE} ${LOW_WATER_MARK}"
```
We can create a systemd service to run the above script on each reboot:
```bash
$ sudo nano /lib/systemd/system/devmapper_reload.service
```
The service file:
```
[Unit]
Description=Devmapper reload script
[Service]
ExecStart=/path/to/script/reload.sh
[Install]
WantedBy=multi-user.target
```
Enable the newly created service:
```bash
$ sudo systemctl daemon-reload
$ sudo systemctl enable devmapper_reload.service
$ sudo systemctl start devmapper_reload.service
```
## Configure Kata Containers with AWS Firecracker
To configure Kata Containers with AWS Firecracker, copy the generated `configuration-fc.toml` file when building the `kata-runtime` to either `/etc/kata-containers/configuration-fc.toml` or `/usr/share/defaults/kata-containers/configuration-fc.toml`.
The following command shows full paths to the `configuration.toml` files that the runtime loads. It will use the first path that exists. (Please make sure the kernel and image paths are set correctly in the `configuration.toml` file)
```bash
$ sudo kata-runtime --show-default-config-paths
```
## Configure `containerd`
Next, we need to configure containerd. Add a file in your path (e.g. `/usr/local/bin/containerd-shim-kata-fc-v2`) with the following contents:
```
#!/bin/bash
KATA_CONF_FILE=/etc/containers/configuration-fc.toml /usr/local/bin/containerd-shim-kata-v2 $@
```
> **Note:** You may need to edit the paths of the configuration file and the `containerd-shim-kata-v2` to correspond to your setup.
Make it executable:
```bash
$ sudo chmod +x /usr/local/bin/containerd-shim-kata-fc-v2
```
Add the relevant section in `containerd`s `config.toml` file (`/etc/containerd/config.toml`):
```
[plugins.cri.containerd.runtimes]
[plugins.cri.containerd.runtimes.kata-fc]
runtime_type = "io.containerd.kata-fc.v2"
```
> **Note:** If you are using the default `containerd` configuration (`containerd config default >> /etc/containerd/config.toml`),
> the configuration should change to :
```
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata-fc]
runtime_type = "io.containerd.kata-fc.v2"
```
Restart `containerd`:
```bash
$ sudo systemctl restart containerd
```
## Verify the installation
We are now ready to launch a container using Kata with Firecracker to verify that everything worked:
```bash
$ sudo ctr images pull --snapshotter devmapper docker.io/library/ubuntu:latest
$ sudo ctr run --snapshotter devmapper --runtime io.containerd.run.kata-fc.v2 -t --rm docker.io/library/ubuntu
```

View File

@@ -31,7 +31,7 @@ See below example config:
[plugins.cri]
[plugins.cri.containerd]
[plugins.cri.containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
runtime_type = "io.containerd.runc.v1"
privileged_without_host_devices = false
[plugins.cri.containerd.runtimes.kata]
runtime_type = "io.containerd.kata.v2"

View File

@@ -99,18 +99,7 @@ $ sudo systemctl restart kubelet
$ sudo kubeadm init --ignore-preflight-errors=all --cri-socket /var/run/crio/crio.sock --pod-network-cidr=10.244.0.0/16
# If using containerd
$ cat <<EOF | tee kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
nodeRegistration:
criSocket: "/run/containerd/containerd.sock"
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
cgroupDriver: cgroupfs
podCIDR: "10.244.0.0/16"
EOF
$ sudo kubeadm init --ignore-preflight-errors=all --config kubeadm-config.yaml
$ sudo kubeadm init --ignore-preflight-errors=all --cri-socket /run/containerd/containerd.sock --pod-network-cidr=10.244.0.0/16
$ export KUBECONFIG=/etc/kubernetes/admin.conf
```

View File

@@ -81,7 +81,7 @@
- Download the standard `systemd(1)` service file and install to
`/etc/systemd/system/`:
- https://raw.githubusercontent.com/containerd/containerd/main/containerd.service
- https://raw.githubusercontent.com/containerd/containerd/master/containerd.service
> **Notes:**
>

View File

@@ -3,4 +3,4 @@
Kata Containers supports passing certain GPUs from the host into the container. Select the GPU vendor for detailed information:
- [Intel](Intel-GPU-passthrough-and-Kata.md)
- [NVIDIA](NVIDIA-GPU-passthrough-and-Kata.md)
- [Nvidia](Nvidia-GPU-passthrough-and-Kata.md)

View File

@@ -1,592 +0,0 @@
# Using NVIDIA GPU device with Kata Containers
An NVIDIA GPU device can be passed to a Kata Containers container using GPU
passthrough (NVIDIA GPU pass-through mode) as well as GPU mediated passthrough
(NVIDIA `vGPU` mode).
NVIDIA GPU pass-through mode, an entire physical GPU is directly assigned to one
VM, bypassing the NVIDIA Virtual GPU Manager. In this mode of operation, the GPU
is accessed exclusively by the NVIDIA driver running in the VM to which it is
assigned. The GPU is not shared among VMs.
NVIDIA Virtual GPU (`vGPU`) enables multiple virtual machines (VMs) to have
simultaneous, direct access to a single physical GPU, using the same NVIDIA
graphics drivers that are deployed on non-virtualized operating systems. By
doing this, NVIDIA `vGPU` provides VMs with unparalleled graphics performance,
compute performance, and application compatibility, together with the
cost-effectiveness and scalability brought about by sharing a GPU among multiple
workloads. A `vGPU` can be either time-sliced or Multi-Instance GPU (MIG)-backed
with [MIG-slices](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/).
| Technology | Description | Behavior | Detail |
| --- | --- | --- | --- |
| NVIDIA GPU pass-through mode | GPU passthrough | Physical GPU assigned to a single VM | Direct GPU assignment to VM without limitation |
| NVIDIA vGPU time-sliced | GPU time-sliced | Physical GPU time-sliced for multiple VMs | Mediated passthrough |
| NVIDIA vGPU MIG-backed | GPU with MIG-slices | Physical GPU MIG-sliced for multiple VMs | Mediated passthrough |
## Hardware Requirements
NVIDIA GPUs Recommended for Virtualization:
- NVIDIA Tesla (T4, M10, P6, V100 or newer)
- NVIDIA Quadro RTX 6000/8000
## Host BIOS Requirements
Some hardware requires a larger PCI BARs window, for example, NVIDIA Tesla P100,
K40m
```sh
$ lspci -s d0:00.0 -vv | grep Region
Region 0: Memory at e7000000 (32-bit, non-prefetchable) [size=16M]
Region 1: Memory at 222800000000 (64-bit, prefetchable) [size=32G] # Above 4G
Region 3: Memory at 223810000000 (64-bit, prefetchable) [size=32M]
```
For large BARs devices, MMIO mapping above 4G address space should be `enabled`
in the PCI configuration of the BIOS.
Some hardware vendors use a different name in BIOS, such as:
- Above 4G Decoding
- Memory Hole for PCI MMIO
- Memory Mapped I/O above 4GB
If one is using a GPU based on the Ampere architecture and later additionally
SR-IOV needs to be enabled for the `vGPU` use-case.
The following steps outline the workflow for using an NVIDIA GPU with Kata.
## Host Kernel Requirements
The following configurations need to be enabled on your host kernel:
- `CONFIG_VFIO`
- `CONFIG_VFIO_IOMMU_TYPE1`
- `CONFIG_VFIO_MDEV`
- `CONFIG_VFIO_MDEV_DEVICE`
- `CONFIG_VFIO_PCI`
Your host kernel needs to be booted with `intel_iommu=on` on the kernel command
line.
## Install and configure Kata Containers
To use non-large BARs devices (for example, NVIDIA Tesla T4), you need Kata
version 1.3.0 or above. Follow the [Kata Containers setup
instructions](../install/README.md) to install the latest version of Kata.
To use large BARs devices (for example, NVIDIA Tesla P100), you need Kata
version 1.11.0 or above.
The following configuration in the Kata `configuration.toml` file as shown below
can work:
Hotplug for PCI devices with small BARs by `acpi_pcihp` (Linux's ACPI PCI
Hotplug driver):
```sh
machine_type = "q35"
hotplug_vfio_on_root_bus = false
```
Hotplug for PCIe devices with large BARs by `pciehp` (Linux's PCIe Hotplug
driver):
```sh
machine_type = "q35"
hotplug_vfio_on_root_bus = true
pcie_root_port = 1
```
## Build Kata Containers kernel with GPU support
The default guest kernel installed with Kata Containers does not provide GPU
support. To use an NVIDIA GPU with Kata Containers, you need to build a kernel
with the necessary GPU support.
The following kernel config options need to be enabled:
```sh
# Support PCI/PCIe device hotplug (Required for large BARs device)
CONFIG_HOTPLUG_PCI_PCIE=y
# Support for loading modules (Required for load NVIDIA drivers)
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# Enable the MMIO access method for PCIe devices (Required for large BARs device)
CONFIG_PCI_MMCONFIG=y
```
The following kernel config options need to be disabled:
```sh
# Disable Open Source NVIDIA driver nouveau
# It conflicts with NVIDIA official driver
CONFIG_DRM_NOUVEAU=n
```
> **Note**: `CONFIG_DRM_NOUVEAU` is normally disabled by default.
It is worth checking that it is not enabled in your kernel configuration to
prevent any conflicts.
Build the Kata Containers kernel with the previous config options, using the
instructions described in [Building Kata Containers
kernel](../../tools/packaging/kernel). For further details on building and
installing guest kernels, see [the developer
guide](../Developer-Guide.md#install-guest-kernel-images).
There is an easy way to build a guest kernel that supports NVIDIA GPU:
```sh
## Build guest kernel with ../../tools/packaging/kernel
# Prepare (download guest kernel source, generate .config)
$ ./build-kernel.sh -v 5.15.23 -g nvidia -f setup
# Build guest kernel
$ ./build-kernel.sh -v 5.15.23 -g nvidia build
# Install guest kernel
$ sudo -E ./build-kernel.sh -v 5.15.23 -g nvidia install
```
To build NVIDIA Driver in Kata container, `linux-headers` are required.
This is a way to generate deb packages for `linux-headers`:
> **Note**:
> Run `make rpm-pkg` to build the rpm package.
> Run `make deb-pkg` to build the deb package.
>
```sh
$ cd kata-linux-5.15.23-89
$ make deb-pkg
```
Before using the new guest kernel, please update the `kernel` parameters in
`configuration.toml`.
```sh
kernel = "/usr/share/kata-containers/vmlinuz-nvidia-gpu.container"
```
## NVIDIA GPU pass-through mode with Kata Containers
Use the following steps to pass an NVIDIA GPU device in pass-through mode with Kata:
1. Find the Bus-Device-Function (BDF) for the GPU device on the host:
```sh
$ sudo lspci -nn -D | grep -i nvidia
0000:d0:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:20b9] (rev a1)
```
> PCI address `0000:d0:00.0` is assigned to the hardware GPU device.
> `10de:20b9` is the device ID of the hardware GPU device.
2. Find the IOMMU group for the GPU device:
```sh
$ BDF="0000:d0:00.0"
$ readlink -e /sys/bus/pci/devices/$BDF/iommu_group
```
The previous output shows that the GPU belongs to IOMMU group 192. The next
step is to bind the GPU to the VFIO-PCI driver.
```sh
$ BDF="0000:d0:00.0"
$ DEV="/sys/bus/pci/devices/$BDF"
$ echo "vfio-pci" > $DEV/driver_override
$ echo $BDF > $DEV/driver/unbind
$ echo $BDF > /sys/bus/pci/drivers_probe
# To return the device to the standard driver, we simply clear the
# driver_override and reprobe the device, ex:
$ echo > $DEV/preferred_driver
$ echo $BDF > $DEV/driver/unbind
$ echo $BDF > /sys/bus/pci/drivers_probe
```
3. Check the IOMMU group number under `/dev/vfio`:
```sh
$ ls -l /dev/vfio
total 0
crw------- 1 zvonkok zvonkok 243, 0 Mar 18 03:06 192
crw-rw-rw- 1 root root 10, 196 Mar 18 02:27 vfio
```
4. Start a Kata container with the GPU device:
```sh
# You may need to `modprobe vhost-vsock` if you get
# host system doesn't support vsock: stat /dev/vhost-vsock
$ sudo ctr --debug run --runtime "io.containerd.kata.v2" --device /dev/vfio/192 --rm -t "docker.io/library/archlinux:latest" arch uname -r
```
5. Run `lspci` within the container to verify the GPU device is seen in the list
of the PCI devices. Note the vendor-device id of the GPU (`10de:20b9`) in the `lspci` output.
```sh
$ sudo ctr --debug run --runtime "io.containerd.kata.v2" --device /dev/vfio/192 --rm -t "docker.io/library/archlinux:latest" arch sh -c "lspci -nn | grep '10de:20b9'"
```
6. Additionally, you can check the PCI BARs space of the NVIDIA GPU device in the container:
```sh
$ sudo ctr --debug run --runtime "io.containerd.kata.v2" --device /dev/vfio/192 --rm -t "docker.io/library/archlinux:latest" arch sh -c "lspci -s 02:00.0 -vv | grep Region"
```
> **Note**: If you see a message similar to the above, the BAR space of the NVIDIA
> GPU has been successfully allocated.
## NVIDIA vGPU mode with Kata Containers
NVIDIA vGPU is a licensed product on all supported GPU boards. A software license
is required to enable all vGPU features within the guest VM. NVIDIA vGPU manager
needs to be installed on the host to configure GPUs in vGPU mode. See [NVIDIA Virtual GPU Software Documentation v14.0 through 14.1](https://docs.nvidia.com/grid/14.0/) for more details.
### NVIDIA vGPU time-sliced
In the time-sliced mode, the GPU is not partitioned and the workload uses the
whole GPU and shares access to the GPU engines. Processes are scheduled in
series. The best effort scheduler is the default one and can be exchanged by
other scheduling policies see the documentation above how to do that.
Beware if you had `MIG` enabled before to disable `MIG` on the GPU if you want
to use `time-sliced` `vGPU`.
```sh
$ sudo nvidia-smi -mig 0
```
Enable the virtual functions for the physical GPU in the `sysfs` file system.
```sh
$ sudo /usr/lib/nvidia/sriov-manage -e 0000:41:00.0
```
Get the `BDF` of the available virtual function on the GPU, and choose one for the
following steps.
```sh
$ cd /sys/bus/pci/devices/0000:41:00.0/
$ ls -l | grep virtfn
```
#### List all available vGPU instances
The following shell snippet will walk the `sysfs` and only print instances
that are available, that can be created.
```sh
# The 00.0 is often the PF of the device the VFs will have the funciont in the
# BDF incremented by some values so e.g. the very first VF is 0000:41:00.4
cd /sys/bus/pci/devices/0000:41:00.0/
for vf in $(ls -d virtfn*)
do
BDF=$(basename $(readlink -f $vf))
for md in $(ls -d $vf/mdev_supported_types/*)
do
AVAIL=$(cat $md/available_instances)
NAME=$(cat $md/name)
DIR=$(basename $md)
if [ $AVAIL -gt 0 ]; then
echo "| BDF | INSTANCES | NAME | DIR |"
echo "+--------------+-----------+----------------+------------+"
printf "| %12s |%10d |%15s | %10s |\n\n" "$BDF" "$AVAIL" "$NAME" "$DIR"
fi
done
done
```
If there are available instances you get something like this (for the first VF),
beware that the output is highly dependent on the GPU you have, if there is no
output check again if `MIG` is really disabled.
```sh
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-4C | nvidia-692 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-8C | nvidia-693 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-10C | nvidia-694 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-16C | nvidia-695 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-20C | nvidia-696 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-40C | nvidia-697 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-80C | nvidia-698 |
```
Change to the `mdev_supported_types` directory for the virtual function on which
you want to create the `vGPU`. Taking the first output as an example:
```sh
$ cd virtfn0/mdev_supported_types/nvidia-692
$ UUIDGEN=$(uuidgen)
$ sudo bash -c "echo $UUIDGEN > create"
```
Confirm that the `vGPU` was created. You should see the `UUID` pointing to a
subdirectory of the `sysfs` space.
```sh
$ ls -l /sys/bus/mdev/devices/
```
Get the `IOMMU` group number and verify there is a `VFIO` device created to use
with Kata.
```sh
$ ls -l /sys/bus/mdev/devices/*/
$ ls -l /dev/vfio
```
Use the `VFIO` device created in the same way as in the pass-through use-case.
Beware that the guest needs the NVIDIA guest drivers, so one would need to build
a new guest `OS` image.
### NVIDIA vGPU MIG-backed
We're not going into detail what `MIG` is but briefly it is a technology to
partition the hardware into independent instances with guaranteed quality of
service. For more details see [NVIDIA Multi-Instance GPU User Guide](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/).
First enable `MIG` mode for a GPU, depending on the platform you're running
a reboot would be necessary. Some platforms support GPU reset.
```sh
$ sudo nvidia-smi -mig 1
```
If the platform supports a GPU reset one can run, otherwise you will get a
warning to reboot the server.
```sh
$ sudo nvidia-smi --gpu-reset
```
The driver per default provides a number of profiles that users can opt-in when
configuring the MIG feature.
```sh
$ sudo nvidia-smi mig -lgip
+-----------------------------------------------------------------------------+
| GPU instance profiles: |
| GPU Name ID Instances Memory P2P SM DEC ENC |
| Free/Total GiB CE JPEG OFA |
|=============================================================================|
| 0 MIG 1g.10gb 19 7/7 9.50 No 14 0 0 |
| 1 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 1g.10gb+me 20 1/1 9.50 No 14 1 0 |
| 1 1 1 |
+-----------------------------------------------------------------------------+
| 0 MIG 2g.20gb 14 3/3 19.50 No 28 1 0 |
| 2 0 0 |
+-----------------------------------------------------------------------------+
...
```
Create the GPU instances that correspond to the `vGPU` types of the `MIG-backed`
`vGPUs` that you will create [NVIDIA A100 PCIe 80GB Virtual GPU Types](https://docs.nvidia.com/grid/13.0/grid-vgpu-user-guide/index.html#vgpu-types-nvidia-a100-pcie-80gb).
```sh
# MIG 1g.10gb --> vGPU A100D-1-10C
$ sudo nvidia-smi mig -cgi 19
```
List the GPU instances and get the GPU instance id to create the compute
instance.
```sh
$ sudo nvidia-smi mig -lgi # list the created GPU instances
$ sudo nvidia-smi mig -cci -gi 9 # each GPU instance can have several compute
# instances. Instance -> Workload
```
Verify that the compute instances were created within the GPU instance
```sh
$ nvidia-smi
... snip ...
+-----------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG|
| | | ECC| |
|==================+======================+===========+=======================|
| 0 9 0 0 | 0MiB / 9728MiB | 14 0 | 1 0 0 0 0 |
| | 0MiB / 4095MiB | | |
+------------------+----------------------+-----------+-----------------------+
... snip ...
```
We can use the [snippet](#list-all-available-vgpu-instances) from before to list
the available `vGPU` instances, this time `MIG-backed`.
```sh
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 |GRID A100D-1-10C | nvidia-699 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.5 | 1 |GRID A100D-1-10C | nvidia-699 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:01.6 | 1 |GRID A100D-1-10C | nvidia-699 |
... snip ...
```
Repeat the steps after the [snippet](#list-all-available-vgpu-instances) listing
to create the corresponding `mdev` device and use the guest `OS` created in the
previous section with `time-sliced` `vGPUs`.
## Install NVIDIA Driver + Toolkit in Kata Containers Guest OS
Consult the [Developer-Guide](https://github.com/kata-containers/kata-containers/blob/main/docs/Developer-Guide.md#create-a-rootfs-image) on how to create a
rootfs base image for a distribution of your choice. This is going to be used as
a base for a NVIDIA enabled guest OS. Use the `EXTRA_PKGS` variable to install
all the needed packages to compile the drivers. Also copy the kernel development
packages from the previous `make deb-pkg` into `$ROOTFS_DIR`.
```sh
export EXTRA_PKGS="gcc make curl gnupg"
```
Having the `$ROOTFS_DIR` exported in the previous step we can now install all the
needed parts in the guest OS. In this case, we have an Ubuntu based rootfs.
First off all mount the special filesystems into the rootfs
```sh
$ sudo mount -t sysfs -o ro none ${ROOTFS_DIR}/sys
$ sudo mount -t proc -o ro none ${ROOTFS_DIR}/proc
$ sudo mount -t tmpfs none ${ROOTFS_DIR}/tmp
$ sudo mount -o bind,ro /dev ${ROOTFS_DIR}/dev
$ sudo mount -t devpts none ${ROOTFS_DIR}/dev/pts
```
Now we can enter `chroot`
```sh
$ sudo chroot ${ROOTFS_DIR}
```
Inside the rootfs one is going to install the drivers and toolkit to enable the
easy creation of GPU containers with Kata. We can also use this rootfs for any
other container not specifically only for GPUs.
As a prerequisite install the copied kernel development packages
```sh
$ sudo dpkg -i *.deb
```
Get the driver run file, since we need to build the driver against a kernel that
is not running on the host we need the ability to specify the exact version we
want the driver to build against. Take the kernel version one used for building
the NVIDIA kernel (`5.15.23-nvidia-gpu`).
```sh
$ wget https://us.download.nvidia.com/XFree86/Linux-x86_64/510.54/NVIDIA-Linux-x86_64-510.54.run
$ chmod +x NVIDIA-Linux-x86_64-510.54.run
# Extract the source files so we can run the installer with arguments
$ ./NVIDIA-Linux-x86_64-510.54.run -x
$ cd NVIDIA-Linux-x86_64-510.54
$ ./nvidia-installer -k 5.15.23-nvidia-gpu
```
Having the drivers installed we need to install the toolkit which will take care
of providing the right bits into the container.
```sh
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
$ curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
$ apt update
$ apt install nvidia-container-toolkit
```
Create the hook execution file for Kata:
```
# Content of $ROOTFS_DIR/usr/share/oci/hooks/prestart/nvidia-container-toolkit.sh
#!/bin/bash -x
/usr/bin/nvidia-container-toolkit -debug $@
```
As the last step one can do some cleanup of files or package caches. Build the
rootfs and configure it for use with Kata according to the development guide.
Enable the `guest_hook_path` in Kata's `configuration.toml`
```sh
guest_hook_path = "/usr/share/oci/hooks"
```
One has built a NVIDIA rootfs, kernel and now we can run any GPU container
without installing the drivers into the container. Check NVIDIA device status
with `nvidia-smi`
```sh
$ sudo ctr --debug run --runtime "io.containerd.kata.v2" --device /dev/vfio/192 --rm -t "docker.io/nvidia/cuda:11.6.0-base-ubuntu20.04" cuda nvidia-smi
Fri Mar 18 10:36:59 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.54 Driver Version: 510.54 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA A30X Off | 00000000:02:00.0 Off | 0 |
| N/A 38C P0 67W / 230W | 0MiB / 24576MiB | 0% Default |
| | | Disabled |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
```
As the last step one can remove the additional packages and files that were added
to the `$ROOTFS_DIR` to keep it as small as possible.
## References
- [Configuring a VM for GPU Pass-Through by Using the QEMU Command Line](https://docs.nvidia.com/grid/latest/grid-vgpu-user-guide/index.html#using-gpu-pass-through-red-hat-el-qemu-cli)
- https://gitlab.com/nvidia/container-images/driver/-/tree/master
- https://github.com/NVIDIA/nvidia-docker/wiki/Driver-containers

View File

@@ -0,0 +1,293 @@
# Using Nvidia GPU device with Kata Containers
An Nvidia GPU device can be passed to a Kata Containers container using GPU passthrough
(Nvidia GPU pass-through mode) as well as GPU mediated passthrough (Nvidia vGPU mode). 
Nvidia GPU pass-through mode, an entire physical GPU is directly assigned to one VM,
bypassing the Nvidia Virtual GPU Manager. In this mode of operation, the GPU is accessed
exclusively by the Nvidia driver running in the VM to which it is assigned.
The GPU is not shared among VMs.
Nvidia Virtual GPU (vGPU) enables multiple virtual machines (VMs) to have simultaneous,
direct access to a single physical GPU, using the same Nvidia graphics drivers that are
deployed on non-virtualized operating systems. By doing this, Nvidia vGPU provides VMs
with unparalleled graphics performance, compute performance, and application compatibility,
together with the cost-effectiveness and scalability brought about by sharing a GPU
among multiple workloads.
| Technology | Description | Behaviour | Detail |
| --- | --- | --- | --- |
| Nvidia GPU pass-through mode | GPU passthrough | Physical GPU assigned to a single VM | Direct GPU assignment to VM without limitation |
| Nvidia vGPU mode | GPU sharing | Physical GPU shared by multiple VMs | Mediated passthrough |
## Hardware Requirements
Nvidia GPUs Recommended for Virtualization:
- Nvidia Tesla (T4, M10, P6, V100 or newer)
- Nvidia Quadro RTX 6000/8000
## Host BIOS Requirements
Some hardware requires a larger PCI BARs window, for example, Nvidia Tesla P100, K40m
```
$ lspci -s 04:00.0 -vv | grep Region
Region 0: Memory at c6000000 (32-bit, non-prefetchable) [size=16M]
Region 1: Memory at 383800000000 (64-bit, prefetchable) [size=16G] #above 4G
Region 3: Memory at 383c00000000 (64-bit, prefetchable) [size=32M]
```
For large BARs devices, MMIO mapping above 4G address space should be `enabled`
in the PCI configuration of the BIOS.
Some hardware vendors use different name in BIOS, such as:
- Above 4G Decoding
- Memory Hole for PCI MMIO
- Memory Mapped I/O above 4GB
The following steps outline the workflow for using an Nvidia GPU with Kata.
## Host Kernel Requirements
The following configurations need to be enabled on your host kernel:
- `CONFIG_VFIO`
- `CONFIG_VFIO_IOMMU_TYPE1`
- `CONFIG_VFIO_MDEV`
- `CONFIG_VFIO_MDEV_DEVICE`
- `CONFIG_VFIO_PCI`
Your host kernel needs to be booted with `intel_iommu=on` on the kernel command line.
## Install and configure Kata Containers
To use non-large BARs devices (for example, Nvidia Tesla T4), you need Kata version 1.3.0 or above.
Follow the [Kata Containers setup instructions](../install/README.md)
to install the latest version of Kata.
To use large BARs devices (for example, Nvidia Tesla P100), you need Kata version 1.11.0 or above.
The following configuration in the Kata `configuration.toml` file as shown below can work:
Hotplug for PCI devices by `acpi_pcihp` (Linux's ACPI PCI Hotplug driver):
```
machine_type = "q35"
hotplug_vfio_on_root_bus = false
```
Hotplug for PCIe devices by `pciehp` (Linux's PCIe Hotplug driver):
```
machine_type = "q35"
hotplug_vfio_on_root_bus = true
pcie_root_port = 1
```
## Build Kata Containers kernel with GPU support
The default guest kernel installed with Kata Containers does not provide GPU support.
To use an Nvidia GPU with Kata Containers, you need to build a kernel with the
necessary GPU support.
The following kernel config options need to be enabled:
```
# Support PCI/PCIe device hotplug (Required for large BARs device)
CONFIG_HOTPLUG_PCI_PCIE=y
# Support for loading modules (Required for load Nvidia drivers)
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# Enable the MMIO access method for PCIe devices (Required for large BARs device)
CONFIG_PCI_MMCONFIG=y
```
The following kernel config options need to be disabled:
```
# Disable Open Source Nvidia driver nouveau
# It conflicts with Nvidia official driver
CONFIG_DRM_NOUVEAU=n
```
> **Note**: `CONFIG_DRM_NOUVEAU` is normally disabled by default.
It is worth checking that it is not enabled in your kernel configuration to prevent any conflicts.
Build the Kata Containers kernel with the previous config options,
using the instructions described in [Building Kata Containers kernel](../../tools/packaging/kernel).
For further details on building and installing guest kernels,
see [the developer guide](../Developer-Guide.md#install-guest-kernel-images).
There is an easy way to build a guest kernel that supports Nvidia GPU:
```
## Build guest kernel with ../../tools/packaging/kernel
# Prepare (download guest kernel source, generate .config)
$ ./build-kernel.sh -v 4.19.86 -g nvidia -f setup
# Build guest kernel
$ ./build-kernel.sh -v 4.19.86 -g nvidia build
# Install guest kernel
$ sudo -E ./build-kernel.sh -v 4.19.86 -g nvidia install
/usr/share/kata-containers/vmlinux-nvidia-gpu.container -> vmlinux-4.19.86-70-nvidia-gpu
/usr/share/kata-containers/vmlinuz-nvidia-gpu.container -> vmlinuz-4.19.86-70-nvidia-gpu
```
To build Nvidia Driver in Kata container, `kernel-devel` is required.
This is a way to generate rpm packages for `kernel-devel`:
```
$ cd kata-linux-4.19.86-68
$ make rpm-pkg
Output RPMs:
~/rpmbuild/RPMS/x86_64/kernel-devel-4.19.86_nvidia_gpu-1.x86_64.rpm
```
> **Note**:
> - `kernel-devel` should be installed in Kata container before run Nvidia driver installer.
> - Run `make deb-pkg` to build the deb package.
Before using the new guest kernel, please update the `kernel` parameters in `configuration.toml`.
```
kernel = "/usr/share/kata-containers/vmlinuz-nvidia-gpu.container"
```
## Nvidia GPU pass-through mode with Kata Containers
Use the following steps to pass an Nvidia GPU device in pass-through mode with Kata:
1. Find the Bus-Device-Function (BDF) for GPU device on host:
```
$ sudo lspci -nn -D | grep -i nvidia
0000:04:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:15f8] (rev a1)
0000:84:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:15f8] (rev a1)
```
> PCI address `0000:04:00.0` is assigned to the hardware GPU device.
> `10de:15f8` is the device ID of the hardware GPU device.
2. Find the IOMMU group for the GPU device:
```
$ BDF="0000:04:00.0"
$ readlink -e /sys/bus/pci/devices/$BDF/iommu_group
/sys/kernel/iommu_groups/45
```
The previous output shows that the GPU belongs to IOMMU group 45.
3. Check the IOMMU group number under `/dev/vfio`:
```
$ ls -l /dev/vfio
total 0
crw------- 1 root root 248, 0 Feb 28 09:57 45
crw------- 1 root root 248, 1 Feb 28 09:57 54
crw-rw-rw- 1 root root 10, 196 Feb 28 09:57 vfio
```
4. Start a Kata container with GPU device:
```
$ sudo docker run -it --runtime=kata-runtime --cap-add=ALL --device /dev/vfio/45 centos /bin/bash
```
5. Run `lspci` within the container to verify the GPU device is seen in the list
of the PCI devices. Note the vendor-device id of the GPU (`10de:15f8`) in the `lspci` output.
```
$ lspci -nn -D | grep '10de:15f8'
0000:01:01.0 3D controller [0302]: NVIDIA Corporation GP100GL [Tesla P100 PCIe 16GB] [10de:15f8] (rev a1)
```
6. Additionally, you can check the PCI BARs space of the Nvidia GPU device in the container:
```
$ lspci -s 01:01.0 -vv | grep Region
Region 0: Memory at c0000000 (32-bit, non-prefetchable) [disabled] [size=16M]
Region 1: Memory at 4400000000 (64-bit, prefetchable) [disabled] [size=16G]
Region 3: Memory at 4800000000 (64-bit, prefetchable) [disabled] [size=32M]
```
> **Note**: If you see a message similar to the above, the BAR space of the Nvidia
> GPU has been successfully allocated.
## Nvidia vGPU mode with Kata Containers
Nvidia vGPU is a licensed product on all supported GPU boards. A software license
is required to enable all vGPU features within the guest VM.
> **Note**: There is no suitable test environment, so it is not written here.
## Install Nvidia Driver in Kata Containers
Download the official Nvidia driver from
[https://www.nvidia.com/Download/index.aspx](https://www.nvidia.com/Download/index.aspx),
for example `NVIDIA-Linux-x86_64-418.87.01.run`.
Install the `kernel-devel`(generated in the previous steps) for guest kernel:
```
$ sudo rpm -ivh kernel-devel-4.19.86_gpu-1.x86_64.rpm
```
Here is an example to extract, compile and install Nvidia driver:
```
## Extract
$ sh ./NVIDIA-Linux-x86_64-418.87.01.run -x
## Compile and install (It will take some time)
$ cd NVIDIA-Linux-x86_64-418.87.01
$ sudo ./nvidia-installer -a -q --ui=none \
--no-cc-version-check \
--no-opengl-files --no-install-libglvnd \
--kernel-source-path=/usr/src/kernels/`uname -r`
```
Or just run one command line:
```
$ sudo sh ./NVIDIA-Linux-x86_64-418.87.01.run -a -q --ui=none \
--no-cc-version-check \
--no-opengl-files --no-install-libglvnd \
--kernel-source-path=/usr/src/kernels/`uname -r`
```
To view detailed logs of the installer:
```
$ tail -f /var/log/nvidia-installer.log
```
Load Nvidia driver module manually
```
# Optionalgenerate modules.dep and map files for Nvidia driver
$ sudo depmod
# Load module
$ sudo modprobe nvidia-drm
# Check module
$ lsmod | grep nvidia
nvidia_drm 45056 0
nvidia_modeset 1093632 1 nvidia_drm
nvidia 18202624 1 nvidia_modeset
drm_kms_helper 159744 1 nvidia_drm
drm 364544 3 nvidia_drm,drm_kms_helper
i2c_core 65536 3 nvidia,drm_kms_helper,drm
ipmi_msghandler 49152 1 nvidia
```
Check Nvidia device status with `nvidia-smi`
```
$ nvidia-smi
Tue Mar 3 00:03:49 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.01 Driver Version: 418.87.01 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... Off | 00000000:01:01.0 Off | 0 |
| N/A 27C P0 25W / 250W | 0MiB / 16280MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
```
## References
- [Configuring a VM for GPU Pass-Through by Using the QEMU Command Line](https://docs.nvidia.com/grid/latest/grid-vgpu-user-guide/index.html#using-gpu-pass-through-red-hat-el-qemu-cli)
- https://gitlab.com/nvidia/container-images/driver/-/tree/master
- https://github.com/NVIDIA/nvidia-docker/wiki/Driver-containers

View File

@@ -312,7 +312,7 @@ working properly with the Kata Containers VM.
### Build OpenSSL Intel® QAT engine container
Use the OpenSSL Intel® QAT [Dockerfile](https://github.com/intel/intel-device-plugins-for-kubernetes/tree/main/demo/openssl-qat-engine)
Use the OpenSSL Intel® QAT [Dockerfile](https://github.com/intel/intel-device-plugins-for-kubernetes/tree/master/demo/openssl-qat-engine)
to build a container image with an optimized OpenSSL engine for
Intel® QAT. Using `docker build` with the Kata Containers runtime can sometimes
have issues. Therefore, make sure that `runc` is the default Docker container
@@ -444,7 +444,7 @@ $ sudo docker save -o openssl-qat-engine.tar openssl-qat-engine:latest
$ sudo ctr -n=k8s.io images import openssl-qat-engine.tar
```
The [Intel® QAT Plugin](https://github.com/intel/intel-device-plugins-for-kubernetes/blob/main/cmd/qat_plugin/README.md)
The [Intel® QAT Plugin](https://github.com/intel/intel-device-plugins-for-kubernetes/blob/master/cmd/qat_plugin/README.md)
needs to be started so that the virtual functions can be discovered and
used by Kubernetes.

View File

@@ -0,0 +1,76 @@
# Setup to run VPP
The Data Plane Development Kit (DPDK) is a set of libraries and drivers for
fast packet processing. Vector Packet Processing (VPP) is a platform
extensible framework that provides out-of-the-box production quality
switch and router functionality. VPP is a high performance packet-processing
stack that can run on commodity CPUs. Enabling VPP with DPDK support can
yield significant performance improvements over a Linux\* bridge providing a
switch with DPDK VHOST-USER ports.
For more information about VPP visit their [wiki](https://wiki.fd.io/view/VPP).
## Install and configure Kata Containers
Follow the [Kata Containers setup instructions](../Developer-Guide.md).
In order to make use of VHOST-USER based interfaces, the container needs to be backed
by huge pages. `HugePages` support is required for the large memory pool allocation used for
DPDK packet buffers. This is a feature which must be configured within the Linux Kernel. See
[the DPDK documentation](https://doc.dpdk.org/guides/linux_gsg/sys_reqs.html#use-of-hugepages-in-the-linux-environment)
for details on how to enable for the host. After enabling huge pages support on the host system,
update the Kata configuration to enable huge page support in the guest kernel:
```
$ sudo sed -i -e 's/^# *\(enable_hugepages\).*=.*$/\1 = true/g' /usr/share/defaults/kata-containers/configuration.toml
```
## Install VPP
Follow the [VPP installation instructions](https://wiki.fd.io/view/VPP/Installing_VPP_binaries_from_packages).
After a successful installation, your host system is ready to start
connecting Kata Containers with VPP bridges.
### Install the VPP Docker\* plugin
To create a Docker network and connect Kata Containers easily to that network through
Docker, install a VPP Docker plugin.
To install the plugin, follow the [plugin installation instructions](https://github.com/clearcontainers/vpp).
This VPP plugin allows the creation of a VPP network. Every container added
to this network is connected through an L2 bridge-domain provided by VPP.
## Example: Launch two Kata Containers using VPP
To use VPP, use Docker to create a network that makes use of VPP.
For example:
```
$ sudo docker network create -d=vpp --ipam-driver=vpp --subnet=192.168.1.0/24 --gateway=192.168.1.1 vpp_net
```
Test connectivity by launching two containers:
```
$ sudo docker run --runtime=kata-runtime --net=vpp_net --ip=192.168.1.2 --mac-address=CA:FE:CA:FE:01:02 -it busybox bash -c "ip a; ip route; sleep 300"
$ sudo docker run --runtime=kata-runtime --net=vpp_net --ip=192.168.1.3 --mac-address=CA:FE:CA:FE:01:03 -it busybox bash -c "ip a; ip route; ping 192.168.1.2"
```
These commands setup two Kata Containers connected via a VPP L2 bridge
domain. The first of the two VMs displays the networking details and then
sleeps providing a period of time for it to be pinged. The second
VM displays its networking details and then pings the first VM, verifying
connectivity between them.
After verifying connectivity, cleanup with the following commands:
```
$ sudo docker kill $(sudo docker ps --no-trunc -aq)
$ sudo docker rm $(sudo docker ps --no-trunc -aq)
$ sudo docker network rm vpp_net
$ sudo service vpp stop
```

View File

@@ -22,35 +22,21 @@ $ sudo snap install kata-containers --classic
## Build and install snap image
Run the command below which will use the packaging Makefile to build the snap image:
Run next command at the root directory of the packaging repository.
```sh
$ make -C tools/packaging snap
$ make snap
```
> **Warning:**
>
> By default, `snapcraft` will create a clean virtual machine
> environment to build the snap in using the `multipass` tool.
>
> However, `multipass` is silently disabled when `--destructive-mode` is
> used.
>
> Since building the Kata Containers package currently requires
> `--destructive-mode`, the snap will be built using the host
> environment. To avoid parts of the build auto-detecting additional
> features to enable (for example for QEMU), we recommend that you
> only run the snap build in a minimal host environment.
To install the resulting snap image, snap must be put in [classic mode][3] and the
security confinement must be disabled (`--classic`). Also since the resulting snap
has not been signed the verification of signature must be omitted (`--dangerous`).
security confinement must be disabled (*--classic*). Also since the resulting snap
has not been signed the verification of signature must be omitted (*--dangerous*).
```sh
$ sudo snap install --classic --dangerous "kata-containers_${version}_${arch}.snap"
$ sudo snap install --classic --dangerous kata-containers_[VERSION]_[ARCH].snap
```
Replace `${version}` with the current version of Kata Containers and `${arch}` with
Replace `VERSION` with the current version of Kata Containers and `ARCH` with
the system architecture.
## Configure Kata Containers
@@ -90,12 +76,12 @@ then a new configuration file can be [created](#configure-kata-containers)
and [configured][7].
[1]: https://docs.snapcraft.io/snaps/intro
[2]: ../../docs/design/architecture/README.md#root-filesystem-image
[2]: ../docs/design/architecture/README.md#root-filesystem-image
[3]: https://docs.snapcraft.io/reference/confinement#classic
[4]: https://github.com/kata-containers/kata-containers/tree/main/src/runtime#configuration
[4]: https://github.com/kata-containers/runtime#configuration
[5]: https://docs.docker.com/engine/reference/commandline/dockerd
[6]: ../../docs/install/docker/ubuntu-docker-install.md
[7]: ../../docs/Developer-Guide.md#configure-to-use-initrd-or-rootfs-image
[6]: ../docs/install/docker/ubuntu-docker-install.md
[7]: ../docs/Developer-Guide.md#configure-to-use-initrd-or-rootfs-image
[8]: https://snapcraft.io/kata-containers
[9]: ../../docs/Developer-Guide.md#run-kata-containers-with-docker
[10]: ../../docs/Developer-Guide.md#run-kata-containers-with-kubernetes
[9]: ../docs/Developer-Guide.md#run-kata-containers-with-docker
[10]: ../docs/Developer-Guide.md#run-kata-containers-with-kubernetes

View File

@@ -1,114 +0,0 @@
#!/usr/bin/env bash
#
# Copyright (c) 2022 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
# Description: Idempotent script to be sourced by all parts in a
# snapcraft config file.
set -o errexit
set -o nounset
set -o pipefail
# XXX: Bash-specific code. zsh doesn't support this option and that *does*
# matter if this script is run sourced... since it'll be using zsh! ;)
[ -n "$BASH_VERSION" ] && set -o errtrace
[ -n "${DEBUG:-}" ] && set -o xtrace
die()
{
echo >&2 "ERROR: $0: $*"
}
[ -n "${SNAPCRAFT_STAGE:-}" ] ||\
die "must be sourced from a snapcraft config file"
snap_yq_version=3.4.1
snap_common_install_yq()
{
export yq="${SNAPCRAFT_STAGE}/bin/yq"
local yq_pkg
yq_pkg="github.com/mikefarah/yq"
local yq_url
yq_url="https://${yq_pkg}/releases/download/${snap_yq_version}/yq_${goos}_${goarch}"
curl -o "${yq}" -L "${yq_url}"
chmod +x "${yq}"
}
# Function that should be called for each snap "part" in
# snapcraft.yaml.
snap_common_main()
{
# Architecture
arch="$(uname -m)"
case "${arch}" in
aarch64)
goarch="arm64"
qemu_arch="${arch}"
;;
ppc64le)
goarch="ppc64le"
qemu_arch="ppc64"
;;
s390x)
goarch="${arch}"
qemu_arch="${arch}"
;;
x86_64)
goarch="amd64"
qemu_arch="${arch}"
;;
*) die "unsupported architecture: ${arch}" ;;
esac
dpkg_arch=$(dpkg --print-architecture)
# golang
#
# We need the O/S name in golang format, but since we don't
# know if the godeps part has run, we don't know if golang is
# available yet, hence fall back to a standard system command.
goos="$(go env GOOS &>/dev/null || true)"
[ -z "$goos" ] && goos=$(uname -s|tr '[A-Z]' '[a-z]')
export GOROOT="${SNAPCRAFT_STAGE}"
export GOPATH="${GOROOT}/gopath"
export GO111MODULE="auto"
mkdir -p "${GOPATH}/bin"
export PATH="${GOPATH}/bin:${PATH}"
# Proxy
export http_proxy="${http_proxy:-}"
export https_proxy="${https_proxy:-}"
# Binaries
mkdir -p "${SNAPCRAFT_STAGE}/bin"
export PATH="$PATH:${SNAPCRAFT_STAGE}/bin"
# YAML query tool
export yq="${SNAPCRAFT_STAGE}/bin/yq"
# Kata paths
export kata_dir=$(printf "%s/src/github.com/%s/%s" \
"${GOPATH}" \
"${SNAPCRAFT_PROJECT_NAME}" \
"${SNAPCRAFT_PROJECT_NAME}")
export versions_file="${kata_dir}/versions.yaml"
[ -n "${yq:-}" ] && [ -x "${yq:-}" ] || snap_common_install_yq
}
snap_common_main

View File

@@ -1,5 +1,4 @@
name: kata-containers
website: https://github.com/kata-containers/kata-containers
summary: Build lightweight VMs that seamlessly plug into the containers ecosystem
description: |
Kata Containers is an open source project and community working to build a
@@ -19,18 +18,20 @@ parts:
- git
- git-extras
override-pull: |
source "${SNAPCRAFT_PROJECT_DIR}/snap/local/snap-common.sh"
version="9999"
kata_url="https://github.com/kata-containers/kata-containers"
if echo "${GITHUB_REF:-}" | grep -q -E "^refs/tags"; then
version=$(echo ${GITHUB_REF:-} | cut -d/ -f3)
if echo "${GITHUB_REF}" | grep -q -E "^refs/tags"; then
version=$(echo ${GITHUB_REF} | cut -d/ -f3)
git checkout ${version}
fi
snapcraftctl set-grade "stable"
snapcraftctl set-version "${version}"
# setup GOPATH - this repo dir should be there
export GOPATH=${SNAPCRAFT_STAGE}/gopath
kata_dir=${GOPATH}/src/github.com/${SNAPCRAFT_PROJECT_NAME}/${SNAPCRAFT_PROJECT_NAME}
mkdir -p $(dirname ${kata_dir})
ln -sf $(realpath "${SNAPCRAFT_STAGE}/..") ${kata_dir}
@@ -42,46 +43,31 @@ parts:
build-packages:
- curl
override-build: |
source "${SNAPCRAFT_PROJECT_DIR}/snap/local/snap-common.sh"
# put everything in stage
cd "${SNAPCRAFT_STAGE}"
cd ${SNAPCRAFT_STAGE}
version="$(${yq} r ${kata_dir}/versions.yaml languages.golang.meta.newest-version)"
yq_path="./yq"
yq_pkg="github.com/mikefarah/yq"
goos="linux"
case "$(uname -m)" in
aarch64) goarch="arm64";;
ppc64le) goarch="ppc64le";;
x86_64) goarch="amd64";;
s390x) goarch="s390x";;
*) echo "unsupported architecture: $(uname -m)"; exit 1;;
esac
yq_version=3.4.1
yq_url="https://${yq_pkg}/releases/download/${yq_version}/yq_${goos}_${goarch}"
curl -o "${yq_path}" -L "${yq_url}"
chmod +x "${yq_path}"
kata_dir=gopath/src/github.com/${SNAPCRAFT_PROJECT_NAME}/${SNAPCRAFT_PROJECT_NAME}
version="$(${yq_path} r ${kata_dir}/versions.yaml languages.golang.meta.newest-version)"
tarfile="go${version}.${goos}-${goarch}.tar.gz"
curl -LO https://golang.org/dl/${tarfile}
tar -xf ${tarfile} --strip-components=1
rustdeps:
after: [metadata]
plugin: nil
prime:
- -*
build-packages:
- curl
override-build: |
source "${SNAPCRAFT_PROJECT_DIR}/snap/local/snap-common.sh"
# put everything in stage
cd "${SNAPCRAFT_STAGE}"
version="$(${yq} r ${kata_dir}/versions.yaml languages.rust.meta.newest-version)"
if ! command -v rustup > /dev/null; then
curl https://sh.rustup.rs -sSf | sh -s -- -y --default-toolchain ${version}
fi
export PATH=${PATH}:${HOME}/.cargo/bin
rustup toolchain install ${version}
rustup default ${version}
if [ "${arch}" == "ppc64le" ] || [ "${arch}" == "s390x" ] ; then
[ "${arch}" == "ppc64le" ] && arch="powerpc64le"
rustup target add ${arch}-unknown-linux-gnu
else
rustup target add ${arch}-unknown-linux-musl
$([ "$(whoami)" != "root" ] && echo sudo) ln -sf /usr/bin/g++ /bin/musl-g++
fi
rustup component add rustfmt
image:
after: [godeps, qemu, kernel]
plugin: nil
@@ -94,17 +80,28 @@ parts:
- uidmap
- gnupg2
override-build: |
source "${SNAPCRAFT_PROJECT_DIR}/snap/local/snap-common.sh"
[ "$(uname -m)" = "ppc64le" ] || [ "$(uname -m)" = "s390x" ] && sudo apt-get --no-install-recommends install -y protobuf-compiler
[ "${arch}" = "ppc64le" ] || [ "${arch}" = "s390x" ] && sudo apt-get --no-install-recommends install -y protobuf-compiler
yq=${SNAPCRAFT_STAGE}/yq
# set GOPATH
export GOPATH=${SNAPCRAFT_STAGE}/gopath
kata_dir=${GOPATH}/src/github.com/${SNAPCRAFT_PROJECT_NAME}/${SNAPCRAFT_PROJECT_NAME}
export GOROOT=${SNAPCRAFT_STAGE}
export PATH="${GOROOT}/bin:${PATH}"
export GO111MODULE="auto"
http_proxy=${http_proxy:-""}
https_proxy=${https_proxy:-""}
if [ -n "$http_proxy" ]; then
echo "Setting proxy $http_proxy"
sudo -E systemctl set-environment http_proxy="$http_proxy" || true
sudo -E systemctl set-environment https_proxy="$https_proxy" || true
sudo -E systemctl set-environment http_proxy=$http_proxy || true
sudo -E systemctl set-environment https_proxy=$https_proxy || true
fi
# Copy yq binary. It's used in the container
mkdir -p "${GOPATH}/bin/"
cp -a "${yq}" "${GOPATH}/bin/"
echo "Unmasking docker service"
@@ -115,54 +112,63 @@ parts:
echo "Starting docker"
sudo -E systemctl start docker || true
cd "${kata_dir}/tools/osbuilder"
cd ${kata_dir}/tools/osbuilder
# build image
export AGENT_INIT=yes
export USE_DOCKER=1
export DEBUG=1
arch="$(uname -m)"
initrd_distro=$(${yq} r -X ${kata_dir}/versions.yaml assets.initrd.architecture.${arch}.name)
image_distro=$(${yq} r -X ${kata_dir}/versions.yaml assets.image.architecture.${arch}.name)
case "$arch" in
x86_64)
# In some build systems it's impossible to build a rootfs image, try with the initrd image
sudo -E PATH=$PATH make image DISTRO="${image_distro}" || sudo -E PATH="$PATH" make initrd DISTRO="${initrd_distro}"
sudo -E PATH=$PATH make image DISTRO=${image_distro} || sudo -E PATH=$PATH make initrd DISTRO=${initrd_distro}
;;
aarch64|ppc64le|s390x)
sudo -E PATH="$PATH" make initrd DISTRO="${initrd_distro}"
sudo -E PATH=$PATH make initrd DISTRO=${initrd_distro}
;;
*) die "unsupported architecture: ${arch}" ;;
*) echo "unsupported architecture: $(uname -m)"; exit 1;;
esac
# Install image
kata_image_dir="${SNAPCRAFT_PART_INSTALL}/usr/share/kata-containers"
mkdir -p "${kata_image_dir}"
cp kata-containers*.img "${kata_image_dir}"
kata_image_dir=${SNAPCRAFT_PART_INSTALL}/usr/share/kata-containers
mkdir -p ${kata_image_dir}
cp kata-containers*.img ${kata_image_dir}
runtime:
after: [godeps, image, cloud-hypervisor]
plugin: nil
build-attributes: [no-patchelf]
override-build: |
source "${SNAPCRAFT_PROJECT_DIR}/snap/local/snap-common.sh"
# set GOPATH
export GOPATH=${SNAPCRAFT_STAGE}/gopath
export GOROOT=${SNAPCRAFT_STAGE}
export PATH="${GOROOT}/bin:${PATH}"
export GO111MODULE="auto"
kata_dir=${GOPATH}/src/github.com/${SNAPCRAFT_PROJECT_NAME}/${SNAPCRAFT_PROJECT_NAME}
cd "${kata_dir}/src/runtime"
cd ${kata_dir}/src/runtime
qemu_cmd="qemu-system-${qemu_arch}"
# setup arch
arch=$(uname -m)
if [ ${arch} = "ppc64le" ]; then
arch="ppc64"
fi
# build and install runtime
make \
PREFIX="/snap/${SNAPCRAFT_PROJECT_NAME}/current/usr" \
PREFIX=/snap/${SNAPCRAFT_PROJECT_NAME}/current/usr \
SKIP_GO_VERSION_CHECK=1 \
QEMUCMD="${qemu_cmd}"
QEMUCMD=qemu-system-$arch
make install \
PREFIX=/usr \
DESTDIR="${SNAPCRAFT_PART_INSTALL}" \
DESTDIR=${SNAPCRAFT_PART_INSTALL} \
SKIP_GO_VERSION_CHECK=1 \
QEMUCMD="${qemu_cmd}"
QEMUCMD=qemu-system-$arch
if [ ! -f ${SNAPCRAFT_PART_INSTALL}/../../image/install/usr/share/kata-containers/kata-containers.img ]; then
sed -i -e "s|^image =.*|initrd = \"/snap/${SNAPCRAFT_PROJECT_NAME}/current/usr/share/kata-containers/kata-containers-initrd.img\"|" \
@@ -179,37 +185,44 @@ parts:
- bison
- flex
override-build: |
source "${SNAPCRAFT_PROJECT_DIR}/snap/local/snap-common.sh"
yq=${SNAPCRAFT_STAGE}/yq
export PATH="${PATH}:${SNAPCRAFT_STAGE}"
export GOPATH=${SNAPCRAFT_STAGE}/gopath
kata_dir=${GOPATH}/src/github.com/${SNAPCRAFT_PROJECT_NAME}/${SNAPCRAFT_PROJECT_NAME}
versions_file="${kata_dir}/versions.yaml"
kernel_version="$(${yq} r $versions_file assets.kernel.version)"
#Remove extra 'v'
kernel_version="${kernel_version#v}"
kernel_version=${kernel_version#v}
[ "${arch}" = "s390x" ] && sudo apt-get --no-install-recommends install -y libssl-dev
[ "$(uname -m)" = "s390x" ] && sudo apt-get --no-install-recommends install -y libssl-dev
cd "${kata_dir}/tools/packaging/kernel"
export GOPATH=${SNAPCRAFT_STAGE}/gopath
export GO111MODULE="auto"
kata_dir=${GOPATH}/src/github.com/${SNAPCRAFT_PROJECT_NAME}/${SNAPCRAFT_PROJECT_NAME}
cd ${kata_dir}/tools/packaging/kernel
kernel_dir_prefix="kata-linux-"
# Setup and build kernel
./build-kernel.sh -v "${kernel_version}" -d setup
./build-kernel.sh -v ${kernel_version} -d setup
cd ${kernel_dir_prefix}*
make -j $(($(nproc)-1)) EXTRAVERSION=".container"
kernel_suffix="${kernel_version}.container"
kata_kernel_dir="${SNAPCRAFT_PART_INSTALL}/usr/share/kata-containers"
mkdir -p "${kata_kernel_dir}"
kernel_suffix=${kernel_version}.container
kata_kernel_dir=${SNAPCRAFT_PART_INSTALL}/usr/share/kata-containers
mkdir -p ${kata_kernel_dir}
# Install bz kernel
make install INSTALL_PATH="${kata_kernel_dir}" EXTRAVERSION=".container" || true
vmlinuz_name="vmlinuz-${kernel_suffix}"
ln -sf "${vmlinuz_name}" "${kata_kernel_dir}/vmlinuz.container"
make install INSTALL_PATH=${kata_kernel_dir} EXTRAVERSION=".container" || true
vmlinuz_name=vmlinuz-${kernel_suffix}
ln -sf ${vmlinuz_name} ${kata_kernel_dir}/vmlinuz.container
# Install raw kernel
vmlinux_path="vmlinux"
[ "${arch}" = "s390x" ] && vmlinux_path="arch/s390/boot/compressed/vmlinux"
vmlinux_name="vmlinux-${kernel_suffix}"
cp "${vmlinux_path}" "${kata_kernel_dir}/${vmlinux_name}"
ln -sf "${vmlinux_name}" "${kata_kernel_dir}/vmlinux.container"
vmlinux_path=vmlinux
[ "$(uname -m)" = "s390x" ] && vmlinux_path=arch/s390/boot/compressed/vmlinux
vmlinux_name=vmlinux-${kernel_suffix}
cp ${vmlinux_path} ${kata_kernel_dir}/${vmlinux_name}
ln -sf ${vmlinux_name} ${kata_kernel_dir}/vmlinux.container
qemu:
plugin: make
@@ -236,8 +249,12 @@ parts:
- libselinux1-dev
- ninja-build
override-build: |
source "${SNAPCRAFT_PROJECT_DIR}/snap/local/snap-common.sh"
yq=${SNAPCRAFT_STAGE}/yq
export GOPATH=${SNAPCRAFT_STAGE}/gopath
export GO111MODULE="auto"
kata_dir=${GOPATH}/src/github.com/${SNAPCRAFT_PROJECT_NAME}/${SNAPCRAFT_PROJECT_NAME}
versions_file="${kata_dir}/versions.yaml"
branch="$(${yq} r ${versions_file} assets.hypervisor.qemu.version)"
url="$(${yq} r ${versions_file} assets.hypervisor.qemu.url)"
commit=""
@@ -245,11 +262,11 @@ parts:
patches_version_dir="${kata_dir}/tools/packaging/qemu/patches/tag_patches/${branch}"
# download source
qemu_dir="${SNAPCRAFT_STAGE}/qemu"
qemu_dir=${SNAPCRAFT_STAGE}/qemu
rm -rf "${qemu_dir}"
git clone --depth 1 --branch ${branch} --single-branch ${url} "${qemu_dir}"
cd "${qemu_dir}"
[ -z "${commit}" ] || git checkout "${commit}"
cd ${qemu_dir}
[ -z "${commit}" ] || git checkout ${commit}
[ -n "$(ls -A ui/keycodemapdb)" ] || git clone --depth 1 https://github.com/qemu/keycodemapdb ui/keycodemapdb/
[ -n "$(ls -A capstone)" ] || git clone --depth 1 https://github.com/qemu/capstone capstone
@@ -260,10 +277,10 @@ parts:
${kata_dir}/tools/packaging/scripts/apply_patches.sh "${patches_version_dir}"
# Only x86_64 supports libpmem
[ "${arch}" = "x86_64" ] && sudo apt-get --no-install-recommends install -y apt-utils ca-certificates libpmem-dev
[ "$(uname -m)" = "x86_64" ] && sudo apt-get --no-install-recommends install -y apt-utils ca-certificates libpmem-dev
configure_hypervisor="${kata_dir}/tools/packaging/scripts/configure-hypervisor.sh"
chmod +x "${configure_hypervisor}"
configure_hypervisor=${kata_dir}/tools/packaging/scripts/configure-hypervisor.sh
chmod +x ${configure_hypervisor}
# static build. The --prefix, --libdir, --libexecdir, --datadir arguments are
# based on PREFIX and set by configure-hypervisor.sh
echo "$(PREFIX=/snap/${SNAPCRAFT_PROJECT_NAME}/current/usr ${configure_hypervisor} -s kata-qemu) \
@@ -273,17 +290,17 @@ parts:
# Copy QEMU configurations (Kconfigs)
case "${branch}" in
"v5.1.0")
cp -a "${kata_dir}"/tools/packaging/qemu/default-configs/* default-configs
cp -a ${kata_dir}/tools/packaging/qemu/default-configs/* default-configs
;;
*)
cp -a "${kata_dir}"/tools/packaging/qemu/default-configs/* configs/devices/
cp -a ${kata_dir}/tools/packaging/qemu/default-configs/* configs/devices/
;;
esac
# build and install
make -j $(($(nproc)-1))
make install DESTDIR="${SNAPCRAFT_PART_INSTALL}"
make install DESTDIR=${SNAPCRAFT_PART_INSTALL}
prime:
- -snap/
- -usr/bin/qemu-ga
@@ -299,67 +316,26 @@ parts:
# Hack: move qemu to /
"snap/kata-containers/current/": "./"
virtiofsd:
plugin: nil
after: [godeps, rustdeps]
override-build: |
source "${SNAPCRAFT_PROJECT_DIR}/snap/local/snap-common.sh"
# Currently, powerpc makes use of the QEMU's C implementation.
# The other platforms make use of the new rust virtiofsd.
#
# See "tools/packaging/scripts/configure-hypervisor.sh".
if [ "${arch}" == "ppc64le" ]
then
echo "INFO: Building QEMU's C version of virtiofsd"
# Handled by the 'qemu' part, so nothing more to do here.
exit 0
else
echo "INFO: Building rust version of virtiofsd"
fi
cd "${kata_dir}"
export PATH=${PATH}:${HOME}/.cargo/bin
# Download the rust implementation of virtiofsd
tools/packaging/static-build/virtiofsd/build-static-virtiofsd.sh
sudo install \
--owner='root' \
--group='root' \
--mode=0755 \
-D \
--target-directory="${SNAPCRAFT_PART_INSTALL}/usr/libexec/" \
virtiofsd/virtiofsd
cloud-hypervisor:
plugin: nil
after: [godeps]
override-build: |
source "${SNAPCRAFT_PROJECT_DIR}/snap/local/snap-common.sh"
if [ "${arch}" == "aarch64" ] || [ "${arch}" == "x86_64" ]; then
arch=$(uname -m)
if [ "{$arch}" == "aarch64" ] || [ "${arch}" == "x64_64" ]; then
sudo apt-get -y update
sudo apt-get -y install ca-certificates curl gnupg lsb-release
curl -fsSL https://download.docker.com/linux/ubuntu/gpg |\
sudo gpg --batch --yes --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
distro_codename=$(lsb_release -cs)
echo "deb [arch=${dpkg_arch} signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu ${distro_codename} stable" |\
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --batch --yes --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get -y update
sudo apt-get -y install docker-ce docker-ce-cli containerd.io
sudo systemctl start docker.socket
cd "${SNAPCRAFT_PROJECT_DIR}"
export GOPATH=${SNAPCRAFT_STAGE}/gopath
kata_dir=${GOPATH}/src/github.com/${SNAPCRAFT_PROJECT_NAME}/${SNAPCRAFT_PROJECT_NAME}
cd ${kata_dir}
sudo -E NO_TTY=true make cloud-hypervisor-tarball
tarfile="${SNAPCRAFT_PROJECT_DIR}/tools/packaging/kata-deploy/local-build/build/kata-static-cloud-hypervisor.tar.xz"
tmpdir=$(mktemp -d)
tar -xvJpf "${tarfile}" -C "${tmpdir}"
install -D "${tmpdir}/opt/kata/bin/cloud-hypervisor" "${SNAPCRAFT_PART_INSTALL}/usr/bin/cloud-hypervisor"
rm -rf "${tmpdir}"
tar xvJpf build/kata-static-cloud-hypervisor.tar.xz -C /tmp/
install -D /tmp/opt/kata/bin/cloud-hypervisor ${SNAPCRAFT_PART_INSTALL}/usr/bin/cloud-hypervisor
fi
apps:

660
src/agent/Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -12,13 +12,13 @@ lazy_static = "1.3.0"
ttrpc = { version = "0.5.0", features = ["async", "protobuf-codec"], default-features = false }
protobuf = "=2.14.0"
libc = "0.2.58"
nix = "0.23"
nix = "0.23.0"
capctl = "0.2.0"
serde_json = "1.0.39"
scan_fmt = "0.2.3"
scopeguard = "1.0.0"
thiserror = "1.0.26"
regex = "1.5.6"
regex = "1.5.4"
serial_test = "0.5.1"
sysinfo = "0.23.0"
@@ -76,8 +76,3 @@ lto = true
[features]
seccomp = ["rustjail/seccomp"]
standard-oci-runtime = ["rustjail/standard-oci-runtime"]
[[bin]]
name = "kata-agent"
path = "src/main.rs"

View File

@@ -14,6 +14,10 @@ PROJECT_COMPONENT = kata-agent
TARGET = $(PROJECT_COMPONENT)
SOURCES := \
$(shell find . 2>&1 | grep -E '.*\.rs$$') \
Cargo.toml
VERSION_FILE := ./VERSION
VERSION := $(shell grep -v ^\# $(VERSION_FILE))
COMMIT_NO := $(shell git rev-parse HEAD 2>/dev/null || true)
@@ -33,16 +37,8 @@ ifeq ($(SECCOMP),yes)
override EXTRA_RUSTFEATURES += seccomp
endif
##VAR STANDARD_OCI_RUNTIME=yes|no define if agent enables standard oci runtime feature
STANDARD_OCI_RUNTIME := no
# Enable standard oci runtime feature of rust build
ifeq ($(STANDARD_OCI_RUNTIME),yes)
override EXTRA_RUSTFEATURES += standard-oci-runtime
endif
ifneq ($(EXTRA_RUSTFEATURES),)
override EXTRA_RUSTFEATURES := --features "$(EXTRA_RUSTFEATURES)"
override EXTRA_RUSTFEATURES := --features $(EXTRA_RUSTFEATURES)
endif
include ../../utils.mk
@@ -112,15 +108,15 @@ $(TARGET): $(GENERATED_CODE) logging-crate-tests $(TARGET_PATH)
logging-crate-tests:
make -C $(CWD)/../libs/logging
$(TARGET_PATH): show-summary
@RUSTFLAGS="$(EXTRA_RUSTFLAGS) --deny warnings" cargo build --target $(TRIPLE) $(if $(findstring release,$(BUILD_TYPE)),--release) $(EXTRA_RUSTFEATURES)
$(TARGET_PATH): $(SOURCES) | show-summary
@RUSTFLAGS="$(EXTRA_RUSTFLAGS) --deny warnings" cargo build --target $(TRIPLE) --$(BUILD_TYPE) $(EXTRA_RUSTFEATURES)
$(GENERATED_FILES): %: %.in
@sed $(foreach r,$(GENERATED_REPLACEMENTS),-e 's|@$r@|$($r)|g') "$<" > "$@"
##TARGET optimize: optimized build
optimize: show-summary show-header
@RUSTFLAGS="-C link-arg=-s $(EXTRA_RUSTFLAGS) --deny warnings" cargo build --target $(TRIPLE) $(if $(findstring release,$(BUILD_TYPE)),--release) $(EXTRA_RUSTFEATURES)
optimize: $(SOURCES) | show-summary show-header
@RUSTFLAGS="-C link-arg=-s $(EXTRA_RUSTFLAGS) --deny warnings" cargo build --target $(TRIPLE) --$(BUILD_TYPE) $(EXTRA_RUSTFEATURES)
##TARGET install: install agent
install: install-services

View File

@@ -20,18 +20,17 @@ protobuf = "=2.14.0"
slog = "2.5.2"
slog-scope = "4.1.2"
scan_fmt = "0.2.6"
regex = "1.5.6"
regex = "1.5.4"
path-absolutize = "1.2.0"
anyhow = "1.0.32"
cgroups = { package = "cgroups-rs", version = "0.2.8" }
rlimit = "0.5.3"
cfg-if = "0.1.0"
tokio = { version = "1.2.0", features = ["sync", "io-util", "process", "time", "macros"] }
futures = "0.3.17"
async-trait = "0.1.31"
inotify = "0.9.2"
libseccomp = { version = "0.2.3", optional = true }
libseccomp = { version = "0.1.3", optional = true }
[dev-dependencies]
serial_test = "0.5.0"
@@ -39,4 +38,3 @@ tempfile = "3.1.0"
[features]
seccomp = ["libseccomp"]
standard-oci-runtime = []

View File

@@ -151,12 +151,12 @@ async fn register_memory_event(
let eventfd = eventfd(0, EfdFlags::EFD_CLOEXEC)?;
let event_control_path = Path::new(&cg_dir).join("cgroup.event_control");
let data = if arg.is_empty() {
format!("{} {}", eventfd, event_file.as_raw_fd())
let data;
if arg.is_empty() {
data = format!("{} {}", eventfd, event_file.as_raw_fd());
} else {
format!("{} {} {}", eventfd, event_file.as_raw_fd(), arg)
};
data = format!("{} {} {}", eventfd, event_file.as_raw_fd(), arg);
}
fs::write(&event_control_path, data)?;

View File

@@ -1,79 +0,0 @@
// SPDX-License-Identifier: Apache-2.0
//
// Copyright 2021 Sony Group Corporation
//
use anyhow::{anyhow, Result};
use nix::errno::Errno;
use nix::pty;
use nix::sys::{socket, uio};
use nix::unistd::{self, dup2};
use std::os::unix::io::{AsRawFd, RawFd};
use std::path::Path;
pub fn setup_console_socket(csocket_path: &str) -> Result<Option<RawFd>> {
if csocket_path.is_empty() {
return Ok(None);
}
let socket_fd = socket::socket(
socket::AddressFamily::Unix,
socket::SockType::Stream,
socket::SockFlag::empty(),
None,
)?;
match socket::connect(
socket_fd,
&socket::SockAddr::Unix(socket::UnixAddr::new(Path::new(csocket_path))?),
) {
Ok(()) => Ok(Some(socket_fd)),
Err(errno) => Err(anyhow!("failed to open console fd: {}", errno)),
}
}
pub fn setup_master_console(socket_fd: RawFd) -> Result<()> {
let pseudo = pty::openpty(None, None)?;
let pty_name: &[u8] = b"/dev/ptmx";
let iov = [uio::IoVec::from_slice(pty_name)];
let fds = [pseudo.master];
let cmsg = socket::ControlMessage::ScmRights(&fds);
socket::sendmsg(socket_fd, &iov, &[cmsg], socket::MsgFlags::empty(), None)?;
unistd::setsid()?;
let ret = unsafe { libc::ioctl(pseudo.slave, libc::TIOCSCTTY) };
Errno::result(ret).map_err(|e| anyhow!(e).context("ioctl TIOCSCTTY"))?;
dup2(pseudo.slave, std::io::stdin().as_raw_fd())?;
dup2(pseudo.slave, std::io::stdout().as_raw_fd())?;
dup2(pseudo.slave, std::io::stderr().as_raw_fd())?;
unistd::close(socket_fd)?;
Ok(())
}
#[cfg(test)]
mod tests {
use super::*;
use std::os::unix::net::UnixListener;
use tempfile::{self, tempdir};
const CONSOLE_SOCKET: &str = "console-socket";
#[test]
fn test_setup_console_socket() {
let dir = tempdir()
.map_err(|e| anyhow!(e).context("tempdir failed"))
.unwrap();
let socket_path = dir.path().join(CONSOLE_SOCKET);
let _listener = UnixListener::bind(&socket_path).unwrap();
let ret = setup_console_socket(socket_path.to_str().unwrap());
assert!(ret.is_ok());
}
}

View File

@@ -23,8 +23,6 @@ use crate::cgroups::fs::Manager as FsManager;
#[cfg(test)]
use crate::cgroups::mock::Manager as FsManager;
use crate::cgroups::Manager;
#[cfg(feature = "standard-oci-runtime")]
use crate::console;
use crate::log_child;
use crate::process::Process;
#[cfg(feature = "seccomp")]
@@ -42,7 +40,7 @@ use nix::pty;
use nix::sched::{self, CloneFlags};
use nix::sys::signal::{self, Signal};
use nix::sys::stat::{self, Mode};
use nix::unistd::{self, fork, ForkResult, Gid, Pid, Uid, User};
use nix::unistd::{self, fork, ForkResult, Gid, Pid, Uid};
use std::os::unix::fs::MetadataExt;
use std::os::unix::io::AsRawFd;
@@ -64,7 +62,9 @@ use rlimit::{setrlimit, Resource, Rlim};
use tokio::io::AsyncBufReadExt;
use tokio::sync::Mutex;
pub const EXEC_FIFO_FILENAME: &str = "exec.fifo";
use crate::utils;
const EXEC_FIFO_FILENAME: &str = "exec.fifo";
const INIT: &str = "INIT";
const NO_PIVOT: &str = "NO_PIVOT";
@@ -74,7 +74,6 @@ const CLOG_FD: &str = "CLOG_FD";
const FIFO_FD: &str = "FIFO_FD";
const HOME_ENV_KEY: &str = "HOME";
const PIDNS_FD: &str = "PIDNS_FD";
const CONSOLE_SOCKET_FD: &str = "CONSOLE_SOCKET_FD";
#[derive(Debug)]
pub struct ContainerStatus {
@@ -83,7 +82,7 @@ pub struct ContainerStatus {
}
impl ContainerStatus {
pub fn new() -> Self {
fn new() -> Self {
ContainerStatus {
pre_status: ContainerState::Created,
cur_status: ContainerState::Created,
@@ -100,12 +99,6 @@ impl ContainerStatus {
}
}
impl Default for ContainerStatus {
fn default() -> Self {
Self::new()
}
}
pub type Config = CreateOpts;
type NamespaceType = String;
@@ -113,7 +106,7 @@ lazy_static! {
// This locker ensures the child exit signal will be received by the right receiver.
pub static ref WAIT_PID_LOCKER: Arc<Mutex<bool>> = Arc::new(Mutex::new(false));
pub static ref NAMESPACES: HashMap<&'static str, CloneFlags> = {
static ref NAMESPACES: HashMap<&'static str, CloneFlags> = {
let mut m = HashMap::new();
m.insert("user", CloneFlags::CLONE_NEWUSER);
m.insert("ipc", CloneFlags::CLONE_NEWIPC);
@@ -126,7 +119,7 @@ lazy_static! {
};
// type to name hashmap, better to be in NAMESPACES
pub static ref TYPETONAME: HashMap<&'static str, &'static str> = {
static ref TYPETONAME: HashMap<&'static str, &'static str> = {
let mut m = HashMap::new();
m.insert("ipc", "ipc");
m.insert("user", "user");
@@ -222,7 +215,7 @@ pub trait BaseContainer {
async fn start(&mut self, p: Process) -> Result<()>;
async fn run(&mut self, p: Process) -> Result<()>;
async fn destroy(&mut self) -> Result<()>;
async fn exec(&mut self) -> Result<()>;
fn exec(&mut self) -> Result<()>;
}
// LinuxContainer protected by Mutex
@@ -243,8 +236,6 @@ pub struct LinuxContainer {
pub status: ContainerStatus,
pub created: SystemTime,
pub logger: Logger,
#[cfg(feature = "standard-oci-runtime")]
pub console_socket: PathBuf,
}
#[derive(Serialize, Deserialize, Debug)]
@@ -368,6 +359,7 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
)));
}
}
log_child!(cfd_log, "child process start run");
let buf = read_sync(crfd)?;
let spec_str = std::str::from_utf8(&buf)?;
@@ -387,9 +379,6 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
let cm: FsManager = serde_json::from_str(cm_str)?;
#[cfg(feature = "standard-oci-runtime")]
let csocket_fd = console::setup_console_socket(&std::env::var(CONSOLE_SOCKET_FD)?)?;
let p = if spec.process.is_some() {
spec.process.as_ref().unwrap()
} else {
@@ -587,20 +576,14 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
// only change stdio devices owner when user
// isn't root.
if !uid.is_root() {
set_stdio_permissions(uid)?;
if guser.uid != 0 {
set_stdio_permissions(guser.uid)?;
}
setid(uid, gid)?;
if !guser.additional_gids.is_empty() {
let gids: Vec<Gid> = guser
.additional_gids
.iter()
.map(|gid| Gid::from_raw(*gid))
.collect();
unistd::setgroups(&gids).map_err(|e| {
setgroups(guser.additional_gids.as_slice()).map_err(|e| {
let _ = write_sync(
cwfd,
SYNC_FAILED,
@@ -640,6 +623,11 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
capabilities::drop_privileges(cfd_log, c)?;
}
if init {
// notify parent to run poststart hooks
write_sync(cwfd, SYNC_SUCCESS, "")?;
}
let args = oci_process.args.to_vec();
let env = oci_process.env.to_vec();
@@ -661,17 +649,12 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
}
}
// set the "HOME" env getting from "/etc/passwd", if
// there's no uid entry in /etc/passwd, set "/" as the
// home env.
if env::var_os(HOME_ENV_KEY).is_none() {
// try to set "HOME" env by uid
if let Ok(Some(user)) = User::from_uid(Uid::from_raw(guser.uid)) {
if let Ok(user_home_dir) = user.dir.into_os_string().into_string() {
env::set_var(HOME_ENV_KEY, user_home_dir);
}
}
// set default home dir as "/" if "HOME" env is still empty
if env::var_os(HOME_ENV_KEY).is_none() {
env::set_var(HOME_ENV_KEY, String::from("/"));
}
let home_dir = utils::home_dir(guser.uid).unwrap_or_else(|_| String::from("/"));
env::set_var(HOME_ENV_KEY, home_dir);
}
let exec_file = Path::new(&args[0]);
@@ -687,19 +670,10 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
let _ = unistd::close(crfd);
let _ = unistd::close(cwfd);
unistd::setsid().context("create a new session")?;
if oci_process.terminal {
cfg_if::cfg_if! {
if #[cfg(feature = "standard-oci-runtime")] {
if let Some(csocket_fd) = csocket_fd {
console::setup_master_console(csocket_fd)?;
} else {
return Err(anyhow!("failed to get console master socket fd"));
}
}
else {
unistd::setsid().context("create a new session")?;
unsafe { libc::ioctl(0, libc::TIOCSCTTY) };
}
unsafe {
libc::ioctl(0, libc::TIOCSCTTY);
}
}
@@ -731,7 +705,7 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
// within the container to the specified user.
// The ownership needs to match because it is created outside of
// the container and needs to be localized.
fn set_stdio_permissions(uid: Uid) -> Result<()> {
fn set_stdio_permissions(uid: libc::uid_t) -> Result<()> {
let meta = fs::metadata("/dev/null")?;
let fds = [
std::io::stdin().as_raw_fd(),
@@ -746,13 +720,19 @@ fn set_stdio_permissions(uid: Uid) -> Result<()> {
continue;
}
// According to the POSIX specification, -1 is used to indicate that owner and group
// are not to be changed. Since uid_t and gid_t are unsigned types, we have to wrap
// around to get -1.
let gid = 0u32.wrapping_sub(1);
// We only change the uid owner (as it is possible for the mount to
// prefer a different gid, and there's no reason for us to change it).
// The reason why we don't just leave the default uid=X mount setup is
// that users expect to be able to actually use their console. Without
// this code, you couldn't effectively run as a non-root user inside a
// container and also have a console set up.
unistd::fchown(*fd, Some(uid), None).with_context(|| "set stdio permissions failed")?;
let res = unsafe { libc::fchown(*fd, uid, gid) };
Errno::result(res).map_err(|e| anyhow!(e).context("set stdio permissions failed"))?;
}
Ok(())
@@ -948,14 +928,6 @@ impl BaseContainer for LinuxContainer {
let exec_path = std::env::current_exe()?;
let mut child = std::process::Command::new(exec_path);
#[allow(unused_mut)]
let mut console_name = PathBuf::from("");
#[cfg(feature = "standard-oci-runtime")]
if !self.console_socket.as_os_str().is_empty() {
console_name = self.console_socket.clone();
}
let mut child = child
.arg("init")
.stdin(child_stdin)
@@ -965,8 +937,7 @@ impl BaseContainer for LinuxContainer {
.env(NO_PIVOT, format!("{}", self.config.no_pivot_root))
.env(CRFD_FD, format!("{}", crfd))
.env(CWFD_FD, format!("{}", cwfd))
.env(CLOG_FD, format!("{}", cfd_log))
.env(CONSOLE_SOCKET_FD, console_name);
.env(CLOG_FD, format!("{}", cfd_log));
if p.init {
child = child.env(FIFO_FD, format!("{}", fifofd));
@@ -1049,7 +1020,7 @@ impl BaseContainer for LinuxContainer {
self.start(p).await?;
if init {
self.exec().await?;
self.exec()?;
self.status.transition(ContainerState::Running);
}
@@ -1092,22 +1063,12 @@ impl BaseContainer for LinuxContainer {
fs::remove_dir_all(&self.root)?;
if let Some(cgm) = self.cgroup_manager.as_mut() {
// Kill all of the processes created in this container to prevent
// the leak of some daemon process when this container shared pidns
// with the sandbox.
let pids = cgm.get_pids().context("get cgroup pids")?;
for i in pids {
if let Err(e) = signal::kill(Pid::from_raw(i), Signal::SIGKILL) {
warn!(self.logger, "kill the process {} error: {:?}", i, e);
}
}
cgm.destroy().context("destroy cgroups")?;
}
Ok(())
}
async fn exec(&mut self) -> Result<()> {
fn exec(&mut self) -> Result<()> {
let fifo = format!("{}/{}", &self.root, EXEC_FIFO_FILENAME);
let fd = fcntl::open(fifo.as_str(), OFlag::O_WRONLY, Mode::from_bits_truncate(0))?;
let data: &[u8] = &[0];
@@ -1119,26 +1080,6 @@ impl BaseContainer for LinuxContainer {
.as_secs();
self.status.transition(ContainerState::Running);
let spec = self
.config
.spec
.as_ref()
.ok_or_else(|| anyhow!("OCI spec was not found"))?;
let st = self.oci_state()?;
// run poststart hook
if spec.hooks.is_some() {
info!(self.logger, "poststart hook");
let hooks = spec
.hooks
.as_ref()
.ok_or_else(|| anyhow!("OCI hooks were not found"))?;
for h in hooks.poststart.iter() {
execute_hook(&self.logger, h, &st).await?;
}
}
unistd::close(fd)?;
Ok(())
@@ -1360,6 +1301,20 @@ async fn join_namespaces(
// notify child run prestart hooks completed
info!(logger, "notify child run prestart hook completed!");
write_async(pipe_w, SYNC_SUCCESS, "").await?;
info!(logger, "notify child parent ready to run poststart hook!");
// wait to run poststart hook
read_async(pipe_r).await?;
info!(logger, "get ready to run poststart hook!");
// run poststart hook
if spec.hooks.is_some() {
info!(logger, "poststart hook");
let hooks = spec.hooks.as_ref().unwrap();
for h in hooks.poststart.iter() {
execute_hook(&logger, h, st).await?;
}
}
}
info!(logger, "wait for child process ready to run exec");
@@ -1476,16 +1431,14 @@ impl LinuxContainer {
.unwrap()
.as_secs(),
logger: logger.new(o!("module" => "rustjail", "subsystem" => "container", "cid" => id)),
#[cfg(feature = "standard-oci-runtime")]
console_socket: Path::new("").to_path_buf(),
})
}
}
#[cfg(feature = "standard-oci-runtime")]
pub fn set_console_socket(&mut self, console_socket: &Path) -> Result<()> {
self.console_socket = console_socket.to_path_buf();
Ok(())
}
fn setgroups(grps: &[libc::gid_t]) -> Result<()> {
let ret = unsafe { libc::setgroups(grps.len(), grps.as_ptr() as *const libc::gid_t) };
Errno::result(ret).map(drop)?;
Ok(())
}
use std::fs::OpenOptions;
@@ -1519,7 +1472,7 @@ use std::process::Stdio;
use std::time::Duration;
use tokio::io::{AsyncReadExt, AsyncWriteExt};
pub async fn execute_hook(logger: &Logger, h: &Hook, st: &OCIState) -> Result<()> {
async fn execute_hook(logger: &Logger, h: &Hook, st: &OCIState) -> Result<()> {
let logger = logger.new(o!("action" => "execute-hook"));
let binary = PathBuf::from(h.path.as_str());
@@ -1657,7 +1610,6 @@ mod tests {
use super::*;
use crate::process::Process;
use crate::skip_if_not_root;
use nix::unistd::Uid;
use std::fs;
use std::os::unix::fs::MetadataExt;
use std::os::unix::io::AsRawFd;
@@ -1803,7 +1755,7 @@ mod tests {
let old_uid = meta.uid();
let uid = 1000;
set_stdio_permissions(Uid::from_raw(uid)).unwrap();
set_stdio_permissions(uid).unwrap();
let meta = fs::metadata("/dev/stdin").unwrap();
assert_eq!(meta.uid(), uid);
@@ -1815,7 +1767,7 @@ mod tests {
assert_eq!(meta.uid(), uid);
// restore the uid
set_stdio_permissions(Uid::from_raw(old_uid)).unwrap();
set_stdio_permissions(old_uid).unwrap();
}
#[test]
@@ -2096,10 +2048,9 @@ mod tests {
assert!(ret.is_ok(), "Expecting Ok, Got {:?}", ret);
}
#[tokio::test]
async fn test_linuxcontainer_exec() {
let (c, _dir) = new_linux_container();
let ret = c.unwrap().exec().await;
#[test]
fn test_linuxcontainer_exec() {
let ret = new_linux_container_and_then(|mut c: LinuxContainer| c.exec());
assert!(ret.is_err(), "Expecting Err, Got {:?}", ret);
}

View File

@@ -30,8 +30,6 @@ extern crate regex;
pub mod capabilities;
pub mod cgroups;
#[cfg(feature = "standard-oci-runtime")]
pub mod console;
pub mod container;
pub mod mount;
pub mod pipestream;
@@ -41,6 +39,7 @@ pub mod seccomp;
pub mod specconv;
pub mod sync;
pub mod sync_with_async;
pub mod utils;
pub mod validator;
use std::collections::HashMap;
@@ -352,12 +351,13 @@ fn seccomp_grpc_to_oci(sec: &grpc::LinuxSeccomp) -> oci::LinuxSeccomp {
for sys in sec.Syscalls.iter() {
let mut args = Vec::new();
let errno_ret: u32;
let errno_ret: u32 = if sys.has_errnoret() {
sys.get_errnoret()
if sys.has_errnoret() {
errno_ret = sys.get_errnoret();
} else {
libc::EPERM as u32
};
errno_ret = libc::EPERM as u32;
}
for arg in sys.Args.iter() {
args.push(oci::LinuxSeccompArg {
@@ -513,7 +513,6 @@ pub fn grpc_to_oci(grpc: &grpc::Spec) -> oci::Spec {
#[cfg(test)]
mod tests {
use super::*;
#[macro_export]
macro_rules! skip_if_not_root {
() => {
@@ -523,595 +522,4 @@ mod tests {
}
};
}
// Parameters:
//
// 1: expected Result
// 2: actual Result
// 3: string used to identify the test on error
#[macro_export]
macro_rules! assert_result {
($expected_result:expr, $actual_result:expr, $msg:expr) => {
if $expected_result.is_ok() {
let expected_value = $expected_result.as_ref().unwrap();
let actual_value = $actual_result.unwrap();
assert!(*expected_value == actual_value, "{}", $msg);
} else {
assert!($actual_result.is_err(), "{}", $msg);
let expected_error = $expected_result.as_ref().unwrap_err();
let expected_error_msg = format!("{:?}", expected_error);
let actual_error_msg = format!("{:?}", $actual_result.unwrap_err());
assert!(expected_error_msg == actual_error_msg, "{}", $msg);
}
};
}
#[test]
fn test_process_grpc_to_oci() {
#[derive(Debug)]
struct TestData {
grpcproc: grpc::Process,
result: oci::Process,
}
let tests = &[
TestData {
// All fields specified
grpcproc: grpc::Process {
Terminal: true,
ConsoleSize: protobuf::SingularPtrField::<grpc::Box>::some(grpc::Box {
Height: 123,
Width: 456,
..Default::default()
}),
User: protobuf::SingularPtrField::<grpc::User>::some(grpc::User {
UID: 1234,
GID: 5678,
AdditionalGids: Vec::from([910, 1112]),
Username: String::from("username"),
..Default::default()
}),
Args: protobuf::RepeatedField::from(Vec::from([
String::from("arg1"),
String::from("arg2"),
])),
Env: protobuf::RepeatedField::from(Vec::from([String::from("env")])),
Cwd: String::from("cwd"),
Capabilities: protobuf::SingularPtrField::some(grpc::LinuxCapabilities {
Bounding: protobuf::RepeatedField::from(Vec::from([String::from("bnd")])),
Effective: protobuf::RepeatedField::from(Vec::from([String::from("eff")])),
Inheritable: protobuf::RepeatedField::from(Vec::from([String::from(
"inher",
)])),
Permitted: protobuf::RepeatedField::from(Vec::from([String::from("perm")])),
Ambient: protobuf::RepeatedField::from(Vec::from([String::from("amb")])),
..Default::default()
}),
Rlimits: protobuf::RepeatedField::from(Vec::from([
grpc::POSIXRlimit {
Type: String::from("r#type"),
Hard: 123,
Soft: 456,
..Default::default()
},
grpc::POSIXRlimit {
Type: String::from("r#type2"),
Hard: 789,
Soft: 1011,
..Default::default()
},
])),
NoNewPrivileges: true,
ApparmorProfile: String::from("apparmor profile"),
OOMScoreAdj: 123456,
SelinuxLabel: String::from("Selinux Label"),
..Default::default()
},
result: oci::Process {
terminal: true,
console_size: Some(oci::Box {
height: 123,
width: 456,
}),
user: oci::User {
uid: 1234,
gid: 5678,
additional_gids: Vec::from([910, 1112]),
username: String::from("username"),
},
args: Vec::from([String::from("arg1"), String::from("arg2")]),
env: Vec::from([String::from("env")]),
cwd: String::from("cwd"),
capabilities: Some(oci::LinuxCapabilities {
bounding: Vec::from([String::from("bnd")]),
effective: Vec::from([String::from("eff")]),
inheritable: Vec::from([String::from("inher")]),
permitted: Vec::from([String::from("perm")]),
ambient: Vec::from([String::from("amb")]),
}),
rlimits: Vec::from([
oci::PosixRlimit {
r#type: String::from("r#type"),
hard: 123,
soft: 456,
},
oci::PosixRlimit {
r#type: String::from("r#type2"),
hard: 789,
soft: 1011,
},
]),
no_new_privileges: true,
apparmor_profile: String::from("apparmor profile"),
oom_score_adj: Some(123456),
selinux_label: String::from("Selinux Label"),
},
},
TestData {
// None ConsoleSize
grpcproc: grpc::Process {
ConsoleSize: protobuf::SingularPtrField::<grpc::Box>::none(),
OOMScoreAdj: 0,
..Default::default()
},
result: oci::Process {
console_size: None,
oom_score_adj: Some(0),
..Default::default()
},
},
TestData {
// None User
grpcproc: grpc::Process {
User: protobuf::SingularPtrField::<grpc::User>::none(),
OOMScoreAdj: 0,
..Default::default()
},
result: oci::Process {
user: oci::User {
uid: 0,
gid: 0,
additional_gids: vec![],
username: String::from(""),
},
oom_score_adj: Some(0),
..Default::default()
},
},
TestData {
// None Capabilities
grpcproc: grpc::Process {
Capabilities: protobuf::SingularPtrField::none(),
OOMScoreAdj: 0,
..Default::default()
},
result: oci::Process {
capabilities: None,
oom_score_adj: Some(0),
..Default::default()
},
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let result = process_grpc_to_oci(&d.grpcproc);
let msg = format!("{}, result: {:?}", msg, result);
assert_eq!(d.result, result, "{}", msg);
}
}
#[test]
fn test_root_grpc_to_oci() {
#[derive(Debug)]
struct TestData {
grpcroot: grpc::Root,
result: oci::Root,
}
let tests = &[
TestData {
// Default fields
grpcroot: grpc::Root {
..Default::default()
},
result: oci::Root {
..Default::default()
},
},
TestData {
// Specified fields, readonly false
grpcroot: grpc::Root {
Path: String::from("path"),
Readonly: false,
..Default::default()
},
result: oci::Root {
path: String::from("path"),
readonly: false,
..Default::default()
},
},
TestData {
// Specified fields, readonly true
grpcroot: grpc::Root {
Path: String::from("path"),
Readonly: true,
..Default::default()
},
result: oci::Root {
path: String::from("path"),
readonly: true,
..Default::default()
},
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let result = root_grpc_to_oci(&d.grpcroot);
let msg = format!("{}, result: {:?}", msg, result);
assert_eq!(d.result, result, "{}", msg);
}
}
#[test]
fn test_hooks_grpc_to_oci() {
#[derive(Debug)]
struct TestData {
grpchooks: grpc::Hooks,
result: oci::Hooks,
}
let tests = &[
TestData {
// Default fields
grpchooks: grpc::Hooks {
..Default::default()
},
result: oci::Hooks {
..Default::default()
},
},
TestData {
// All specified
grpchooks: grpc::Hooks {
Prestart: protobuf::RepeatedField::from(Vec::from([
grpc::Hook {
Path: String::from("prestartpath"),
Args: protobuf::RepeatedField::from(Vec::from([
String::from("arg1"),
String::from("arg2"),
])),
Env: protobuf::RepeatedField::from(Vec::from([
String::from("env1"),
String::from("env2"),
])),
Timeout: 10,
..Default::default()
},
grpc::Hook {
Path: String::from("prestartpath2"),
Args: protobuf::RepeatedField::from(Vec::from([
String::from("arg3"),
String::from("arg4"),
])),
Env: protobuf::RepeatedField::from(Vec::from([
String::from("env3"),
String::from("env4"),
])),
Timeout: 25,
..Default::default()
},
])),
Poststart: protobuf::RepeatedField::from(Vec::from([grpc::Hook {
Path: String::from("poststartpath"),
Args: protobuf::RepeatedField::from(Vec::from([
String::from("arg1"),
String::from("arg2"),
])),
Env: protobuf::RepeatedField::from(Vec::from([
String::from("env1"),
String::from("env2"),
])),
Timeout: 10,
..Default::default()
}])),
Poststop: protobuf::RepeatedField::from(Vec::from([grpc::Hook {
Path: String::from("poststoppath"),
Args: protobuf::RepeatedField::from(Vec::from([
String::from("arg1"),
String::from("arg2"),
])),
Env: protobuf::RepeatedField::from(Vec::from([
String::from("env1"),
String::from("env2"),
])),
Timeout: 10,
..Default::default()
}])),
..Default::default()
},
result: oci::Hooks {
prestart: Vec::from([
oci::Hook {
path: String::from("prestartpath"),
args: Vec::from([String::from("arg1"), String::from("arg2")]),
env: Vec::from([String::from("env1"), String::from("env2")]),
timeout: Some(10),
},
oci::Hook {
path: String::from("prestartpath2"),
args: Vec::from([String::from("arg3"), String::from("arg4")]),
env: Vec::from([String::from("env3"), String::from("env4")]),
timeout: Some(25),
},
]),
poststart: Vec::from([oci::Hook {
path: String::from("poststartpath"),
args: Vec::from([String::from("arg1"), String::from("arg2")]),
env: Vec::from([String::from("env1"), String::from("env2")]),
timeout: Some(10),
}]),
poststop: Vec::from([oci::Hook {
path: String::from("poststoppath"),
args: Vec::from([String::from("arg1"), String::from("arg2")]),
env: Vec::from([String::from("env1"), String::from("env2")]),
timeout: Some(10),
}]),
},
},
TestData {
// Prestart empty
grpchooks: grpc::Hooks {
Prestart: protobuf::RepeatedField::from(Vec::from([])),
Poststart: protobuf::RepeatedField::from(Vec::from([grpc::Hook {
Path: String::from("poststartpath"),
Args: protobuf::RepeatedField::from(Vec::from([
String::from("arg1"),
String::from("arg2"),
])),
Env: protobuf::RepeatedField::from(Vec::from([
String::from("env1"),
String::from("env2"),
])),
Timeout: 10,
..Default::default()
}])),
Poststop: protobuf::RepeatedField::from(Vec::from([grpc::Hook {
Path: String::from("poststoppath"),
Args: protobuf::RepeatedField::from(Vec::from([
String::from("arg1"),
String::from("arg2"),
])),
Env: protobuf::RepeatedField::from(Vec::from([
String::from("env1"),
String::from("env2"),
])),
Timeout: 10,
..Default::default()
}])),
..Default::default()
},
result: oci::Hooks {
prestart: Vec::from([]),
poststart: Vec::from([oci::Hook {
path: String::from("poststartpath"),
args: Vec::from([String::from("arg1"), String::from("arg2")]),
env: Vec::from([String::from("env1"), String::from("env2")]),
timeout: Some(10),
}]),
poststop: Vec::from([oci::Hook {
path: String::from("poststoppath"),
args: Vec::from([String::from("arg1"), String::from("arg2")]),
env: Vec::from([String::from("env1"), String::from("env2")]),
timeout: Some(10),
}]),
},
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let result = hooks_grpc_to_oci(&d.grpchooks);
let msg = format!("{}, result: {:?}", msg, result);
assert_eq!(d.result, result, "{}", msg);
}
}
#[test]
fn test_mount_grpc_to_oci() {
#[derive(Debug)]
struct TestData {
grpcmount: grpc::Mount,
result: oci::Mount,
}
let tests = &[
TestData {
// Default fields
grpcmount: grpc::Mount {
..Default::default()
},
result: oci::Mount {
..Default::default()
},
},
TestData {
grpcmount: grpc::Mount {
destination: String::from("destination"),
source: String::from("source"),
field_type: String::from("fieldtype"),
options: protobuf::RepeatedField::from(Vec::from([
String::from("option1"),
String::from("option2"),
])),
..Default::default()
},
result: oci::Mount {
destination: String::from("destination"),
source: String::from("source"),
r#type: String::from("fieldtype"),
options: Vec::from([String::from("option1"), String::from("option2")]),
},
},
TestData {
grpcmount: grpc::Mount {
destination: String::from("destination"),
source: String::from("source"),
field_type: String::from("fieldtype"),
options: protobuf::RepeatedField::from(Vec::new()),
..Default::default()
},
result: oci::Mount {
destination: String::from("destination"),
source: String::from("source"),
r#type: String::from("fieldtype"),
options: Vec::new(),
},
},
TestData {
grpcmount: grpc::Mount {
destination: String::new(),
source: String::from("source"),
field_type: String::from("fieldtype"),
options: protobuf::RepeatedField::from(Vec::from([String::from("option1")])),
..Default::default()
},
result: oci::Mount {
destination: String::new(),
source: String::from("source"),
r#type: String::from("fieldtype"),
options: Vec::from([String::from("option1")]),
},
},
TestData {
grpcmount: grpc::Mount {
destination: String::from("destination"),
source: String::from("source"),
field_type: String::new(),
options: protobuf::RepeatedField::from(Vec::from([String::from("option1")])),
..Default::default()
},
result: oci::Mount {
destination: String::from("destination"),
source: String::from("source"),
r#type: String::new(),
options: Vec::from([String::from("option1")]),
},
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let result = mount_grpc_to_oci(&d.grpcmount);
let msg = format!("{}, result: {:?}", msg, result);
assert_eq!(d.result, result, "{}", msg);
}
}
#[test]
fn test_hook_grpc_to_oci<'a>() {
#[derive(Debug)]
struct TestData<'a> {
grpchook: &'a [grpc::Hook],
result: Vec<oci::Hook>,
}
let tests = &[
TestData {
// Default fields
grpchook: &[
grpc::Hook {
Timeout: 0,
..Default::default()
},
grpc::Hook {
Timeout: 0,
..Default::default()
},
],
result: vec![
oci::Hook {
timeout: Some(0),
..Default::default()
},
oci::Hook {
timeout: Some(0),
..Default::default()
},
],
},
TestData {
// Specified fields
grpchook: &[
grpc::Hook {
Path: String::from("path"),
Args: protobuf::RepeatedField::from(Vec::from([
String::from("arg1"),
String::from("arg2"),
])),
Env: protobuf::RepeatedField::from(Vec::from([
String::from("env1"),
String::from("env2"),
])),
Timeout: 10,
..Default::default()
},
grpc::Hook {
Path: String::from("path2"),
Args: protobuf::RepeatedField::from(Vec::from([
String::from("arg3"),
String::from("arg4"),
])),
Env: protobuf::RepeatedField::from(Vec::from([
String::from("env3"),
String::from("env4"),
])),
Timeout: 20,
..Default::default()
},
],
result: vec![
oci::Hook {
path: String::from("path"),
args: Vec::from([String::from("arg1"), String::from("arg2")]),
env: Vec::from([String::from("env1"), String::from("env2")]),
timeout: Some(10),
},
oci::Hook {
path: String::from("path2"),
args: Vec::from([String::from("arg3"), String::from("arg4")]),
env: Vec::from([String::from("env3"), String::from("env4")]),
timeout: Some(20),
},
],
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let result = hook_grpc_to_oci(d.grpchook);
let msg = format!("{}, result: {:?}", msg, result);
assert_eq!(d.result, result, "{}", msg);
}
}
}

View File

@@ -32,21 +32,16 @@ use crate::log_child;
// Info reveals information about a particular mounted filesystem. This
// struct is populated from the content in the /proc/<pid>/mountinfo file.
#[derive(std::fmt::Debug, PartialEq)]
#[derive(std::fmt::Debug)]
pub struct Info {
mount_point: String,
optional: String,
fstype: String,
}
const MOUNTINFO_FORMAT: &str = "{d} {d} {d}:{d} {} {} {} {}";
const MOUNTINFO_PATH: &str = "/proc/self/mountinfo";
const MOUNTINFOFORMAT: &str = "{d} {d} {d}:{d} {} {} {} {}";
const PROC_PATH: &str = "/proc";
const ERR_FAILED_PARSE_MOUNTINFO: &str = "failed to parse mountinfo file";
const ERR_FAILED_PARSE_MOUNTINFO_FINAL_FIELDS: &str =
"failed to parse final fields in mountinfo file";
// since libc didn't defined this const for musl, thus redefined it here.
#[cfg(all(target_os = "linux", target_env = "gnu", not(target_arch = "s390x")))]
const PROC_SUPER_MAGIC: libc::c_long = 0x00009fa0;
@@ -523,7 +518,7 @@ pub fn pivot_rootfs<P: ?Sized + NixPath + std::fmt::Debug>(path: &P) -> Result<(
}
fn rootfs_parent_mount_private(path: &str) -> Result<()> {
let mount_infos = parse_mount_table(MOUNTINFO_PATH)?;
let mount_infos = parse_mount_table()?;
let mut max_len = 0;
let mut mount_point = String::from("");
@@ -551,8 +546,8 @@ fn rootfs_parent_mount_private(path: &str) -> Result<()> {
// Parse /proc/self/mountinfo because comparing Dev and ino does not work from
// bind mounts
fn parse_mount_table(mountinfo_path: &str) -> Result<Vec<Info>> {
let file = File::open(mountinfo_path)?;
fn parse_mount_table() -> Result<Vec<Info>> {
let file = File::open("/proc/self/mountinfo")?;
let reader = BufReader::new(file);
let mut infos = Vec::new();
@@ -574,7 +569,7 @@ fn parse_mount_table(mountinfo_path: &str) -> Result<Vec<Info>> {
let (_id, _parent, _major, _minor, _root, mount_point, _opts, optional) = scan_fmt!(
&line,
MOUNTINFO_FORMAT,
MOUNTINFOFORMAT,
i32,
i32,
i32,
@@ -583,17 +578,12 @@ fn parse_mount_table(mountinfo_path: &str) -> Result<Vec<Info>> {
String,
String,
String
)
.map_err(|_| anyhow!(ERR_FAILED_PARSE_MOUNTINFO))?;
)?;
let fields: Vec<&str> = line.split(" - ").collect();
if fields.len() == 2 {
let final_fields: Vec<&str> = fields[1].split_whitespace().collect();
if final_fields.len() != 3 {
return Err(anyhow!(ERR_FAILED_PARSE_MOUNTINFO_FINAL_FIELDS));
}
let fstype = final_fields[0].to_string();
let (fstype, _source, _vfs_opts) =
scan_fmt!(fields[1], "{} {} {}", String, String, String)?;
let mut optional_new = String::new();
if optional != "-" {
@@ -608,7 +598,7 @@ fn parse_mount_table(mountinfo_path: &str) -> Result<Vec<Info>> {
infos.push(info);
} else {
return Err(anyhow!(ERR_FAILED_PARSE_MOUNTINFO));
return Err(anyhow!("failed to parse mount info file".to_string()));
}
}
@@ -629,7 +619,7 @@ fn chroot<P: ?Sized + NixPath>(_path: &P) -> Result<(), nix::Error> {
pub fn ms_move_root(rootfs: &str) -> Result<bool> {
unistd::chdir(rootfs)?;
let mount_infos = parse_mount_table(MOUNTINFO_PATH)?;
let mount_infos = parse_mount_table()?;
let root_path = Path::new(rootfs);
let abs_root_buf = root_path.absolutize()?;
@@ -1056,12 +1046,10 @@ fn readonly_path(path: &str) -> Result<()> {
#[cfg(test)]
mod tests {
use super::*;
use crate::assert_result;
use crate::skip_if_not_root;
use std::fs::create_dir;
use std::fs::create_dir_all;
use std::fs::remove_dir_all;
use std::io;
use std::os::unix::fs;
use std::os::unix::io::AsRawFd;
use tempfile::tempdir;
@@ -1298,113 +1286,6 @@ mod tests {
let ret = stat::stat(path);
assert!(ret.is_ok(), "Should pass. Got: {:?}", ret);
}
#[test]
fn test_mount_from() {
#[derive(Debug)]
struct TestData<'a> {
source: &'a str,
destination: &'a str,
r#type: &'a str,
flags: MsFlags,
error_contains: &'a str,
// if true, a directory will be created at path in source
make_source_directory: bool,
// if true, a file will be created at path in source
make_source_file: bool,
}
impl Default for TestData<'_> {
fn default() -> Self {
TestData {
source: "tmp",
destination: "dest",
r#type: "tmpfs",
flags: MsFlags::empty(),
error_contains: "",
make_source_directory: true,
make_source_file: false,
}
}
}
let tests = &[
TestData {
..Default::default()
},
TestData {
flags: MsFlags::MS_BIND,
..Default::default()
},
TestData {
r#type: "bind",
..Default::default()
},
TestData {
r#type: "cgroup2",
..Default::default()
},
TestData {
r#type: "bind",
make_source_directory: false,
error_contains: &format!("{}", std::io::Error::from_raw_os_error(libc::ENOENT)),
..Default::default()
},
TestData {
r#type: "bind",
make_source_directory: false,
make_source_file: true,
..Default::default()
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let tempdir = tempdir().unwrap();
let (rfd, wfd) = unistd::pipe2(OFlag::O_CLOEXEC).unwrap();
defer!({
unistd::close(rfd).unwrap();
unistd::close(wfd).unwrap();
});
let source_path = tempdir.path().join(d.source).to_str().unwrap().to_string();
if d.make_source_directory {
std::fs::create_dir_all(&source_path).unwrap();
} else if d.make_source_file {
std::fs::write(&source_path, []).unwrap();
}
let mount = Mount {
source: source_path,
destination: d.destination.to_string(),
r#type: d.r#type.to_string(),
options: vec![],
};
let result = mount_from(
wfd,
&mount,
tempdir.path().to_str().unwrap(),
d.flags,
"",
"",
);
let msg = format!("{}: result: {:?}", msg, result);
if d.error_contains.is_empty() {
assert!(result.is_ok(), "{}", msg);
} else {
assert!(result.is_err(), "{}", msg);
let error_msg = format!("{}", result.unwrap_err());
assert!(error_msg.contains(d.error_contains), "{}", msg);
}
}
}
#[test]
fn test_check_proc_mount() {
let mount = oci::Mount {
@@ -1520,121 +1401,6 @@ mod tests {
}
}
#[test]
fn test_parse_mount_table() {
#[derive(Debug)]
struct TestData<'a> {
mountinfo_data: Option<&'a str>,
result: Result<Vec<Info>>,
}
let tests = &[
TestData {
mountinfo_data: Some(
"22 933 0:20 / /sys rw,nodev shared:2 - sysfs sysfs rw,noexec",
),
result: Ok(vec![Info {
mount_point: "/sys".to_string(),
optional: "shared:2".to_string(),
fstype: "sysfs".to_string(),
}]),
},
TestData {
mountinfo_data: Some(
r#"22 933 0:20 / /sys rw,nodev - sysfs sysfs rw,noexec
81 13 1:2 / /tmp/dir rw shared:2 - tmpfs tmpfs rw"#,
),
result: Ok(vec![
Info {
mount_point: "/sys".to_string(),
optional: "".to_string(),
fstype: "sysfs".to_string(),
},
Info {
mount_point: "/tmp/dir".to_string(),
optional: "shared:2".to_string(),
fstype: "tmpfs".to_string(),
},
]),
},
TestData {
mountinfo_data: Some(
"22 933 0:20 /foo\040-\040bar /sys rw,nodev shared:2 - sysfs sysfs rw,noexec",
),
result: Ok(vec![Info {
mount_point: "/sys".to_string(),
optional: "shared:2".to_string(),
fstype: "sysfs".to_string(),
}]),
},
TestData {
mountinfo_data: Some(""),
result: Ok(vec![]),
},
TestData {
mountinfo_data: Some("invalid line data - sysfs sysfs rw"),
result: Err(anyhow!(ERR_FAILED_PARSE_MOUNTINFO)),
},
TestData {
mountinfo_data: Some("22 96 0:21 / /sys rw,noexec - sysfs"),
result: Err(anyhow!(ERR_FAILED_PARSE_MOUNTINFO_FINAL_FIELDS)),
},
TestData {
mountinfo_data: Some("22 96 0:21 / /sys rw,noexec - sysfs sysfs rw rw"),
result: Err(anyhow!(ERR_FAILED_PARSE_MOUNTINFO_FINAL_FIELDS)),
},
TestData {
mountinfo_data: Some("22 96 0:21 / /sys rw,noexec shared:2 - x - x"),
result: Err(anyhow!(ERR_FAILED_PARSE_MOUNTINFO)),
},
TestData {
mountinfo_data: Some("-"),
result: Err(anyhow!(ERR_FAILED_PARSE_MOUNTINFO)),
},
TestData {
mountinfo_data: Some("--"),
result: Err(anyhow!(ERR_FAILED_PARSE_MOUNTINFO)),
},
TestData {
mountinfo_data: Some("- -"),
result: Err(anyhow!(ERR_FAILED_PARSE_MOUNTINFO)),
},
TestData {
mountinfo_data: Some(" - "),
result: Err(anyhow!(ERR_FAILED_PARSE_MOUNTINFO)),
},
TestData {
mountinfo_data: Some(
r#"22 933 0:20 / /sys rw,nodev - sysfs sysfs rw,noexec
invalid line
81 13 1:2 / /tmp/dir rw shared:2 - tmpfs tmpfs rw"#,
),
result: Err(anyhow!(ERR_FAILED_PARSE_MOUNTINFO)),
},
TestData {
mountinfo_data: None,
result: Err(anyhow!(io::Error::from_raw_os_error(libc::ENOENT))),
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let tempdir = tempdir().unwrap();
let mountinfo_path = tempdir.path().join("mountinfo");
if let Some(mountinfo_data) = d.mountinfo_data {
std::fs::write(&mountinfo_path, mountinfo_data).unwrap();
}
let result = parse_mount_table(mountinfo_path.to_str().unwrap());
let msg = format!("{}: result: {:?}", msg, result);
assert_result!(d.result, result, msg);
}
}
#[test]
fn test_dev_rel_path() {
// Valid device paths

View File

@@ -5,7 +5,7 @@
use libc::pid_t;
use std::fs::File;
use std::os::unix::io::{AsRawFd, RawFd};
use std::os::unix::io::RawFd;
use tokio::sync::mpsc::Sender;
use nix::errno::Errno;
@@ -137,25 +137,19 @@ impl Process {
info!(logger, "before create console socket!");
if !p.tty {
if cfg!(feature = "standard-oci-runtime") {
p.stdin = Some(std::io::stdin().as_raw_fd());
p.stdout = Some(std::io::stdout().as_raw_fd());
p.stderr = Some(std::io::stderr().as_raw_fd());
} else {
info!(logger, "created console socket!");
info!(logger, "created console socket!");
let (stdin, pstdin) = unistd::pipe2(OFlag::O_CLOEXEC)?;
p.parent_stdin = Some(pstdin);
p.stdin = Some(stdin);
let (stdin, pstdin) = unistd::pipe2(OFlag::O_CLOEXEC)?;
p.parent_stdin = Some(pstdin);
p.stdin = Some(stdin);
let (pstdout, stdout) = create_extended_pipe(OFlag::O_CLOEXEC, pipe_size)?;
p.parent_stdout = Some(pstdout);
p.stdout = Some(stdout);
let (pstdout, stdout) = create_extended_pipe(OFlag::O_CLOEXEC, pipe_size)?;
p.parent_stdout = Some(pstdout);
p.stdout = Some(stdout);
let (pstderr, stderr) = create_extended_pipe(OFlag::O_CLOEXEC, pipe_size)?;
p.parent_stderr = Some(pstderr);
p.stderr = Some(stderr);
}
let (pstderr, stderr) = create_extended_pipe(OFlag::O_CLOEXEC, pipe_size)?;
p.parent_stderr = Some(pstderr);
p.stderr = Some(stderr);
}
Ok(p)
}
@@ -290,11 +284,5 @@ mod tests {
// group of the calling process.
process.pid = 0;
assert!(process.signal(libc::SIGCONT).is_ok());
if cfg!(feature = "standard-oci-runtime") {
assert_eq!(process.stdin.unwrap(), std::io::stdin().as_raw_fd());
assert_eq!(process.stdout.unwrap(), std::io::stdout().as_raw_fd());
assert_eq!(process.stderr.unwrap(), std::io::stderr().as_raw_fd());
}
}
}

View File

@@ -26,15 +26,12 @@ fn get_rule_conditions(args: &[LinuxSeccompArg]) -> Result<Vec<ScmpArgCompare>>
return Err(anyhow!("seccomp opreator is required"));
}
let mut op = ScmpCompareOp::from_str(&arg.op)?;
let mut value = arg.value;
// For SCMP_CMP_MASKED_EQ, arg.value is the mask and arg.value_two is the value
if op == ScmpCompareOp::MaskedEqual(u64::default()) {
op = ScmpCompareOp::MaskedEqual(arg.value);
value = arg.value_two;
}
let cond = ScmpArgCompare::new(arg.index, op, value);
let cond = ScmpArgCompare::new(
arg.index,
ScmpCompareOp::from_str(&arg.op)?,
arg.value,
Some(arg.value_two),
);
conditions.push(cond);
}
@@ -47,7 +44,7 @@ pub fn get_unknown_syscalls(scmp: &LinuxSeccomp) -> Option<Vec<String>> {
for syscall in &scmp.syscalls {
for name in &syscall.names {
if ScmpSyscall::from_name(name).is_err() {
if get_syscall_from_name(name, None).is_err() {
unknown_syscalls.push(name.to_string());
}
}
@@ -63,7 +60,7 @@ pub fn get_unknown_syscalls(scmp: &LinuxSeccomp) -> Option<Vec<String>> {
// init_seccomp creates a seccomp filter and loads it for the current process
// including all the child processes.
pub fn init_seccomp(scmp: &LinuxSeccomp) -> Result<()> {
let def_action = ScmpAction::from_str(scmp.default_action.as_str(), Some(libc::EPERM as i32))?;
let def_action = ScmpAction::from_str(scmp.default_action.as_str(), Some(libc::EPERM as u32))?;
// Create a new filter context
let mut filter = ScmpFilterContext::new_filter(def_action)?;
@@ -75,7 +72,7 @@ pub fn init_seccomp(scmp: &LinuxSeccomp) -> Result<()> {
}
// Unset no new privileges bit
filter.set_ctl_nnp(false)?;
filter.set_no_new_privs_bit(false)?;
// Add a rule for each system call
for syscall in &scmp.syscalls {
@@ -83,13 +80,13 @@ pub fn init_seccomp(scmp: &LinuxSeccomp) -> Result<()> {
return Err(anyhow!("syscall name is required"));
}
let action = ScmpAction::from_str(&syscall.action, Some(syscall.errno_ret as i32))?;
let action = ScmpAction::from_str(&syscall.action, Some(syscall.errno_ret))?;
if action == def_action {
continue;
}
for name in &syscall.names {
let syscall_num = match ScmpSyscall::from_name(name) {
let syscall_num = match get_syscall_from_name(name, None) {
Ok(num) => num,
Err(_) => {
// If we cannot resolve the given system call, we assume it is not supported
@@ -99,10 +96,10 @@ pub fn init_seccomp(scmp: &LinuxSeccomp) -> Result<()> {
};
if syscall.args.is_empty() {
filter.add_rule(action, syscall_num)?;
filter.add_rule(action, syscall_num, None)?;
} else {
let conditions = get_rule_conditions(&syscall.args)?;
filter.add_rule_conditional(action, syscall_num, &conditions)?;
filter.add_rule(action, syscall_num, Some(&conditions))?;
}
}
}

View File

@@ -5,7 +5,7 @@
use oci::Spec;
#[derive(Serialize, Deserialize, Debug, Default, Clone)]
#[derive(Debug)]
pub struct CreateOpts {
pub cgroup_name: String,
pub use_systemd_cgroup: bool,

View File

@@ -0,0 +1,119 @@
// Copyright (c) 2021 Ant Group
//
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::{anyhow, Context, Result};
use libc::gid_t;
use libc::uid_t;
use std::fs::File;
use std::io::{BufRead, BufReader};
const PASSWD_FILE: &str = "/etc/passwd";
// An entry from /etc/passwd
#[derive(Debug, PartialEq, PartialOrd)]
pub struct PasswdEntry {
// username
pub name: String,
// user password
pub passwd: String,
// user id
pub uid: uid_t,
// group id
pub gid: gid_t,
// user Information
pub gecos: String,
// home directory
pub dir: String,
// User's Shell
pub shell: String,
}
// get an entry for a given `uid` from `/etc/passwd`
fn get_entry_by_uid(uid: uid_t, path: &str) -> Result<PasswdEntry> {
let file = File::open(path).with_context(|| format!("open file {}", path))?;
let mut reader = BufReader::new(file);
let mut line = String::new();
loop {
line.clear();
match reader.read_line(&mut line) {
Ok(0) => return Err(anyhow!(format!("file {} is empty", path))),
Ok(_) => (),
Err(e) => {
return Err(anyhow!(format!(
"failed to read file {} with {:?}",
path, e
)))
}
}
if line.starts_with('#') {
continue;
}
let parts: Vec<&str> = line.split(':').map(|part| part.trim()).collect();
if parts.len() != 7 {
continue;
}
match parts[2].parse() {
Err(_e) => continue,
Ok(new_uid) => {
if uid != new_uid {
continue;
}
let entry = PasswdEntry {
name: parts[0].to_string(),
passwd: parts[1].to_string(),
uid: new_uid,
gid: parts[3].parse().unwrap_or(0),
gecos: parts[4].to_string(),
dir: parts[5].to_string(),
shell: parts[6].to_string(),
};
return Ok(entry);
}
}
}
}
pub fn home_dir(uid: uid_t) -> Result<String> {
get_entry_by_uid(uid, PASSWD_FILE).map(|entry| entry.dir)
}
#[cfg(test)]
mod tests {
use super::*;
use std::io::Write;
use tempfile::Builder;
#[test]
fn test_get_entry_by_uid() {
let tmpdir = Builder::new().tempdir().unwrap();
let tmpdir_path = tmpdir.path().to_str().unwrap();
let temp_passwd = format!("{}/passwd", tmpdir_path);
let mut tempf = File::create(temp_passwd.as_str()).unwrap();
writeln!(tempf, "root:x:0:0:root:/root0:/bin/bash").unwrap();
writeln!(tempf, "root:x:1:0:root:/root1:/bin/bash").unwrap();
writeln!(tempf, "#root:x:1:0:root:/rootx:/bin/bash").unwrap();
writeln!(tempf, "root:x:2:0:root:/root2:/bin/bash").unwrap();
writeln!(tempf, "root:x:3:0:root:/root3").unwrap();
writeln!(tempf, "root:x:3:0:root:/root3:/bin/bash").unwrap();
let entry = get_entry_by_uid(0, temp_passwd.as_str()).unwrap();
assert_eq!(entry.dir.as_str(), "/root0");
let entry = get_entry_by_uid(1, temp_passwd.as_str()).unwrap();
assert_eq!(entry.dir.as_str(), "/root1");
let entry = get_entry_by_uid(2, temp_passwd.as_str()).unwrap();
assert_eq!(entry.dir.as_str(), "/root2");
let entry = get_entry_by_uid(3, temp_passwd.as_str()).unwrap();
assert_eq!(entry.dir.as_str(), "/root3");
}
}

View File

@@ -27,6 +27,7 @@ use nix::unistd::{self, dup, Pid};
use std::env;
use std::ffi::OsStr;
use std::fs::{self, File};
use std::os::unix::ffi::OsStrExt;
use std::os::unix::fs as unixfs;
use std::os::unix::io::AsRawFd;
use std::path::Path;
@@ -124,7 +125,9 @@ fn announce(logger: &Logger, config: &AgentConfig) {
// output to the vsock port specified, or stdout.
async fn create_logger_task(rfd: RawFd, vsock_port: u32, shutdown: Receiver<bool>) -> Result<()> {
let mut reader = PipeStream::from_fd(rfd);
let mut writer: Box<dyn AsyncWrite + Unpin + Send> = if vsock_port > 0 {
let mut writer: Box<dyn AsyncWrite + Unpin + Send>;
if vsock_port > 0 {
let listenfd = socket::socket(
AddressFamily::Vsock,
SockType::Stream,
@@ -136,10 +139,10 @@ async fn create_logger_task(rfd: RawFd, vsock_port: u32, shutdown: Receiver<bool
socket::bind(listenfd, &addr)?;
socket::listen(listenfd, 1)?;
Box::new(util::get_vsock_stream(listenfd).await?)
writer = Box::new(util::get_vsock_stream(listenfd).await?);
} else {
Box::new(tokio::io::stdout())
};
writer = Box::new(tokio::io::stdout());
}
let _ = util::interruptable_io_copier(&mut reader, &mut writer, shutdown).await;
@@ -381,13 +384,27 @@ fn init_agent_as_init(logger: &Logger, unified_cgroup_hierarchy: bool) -> Result
let contents_array: Vec<&str> = contents.split(' ').collect();
let hostname = contents_array[0].trim();
if unistd::sethostname(OsStr::new(hostname)).is_err() {
if sethostname(OsStr::new(hostname)).is_err() {
warn!(logger, "failed to set hostname");
}
Ok(())
}
#[instrument]
fn sethostname(hostname: &OsStr) -> Result<()> {
let size = hostname.len() as usize;
let result =
unsafe { libc::sethostname(hostname.as_bytes().as_ptr() as *const libc::c_char, size) };
if result != 0 {
Err(anyhow!("failed to set hostname"))
} else {
Ok(())
}
}
// The Rust standard library had suppressed the default SIGPIPE behavior,
// see https://github.com/rust-lang/rust/pull/13158.
// Since the parent's signal handler would be inherited by it's child process,
@@ -401,59 +418,3 @@ fn reset_sigpipe() {
use crate::config::AgentConfig;
use std::os::unix::io::{FromRawFd, RawFd};
#[cfg(test)]
mod tests {
use super::*;
use crate::test_utils::test_utils::TestUserType;
#[tokio::test]
async fn test_create_logger_task() {
#[derive(Debug)]
struct TestData {
vsock_port: u32,
test_user: TestUserType,
result: Result<()>,
}
let tests = &[
TestData {
// non-root user cannot use privileged vsock port
vsock_port: 1,
test_user: TestUserType::NonRootOnly,
result: Err(anyhow!(nix::errno::Errno::from_i32(libc::EACCES))),
},
TestData {
// passing vsock_port 0 causes logger task to write to stdout
vsock_port: 0,
test_user: TestUserType::Any,
result: Ok(()),
},
];
for (i, d) in tests.iter().enumerate() {
if d.test_user == TestUserType::RootOnly {
skip_if_not_root!();
} else if d.test_user == TestUserType::NonRootOnly {
skip_if_root!();
}
let msg = format!("test[{}]: {:?}", i, d);
let (rfd, wfd) = unistd::pipe2(OFlag::O_CLOEXEC).unwrap();
defer!({
// rfd is closed by the use of PipeStream in the crate_logger_task function,
// but we will attempt to close in case of a failure
let _ = unistd::close(rfd);
unistd::close(wfd).unwrap();
});
let (shutdown_tx, shutdown_rx) = channel(true);
shutdown_tx.send(true).unwrap();
let result = create_logger_task(rfd, d.vsock_port, shutdown_rx).await;
let msg = format!("{}, result: {:?}", msg, result);
assert_result!(d.result, result, msg);
}
}
}

View File

@@ -344,25 +344,25 @@ fn set_gauge_vec_meminfo(gv: &prometheus::GaugeVec, meminfo: &procfs::Meminfo) {
#[instrument]
fn set_gauge_vec_cpu_time(gv: &prometheus::GaugeVec, cpu: &str, cpu_time: &procfs::CpuTime) {
gv.with_label_values(&[cpu, "user"])
.set(cpu_time.user_ms() as f64);
.set(cpu_time.user as f64);
gv.with_label_values(&[cpu, "nice"])
.set(cpu_time.nice_ms() as f64);
.set(cpu_time.nice as f64);
gv.with_label_values(&[cpu, "system"])
.set(cpu_time.system_ms() as f64);
.set(cpu_time.system as f64);
gv.with_label_values(&[cpu, "idle"])
.set(cpu_time.idle_ms() as f64);
.set(cpu_time.idle as f64);
gv.with_label_values(&[cpu, "iowait"])
.set(cpu_time.iowait_ms().unwrap_or(0) as f64);
.set(cpu_time.iowait.unwrap_or(0) as f64);
gv.with_label_values(&[cpu, "irq"])
.set(cpu_time.irq_ms().unwrap_or(0) as f64);
.set(cpu_time.irq.unwrap_or(0) as f64);
gv.with_label_values(&[cpu, "softirq"])
.set(cpu_time.softirq_ms().unwrap_or(0) as f64);
.set(cpu_time.softirq.unwrap_or(0) as f64);
gv.with_label_values(&[cpu, "steal"])
.set(cpu_time.steal_ms().unwrap_or(0) as f64);
.set(cpu_time.steal.unwrap_or(0) as f64);
gv.with_label_values(&[cpu, "guest"])
.set(cpu_time.guest_ms().unwrap_or(0) as f64);
.set(cpu_time.guest.unwrap_or(0) as f64);
gv.with_label_values(&[cpu, "guest_nice"])
.set(cpu_time.guest_nice_ms().unwrap_or(0) as f64);
.set(cpu_time.guest_nice.unwrap_or(0) as f64);
}
#[instrument]

View File

@@ -91,11 +91,11 @@ lazy_static! {
}
#[derive(Debug, PartialEq)]
pub struct InitMount<'a> {
fstype: &'a str,
src: &'a str,
dest: &'a str,
options: Vec<&'a str>,
pub struct InitMount {
fstype: &'static str,
src: &'static str,
dest: &'static str,
options: Vec<&'static str>,
}
#[rustfmt::skip]
@@ -121,7 +121,7 @@ lazy_static!{
#[rustfmt::skip]
lazy_static! {
pub static ref INIT_ROOTFS_MOUNTS: Vec<InitMount<'static>> = vec![
pub static ref INIT_ROOTFS_MOUNTS: Vec<InitMount> = vec![
InitMount{fstype: "proc", src: "proc", dest: "/proc", options: vec!["nosuid", "nodev", "noexec"]},
InitMount{fstype: "sysfs", src: "sysfs", dest: "/sys", options: vec!["nosuid", "nodev", "noexec"]},
InitMount{fstype: "devtmpfs", src: "dev", dest: "/dev", options: vec!["nosuid"]},
@@ -840,13 +840,15 @@ pub fn get_mount_fs_type_from_file(mount_file: &str, mount_point: &str) -> Resul
return Err(anyhow!("Invalid mount point {}", mount_point));
}
let content = fs::read_to_string(mount_file)?;
let file = File::open(mount_file)?;
let reader = BufReader::new(file);
let re = Regex::new(format!("device .+ mounted on {} with fstype (.+)", mount_point).as_str())?;
// Read the file line by line using the lines() iterator from std::io::BufRead.
for (_index, line) in content.lines().enumerate() {
let capes = match re.captures(line) {
for (_index, line) in reader.lines().enumerate() {
let line = line?;
let capes = match re.captures(line.as_str()) {
Some(c) => c,
None => continue,
};
@@ -857,9 +859,8 @@ pub fn get_mount_fs_type_from_file(mount_file: &str, mount_point: &str) -> Resul
}
Err(anyhow!(
"failed to find FS type for mount point {}, mount file content: {:?}",
mount_point,
content
"failed to find FS type for mount point {}",
mount_point
))
}
@@ -868,7 +869,7 @@ pub fn get_cgroup_mounts(
logger: &Logger,
cg_path: &str,
unified_cgroup_hierarchy: bool,
) -> Result<Vec<InitMount<'static>>> {
) -> Result<Vec<InitMount>> {
// cgroup v2
// https://github.com/kata-containers/agent/blob/8c9bbadcd448c9a67690fbe11a860aaacc69813c/agent.go#L1249
if unified_cgroup_hierarchy {
@@ -1016,8 +1017,7 @@ fn parse_options(option_list: Vec<String>) -> HashMap<String, String> {
#[cfg(test)]
mod tests {
use super::*;
use crate::test_utils::test_utils::TestUserType;
use crate::{skip_if_not_root, skip_loop_by_user, skip_loop_if_not_root, skip_loop_if_root};
use crate::{skip_if_not_root, skip_loop_if_not_root, skip_loop_if_root};
use protobuf::RepeatedField;
use protocols::agent::FSGroup;
use std::fs::File;
@@ -1026,6 +1026,13 @@ mod tests {
use std::path::PathBuf;
use tempfile::tempdir;
#[derive(Debug, PartialEq)]
enum TestUserType {
RootOnly,
NonRootOnly,
Any,
}
#[test]
fn test_mount() {
#[derive(Debug)]
@@ -1111,7 +1118,11 @@ mod tests {
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
skip_loop_by_user!(msg, d.test_user);
if d.test_user == TestUserType::RootOnly {
skip_loop_if_not_root!(msg);
} else if d.test_user == TestUserType::NonRootOnly {
skip_loop_if_root!(msg);
}
let src: PathBuf;
let dest: PathBuf;
@@ -1120,7 +1131,7 @@ mod tests {
let dest_filename: String;
if !d.src.is_empty() {
src = dir.path().join(d.src);
src = dir.path().join(d.src.to_string());
src_filename = src
.to_str()
.expect("failed to convert src to filename")
@@ -1130,7 +1141,7 @@ mod tests {
}
if !d.dest.is_empty() {
dest = dir.path().join(d.dest);
dest = dir.path().join(d.dest.to_string());
dest_filename = dest
.to_str()
.expect("failed to convert dest to filename")
@@ -1581,226 +1592,6 @@ mod tests {
assert!(testfile.is_file());
}
#[test]
fn test_mount_storage() {
#[derive(Debug)]
struct TestData<'a> {
test_user: TestUserType,
storage: Storage,
error_contains: &'a str,
make_source_dir: bool,
make_mount_dir: bool,
deny_mount_permission: bool,
}
impl Default for TestData<'_> {
fn default() -> Self {
TestData {
test_user: TestUserType::Any,
storage: Storage {
mount_point: "mnt".to_string(),
source: "src".to_string(),
fstype: "tmpfs".to_string(),
..Default::default()
},
make_source_dir: true,
make_mount_dir: false,
deny_mount_permission: false,
error_contains: "",
}
}
}
let tests = &[
TestData {
test_user: TestUserType::NonRootOnly,
error_contains: "EPERM: Operation not permitted",
..Default::default()
},
TestData {
test_user: TestUserType::RootOnly,
..Default::default()
},
TestData {
storage: Storage {
mount_point: "mnt".to_string(),
source: "src".to_string(),
fstype: "bind".to_string(),
..Default::default()
},
make_source_dir: false,
make_mount_dir: true,
error_contains: "Could not create mountpoint",
..Default::default()
},
TestData {
test_user: TestUserType::NonRootOnly,
deny_mount_permission: true,
error_contains: "Could not create mountpoint",
..Default::default()
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
skip_loop_by_user!(msg, d.test_user);
let drain = slog::Discard;
let logger = slog::Logger::root(drain, o!());
let tempdir = tempdir().unwrap();
let source = tempdir.path().join(&d.storage.source);
let mount_point = tempdir.path().join(&d.storage.mount_point);
let storage = Storage {
source: source.to_str().unwrap().to_string(),
mount_point: mount_point.to_str().unwrap().to_string(),
..d.storage.clone()
};
if d.make_source_dir {
fs::create_dir_all(&storage.source).unwrap();
}
if d.make_mount_dir {
fs::create_dir_all(&storage.mount_point).unwrap();
}
if d.deny_mount_permission {
fs::set_permissions(
mount_point.parent().unwrap(),
fs::Permissions::from_mode(0o000),
)
.unwrap();
}
let result = mount_storage(&logger, &storage);
// restore permissions so tempdir can be cleaned up
if d.deny_mount_permission {
fs::set_permissions(
mount_point.parent().unwrap(),
fs::Permissions::from_mode(0o755),
)
.unwrap();
}
if result.is_ok() {
nix::mount::umount(&mount_point).unwrap();
}
let msg = format!("{}: result: {:?}", msg, result);
if d.error_contains.is_empty() {
assert!(result.is_ok(), "{}", msg);
} else {
assert!(result.is_err(), "{}", msg);
let error_msg = format!("{}", result.unwrap_err());
assert!(error_msg.contains(d.error_contains), "{}", msg);
}
}
}
#[test]
fn test_mount_to_rootfs() {
#[derive(Debug)]
struct TestData<'a> {
test_user: TestUserType,
src: &'a str,
options: Vec<&'a str>,
error_contains: &'a str,
deny_mount_dir_permission: bool,
// if true src will be prepended with a temporary directory
mask_src: bool,
}
impl Default for TestData<'_> {
fn default() -> Self {
TestData {
test_user: TestUserType::Any,
src: "src",
options: vec![],
error_contains: "",
deny_mount_dir_permission: false,
mask_src: true,
}
}
}
let tests = &[
TestData {
test_user: TestUserType::NonRootOnly,
error_contains: "EPERM: Operation not permitted",
..Default::default()
},
TestData {
test_user: TestUserType::NonRootOnly,
src: "dev",
mask_src: false,
..Default::default()
},
TestData {
test_user: TestUserType::RootOnly,
..Default::default()
},
TestData {
test_user: TestUserType::NonRootOnly,
deny_mount_dir_permission: true,
error_contains: "could not create directory",
..Default::default()
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
skip_loop_by_user!(msg, d.test_user);
let drain = slog::Discard;
let logger = slog::Logger::root(drain, o!());
let tempdir = tempdir().unwrap();
let src = if d.mask_src {
tempdir.path().join(&d.src)
} else {
Path::new(d.src).to_path_buf()
};
let dest = tempdir.path().join("mnt");
let init_mount = InitMount {
fstype: "tmpfs",
src: src.to_str().unwrap(),
dest: dest.to_str().unwrap(),
options: d.options.clone(),
};
if d.deny_mount_dir_permission {
fs::set_permissions(dest.parent().unwrap(), fs::Permissions::from_mode(0o000))
.unwrap();
}
let result = mount_to_rootfs(&logger, &init_mount);
// restore permissions so tempdir can be cleaned up
if d.deny_mount_dir_permission {
fs::set_permissions(dest.parent().unwrap(), fs::Permissions::from_mode(0o755))
.unwrap();
}
if result.is_ok() && d.mask_src {
nix::mount::umount(&dest).unwrap();
}
let msg = format!("{}: result: {:?}", msg, result);
if d.error_contains.is_empty() {
assert!(result.is_ok(), "{}", msg);
} else {
assert!(result.is_err(), "{}", msg);
let error_msg = format!("{}", result.unwrap_err());
assert!(error_msg.contains(d.error_contains), "{}", msg);
}
}
}
#[test]
fn test_get_pagesize_and_size_from_option() {
let expected_pagesize = 2048;
@@ -1857,57 +1648,6 @@ mod tests {
}
}
#[test]
fn test_parse_mount_flags_and_options() {
#[derive(Debug)]
struct TestData<'a> {
options_vec: Vec<&'a str>,
result: (MsFlags, &'a str),
}
let tests = &[
TestData {
options_vec: vec![],
result: (MsFlags::empty(), ""),
},
TestData {
options_vec: vec!["ro"],
result: (MsFlags::MS_RDONLY, ""),
},
TestData {
options_vec: vec!["rw"],
result: (MsFlags::empty(), ""),
},
TestData {
options_vec: vec!["ro", "rw"],
result: (MsFlags::empty(), ""),
},
TestData {
options_vec: vec!["ro", "nodev"],
result: (MsFlags::MS_RDONLY | MsFlags::MS_NODEV, ""),
},
TestData {
options_vec: vec!["option1", "nodev", "option2"],
result: (MsFlags::MS_NODEV, "option1,option2"),
},
TestData {
options_vec: vec!["rbind", "", "ro"],
result: (MsFlags::MS_BIND | MsFlags::MS_REC | MsFlags::MS_RDONLY, ""),
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let result = parse_mount_flags_and_options(d.options_vec.clone());
let msg = format!("{}: result: {:?}", msg, result);
let expected_result = (d.result.0, d.result.1.to_owned());
assert_eq!(expected_result, result, "{}", msg);
}
}
#[test]
fn test_set_ownership() {
skip_if_not_root!();

View File

@@ -523,7 +523,7 @@ impl Handle {
.as_ref()
.map(|to| to.address.as_str()) // Extract address field
.and_then(|addr| if addr.is_empty() { None } else { Some(addr) }) // Make sure it's not empty
.ok_or_else(|| anyhow!(nix::Error::EINVAL))?;
.ok_or(anyhow!(nix::Error::EINVAL))?;
let ip = IpAddr::from_str(ip_address)
.map_err(|e| anyhow!("Failed to parse IP {}: {:?}", ip_address, e))?;
@@ -612,7 +612,7 @@ fn parse_mac_address(addr: &str) -> Result<[u8; 6]> {
// Parse single Mac address block
let mut parse_next = || -> Result<u8> {
let v = u8::from_str_radix(split.next().ok_or_else(|| anyhow!(nix::Error::EINVAL))?, 16)?;
let v = u8::from_str_radix(split.next().ok_or(anyhow!(nix::Error::EINVAL))?, 16)?;
Ok(v)
};

View File

@@ -3,7 +3,7 @@
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::{ensure, Result};
use anyhow::Result;
use nix::errno::Errno;
use nix::fcntl::{self, OFlag};
use nix::sys::stat::Mode;
@@ -13,7 +13,7 @@ use tracing::instrument;
pub const RNGDEV: &str = "/dev/random";
pub const RNDADDTOENTCNT: libc::c_int = 0x40045201;
pub const RNDRESEEDCRNG: libc::c_int = 0x5207;
pub const RNDRESEEDRNG: libc::c_int = 0x5207;
// Handle the differing ioctl(2) request types for different targets
#[cfg(target_env = "musl")]
@@ -24,9 +24,6 @@ type IoctlRequestType = libc::c_ulong;
#[instrument]
pub fn reseed_rng(data: &[u8]) -> Result<()> {
let len = data.len() as libc::c_long;
ensure!(len > 0, "missing entropy data");
fs::write(RNGDEV, data)?;
let f = {
@@ -44,52 +41,8 @@ pub fn reseed_rng(data: &[u8]) -> Result<()> {
};
Errno::result(ret).map(drop)?;
let ret = unsafe { libc::ioctl(f.as_raw_fd(), RNDRESEEDCRNG as IoctlRequestType, 0) };
let ret = unsafe { libc::ioctl(f.as_raw_fd(), RNDRESEEDRNG as IoctlRequestType, 0) };
Errno::result(ret).map(drop)?;
Ok(())
}
#[cfg(test)]
mod tests {
use super::*;
use crate::skip_if_not_root;
use std::fs::File;
use std::io::prelude::*;
#[test]
fn test_reseed_rng() {
skip_if_not_root!();
const POOL_SIZE: usize = 512;
let mut f = File::open("/dev/urandom").unwrap();
let mut seed = [0; POOL_SIZE];
let n = f.read(&mut seed).unwrap();
// Ensure the buffer was filled.
assert!(n == POOL_SIZE);
let ret = reseed_rng(&seed);
assert!(ret.is_ok());
}
#[test]
fn test_reseed_rng_not_root() {
const POOL_SIZE: usize = 512;
let mut f = File::open("/dev/urandom").unwrap();
let mut seed = [0; POOL_SIZE];
let n = f.read(&mut seed).unwrap();
// Ensure the buffer was filled.
assert!(n == POOL_SIZE);
let ret = reseed_rng(&seed);
if nix::unistd::Uid::effective().is_root() {
assert!(ret.is_ok());
} else {
assert!(ret.is_err());
}
}
#[test]
fn test_reseed_rng_zero_data() {
let seed = [];
let ret = reseed_rng(&seed);
assert!(ret.is_err());
}
}

View File

@@ -23,9 +23,8 @@ use cgroups::freezer::FreezerState;
use oci::{LinuxNamespace, Root, Spec};
use protobuf::{Message, RepeatedField, SingularPtrField};
use protocols::agent::{
AddSwapRequest, AgentDetails, CopyFileRequest, GetIPTablesRequest, GetIPTablesResponse,
GuestDetailsResponse, Interfaces, Metrics, OOMEvent, ReadStreamResponse, Routes,
SetIPTablesRequest, SetIPTablesResponse, StatsContainerResponse, VolumeStatsRequest,
AddSwapRequest, AgentDetails, CopyFileRequest, GuestDetailsResponse, Interfaces, Metrics,
OOMEvent, ReadStreamResponse, Routes, StatsContainerResponse, VolumeStatsRequest,
WaitProcessResponse, WriteStreamResponse,
};
use protocols::csi::{VolumeCondition, VolumeStatsResponse, VolumeUsage, VolumeUsage_Unit};
@@ -76,29 +75,18 @@ use std::time::Duration;
use nix::unistd::{Gid, Uid};
use std::fs::{File, OpenOptions};
use std::io::{BufRead, BufReader, Write};
use std::io::{BufRead, BufReader};
use std::os::unix::fs::FileExt;
use std::path::PathBuf;
const CONTAINER_BASE: &str = "/run/kata-containers";
const MODPROBE_PATH: &str = "/sbin/modprobe";
const IPTABLES_SAVE: &str = "/sbin/iptables-save";
const IPTABLES_RESTORE: &str = "/sbin/iptables-restore";
const IP6TABLES_SAVE: &str = "/sbin/ip6tables-save";
const IP6TABLES_RESTORE: &str = "/sbin/ip6tables-restore";
const ERR_CANNOT_GET_WRITER: &str = "Cannot get writer";
const ERR_INVALID_BLOCK_SIZE: &str = "Invalid block size";
const ERR_NO_LINUX_FIELD: &str = "Spec does not contain linux field";
const ERR_NO_SANDBOX_PIDNS: &str = "Sandbox does not have sandbox_pidns";
// IPTABLES_RESTORE_WAIT_SEC is the timeout value provided to iptables-restore --wait. Since we
// don't expect other writers to iptables, we don't expect contention for grabbing the iptables
// filesystem lock. Based on this, 5 seconds seems a resonable timeout period in case the lock is
// not available.
const IPTABLES_RESTORE_WAIT_SEC: u64 = 5;
// Convenience macro to obtain the scope logger
macro_rules! sl {
() => {
@@ -249,20 +237,7 @@ impl AgentService {
info!(sl!(), "no process configurations!");
return Err(anyhow!(nix::Error::EINVAL));
};
// if starting container failed, we will do some rollback work
// to ensure no resources are leaked.
if let Err(err) = ctr.start(p).await {
error!(sl!(), "failed to start container: {:?}", err);
if let Err(e) = ctr.destroy().await {
error!(sl!(), "failed to destroy container: {:?}", e);
}
if let Err(e) = remove_container_resources(&mut s, &cid) {
error!(sl!(), "failed to remove container resources: {:?}", e);
}
return Err(err);
}
ctr.start(p).await?;
s.update_shared_pidns(&ctr)?;
s.add_container(ctr);
info!(sl!(), "created container!");
@@ -282,7 +257,7 @@ impl AgentService {
.get_container(&cid)
.ok_or_else(|| anyhow!("Invalid container id"))?;
ctr.exec().await?;
ctr.exec()?;
if sid == cid {
return Ok(());
@@ -308,6 +283,27 @@ impl AgentService {
req: protocols::agent::RemoveContainerRequest,
) -> Result<()> {
let cid = req.container_id.clone();
let mut cmounts: Vec<String> = vec![];
let mut remove_container_resources = |sandbox: &mut Sandbox| -> Result<()> {
// Find the sandbox storage used by this container
let mounts = sandbox.container_mounts.get(&cid);
if let Some(mounts) = mounts {
for m in mounts.iter() {
if sandbox.storages.get(m).is_some() {
cmounts.push(m.to_string());
}
}
}
for m in cmounts.iter() {
sandbox.unset_and_remove_sandbox_storage(m)?;
}
sandbox.container_mounts.remove(cid.as_str());
sandbox.containers.remove(cid.as_str());
Ok(())
};
if req.timeout == 0 {
let s = Arc::clone(&self.sandbox);
@@ -321,7 +317,7 @@ impl AgentService {
.destroy()
.await?;
remove_container_resources(&mut sandbox, &cid)?;
remove_container_resources(&mut sandbox)?;
return Ok(());
}
@@ -353,7 +349,8 @@ impl AgentService {
let s = self.sandbox.clone();
let mut sandbox = s.lock().await;
remove_container_resources(&mut sandbox, &cid)?;
remove_container_resources(&mut sandbox)?;
Ok(())
}
@@ -997,140 +994,6 @@ impl protocols::agent_ttrpc::AgentService for AgentService {
})
}
async fn get_ip_tables(
&self,
ctx: &TtrpcContext,
req: GetIPTablesRequest,
) -> ttrpc::Result<GetIPTablesResponse> {
trace_rpc_call!(ctx, "get_iptables", req);
is_allowed!(req);
info!(sl!(), "get_ip_tables: request received");
let cmd = if req.is_ipv6 {
IP6TABLES_SAVE
} else {
IPTABLES_SAVE
}
.to_string();
match Command::new(cmd.clone()).output() {
Ok(output) => Ok(GetIPTablesResponse {
data: output.stdout,
..Default::default()
}),
Err(e) => {
warn!(sl!(), "failed to run {}: {:?}", cmd, e.kind());
return Err(ttrpc_error!(ttrpc::Code::INTERNAL, e));
}
}
}
async fn set_ip_tables(
&self,
ctx: &TtrpcContext,
req: SetIPTablesRequest,
) -> ttrpc::Result<SetIPTablesResponse> {
trace_rpc_call!(ctx, "set_iptables", req);
is_allowed!(req);
info!(sl!(), "set_ip_tables request received");
let cmd = if req.is_ipv6 {
IP6TABLES_RESTORE
} else {
IPTABLES_RESTORE
}
.to_string();
let mut child = match Command::new(cmd.clone())
.arg("--wait")
.arg(IPTABLES_RESTORE_WAIT_SEC.to_string())
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()
{
Ok(child) => child,
Err(e) => {
warn!(sl!(), "failure to spawn {}: {:?}", cmd, e.kind());
return Err(ttrpc_error!(ttrpc::Code::INTERNAL, e));
}
};
let mut stdin = match child.stdin.take() {
Some(si) => si,
None => {
println!("failed to get stdin from child");
return Err(ttrpc_error!(
ttrpc::Code::INTERNAL,
"failed to take stdin from child".to_string()
));
}
};
let (tx, rx) = tokio::sync::oneshot::channel::<i32>();
let handle = tokio::spawn(async move {
let _ = match stdin.write_all(&req.data) {
Ok(o) => o,
Err(e) => {
warn!(sl!(), "error writing stdin: {:?}", e.kind());
return;
}
};
if tx.send(1).is_err() {
warn!(sl!(), "stdin writer thread receiver dropped");
};
});
if tokio::time::timeout(Duration::from_secs(IPTABLES_RESTORE_WAIT_SEC), rx)
.await
.is_err()
{
return Err(ttrpc_error!(
ttrpc::Code::INTERNAL,
"timeout waiting for stdin writer to complete".to_string()
));
}
if handle.await.is_err() {
return Err(ttrpc_error!(
ttrpc::Code::INTERNAL,
"stdin writer thread failure".to_string()
));
}
let output = match child.wait_with_output() {
Ok(o) => o,
Err(e) => {
warn!(
sl!(),
"failure waiting for spawned {} to complete: {:?}",
cmd,
e.kind()
);
return Err(ttrpc_error!(ttrpc::Code::INTERNAL, e));
}
};
if !output.status.success() {
warn!(sl!(), "{} failed: {:?}", cmd, output.stderr);
return Err(ttrpc_error!(
ttrpc::Code::INTERNAL,
format!(
"{} failed: {:?}",
cmd,
String::from_utf8_lossy(&output.stderr)
)
));
}
Ok(SetIPTablesResponse {
data: output.stdout,
..Default::default()
})
}
async fn list_interfaces(
&self,
ctx: &TtrpcContext,
@@ -1743,35 +1606,6 @@ fn update_container_namespaces(
Ok(())
}
fn remove_container_resources(sandbox: &mut Sandbox, cid: &str) -> Result<()> {
let mut cmounts: Vec<String> = vec![];
// Find the sandbox storage used by this container
let mounts = sandbox.container_mounts.get(cid);
if let Some(mounts) = mounts {
for m in mounts.iter() {
if sandbox.storages.get(m).is_some() {
cmounts.push(m.to_string());
}
}
}
for m in cmounts.iter() {
if let Err(err) = sandbox.unset_and_remove_sandbox_storage(m) {
error!(
sl!(),
"failed to unset_and_remove_sandbox_storage for container {}, error: {:?}",
cid,
err
);
}
}
sandbox.container_mounts.remove(cid);
sandbox.containers.remove(cid);
Ok(())
}
fn append_guest_hooks(s: &Sandbox, oci: &mut Spec) -> Result<()> {
if let Some(ref guest_hooks) = s.hooks {
let mut hooks = oci.hooks.take().unwrap_or_default();
@@ -1813,25 +1647,34 @@ fn is_signal_handled(proc_status_file: &str, signum: u32) -> bool {
let sig_mask: u64 = 1 << shift_count;
let reader = BufReader::new(file);
// read lines start with SigBlk/SigIgn/SigCgt and check any match the signal mask
reader
.lines()
.flatten()
.filter(|line| {
line.starts_with("SigBlk:")
|| line.starts_with("SigIgn:")
|| line.starts_with("SigCgt:")
})
.any(|line| {
let mask_vec: Vec<&str> = line.split(':').collect();
if mask_vec.len() == 2 {
let sig_str = mask_vec[1].trim();
if let Ok(sig) = u64::from_str_radix(sig_str, 16) {
return sig & sig_mask == sig_mask;
}
// Read the file line by line using the lines() iterator from std::io::BufRead.
for (_index, line) in reader.lines().enumerate() {
let line = match line {
Ok(l) => l,
Err(_) => {
warn!(sl!(), "failed to read file {}", proc_status_file);
return false;
}
false
})
};
if line.starts_with("SigCgt:") {
let mask_vec: Vec<&str> = line.split(':').collect();
if mask_vec.len() != 2 {
warn!(sl!(), "parse the SigCgt field failed");
return false;
}
let sig_cgt_str = mask_vec[1].trim();
let sig_cgt_mask = match u64::from_str_radix(sig_cgt_str, 16) {
Ok(h) => h,
Err(_) => {
warn!(sl!(), "failed to parse the str {} to hex", sig_cgt_str);
return false;
}
};
return (sig_cgt_mask & sig_mask) == sig_mask;
}
}
false
}
fn do_mem_hotplug_by_probe(addrs: &[u64]) -> Result<()> {
@@ -1939,7 +1782,7 @@ async fn do_add_swap(sandbox: &Arc<Mutex<Sandbox>>, req: &AddSwapRequest) -> Res
// - config.json at /<CONTAINER_BASE>/<cid>/config.json
// - container rootfs bind mounted at /<CONTAINER_BASE>/<cid>/rootfs
// - modify container spec root to point to /<CONTAINER_BASE>/<cid>/rootfs
pub fn setup_bundle(cid: &str, spec: &mut Spec) -> Result<PathBuf> {
fn setup_bundle(cid: &str, spec: &mut Spec) -> Result<PathBuf> {
let spec_root = if let Some(sr) = &spec.root {
sr
} else {
@@ -2036,7 +1879,6 @@ mod tests {
skip_if_not_root,
};
use nix::mount;
use nix::sched::{unshare, CloneFlags};
use oci::{Hook, Hooks, Linux, LinuxNamespace};
use tempfile::{tempdir, TempDir};
use ttrpc::{r#async::TtrpcContext, MessageHeader};
@@ -2653,12 +2495,7 @@ OtherField:other
TestData {
status_file_data: Some("SigBlk:0000000000000001"),
signum: 1,
result: true,
},
TestData {
status_file_data: Some("SigIgn:0000000000000001"),
signum: 1,
result: true,
result: false,
},
TestData {
status_file_data: None,
@@ -2978,162 +2815,4 @@ OtherField:other
assert_eq!(stats.used, 3);
assert_eq!(stats.available, available - 2);
}
#[tokio::test]
async fn test_ip_tables() {
skip_if_not_root!();
let logger = slog::Logger::root(slog::Discard, o!());
let sandbox = Sandbox::new(&logger).unwrap();
let agent_service = Box::new(AgentService {
sandbox: Arc::new(Mutex::new(sandbox)),
});
let ctx = mk_ttrpc_context();
// Move to a new netns in order to ensure we don't trash the hosts' iptables
unshare(CloneFlags::CLONE_NEWNET).unwrap();
// Get initial iptables, we expect to be empty:
let result = agent_service
.get_ip_tables(
&ctx,
GetIPTablesRequest {
is_ipv6: false,
..Default::default()
},
)
.await;
assert!(result.is_ok(), "get ip tables should succeed");
assert_eq!(
result.unwrap().data.len(),
0,
"ip tables should be empty initially"
);
// Initial ip6 ip tables should also be empty:
let result = agent_service
.get_ip_tables(
&ctx,
GetIPTablesRequest {
is_ipv6: true,
..Default::default()
},
)
.await;
assert!(result.is_ok(), "get ip6 tables should succeed");
assert_eq!(
result.unwrap().data.len(),
0,
"ip tables should be empty initially"
);
// Verify that attempting to write 'empty' iptables results in no error:
let empty_rules = "";
let result = agent_service
.set_ip_tables(
&ctx,
SetIPTablesRequest {
is_ipv6: false,
data: empty_rules.as_bytes().to_vec(),
..Default::default()
},
)
.await;
assert!(result.is_ok(), "set ip tables with no data should succeed");
// Verify that attempting to write "garbage" iptables results in an error:
let garbage_rules = r#"
this
is
just garbage
"#;
let result = agent_service
.set_ip_tables(
&ctx,
SetIPTablesRequest {
is_ipv6: false,
data: garbage_rules.as_bytes().to_vec(),
..Default::default()
},
)
.await;
assert!(result.is_err(), "set iptables with garbage should fail");
// Verify setup of valid iptables:Setup valid set of iptables:
let valid_rules = r#"
*nat
-A PREROUTING -d 192.168.103.153/32 -j DNAT --to-destination 192.168.188.153
COMMIT
"#;
let result = agent_service
.set_ip_tables(
&ctx,
SetIPTablesRequest {
is_ipv6: false,
data: valid_rules.as_bytes().to_vec(),
..Default::default()
},
)
.await;
assert!(result.is_ok(), "set ip tables should succeed");
let result = agent_service
.get_ip_tables(
&ctx,
GetIPTablesRequest {
is_ipv6: false,
..Default::default()
},
)
.await
.unwrap();
assert!(!result.data.is_empty(), "we should have non-zero output:");
assert!(
std::str::from_utf8(&*result.data).unwrap().contains(
"PREROUTING -d 192.168.103.153/32 -j DNAT --to-destination 192.168.188.153"
),
"We should see the resulting rule"
);
// Verify setup of valid ip6tables:
let valid_ipv6_rules = r#"
*filter
-A INPUT -s 2001:db8:100::1/128 -i sit+ -p tcp -m tcp --sport 512:65535
COMMIT
"#;
let result = agent_service
.set_ip_tables(
&ctx,
SetIPTablesRequest {
is_ipv6: true,
data: valid_ipv6_rules.as_bytes().to_vec(),
..Default::default()
},
)
.await;
assert!(result.is_ok(), "set ip6 tables should succeed");
let result = agent_service
.get_ip_tables(
&ctx,
GetIPTablesRequest {
is_ipv6: true,
..Default::default()
},
)
.await
.unwrap();
assert!(!result.data.is_empty(), "we should have non-zero output:");
assert!(
std::str::from_utf8(&*result.data)
.unwrap()
.contains("INPUT -s 2001:db8:100::1/128 -i sit+ -p tcp -m tcp --sport 512:65535"),
"We should see the resulting rule"
);
}
}

View File

@@ -470,7 +470,7 @@ fn online_memory(logger: &Logger) -> Result<()> {
#[cfg(test)]
mod tests {
use super::*;
use super::Sandbox;
use crate::{mount::baremount, skip_if_not_root};
use anyhow::{anyhow, Error};
use nix::mount::MsFlags;
@@ -480,7 +480,6 @@ mod tests {
use rustjail::specconv::CreateOpts;
use slog::Logger;
use std::fs::{self, File};
use std::io::prelude::*;
use std::os::unix::fs::PermissionsExt;
use std::path::Path;
use tempfile::{tempdir, Builder, TempDir};
@@ -848,259 +847,4 @@ mod tests {
let p = s.find_container_process("not-exist-cid", "");
assert!(p.is_err(), "Expecting Error, Got {:?}", p);
}
#[tokio::test]
async fn test_find_process() {
let logger = slog::Logger::root(slog::Discard, o!());
let test_pids = [std::i32::MIN, -1, 0, 1, std::i32::MAX];
for test_pid in test_pids {
let mut s = Sandbox::new(&logger).unwrap();
let (mut linux_container, _root) = create_linuxcontainer();
let mut test_process = Process::new(
&logger,
&oci::Process::default(),
"this_is_a_test_process",
true,
1,
)
.unwrap();
// processes interally only have pids when manually set
test_process.pid = test_pid;
linux_container.processes.insert(test_pid, test_process);
s.add_container(linux_container);
let find_result = s.find_process(test_pid);
// test first if it finds anything
assert!(find_result.is_some(), "Should be able to find a process");
let found_process = find_result.unwrap();
// then test if it founds the correct process
assert_eq!(
found_process.pid, test_pid,
"Should be able to find correct process"
);
}
// to test for nonexistent pids, any pid that isn't the one set
// above should work, as linuxcontainer starts with no processes
let mut s = Sandbox::new(&logger).unwrap();
let nonexistent_test_pid = 1234;
let find_result = s.find_process(nonexistent_test_pid);
assert!(
find_result.is_none(),
"Shouldn't find a process for non existent pid"
);
}
#[tokio::test]
async fn test_online_resources() {
#[derive(Debug, Default)]
struct TestFile {
name: String,
content: String,
}
#[derive(Debug, Default)]
struct TestDirectory<'a> {
name: String,
files: &'a [TestFile],
}
#[derive(Debug)]
struct TestData<'a> {
directory_autogen_name: String,
number_autogen_directories: u32,
extra_directories: &'a [TestDirectory<'a>],
pattern: String,
to_enable: i32,
result: Result<i32>,
}
impl Default for TestData<'_> {
fn default() -> Self {
TestData {
directory_autogen_name: Default::default(),
number_autogen_directories: Default::default(),
extra_directories: Default::default(),
pattern: Default::default(),
to_enable: Default::default(),
result: Ok(Default::default()),
}
}
}
let tests = &[
// 4 well formed directories, request enabled 4,
// correct result 4 enabled, should pass
TestData {
directory_autogen_name: String::from("cpu"),
number_autogen_directories: 4,
pattern: String::from(r"cpu[0-9]+"),
to_enable: 4,
result: Ok(4),
..Default::default()
},
// 0 well formed directories, request enabled 4,
// correct result 0 enabled, should pass
TestData {
number_autogen_directories: 0,
to_enable: 4,
result: Ok(0),
..Default::default()
},
// 10 well formed directories, request enabled 4,
// correct result 4 enabled, should pass
TestData {
directory_autogen_name: String::from("cpu"),
number_autogen_directories: 10,
pattern: String::from(r"cpu[0-9]+"),
to_enable: 4,
result: Ok(4),
..Default::default()
},
// 0 well formed directories, request enabled 0,
// correct result 0 enabled, should pass
TestData {
number_autogen_directories: 0,
pattern: String::from(r"cpu[0-9]+"),
to_enable: 0,
result: Ok(0),
..Default::default()
},
// 4 well formed directories, 1 malformed (no online file),
// request enable 5, correct result 4
TestData {
directory_autogen_name: String::from("cpu"),
number_autogen_directories: 4,
pattern: String::from(r"cpu[0-9]+"),
extra_directories: &[TestDirectory {
name: String::from("cpu4"),
files: &[],
}],
to_enable: 5,
result: Ok(4),
},
// 3 malformed directories (no online files),
// request enable 3, correct result 0
TestData {
pattern: String::from(r"cpu[0-9]+"),
extra_directories: &[
TestDirectory {
name: String::from("cpu0"),
files: &[],
},
TestDirectory {
name: String::from("cpu1"),
files: &[],
},
TestDirectory {
name: String::from("cpu2"),
files: &[],
},
],
to_enable: 3,
result: Ok(0),
..Default::default()
},
// 1 malformed directories (online file with content "1"),
// request enable 1, correct result 0
TestData {
pattern: String::from(r"cpu[0-9]+"),
extra_directories: &[TestDirectory {
name: String::from("cpu0"),
files: &[TestFile {
name: SYSFS_ONLINE_FILE.to_string(),
content: String::from("1"),
}],
}],
to_enable: 1,
result: Ok(0),
..Default::default()
},
// 2 well formed directories, 1 malformed (online file with content "1"),
// request enable 3, correct result 2
TestData {
directory_autogen_name: String::from("cpu"),
number_autogen_directories: 2,
pattern: String::from(r"cpu[0-9]+"),
extra_directories: &[TestDirectory {
name: String::from("cpu2"),
files: &[TestFile {
name: SYSFS_ONLINE_FILE.to_string(),
content: String::from("1"),
}],
}],
to_enable: 3,
result: Ok(2),
},
];
let logger = slog::Logger::root(slog::Discard, o!());
let tmpdir = Builder::new().tempdir().unwrap();
let tmpdir_path = tmpdir.path().to_str().unwrap();
for (i, d) in tests.iter().enumerate() {
let current_test_dir_path = format!("{}/test_{}", tmpdir_path, i);
fs::create_dir(&current_test_dir_path).unwrap();
// create numbered directories and fill using root name
for j in 0..d.number_autogen_directories {
let subdir_path = format!(
"{}/{}{}",
current_test_dir_path, d.directory_autogen_name, j
);
let subfile_path = format!("{}/{}", subdir_path, SYSFS_ONLINE_FILE);
fs::create_dir(&subdir_path).unwrap();
let mut subfile = File::create(subfile_path).unwrap();
subfile.write_all(b"0").unwrap();
}
// create extra directories and fill to specification
for j in d.extra_directories {
let subdir_path = format!("{}/{}", current_test_dir_path, j.name);
fs::create_dir(&subdir_path).unwrap();
for file in j.files {
let subfile_path = format!("{}/{}", subdir_path, file.name);
let mut subfile = File::create(&subfile_path).unwrap();
subfile.write_all(file.content.as_bytes()).unwrap();
}
}
// run created directory structure against online_resources
let result = online_resources(&logger, &current_test_dir_path, &d.pattern, d.to_enable);
let mut msg = format!(
"test[{}]: {:?}, expected {}, actual {}",
i,
d,
d.result.is_ok(),
result.is_ok()
);
assert_eq!(result.is_ok(), d.result.is_ok(), "{}", msg);
if d.result.is_ok() {
let test_result_val = *d.result.as_ref().ok().unwrap();
let result_val = result.ok().unwrap();
msg = format!(
"test[{}]: {:?}, expected {}, actual {}",
i, d, test_result_val, result_val
);
assert_eq!(test_result_val, result_val, "{}", msg);
}
}
}
}

View File

@@ -58,16 +58,17 @@ async fn handle_sigchild(logger: Logger, sandbox: Arc<Mutex<Sandbox>>) -> Result
}
let mut p = process.unwrap();
let ret: i32;
let ret: i32 = match wait_status {
WaitStatus::Exited(_, c) => c,
WaitStatus::Signaled(_, sig, _) => sig as i32,
match wait_status {
WaitStatus::Exited(_, c) => ret = c,
WaitStatus::Signaled(_, sig, _) => ret = sig as i32,
_ => {
info!(logger, "got wrong status for process";
"child-status" => format!("{:?}", wait_status));
continue;
}
};
}
p.exit_code = ret;
let _ = p.exit_tx.take();

View File

@@ -5,14 +5,7 @@
#![allow(clippy::module_inception)]
#[cfg(test)]
pub mod test_utils {
#[derive(Debug, PartialEq)]
pub enum TestUserType {
RootOnly,
NonRootOnly,
Any,
}
mod test_utils {
#[macro_export]
macro_rules! skip_if_root {
() => {
@@ -85,15 +78,4 @@ pub mod test_utils {
}
};
}
#[macro_export]
macro_rules! skip_loop_by_user {
($msg:expr, $user:expr) => {
if $user == TestUserType::RootOnly {
skip_loop_if_not_root!($msg);
} else if $user == TestUserType::NonRootOnly {
skip_loop_if_root!($msg);
}
};
}
}

View File

@@ -237,6 +237,8 @@ mod tests {
JoinError,
>;
let result: std::result::Result<u64, std::io::Error>;
select! {
res = handle => spawn_result = res,
_ = &mut timeout => panic!("timed out"),
@@ -244,7 +246,7 @@ mod tests {
assert!(spawn_result.is_ok());
let result: std::result::Result<u64, std::io::Error> = spawn_result.unwrap();
result = spawn_result.unwrap();
assert!(result.is_ok());
@@ -276,6 +278,8 @@ mod tests {
let spawn_result: std::result::Result<std::result::Result<u64, std::io::Error>, JoinError>;
let result: std::result::Result<u64, std::io::Error>;
select! {
res = handle => spawn_result = res,
_ = &mut timeout => panic!("timed out"),
@@ -283,7 +287,7 @@ mod tests {
assert!(spawn_result.is_ok());
let result: std::result::Result<u64, std::io::Error> = spawn_result.unwrap();
result = spawn_result.unwrap();
assert!(result.is_ok());
@@ -316,6 +320,8 @@ mod tests {
let spawn_result: std::result::Result<std::result::Result<u64, std::io::Error>, JoinError>;
let result: std::result::Result<u64, std::io::Error>;
select! {
res = handle => spawn_result = res,
_ = &mut timeout => panic!("timed out"),
@@ -323,7 +329,7 @@ mod tests {
assert!(spawn_result.is_ok());
let result: std::result::Result<u64, std::io::Error> = spawn_result.unwrap();
result = spawn_result.unwrap();
assert!(result.is_ok());

View File

@@ -178,11 +178,13 @@ impl Builder {
pub fn init(self) -> Exporter {
let Builder { port, cid, logger } = self;
let cid_str: String = if self.cid == libc::VMADDR_CID_ANY {
ANY_CID.to_string()
let cid_str: String;
if self.cid == libc::VMADDR_CID_ANY {
cid_str = ANY_CID.to_string();
} else {
format!("{}", self.cid)
};
cid_str = format!("{}", self.cid);
}
Exporter {
port,

897
src/libs/Cargo.lock generated
View File

@@ -1,897 +0,0 @@
# This file is automatically @generated by Cargo.
# It is not intended for manual editing.
version = 3
[[package]]
name = "anyhow"
version = "1.0.57"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "08f9b8508dccb7687a1d6c4ce66b2b0ecef467c94667de27d8d7fe1f8d2a9cdc"
[[package]]
name = "arc-swap"
version = "1.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c5d78ce20460b82d3fa150275ed9d55e21064fc7951177baacf86a145c4a4b1f"
[[package]]
name = "async-trait"
version = "0.1.53"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ed6aa3524a2dfcf9fe180c51eae2b58738348d819517ceadf95789c51fff7600"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "autocfg"
version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cdb031dd78e28731d87d56cc8ffef4a8f36ca26c38fe2de700543e627f8a464a"
[[package]]
name = "bitflags"
version = "1.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cf1de2fe8c75bc145a2f577add951f8134889b4795d47466a54a5c846d691693"
[[package]]
name = "byteorder"
version = "1.4.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "14c189c53d098945499cdfa7ecc63567cf3886b3332b312a5b4585d8d3a6a610"
[[package]]
name = "bytes"
version = "0.4.12"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "206fdffcfa2df7cbe15601ef46c813fce0965eb3286db6b56c583b814b51c81c"
dependencies = [
"byteorder",
"iovec",
]
[[package]]
name = "bytes"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c4872d67bab6358e59559027aa3b9157c53d9358c51423c17554809a8858e0f8"
[[package]]
name = "cc"
version = "1.0.73"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2fff2a6927b3bb87f9595d67196a70493f627687a71d87a0d692242c33f58c11"
[[package]]
name = "cfg-if"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "baf1de4339761588bc0619e3cbc0120ee582ebb74b53b4efbf79117bd2da40fd"
[[package]]
name = "chrono"
version = "0.4.19"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "670ad68c9088c2a963aaa298cb369688cf3f9465ce5e2d4ca10e6e0098a1ce73"
dependencies = [
"libc",
"num-integer",
"num-traits",
"time",
"winapi",
]
[[package]]
name = "crossbeam-channel"
version = "0.5.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e54ea8bc3fb1ee042f5aace6e3c6e025d3874866da222930f70ce62aceba0bfa"
dependencies = [
"cfg-if",
"crossbeam-utils",
]
[[package]]
name = "crossbeam-utils"
version = "0.8.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0bf124c720b7686e3c2663cf54062ab0f68a88af2fb6a030e87e30bf721fcb38"
dependencies = [
"cfg-if",
"lazy_static",
]
[[package]]
name = "derive-new"
version = "0.5.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3418329ca0ad70234b9735dc4ceed10af4df60eff9c8e7b06cb5e520d92c3535"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "either"
version = "1.6.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e78d4f1cc4ae33bbfc157ed5d5a5ef3bc29227303d595861deb238fcec4e9457"
[[package]]
name = "fastrand"
version = "1.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "779d043b6a0b90cc4c0ed7ee380a6504394cee7efd7db050e3774eee387324b2"
dependencies = [
"instant",
]
[[package]]
name = "fixedbitset"
version = "0.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "37ab347416e802de484e4d03c7316c48f1ecb56574dfd4a46a80f173ce1de04d"
[[package]]
name = "futures"
version = "0.3.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f73fe65f54d1e12b726f517d3e2135ca3125a437b6d998caf1962961f7172d9e"
dependencies = [
"futures-channel",
"futures-core",
"futures-executor",
"futures-io",
"futures-sink",
"futures-task",
"futures-util",
]
[[package]]
name = "futures-channel"
version = "0.3.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c3083ce4b914124575708913bca19bfe887522d6e2e6d0952943f5eac4a74010"
dependencies = [
"futures-core",
"futures-sink",
]
[[package]]
name = "futures-core"
version = "0.3.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0c09fd04b7e4073ac7156a9539b57a484a8ea920f79c7c675d05d289ab6110d3"
[[package]]
name = "futures-executor"
version = "0.3.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9420b90cfa29e327d0429f19be13e7ddb68fa1cccb09d65e5706b8c7a749b8a6"
dependencies = [
"futures-core",
"futures-task",
"futures-util",
]
[[package]]
name = "futures-io"
version = "0.3.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fc4045962a5a5e935ee2fdedaa4e08284547402885ab326734432bed5d12966b"
[[package]]
name = "futures-macro"
version = "0.3.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "33c1e13800337f4d4d7a316bf45a567dbcb6ffe087f16424852d97e97a91f512"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "futures-sink"
version = "0.3.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "21163e139fa306126e6eedaf49ecdb4588f939600f0b1e770f4205ee4b7fa868"
[[package]]
name = "futures-task"
version = "0.3.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "57c66a976bf5909d801bbef33416c41372779507e7a6b3a5e25e4749c58f776a"
[[package]]
name = "futures-util"
version = "0.3.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d8b7abd5d659d9b90c8cba917f6ec750a74e2dc23902ef9cd4cc8c8b22e6036a"
dependencies = [
"futures-channel",
"futures-core",
"futures-io",
"futures-macro",
"futures-sink",
"futures-task",
"memchr",
"pin-project-lite",
"pin-utils",
"slab",
]
[[package]]
name = "hashbrown"
version = "0.11.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ab5ef0d4909ef3724cc8cce6ccc8572c5c817592e9285f5464f8e86f8bd3726e"
[[package]]
name = "heck"
version = "0.3.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6d621efb26863f0e9924c6ac577e8275e5e6b77455db64ffa6c65c904e9e132c"
dependencies = [
"unicode-segmentation",
]
[[package]]
name = "indexmap"
version = "1.8.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0f647032dfaa1f8b6dc29bd3edb7bbef4861b8b8007ebb118d6db284fd59f6ee"
dependencies = [
"autocfg",
"hashbrown",
]
[[package]]
name = "instant"
version = "0.1.12"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7a5bbe824c507c5da5956355e86a746d82e0e1464f65d862cc5e71da70e94b2c"
dependencies = [
"cfg-if",
]
[[package]]
name = "iovec"
version = "0.1.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b2b3ea6ff95e175473f8ffe6a7eb7c00d054240321b84c57051175fe3c1e075e"
dependencies = [
"libc",
]
[[package]]
name = "itertools"
version = "0.10.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a9a9d19fa1e79b6215ff29b9d6880b706147f16e9b1dbb1e4e5947b5b02bc5e3"
dependencies = [
"either",
]
[[package]]
name = "itoa"
version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1aab8fc367588b89dcee83ab0fd66b72b50b72fa1904d7095045ace2b0c81c35"
[[package]]
name = "lazy_static"
version = "1.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e2abad23fbc42b3700f2f279844dc832adb2b2eb069b2df918f455c4e18cc646"
[[package]]
name = "libc"
version = "0.2.124"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "21a41fed9d98f27ab1c6d161da622a4fa35e8a54a8adc24bbf3ddd0ef70b0e50"
[[package]]
name = "log"
version = "0.4.16"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6389c490849ff5bc16be905ae24bc913a9c8892e19b2341dbc175e14c341c2b8"
dependencies = [
"cfg-if",
]
[[package]]
name = "logging"
version = "0.1.0"
dependencies = [
"serde_json",
"slog",
"slog-async",
"slog-json",
"slog-scope",
"tempfile",
]
[[package]]
name = "memchr"
version = "2.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "308cc39be01b73d0d18f82a0e7b2a3df85245f84af96fdddc5d202d27e47b86a"
[[package]]
name = "memoffset"
version = "0.6.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5aa361d4faea93603064a027415f07bd8e1d5c88c9fbf68bf56a285428fd79ce"
dependencies = [
"autocfg",
]
[[package]]
name = "mio"
version = "0.8.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "52da4364ffb0e4fe33a9841a98a3f3014fb964045ce4f7a45a398243c8d6b0c9"
dependencies = [
"libc",
"log",
"miow",
"ntapi",
"wasi 0.11.0+wasi-snapshot-preview1",
"winapi",
]
[[package]]
name = "miow"
version = "0.3.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b9f1c5b025cda876f66ef43a113f91ebc9f4ccef34843000e0adf6ebbab84e21"
dependencies = [
"winapi",
]
[[package]]
name = "multimap"
version = "0.8.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e5ce46fe64a9d73be07dcbe690a38ce1b293be448fd8ce1e6c1b8062c9f72c6a"
[[package]]
name = "nix"
version = "0.20.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f5e06129fb611568ef4e868c14b326274959aa70ff7776e9d55323531c374945"
dependencies = [
"bitflags",
"cc",
"cfg-if",
"libc",
"memoffset",
]
[[package]]
name = "nix"
version = "0.23.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9f866317acbd3a240710c63f065ffb1e4fd466259045ccb504130b7f668f35c6"
dependencies = [
"bitflags",
"cc",
"cfg-if",
"libc",
"memoffset",
]
[[package]]
name = "ntapi"
version = "0.3.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c28774a7fd2fbb4f0babd8237ce554b73af68021b5f695a3cebd6c59bac0980f"
dependencies = [
"winapi",
]
[[package]]
name = "num-integer"
version = "0.1.44"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d2cc698a63b549a70bc047073d2949cce27cd1c7b0a4a862d08a8031bc2801db"
dependencies = [
"autocfg",
"num-traits",
]
[[package]]
name = "num-traits"
version = "0.2.14"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9a64b1ec5cda2586e284722486d802acf1f7dbdc623e2bfc57e65ca1cd099290"
dependencies = [
"autocfg",
]
[[package]]
name = "once_cell"
version = "1.9.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "da32515d9f6e6e489d7bc9d84c71b060db7247dc035bbe44eac88cf87486d8d5"
[[package]]
name = "petgraph"
version = "0.5.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "467d164a6de56270bd7c4d070df81d07beace25012d5103ced4e9ff08d6afdb7"
dependencies = [
"fixedbitset",
"indexmap",
]
[[package]]
name = "pin-project-lite"
version = "0.2.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e280fbe77cc62c91527259e9442153f4688736748d24660126286329742b4c6c"
[[package]]
name = "pin-utils"
version = "0.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8b870d8c151b6f2fb93e84a13146138f05d02ed11c7e7c54f8826aaaf7c9f184"
[[package]]
name = "proc-macro2"
version = "1.0.37"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ec757218438d5fda206afc041538b2f6d889286160d649a86a24d37e1235afd1"
dependencies = [
"unicode-xid",
]
[[package]]
name = "prost"
version = "0.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "de5e2533f59d08fcf364fd374ebda0692a70bd6d7e66ef97f306f45c6c5d8020"
dependencies = [
"bytes 1.1.0",
"prost-derive",
]
[[package]]
name = "prost-build"
version = "0.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "355f634b43cdd80724ee7848f95770e7e70eefa6dcf14fea676216573b8fd603"
dependencies = [
"bytes 1.1.0",
"heck",
"itertools",
"log",
"multimap",
"petgraph",
"prost",
"prost-types",
"tempfile",
"which",
]
[[package]]
name = "prost-derive"
version = "0.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "600d2f334aa05acb02a755e217ef1ab6dea4d51b58b7846588b747edec04efba"
dependencies = [
"anyhow",
"itertools",
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "prost-types"
version = "0.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "603bbd6394701d13f3f25aada59c7de9d35a6a5887cfc156181234a44002771b"
dependencies = [
"bytes 1.1.0",
"prost",
]
[[package]]
name = "protobuf"
version = "2.14.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8e86d370532557ae7573551a1ec8235a0f8d6cb276c7c9e6aa490b511c447485"
dependencies = [
"serde",
"serde_derive",
]
[[package]]
name = "protobuf-codegen"
version = "2.14.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "de113bba758ccf2c1ef816b127c958001b7831136c9bc3f8e9ec695ac4e82b0c"
dependencies = [
"protobuf",
]
[[package]]
name = "protobuf-codegen-pure"
version = "2.14.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2d1a4febc73bf0cada1d77c459a0c8e5973179f1cfd5b0f1ab789d45b17b6440"
dependencies = [
"protobuf",
"protobuf-codegen",
]
[[package]]
name = "protocols"
version = "0.1.0"
dependencies = [
"async-trait",
"protobuf",
"serde",
"serde_json",
"ttrpc",
"ttrpc-codegen",
]
[[package]]
name = "quote"
version = "1.0.18"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a1feb54ed693b93a84e14094943b84b7c4eae204c512b7ccb95ab0c66d278ad1"
dependencies = [
"proc-macro2",
]
[[package]]
name = "redox_syscall"
version = "0.2.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8383f39639269cde97d255a32bdb68c047337295414940c68bdd30c2e13203ff"
dependencies = [
"bitflags",
]
[[package]]
name = "remove_dir_all"
version = "0.5.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3acd125665422973a33ac9d3dd2df85edad0f4ae9b00dafb1a05e43a9f5ef8e7"
dependencies = [
"winapi",
]
[[package]]
name = "ryu"
version = "1.0.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "73b4b750c782965c211b42f022f59af1fbceabdd026623714f104152f1ec149f"
[[package]]
name = "safe-path"
version = "0.1.0"
dependencies = [
"libc",
"tempfile",
]
[[package]]
name = "serde"
version = "1.0.133"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "97565067517b60e2d1ea8b268e59ce036de907ac523ad83a0475da04e818989a"
dependencies = [
"serde_derive",
]
[[package]]
name = "serde_derive"
version = "1.0.133"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ed201699328568d8d08208fdd080e3ff594e6c422e438b6705905da01005d537"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "serde_json"
version = "1.0.75"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c059c05b48c5c0067d4b4b2b4f0732dd65feb52daf7e0ea09cd87e7dadc1af79"
dependencies = [
"itoa",
"ryu",
"serde",
]
[[package]]
name = "slab"
version = "0.4.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "eb703cfe953bccee95685111adeedb76fabe4e97549a58d16f03ea7b9367bb32"
[[package]]
name = "slog"
version = "2.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8347046d4ebd943127157b94d63abb990fcf729dc4e9978927fdf4ac3c998d06"
[[package]]
name = "slog-async"
version = "2.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "766c59b252e62a34651412870ff55d8c4e6d04df19b43eecb2703e417b097ffe"
dependencies = [
"crossbeam-channel",
"slog",
"take_mut",
"thread_local",
]
[[package]]
name = "slog-json"
version = "2.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "52e9b96fb6b5e80e371423b4aca6656eb537661ce8f82c2697e619f8ca85d043"
dependencies = [
"chrono",
"serde",
"serde_json",
"slog",
]
[[package]]
name = "slog-scope"
version = "4.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2f95a4b4c3274cd2869549da82b57ccc930859bdbf5bcea0424bc5f140b3c786"
dependencies = [
"arc-swap",
"lazy_static",
"slog",
]
[[package]]
name = "socket2"
version = "0.4.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "66d72b759436ae32898a2af0a14218dbf55efde3feeb170eb623637db85ee1e0"
dependencies = [
"libc",
"winapi",
]
[[package]]
name = "syn"
version = "1.0.91"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b683b2b825c8eef438b77c36a06dc262294da3d5a5813fac20da149241dcd44d"
dependencies = [
"proc-macro2",
"quote",
"unicode-xid",
]
[[package]]
name = "take_mut"
version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f764005d11ee5f36500a149ace24e00e3da98b0158b3e2d53a7495660d3f4d60"
[[package]]
name = "tempfile"
version = "3.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5cdb1ef4eaeeaddc8fbd371e5017057064af0911902ef36b39801f67cc6d79e4"
dependencies = [
"cfg-if",
"fastrand",
"libc",
"redox_syscall",
"remove_dir_all",
"winapi",
]
[[package]]
name = "thiserror"
version = "1.0.30"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "854babe52e4df1653706b98fcfc05843010039b406875930a70e4d9644e5c417"
dependencies = [
"thiserror-impl",
]
[[package]]
name = "thiserror-impl"
version = "1.0.30"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "aa32fd3f627f367fe16f893e2597ae3c05020f8bba2666a4e6ea73d377e5714b"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "thread_local"
version = "1.1.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8018d24e04c95ac8790716a5987d0fec4f8b27249ffa0f7d33f1369bdfb88cbd"
dependencies = [
"once_cell",
]
[[package]]
name = "time"
version = "0.1.44"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6db9e6914ab8b1ae1c260a4ae7a49b6c5611b40328a735b21862567685e73255"
dependencies = [
"libc",
"wasi 0.10.0+wasi-snapshot-preview1",
"winapi",
]
[[package]]
name = "tokio"
version = "1.17.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2af73ac49756f3f7c01172e34a23e5d0216f6c32333757c2c61feb2bbff5a5ee"
dependencies = [
"bytes 1.1.0",
"libc",
"memchr",
"mio",
"pin-project-lite",
"socket2",
"tokio-macros",
"winapi",
]
[[package]]
name = "tokio-macros"
version = "1.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b557f72f448c511a979e2564e55d74e6c4432fc96ff4f6241bc6bded342643b7"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "tokio-vsock"
version = "0.3.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9e0723fc001950a3b018947b05eeb45014fd2b7c6e8f292502193ab74486bdb6"
dependencies = [
"bytes 0.4.12",
"futures",
"libc",
"tokio",
"vsock",
]
[[package]]
name = "ttrpc"
version = "0.5.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "66a973ce6d5eaa20c173635b29ffb660dafbc7ef109172c0015ba44e47a23711"
dependencies = [
"async-trait",
"byteorder",
"futures",
"libc",
"log",
"nix 0.20.2",
"protobuf",
"protobuf-codegen-pure",
"thiserror",
"tokio",
"tokio-vsock",
]
[[package]]
name = "ttrpc-codegen"
version = "0.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "809eda4e459820237104e4b61d6b41bbe6c9e1ce6adf4057955e6e6722a90408"
dependencies = [
"protobuf",
"protobuf-codegen",
"protobuf-codegen-pure",
"ttrpc-compiler",
]
[[package]]
name = "ttrpc-compiler"
version = "0.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2978ed3fa047d8fd55cbeb4d4a61d461fb3021a90c9618519c73ce7e5bb66c15"
dependencies = [
"derive-new",
"prost",
"prost-build",
"prost-types",
"protobuf",
"protobuf-codegen",
"tempfile",
]
[[package]]
name = "unicode-segmentation"
version = "1.9.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7e8820f5d777f6224dc4be3632222971ac30164d4a258d595640799554ebfd99"
[[package]]
name = "unicode-xid"
version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8ccb82d61f80a663efe1f787a51b16b5a51e3314d6ac365b08639f52387b33f3"
[[package]]
name = "vsock"
version = "0.2.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e32675ee2b3ce5df274c0ab52d19b28789632406277ca26bffee79a8e27dc133"
dependencies = [
"libc",
"nix 0.23.1",
]
[[package]]
name = "wasi"
version = "0.10.0+wasi-snapshot-preview1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1a143597ca7c7793eff794def352d41792a93c481eb1042423ff7ff72ba2c31f"
[[package]]
name = "wasi"
version = "0.11.0+wasi-snapshot-preview1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9c8d87e72b64a3b4db28d11ce29237c246188f4f51057d65a7eab63b7987e423"
[[package]]
name = "which"
version = "4.2.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5c4fb54e6113b6a8772ee41c3404fb0301ac79604489467e0a9ce1f3e97c24ae"
dependencies = [
"either",
"lazy_static",
"libc",
]
[[package]]
name = "winapi"
version = "0.3.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5c839a674fcd7a98952e593242ea400abe93992746761e38641405d28b00f419"
dependencies = [
"winapi-i686-pc-windows-gnu",
"winapi-x86_64-pc-windows-gnu",
]
[[package]]
name = "winapi-i686-pc-windows-gnu"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ac3b87c63620426dd9b991e5ce0329eff545bccbbb34f3be09ff6fb6ab51b7b6"
[[package]]
name = "winapi-x86_64-pc-windows-gnu"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "712e227841d057c1ee1cd2fb22fa7e5a5461ae8e48fa2ca79ec42cfc1931183f"

View File

@@ -1,7 +0,0 @@
[workspace]
members = [
"logging",
"safe-path",
"protocols",
]
resolver = "2"

View File

@@ -1,10 +0,0 @@
The `src/libs` directory hosts library crates which may be shared by multiple Kata Containers components
or published to [`crates.io`](https://crates.io/index.html).
### Library Crates
Currently it provides following library crates:
| Library | Description |
|-|-|-|
| [logging](logging/) | Facilities to setup logging subsystem based slog. |
| [safe-path](safe-path/) | Utilities to safely resolve filesystem paths. |

321
src/libs/logging/Cargo.lock generated Normal file
View File

@@ -0,0 +1,321 @@
# This file is automatically @generated by Cargo.
# It is not intended for manual editing.
version = 3
[[package]]
name = "arc-swap"
version = "1.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c5d78ce20460b82d3fa150275ed9d55e21064fc7951177baacf86a145c4a4b1f"
[[package]]
name = "autocfg"
version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cdb031dd78e28731d87d56cc8ffef4a8f36ca26c38fe2de700543e627f8a464a"
[[package]]
name = "bitflags"
version = "1.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bef38d45163c2f1dde094a7dfd33ccf595c92905c8f8f4fdc18d06fb1037718a"
[[package]]
name = "cfg-if"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "baf1de4339761588bc0619e3cbc0120ee582ebb74b53b4efbf79117bd2da40fd"
[[package]]
name = "chrono"
version = "0.4.19"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "670ad68c9088c2a963aaa298cb369688cf3f9465ce5e2d4ca10e6e0098a1ce73"
dependencies = [
"libc",
"num-integer",
"num-traits",
"time",
"winapi",
]
[[package]]
name = "crossbeam-channel"
version = "0.5.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "06ed27e177f16d65f0f0c22a213e17c696ace5dd64b14258b52f9417ccb52db4"
dependencies = [
"cfg-if",
"crossbeam-utils",
]
[[package]]
name = "crossbeam-utils"
version = "0.8.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d82cfc11ce7f2c3faef78d8a684447b40d503d9681acebed6cb728d45940c4db"
dependencies = [
"cfg-if",
"lazy_static",
]
[[package]]
name = "getrandom"
version = "0.2.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7fcd999463524c52659517fe2cea98493cfe485d10565e7b0fb07dbba7ad2753"
dependencies = [
"cfg-if",
"libc",
"wasi",
]
[[package]]
name = "itoa"
version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1aab8fc367588b89dcee83ab0fd66b72b50b72fa1904d7095045ace2b0c81c35"
[[package]]
name = "lazy_static"
version = "1.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e2abad23fbc42b3700f2f279844dc832adb2b2eb069b2df918f455c4e18cc646"
[[package]]
name = "libc"
version = "0.2.112"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1b03d17f364a3a042d5e5d46b053bbbf82c92c9430c592dd4c064dc6ee997125"
[[package]]
name = "logging"
version = "0.1.0"
dependencies = [
"serde_json",
"slog",
"slog-async",
"slog-json",
"slog-scope",
"tempfile",
]
[[package]]
name = "num-integer"
version = "0.1.44"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d2cc698a63b549a70bc047073d2949cce27cd1c7b0a4a862d08a8031bc2801db"
dependencies = [
"autocfg",
"num-traits",
]
[[package]]
name = "num-traits"
version = "0.2.14"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9a64b1ec5cda2586e284722486d802acf1f7dbdc623e2bfc57e65ca1cd099290"
dependencies = [
"autocfg",
]
[[package]]
name = "once_cell"
version = "1.9.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "da32515d9f6e6e489d7bc9d84c71b060db7247dc035bbe44eac88cf87486d8d5"
[[package]]
name = "ppv-lite86"
version = "0.2.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ed0cfbc8191465bed66e1718596ee0b0b35d5ee1f41c5df2189d0fe8bde535ba"
[[package]]
name = "rand"
version = "0.8.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2e7573632e6454cf6b99d7aac4ccca54be06da05aca2ef7423d22d27d4d4bcd8"
dependencies = [
"libc",
"rand_chacha",
"rand_core",
"rand_hc",
]
[[package]]
name = "rand_chacha"
version = "0.3.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e6c10a63a0fa32252be49d21e7709d4d4baf8d231c2dbce1eaa8141b9b127d88"
dependencies = [
"ppv-lite86",
"rand_core",
]
[[package]]
name = "rand_core"
version = "0.6.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d34f1408f55294453790c48b2f1ebbb1c5b4b7563eb1f418bcfcfdbb06ebb4e7"
dependencies = [
"getrandom",
]
[[package]]
name = "rand_hc"
version = "0.3.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d51e9f596de227fda2ea6c84607f5558e196eeaf43c986b724ba4fb8fdf497e7"
dependencies = [
"rand_core",
]
[[package]]
name = "redox_syscall"
version = "0.2.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8383f39639269cde97d255a32bdb68c047337295414940c68bdd30c2e13203ff"
dependencies = [
"bitflags",
]
[[package]]
name = "remove_dir_all"
version = "0.5.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3acd125665422973a33ac9d3dd2df85edad0f4ae9b00dafb1a05e43a9f5ef8e7"
dependencies = [
"winapi",
]
[[package]]
name = "ryu"
version = "1.0.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "73b4b750c782965c211b42f022f59af1fbceabdd026623714f104152f1ec149f"
[[package]]
name = "serde"
version = "1.0.131"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b4ad69dfbd3e45369132cc64e6748c2d65cdfb001a2b1c232d128b4ad60561c1"
[[package]]
name = "serde_json"
version = "1.0.73"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bcbd0344bc6533bc7ec56df11d42fb70f1b912351c0825ccb7211b59d8af7cf5"
dependencies = [
"itoa",
"ryu",
"serde",
]
[[package]]
name = "slog"
version = "2.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8347046d4ebd943127157b94d63abb990fcf729dc4e9978927fdf4ac3c998d06"
[[package]]
name = "slog-async"
version = "2.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "766c59b252e62a34651412870ff55d8c4e6d04df19b43eecb2703e417b097ffe"
dependencies = [
"crossbeam-channel",
"slog",
"take_mut",
"thread_local",
]
[[package]]
name = "slog-json"
version = "2.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "52e9b96fb6b5e80e371423b4aca6656eb537661ce8f82c2697e619f8ca85d043"
dependencies = [
"chrono",
"serde",
"serde_json",
"slog",
]
[[package]]
name = "slog-scope"
version = "4.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2f95a4b4c3274cd2869549da82b57ccc930859bdbf5bcea0424bc5f140b3c786"
dependencies = [
"arc-swap",
"lazy_static",
"slog",
]
[[package]]
name = "take_mut"
version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f764005d11ee5f36500a149ace24e00e3da98b0158b3e2d53a7495660d3f4d60"
[[package]]
name = "tempfile"
version = "3.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dac1c663cfc93810f88aed9b8941d48cabf856a1b111c29a40439018d870eb22"
dependencies = [
"cfg-if",
"libc",
"rand",
"redox_syscall",
"remove_dir_all",
"winapi",
]
[[package]]
name = "thread_local"
version = "1.1.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8018d24e04c95ac8790716a5987d0fec4f8b27249ffa0f7d33f1369bdfb88cbd"
dependencies = [
"once_cell",
]
[[package]]
name = "time"
version = "0.1.43"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ca8a50ef2360fbd1eeb0ecd46795a87a19024eb4b53c5dc916ca1fd95fe62438"
dependencies = [
"libc",
"winapi",
]
[[package]]
name = "wasi"
version = "0.10.2+wasi-snapshot-preview1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fd6fbd9a79829dd1ad0cc20627bf1ed606756a7f77edff7b66b7064f9cb327c6"
[[package]]
name = "winapi"
version = "0.3.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5c839a674fcd7a98952e593242ea400abe93992746761e38641405d28b00f419"
dependencies = [
"winapi-i686-pc-windows-gnu",
"winapi-x86_64-pc-windows-gnu",
]
[[package]]
name = "winapi-i686-pc-windows-gnu"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ac3b87c63620426dd9b991e5ce0329eff545bccbbb34f3be09ff6fb6ab51b7b6"
[[package]]
name = "winapi-x86_64-pc-windows-gnu"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "712e227841d057c1ee1cd2fb22fa7e5a5461ae8e48fa2ca79ec42cfc1931183f"

View File

@@ -43,7 +43,7 @@ pub struct Spec {
pub process: Option<Process>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub root: Option<Root>,
#[serde(default, skip_serializing_if = "String::is_empty")]
#[serde(default, skip_serializing_if = "String:: is_empty")]
pub hostname: String,
#[serde(default, skip_serializing_if = "Vec::is_empty")]
pub mounts: Vec<Mount>,

View File

@@ -1,7 +1,6 @@
Cargo.lock
src/agent.rs
src/agent_ttrpc.rs
src/csi.rs
src/empty.rs
src/health.rs
src/health_ttrpc.rs

View File

@@ -7,14 +7,14 @@
# //
die() {
cat <<EOF >&2
cat <<EOT >&2
====================================================================
==== compile protocols failed ====
$1
====================================================================
EOF
EOT
exit 1
}

View File

@@ -51,8 +51,6 @@ service AgentService {
rpc ListInterfaces(ListInterfacesRequest) returns(Interfaces);
rpc ListRoutes(ListRoutesRequest) returns (Routes);
rpc AddARPNeighbors(AddARPNeighborsRequest) returns (google.protobuf.Empty);
rpc GetIPTables(GetIPTablesRequest) returns (GetIPTablesResponse);
rpc SetIPTables(SetIPTablesRequest) returns (SetIPTablesResponse);
// observability
rpc GetMetrics(GetMetricsRequest) returns (Metrics);
@@ -330,28 +328,6 @@ message AddARPNeighborsRequest {
ARPNeighbors neighbors = 1;
}
message GetIPTablesRequest {
bool is_ipv6 = 1;
}
message GetIPTablesResponse{
// raw stdout from iptables-save or ip6tables-save
bytes data = 1;
}
message SetIPTablesRequest {
bool is_ipv6 = 1;
// iptables, in raw format expected to be passed to stdin
// of iptables-save or ip6tables-save
bytes data = 2;
}
message SetIPTablesResponse{
// raw stdout from iptables-restore or ip6tables-restore
bytes data = 1;
}
message OnlineCPUMemRequest {
// Wait specifies if the caller waits for the agent to online all resources.
// If true the agent returns once all resources have been connected, otherwise all

View File

@@ -1,18 +0,0 @@
[package]
name = "safe-path"
version = "0.1.0"
description = "A library to safely handle file system paths for container runtimes"
keywords = ["kata", "container", "path", "securejoin"]
categories = ["parser-implementations", "filesystem"]
authors = ["The Kata Containers community <kata-dev@lists.katacontainers.io>"]
repository = "https://github.com/kata-containers/kata-containers.git"
homepage = "https://katacontainers.io/"
readme = "README.md"
license = "Apache-2.0"
edition = "2018"
[dependencies]
libc = "0.2.100"
[dev-dependencies]
tempfile = "3.2.0"

View File

@@ -1,21 +0,0 @@
Safe Path
====================
[![CI](https://github.com/magiclen/path-absolutize/actions/workflows/ci.yml/badge.svg)](https://github.com/magiclen/path-absolutize/actions/workflows/ci.yml)
A library to safely handle filesystem paths, typically for container runtimes.
There are often path related attacks, such as symlink based attacks, TOCTTOU attacks. The `safe-path` crate
provides several functions and utility structures to protect against path resolution related attacks.
## Support
**Operating Systems**:
- Linux
## Reference
- [`filepath-securejoin`](https://github.com/cyphar/filepath-securejoin): secure_join() written in Go.
- [CVE-2021-30465](https://github.com/advisories/GHSA-c3xm-pvg7-gh7r): symlink related TOCTOU flaw in `runC`.
## License
This code is licensed under [Apache-2.0](../../../LICENSE).

View File

@@ -1,65 +0,0 @@
// Copyright (c) 2022 Alibaba Cloud
//
// SPDX-License-Identifier: Apache-2.0
//
//! A library to safely handle filesystem paths, typically for container runtimes.
//!
//! Linux [mount namespace](https://man7.org/linux/man-pages/man7/mount_namespaces.7.html)
//! provides isolation of the list of mounts seen by the processes in each
//! [namespace](https://man7.org/linux/man-pages/man7/namespaces.7.html) instance.
//! Thus, the processes in each of the mount namespace instances will see distinct single-directory
//! hierarchies.
//!
//! Containers are used to isolate workloads from the host system. Container on Linux systems
//! depends on the mount namespace to build an isolated root filesystem for each container,
//! thus protect the host and containers from each other. When creating containers, the container
//! runtime needs to setup filesystem mounts for container rootfs/volumes. Configuration for
//! mounts/paths may be indirectly controlled by end users through:
//! - container images
//! - Kubernetes pod specifications
//! - hook command line arguments
//!
//! These volume configuration information may be controlled by end users/malicious attackers,
//! so it must not be trusted by container runtimes. When the container runtime is preparing mount
//! namespace for a container, it must be very careful to validate user input configuration
//! information and ensure data out of the container rootfs directory won't be affected
//! by the container. There are several types of attacks related to container mount namespace:
//! - symlink based attack
//! - Time of check to time of use (TOCTTOU)
//!
//! This crate provides several mechanisms for container runtimes to safely handle filesystem paths
//! when preparing mount namespace for containers.
//! - [scoped_join()](crate::scoped_join()): safely join `unsafe_path` to `root`, and ensure
//! `unsafe_path` is scoped under `root`.
//! - [scoped_resolve()](crate::scoped_resolve()): resolve `unsafe_path` to a relative path,
//! rooted at and constrained by `root`.
//! - [struct PinnedPathBuf](crate::PinnedPathBuf): safe version of `PathBuf` to protect from
//! TOCTTOU style of attacks, which ensures:
//! - the value of [`PinnedPathBuf::as_path()`] never changes.
//! - the path returned by [`PinnedPathBuf::as_path()`] is always a symlink.
//! - the filesystem object referenced by the symlink [`PinnedPathBuf::as_path()`] never changes.
//! - the value of [`PinnedPathBuf::target()`] never changes.
//! - [struct ScopedDirBuilder](crate::ScopedDirBuilder): safe version of `DirBuilder` to protect
//! from symlink race and TOCTTOU style of attacks, which enhances security by:
//! - ensuring the new directories are created under a specified `root` directory.
//! - avoiding symlink race attacks during making directories.
//! - returning a [PinnedPathBuf] for the last level of directory, so it could be used for other
//! operations safely.
//!
//! The work is inspired by:
//! - [`filepath-securejoin`](https://github.com/cyphar/filepath-securejoin): secure_join() written
//! in Go.
//! - [CVE-2021-30465](https://github.com/advisories/GHSA-c3xm-pvg7-gh7r): symlink related TOCTOU
//! flaw in `runC`.
#![deny(missing_docs)]
mod pinned_path_buf;
pub use pinned_path_buf::PinnedPathBuf;
mod scoped_dir_builder;
pub use scoped_dir_builder::ScopedDirBuilder;
mod scoped_path_resolver;
pub use scoped_path_resolver::{scoped_join, scoped_resolve};

View File

@@ -1,445 +0,0 @@
// Copyright (c) 2022 Alibaba Cloud
//
// SPDX-License-Identifier: Apache-2.0
//
use std::ffi::{CString, OsStr};
use std::fs::{self, File, Metadata, OpenOptions};
use std::io::{Error, ErrorKind, Result};
use std::ops::Deref;
use std::os::unix::ffi::OsStrExt;
use std::os::unix::fs::OpenOptionsExt;
use std::os::unix::io::{AsRawFd, FromRawFd, RawFd};
use std::path::{Component, Path, PathBuf};
use crate::scoped_join;
/// A safe version of [`PathBuf`] pinned to an underlying filesystem object to protect from
/// `TOCTTOU` style of attacks.
///
/// A [`PinnedPathBuf`] is a resolved path buffer pinned to an underlying filesystem object, which
/// guarantees:
/// - the value of [`PinnedPathBuf::as_path()`] never changes.
/// - the path returned by [`PinnedPathBuf::as_path()`] is always a symlink.
/// - the filesystem object referenced by the symlink [`PinnedPathBuf::as_path()`] never changes.
/// - the value of [`PinnedPathBuf::target()`] never changes.
///
/// Note:
/// - Though the filesystem object referenced by the symlink [`PinnedPathBuf::as_path()`] never
/// changes, the value of `fs::read_link(PinnedPathBuf::as_path())` may change due to filesystem
/// operations.
/// - The value of [`PinnedPathBuf::target()`] is a cached version of
/// `fs::read_link(PinnedPathBuf::as_path())` generated when creating the `PinnedPathBuf` object.
/// - It's a sign of possible attacks if `[PinnedPathBuf::target()]` doesn't match
/// `fs::read_link(PinnedPathBuf::as_path())`.
/// - Once the [`PinnedPathBuf`] object gets dropped, the [`Path`] returned by
/// [`PinnedPathBuf::as_path()`] becomes invalid.
///
/// With normal [`PathBuf`], there's a race window for attackers between time to validate a path and
/// time to use the path. An attacker may maliciously change filesystem object referenced by the
/// path by using symlinks to compose an attack.
///
/// The [`PinnedPathBuf`] is introduced to protect from such attacks, by using the
/// `/proc/self/fd/xxx` files on Linux. The `/proc/self/fd/xxx` file on Linux is a symlink to the
/// real target corresponding to the process's file descriptor `xxx`. And the target filesystem
/// object referenced by the symlink will be kept stable until the file descriptor has been closed.
/// Combined with `O_PATH`, a safe version of `PathBuf` could be built by:
/// - Generate a safe path from `root` and `path` by using [`crate::scoped_join()`].
/// - Open the safe path with O_PATH | O_CLOEXEC flags, say the fd number is `fd_num`.
/// - Read the symlink target of `/proc/self/fd/fd_num`.
/// - Compare the symlink target with the safe path, it's safe if these two paths equal.
/// - Use the proc file path as a safe version of [`PathBuf`].
/// - Close the `fd_num` when dropping the [`PinnedPathBuf`] object.
#[derive(Debug)]
pub struct PinnedPathBuf {
handle: File,
path: PathBuf,
target: PathBuf,
}
impl PinnedPathBuf {
/// Create a [`PinnedPathBuf`] object from `root` and `path`.
///
/// The `path` must be a subdirectory of `root`, otherwise error will be returned.
pub fn new<R: AsRef<Path>, U: AsRef<Path>>(root: R, path: U) -> Result<Self> {
let path = scoped_join(root, path)?;
Self::from_path(path)
}
/// Create a `PinnedPathBuf` from `path`.
///
/// If the resolved value of `path` doesn't equal to `path`, an error will be returned.
pub fn from_path<P: AsRef<Path>>(orig_path: P) -> Result<Self> {
let orig_path = orig_path.as_ref();
let handle = Self::open_by_path(orig_path)?;
Self::new_from_file(handle, orig_path)
}
/// Try to clone the [`PinnedPathBuf`] object.
pub fn try_clone(&self) -> Result<Self> {
let fd = unsafe { libc::dup(self.path_fd()) };
if fd < 0 {
Err(Error::last_os_error())
} else {
Ok(Self {
handle: unsafe { File::from_raw_fd(fd) },
path: Self::get_proc_path(fd),
target: self.target.clone(),
})
}
}
/// Return the underlying file descriptor representing the pinned path.
///
/// Following operations are supported by the returned `RawFd`:
/// - fchdir
/// - fstat/fstatfs
/// - openat/linkat/fchownat/fstatat/readlinkat/mkdirat/*at
/// - fcntl(F_GETFD, F_SETFD, F_GETFL)
pub fn path_fd(&self) -> RawFd {
self.handle.as_raw_fd()
}
/// Get the symlink path referring the target filesystem object.
pub fn as_path(&self) -> &Path {
self.path.as_path()
}
/// Get the cached real path of the target filesystem object.
///
/// The target path is cached version of `fs::read_link(PinnedPathBuf::as_path())` generated
/// when creating the `PinnedPathBuf` object. On the other hand, the value of
/// `fs::read_link(PinnedPathBuf::as_path())` may change due to underlying filesystem operations.
/// So it's a sign of possible attacks if `PinnedPathBuf::target()` does not match
/// `fs::read_link(PinnedPathBuf::as_path())`.
pub fn target(&self) -> &Path {
&self.target
}
/// Get [`Metadata`] about the path handle.
pub fn metadata(&self) -> Result<Metadata> {
self.handle.metadata()
}
/// Open a direct child of the filesystem objected referenced by the `PinnedPathBuf` object.
pub fn open_child(&self, path_comp: &OsStr) -> Result<Self> {
let name = Self::prepare_path_component(path_comp)?;
let oflags = libc::O_PATH | libc::O_CLOEXEC;
let res = unsafe { libc::openat(self.path_fd(), name.as_ptr(), oflags, 0) };
if res < 0 {
Err(Error::last_os_error())
} else {
let handle = unsafe { File::from_raw_fd(res) };
Self::new_from_file(handle, self.target.join(path_comp))
}
}
/// Create or open a child directory if current object is a directory.
pub fn mkdir(&self, path_comp: &OsStr, mode: libc::mode_t) -> Result<Self> {
let path_name = Self::prepare_path_component(path_comp)?;
let res = unsafe { libc::mkdirat(self.handle.as_raw_fd(), path_name.as_ptr(), mode) };
if res < 0 {
Err(Error::last_os_error())
} else {
self.open_child(path_comp)
}
}
/// Open a directory/file by path.
///
/// Obtain a file descriptor that can be used for two purposes:
/// - indicate a location in the filesystem tree
/// - perform operations that act purely at the file descriptor level
fn open_by_path<P: AsRef<Path>>(path: P) -> Result<File> {
// When O_PATH is specified in flags, flag bits other than O_CLOEXEC, O_DIRECTORY, and
// O_NOFOLLOW are ignored.
let o_flags = libc::O_PATH | libc::O_CLOEXEC;
OpenOptions::new()
.read(true)
.custom_flags(o_flags)
.open(path.as_ref())
}
fn get_proc_path<F: AsRawFd>(file: F) -> PathBuf {
PathBuf::from(format!("/proc/self/fd/{}", file.as_raw_fd()))
}
fn new_from_file<P: AsRef<Path>>(handle: File, orig_path: P) -> Result<Self> {
let path = Self::get_proc_path(handle.as_raw_fd());
let link_path = fs::read_link(path.as_path())?;
if link_path != orig_path.as_ref() {
Err(Error::new(
ErrorKind::Other,
format!(
"Path changed from {} to {} on open, possible attack",
orig_path.as_ref().display(),
link_path.display()
),
))
} else {
Ok(PinnedPathBuf {
handle,
path,
target: link_path,
})
}
}
#[inline]
fn prepare_path_component(path_comp: &OsStr) -> Result<CString> {
let path = Path::new(path_comp);
let mut comps = path.components();
let name = comps.next();
if !matches!(name, Some(Component::Normal(_))) || comps.next().is_some() {
return Err(Error::new(
ErrorKind::Other,
format!("Path component {} is invalid", path_comp.to_string_lossy()),
));
}
let name = name.unwrap();
if name.as_os_str() != path_comp {
return Err(Error::new(
ErrorKind::Other,
format!("Path component {} is invalid", path_comp.to_string_lossy()),
));
}
CString::new(path_comp.as_bytes()).map_err(|_e| {
Error::new(
ErrorKind::Other,
format!("Path component {} is invalid", path_comp.to_string_lossy()),
)
})
}
}
impl Deref for PinnedPathBuf {
type Target = PathBuf;
fn deref(&self) -> &Self::Target {
&self.path
}
}
impl AsRef<Path> for PinnedPathBuf {
fn as_ref(&self) -> &Path {
self.path.as_path()
}
}
#[cfg(test)]
mod tests {
use super::*;
use std::ffi::OsString;
use std::fs::DirBuilder;
use std::io::Write;
use std::os::unix::fs::{symlink, MetadataExt};
use std::sync::{Arc, Barrier};
use std::thread;
#[test]
fn test_pinned_path_buf() {
// Create a root directory, which itself contains symlinks.
let rootfs_dir = tempfile::tempdir().expect("failed to create tmpdir");
DirBuilder::new()
.create(rootfs_dir.path().join("b"))
.unwrap();
symlink(rootfs_dir.path().join("b"), rootfs_dir.path().join("a")).unwrap();
let rootfs_path = &rootfs_dir.path().join("a");
// Create a file and a symlink to it.
fs::create_dir(rootfs_path.join("symlink_dir")).unwrap();
symlink("/endpoint", rootfs_path.join("symlink_dir/endpoint")).unwrap();
fs::write(rootfs_path.join("endpoint"), "test").unwrap();
// Pin the target and validate the path/content.
let path = PinnedPathBuf::new(rootfs_path, "symlink_dir/endpoint").unwrap();
assert!(!path.is_dir());
let path_ref = path.deref();
let target = fs::read_link(path_ref).unwrap();
assert_eq!(target, rootfs_path.join("endpoint").canonicalize().unwrap());
let content = fs::read_to_string(&path).unwrap();
assert_eq!(&content, "test");
// Remove the target file and validate that we could still read data from the pinned path.
fs::remove_file(&target).unwrap();
fs::read_to_string(&target).unwrap_err();
let content = fs::read_to_string(&path).unwrap();
assert_eq!(&content, "test");
}
#[test]
fn test_pinned_path_buf_race() {
let root_dir = tempfile::tempdir().expect("failed to create tmpdir");
let root_path = root_dir.path();
let barrier = Arc::new(Barrier::new(2));
fs::write(root_path.join("a"), b"a").unwrap();
fs::write(root_path.join("b"), b"b").unwrap();
fs::write(root_path.join("c"), b"c").unwrap();
symlink("a", root_path.join("s")).unwrap();
let root_path2 = root_path.to_path_buf();
let barrier2 = barrier.clone();
let thread = thread::spawn(move || {
// step 1
barrier2.wait();
fs::remove_file(root_path2.join("a")).unwrap();
symlink("b", root_path2.join("a")).unwrap();
barrier2.wait();
// step 2
barrier2.wait();
fs::remove_file(root_path2.join("b")).unwrap();
symlink("c", root_path2.join("b")).unwrap();
barrier2.wait();
});
let path = scoped_join(&root_path, "s").unwrap();
let data = fs::read_to_string(&path).unwrap();
assert_eq!(&data, "a");
assert!(path.is_file());
barrier.wait();
barrier.wait();
// Verify the target has been redirected.
let data = fs::read_to_string(&path).unwrap();
assert_eq!(&data, "b");
PinnedPathBuf::from_path(&path).unwrap_err();
let pinned_path = PinnedPathBuf::new(&root_path, "s").unwrap();
let data = fs::read_to_string(&pinned_path).unwrap();
assert_eq!(&data, "b");
// step2
barrier.wait();
barrier.wait();
// Verify it still points to the old target.
let data = fs::read_to_string(&pinned_path).unwrap();
assert_eq!(&data, "b");
thread.join().unwrap();
}
#[test]
fn test_new_pinned_path_buf() {
let rootfs_dir = tempfile::tempdir().expect("failed to create tmpdir");
let rootfs_path = rootfs_dir.path();
let path = PinnedPathBuf::from_path(rootfs_path).unwrap();
let _ = OpenOptions::new().read(true).open(&path).unwrap();
}
#[test]
fn test_pinned_path_try_clone() {
let rootfs_dir = tempfile::tempdir().expect("failed to create tmpdir");
let rootfs_path = rootfs_dir.path();
let path = PinnedPathBuf::from_path(rootfs_path).unwrap();
let path2 = path.try_clone().unwrap();
assert_ne!(path.as_path(), path2.as_path());
}
#[test]
fn test_new_pinned_path_buf_from_nonexist_file() {
let rootfs_dir = tempfile::tempdir().expect("failed to create tmpdir");
let rootfs_path = rootfs_dir.path();
PinnedPathBuf::new(rootfs_path, "does_not_exist").unwrap_err();
}
#[allow(clippy::zero_prefixed_literal)]
#[test]
fn test_new_pinned_path_buf_without_read_perm() {
let rootfs_dir = tempfile::tempdir().expect("failed to create tmpdir");
let rootfs_path = rootfs_dir.path();
let path = rootfs_path.join("write_only_file");
let mut file = OpenOptions::new()
.read(false)
.write(true)
.create(true)
.mode(0o200)
.open(&path)
.unwrap();
file.write_all(&[0xa5u8]).unwrap();
let md = fs::metadata(&path).unwrap();
let umask = unsafe { libc::umask(0022) };
unsafe { libc::umask(umask) };
assert_eq!(md.mode() & 0o700, 0o200 & !umask);
PinnedPathBuf::from_path(&path).unwrap();
}
#[test]
fn test_pinned_path_buf_path_fd() {
let rootfs_dir = tempfile::tempdir().expect("failed to create tmpdir");
let rootfs_path = rootfs_dir.path();
let path = rootfs_path.join("write_only_file");
let mut file = OpenOptions::new()
.read(false)
.write(true)
.create(true)
.mode(0o200)
.open(&path)
.unwrap();
file.write_all(&[0xa5u8]).unwrap();
let handle = PinnedPathBuf::from_path(&path).unwrap();
// Check that `fstat()` etc works with the fd returned by `path_fd()`.
let fd = handle.path_fd();
let mut stat: libc::stat = unsafe { std::mem::zeroed() };
let res = unsafe { libc::fstat(fd, &mut stat as *mut _) };
assert_eq!(res, 0);
}
#[test]
fn test_pinned_path_buf_open_child() {
let rootfs_dir = tempfile::tempdir().expect("failed to create tmpdir");
let rootfs_path = rootfs_dir.path();
let path = PinnedPathBuf::from_path(rootfs_path).unwrap();
fs::write(path.join("child"), "test").unwrap();
let path = path.open_child(OsStr::new("child")).unwrap();
let content = fs::read_to_string(&path).unwrap();
assert_eq!(&content, "test");
path.open_child(&OsString::from("__does_not_exist__"))
.unwrap_err();
path.open_child(&OsString::from("test/a")).unwrap_err();
}
#[test]
fn test_prepare_path_component() {
assert!(PinnedPathBuf::prepare_path_component(&OsString::from("")).is_err());
assert!(PinnedPathBuf::prepare_path_component(&OsString::from(".")).is_err());
assert!(PinnedPathBuf::prepare_path_component(&OsString::from("..")).is_err());
assert!(PinnedPathBuf::prepare_path_component(&OsString::from("/")).is_err());
assert!(PinnedPathBuf::prepare_path_component(&OsString::from("//")).is_err());
assert!(PinnedPathBuf::prepare_path_component(&OsString::from("a/b")).is_err());
assert!(PinnedPathBuf::prepare_path_component(&OsString::from("./b")).is_err());
assert!(PinnedPathBuf::prepare_path_component(&OsString::from("a/.")).is_err());
assert!(PinnedPathBuf::prepare_path_component(&OsString::from("a/..")).is_err());
assert!(PinnedPathBuf::prepare_path_component(&OsString::from("a/./")).is_err());
assert!(PinnedPathBuf::prepare_path_component(&OsString::from("a/../")).is_err());
assert!(PinnedPathBuf::prepare_path_component(&OsString::from("a/./a")).is_err());
assert!(PinnedPathBuf::prepare_path_component(&OsString::from("a/../a")).is_err());
assert!(PinnedPathBuf::prepare_path_component(&OsString::from("a")).is_ok());
assert!(PinnedPathBuf::prepare_path_component(&OsString::from("a.b")).is_ok());
assert!(PinnedPathBuf::prepare_path_component(&OsString::from("a..b")).is_ok());
}
#[test]
fn test_target_fs_object_changed() {
let rootfs_dir = tempfile::tempdir().expect("failed to create tmpdir");
let rootfs_path = rootfs_dir.path();
let file = rootfs_path.join("child");
fs::write(&file, "test").unwrap();
let path = PinnedPathBuf::from_path(&file).unwrap();
let path3 = fs::read_link(path.as_path()).unwrap();
assert_eq!(&path3, path.target());
fs::rename(file, rootfs_path.join("child2")).unwrap();
let path4 = fs::read_link(path.as_path()).unwrap();
assert_ne!(&path4, path.target());
fs::remove_file(rootfs_path.join("child2")).unwrap();
let path5 = fs::read_link(path.as_path()).unwrap();
assert_ne!(&path4, &path5);
}
}

View File

@@ -1,295 +0,0 @@
// Copyright (c) 2022 Alibaba Cloud
//
// SPDX-License-Identifier: Apache-2.0
//
use std::io::{Error, ErrorKind, Result};
use std::path::Path;
use crate::{scoped_join, scoped_resolve, PinnedPathBuf};
const DIRECTORY_MODE_DEFAULT: u32 = 0o777;
const DIRECTORY_MODE_MASK: u32 = 0o777;
/// Safe version of `DirBuilder` to protect from TOCTOU style of attacks.
///
/// The `ScopedDirBuilder` is a counterpart for `DirBuilder`, with safety enhancements of:
/// - ensuring the new directories are created under a specified `root` directory.
/// - ensuring all created directories are still scoped under `root` even under symlink based
/// attacks.
/// - returning a [PinnedPathBuf] for the last level of directory, so it could be used for other
/// operations safely.
#[derive(Debug)]
pub struct ScopedDirBuilder {
root: PinnedPathBuf,
mode: u32,
recursive: bool,
}
impl ScopedDirBuilder {
/// Create a new instance of `ScopedDirBuilder` with with default mode/security settings.
pub fn new<P: AsRef<Path>>(root: P) -> Result<Self> {
let root = root.as_ref().canonicalize()?;
let root = PinnedPathBuf::from_path(root)?;
if !root.metadata()?.is_dir() {
return Err(Error::new(
ErrorKind::Other,
format!("Invalid root path: {}", root.display()),
));
}
Ok(ScopedDirBuilder {
root,
mode: DIRECTORY_MODE_DEFAULT,
recursive: false,
})
}
/// Indicates that directories should be created recursively, creating all parent directories.
///
/// Parents that do not exist are created with the same security and permissions settings.
pub fn recursive(&mut self, recursive: bool) -> &mut Self {
self.recursive = recursive;
self
}
/// Sets the mode to create new directories with. This option defaults to 0o755.
pub fn mode(&mut self, mode: u32) -> &mut Self {
self.mode = mode & DIRECTORY_MODE_MASK;
self
}
/// Creates the specified directory with the options configured in this builder.
///
/// This is a helper to create subdirectory with an absolute path, without stripping off
/// `self.root`. So error will be returned if path does start with `self.root`.
/// It is considered an error if the directory already exists unless recursive mode is enabled.
pub fn create_with_unscoped_path<P: AsRef<Path>>(&self, path: P) -> Result<PinnedPathBuf> {
if !path.as_ref().is_absolute() {
return Err(Error::new(
ErrorKind::Other,
format!(
"Expected absolute directory path: {}",
path.as_ref().display()
),
));
}
// Partially canonicalize `path` so we can strip the `root` part.
let scoped_path = scoped_join("/", path)?;
let stripped_path = scoped_path.strip_prefix(self.root.target()).map_err(|_| {
Error::new(
ErrorKind::Other,
format!(
"Path {} is not under {}",
scoped_path.display(),
self.root.target().display()
),
)
})?;
self.do_mkdir(stripped_path)
}
/// Creates sub-directory with the options configured in this builder.
///
/// It is considered an error if the directory already exists unless recursive mode is enabled.
pub fn create<P: AsRef<Path>>(&self, path: P) -> Result<PinnedPathBuf> {
let path = scoped_resolve(&self.root, path)?;
self.do_mkdir(&path)
}
fn do_mkdir(&self, path: &Path) -> Result<PinnedPathBuf> {
assert!(path.is_relative());
if path.file_name().is_none() {
if !self.recursive {
return Err(Error::new(
ErrorKind::AlreadyExists,
"directory already exists",
));
} else {
return self.root.try_clone();
}
}
// Safe because `path` have at least one level.
let levels = path.iter().count() - 1;
let mut dir = self.root.try_clone()?;
for (idx, comp) in path.iter().enumerate() {
match dir.open_child(comp) {
Ok(v) => {
if !v.metadata()?.is_dir() {
return Err(Error::new(
ErrorKind::Other,
format!("Path {} is not a directory", v.display()),
));
} else if !self.recursive && idx == levels {
return Err(Error::new(
ErrorKind::AlreadyExists,
"directory already exists",
));
}
dir = v;
}
Err(_e) => {
if !self.recursive && idx != levels {
return Err(Error::new(
ErrorKind::NotFound,
"parent directory does not exist".to_string(),
));
}
dir = dir.mkdir(comp, self.mode)?;
}
}
}
Ok(dir)
}
}
#[allow(clippy::zero_prefixed_literal)]
#[cfg(test)]
mod tests {
use super::*;
use std::fs;
use std::fs::DirBuilder;
use std::os::unix::fs::{symlink, MetadataExt};
use tempfile::tempdir;
#[test]
fn test_scoped_dir_builder() {
// create temporary directory to emulate container rootfs with symlink
let rootfs_dir = tempdir().expect("failed to create tmpdir");
DirBuilder::new()
.create(rootfs_dir.path().join("b"))
.unwrap();
symlink(rootfs_dir.path().join("b"), rootfs_dir.path().join("a")).unwrap();
let rootfs_path = &rootfs_dir.path().join("a");
// root directory doesn't exist
ScopedDirBuilder::new(rootfs_path.join("__does_not_exist__")).unwrap_err();
ScopedDirBuilder::new("__does_not_exist__").unwrap_err();
// root is a file
fs::write(rootfs_path.join("txt"), "test").unwrap();
ScopedDirBuilder::new(rootfs_path.join("txt")).unwrap_err();
let mut builder = ScopedDirBuilder::new(&rootfs_path).unwrap();
// file with the same name already exists.
builder
.create_with_unscoped_path(rootfs_path.join("txt"))
.unwrap_err();
// parent is a file
builder.create("/txt/a").unwrap_err();
// Not starting with root
builder.create_with_unscoped_path("/txt/a").unwrap_err();
// creating "." without recursive mode should fail
builder
.create_with_unscoped_path(rootfs_path.join("."))
.unwrap_err();
// parent doesn't exist
builder
.create_with_unscoped_path(rootfs_path.join("a/b"))
.unwrap_err();
builder.create("a/b/c").unwrap_err();
let path = builder.create("a").unwrap();
assert!(rootfs_path.join("a").is_dir());
assert_eq!(path.target(), rootfs_path.join("a").canonicalize().unwrap());
// Creating an existing directory without recursive mode should fail.
builder
.create_with_unscoped_path(rootfs_path.join("a"))
.unwrap_err();
// Creating an existing directory with recursive mode should succeed.
builder.recursive(true);
let path = builder
.create_with_unscoped_path(rootfs_path.join("a"))
.unwrap();
assert_eq!(path.target(), rootfs_path.join("a").canonicalize().unwrap());
let path = builder.create(".").unwrap();
assert_eq!(path.target(), rootfs_path.canonicalize().unwrap());
let umask = unsafe { libc::umask(0022) };
unsafe { libc::umask(umask) };
builder.mode(0o740);
let path = builder.create("a/b/c/d").unwrap();
assert_eq!(
path.target(),
rootfs_path.join("a/b/c/d").canonicalize().unwrap()
);
assert!(rootfs_path.join("a/b/c/d").is_dir());
assert_eq!(
rootfs_path.join("a").metadata().unwrap().mode() & 0o777,
DIRECTORY_MODE_DEFAULT & !umask,
);
assert_eq!(
rootfs_path.join("a/b").metadata().unwrap().mode() & 0o777,
0o740 & !umask
);
assert_eq!(
rootfs_path.join("a/b/c").metadata().unwrap().mode() & 0o777,
0o740 & !umask
);
assert_eq!(
rootfs_path.join("a/b/c/d").metadata().unwrap().mode() & 0o777,
0o740 & !umask
);
// Creating should fail if some components are not directory.
builder.create("txt/e/f").unwrap_err();
fs::write(rootfs_path.join("a/b/txt"), "test").unwrap();
builder.create("a/b/txt/h/i").unwrap_err();
}
#[test]
fn test_create_root() {
let mut builder = ScopedDirBuilder::new("/").unwrap();
builder.recursive(true);
builder.create("/").unwrap();
builder.create(".").unwrap();
builder.create("..").unwrap();
builder.create("../../.").unwrap();
builder.create("").unwrap();
builder.create_with_unscoped_path("/").unwrap();
builder.create_with_unscoped_path("/..").unwrap();
builder.create_with_unscoped_path("/../.").unwrap();
}
#[test]
fn test_create_with_absolute_path() {
// create temporary directory to emulate container rootfs with symlink
let rootfs_dir = tempdir().expect("failed to create tmpdir");
DirBuilder::new()
.create(rootfs_dir.path().join("b"))
.unwrap();
symlink(rootfs_dir.path().join("b"), rootfs_dir.path().join("a")).unwrap();
let rootfs_path = &rootfs_dir.path().join("a");
let mut builder = ScopedDirBuilder::new(&rootfs_path).unwrap();
builder.create_with_unscoped_path("/").unwrap_err();
builder
.create_with_unscoped_path(rootfs_path.join("../__xxxx___xxx__"))
.unwrap_err();
builder
.create_with_unscoped_path(rootfs_path.join("c/d"))
.unwrap_err();
// Return `AlreadyExist` when recursive is false
builder.create_with_unscoped_path(&rootfs_path).unwrap_err();
builder
.create_with_unscoped_path(rootfs_path.join("."))
.unwrap_err();
builder.recursive(true);
builder.create_with_unscoped_path(&rootfs_path).unwrap();
builder
.create_with_unscoped_path(rootfs_path.join("."))
.unwrap();
builder
.create_with_unscoped_path(rootfs_path.join("c/d"))
.unwrap();
}
}

View File

@@ -1,415 +0,0 @@
// Copyright (c) 2022 Alibaba Cloud
//
// SPDX-License-Identifier: Apache-2.0
//
use std::io::{Error, ErrorKind, Result};
use std::path::{Component, Path, PathBuf};
// Follow the same configuration as
// [secure_join](https://github.com/cyphar/filepath-securejoin/blob/master/join.go#L51)
const MAX_SYMLINK_DEPTH: u32 = 255;
fn do_scoped_resolve<R: AsRef<Path>, U: AsRef<Path>>(
root: R,
unsafe_path: U,
) -> Result<(PathBuf, PathBuf)> {
let root = root.as_ref().canonicalize()?;
let mut nlinks = 0u32;
let mut curr_path = unsafe_path.as_ref().to_path_buf();
'restart: loop {
let mut subpath = PathBuf::new();
let mut iter = curr_path.components();
'next_comp: while let Some(comp) = iter.next() {
match comp {
// Linux paths don't have prefixes.
Component::Prefix(_) => {
return Err(Error::new(
ErrorKind::Other,
format!("Invalid path prefix in: {}", unsafe_path.as_ref().display()),
));
}
// `RootDir` should always be the first component, and Path::components() ensures
// that.
Component::RootDir | Component::CurDir => {
continue 'next_comp;
}
Component::ParentDir => {
subpath.pop();
}
Component::Normal(n) => {
let path = root.join(&subpath).join(n);
if let Ok(v) = path.read_link() {
nlinks += 1;
if nlinks > MAX_SYMLINK_DEPTH {
return Err(Error::new(
ErrorKind::Other,
format!(
"Too many levels of symlinks: {}",
unsafe_path.as_ref().display()
),
));
}
curr_path = if v.is_absolute() {
v.join(iter.as_path())
} else {
subpath.join(v).join(iter.as_path())
};
continue 'restart;
} else {
subpath.push(n);
}
}
}
}
return Ok((root, subpath));
}
}
/// Resolve `unsafe_path` to a relative path, rooted at and constrained by `root`.
///
/// The `scoped_resolve()` function assumes `root` exists and is an absolute path. It processes
/// each path component in `unsafe_path` as below:
/// - assume it's not a symlink and output if the component doesn't exist yet.
/// - ignore if it's "/" or ".".
/// - go to parent directory but constrained by `root` if it's "..".
/// - recursively resolve to the real path if it's a symlink. All symlink resolutions will be
/// constrained by `root`.
/// - otherwise output the path component.
///
/// # Arguments
/// - `root`: the absolute path to constrain the symlink resolution.
/// - `unsafe_path`: the path to resolve.
///
/// Note that the guarantees provided by this function only apply if the path components in the
/// returned PathBuf are not modified (in other words are not replaced with symlinks on the
/// filesystem) after this function has returned. You may use [crate::PinnedPathBuf] to protect
/// from such TOCTOU attacks.
pub fn scoped_resolve<R: AsRef<Path>, U: AsRef<Path>>(root: R, unsafe_path: U) -> Result<PathBuf> {
do_scoped_resolve(root, unsafe_path).map(|(_root, path)| path)
}
/// Safely join `unsafe_path` to `root`, and ensure `unsafe_path` is scoped under `root`.
///
/// The `scoped_join()` function assumes `root` exists and is an absolute path. It safely joins the
/// two given paths and ensures:
/// - The returned path is guaranteed to be scoped inside `root`.
/// - Any symbolic links in the path are evaluated with the given `root` treated as the root of the
/// filesystem, similar to a chroot.
///
/// It's modelled after [secure_join](https://github.com/cyphar/filepath-securejoin), but only
/// for Linux systems.
///
/// # Arguments
/// - `root`: the absolute path to scope the symlink evaluation.
/// - `unsafe_path`: the path to evaluated and joint with `root`. It is unsafe since it may try to
/// escape from the `root` by using "../" or symlinks.
///
/// # Security
/// On success return, the `scoped_join()` function guarantees that:
/// - The resulting PathBuf must be a child path of `root` and will not contain any symlink path
/// components (they will all get expanded).
/// - When expanding symlinks, all symlink path components must be resolved relative to the provided
/// `root`. In particular, this can be considered a userspace implementation of how chroot(2)
/// operates on file paths.
/// - Non-existent path components are unaffected.
///
/// Note that the guarantees provided by this function only apply if the path components in the
/// returned string are not modified (in other words are not replaced with symlinks on the
/// filesystem) after this function has returned. You may use [crate::PinnedPathBuf] to protect
/// from such TOCTTOU attacks.
pub fn scoped_join<R: AsRef<Path>, U: AsRef<Path>>(root: R, unsafe_path: U) -> Result<PathBuf> {
do_scoped_resolve(root, unsafe_path).map(|(root, path)| root.join(path))
}
#[cfg(test)]
mod tests {
use super::*;
use std::fs::DirBuilder;
use std::os::unix::fs;
use tempfile::tempdir;
#[allow(dead_code)]
#[derive(Debug)]
struct TestData<'a> {
name: &'a str,
rootfs: &'a Path,
unsafe_path: &'a str,
result: &'a str,
}
fn exec_tests(tests: &[TestData]) {
for (i, t) in tests.iter().enumerate() {
// Create a string containing details of the test
let msg = format!("test[{}]: {:?}", i, t);
let result = scoped_resolve(t.rootfs, t.unsafe_path).unwrap();
let msg = format!("{}, result: {:?}", msg, result);
// Perform the checks
assert_eq!(&result, Path::new(t.result), "{}", msg);
}
}
#[test]
fn test_scoped_resolve() {
// create temporary directory to emulate container rootfs with symlink
let rootfs_dir = tempdir().expect("failed to create tmpdir");
DirBuilder::new()
.create(rootfs_dir.path().join("b"))
.unwrap();
fs::symlink(rootfs_dir.path().join("b"), rootfs_dir.path().join("a")).unwrap();
let rootfs_path = &rootfs_dir.path().join("a");
let tests = [
TestData {
name: "normal path",
rootfs: rootfs_path,
unsafe_path: "a/b/c",
result: "a/b/c",
},
TestData {
name: "path with .. at beginning",
rootfs: rootfs_path,
unsafe_path: "../../../a/b/c",
result: "a/b/c",
},
TestData {
name: "path with complex .. pattern",
rootfs: rootfs_path,
unsafe_path: "../../../a/../../b/../../c",
result: "c",
},
TestData {
name: "path with .. in middle",
rootfs: rootfs_path,
unsafe_path: "/usr/bin/../../bin/ls",
result: "bin/ls",
},
TestData {
name: "path with . and ..",
rootfs: rootfs_path,
unsafe_path: "/usr/./bin/../../bin/./ls",
result: "bin/ls",
},
TestData {
name: "path with . at end",
rootfs: rootfs_path,
unsafe_path: "/usr/./bin/../../bin/./ls/.",
result: "bin/ls",
},
TestData {
name: "path try to escape by ..",
rootfs: rootfs_path,
unsafe_path: "/usr/./bin/../../../../bin/./ls/../ls",
result: "bin/ls",
},
TestData {
name: "path with .. at the end",
rootfs: rootfs_path,
unsafe_path: "/usr/./bin/../../bin/./ls/..",
result: "bin",
},
TestData {
name: "path ..",
rootfs: rootfs_path,
unsafe_path: "..",
result: "",
},
TestData {
name: "path .",
rootfs: rootfs_path,
unsafe_path: ".",
result: "",
},
TestData {
name: "path /",
rootfs: rootfs_path,
unsafe_path: "/",
result: "",
},
TestData {
name: "empty path",
rootfs: rootfs_path,
unsafe_path: "",
result: "",
},
];
exec_tests(&tests);
}
#[test]
fn test_scoped_resolve_invalid() {
scoped_resolve("./root_is_not_absolute_path", ".").unwrap_err();
scoped_resolve("C:", ".").unwrap_err();
scoped_resolve(r#"\\server\test"#, ".").unwrap_err();
scoped_resolve(r#"http://localhost/test"#, ".").unwrap_err();
// Chinese Unicode characters
scoped_resolve(r#"您好"#, ".").unwrap_err();
}
#[test]
fn test_scoped_resolve_symlink() {
// create temporary directory to emulate container rootfs with symlink
let rootfs_dir = tempdir().expect("failed to create tmpdir");
let rootfs_path = &rootfs_dir.path();
std::fs::create_dir(rootfs_path.join("symlink_dir")).unwrap();
fs::symlink("../../../", rootfs_path.join("1")).unwrap();
let tests = [TestData {
name: "relative symlink beyond root",
rootfs: rootfs_path,
unsafe_path: "1",
result: "",
}];
exec_tests(&tests);
fs::symlink("/dddd", rootfs_path.join("2")).unwrap();
let tests = [TestData {
name: "abs symlink pointing to non-exist directory",
rootfs: rootfs_path,
unsafe_path: "2",
result: "dddd",
}];
exec_tests(&tests);
fs::symlink("/", rootfs_path.join("3")).unwrap();
let tests = [TestData {
name: "abs symlink pointing to /",
rootfs: rootfs_path,
unsafe_path: "3",
result: "",
}];
exec_tests(&tests);
fs::symlink("usr/bin/../bin/ls", rootfs_path.join("4")).unwrap();
let tests = [TestData {
name: "symlink with one ..",
rootfs: rootfs_path,
unsafe_path: "4",
result: "usr/bin/ls",
}];
exec_tests(&tests);
fs::symlink("usr/bin/../../bin/ls", rootfs_path.join("5")).unwrap();
let tests = [TestData {
name: "symlink with two ..",
rootfs: rootfs_path,
unsafe_path: "5",
result: "bin/ls",
}];
exec_tests(&tests);
fs::symlink(
"../usr/bin/../../../bin/ls",
rootfs_path.join("symlink_dir/6"),
)
.unwrap();
let tests = [TestData {
name: "symlink try to escape",
rootfs: rootfs_path,
unsafe_path: "symlink_dir/6",
result: "bin/ls",
}];
exec_tests(&tests);
// Detect symlink loop.
fs::symlink("/endpoint_b", rootfs_path.join("endpoint_a")).unwrap();
fs::symlink("/endpoint_a", rootfs_path.join("endpoint_b")).unwrap();
scoped_resolve(rootfs_path, "endpoint_a").unwrap_err();
}
#[test]
fn test_scoped_join() {
// create temporary directory to emulate container rootfs with symlink
let rootfs_dir = tempdir().expect("failed to create tmpdir");
let rootfs_path = &rootfs_dir.path();
assert_eq!(
scoped_join(&rootfs_path, "a").unwrap(),
rootfs_path.join("a")
);
assert_eq!(
scoped_join(&rootfs_path, "./a").unwrap(),
rootfs_path.join("a")
);
assert_eq!(
scoped_join(&rootfs_path, "././a").unwrap(),
rootfs_path.join("a")
);
assert_eq!(
scoped_join(&rootfs_path, "c/d/../../a").unwrap(),
rootfs_path.join("a")
);
assert_eq!(
scoped_join(&rootfs_path, "c/d/../../../.././a").unwrap(),
rootfs_path.join("a")
);
assert_eq!(
scoped_join(&rootfs_path, "../../a").unwrap(),
rootfs_path.join("a")
);
assert_eq!(
scoped_join(&rootfs_path, "./../a").unwrap(),
rootfs_path.join("a")
);
}
#[test]
fn test_scoped_join_symlink() {
// create temporary directory to emulate container rootfs with symlink
let rootfs_dir = tempdir().expect("failed to create tmpdir");
let rootfs_path = &rootfs_dir.path();
DirBuilder::new()
.recursive(true)
.create(rootfs_dir.path().join("b/c"))
.unwrap();
fs::symlink("b/c", rootfs_dir.path().join("a")).unwrap();
let target = rootfs_path.join("b/c");
assert_eq!(scoped_join(&rootfs_path, "a").unwrap(), target);
assert_eq!(scoped_join(&rootfs_path, "./a").unwrap(), target);
assert_eq!(scoped_join(&rootfs_path, "././a").unwrap(), target);
assert_eq!(scoped_join(&rootfs_path, "b/c/../../a").unwrap(), target);
assert_eq!(
scoped_join(&rootfs_path, "b/c/../../../.././a").unwrap(),
target
);
assert_eq!(scoped_join(&rootfs_path, "../../a").unwrap(), target);
assert_eq!(scoped_join(&rootfs_path, "./../a").unwrap(), target);
assert_eq!(scoped_join(&rootfs_path, "a/../../../a").unwrap(), target);
assert_eq!(scoped_join(&rootfs_path, "a/../../../b/c").unwrap(), target);
}
#[test]
fn test_scoped_join_symlink_loop() {
// create temporary directory to emulate container rootfs with symlink
let rootfs_dir = tempdir().expect("failed to create tmpdir");
let rootfs_path = &rootfs_dir.path();
fs::symlink("/endpoint_b", rootfs_path.join("endpoint_a")).unwrap();
fs::symlink("/endpoint_a", rootfs_path.join("endpoint_b")).unwrap();
scoped_join(rootfs_path, "endpoint_a").unwrap_err();
}
#[test]
fn test_scoped_join_unicode_character() {
// create temporary directory to emulate container rootfs with symlink
let rootfs_dir = tempdir().expect("failed to create tmpdir");
let rootfs_path = &rootfs_dir.path().canonicalize().unwrap();
let path = scoped_join(rootfs_path, "您好").unwrap();
assert_eq!(path, rootfs_path.join("您好"));
let path = scoped_join(rootfs_path, "../../../您好").unwrap();
assert_eq!(path, rootfs_path.join("您好"));
let path = scoped_join(rootfs_path, "。。/您好").unwrap();
assert_eq!(path, rootfs_path.join("。。/您好"));
let path = scoped_join(rootfs_path, "您好/../../test").unwrap();
assert_eq!(path, rootfs_path.join("test"));
}
}

View File

@@ -2,7 +2,6 @@
*.patch
*.swp
coverage.txt
coverage.txt.tmp
coverage.html
.git-commit
.git-commit.tmp

View File

@@ -158,22 +158,15 @@ DEFMEMSZ := 2048
# - vm template memory
# - hugepage memory
DEFMEMSLOTS := 10
# Default maximum memory in MiB
DEFMAXMEMSZ := 0
#Default number of bridges
DEFBRIDGES := 1
DEFENABLEANNOTATIONS := [\"enable_iommu\"]
DEFENABLEANNOTATIONS := []
DEFDISABLEGUESTSECCOMP := true
DEFDISABLEGUESTEMPTYDIR := false
#Default experimental features enabled
DEFAULTEXPFEATURES := []
DEFDISABLESELINUX := false
#Default SeccomSandbox param
#The same default policy is used by libvirt
#More explanation on https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg03348.html
# Note: "elevateprivileges=deny" doesn't work with daemonize option, so it's removed from the seccomp sandbox
DEFSECCOMPSANDBOXPARAM := on,obsolete=deny,spawn=deny,resourcecontrol=deny
#Default entropy source
DEFENTROPYSOURCE := /dev/urandom
@@ -182,10 +175,7 @@ DEFVALIDENTROPYSOURCES := [\"/dev/urandom\",\"/dev/random\",\"\"]
DEFDISABLEBLOCK := false
DEFSHAREDFS_CLH_VIRTIOFS := virtio-fs
DEFSHAREDFS_QEMU_VIRTIOFS := virtio-fs
DEFVIRTIOFSDAEMON := $(LIBEXECDIR)/virtiofsd
ifeq ($(ARCH),ppc64le)
DEFVIRTIOFSDAEMON := $(LIBEXECDIR)/kata-qemu/virtiofsd
endif
DEFVALIDVIRTIOFSDAEMONPATHS := [\"$(DEFVIRTIOFSDAEMON)\"]
# Default DAX mapping cache size in MiB
#if value is 0, DAX is not enabled
@@ -444,7 +434,6 @@ USER_VARS += DEFMAXVCPUS
USER_VARS += DEFMAXVCPUS_ACRN
USER_VARS += DEFMEMSZ
USER_VARS += DEFMEMSLOTS
USER_VARS += DEFMAXMEMSZ
USER_VARS += DEFBRIDGES
USER_VARS += DEFNETWORKMODEL_ACRN
USER_VARS += DEFNETWORKMODEL_CLH
@@ -467,7 +456,6 @@ USER_VARS += DEFVIRTIOFSCACHE
USER_VARS += DEFVIRTIOFSEXTRAARGS
USER_VARS += DEFENABLEANNOTATIONS
USER_VARS += DEFENABLEIOTHREADS
USER_VARS += DEFSECCOMPSANDBOXPARAM
USER_VARS += DEFENABLEVHOSTUSERSTORE
USER_VARS += DEFVHOSTUSERSTOREPATH
USER_VARS += DEFVALIDVHOSTUSERSTOREPATHS
@@ -604,11 +592,11 @@ generate-config: $(CONFIGS)
test: hook go-test
hook:
make -C pkg/katautils/mockhook
make -C virtcontainers hook
go-test: $(GENERATED_FILES)
go clean -testcache
$(QUIET_TEST)./go-test.sh
$(QUIET_TEST)../../ci/go-test.sh
fast-test: $(GENERATED_FILES)
go clean -testcache

View File

@@ -87,27 +87,6 @@ following locations (in order):
> **Note:** For both binaries, the first path that exists will be used.
#### Drop-in configuration file fragments
To enable changing configuration without changing the configuration file
itself, drop-in configuration file fragments are supported. Once a
configuration file is parsed, if there is a subdirectory called `config.d` in
the same directory as the configuration file its contents will be loaded
in alphabetical order and each item will be parsed as a config file. Settings
loaded from these configuration file fragments override settings loaded from
the main configuration file and earlier fragments. Users are encouraged to use
familiar naming conventions to order the fragments (e.g. `config.d/10-this`,
`config.d/20-that` etc.).
Non-existent or empty `config.d` directory is not an error (in other words, not
using configuration file fragments is fine). On the other hand, if fragments
are used, they must be valid - any errors while parsing fragments (unreadable
fragment files, contents not valid TOML) are treated the same as errors
while parsing the main configuration file. A `config.d` subdirectory affects
only the `configuration.toml` _in the same directory_. For fragments in
`config.d` to be parsed, there has to be a valid main configuration file _in
that location_ (it can be empty though).
### Hypervisor specific configuration
Kata Containers supports multiple hypervisors so your `configuration.toml`
@@ -146,7 +125,7 @@ $ kata-runtime env
For detailed information and analysis on obtaining logs for other system
components, see the documentation for the
[`kata-log-parser`](../tools/log-parser)
[`kata-log-parser`](https://github.com/kata-containers/tests/tree/main/cmd/log-parser)
tool.
### Kata containerd shimv2
@@ -180,7 +159,7 @@ See [the community repository](https://github.com/kata-containers/community).
### Contact
See [how to reach the community](https://github.com/kata-containers/community/blob/main/CONTRIBUTING.md#contact).
See [how to reach the community](https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md#contact).
## Further information

View File

@@ -15,14 +15,6 @@ Available metrics include:
All the provided metrics are in Prometheus format. While `kata-monitor` can be used as a standalone daemon on any host running Kata Containers workloads and can be used for retrieving profiling data from the running Kata runtimes, its main expected usage is to be deployed as a DaemonSet on a Kubernetes cluster: there Prometheus should scrape the metrics from the kata-monitor endpoints.
For more information on the Kata Containers metrics architecture and a detailed list of the available metrics provided by Kata monitor check the [Kata 2.0 Metrics Design](../../../../docs/design/kata-2-0-metrics.md) document.
## Local network considerations
The `kata-monitor` daemon is not run unless explicitly
[started](#kata-monitor-arguments). However, when it is running it
will accept connections on the `localhost` network interface (by
default) and provide metrics to any client process that connects to
it, whether they are privileged or not.
## Usage
Each `kata-monitor` instance detects and monitors the Kata Container workloads running on the same node.

View File

@@ -8,6 +8,7 @@ package main
import (
"context"
"flag"
"os"
"testing"
"github.com/stretchr/testify/assert"
@@ -42,9 +43,11 @@ func TestFactoryCLIFunctionNoRuntimeConfig(t *testing.T) {
func TestFactoryCLIFunctionInit(t *testing.T) {
assert := assert.New(t)
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
assert.NoError(err)
defer os.RemoveAll(tmpdir)
runtimeConfig, err := newTestRuntimeConfig(tmpdir, true)
runtimeConfig, err := newTestRuntimeConfig(tmpdir, testConsole, true)
assert.NoError(err)
set := flag.NewFlagSet("", 0)
@@ -89,9 +92,11 @@ func TestFactoryCLIFunctionInit(t *testing.T) {
func TestFactoryCLIFunctionDestroy(t *testing.T) {
assert := assert.New(t)
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
assert.NoError(err)
defer os.RemoveAll(tmpdir)
runtimeConfig, err := newTestRuntimeConfig(tmpdir, true)
runtimeConfig, err := newTestRuntimeConfig(tmpdir, testConsole, true)
assert.NoError(err)
set := flag.NewFlagSet("", 0)
@@ -121,9 +126,11 @@ func TestFactoryCLIFunctionDestroy(t *testing.T) {
func TestFactoryCLIFunctionStatus(t *testing.T) {
assert := assert.New(t)
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
assert.NoError(err)
defer os.RemoveAll(tmpdir)
runtimeConfig, err := newTestRuntimeConfig(tmpdir, true)
runtimeConfig, err := newTestRuntimeConfig(tmpdir, testConsole, true)
assert.NoError(err)
set := flag.NewFlagSet("", 0)

View File

@@ -71,7 +71,11 @@ func TestCCCheckCLIFunction(t *testing.T) {
func TestCheckCheckKernelModulesNoNesting(t *testing.T) {
assert := assert.New(t)
dir := t.TempDir()
dir, err := os.MkdirTemp("", "")
if err != nil {
t.Fatal(err)
}
defer os.RemoveAll(dir)
savedSysModuleDir := sysModuleDir
savedProcCPUInfo := procCPUInfo
@@ -87,7 +91,7 @@ func TestCheckCheckKernelModulesNoNesting(t *testing.T) {
procCPUInfo = savedProcCPUInfo
}()
err := os.MkdirAll(sysModuleDir, testDirMode)
err = os.MkdirAll(sysModuleDir, testDirMode)
if err != nil {
t.Fatal(err)
}
@@ -152,7 +156,11 @@ func TestCheckCheckKernelModulesNoNesting(t *testing.T) {
func TestCheckCheckKernelModulesNoUnrestrictedGuest(t *testing.T) {
assert := assert.New(t)
dir := t.TempDir()
dir, err := os.MkdirTemp("", "")
if err != nil {
t.Fatal(err)
}
defer os.RemoveAll(dir)
savedSysModuleDir := sysModuleDir
savedProcCPUInfo := procCPUInfo
@@ -168,7 +176,7 @@ func TestCheckCheckKernelModulesNoUnrestrictedGuest(t *testing.T) {
procCPUInfo = savedProcCPUInfo
}()
err := os.MkdirAll(sysModuleDir, testDirMode)
err = os.MkdirAll(sysModuleDir, testDirMode)
if err != nil {
t.Fatal(err)
}
@@ -247,7 +255,11 @@ func TestCheckHostIsVMContainerCapable(t *testing.T) {
assert := assert.New(t)
dir := t.TempDir()
dir, err := os.MkdirTemp("", "")
if err != nil {
t.Fatal(err)
}
defer os.RemoveAll(dir)
savedSysModuleDir := sysModuleDir
savedProcCPUInfo := procCPUInfo
@@ -263,7 +275,7 @@ func TestCheckHostIsVMContainerCapable(t *testing.T) {
procCPUInfo = savedProcCPUInfo
}()
err := os.MkdirAll(sysModuleDir, testDirMode)
err = os.MkdirAll(sysModuleDir, testDirMode)
if err != nil {
t.Fatal(err)
}
@@ -393,7 +405,11 @@ func TestArchKernelParamHandler(t *testing.T) {
func TestKvmIsUsable(t *testing.T) {
assert := assert.New(t)
dir := t.TempDir()
dir, err := os.MkdirTemp("", "")
if err != nil {
t.Fatal(err)
}
defer os.RemoveAll(dir)
savedKvmDevice := kvmDevice
fakeKVMDevice := filepath.Join(dir, "kvm")
@@ -403,7 +419,7 @@ func TestKvmIsUsable(t *testing.T) {
kvmDevice = savedKvmDevice
}()
err := kvmIsUsable()
err = kvmIsUsable()
assert.Error(err)
err = createEmptyFile(fakeKVMDevice)
@@ -441,7 +457,9 @@ foo : bar
func TestSetCPUtype(t *testing.T) {
assert := assert.New(t)
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
assert.NoError(err)
defer os.RemoveAll(tmpdir)
savedArchRequiredCPUFlags := archRequiredCPUFlags
savedArchRequiredCPUAttribs := archRequiredCPUAttribs

View File

@@ -67,7 +67,11 @@ foo : bar
{validContents, validNormalizeVendorName, validNormalizeModelName, false},
}
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
if err != nil {
panic(err)
}
defer os.RemoveAll(tmpdir)
savedProcCPUInfo := procCPUInfo
@@ -80,7 +84,7 @@ foo : bar
procCPUInfo = savedProcCPUInfo
}()
_, _, err := getCPUDetails()
_, _, err = getCPUDetails()
// ENOENT
assert.Error(t, err)
assert.True(t, os.IsNotExist(err))

View File

@@ -9,6 +9,7 @@
package main
import (
"os"
"testing"
"github.com/stretchr/testify/assert"
@@ -17,7 +18,9 @@ import (
func testSetCPUTypeGeneric(t *testing.T) {
assert := assert.New(t)
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
assert.NoError(err)
defer os.RemoveAll(tmpdir)
savedArchRequiredCPUFlags := archRequiredCPUFlags
savedArchRequiredCPUAttribs := archRequiredCPUAttribs

View File

@@ -7,6 +7,7 @@ package main
import (
"fmt"
"os"
"path/filepath"
"testing"
@@ -117,7 +118,11 @@ func TestArchKernelParamHandler(t *testing.T) {
func TestKvmIsUsable(t *testing.T) {
assert := assert.New(t)
dir := t.TempDir()
dir, err := os.MkdirTemp("", "")
if err != nil {
t.Fatal(err)
}
defer os.RemoveAll(dir)
savedKvmDevice := kvmDevice
fakeKVMDevice := filepath.Join(dir, "kvm")
@@ -127,7 +132,7 @@ func TestKvmIsUsable(t *testing.T) {
kvmDevice = savedKvmDevice
}()
err := kvmIsUsable()
err = kvmIsUsable()
assert.Error(err)
err = createEmptyFile(fakeKVMDevice)

View File

@@ -7,6 +7,7 @@ package main
import (
"fmt"
"os"
"path/filepath"
"testing"
@@ -116,7 +117,11 @@ func TestArchKernelParamHandler(t *testing.T) {
func TestKvmIsUsable(t *testing.T) {
assert := assert.New(t)
dir := t.TempDir()
dir, err := os.MkdirTemp("", "")
if err != nil {
t.Fatal(err)
}
defer os.RemoveAll(dir)
savedKvmDevice := kvmDevice
fakeKVMDevice := filepath.Join(dir, "kvm")
@@ -126,7 +131,7 @@ func TestKvmIsUsable(t *testing.T) {
kvmDevice = savedKvmDevice
}()
err := kvmIsUsable()
err = kvmIsUsable()
assert.Error(err)
err = createEmptyFile(fakeKVMDevice)

View File

@@ -155,7 +155,11 @@ func makeCPUInfoFile(path, vendorID, flags string) error {
// nolint: unused, deadcode
func genericTestGetCPUDetails(t *testing.T, validVendor string, validModel string, validContents string, data []testCPUDetail) {
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
if err != nil {
panic(err)
}
defer os.RemoveAll(tmpdir)
savedProcCPUInfo := procCPUInfo
@@ -168,7 +172,7 @@ func genericTestGetCPUDetails(t *testing.T, validVendor string, validModel strin
procCPUInfo = savedProcCPUInfo
}()
_, _, err := getCPUDetails()
_, _, err = getCPUDetails()
// ENOENT
assert.Error(t, err)
assert.True(t, os.IsNotExist(err))
@@ -193,7 +197,11 @@ func genericTestGetCPUDetails(t *testing.T, validVendor string, validModel strin
func genericCheckCLIFunction(t *testing.T, cpuData []testCPUData, moduleData []testModuleData) {
assert := assert.New(t)
dir := t.TempDir()
dir, err := os.MkdirTemp("", "")
if err != nil {
t.Fatal(err)
}
defer os.RemoveAll(dir)
_, config, err := makeRuntimeConfig(dir)
assert.NoError(err)
@@ -299,11 +307,15 @@ func TestCheckGetCPUInfo(t *testing.T) {
{"foo\n\nbar\nbaz\n\n", "foo", false},
}
dir := t.TempDir()
dir, err := os.MkdirTemp("", "")
if err != nil {
t.Fatal(err)
}
defer os.RemoveAll(dir)
file := filepath.Join(dir, "cpuinfo")
// file doesn't exist
_, err := getCPUInfo(file)
_, err = getCPUInfo(file)
assert.Error(err)
for _, d := range data {
@@ -515,7 +527,11 @@ func TestCheckHaveKernelModule(t *testing.T) {
assert := assert.New(t)
dir := t.TempDir()
dir, err := os.MkdirTemp("", "")
if err != nil {
t.Fatal(err)
}
defer os.RemoveAll(dir)
savedModProbeCmd := modProbeCmd
savedSysModuleDir := sysModuleDir
@@ -529,7 +545,7 @@ func TestCheckHaveKernelModule(t *testing.T) {
sysModuleDir = savedSysModuleDir
}()
err := os.MkdirAll(sysModuleDir, testDirMode)
err = os.MkdirAll(sysModuleDir, testDirMode)
if err != nil {
t.Fatal(err)
}
@@ -561,7 +577,11 @@ func TestCheckHaveKernelModule(t *testing.T) {
func TestCheckCheckKernelModules(t *testing.T) {
assert := assert.New(t)
dir := t.TempDir()
dir, err := os.MkdirTemp("", "")
if err != nil {
t.Fatal(err)
}
defer os.RemoveAll(dir)
savedModProbeCmd := modProbeCmd
savedSysModuleDir := sysModuleDir
@@ -575,7 +595,7 @@ func TestCheckCheckKernelModules(t *testing.T) {
sysModuleDir = savedSysModuleDir
}()
err := os.MkdirAll(sysModuleDir, testDirMode)
err = os.MkdirAll(sysModuleDir, testDirMode)
if err != nil {
t.Fatal(err)
}
@@ -642,7 +662,11 @@ func TestCheckCheckKernelModulesUnreadableFile(t *testing.T) {
t.Skip(ktu.TestDisabledNeedNonRoot)
}
dir := t.TempDir()
dir, err := os.MkdirTemp("", "")
if err != nil {
t.Fatal(err)
}
defer os.RemoveAll(dir)
testData := map[string]kernelModule{
"foo": {
@@ -667,7 +691,7 @@ func TestCheckCheckKernelModulesUnreadableFile(t *testing.T) {
}()
modPath := filepath.Join(sysModuleDir, "foo/parameters")
err := os.MkdirAll(modPath, testDirMode)
err = os.MkdirAll(modPath, testDirMode)
assert.NoError(err)
modParamFile := filepath.Join(modPath, "param1")
@@ -686,7 +710,11 @@ func TestCheckCheckKernelModulesUnreadableFile(t *testing.T) {
func TestCheckCheckKernelModulesInvalidFileContents(t *testing.T) {
assert := assert.New(t)
dir := t.TempDir()
dir, err := os.MkdirTemp("", "")
if err != nil {
t.Fatal(err)
}
defer os.RemoveAll(dir)
testData := map[string]kernelModule{
"foo": {
@@ -711,7 +739,7 @@ func TestCheckCheckKernelModulesInvalidFileContents(t *testing.T) {
}()
modPath := filepath.Join(sysModuleDir, "foo/parameters")
err := os.MkdirAll(modPath, testDirMode)
err = os.MkdirAll(modPath, testDirMode)
assert.NoError(err)
modParamFile := filepath.Join(modPath, "param1")
@@ -727,7 +755,11 @@ func TestCheckCheckKernelModulesInvalidFileContents(t *testing.T) {
func TestCheckCLIFunctionFail(t *testing.T) {
assert := assert.New(t)
dir := t.TempDir()
dir, err := os.MkdirTemp("", "")
if err != nil {
t.Fatal(err)
}
defer os.RemoveAll(dir)
_, config, err := makeRuntimeConfig(dir)
assert.NoError(err)
@@ -756,7 +788,11 @@ func TestCheckCLIFunctionFail(t *testing.T) {
func TestCheckKernelParamHandler(t *testing.T) {
assert := assert.New(t)
dir := t.TempDir()
dir, err := os.MkdirTemp("", "")
if err != nil {
t.Fatal(err)
}
defer os.RemoveAll(dir)
savedModProbeCmd := modProbeCmd
savedSysModuleDir := sysModuleDir
@@ -834,7 +870,9 @@ func TestCheckKernelParamHandler(t *testing.T) {
func TestArchRequiredKernelModules(t *testing.T) {
assert := assert.New(t)
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
assert.NoError(err)
defer os.RemoveAll(tmpdir)
_, config, err := makeRuntimeConfig(tmpdir)
assert.NoError(err)
@@ -847,7 +885,11 @@ func TestArchRequiredKernelModules(t *testing.T) {
return
}
dir := t.TempDir()
dir, err := os.MkdirTemp("", "")
if err != nil {
t.Fatal(err)
}
defer os.RemoveAll(dir)
savedModProbeCmd := modProbeCmd
savedSysModuleDir := sysModuleDir

View File

@@ -6,6 +6,7 @@
package main
import (
"os"
"testing"
"github.com/stretchr/testify/assert"
@@ -21,7 +22,9 @@ func getExpectedHostDetails(tmpdir string) (HostInfo, error) {
func TestEnvGetEnvInfoSetsCPUType(t *testing.T) {
assert := assert.New(t)
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
assert.NoError(err)
defer os.RemoveAll(tmpdir)
savedArchRequiredCPUFlags := archRequiredCPUFlags
savedArchRequiredCPUAttribs := archRequiredCPUAttribs

View File

@@ -9,6 +9,7 @@
package main
import (
"os"
"testing"
"github.com/stretchr/testify/assert"
@@ -17,7 +18,9 @@ import (
func testEnvGetEnvInfoSetsCPUTypeGeneric(t *testing.T) {
assert := assert.New(t)
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
assert.NoError(err)
defer os.RemoveAll(tmpdir)
savedArchRequiredCPUFlags := archRequiredCPUFlags
savedArchRequiredCPUAttribs := archRequiredCPUAttribs

View File

@@ -364,7 +364,11 @@ func TestEnvGetMetaInfo(t *testing.T) {
}
func TestEnvGetHostInfo(t *testing.T) {
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
if err != nil {
panic(err)
}
defer os.RemoveAll(tmpdir)
expectedHostDetails, err := getExpectedHostDetails(tmpdir)
assert.NoError(t, err)
@@ -385,9 +389,13 @@ func TestEnvGetHostInfo(t *testing.T) {
}
func TestEnvGetHostInfoNoProcCPUInfo(t *testing.T) {
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
if err != nil {
panic(err)
}
defer os.RemoveAll(tmpdir)
_, err := getExpectedHostDetails(tmpdir)
_, err = getExpectedHostDetails(tmpdir)
assert.NoError(t, err)
err = os.Remove(procCPUInfo)
@@ -398,9 +406,13 @@ func TestEnvGetHostInfoNoProcCPUInfo(t *testing.T) {
}
func TestEnvGetHostInfoNoOSRelease(t *testing.T) {
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
if err != nil {
panic(err)
}
defer os.RemoveAll(tmpdir)
_, err := getExpectedHostDetails(tmpdir)
_, err = getExpectedHostDetails(tmpdir)
assert.NoError(t, err)
err = os.Remove(osRelease)
@@ -411,9 +423,13 @@ func TestEnvGetHostInfoNoOSRelease(t *testing.T) {
}
func TestEnvGetHostInfoNoProcVersion(t *testing.T) {
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
if err != nil {
panic(err)
}
defer os.RemoveAll(tmpdir)
_, err := getExpectedHostDetails(tmpdir)
_, err = getExpectedHostDetails(tmpdir)
assert.NoError(t, err)
err = os.Remove(procVersion)
@@ -424,7 +440,11 @@ func TestEnvGetHostInfoNoProcVersion(t *testing.T) {
}
func TestEnvGetEnvInfo(t *testing.T) {
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
if err != nil {
panic(err)
}
defer os.RemoveAll(tmpdir)
// Run test twice to ensure the individual component debug+trace
// options are tested.
@@ -454,7 +474,9 @@ func TestEnvGetEnvInfo(t *testing.T) {
func TestEnvGetEnvInfoNoHypervisorVersion(t *testing.T) {
assert := assert.New(t)
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
assert.NoError(err)
defer os.RemoveAll(tmpdir)
configFile, config, err := makeRuntimeConfig(tmpdir)
assert.NoError(err)
@@ -479,14 +501,20 @@ func TestEnvGetEnvInfoNoHypervisorVersion(t *testing.T) {
func TestEnvGetEnvInfoAgentError(t *testing.T) {
assert := assert.New(t)
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
assert.NoError(err)
defer os.RemoveAll(tmpdir)
_, _, err := makeRuntimeConfig(tmpdir)
_, _, err = makeRuntimeConfig(tmpdir)
assert.NoError(err)
}
func TestEnvGetEnvInfoNoOSRelease(t *testing.T) {
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
if err != nil {
panic(err)
}
defer os.RemoveAll(tmpdir)
configFile, config, err := makeRuntimeConfig(tmpdir)
assert.NoError(t, err)
@@ -502,7 +530,11 @@ func TestEnvGetEnvInfoNoOSRelease(t *testing.T) {
}
func TestEnvGetEnvInfoNoProcCPUInfo(t *testing.T) {
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
if err != nil {
panic(err)
}
defer os.RemoveAll(tmpdir)
configFile, config, err := makeRuntimeConfig(tmpdir)
assert.NoError(t, err)
@@ -518,7 +550,11 @@ func TestEnvGetEnvInfoNoProcCPUInfo(t *testing.T) {
}
func TestEnvGetEnvInfoNoProcVersion(t *testing.T) {
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
if err != nil {
panic(err)
}
defer os.RemoveAll(tmpdir)
configFile, config, err := makeRuntimeConfig(tmpdir)
assert.NoError(t, err)
@@ -534,7 +570,11 @@ func TestEnvGetEnvInfoNoProcVersion(t *testing.T) {
}
func TestEnvGetRuntimeInfo(t *testing.T) {
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
if err != nil {
panic(err)
}
defer os.RemoveAll(tmpdir)
configFile, config, err := makeRuntimeConfig(tmpdir)
assert.NoError(t, err)
@@ -547,7 +587,11 @@ func TestEnvGetRuntimeInfo(t *testing.T) {
}
func TestEnvGetAgentInfo(t *testing.T) {
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
if err != nil {
panic(err)
}
defer os.RemoveAll(tmpdir)
_, config, err := makeRuntimeConfig(tmpdir)
assert.NoError(t, err)
@@ -682,7 +726,11 @@ func testEnvShowJSONSettings(t *testing.T, tmpdir string, tmpfile *os.File) erro
}
func TestEnvShowSettings(t *testing.T) {
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
if err != nil {
panic(err)
}
defer os.RemoveAll(tmpdir)
tmpfile, err := os.CreateTemp("", "envShowSettings-")
assert.NoError(t, err)
@@ -699,7 +747,11 @@ func TestEnvShowSettings(t *testing.T) {
}
func TestEnvShowSettingsInvalidFile(t *testing.T) {
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
if err != nil {
panic(err)
}
defer os.RemoveAll(tmpdir)
tmpfile, err := os.CreateTemp("", "envShowSettings-")
assert.NoError(t, err)
@@ -719,7 +771,11 @@ func TestEnvShowSettingsInvalidFile(t *testing.T) {
}
func TestEnvHandleSettings(t *testing.T) {
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
if err != nil {
panic(err)
}
defer os.RemoveAll(tmpdir)
configFile, config, err := makeRuntimeConfig(tmpdir)
assert.NoError(t, err)
@@ -749,7 +805,9 @@ func TestEnvHandleSettings(t *testing.T) {
func TestEnvHandleSettingsInvalidParams(t *testing.T) {
assert := assert.New(t)
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
assert.NoError(err)
defer os.RemoveAll(tmpdir)
configFile, _, err := makeRuntimeConfig(tmpdir)
assert.NoError(err)
@@ -801,7 +859,11 @@ func TestEnvHandleSettingsInvalidRuntimeConfigType(t *testing.T) {
}
func TestEnvCLIFunction(t *testing.T) {
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
if err != nil {
panic(err)
}
defer os.RemoveAll(tmpdir)
configFile, config, err := makeRuntimeConfig(tmpdir)
assert.NoError(t, err)
@@ -842,7 +904,11 @@ func TestEnvCLIFunction(t *testing.T) {
}
func TestEnvCLIFunctionFail(t *testing.T) {
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
if err != nil {
panic(err)
}
defer os.RemoveAll(tmpdir)
configFile, config, err := makeRuntimeConfig(tmpdir)
assert.NoError(t, err)
@@ -874,7 +940,9 @@ func TestEnvCLIFunctionFail(t *testing.T) {
func TestGetHypervisorInfo(t *testing.T) {
assert := assert.New(t)
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
assert.NoError(err)
defer os.RemoveAll(tmpdir)
_, config, err := makeRuntimeConfig(tmpdir)
assert.NoError(err)
@@ -894,7 +962,9 @@ func TestGetHypervisorInfo(t *testing.T) {
func TestGetHypervisorInfoSocket(t *testing.T) {
assert := assert.New(t)
tmpdir := t.TempDir()
tmpdir, err := os.MkdirTemp("", "")
assert.NoError(err)
defer os.RemoveAll(tmpdir)
_, config, err := makeRuntimeConfig(tmpdir)
assert.NoError(err)

View File

@@ -1,122 +0,0 @@
// Copyright (c) 2022 Apple Inc.
//
// SPDX-License-Identifier: Apache-2.0
//
package main
import (
"fmt"
"io/ioutil"
containerdshim "github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2"
"github.com/kata-containers/kata-containers/src/runtime/pkg/katautils"
"github.com/kata-containers/kata-containers/src/runtime/pkg/utils/shimclient"
"github.com/urfave/cli"
)
var (
sandboxID string
isIPv6 bool
)
var iptablesSubCmds = []cli.Command{
getIPTablesCommand,
setIPTablesCommand,
}
var kataIPTablesCommand = cli.Command{
Name: "iptables",
Usage: "get or set iptables within the Kata Containers guest",
Subcommands: iptablesSubCmds,
Action: func(context *cli.Context) {
cli.ShowSubcommandHelp(context)
},
}
var getIPTablesCommand = cli.Command{
Name: "get",
Usage: "get iptables from the Kata Containers guest",
Flags: []cli.Flag{
cli.StringFlag{
Name: "sandbox-id",
Usage: "the target sandbox for getting the iptables",
Required: true,
Destination: &sandboxID,
},
cli.BoolFlag{
Name: "v6",
Usage: "indicate we're requesting ipv6 iptables",
Destination: &isIPv6,
},
},
Action: func(c *cli.Context) error {
// verify sandbox exists:
if err := katautils.VerifyContainerID(sandboxID); err != nil {
return err
}
url := containerdshim.IPTablesUrl
if isIPv6 {
url = containerdshim.IP6TablesUrl
}
body, err := shimclient.DoGet(sandboxID, defaultTimeout, url)
if err != nil {
return err
}
fmt.Println(string(body))
return nil
},
}
var setIPTablesCommand = cli.Command{
Name: "set",
Usage: "set iptables in a specifc Kata Containers guest based on file",
Flags: []cli.Flag{
cli.StringFlag{
Name: "sandbox-id",
Usage: "the target sandbox for setting the iptables",
Required: true,
Destination: &sandboxID,
},
cli.BoolFlag{
Name: "v6",
Usage: "indicate we're requesting ipv6 iptables",
Destination: &isIPv6,
},
},
Action: func(c *cli.Context) error {
iptablesFile := c.Args().Get(0)
// verify sandbox exists:
if err := katautils.VerifyContainerID(sandboxID); err != nil {
return err
}
// verify iptables were provided:
if iptablesFile == "" {
return fmt.Errorf("iptables file not provided")
}
if !katautils.FileExists(iptablesFile) {
return fmt.Errorf("iptables file does not exist: %s", iptablesFile)
}
// Read file into buffer, and make request to the appropriate shim
buf, err := ioutil.ReadFile(iptablesFile)
if err != nil {
return err
}
url := containerdshim.IPTablesUrl
if isIPv6 {
url = containerdshim.IP6TablesUrl
}
if err = shimclient.DoPut(sandboxID, defaultTimeout, url, "application/octet-stream", buf); err != nil {
return fmt.Errorf("Error observed when making iptables-set request(%s): %s", iptablesFile, err)
}
return nil
},
}

Some files were not shown because too many files have changed in this diff Show More