Compare commits

..

328 Commits

Author SHA1 Message Date
Zvonko Kaiser
2777b13db7 Merge pull request #10742 from zvonkok/3.13.0-release
release: Bump version to 3.13.0
2025-01-16 10:05:48 -05:00
Aurélien Bombo
0d93f59f5b Merge pull request #10738 from microsoft/danmihai1/empty-pty-lines
runtime: skip empty Guest console output lines
2025-01-15 10:33:24 -06:00
Zvonko Kaiser
0b04f43ac6 release: Bump version to 3.13.0
Bump VERSION and helm-chart versions

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-01-15 16:13:22 +00:00
Zvonko Kaiser
365def9b4a Merge pull request #10735 from BbolroC/kubectl-create-retry-trusted-storage
tests: Introduce retry_kubectl_apply() for trusted storage
2025-01-14 21:59:45 -05:00
Dan Mihai
2e21f51375 runtime: skip empty Guest console output lines
Skip logging empty lines of text from the Guest console output, if
there are any such lines.

Without this change, the Guest console log from CLH + /dev/pts/0 has
twice as many lines of text. Half of these lines are empty.

Fixes: #10737

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-01-15 00:28:26 +00:00
Hyounggyu Choi
f7816e9206 tests: Introduce retry_kubectl_apply() for trusted storage
On s390x, some tests for trusted storage occasionally failed due to:

```bash
etcdserver: request timed out
```

or

```bash
Internal error occurred: resource quota evaluation timed out
```

These timeouts were not observed previously on k3s but occur
sporadically on kubeadm. Importantly, they appear to be temporary
and transient, which means they can be ignored in most cases.

To address this, we introduced a new wrapper function, `retry_kubectl_apply()`,
for `kubectl create`. This function retries applying a given manifest up to 5
times if it fails due to a timeout. However, it will still catch and handle
any other errors during pod creation.

Fixes: #10651

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-01-14 21:15:44 +01:00
Fabiano Fidêncio
121ac0c5c0 Merge pull request #10727 from microsoft/danmihai1/mariner3-guest
image: bump mariner guest version to 3.0
2025-01-14 19:06:28 +01:00
Fabiano Fidêncio
3658ea2320 Merge pull request #10731 from microsoft/danmihai1/quiet-rootfs-build
rootfs: reduced console output by default
2025-01-14 19:02:42 +01:00
Chengyu Zhu
7d34ca4420 Merge pull request #10674 from bpradipt/fix-10398
agent: alternative implementation for sealed_secret as volume
2025-01-14 18:55:45 +08:00
Fabiano Fidêncio
4578969c5d Merge pull request #10730 from BbolroC/bump-coco-trustee
versions: Bump trustee to latest
2025-01-14 08:56:11 +01:00
Dan Mihai
0f522c09d9 rootfs: reduced console output by default
Use "set -x" only when the user specified DEBUG=1.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-01-13 19:34:05 +00:00
Pradipta Banerjee
36580bb642 tests: Update sealed secret CI value to base64url
The existing encoding was base64 and it fails due to
874948638a

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
2025-01-13 09:37:05 -05:00
Hyounggyu Choi
2cdb549a75 versions: Bump trustee to latest
This update addresses an issue with token verification for SE and SNP
introduced in the last update by #10541.
Bumping the project to the latest commit resolves the issue.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-01-13 15:07:33 +01:00
Pradipta Banerjee
5218345e34 agent: alternative implementation for sealed_secret as volume
The earlier implementation relied on using a specific mount-path prefix - `/sealed`
to determine that the referenced secret is a sealed secret.
However that was restrictive for certain use cases as it forced
the user to always use a specific mountpath naming convention.

This commit introduces an alternative implementation to relax the
restriction. A sealed secret can be mounted in any mount-path.
However it comes with a potential performance penality. The
implementation loops through all volume mounts and reads the file
to determine if it's a sealed secret or not.

Fixes: #10398

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
2025-01-11 12:36:44 -05:00
Dan Mihai
4707883b40 image: bump mariner guest version to 3.0
Use Mariner 3.0 (a.k.a., Azure Linux 3.0) as the Guest CI image.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-01-11 17:36:19 +00:00
Fabiano Fidêncio
2d9baf899a Merge pull request #10719 from msanft/msanft/runtime/fix-boolean-opts
runtime: use actual booleans for QMP `device_add` boolean options
2025-01-11 16:38:06 +01:00
Zvonko Kaiser
f08a9eac11 Merge pull request #10721 from stevenhorsman/more-metrics-latency-minimum-range-fixes
metrics: Increase latency test range
2025-01-10 21:59:39 -05:00
Moritz Sanft
e5735b221c runtime: use actual booleans for QMP device_add boolean options
Since
be93fd5372,
which is included in QEMU since version 9.2.0, the options for the
`device_add` QMP command need to be typed correctly.

This makes it so that instead of `"on"`, the value is set to `true`,
matching QEMU's expectations.

This has been tested on QEMU 9.2.0 and QEMU 9.1.2, so before and after
the change.

The compatibility with incorrectly typed options  for the `device_add`
command is deprecated since version 6.2.0 [^1].

[^1]:  https://qemu-project.gitlab.io/qemu/about/deprecated.html#incorrectly-typed-device-add-arguments-since-6-2

Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
2025-01-10 11:53:56 +01:00
Wainer Moschetta
5fae2a9f91 Merge pull request #9871 from wainersm/fix-print_cluster_name
tests/gha-run-k8s-common: shorten AKS cluster name
2025-01-09 14:35:02 -03:00
stevenhorsman
aaae5b6d0f metrics: clh: Increase network-iperf3 range
We hit a failure with:
```
time="2025-01-09T09:55:58Z" level=warning msg="Failed Minval (0.017600 > 0.015000) for [network-iperf3]"
```
The range is very big, but in the last 3 test runs I reviewed we have got a minimum value of 0.015s
and a max value of 0.052, so there is a ~350% difference possible
so I think we need to have a wide range to make this stable.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-01-09 11:25:57 +00:00
stevenhorsman
e946d9d5d3 metrics: qemu: Increase latency test range
After the kernel version bump, in the latest nightly run
https://github.com/kata-containers/kata-containers/actions/runs/12681309963/job/35345228400
The sequential read throughput result was 79.7% of the expected (so failed)
and the sequential write was 84% of the expected, so was fairly close,
so increase their minimum ranges to make them more robust.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-01-09 11:25:50 +00:00
Wainer dos Santos Moschetta
badc208e9a tests/gha-run-k8s-common: shorten AKS cluster name
Because az client restricts the name to be less than 64 characters. In
some cases (e.g. KATA_HYPERVISOR=qemu-runtime-rs) the generated name
will exceed the limit. This changed the function to shorten the name:

* SHA1 is computed from metadata then compound the cluster's name
* metadata as plain-text are passed as --tags

Fixes: #9850
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2025-01-08 16:39:07 -03:00
Fabiano Fidêncio
8f8988fcd1 Merge pull request #10714 from fidencio/topic/update-virtiofsd
virtiofsd: Update to its v1.13.0 ( + one patch) release :-)
2025-01-08 17:59:29 +01:00
Fabiano Fidêncio
7e5e109255 Merge pull request #10541 from fitzthum/bump-trustee-010
Update Trustee and Guest Components
2025-01-08 17:44:13 +01:00
Fabiano Fidêncio
eb3fe0d27c Merge pull request #10717 from fidencio/topic/re-enable-oom-test-for-mariner
tests: Re-enable oom tests for mariner
2025-01-08 17:43:56 +01:00
Fabiano Fidêncio
65e267294b Merge pull request #10718 from stevenhorsman/metrics-blogbench-latency-minimal-range-increase
metrics: Increase latency minimum range
2025-01-08 17:09:36 +01:00
stevenhorsman
dc069d83b5 metrics: Increase latency test range
The bump to kernel 6.12 seems to have reduced the latency in
the metrics test, so increase the ranges for the minimal value,
to account for this.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-01-08 15:11:49 +00:00
Fabiano Fidêncio
967d5afb42 Revert "tests: k8s: Skip one of the empty-dir tests"
This reverts commit 9aea7456fb.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-08 14:07:34 +01:00
Fabiano Fidêncio
7ae2ca4c31 virtiofsd: Update to its v1.13.0 + one patch release
Together with the bump, let's also bump the rust version needed to build
the package, with the caveat that virtiofsd doesn't actually use a
pinned version as part of their CI, so we're bumping to whatever is the
version on `alpine:rust` (which is used in their CI).

It's important to note that we're using a version which brings in one
extra patch apart from the release, as the next virtiofsd release will
happen at the end of February, 2025.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-08 14:07:34 +01:00
Fabiano Fidêncio
0af3536328 packaging: virtiofsd: Allow building a specific commit
Right now we've been only building releases from virtiofsd, but we'll
need to pin a specific commit till v1.14.0 is out, thus let's add the
needed machinery to do so.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-08 14:07:34 +01:00
Tobin Feldman-Fitzthum
41c7f076fa packaging: updating guest components build script
The guest-components directory has been re-arranged slightly. Adjust the
installation path of the LUKS helper script to account for this.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2025-01-07 16:59:10 -06:00
Tobin Feldman-Fitzthum
cafc7d6819 versions: update trustee and guest components
Trustee has some new features including a plugin backend, support for
PKCS11 resources, improvements to token verification, and adjustments to
logging, and more.

Also update guest-components to pickup improvements and keep the KBS
protocol in sync.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2025-01-07 16:59:10 -06:00
Fabiano Fidêncio
53ac0f00c5 tests: Re-enable oom tests for mariner
Since we bumped to the 6.12.x LTS kernel, we've also adjusted the
aggressivity of the OOM test, which may be enough to allow us to
re-enable it for mariner.

Fixes: #8821

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-07 18:33:17 +01:00
Fabiano Fidêncio
f4a39e8c40 Merge pull request #10468 from fidencio/topic/early-tests-on-next-lts-kernel
versions: Move kernel to the latest 6.12 release (the current LTS)
2025-01-07 18:02:04 +01:00
Fupan Li
bd56891f84 Merge pull request #10702 from lifupan/fix_containerdname
CI: change the containerd tarball name from cri-containerd-cni to containerd
2025-01-07 18:56:15 +08:00
Fupan Li
b19db40343 CI: change the containerd tarball name to containerd
Since from https://github.com/containerd/containerd/pull/9096
containerd removed cri-containerd-*.tar.gz release bundles,
thus we'd better change the tarball name to "containerd".

BTW, the containerd tarball containerd the follow files:

bin/
bin/containerd-shim
bin/ctr
bin/containerd-shim-runc-v1
bin/containerd-stress
bin/containerd
bin/containerd-shim-runc-v2

thus we should untar containerd into /usr/local directory instead of "/"
to keep align with the cri-containerd.

In addition, there's no containerd.service file,runc binary and cni-plugin
included, thus we should add a specific containerd.service file and
install install the runc binary and cni-pluginspecifically.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-01-07 17:39:05 +08:00
Fabiano Fidêncio
9aea7456fb tests: k8s: Skip one of the empty-dir tests
An issue has been created for this, and we should fix the issue before
the next release.  However, for now, let's unblock the kernel bump and
have the test skipped.

Reference: https://github.com/kata-containers/kata-containers/issues/10706

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-06 21:48:20 +01:00
Fabiano Fidêncio
44ff602c64 tests: k8s: Be more aggressive to get OOM
Let's increase the amount of bytes allocated per VM worker, so we can
hit the OOM sooner.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-01-06 21:48:20 +01:00
Fabiano Fidêncio
f563f0d3fc versions: Update kernel to v6.12.8
There are lots of configs removed from latest kernel. Update them here
for convenience of next kernel upgrade.

Remove CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE [1]
Remove CONFIG_IP_NF_TARGET_CLUSTERIP [2]
Remove CONFIG_NET_SCH_CBQ [3]
Remove CONFIG_AUTOFS4_FS [4]
Remove CONFIG_EMBEDDED [5]
Remove CONFIG_ARCH_RANDOM & CONFIG_RANDOM_TRUST_CPU [6]

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=a7e4676e8e2cb158a4d24123de778087955e1b36
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=9db5d918e2c07fa09fab18bc7addf3408da0c76f
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=051d442098421c28c7951625652f61b1e15c4bd5
[4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=1f2190d6b7112d22d3f8dfeca16a2f6a2f51444e
[5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=ef815d2cba782e96b9aad9483523d474ed41c62a
[6] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.2&id=b9b01a5625b5a9e9d96d14d4a813a54e8a124f4b

Apart from the removals, CONFIG_CPU_MITIGATIONS is now a dependency for
CONFIG_RETPOLINE (which has been renamed to CONFIG_MITIGATION_RETPOLINE)
and CONFIG_PAGE_TABLE_ISOLATION (which has been renamed to
CONFIG_MITIGATION_PAGE_TABLE_ISOLATION).  I've added that to the
whitelist because we still build older versions of the kernel that
do not have that dependency.

Fixes: #8408
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-01-06 21:48:20 +01:00
Xuewei Niu
71b14d40f2 Merge pull request #10696 from teawater/kt
kata-ctl: direct-volume: Auto create KATA_DIRECT_VOLUME_ROOT_PATH
2025-01-02 14:04:37 +08:00
Hui Zhu
d15a7baedd kata-ctl: direct-volume: Auto create KATA_DIRECT_VOLUME_ROOT_PATH
Got following issue:
kata-ctl direct-volume add /kubelet/kata-direct-vol-002/directvol002
"{\"device\": \"/home/t4/teawater/coco/t.img\", \"volume-type\":
\"directvol\", \"fstype\": \"\", \"metadata\":"{}", \"options\": []}"
subsystem: kata-ctl_main
 Dec 30 09:43:41.150 ERRO Os {
    code: 2,
    kind: NotFound,
    message: "No such file or directory",
}
The reason is KATA_DIRECT_VOLUME_ROOT_PATH is not exist.

This commit create_dir_all KATA_DIRECT_VOLUME_ROOT_PATH before join_path
to handle this issue.

Fixes: #10695

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-30 17:55:49 +08:00
Xuewei Niu
6400295940 Merge pull request #10683 from justxuewei/nxw/remove-mut 2024-12-29 00:49:38 +08:00
Fupan Li
2068801b80 Merge pull request #10626 from teawater/ma
Add mem-agent to kata
2024-12-24 14:11:36 +08:00
Steve Horsman
2322f6df94 Merge pull request #10686 from stevenhorsman/ppc64le-all-prepare-steps-timeout
workflows: Add more ppc64le timeouts
2024-12-20 19:08:48 +00:00
stevenhorsman
9b6fce9e96 workflows: Add more ppc64le timeouts
Unsurprisingly now we've got passed the containerd test
hangs on the ppc64le, we are hitting others  in the "Prepare the
self-hosted runner" stage, so add timeouts to all of them
to avoid CI blockages.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-20 17:31:24 +00:00
Steve Horsman
162e2af4f5 Merge pull request #10685 from stevenhorsman/ppc64le-containerd-test-timeout
workflows: Add timeout to some ppc64le steps
2024-12-20 16:55:40 +00:00
stevenhorsman
d9d8d53bea workflows: Add timeout to some ppc64le steps
In some runs e.g. https://github.com/kata-containers/kata-containers/actions/runs/12426384186/job/34697095588
and https://github.com/kata-containers/kata-containers/actions/runs/12422958889/job/34697016842
we've seen the Prepare the self-hosted runner
and Install dependencies steps get stuck for 5hours+.
If they are working then it should take a few minutes,
so let's add timeouts and not hold up whole the CI if they are stuck

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-20 16:37:36 +00:00
Steve Horsman
99f239bc44 Merge pull request #10380 from stevenhorsman/required-tests-guidance
doc: Add required jobs info
2024-12-20 16:24:42 +00:00
stevenhorsman
d1d4bc43a4 static-checks: Add words to dictionary
devmapper and snapshotters are being marked as spelling
errors, so add them to the kata dictionary

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-20 14:16:52 +00:00
stevenhorsman
7612839640 doc: Add required jobs info
Add information about what required jobs are and
our initial guidelines for how jobs are eligible for being
made required, or non-required

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-20 14:12:13 +00:00
Xuewei Niu
ecf98e4db8 runtime-rs: Remove unneeded mut from new_hypervisor()
`set_hypervisor_config()` and `set_passfd_listener_port()` acquire inner
lock, so that `mut` for `hypervisor` is unneeded.

Fixes: #10682

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-12-20 17:08:10 +08:00
Steve Horsman
2c6126d3ab Merge pull request #10676 from stevenhorsman/fix-qemu-coco-dev-skip
tests: Fix qemu-coc-dev skip
2024-12-20 08:56:54 +00:00
Xuewei Niu
ea60613be9 Merge pull request #9387 from deagon/fix-broken-usage
packaging: fix the broken usage help
2024-12-20 15:20:37 +08:00
Guoqiang Ding
75baf75726 packaging: fix the broken usage help
Using the plain usage text instead of the bad variable reference.

Fixes: #9386
Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>
2024-12-20 13:58:40 +08:00
stevenhorsman
dd02b6699e tests: Fix qemu-coc-dev skip
Fix the logic to make the test skipped on qemu-coco-dev,
rather than the opposite and update the syntax to make it
clearer as it incorrectly got written and reviewed by three
different people in it's prior form.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-19 19:50:46 +00:00
Steve Horsman
79495379e2 Merge pull request #10668 from stevenhorsman/update-release-process-post-3.12
doc: Update the release process
2024-12-19 14:16:30 +00:00
Steve Horsman
99b9ef4e5a Merge pull request #10675 from stevenhorsman/release-repeat-abort
release: Abort if release version exists
2024-12-19 11:55:44 +00:00
stevenhorsman
c3f13265e4 doc: Update the release process
Add a step to wait for the payload publish to complete
before running the release action.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-19 09:52:39 +00:00
Zvonko Kaiser
f2d72874a1 Merge pull request #10620 from kata-containers/topic/fix-remove-artifact-ordering
workflows: Remove potential timing issues with artifacts
2024-12-18 13:22:12 -05:00
Zvonko Kaiser
fc2c77f3b6 Merge pull request #10669 from zvonkok/qemu-aarch64-fix
qemu: Fix aarch64 build
2024-12-18 08:26:55 -05:00
stevenhorsman
e2669d4acc release: Abort if release version exists
In order to check that we don't accidentally overwrite
release artifacts, we should add a check if the release
name already exists and bail if it does.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-18 11:04:19 +00:00
Zvonko Kaiser
07d2b00863 qemu: Fix aarch64 build
Building static binaries for aarch64 requires disabling PIE
We get an GOT overflow and the OS libraries are only build with fpic
and not with fPIC which enables unlimited sized GOT tables.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-18 03:26:14 +00:00
Zvonko Kaiser
39bf10875b Merge pull request #10663 from zvonkok/3.12.0-relase
release: Bump version to 3.12.0
2024-12-17 10:00:42 -05:00
Zvonko Kaiser
28b57627bd release: Bump version to 3.12.0
Bump VERSION and helm-chart versions

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-16 18:41:51 +00:00
Xuewei Niu
02b5fa15ac Merge pull request #10655 from liubogithub/patch-1
kata-ctl: fix outdated comments
2024-12-16 13:11:25 +08:00
Hyounggyu Choi
cfbc425041 Merge pull request #10660 from BbolroC/fix-leading-zero-issue-for-vfio-ap
vfio-ap: Assign default string "0" for empty APID and APQI
2024-12-13 17:40:29 +01:00
Hyounggyu Choi
341e5ca58e vfio-ap: Assign default string "0" for empty APID and APQI
The current script logic assigns an empty string to APID and APQI
when APQN consists entirely of zeros (e.g., "00.0000").
However, this behavior is incorrect, as "00" and "0000" are valid
values and should be represented as "0".
This commit ensures that the script assigns the default string “0”
to APID and APQI if their computed values are empty.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-12-13 14:39:03 +01:00
Liu Bo
95fc585103 kata-ctl: fix outdated comments
MgmnClient can also tolerate short sandbox id.

Signed-off-by: Liu Bo <liub.liubo@gmail.com>
2024-12-12 21:59:54 -08:00
stevenhorsman
cf8b82794a workflows: Only remove artifacts in release builds
Due to the agent-api tests requiring the agent to be deployed in the
CI by the tarball, so in the short-term lets only do this on the release
stage, so that both kata-manager works with the release and the
agent-api tests work with the other CI builds.

In the longer term we need to re-evaluate what is in our tarballs
(issue #10619), but want to unblock the tests in the short-term.

Fixes: #10630
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-12 17:38:27 +00:00
stevenhorsman
e1f6aca9de workflows: Remove potential timing issues with artifacts
With the code I originally did I think there is potentially
a case where we can get a failure due to timing of steps.
Before this change the `build-asset-shim-v2`
job could start the `get-artifacts` step and concurrently
`remove-rootfs-binary-artifacts` could run and delete the artifact
during the download and result in the error. In this commit, I
try to resolve this by making sure that the shim build waits
for the artifact deletes to complete before starting.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-12 16:52:54 +00:00
Fabiano Fidêncio
7b0c1d0a8c Merge pull request #10492 from zvonkok/upgrade-qemu-9.1.0
qemu: Upgrade qemu 9.1.2
2024-12-12 08:15:39 +01:00
Fupan Li
07fe7325c2 Merge pull request #10643 from justxuewei/fix-bind-vol
runtime-rs & agent: Fix the issues with bind volumes
2024-12-12 11:34:52 +08:00
Fupan Li
372346baed Merge pull request #10641 from justxuewei/fix-build-type
runtime-rs: Ignore BUILD_TYPE if it is not release
2024-12-12 11:32:49 +08:00
Xuewei Niu
5f1b1d8932 Merge pull request #10638 from justxuewei/fix-stderr-fifo
runtime-rs: Fix the issues with stderr fifo
2024-12-12 10:03:46 +08:00
Fabiano Fidêncio
a5c863a907 Merge pull request #10581 from ryansavino/snp-enable-skipped
Revert "ci: Skip the failing tests in SNP"
2024-12-11 18:22:17 +01:00
Zvonko Kaiser
cc9ecedaea qemu: Bump version, new options, add no_patches
We want to have the latest QEMU version available
which is as of this writing v9.1.2

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>

qemu: Add new options for 9.1.2

We need to fence specific options depending on the version
and disable ones that are not needed anymore

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>

qemu: Add no_patches.txt

Since we do not have any patches for this version
let's create the appropriate files.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-11 16:32:39 +00:00
Zvonko Kaiser
69ed4bc3b7 qemu: Add depedency
The new QEMU build needs python-tomli, now that we bumped Ubuntu
we can include the needed tomli package

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-11 16:32:20 +00:00
Zvonko Kaiser
c82db45eaa qemu: Disable pmem
We're disabling pmem support, it is heavilly broken with
Ubuntu's static build of QEMU and not needed

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-11 16:32:19 +00:00
Zvonko Kaiser
a88174e977 qemu: Replace from source build with package
In jammy we have the liburing package available, hence
remove the source build and include the package.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-11 16:22:54 +00:00
Zvonko Kaiser
c15f77737a qemu: Bump Ubuntu version in Dockerfile
We need jammy for a new package that is not available in focal

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-11 16:22:54 +00:00
Zvonko Kaiser
eef2795226 qemu: Use proper QEMU builder
Do not use hardcoded abs path. Use the deduced rel path.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-11 16:22:54 +00:00
Zvonko Kaiser
e604e51b3d qemu: Build as user
We moved all others artifacts to be build as a user,
QEMU should not be the exception

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-11 16:22:54 +00:00
Zvonko Kaiser
1d56fd0308 qemu: Remove abs path
We want to stick with the other build scripts and
only use relative paths.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-11 16:22:54 +00:00
Ryan Savino
7d45382f54 Revert "ci: Skip the failing tests in SNP"
This reverts commit 2242aee099.
2024-12-10 16:20:31 -06:00
Xuewei Niu
3fb91dd631 agent: Fix the issues with bind volumes
The mount type should be considered as empty if the value is
`Some("none")`.

Fixes: #10642

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-12-11 00:51:32 +08:00
Xuewei Niu
59ed19e8b2 runtime-rs: Fix the issues with bind volumes
This path fixes the logic of getting the type of volume: when the type of
OCI mount is Some("none") and the options have "bind" or "rbind", the
type will be considered as "bind".

Fixes: #10642

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-12-11 00:50:36 +08:00
Xuewei Niu
2424c1a562 runtime-rs: Ignore BUILD_TYPE if it is not release
This patch fixes that by adding `--release` only if `BUILD_TYPE=release`.

Fixes: #10640

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-12-11 00:27:28 +08:00
Xuewei Niu
b4695f6303 runtime-rs: Fix the issues with stderr fifo
When tty is enabled, stderr fifo should never be opened.

Fixes: #10637

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-12-10 21:48:52 +08:00
Aurélien Bombo
037281d699 Merge pull request #10593 from microsoft/saulparedes/improve_namespace_validation
policy: improve pod namespace validation
2024-12-09 11:55:09 -06:00
Steve Horsman
9b7fb31ce6 Merge pull request #10631 from stevenhorsman/action-lint-workflow
Action lint workflow
2024-12-09 09:33:07 +00:00
Fabiano Fidêncio
bec1de7bd7 Merge pull request #10548 from Sumynwa/sumsharma/clh_tweak_vm_configs
runtime: Set memory config shared=false when shared_fs=None in CLH.
2024-12-06 23:15:29 +01:00
Sumedh Alok Sharma
ac4f986e3e runtime: Set memory config shared=false when shared_fs=None in CLH.
This commit sets memory config `shared` to false in cloud hypervisor
when creating vm with shared_fs=None && hugePages = false.

Currently in runtime/virtcontainers/clh.go,the memory config shared is by default set to true.
As per the CLH memory document,
(a) shared=true is needed in case like when using virtio_fs since virtiofs daemon runs as separate process than clh.
(b) for shared_fs=none + hugespages=false, shared=false can be set to use private anonymous memory for guest (with no file backing).
(c) Another memory config thp (use transparent huge pages) is always enabled by default.
As per documentation, (b) + (c) can be used in combination.
However, with the current CLH implementation, the above combination cannot be used since shared=true is always set.

Fixes #10547

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-12-06 21:22:51 +05:30
stevenhorsman
b4b3471bcb workflows: linting: Fix shellcheck SC1001
> This \/ will be a regular '/' in this context

Remove ignored escape

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
491210ed22 workflows: linting: Fix shellcheck SC2006
> Use $(...) notation instead of legacy backticks `...`

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
5d7c5bdfa4 workflows: linting: Fix shellcheck SC2015
> A && B || C is not if-then-else. C may run when A is true

Refactor the echo so that we can't get into a situation where
the retry of workspace delete happens if the original one was
successful

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
c2ba15c111 workflows: linting: Fix shellcheck SC2206
>  Quote to prevent word splitting/globbing

Double quote variables expanded in an array

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
007514154c workflows: linting: Fix shellcheck SC2068
> Double quote array expansions to avoid re-splitting elements

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
4ef05c6176 workflows: linting: Fix shellcheck SC2116
> Useless echo? Instead of 'cmd $(echo foo)', just use 'cmd foo'

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
f02d540799 workflows: Bump outdated action versions
Bump some actions that are significantly out-of-date
and out of sync with the versions used in other workflows

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
935327b5aa workflows: linting: Fix shellcheck SC2046
> Quote this to prevent word splitting.

Quote around subshell

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
e93ed6c20e workflows: linting: Add tdx labels
The tdx runners got split into two different
runners, so we need to update the known self-hosted
runner labels

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
d4bd314d52 workflows: linting: Fix incorrect properties
These properties are currently invalid, so either
fix, or remove them

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
9113606d45 workflows: linting: Fix shellcheck SC2086
> Double quote to prevent globbing and word splitting.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
42cd2ce6e4 workflows: Add actionlint workflows
On PRs that update anything in the workflows directory,
add an actionlint run to validate our workflow files for errors
and hopefully catch issues earlier.

Fixes: #9646

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 11:36:08 +00:00
Fabiano Fidêncio
a93ff57c7d Merge pull request #10627 from kata-containers/topic/release-helm-charm-tarball
release: helm: Add the chart as part of the release
2024-12-06 11:22:43 +01:00
Fabiano Fidêncio
300a827d03 release: helm: Add the chart as part of the release
So users can simply download the chart and use it accordingly without
the need to download the full repo.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-12-06 11:19:34 +01:00
Fabiano Fidêncio
652662ae09 Merge pull request #10551 from fidencio/topic/kata-deploy-allow-multi-deployment
kata-deploy: Add support to multi-installation
2024-12-06 11:16:20 +01:00
Hui Zhu
d3a6bcdaa5 runtime-rs: configuration-dragonball.toml.in: Add config for mem-agent
Add config for mem-agent to configuration-dragonball.toml.in.

Fixes: #10625

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-06 10:00:28 +08:00
Hui Zhu
2b6caf26e0 agent-ctl: Add mem-agent API support
Add sub command MemAgentMemcgSet and MemAgentCompactSet to agent-ctl to
configate the mem-agent inside the running kata-containers.

Fixes: #10625

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-06 10:00:24 +08:00
Hui Zhu
cb86d700a6 config: Add config of mem-agent
Add config of mem-agent to configate the mem-agent.

Fixes: #10625

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-06 10:00:20 +08:00
Hui Zhu
692ded8f96 agent: add support for MemAgentMemcgSet and MemAgentCompactSet
Add MemAgentMemcgSet and MemAgentCompactSet to agent API to set the config of
mem-agent memcg and compact.

Fixes: #10625

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-06 10:00:16 +08:00
Hui Zhu
f84ad54d97 agent: Start mem-agent in start_sandbox
mem-agent will run with kata-agent.

Fixes: #10625

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-06 10:00:13 +08:00
Hui Zhu
74a17f96f4 protocols/protos/agent.proto: Add mem-agent support
Add MemAgentMemcgConfig and MemAgentCompactConfig to AgentService.

Fixes: #10625

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-06 10:00:09 +08:00
Hui Zhu
ffc8390a60 agent: Add mem-agent to Cargo.toml
Add mem-agent to Cargo.toml of agent.
mem-agent will be integrated into kata-agent.

Fixes: #10625

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-06 10:00:05 +08:00
Hui Zhu
4407f6e098 mem-agent: Add to src
mem-agent is a component designed for managing memory in Linux
environments.
Sub-feature memcg: Utilizes the MgLRU feature to monitor each cgroup's
memory usage and periodically reclaim cold memory.
Sub-feature compact: Periodically compacts memory to facilitate the
kernel's free page reporting feature, enabling the release of more idle
memory from guests.
During memory reclamation and compaction, mem-agent monitors system
pressure using Pressure Stall Information (PSI). If the system pressure
becomes too high, memory reclamation or compaction will automatically
stop.

Fixes: #10625

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-06 10:00:02 +08:00
Hui Zhu
f9c63d20a4 kernel/configs: Add mglru, debugfs and psi to dragonball-experimental
Add mglru, debugfs and psi to dragonball-experimental/mem_agent.conf to
support mem_agent function.

Fixes: #10625

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-12-06 09:59:59 +08:00
Fabiano Fidêncio
111082db07 kata-deploy: Add support to multi-installation
This is super useful for development / debugging scenarios, mainly when
dealing with limited hardware availability, as this change allows
multiple people to develop into one single machine, while still using
kata-deploy.

Fixes: #10546

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-12-05 17:42:53 +01:00
Fabiano Fidêncio
0033a0c23a kata-deploy: Adjust paths for qemu-coco-dev as well
I missed that when working on the INSTALL_PREFIX feature, so adding it
now.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-12-05 17:42:53 +01:00
Fabiano Fidêncio
62b3a07e2f kata-deploy: helm: Add overlooked INSTALLATION_PREFIX env var
At the same time that INSTALLATION_PREFIX was added, I was working on
the helm changes to properly do the cleanup / deletion when it's
removed.  However, I missed adding the INSTALLATION_PREFIX env var
there. which I'm doing now.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-12-05 17:42:53 +01:00
Steve Horsman
5d96734831 Merge pull request #10572 from ldoktor/gk-stalled-results
ci.gatekeeper: Update existing results
2024-12-04 19:02:14 +00:00
Wainer Moschetta
a94982d8b8 Merge pull request #10617 from stevenhorsman/skip-k8s-job-test-on-non-tee
tests: Skip k8s job test on qemu-coco-dev
2024-12-04 15:47:33 -03:00
Saul Paredes
84a411dac4 policy: improve pod namespace validation
- Remove default_namespace from settings
- Ensure container namespaces in a pod match each other in case no namespace is specified in the YAML

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-12-04 10:17:54 -08:00
Steve Horsman
c86f76d324 Merge pull request #10588 from stevenhorsman/metrics-clh-min-range-relaxation
metrics: Increase minval range for failing tests
2024-12-04 16:10:26 +00:00
stevenhorsman
a8ccd9a2ac tests: Skip k8s job test on qemu-coco-dev
The tests is unstable on this platform, so skip it for now to prevent
the regular known failures covering up other issues. See #10616

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-04 16:00:05 +00:00
Steve Horsman
9e609dd34f Merge pull request #10615 from kata-containers/topic/update-remove-artifact-filter
workflows: Fix remove artifact name filter
2024-12-04 15:02:35 +00:00
Fabiano Fidêncio
531a29137e Merge pull request #10607 from microsoft/danmihai1/less-logging
runtime: skip logging some of the dial errors
2024-12-04 15:01:45 +01:00
stevenhorsman
14a3adf4d6 workflows: Fix remove artifact name filter
- Fix copy-paste errors in artifact filters for arm64 and ppc64le
- Remove the trailing wildcard filter that falsely ends up removing agent-ctl
and replace with the tarball-suffix, which should exactly match the artifacts

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-04 13:34:42 +00:00
Alex Lyn
5f9cc86b5a Merge pull request #10604 from 3u13r/euler/fix/genpolicy-rego-state-getter
genpolicy: align state path getter and setter
2024-12-04 13:57:34 +08:00
Alex Lyn
c7064027f4 Merge pull request #10574 from BbolroC/add-ccw-subchannel-qemu-runtime-rs
Add subchannel support to qemu-runtime-rs for s390x
2024-12-04 09:17:45 +08:00
Aurélien Bombo
57d893b5dc Merge pull request #10563 from sprt/csi-deploy
coco: ci: Fully implement compilation of CSI driver and require it for CoCo tests [2/x]
2024-12-03 18:58:14 -06:00
Aurélien Bombo
4aa7d4e358 ci: Require CSI driver for CoCo tests
With the building/publishing step for the CSI driver validated, we can
set that as a requirement for the CoCo tests.

Depends on: #10561

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-12-03 14:43:36 -06:00
Aurélien Bombo
fe55b29ef0 csi-kata-directvolume: Remove go version check
The driver build recipe has a script to check the current Go version against
the go.mod version.  However, the script is broken ($expected is unbound) and I
don't believe we do this for other components. On top of this, Go should be
backward-compatible. Let's keep things simple for now and we can evaluate
restoring this script in the future if need be.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-12-03 14:43:36 -06:00
Aurélien Bombo
fb87bf221f ci: Implement build step for CSI driver
This fully implements the compilation step for csi-kata-directvolume.
This component can now be built by the CI running:

 $ cd tools/packaging/kata-deploy/local-build
 $ make csi-kata-directvolume-tarball

A couple notes:

 * When installing the binary, we rename it from directvolplugin to
   csi-kata-directvolume on the fly to make it more readable.
 * We add go to the tools builder Dockerfile to support building this
   tool.
 * I've noticed the file install_libseccomp.sh gets created by the build
   process so I've added it to a .gitignore.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-12-03 14:43:36 -06:00
Aurélien Bombo
0f6113a743 Merge pull request #10612 from kata-containers/sprt/fix-csi-publish2
ci: Fix Docker publishing for CSI driver, 2nd try
2024-12-03 14:43:28 -06:00
Aurélien Bombo
a23ceac913 ci: Fix Docker publishing for CSI driver, 2nd try
Follow-up to #10609 as it seems GHA doesn't allow hard links:

https://github.com/kata-containers/kata-containers/actions/runs/12144941404/job/33868901896?pr=10563#step:6:8

Note that I also updated the `needs` directive as we don't need the Kata
payload container, just the tarball artifact.

Part of: #10560

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-12-03 13:04:46 -06:00
Dan Mihai
2a67038836 Merge pull request #10608 from microsoft/saulparedes/policy_metadatata_uid
policy: ignore optional metadata uid field
2024-12-03 10:19:12 -08:00
Dan Mihai
25e6f4b2a5 Merge pull request #10592 from microsoft/saulparedes/add_constants_to_rules
policy: add constants to rules.rego
2024-12-03 10:17:10 -08:00
Aurélien Bombo
5e1fc5a63f Merge pull request #10609 from kata-containers/sprt/fix-publish-csi
ci: Fix Docker publishing for CSI driver
2024-12-03 11:21:55 -06:00
Hyounggyu Choi
8b998e5f0c runtime-rs: Introduce get_devno_ccw() for deduplication
The devno assignment logic is repeated in 5 different places
during device addition.
To improve code maintainability and readability, this commit
introduces a standalone function, `get_devno_ccw()`,
to handle the deduplication.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-12-03 15:35:03 +01:00
Leonard Cohnen
9b614a4615 genpolicy: align state path getter and setter
Before this patch there was a mismatch between the JSON path under which
the state of the rule evaluation is set in comparison to under which
it is retrieved.

This resulted in the behavior that each time the policy was evaluated,
it thought it was the _first_ time the policy was evaluated.
This also means that the consistency check for the `sandbox_name`
was ineffective.

Signed-off-by: Leonard Cohnen <lc@edgeless.systems>
2024-12-03 13:25:24 +01:00
Aurélien Bombo
85d3bcd713 ci: Fix Docker publishing for CSI driver
The compilation succeeds, however Docker can't find the binary because
we specify an absolute path. In Docker world, an absolute path is
absolute to the Docker build context (here:
src/tools/csi-kata-directvolume).

To fix this, we link the binary into the build context, where the
Dockerfile expects it.

Failure mode:
https://github.com/kata-containers/kata-containers/actions/runs/12068202642/job/33693101962?pr=10563#step:8:213

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-12-02 15:50:01 -06:00
Saul Paredes
711d12e5db policy: support optional metadata uid field
This prevents a deserialization error when uid is specified

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-12-02 11:24:58 -08:00
Dan Mihai
efd492d562 runtime: skip logging some of the dial errors
With full debug logging enabled there might be around 1,500 redials
so log just ~15 of these redials to avoid flooding the log.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-12-02 19:11:32 +00:00
Hyounggyu Choi
9c19d7674a Merge pull request #10590 from zvonkok/fix-ci
ci: Fix variant for confidential targets
2024-12-02 18:39:52 +01:00
Saul Paredes
9105c1fa0c policy: add constants to rules.rego
Reuse constants where applicable

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-12-02 08:28:58 -08:00
Hyounggyu Choi
6f4f94a9f0 Merge pull request #10595 from BbolroC/add-zvsi-devmapper-to-gatekeeper-required-jobs
gatekeeper: add run-k8s-tests-on-zvsi(devmapper) to required jobs
2024-12-02 15:28:14 +01:00
Zvonko Kaiser
20442c0eae ci: Fix variant for confidential targets
The default initrd confidential target will have a
variant=confidential we need to accomodate this
and make sure we also accomodate aaa-xxx-confidential targets.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-12-02 14:21:03 +00:00
stevenhorsman
b87b4b6756 metrics: Increase ranges range for qemu failing tests
We've also seen the qemu metrics tests are failing due to the results
being slightly outside the max range for network-iperf3 parallel and minimum for network-iperf3 jitter tests on PRs that have no code changes,
so we've increase the bounds to not see false negatives.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-29 10:52:16 +00:00
stevenhorsman
4011071526 metrics: Increase minval range for failing tests
We've seen a couple of instances recently where the metrics
tests are failing due to the results being below the minimum
value by ~2%.
For tests like latency I'm not sure why values being too low would
be an issue, but I've updated the minpercent range of the failing tests
to try and get them passing.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-29 10:50:02 +00:00
Hyounggyu Choi
de3452f8e1 gatekeeper: add run-k8s-tests-on-zvsi(devmapper) to required jobs
As the following CI job has been marked as required:

- kata-containers-ci-on-push / run-k8s-tests-on-zvsi / run-k8s-tests (devmapper, qemu, kubeadm)

we need to add it to the gatekeeper's required job list.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-11-28 12:46:47 +01:00
Fabiano Fidêncio
bdf10e651a Merge pull request #10597 from kata-containers/topic/unbreak-ci-3rd-time-s-a-charm
Unbreak the CI, 3rd attempt
2024-11-28 12:36:09 +01:00
Fabiano Fidêncio
92b8091f62 Revert "ci: unbreak: Reallow no-op builds"
This reverts commit 559018554b.

As we've noticed that this is causing issues with initrd builds in the
CI.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-28 12:02:40 +01:00
Fabiano Fidêncio
ca2098f828 build: Allow dummy builds (for when adding a new target)
This will help us to simply allow a new dummy build whenever a new
component is added.

As long as the format `$(call DUMMY,$@)` is followed, we should be good
to go without taking the risk of breaking the CI.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-28 11:13:24 +01:00
Fabiano Fidêncio
f9930971a2 Merge pull request #10594 from sprt/sprt/unbreak-ci-noop-build
ci: unbreak: Reallow no-op builds
2024-11-28 07:38:25 +01:00
Aurélien Bombo
559018554b ci: unbreak: Reallow no-op builds
#9838 previously modified the static build so as not to repeatedly
copy the same assets on each matrix iteration:

https://github.com/kata-containers/kata-containers/pull/9838#issuecomment-2169299202

However, that implementation breaks specifiying no-op/WIP build targets
such as done in e43c59a. Such no-op builds have been a historical of the
project requirement because of a GHA limitation. The breakage is due to
no-op builds not generating a tar file corresponding to the asset:

https://github.com/kata-containers/kata-containers/actions/runs/12059743390/job/33628926474?pr=10592

To address this breakage, we revert to the `cp -r` implementation and
add the `--no-clobber` flag to still preserve the current behavior. Note
that `-r` will also create the destination directory if it doesn't
exist.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-27 18:40:29 -06:00
Fabiano Fidêncio
9699c7ed06 Merge pull request #10589 from kata-containers/sprt/fix-csi-publish
gha: Unbreak CI and work around workflow limit
2024-11-27 23:52:55 +01:00
Aurélien Bombo
eac197d3b7 Merge pull request #10564 from microsoft/danmihai1/clh-endpoint-type
runtime: clh: addNet() logging clean-up
2024-11-27 14:44:14 -06:00
Aurélien Bombo
7f659f3d63 gha: Unbreak CI and work around workflow limit
#10561 inadvertently broke the CI by going over the limit of
20 reusable workflows:

https://github.com/kata-containers/kata-containers/actions/runs/12054648658/workflow

This commit fixes that by inlining the job.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-27 12:23:15 -06:00
Aurélien Bombo
16a91fccbe Merge pull request #10561 from sprt/csi-driver-ci
coco: ci: Lay groundwork for compiling and publishing CSI driver image [1/x]
2024-11-27 10:26:45 -06:00
Fabiano Fidêncio
175fe8bc66 Merge pull request #10585 from fidencio/topic/kata-deploy-use-drop-in-containerd-config-whenever-it-is-possible
kata-deploy: Use drop-in files whenever it's possible
2024-11-27 16:36:18 +01:00
Steve Horsman
6bb00d9a1d Merge pull request #10583 from squarti/agent-startup-cdh-client
agent: fix startup when guest_components_procs is set to none
2024-11-27 11:43:07 +00:00
Fabiano Fidêncio
500508a592 kata-deploy: Use drop-in files whenever it's possible
This will make our lives considerably easier when it comes to cleaning
up content added, while it's also a groundwork needed for having
multiple installations running in parallel.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-27 12:27:08 +01:00
Steve Horsman
3240f8a4b8 Merge pull request #10586 from stevenhorsman/delete-rootfs-binary-assets-after-rootfs-build
workflows: Remove rootfs binary artifacts
2024-11-27 10:03:20 +00:00
Fabiano Fidêncio
c472fe1924 Merge pull request #10584 from fidencio/topic/kata-deploy-prepare-for-containerd-config-version-3
kata-deploy: Support containerd configuration version 3
2024-11-26 18:44:56 +01:00
stevenhorsman
3e5d360185 workflows: Remove rootfs binary artifacts
We need the publish certain artefacts for the rootfs,
like the agent, guest-components, pause bundle etc
as they are consumed in the `build-asset-rootfs` step.
However after this point they aren't needed and probably
shouldn't be included in the overall kata tarball, so delete
them once they aren't needed any more to avoid them
being included.

Fixes: #10575
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-26 15:24:20 +00:00
Fabiano Fidêncio
6f70ab9169 kata-deploy: Adapt how the containerd version is checked for k0s
Let's actually mount the whole /etc/k0s as /etc/containerd, so we can
easily access the containerd configuration file which has the version in
it, allowing us to parse it instead of just making a guess based on
kubernetes distro being used.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-26 16:15:11 +01:00
Silenio Quarti
1230bc77f2 agent: fix startup when guest_components_procs is set to none
This PR ensures that OCICRYPT_CONFIG_PATH file is initialized only
when CDH socket exists. This prevents startup error if attestation
binaries are not installed in PodVM.

Fixes: https://github.com/kata-containers/kata-containers/issues/10568

Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
2024-11-26 09:57:04 -05:00
Fabiano Fidêncio
f5a9aaa100 kata-deploy: Support containerd config version 3
On Ubuntu 24.04, with the distro default containerd, we're already
getting:
```
$ containerd config default | grep "version = "
version = 3
```

With that in mind, let's make sure that we're ready to support this from
the next release.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-26 14:01:50 +01:00
Fupan Li
28166c8a32 Merge pull request #10577 from Apokleos/fix-vfiodev-name
runtime-rs: fix vfio device name combination issue
2024-11-26 09:35:45 +08:00
Dan Mihai
d93900c128 Merge pull request #10543 from microsoft/danmihai1/regorus-warning
genpolicy: avoid regorus warning
2024-11-25 16:47:33 -08:00
Zvonko Kaiser
1b10e82559 Merge pull request #10516 from zvonkok/kata-agent-cdi
ci: Fix error on self-hosted machines
2024-11-25 18:49:37 -05:00
Ryan Savino
e46d24184a Merge pull request #10386 from kimullaa/fix-build-error-when-using-sev-snp
docs: Fix several build failures  when I tried the procedures in "Kata Containers with AMD SEV-SNP VMs"
2024-11-25 16:58:52 -06:00
Dan Mihai
f340b31c41 genpolicy: avoid regorus warning
Avoid adding to the Guest console warnings about "agent_policy:10:8".

"import input" is unnecessary.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-11-25 21:19:01 +00:00
Zvonko Kaiser
c3d1b3c5e3 Merge pull request #10464 from zvonkok/nvidia-gpu-rootfs
gpu: NVIDIA GPU initrd/image build
2024-11-25 16:16:42 -05:00
Fabiano Fidêncio
8763a9bc90 Merge pull request #10520 from fidencio/topic/drop-clear-linux-rootfs
osbuilder: Drop Clear Linux
2024-11-25 21:16:03 +01:00
Dan Mihai
78cbf33f1d runtime: clh: addNet() logging clean-up
Avoid logging the same endpoint fields twice from addNet().

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-11-25 19:58:54 +00:00
alex.lyn
5dba680afb runtime-rs: fix vfio device name combination issue
Fixes #10576

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2024-11-25 14:01:43 +08:00
Hyounggyu Choi
48e2df53f7 runtime-rs: Add devno to DeviceVirtioScsi
A new attribute named `devno` is added to DeviceVirtioScsi.
It will be used to specify a device number for a CCW bus type.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-11-23 13:45:36 +01:00
Hyounggyu Choi
2cc48f7822 runtime-rs: Add devno to DeviceVhostUserFs
A new attribute named `devno` is added to DeviceVhostUserFs.
It will be used to specify a device number for a CCW bus type.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-11-23 13:45:36 +01:00
Hyounggyu Choi
920484918c runtime-rs: Add devno to VhostVsock
A new attribute named `devno` is added to VhostVsock.
It will be used to specify a device number for a CCW bus type.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-11-23 13:45:36 +01:00
Hyounggyu Choi
9486790089 runtime-rs: Add devno to DeviceVirtioSerial
A new attribute named `devno` is added to DeviceVirtioSerial.
It will be used to specify a device number for a CCW bus type.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-11-23 13:45:36 +01:00
Hyounggyu Choi
516daecc50 runtime-rs: Add devno to DeviceVirtioBlk
A new attribute named `devno` is added to DeviceVirtioBlk.
It will be used to specify a device number for a CCW bus type.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-11-23 13:45:36 +01:00
Hyounggyu Choi
30a64092a7 runtime-rs: Add CcwSubChannel to provide devno for CCW devices
To explicitly specify a device number on the QEMU command line
for the following devices using the CCW transport on s390x:

- SerialDevice
- BlockDevice
- VhostUserDevice
- SCSIController
- VSOCKDevice

this commit introduces a new structure CcwSubChannel and implements
the following methods:

- add_device()
- remove_device()
- address_format_ccw()
- set_addr()

You can see the detailed explanation for each method in the comment.

This resolves the 1st part of #10573.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-11-23 13:45:36 +01:00
Steve Horsman
322073bea1 Merge pull request #10447 from ldoktor/required-jobs
ci: Required jobs
2024-11-22 09:15:11 +00:00
Lukáš Doktor
e69635b376 ci.gatekeeper: Remove unused variable
this is a left-over from previous way of iterating over jobs.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-11-22 09:27:11 +01:00
Lukáš Doktor
fa7bca4179 ci.gatekeeper: Print the older job id
let's print the also the existing result's id when printing the
information about ignoring older result id to simplify debugging.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-11-22 09:27:11 +01:00
Lukáš Doktor
6c19a067a0 ci.gatekeeper: Update existing results
tha matching run_id means we're dealing with the same job but with
updated results and not with an older job. Update the results in such
case.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-11-22 09:27:09 +01:00
Aurélien Bombo
5e4990bcf5 coco: ci: Add no-op steps to deploy CSI driver
This adds no-op steps that'll be used to deploy and clean up the CSI driver
used for testing.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-21 16:08:06 -06:00
Aurélien Bombo
893f6a4ca0 ci: Introduce job to publish CSI driver image
This adds a new job to build and publish the CSI driver Docker image.

Of course this job will fail after we merge this PR because the CSI driver
compilation job hasn't been implemented yet. However that will be implemented
directly after in #10561.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-21 16:07:59 -06:00
Aurélien Bombo
e43c59a2c6 ci: Add no-op step to compile CSI driver
This adds a no-op build step to compile the CSI driver. The actual compilation
will be implemented in an ulterior PR, so as to ensure we don't break the CI.

Addresses: #10560

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-21 16:06:55 -06:00
Zvonko Kaiser
0debf77770 gpu: NVIDIA gpu initrd/image build
With each release make sure we ship a GPU enabled rootfs/initrd

Fixes: #6554

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-11-21 18:57:23 +00:00
Steve Horsman
b4da4b5e3b Merge pull request #10377 from coolljt0725/fix_build
osbuilder: Fix build dependency of ubuntu rootfs with Docker
2024-11-21 08:45:59 +00:00
Jitang Lei
ed4c727c12 osbuilder: Fix build dependency of ubuntu rootfs with Docker
Build ubuntu rootfs with Docker failed with error:
`Unable to find libclang`

Fix this error by adding libclang-dev to the dependency.

Signed-off-by: Jitang Lei <leijitang@outlook.com>
2024-11-21 10:49:27 +08:00
Zvonko Kaiser
e9f36f8187 ci: Fixing simple typo
change evn to env

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-11-20 18:40:14 +00:00
Zvonko Kaiser
a5733877a4 ci: Fix error on self-hosted machines
We need to clean-up any created files/dirs otherwise
we cause problems on self-hosted runners. Using tempdir which
will be removed automatically.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-11-20 18:40:13 +00:00
Lukáš Doktor
62e8815a5a ci: Add documentation to cover mapping format
to help people with adding new entries.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-11-20 17:25:59 +01:00
Lukáš Doktor
64306dc888 ci: Set required-tests according to GH required tests
this should record the current list of required tests from GH.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-11-20 17:25:57 +01:00
Steve Horsman
358ebf5134 Merge pull request #10558 from AdithyaKrishnan/main
ci: Re-enable SNP CI
2024-11-20 10:27:41 +00:00
Steve Horsman
30bad4ee43 Merge pull request #10562 from stevenhorsman/remove-release-artifactor-skips
workflows: Remove skipping of artifact uploads
2024-11-20 08:45:37 +00:00
Adithya Krishnan Kannan
2242aee099 ci: Skip the failing tests in SNP
Per [Issue#10549](https://github.com/kata-containers/kata-containers/issues/10549),
the following tests are failing on SNP.
1. k8s-guest-pull-image-encrypted.bats
2. k8s-guest-pull-image-authenticated.bats
3. k8s-guest-pull-image-signature.bats
4. k8s-confidential-attestation.bats

Per @fidencio 's comment on
[PR#10558](https://github.com/kata-containers/kata-containers/pull/10558),
I am skipping the same.

Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
2024-11-19 10:41:43 -06:00
stevenhorsman
da5f6b77c7 workflows: Remove skipping of artifact uploads
Now we are downloading artifacts to create the rootfs
we need to ensure they are uploaded always,
even on releases

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-19 13:28:02 +00:00
Steve Horsman
817438d1f6 Merge pull request #10552 from stevenhorsman/3.11.0-release
release: Bump version to 3.11.0
2024-11-19 09:44:35 +00:00
Saul Paredes
eab48c9884 Merge pull request #10545 from microsoft/cameronbaird/sync-clh-logging
runtime: fix comment to accurately reflect clh behavior
2024-11-18 11:25:58 -08:00
Adithya Krishnan Kannan
ef367d81f2 ci: Re-enable SNP CI
We've debugged the SNP Node and we
wish to test the fixes on GHA.

Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
2024-11-18 11:11:27 -06:00
stevenhorsman
7a8ba14959 release: Bump version to 3.11.0
Bump `VERSION` and helm-chart versions

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-18 11:13:15 +00:00
Steve Horsman
0ce3f5fc6f Merge pull request #10514 from squarti/pause_command
agent: overwrite OCI process spec when overwriting pause image
2024-11-15 18:03:58 +00:00
Fabiano Fidêncio
92f7526550 Merge pull request #10542 from Crypt0s/topic/enable-CONFIG_KEYS
kernel: add CONFIG_KEYS=y to enable kernel keyring
2024-11-15 12:15:25 +01:00
Crypt0s
563a6887e2 kernel: add CONFIG_KEYS=y to enable kernel keyring
KinD checks for the presence of this (and other) kernel configuration
via scripts like
https://blog.hypriot.com/post/verify-kernel-container-compatibility/ or
attempts to directly use /proc/sys/kernel/keys/ without checking to see
if it exists, causing an exit when it does not see it.

Docker/it's consumers apparently expect to be able to use the kernel
keyring and it's associated syscalls from/for containers.

There aren't any known downsides to enabling this except that it would
by definition enable additional syscalls defined in
https://man7.org/linux/man-pages/man7/keyrings.7.html which are
reachable from userspace. This minimally increases the attack surface of
the Kata Kernel, but this attack surface is minimal (especially since
the kernel is most likely being executed by some kind of hypervisor) and
highly restricted compared to the utility of enabling this feature to
get further containerization compatibility.

Signed-off-by: Crypt0s <BryanHalf@gmail.com>
2024-11-15 09:30:06 +01:00
Shunsuke Kimura
706e8bce89 docs: change from OVMF.fd to AmdSev.fd
change the build method to generate OVMF for AmdSev.
This commit adds `ovmf_build=sev` env parameter.
<638c2c4164>

Fixes #10378

Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>
2024-11-15 11:24:45 +09:00
Shunsuke Kimura
d7f6fabe65 docs: fix build-kernel.sh option
`build-kernel.sh` no longer takes an argument for the -x option.
<6c3338271b>

Fixes #10378

Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>
2024-11-15 11:24:45 +09:00
Cameron Baird
65881ceb8a runtime: fix comment to accurately reflect clh behavior
Fix the CLH log levels description

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
2024-11-14 23:16:11 +00:00
Silenio Quarti
42b6203493 agent: overwrite OCI process spec when overwriting pause image
The PR replaces the OCI process spec of the pause container with the spec of
the guest provided pause bundle.

Fixes: https://github.com/kata-containers/kata-containers/issues/10537

Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
2024-11-14 13:05:16 -05:00
Fabiano Fidêncio
6a9266124b Merge pull request #10501 from kata-containers/topic/ci-split-tests
ci: tdx: Split jobs to run in 2 different machines
2024-11-14 17:24:50 +01:00
Fabiano Fidêncio
9b3fe0c747 ci: tdx: Adjust workflows to use different machines
This will be helpful in order to increase the OS coverage (we'll be
using both Ubuntu 24.04 and CentOS 9 Stream), while also reducing the
amount spent on the tests (as one machine will only run attestation
related tests, and the other the tests that do *not* require
attestation).

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-14 15:52:00 +01:00
Fabiano Fidêncio
9b1a5f2ac2 tests: Add a way to run only tests which rely on attestation
We're doing this as, at Intel, we have two different kind of machines we
can plug into our CI.  Without going much into details, only one of
those two kinds of machines will work for the attestation tests we
perform with ITA, thus in order to speed up the CI and improve test
coverage (OS wise), we're going to run different tests in different
machines.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-14 15:51:57 +01:00
Steve Horsman
915695f5ef Merge pull request #9407 from mrIncompetent/root-fs-clang
rootfs: Install missing clang in Ubuntu docker image
2024-11-14 10:35:06 +00:00
Henrik Schmidt
57a4dbedeb rootfs: Install missing libclang-dev in Ubuntu docker image
Fixes #9444

Signed-off-by: Henrik Schmidt <mrIncompetent@users.noreply.github.com>
2024-11-14 08:48:24 +00:00
Hyounggyu Choi
5869046d04 Merge pull request #9195 from UiPath/fix/vcpus-for-static-mgmt
runtime: Set maxvcpus equal to vcpus for the static resources case
2024-11-14 09:38:20 +01:00
Dan Mihai
d9977b3e75 Merge pull request #10431 from microsoft/saulparedes/add-policy-state
genpolicy: add state to policy
2024-11-13 11:48:46 -08:00
Aurélien Bombo
7bc2fe90f9 Merge pull request #10521 from ncppd/osbuilder-cleanup
osbuilder: remove redundant env variable
2024-11-13 12:17:09 -06:00
Steve Horsman
a947d2bc40 Merge pull request #10539 from AdithyaKrishnan/main
ci: Temporarily skip SNP CI
2024-11-13 17:58:32 +00:00
Adithya Krishnan Kannan
439a1336b5 ci: Temporarily skip SNP CI
As discussed in the CI working group,
we are temporarily skipping the SNP CI
to unblock the remaining workflow.
Will revert after fixing the SNP runner.

Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
2024-11-13 11:44:16 -06:00
Fabiano Fidêncio
02d4c3efbf Merge pull request #10519 from fidencio/topic/relax-restriction-for-qemu-tdx
Reapply "runtime: confidential: Do not set the max_vcpu to cpu"
2024-11-13 16:09:06 +01:00
Saul Paredes
c207312260 genpolicy: validate container sandbox names
Make sure all container sandbox names match the sandbox name of the first container.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-11-12 15:17:01 -08:00
Saul Paredes
52d1aea1f7 genpolicy: Add state
Use regorous engine's add_data method to add state to the policy.
This data can later be accessed inside rego context through the data namespace.

Support state modifications (json-patches) that may be returned as a result from policy evaluation.

Also initialize a policy engine data slice "pstate" dedicated for storing state.

Fixes #10087

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-11-12 15:16:53 -08:00
Alexandru Matei
e83f8f8a04 runtime: Set maxvcpus equal to vcpus for the static resources case
Fixes: #9194

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2024-11-12 16:36:42 +02:00
GabyCT
06fe459e52 Merge pull request #10508 from GabyCT/topic/installartsta
gha: Get artifacts when installing kata tools in stability workflow
2024-11-11 15:59:06 -06:00
Nikos Ch. Papadopoulos
ab80cf8f48 osbuilder: remove redundant env variable
Remove second declaration of GO_HOME in roofs-build ubuntu script.

Signed-off-by: Nikos Ch. Papadopoulos <ncpapad@cslab.ece.ntua.gr>
2024-11-11 19:49:28 +02:00
Fabiano Fidêncio
780b36f477 osbuilder: Drop Clear Linux
The Clear Linux rootfs is not being tested anywhere, and it seems Intel
doesn't have the capacity to review the PRs related to this (combined
with the lack of interested from the rest of the community on reviewing
PRs that are specific to this untested rootfs).

With this in mind, I'm suggesting we drop Clear Linux support and focus
on what we can actually maintain.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-11 15:22:55 +01:00
Fabiano Fidêncio
5618180e63 Merge pull request #10515 from kata-containers/sprt/ubuntu-latest-fix
gha: Hardcode ubuntu-22.04 instead of latest
2024-11-10 09:54:39 +01:00
Fabiano Fidêncio
2281342fb8 Merge pull request #10513 from fidencio/topic/ci-adjust-proxy-nightmare-for-tdx
ci: tdx: kbs: Ensure https_proxy is taken in consideration
2024-11-10 00:17:10 +01:00
Fabiano Fidêncio
0d8c4ce251 Merge pull request #10517 from microsoft/saulparedes/remove_manifest_v1_test
tests: remove manifest v1 test
2024-11-09 23:40:51 +01:00
Fabiano Fidêncio
56812c852f Reapply "runtime: confidential: Do not set the max_vcpu to cpu"
This reverts commit f15e16b692, as we
don't have to do this since we're relying on the
`static_sandbox_resource_mgmt` feature, which gives us the correct
amount of memory and CPUs to be allocated.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-09 23:20:17 +01:00
Saul Paredes
461efc0dd5 tests: remove manifest v1 test
This test was meant to show support for pulling images with v1 manifest schema versions.

The nginxhttps image has been modified in https://hub.docker.com/r/ymqytw/nginxhttps/tags such that we are no longer able to pull it:

$ docker pull ymqytw/nginxhttps:1.5
Error response from daemon: missing signature key

We may remove this test since schema version 1 manifests are deprecated per
https://docs.docker.com/engine/deprecated/#pushing-and-pulling-with-image-manifest-v2-schema-1 :
"These legacy formats should no longer be used, and users are recommended to update images to use current formats, or to upgrade to more
current images". This schema version was used by old docker versions. Further OCI spec
https://github.com/opencontainers/image-spec/blob/main/manifest.md#image-manifest-property-descriptions only supports schema version 2.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-11-08 13:38:51 -08:00
Aurélien Bombo
19e972151f gha: Hardcode ubuntu-22.04 instead of latest
GHA is migrating ubuntu-latest to Ubuntu 24 so
let's hardcode the current 22.04 LTS.

https://github.blog/changelog/2024-11-05-notice-of-breaking-changes-for-github-actions/

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-08 11:00:15 -06:00
Greg Kurz
2bd8fde44a Merge pull request #10511 from ldoktor/fedora-python
ci.ocp: Use the official python:3 container for sanity
2024-11-08 16:31:40 +01:00
Fabiano Fidêncio
baf88bb72d ci: tdx: kbs: Ensure https_proxy is taken in consideration
Trustee's deployment must set the correct https_proxy as env var on the
container that will talk to the ITA / ITTS server, otherwise the kbs
service won't be able to start, causing then issues in our CI.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Signed-off-by: Krzysztof Sandowicz <krzysztof.sandowicz@intel.com>
2024-11-08 16:06:16 +01:00
Steve Horsman
1f728eb906 Merge pull request #10498 from stevenhorsman/update-create-container-timeout-log
tests: k8s: Update image pull timeout error
2024-11-08 10:47:39 +00:00
Steve Horsman
6112bf85c3 Merge pull request #10506 from stevenhorsman/skip-runk-ci
workflow: Remove/skip runk CI
2024-11-08 09:54:06 +00:00
Steve Horsman
a5acbc9e80 Merge pull request #10505 from stevenhorsman/remove-stratovirt-metrics-tests
metrics: Skip metrics on stratovirt
2024-11-08 08:53:05 +00:00
Lukáš Doktor
2f7d34417a ci.ocp: Use the official python:3 container for sanity
Fedora F40 removed python3 from the base container, to avoid such issues
let's rely on the latest and greates official python container.

Fixes: #10497

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-11-08 07:16:30 +01:00
Zvonko Kaiser
183bd2aeed Merge pull request #9584 from zvonkok/kata-agent-cdi
kata-agent: Add CDI support
2024-11-07 14:18:32 -05:00
Zvonko Kaiser
aa2e1a57bd agent: Added test-case for handle_cdi_devices
We are generating a simple CDI spec with device and
global containerEdits to test the CDI crate.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-11-07 17:03:18 +00:00
Gabriela Cervantes
4274198664 gha: Get artifacts when installing kata tools in stability workflow
This PR adds the get artifacts which are needed when installing kata
tools in stability workflow to avoid failures saying that artifacts
are missing.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-11-07 16:20:41 +00:00
stevenhorsman
a5f1a5a0ee workflow: Remove/skip runk CI
As discussed in the AC meeting, we don't have a maintainer,
(or users?) of runk, and the CI is unstable, so giving we can't
support it, we shouldn't waste CI cycles on it.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-07 14:16:30 +00:00
stevenhorsman
0efe9f4e76 metrics: Skip metrics on stratovirt
As discussed on the AC call, we are lacking maintainers for the
metrics tests. As a starting point for potentially phasing them
out, we discussed starting with removing the test for stratovirt
as a non-core hypervisor and a job that is problematic in leaving
behind resources that need cleaning up.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-07 14:06:57 +00:00
Fabiano Fidêncio
c332e953f9 Merge pull request #10500 from squarti/fix-10499
runtime: Files are not synced between host and guest VMs
2024-11-07 08:28:53 +01:00
Silenio Quarti
be3ea2675c runtime: Files are not synced between host and guest VMs
This PR makes the root dir absolute after resolving the
default root dir symlink. 

Fixes: https://github.com/kata-containers/kata-containers/issues/10499

Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
2024-11-06 17:31:12 -05:00
GabyCT
47cea6f3c6 Merge pull request #10493 from GabyCT/topic/katatoolsta
gha: Add install kata tools as part of the stability workflow
2024-11-06 14:16:48 -06:00
Gabriela Cervantes
13e27331ef gha: Add install kata tools as part of the stability workflow
This PR adds the install kata tools step as part of the k8s stability workflow.
To avoid the failures saying that certain kata components are not installed it.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-11-06 20:07:06 +00:00
Fabiano Fidêncio
71c4c2a514 Merge pull request #10486 from kata-containers/topic/enable-AUTO_GENERATE_POLICY-for-qemu-coco-dev
workflows: Use AUTO_GENERATE_POLICY for qemu-coco-dev
2024-11-06 21:04:45 +01:00
Zvonko Kaiser
3995fe71f9 kata-agent: Add CDI support
For proper device handling add CDI support

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-11-06 17:50:20 +00:00
stevenhorsman
85554257f8 tests: k8s: Update image pull timeout error
Currently the error we are checking for is
`CreateContainerRequest timed out`, but this message
doesn't always seem to be printed to our pod log.
Try using a more general message that should be present
more reliably.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-06 17:00:26 +00:00
Fabiano Fidêncio
a3c72e59b1 Merge pull request #10495 from littlejawa/ci/skip_nginx_connectivity_for_crio
ci: skip nginx connectivity test with qemu/crio
2024-11-06 13:43:19 +01:00
Julien Ropé
da5e0c3f53 ci: skip nginx connectivity test with crio
We have an error with service name resolution with this test when using crio.
This error could not be reproduced outside of the CI for now.
Skipping it to keep the CI job running until we find a solution.

See: #10414

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-11-06 12:07:02 +01:00
Greg Kurz
5af614b1a4 Merge pull request #10496 from littlejawa/ci/expose_container_runtime
ci: export CONTAINER_RUNTIME to the test scripts
2024-11-06 12:05:36 +01:00
Julien Ropé
6d0cb1e9a8 ci: export CONTAINER_RUNTIME to the test scripts
This variable will allow tests to adapt their behaviour to the runtime (containerd/crio).

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-11-06 11:29:11 +01:00
Fabiano Fidêncio
72979d7f30 workflows: Use AUTO_GENERATE_POLICY for qemu-coco-dev
By the moment we're testing it also with qemu-coco-dev, it becomes
easier for a developer without access to TEE to also test it locally.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-06 10:47:08 +01:00
Fabiano Fidêncio
7d3f2f7200 runtime: Match TEEs for the static_sandbox_resource_mgmt option
The qemu-coco-dev runtime class should be as close as possible to what
the TEEs runtime classes are doing, and this was one of the options that
ended up overlooked till now.

Shout out to Dan Mihai for noticing that!

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-06 10:47:08 +01:00
Fabiano Fidêncio
ea8114833c Merge pull request #10491 from fidencio/topic/fix-typo-in-the-ephemeral-handler
agent: fix typo on getting EphemeralHandler size option
2024-11-06 10:31:48 +01:00
Fabiano Fidêncio
7e6779f3ad Merge pull request #10488 from fidencio/topic/teach-our-machinery-to-deal-with-rc-kernels
build: kernel: Teach our machinery to deal with -rc kernels
2024-11-05 16:19:57 +01:00
Zvonko Kaiser
a4725034b2 Merge pull request #9480 from zvonkok/build-image-suffix
image: Add suffix to image or initrd depending on the NVIDIA driver version
2024-11-05 09:43:56 -05:00
Fabiano Fidêncio
77c87a0990 agent: fix typo on getting EphemeralHandler size option
Most likely this was overlooked during the development / review, but
we're actually interested on the size rather than on the pagesize of the
hugepages.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 15:15:17 +01:00
Fabiano Fidêncio
2b16160ff1 versions: kernel-dragonball: Fix URL
SSIA

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 12:55:34 +01:00
Fabiano Fidêncio
f7b31ccd6c kernel: bump kata_config_version
Due to the changes done in the previous commits.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 12:26:57 +01:00
Fabiano Fidêncio
a52ea32b05 build: kernel: Learn how to deal with release candidates
So far we were not prepared to deal with release candidates as those:
* Do not have a sha256sum in the sha256sums provided by the kernel cdn
* Come from a different URL (directly from Linus)
* Have a different suffix (.tar.gz, instead of .tar.xz)

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 12:26:02 +01:00
Fabiano Fidêncio
9f2d4b2956 build: kernel: Always pass the url to the builder
This doesn't change much on how we're doing things Today, but it
simplifies a lot cases that may be added later on (and will be) like
building -rc kernels.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 12:26:02 +01:00
Fabiano Fidêncio
ee1a17cffc build: kernel: Take kernel_url into consideration
Let's make sure the kernel_url is actually used whenever it's passed to
the function.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 12:26:02 +01:00
Fabiano Fidêncio
9a0b501042 build: kernel: Remove tee specific function
As, thankfully, we're relying on upstream kernels for TEEs.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 12:26:02 +01:00
Fabiano Fidêncio
cc4006297a build: kernel: Pass the yaml base path instead of the version path
By doing this we can ensure this can be re-used, if needed (and it'll be
needed), for also getting the URL.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 12:26:02 +01:00
Fabiano Fidêncio
7057ff1cd5 build: kernel: Always pass -f to the kernel builder
-f forces the (re)generaton of the config when doing the setup, which
helps a lot on local development whilst not causing any harm in the CI
builds.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 12:26:02 +01:00
Fabiano Fidêncio
910defc4cf Merge pull request #10490 from fidencio/topic/fix-ovmf-build
builds: ovmf: Workaround Zeex repo becoming private
2024-11-05 12:25:00 +01:00
Fabiano Fidêncio
aff3d98ddd builds: ovmf: Workaround Zeex repo becoming private
Let's just do a simple `sed` and **not** use the repo that became
private.

This is not a backport of https://github.com/tianocore/edk2/pull/6402,
but it's a similar approach that allows us to proceed without the need
to pick up a newer version of edk2.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-05 11:25:54 +01:00
Dan Mihai
03bf4433d7 Merge pull request #10459 from stevenhorsman/update-bats
tests: k8s: Update bats
2024-11-04 12:26:58 -08:00
Aurélien Bombo
f639d3e87c Merge pull request #10395 from Sumynwa/sumsharma/create_container
agent-ctl: Add support to test kata-agent's container creation APIs.
2024-11-04 14:09:12 -06:00
GabyCT
7f066be04e Merge pull request #10485 from GabyCT/topic/fixghast
gha: Fix source for gha stability run script
2024-11-04 12:09:28 -06:00
Steve Horsman
a2b9527be3 Merge pull request #10481 from mkulke/mkulke/init-cdh-client-on-gcprocs-none
agent: perform attestation init w/o process launch
2024-11-04 17:27:45 +00:00
Gabriela Cervantes
fd4d0dd1ce gha: Fix source for gha stability run script
This PR fixes the source to avoid duplication specially in the common.sh
script and avoid failures saying that certain script is not in the directory.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-11-04 16:16:13 +00:00
Magnus Kulke
bf769851f8 agent: perform attestation init w/o process launch
This change is motivated by a problem in peerpod's podvms. In this setup
the lifecycle of guest components is managed by systemd. The current code
skips over init steps like setting the ocicrypt-rs env and initialization
of a CDH client in this case.

To address this the launch of the processes has been isolated into its
own fn.

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
2024-11-04 13:31:07 +01:00
Steve Horsman
4fd9df84e4 Merge pull request #10482 from GabyCT/topic/fixvirtdoc
docs: Update virtualization document
2024-11-04 11:51:09 +00:00
stevenhorsman
175ebfec7c Revert "k8s:kbs: Add trap statement to clean up tmp files"
This reverts commit 973b8a1d8f.

As @danmihai1 points out https://github.com/bats-core/bats-core/issues/364
states that using traps in bats is error prone, so this could be the cause
of the confidential test instability we've been seeing, like it was
in the static checks, so let's try and revert this.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-04 09:59:37 +00:00
stevenhorsman
75cb1f46b8 tests/k8s: Add skip is setup_common fails
At @danmihai1's suggestion add a die message in case
the call to setup_common fails, so we can see if in the test
output.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-04 09:59:33 +00:00
stevenhorsman
3f5bf9828b tests: k8s: Update bats
We've seen some issues with tests not being run in
some of the Coco CI jobs (Issue #10451) and in the
envrionments that are more stable we noticed that
they had a newer version of bats installed.

Try updating the version to 1.10+ and print out
the version for debug purposes

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-04 09:59:33 +00:00
Steve Horsman
06d2cc7239 Merge pull request #10453 from bpradipt/remote-annotation
runtime: Add GPU annotations for remote hypervisor
2024-11-04 09:10:06 +00:00
Zvonko Kaiser
3781526c94 gpu: Add VARIANT to the initrd and image build
We need to know if we're building a nvidia initrd or image
Additionally if we build a regular or confidential VARIANT

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-11-01 18:34:13 +00:00
Zvonko Kaiser
95b69c5732 build: initrd make it coherent to the image build
Add -f for moving the initrd to the correct file path

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-11-01 18:34:13 +00:00
Zvonko Kaiser
3c29c1707d image: Add suffix to image or initrd depending on the NVIDIA driver version
Fixes: #9478

We want to keep track of the driver versions build during initrd/image build so update the artifact_name after the fact.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-11-01 18:34:13 +00:00
Sumedh Alok Sharma
4b7aba5c57 agent-ctl: Add support to test kata-agent's container creation APIs.
This commit introduces changes to enable testing kata-agent's container
APIs of CreateContainer/StartContainer/RemoveContainer. The changeset
include:
- using confidential-containers image-rs crate to pull/unpack/mount a
container image. Currently supports only un-authenicated registry pull
- re-factor api handlers to reduce cmdline complexity and handle
request generation logic in tool
- introduce an OCI config template for container creation
- add test case

Fixes #9707

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-11-01 22:18:54 +05:30
Fabiano Fidêncio
2efcb442f4 Merge pull request #10442 from Sumynwa/sumsharma/tools_use_ubuntu_static_build
ci: Use ubuntu for static building of kata tools.
2024-11-01 16:04:31 +01:00
Gabriela Cervantes
1ca83f9d41 docs: Update virtualization document
This PR updates the virtualization document by removing a url link
which is not longer valid.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-31 17:28:02 +00:00
GabyCT
a3d594d526 Merge pull request #10480 from GabyCT/topic/fixstabilityrun
gha: Add missing steps in Kata stability workflow
2024-10-31 09:57:33 -06:00
Fabiano Fidêncio
e058b92350 Merge pull request #10425 from burgerdev/darwin
genpolicy: support darwin target
2024-10-31 12:16:44 +01:00
Markus Rudy
df5e6e65b5 protocols: only build RLimit impls on Linux
The current version of the oci-spec crate compiles RLimit structs only
for Linux and Solaris. Until this is fixed upstream, add compilation
conditions to the type converters for the affected structs.

Fixes: #10071

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-10-31 09:50:36 +01:00
Markus Rudy
091a410b96 kata-sys-util: move json parsing to protocols crate
The parse_json_string function is specific to parsing capability strings
out of ttRPC proto definitions and does not benefit from being available
to other crates. Moving it into the protocols crate allows removing
kata-sys-util as a dependency, which in turn enables compiling the
library on darwin.

Fixes: #10071

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-10-31 09:41:07 +01:00
Markus Rudy
8ab4bd2bfc kata-sys-util: remove obsolete cgroups dependency
The cgroups.rs source file was removed in
234d7bca04. With cgroups support handled
in runtime-rs, the cgroups dependency on kata-sys-util can be removed.

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-10-31 09:41:07 +01:00
Sumedh Alok Sharma
0adf7a66c3 ci: Use ubuntu for static building of kata tools.
This commit introduces changes to use ubuntu for statically
building kata tools. In the existing CI setup, the tools
currently build only for x86_64 architecture.

It also fixes the build error seen for agent-ctl PR#10395.

Fixes #10441

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-10-31 13:19:18 +05:30
Gabriela Cervantes
c4089df9d2 gha: Add missing steps in Kata stability workflow
This PR adds missing steps in the gha run script for the kata stability
workflow.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-30 19:13:15 +00:00
Xuewei Niu
1a216fecdf Merge pull request #10225 from Chasing1020/main
runtime-rs: Add basic boilerplate for remote hypervisor
2024-10-30 17:02:50 +08:00
Hyounggyu Choi
dca69296ae Merge pull request #10476 from BbolroC/switch-to-kubeadm-s390x
gha: Switch KUBERNETES from k3s to kubeadm on s390x
2024-10-30 09:52:06 +01:00
GabyCT
9293931414 Merge pull request #10474 from GabyCT/topic/removeunvarb
packaging: Remove kernel config repo variable as it is unused
2024-10-29 12:52:07 -06:00
Gabriela Cervantes
69ee287e50 packaging: Remove kernel config repo variable as it is unused
This PR removes the kernel config repo variable at the build kernel
script as it is not used.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-29 17:09:52 +00:00
GabyCT
8539cd361a Merge pull request #10462 from GabyCT/topic/increstress
tests: Increase time to run stressng k8s tests
2024-10-29 11:08:47 -06:00
Chasing1020
425f6ad4e6 runtime-rs: add oci spec for prepare_vm method
The cloud-api-adaptor needs to support different types of pod VM
instance.
We needs to pass some annotations like machine_type, default_vcpus and
default_memory to prepare the VMs.

Signed-off-by: Chasing1020 <643601464@qq.com>
2024-10-30 01:01:28 +08:00
Chasing1020
f1167645f3 runtime-rs: support for remote hypervisors type
This patch adds the support of the remote hypervisor type for runtime-rs.
The cloud-api-adaptor needs the annotations and network namespace path
to create the VMs.
The remote hypervisor opens a UNIX domain socket specified in the config
file, and sends ttrpc requests to a external process to control sandbox
VMs.

Fixes: #10350

Signed-off-by: Chasing1020 <643601464@qq.com>
2024-10-30 00:54:17 +08:00
Pradipta Banerjee
6f1ba007ed runtime: Add GPU annotations for remote hypervisor
Add GPU annotations for remote hypervisor to help
with the right instance selection based on number of GPUs
and model

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
2024-10-29 10:28:21 -04:00
Steve Horsman
68225b53ca Merge pull request #10475 from stevenhorsman/revert-10452
Revert "tests: Add trap statement in kata doc script"
2024-10-29 13:58:00 +00:00
Hyounggyu Choi
aeef28eec2 gha: Switch to kubeadm for run-k8s-tests-on-zvsi
Last November, SUSE discontinued support for s390x, leaving k3s
on this platform stuck at k8s version 1.28, while upstream k8s
has since reached 1.31. Fortunately, kubeadm allows us to create
a 1.30 Kubernetes cluster on s390x.
This commit switches the KUBERNETES option from k3s to kubeadm
for s390x and removes a dedicated cluster creation step.
Now, cluster setup and teardown occur in ACTIONS_RUNNER_HOOK_JOB_{STARTED,COMPLETED}.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-10-29 14:27:32 +01:00
Hyounggyu Choi
238f67005f tests: Add kubeadm option for KUBERNETES in gha-run.sh
When creating a k8s cluster via kubeadm, the devmapper setup
for containerd requires a different configuration.
This commit introduces a new `kubeadm` option for the KUBERNETES
variable and adjusts the path to the containerd config file for
devmapper setup.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-10-29 14:19:42 +01:00
stevenhorsman
b1cffb4b09 Revert "tests: Add trap statement in kata doc script"
This reverts commit 093a6fd542.
as it is breaking the static checks

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-10-29 09:57:18 +00:00
Aurélien Bombo
eb04caaf8f Merge pull request #10074 from koct9i/log-vm-start-error
runtime: log vm start error before cleanup
2024-10-28 14:39:00 -05:00
Fabiano Fidêncio
e675e233be Merge pull request #10473 from fidencio/topic/build-cache-fix-shim-v2-root_hash.txt-location
build: cache: Ensure shim-v2-root_hash.txt is in "${workdir}"
2024-10-28 16:53:06 +01:00
Fabiano Fidêncio
f19c8cbd02 build: cache: Ensure shim-v2-root_hash.txt is in "${workdir}"
All the oras push logic happens from inside `${workdir}`, while the
root_hash.txt extraction and renaming was not taking this into
consideration.

This was not caught during the manually triggered runs as those do not
perform the oras push.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 15:17:16 +01:00
Steve Horsman
51bc71b8d9 Merge pull request #10466 from kata-containers/topic/ensure-shim-v2-sets-the-measured-rootfs-parameters-to-the-config
re-enable measured rootfs build & tests
2024-10-28 13:11:50 +00:00
Fabiano Fidêncio
b70d7c1aac tests: Enable measured rootfs tests for qemu-coco-dev
Then it's on pair with what's being tested with TEEs using a rootfs
image.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:54 +01:00
Fabiano Fidêncio
d23d057ac7 runtime: Enable measured rootfs for qemu-coco-dev
Let's make sure we are prepared to test this with non-TEE environments
as well.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
7d202fc173 tests: Re-enable measured_rootfs test for TDX
As we're now building everything needed to test TDX with measured rootfs
support, let's bring this test back in (for TDX only, at least for now).

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
d537932e66 build: shim-v2: Ensure MEASURED_ROOTFS is exported
The approach taken for now is to export MEASURED_ROOTFS=yes on the
workflow files for the architectures using confidential stuff, and leave
the "normal" build without having it set (to avoid any change of
expectation on the current bevahiour).

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
9c8b20b2bf build: shim-v2: Rebuild if root_hashes do not match
Let's make sure we take the root_hashes into consideration to decide
whether the shim-v2 should or should not be used from the cached
artefacts.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
9c84998de9 build: cache: Cache root_hash.txt used by the shim-v2
Let's cache the root_hash.txt from the confidential image so we can use
them later on to decide whether there was a rootfs change that would
require shim-v2 to be rebuilt.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
d2d9792720 build: Don't leave cached component behind if it can't be used
Let's ensure we remove the component and any extra tarball provided by
ORAS in case the cached component cannot be used.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
ef29824db9 runtime: Don't do measured rootfs for "vanilla" kernel
We may decide to add this later on, but for now this is only targetting
TEEs and the confidential image / initrd.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
a65946bcb0 workflows: build: Ensure rootfs is present for shim-v2 build
Let's ensure that we get the already built rootfs tarball from previous
steps of the action at the time we're building the shim-v2.

The reason we do that is because the rootfs binary tarballs has a
root_hash.txt file that contains the information needed the shim-v2
build scripts to add the measured rootfs arguments to the shim-v2
configuration files.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
6ea0369878 workflows: build: Ensure rootfs is built before shim-v2
As the rootfs will have what we need to add as part of the shim-v2
configuration files for measured rootfs, we **must** ensure this is
built **before** shim-v2.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
13ea082531 workflows: Build rootfs after its deps are built
By doing this we can just re-use the dependencies already built, saving
us a reasonable amount of time.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
eb07a809ce tests: Add a helper script to use prebuild components
This is a helper script that does basically what's already being done by
the s390x CI, which is:
* Move a folder with the components that we were stored / downloaded
  during the GHA execution to the expected `build` location
* Get rid of the dependencies for a specific asset, as the dependencies
  are already pulled in from previous GHA steps

For now this script is only being added but not yet executed anywhere,
and that will come as the next step in this series.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:52 +01:00
Fabiano Fidêncio
c2b18f9660 workflows: Store rootfs dependencies
So far we haven't been storing the rootfs dependencies as part of our
workflows, but we better do it to re-use them as part of the rootfs
build.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:52 +01:00
Konstantin Khlebnikov
ee50582848 runtime: log vm start error before cleanup
Return of proper error to the initiator is not guaranteed.
Method StopVM could kill shim process together with VM pieces.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
2024-10-28 11:21:21 +01:00
Gabriela Cervantes
a3ef8c0a16 tests: Increase time to run stressng k8s tests
This PR increase the time to run the stressng k8s tests for the
CoCo stability CI.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-24 16:34:17 +00:00
263 changed files with 15522 additions and 1812 deletions

View File

@@ -21,4 +21,5 @@ self-hosted-runner:
- sev-snp
- s390x
- s390x-large
- tdx
- tdx-no-attestation
- tdx-attestation

33
.github/workflows/actionlint.yaml vendored Normal file
View File

@@ -0,0 +1,33 @@
name: Lint GHA workflows
on:
workflow_dispatch:
pull_request:
types:
- opened
- edited
- reopened
- synchronize
paths:
- '.github/workflows/**'
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
run-actionlint:
env:
GH_TOKEN: ${{ github.token }}
runs-on: ubuntu-24.04
steps:
- name: Checkout the code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Install actionlint gh extension
run: gh extension install https://github.com/cschleiden/gh-actionlint
- name: Run actionlint
run: gh actionlint

View File

@@ -33,7 +33,7 @@ jobs:
run: |
# Clone into a temporary directory to avoid overwriting
# any existing github directory.
pushd $(mktemp -d) &>/dev/null
pushd "$(mktemp -d)" &>/dev/null
git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts
sudo install hub-util.sh /usr/local/bin
popd &>/dev/null

View File

@@ -36,7 +36,7 @@ jobs:
run: |
# Clone into a temporary directory to avoid overwriting
# any existing github directory.
pushd $(mktemp -d) &>/dev/null
pushd "$(mktemp -d)" &>/dev/null
git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts
sudo install pr-add-size-label.sh /usr/local/bin
popd &>/dev/null

View File

@@ -138,6 +138,8 @@ jobs:
run: bash tests/integration/nydus/gha-run.sh run
run-runk:
# Skip runk tests as we have no maintainers. TODO: Decide when to remove altogether
if: false
runs-on: ubuntu-22.04
env:
CONTAINERD_VERSION: lts

View File

@@ -19,7 +19,6 @@ jobs:
- runtime-rs
- agent-ctl
- kata-ctl
- runk
- trace-forwarder
- genpolicy
command:
@@ -40,22 +39,18 @@ jobs:
component-path: src/tools/agent-ctl
- component: kata-ctl
component-path: src/tools/kata-ctl
- component: runk
component-path: src/tools/runk
- component: trace-forwarder
component-path: src/tools/trace-forwarder
- install-libseccomp: no
- component: agent
install-libseccomp: yes
- component: runk
install-libseccomp: yes
- component: genpolicy
component-path: src/tools/genpolicy
steps:
- name: Adjust a permission for repo
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE $HOME
sudo rm -rf $GITHUB_WORKSPACE/* && echo "GITHUB_WORKSPACE removed" || { sleep 10 && sudo rm -rf $GITHUB_WORKSPACE/*; }
sudo chown -R "$USER":"$USER" "$GITHUB_WORKSPACE" "$HOME"
sudo rm -rf "$GITHUB_WORKSPACE"/* || { sleep 10 && sudo rm -rf "$GITHUB_WORKSPACE"/*; }
sudo rm -f /tmp/kata_hybrid* # Sometime we got leftover from test_setup_hvsock_failed()
- name: Checkout the code
@@ -72,12 +67,12 @@ jobs:
if: ${{ matrix.component == 'runtime' }}
run: |
./tests/install_go.sh -f -p
echo "/usr/local/go/bin" >> $GITHUB_PATH
echo "/usr/local/go/bin" >> "$GITHUB_PATH"
- name: Install rust
if: ${{ matrix.component != 'runtime' }}
run: |
./tests/install_rust.sh
echo "${HOME}/.cargo/bin" >> $GITHUB_PATH
echo "${HOME}/.cargo/bin" >> "$GITHUB_PATH"
- name: Install musl-tools
if: ${{ matrix.component != 'runtime' }}
run: sudo apt-get -y install musl-tools
@@ -91,10 +86,10 @@ jobs:
gperf_install_dir=$(mktemp -d -t gperf.XXXXXXXXXX)
./ci/install_libseccomp.sh "${libseccomp_install_dir}" "${gperf_install_dir}"
echo "Set environment variables for the libseccomp crate to link the libseccomp library statically"
echo "LIBSECCOMP_LINK_TYPE=static" >> $GITHUB_ENV
echo "LIBSECCOMP_LIB_PATH=${libseccomp_install_dir}/lib" >> $GITHUB_ENV
echo "LIBSECCOMP_LINK_TYPE=static" >> "$GITHUB_ENV"
echo "LIBSECCOMP_LIB_PATH=${libseccomp_install_dir}/lib" >> "$GITHUB_ENV"
- name: Install protobuf-compiler
if: ${{ matrix.command != 'make vendor' && (matrix.component == 'agent' || matrix.component == 'runk' || matrix.component == 'genpolicy' || matrix.component == 'agent-ctl') }}
if: ${{ matrix.command != 'make vendor' && (matrix.component == 'agent' || matrix.component == 'genpolicy' || matrix.component == 'agent-ctl') }}
run: sudo apt-get -y install protobuf-compiler
- name: Install clang
if: ${{ matrix.command == 'make check' && (matrix.component == 'agent' || matrix.component == 'agent-ctl') }}
@@ -102,8 +97,8 @@ jobs:
- name: Setup XDG_RUNTIME_DIR for the `runtime` tests
if: ${{ matrix.command != 'make vendor' && matrix.command != 'make check' && matrix.component == 'runtime' }}
run: |
XDG_RUNTIME_DIR=$(mktemp -d /tmp/kata-tests-$USER.XXX | tee >(xargs chmod 0700))
echo "XDG_RUNTIME_DIR=${XDG_RUNTIME_DIR}" >> $GITHUB_ENV
XDG_RUNTIME_DIR=$(mktemp -d "/tmp/kata-tests-$USER.XXX" | tee >(xargs chmod 0700))
echo "XDG_RUNTIME_DIR=${XDG_RUNTIME_DIR}" >> "$GITHUB_ENV"
- name: Running `${{ matrix.command }}` for ${{ matrix.component }}
run: |
cd ${{ matrix.component-path }}

View File

@@ -37,6 +37,7 @@ jobs:
- cloud-hypervisor
- cloud-hypervisor-glibc
- coco-guest-components
- csi-kata-directvolume
- firecracker
- genpolicy
- kata-ctl
@@ -53,12 +54,6 @@ jobs:
- qemu
- qemu-snp-experimental
- stratovirt
- rootfs-image
- rootfs-image-confidential
- rootfs-image-mariner
- rootfs-initrd
- rootfs-initrd-confidential
- runk
- trace-forwarder
- virtiofsd
stage:
@@ -94,7 +89,7 @@ jobs:
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-${KATA_ASSET}*.tar.* kata-build/.
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
@@ -134,7 +129,6 @@ jobs:
push-to-registry: true
- name: store-artifact ${{ matrix.asset }}
if: ${{ matrix.stage != 'release' || (matrix.asset != 'agent' && matrix.asset != 'coco-guest-components' && matrix.asset != 'pause-image') }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-amd64-${{ matrix.asset }}${{ inputs.tarball-suffix }}
@@ -142,9 +136,17 @@ jobs:
retention-days: 15
if-no-files-found: error
build-asset-shim-v2:
build-asset-rootfs:
runs-on: ubuntu-22.04
needs: build-asset
strategy:
matrix:
asset:
- rootfs-image
- rootfs-image-confidential
- rootfs-image-mariner
- rootfs-initrd
- rootfs-initrd-confidential
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
@@ -165,13 +167,93 @@ jobs:
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Build shim-v2
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-amd64-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: Build ${{ matrix.asset }}
id: build
run: |
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-${KATA_ASSET}*.tar.* kata-build/.
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-amd64-${{ matrix.asset }}${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
retention-days: 15
if-no-files-found: error
# We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs
remove-rootfs-binary-artifacts:
runs-on: ubuntu-22.04
needs: build-asset-rootfs
strategy:
matrix:
asset:
- agent
- coco-guest-components
- pause-image
steps:
- uses: geekyeggo/delete-artifact@v5
if: ${{ inputs.stage == 'release' }}
with:
name: kata-artifacts-amd64-${{ matrix.asset}}${{ inputs.tarball-suffix }}
build-asset-shim-v2:
runs-on: ubuntu-22.04
needs: [build-asset, build-asset-rootfs, remove-rootfs-binary-artifacts]
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-amd64-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: Build shim-v2
id: build
run: |
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: shim-v2
TAR_OUTPUT: shim-v2.tar.gz
@@ -181,6 +263,7 @@ jobs:
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
MEASURED_ROOTFS: yes
- name: store-artifact shim-v2
uses: actions/upload-artifact@v4
@@ -192,7 +275,7 @@ jobs:
create-kata-tarball:
runs-on: ubuntu-22.04
needs: [build-asset, build-asset-shim-v2]
needs: [build-asset, build-asset-rootfs, build-asset-shim-v2]
steps:
- uses: actions/checkout@v4
with:

View File

@@ -35,8 +35,6 @@ jobs:
- nydus
- qemu
- stratovirt
- rootfs-image
- rootfs-initrd
- virtiofsd
steps:
- name: Login to Kata Containers quay.io
@@ -63,7 +61,7 @@ jobs:
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-${KATA_ASSET}*.tar.* kata-build/.
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
@@ -75,7 +73,6 @@ jobs:
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact ${{ matrix.asset }}
if: ${{ inputs.stage != 'release' || matrix.asset != 'agent' }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-arm64-${{ matrix.asset }}${{ inputs.tarball-suffix }}
@@ -83,9 +80,14 @@ jobs:
retention-days: 15
if-no-files-found: error
build-asset-shim-v2:
build-asset-rootfs:
runs-on: arm64-builder
needs: build-asset
strategy:
matrix:
asset:
- rootfs-image
- rootfs-initrd
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
@@ -106,12 +108,89 @@ jobs:
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Build shim-v2
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-arm64-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: Build ${{ matrix.asset }}
run: |
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-${KATA_ASSET}*.tar.* kata-build/.
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-arm64-${{ matrix.asset }}${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
retention-days: 15
if-no-files-found: error
# We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs
remove-rootfs-binary-artifacts:
runs-on: ubuntu-22.04
needs: build-asset-rootfs
strategy:
matrix:
asset:
- agent
steps:
- uses: geekyeggo/delete-artifact@v5
if: ${{ inputs.stage == 'release' }}
with:
name: kata-artifacts-arm64-${{ matrix.asset}}${{ inputs.tarball-suffix }}
build-asset-shim-v2:
runs-on: arm64-builder
needs: [build-asset, build-asset-rootfs, remove-rootfs-binary-artifacts]
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-arm64-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: Build shim-v2
run: |
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: shim-v2
TAR_OUTPUT: shim-v2.tar.gz
@@ -132,11 +211,11 @@ jobs:
create-kata-tarball:
runs-on: arm64-builder
needs: [build-asset, build-asset-shim-v2]
needs: [build-asset, build-asset-rootfs, build-asset-shim-v2]
steps:
- name: Adjust a permission for repo
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
sudo chown -R "$USER":"$USER" "$GITHUB_WORKSPACE"
- uses: actions/checkout@v4
with:

View File

@@ -30,15 +30,15 @@ jobs:
- agent
- kernel
- qemu
- rootfs-initrd
- virtiofsd
stage:
- ${{ inputs.stage }}
steps:
- name: Prepare the self-hosted runner
timeout-minutes: 15
run: |
${HOME}/scripts/prepare_runner.sh
sudo rm -rf $GITHUB_WORKSPACE/*
"${HOME}/scripts/prepare_runner.sh"
sudo rm -rf "$GITHUB_WORKSPACE"/*
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
@@ -64,7 +64,7 @@ jobs:
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-${KATA_ASSET}*.tar.* kata-build/.
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
@@ -76,7 +76,6 @@ jobs:
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact ${{ matrix.asset }}
if: ${{ inputs.stage != 'release' || matrix.asset != 'agent' }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-ppc64le-${{ matrix.asset }}${{ inputs.tarball-suffix }}
@@ -84,14 +83,21 @@ jobs:
retention-days: 1
if-no-files-found: error
build-asset-shim-v2:
build-asset-rootfs:
runs-on: ppc64le
needs: build-asset
strategy:
matrix:
asset:
- rootfs-initrd
stage:
- ${{ inputs.stage }}
steps:
- name: Prepare the self-hosted runner
timeout-minutes: 15
run: |
${HOME}/scripts/prepare_runner.sh
sudo rm -rf $GITHUB_WORKSPACE/*
"${HOME}/scripts/prepare_runner.sh"
sudo rm -rf "$GITHUB_WORKSPACE"/*
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
@@ -112,12 +118,95 @@ jobs:
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Build shim-v2
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-ppc64le-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: Build ${{ matrix.asset }}
run: |
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-${KATA_ASSET}*.tar.* kata-build/.
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-ppc64le-${{ matrix.asset }}${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
retention-days: 1
if-no-files-found: error
# We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs
remove-rootfs-binary-artifacts:
runs-on: ubuntu-22.04
needs: build-asset-rootfs
strategy:
matrix:
asset:
- agent
steps:
- uses: geekyeggo/delete-artifact@v5
if: ${{ inputs.stage == 'release' }}
with:
name: kata-artifacts-ppc64le-${{ matrix.asset}}${{ inputs.tarball-suffix }}
build-asset-shim-v2:
runs-on: ppc64le
needs: [build-asset, build-asset-rootfs, remove-rootfs-binary-artifacts]
steps:
- name: Prepare the self-hosted runner
timeout-minutes: 15
run: |
"${HOME}/scripts/prepare_runner.sh"
sudo rm -rf "$GITHUB_WORKSPACE"/*
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-ppc64le-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: Build shim-v2
run: |
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: shim-v2
TAR_OUTPUT: shim-v2.tar.gz
@@ -138,11 +227,11 @@ jobs:
create-kata-tarball:
runs-on: ppc64le
needs: [build-asset, build-asset-shim-v2]
needs: [build-asset, build-asset-rootfs, build-asset-shim-v2]
steps:
- name: Adjust a permission for repo
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
sudo chown -R "$USER":"$USER" "$GITHUB_WORKSPACE"
- uses: actions/checkout@v4
with:

View File

@@ -38,10 +38,6 @@ jobs:
- kernel-confidential
- pause-image
- qemu
- rootfs-image
- rootfs-image-confidential
- rootfs-initrd
- rootfs-initrd-confidential
- virtiofsd
env:
PERFORM_ATTESTATION: ${{ matrix.asset == 'agent' && inputs.push-to-registry == 'yes' && 'yes' || 'no' }}
@@ -71,7 +67,7 @@ jobs:
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-${KATA_ASSET}*.tar.* kata-build/.
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
@@ -106,7 +102,69 @@ jobs:
push-to-registry: true
- name: store-artifact ${{ matrix.asset }}
if: ${{ inputs.stage != 'release' || (matrix.asset != 'agent' && matrix.asset != 'coco-guest-components' && matrix.asset != 'pause-image') }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-s390x-${{ matrix.asset }}${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
retention-days: 15
if-no-files-found: error
build-asset-rootfs:
runs-on: s390x
needs: build-asset
strategy:
matrix:
asset:
- rootfs-image
- rootfs-image-confidential
- rootfs-initrd
- rootfs-initrd-confidential
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-s390x-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: Build ${{ matrix.asset }}
id: build
run: |
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-s390x-${{ matrix.asset }}${{ inputs.tarball-suffix }}
@@ -116,7 +174,7 @@ jobs:
build-asset-boot-image-se:
runs-on: s390x
needs: build-asset
needs: [build-asset, build-asset-rootfs]
steps:
- uses: actions/checkout@v4
@@ -142,15 +200,11 @@ jobs:
- name: Build boot-image-se
run: |
base_dir=tools/packaging/kata-deploy/local-build/
cp -r kata-artifacts ${base_dir}/build
# Skip building dependant artifacts of boot-image-se-tarball
# because we already have them from the previous build
sed -i 's/\(^boot-image-se-tarball:\).*/\1/g' ${base_dir}/Makefile
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "boot-image-se"
make boot-image-se-tarball
build_dir=$(readlink -f build)
sudo cp -r "${build_dir}" "kata-build"
sudo chown -R $(id -u):$(id -g) "kata-build"
sudo chown -R "$(id -u)":"$(id -g)" "kata-build"
env:
HKD_PATH: "host-key-document"
@@ -162,9 +216,25 @@ jobs:
retention-days: 1
if-no-files-found: error
# We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs
remove-rootfs-binary-artifacts:
runs-on: ubuntu-22.04
needs: [build-asset-rootfs, build-asset-boot-image-se]
strategy:
matrix:
asset:
- agent
- coco-guest-components
- pause-image
steps:
- uses: geekyeggo/delete-artifact@v5
if: ${{ inputs.stage == 'release' }}
with:
name: kata-artifacts-s390x-${{ matrix.asset}}${{ inputs.tarball-suffix }}
build-asset-shim-v2:
runs-on: s390x
needs: build-asset
needs: [build-asset, build-asset-rootfs, remove-rootfs-binary-artifacts]
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
@@ -185,13 +255,21 @@ jobs:
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-artifacts
uses: actions/download-artifact@v4
with:
pattern: kata-artifacts-s390x-*${{ inputs.tarball-suffix }}
path: kata-artifacts
merge-multiple: true
- name: Build shim-v2
id: build
run: |
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-${KATA_ASSET}*.tar.* kata-build/.
mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.
env:
KATA_ASSET: shim-v2
TAR_OUTPUT: shim-v2.tar.gz
@@ -201,6 +279,7 @@ jobs:
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
MEASURED_ROOTFS: yes
- name: store-artifact shim-v2
uses: actions/upload-artifact@v4
@@ -212,7 +291,11 @@ jobs:
create-kata-tarball:
runs-on: s390x
needs: [build-asset, build-asset-boot-image-se, build-asset-shim-v2]
needs:
- build-asset
- build-asset-rootfs
- build-asset-boot-image-se
- build-asset-shim-v2
steps:
- uses: actions/checkout@v4
with:

View File

@@ -24,7 +24,7 @@ jobs:
run: bash cargo-deny-generator.sh
working-directory: ./.github/cargo-deny-composite-action/
env:
GOPATH: ${{ runner.workspace }}/kata-containers
GOPATH: ${{ github.workspace }}/kata-containers
- name: Run Action
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: ./.github/cargo-deny-composite-action

View File

@@ -16,6 +16,6 @@ jobs:
- name: Fetch a test result for {{ matrix.test_title }}
run: |
file_name="${TEST_TITLE}-$(date +%Y-%m-%d).log"
/home/${USER}/script/handle_test_log.sh download $file_name
"/home/${USER}/script/handle_test_log.sh" download "$file_name"
env:
TEST_TITLE: ${{ matrix.test_title }}

View File

@@ -37,7 +37,7 @@ jobs:
secrets: inherit
build-and-publish-tee-confidential-unencrypted-image:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
steps:
- name: Checkout code
uses: actions/checkout@v4
@@ -83,4 +83,5 @@ jobs:
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
tarball-suffix: -${{ inputs.tag }}
secrets: inherit

View File

@@ -135,6 +135,56 @@ jobs:
platforms: linux/amd64, linux/s390x
file: tests/integration/kubernetes/runtimeclass_workloads/confidential/unencrypted/Dockerfile
publish-csi-driver-amd64:
needs: build-kata-static-tarball-amd64
runs-on: ubuntu-22.04
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64-${{ inputs.tag }}
path: kata-artifacts
- name: Install tools
run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-artifacts
- name: Copy binary into Docker context
run: |
# Copy to the location where the Dockerfile expects the binary.
mkdir -p src/tools/csi-kata-directvolume/bin/
cp /opt/kata/bin/csi-kata-directvolume src/tools/csi-kata-directvolume/bin/directvolplugin
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Kata Containers ghcr.io
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Docker build and push
uses: docker/build-push-action@v5
with:
tags: ghcr.io/kata-containers/csi-kata-directvolume:${{ inputs.pr-number }}
push: true
context: src/tools/csi-kata-directvolume/
platforms: linux/amd64
file: src/tools/csi-kata-directvolume/Dockerfile
run-kata-monitor-tests:
if: ${{ inputs.skip-test != 'yes' }}
needs: build-kata-static-tarball-amd64
@@ -173,9 +223,13 @@ jobs:
run-kata-coco-tests:
if: ${{ inputs.skip-test != 'yes' }}
needs: [publish-kata-deploy-payload-amd64, build-and-publish-tee-confidential-unencrypted-image]
needs:
- publish-kata-deploy-payload-amd64
- build-and-publish-tee-confidential-unencrypted-image
- publish-csi-driver-amd64
uses: ./.github/workflows/run-kata-coco-tests.yaml
with:
tarball-suffix: -${{ inputs.tag }}
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64

View File

@@ -16,7 +16,7 @@ jobs:
runs-on: macos-latest
steps:
- name: Install Go
uses: actions/setup-go@v2
uses: actions/setup-go@v5
with:
go-version: 1.22.2
- name: Checkout code

View File

@@ -12,15 +12,15 @@ jobs:
target_branch: ${{ github.base_ref }}
steps:
- name: Install Go
uses: actions/setup-go@v2
uses: actions/setup-go@v5
with:
go-version: 1.22.2
env:
GOPATH: ${{ runner.workspace }}/kata-containers
GOPATH: ${{ github.workspace }}/kata-containers
- name: Set env
run: |
echo "GOPATH=${{ github.workspace }}" >> $GITHUB_ENV
echo "${{ github.workspace }}/bin" >> $GITHUB_PATH
echo "GOPATH=${{ github.workspace }}" >> "$GITHUB_ENV"
echo "${{ github.workspace }}/bin" >> "$GITHUB_PATH"
- name: Checkout code
uses: actions/checkout@v4
with:
@@ -29,4 +29,4 @@ jobs:
# docs url alive check
- name: Docs URL Alive Check
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && make docs-url-alive-check
cd "${GOPATH}/src/github.com/${{ github.repository }}" && make docs-url-alive-check

View File

@@ -34,7 +34,7 @@ on:
jobs:
skipper:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
outputs:
skip_build: ${{ steps.skipper.outputs.skip_build }}
skip_test: ${{ steps.skipper.outputs.skip_test }}

View File

@@ -18,7 +18,7 @@ concurrency:
jobs:
gatekeeper:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v4
with:

View File

@@ -20,9 +20,9 @@ jobs:
run: |
pushd tools/packaging/kata-deploy/runtimeclasses/
echo "::group::Combine runtime classes"
for runtimeClass in `find . -type f \( -name "*.yaml" -and -not -name "kata-runtimeClasses.yaml" \) | sort`; do
for runtimeClass in $(find . -type f \( -name "*.yaml" -and -not -name "kata-runtimeClasses.yaml" \) | sort); do
echo "Adding ${runtimeClass} to the resultingRuntimeClasses.yaml"
cat ${runtimeClass} >> resultingRuntimeClasses.yaml;
cat "${runtimeClass}" >> resultingRuntimeClasses.yaml;
done
echo "::endgroup::"
echo "::group::Displaying the content of resultingRuntimeClasses.yaml"

View File

@@ -31,7 +31,7 @@ jobs:
run: |
# Clone into a temporary directory to avoid overwriting
# any existing github directory.
pushd $(mktemp -d) &>/dev/null
pushd "$(mktemp -d)" &>/dev/null
git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts
sudo install hub-util.sh /usr/local/bin
popd &>/dev/null
@@ -72,9 +72,9 @@ jobs:
project_type="org"
project_column="In progress"
for issue_url in $(echo "$linked_issue_urls")
for issue_url in $linked_issue_urls
do
issue=$(echo "$issue_url"| awk -F\/ '{print $NF}' || true)
issue=$(echo "$issue_url"| awk -F/ '{print $NF}' || true)
[ -z "$issue" ] && {
echo "::error::Cannot determine issue number from $issue_url for PR $pr"

View File

@@ -62,5 +62,5 @@ jobs:
id: build-and-push-kata-payload
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
$(pwd)/kata-static.tar.xz \
"$(pwd)"/kata-static.tar.xz \
${{ inputs.registry }}/${{ inputs.repo }} ${{ inputs.tag }}

View File

@@ -28,7 +28,7 @@ jobs:
steps:
- name: Adjust a permission for repo
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
sudo chown -R "$USER":"$USER" "$GITHUB_WORKSPACE"
- uses: actions/checkout@v4
with:
@@ -66,6 +66,5 @@ jobs:
id: build-and-push-kata-payload
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
$(pwd)/kata-static.tar.xz \
"$(pwd)"/kata-static.tar.xz \
${{ inputs.registry }}/${{ inputs.repo }} ${{ inputs.tag }}

View File

@@ -27,13 +27,14 @@ jobs:
runs-on: ppc64le
steps:
- name: Prepare the self-hosted runner
timeout-minutes: 15
run: |
${HOME}/scripts/prepare_runner.sh
sudo rm -rf $GITHUB_WORKSPACE/*
"${HOME}/scripts/prepare_runner.sh"
sudo rm -rf "$GITHUB_WORKSPACE"/*
- name: Adjust a permission for repo
run: |
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
sudo chown -R "$USER":"$USER" "$GITHUB_WORKSPACE"
- uses: actions/checkout@v4
with:
@@ -71,5 +72,5 @@ jobs:
id: build-and-push-kata-payload
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
$(pwd)/kata-static.tar.xz \
"$(pwd)"/kata-static.tar.xz \
${{ inputs.registry }}/${{ inputs.repo }} ${{ inputs.tag }}

View File

@@ -62,5 +62,5 @@ jobs:
id: build-and-push-kata-payload
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
$(pwd)/kata-static.tar.xz \
"$(pwd)"/kata-static.tar.xz \
${{ inputs.registry }}/${{ inputs.repo }} ${{ inputs.tag }}

View File

@@ -42,18 +42,18 @@ jobs:
run: |
# We need to do such trick here as the format of the $GITHUB_REF
# is "refs/tags/<tag>"
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
tag=$(echo "$GITHUB_REF" | cut -d/ -f3-)
if [ "${tag}" = "main" ]; then
tag=$(./tools/packaging/release/release.sh release-version)
tags=(${tag} "latest")
tags=("${tag}" "latest")
else
tags=(${tag})
tags=("${tag}")
fi
for tag in ${tags[@]}; do
for tag in "${tags[@]}"; do
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
$(pwd)/kata-static.tar.xz "docker.io/katadocker/kata-deploy" \
"$(pwd)"/kata-static.tar.xz "docker.io/katadocker/kata-deploy" \
"${tag}-${{ inputs.target-arch }}"
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
$(pwd)/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \
"$(pwd)"/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \
"${tag}-${{ inputs.target-arch }}"
done

View File

@@ -42,18 +42,18 @@ jobs:
run: |
# We need to do such trick here as the format of the $GITHUB_REF
# is "refs/tags/<tag>"
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
tag=$(echo "$GITHUB_REF" | cut -d/ -f3-)
if [ "${tag}" = "main" ]; then
tag=$(./tools/packaging/release/release.sh release-version)
tags=(${tag} "latest")
tags=("${tag}" "latest")
else
tags=(${tag})
tags=("${tag}")
fi
for tag in ${tags[@]}; do
for tag in "${tags[@]}"; do
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
$(pwd)/kata-static.tar.xz "docker.io/katadocker/kata-deploy" \
"$(pwd)"/kata-static.tar.xz "docker.io/katadocker/kata-deploy" \
"${tag}-${{ inputs.target-arch }}"
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
$(pwd)/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \
"$(pwd)"/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \
"${tag}-${{ inputs.target-arch }}"
done

View File

@@ -19,9 +19,10 @@ jobs:
runs-on: ppc64le
steps:
- name: Prepare the self-hosted runner
timeout-minutes: 15
run: |
bash ${HOME}/scripts/prepare_runner.sh
sudo rm -rf $GITHUB_WORKSPACE/*
bash "${HOME}/scripts/prepare_runner.sh"
sudo rm -rf "$GITHUB_WORKSPACE"/*
- name: Login to Kata Containers docker.io
uses: docker/login-action@v3
@@ -47,18 +48,18 @@ jobs:
run: |
# We need to do such trick here as the format of the $GITHUB_REF
# is "refs/tags/<tag>"
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
tag=$(echo "$GITHUB_REF" | cut -d/ -f3-)
if [ "${tag}" = "main" ]; then
tag=$(./tools/packaging/release/release.sh release-version)
tags=(${tag} "latest")
tags=("${tag}" "latest")
else
tags=(${tag})
tags=("${tag}")
fi
for tag in ${tags[@]}; do
for tag in "${tags[@]}"; do
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
$(pwd)/kata-static.tar.xz "docker.io/katadocker/kata-deploy" \
"$(pwd)"/kata-static.tar.xz "docker.io/katadocker/kata-deploy" \
"${tag}-${{ inputs.target-arch }}"
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
$(pwd)/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \
"$(pwd)"/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \
"${tag}-${{ inputs.target-arch }}"
done

View File

@@ -42,18 +42,18 @@ jobs:
run: |
# We need to do such trick here as the format of the $GITHUB_REF
# is "refs/tags/<tag>"
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
tag=$(echo "$GITHUB_REF" | cut -d/ -f3-)
if [ "${tag}" = "main" ]; then
tag=$(./tools/packaging/release/release.sh release-version)
tags=(${tag} "latest")
tags=("${tag}" "latest")
else
tags=(${tag})
tags=("${tag}")
fi
for tag in ${tags[@]}; do
for tag in "${tags[@]}"; do
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
$(pwd)/kata-static.tar.xz "docker.io/katadocker/kata-deploy" \
"$(pwd)"/kata-static.tar.xz "docker.io/katadocker/kata-deploy" \
"${tag}-${{ inputs.target-arch }}"
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
$(pwd)/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \
"$(pwd)"/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \
"${tag}-${{ inputs.target-arch }}"
done

View File

@@ -175,6 +175,23 @@ jobs:
env:
GH_TOKEN: ${{ github.token }}
upload-helm-chart-tarball:
needs: release
runs-on: ubuntu-22.04
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Install helm
uses: azure/setup-helm@v4.2.0
id: install
- name: Generate and upload helm chart tarball
run: |
./tools/packaging/release/release.sh upload-helm-chart-tarball
env:
GH_TOKEN: ${{ github.token }}
publish-release:
needs: [ build-and-push-assets-amd64, build-and-push-assets-arm64, build-and-push-assets-s390x, build-and-push-assets-ppc64le, publish-multi-arch-images, upload-multi-arch-static-tarball, upload-versions-yaml, upload-cargo-vendored-tarball, upload-libseccomp-tarball ]
runs-on: ubuntu-22.04

View File

@@ -30,12 +30,13 @@ jobs:
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- name: Adjust a permission for repo
run: sudo chown -R $USER:$USER $GITHUB_WORKSPACE
run: sudo chown -R "$USER":"$USER" "$GITHUB_WORKSPACE"
- name: Prepare the self-hosted runner
timeout-minutes: 15
run: |
bash ${HOME}/scripts/prepare_runner.sh cri-containerd
sudo rm -rf $GITHUB_WORKSPACE/*
bash "${HOME}/scripts/prepare_runner.sh" cri-containerd
sudo rm -rf "$GITHUB_WORKSPACE"/*
- uses: actions/checkout@v4
with:
@@ -49,6 +50,7 @@ jobs:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
timeout-minutes: 15
run: bash tests/integration/cri-containerd/gha-run.sh install-dependencies
- name: get-kata-tarball
@@ -62,6 +64,6 @@ jobs:
- name: Run cri-containerd tests
run: bash tests/integration/cri-containerd/gha-run.sh run
- name: Cleanup actions for the self hosted runner
run: ${HOME}/scripts/cleanup_runner.sh
run: bash "${HOME}/scripts/cleanup_runner.sh"

View File

@@ -56,6 +56,7 @@ jobs:
SNAPSHOTTER: ${{ matrix.snapshotter }}
USING_NFD: "false"
K8S_TEST_HOST_TYPE: all
CONTAINER_RUNTIME: ${{ matrix.container_runtime }}
steps:
- uses: actions/checkout@v4
with:
@@ -85,11 +86,11 @@ jobs:
- name: Install `bats`
run: bash tests/integration/kubernetes/gha-run.sh install-bats
- name: Run tests
timeout-minutes: 30
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Collect artifacts ${{ matrix.vmm }}
if: always()
run: bash tests/integration/kubernetes/gha-run.sh collect-artifacts
@@ -98,7 +99,7 @@ jobs:
- name: Archive artifacts ${{ matrix.vmm }}
uses: actions/upload-artifact@v4
with:
name: k8s-tests-${{ matrix.vmm }}-${{ matrix.snapshotter }}-${{ matrix.k8s }}-${{ matrix.instance }}-${{ inputs.tag }}
name: k8s-tests-${{ matrix.vmm }}-${{ matrix.snapshotter }}-${{ matrix.k8s }}-${{ inputs.tag }}
path: /tmp/artifacts
retention-days: 1

View File

@@ -44,9 +44,10 @@ jobs:
TARGET_ARCH: "ppc64le"
steps:
- name: Prepare the self-hosted runner
run: |
bash ${HOME}/scripts/prepare_runner.sh kubernetes
sudo rm -rf $GITHUB_WORKSPACE/*
timeout-minutes: 15
run: |
bash "${HOME}/scripts/prepare_runner.sh" kubernetes
sudo rm -rf "$GITHUB_WORKSPACE"/*
- uses: actions/checkout@v4
with:
@@ -62,13 +63,13 @@ jobs:
- name: Install golang
run: |
./tests/install_go.sh -f -p
echo "/usr/local/go/bin" >> $GITHUB_PATH
echo "/usr/local/go/bin" >> "$GITHUB_PATH"
- name: Prepare the runner for k8s cluster creation
run: bash ${HOME}/scripts/k8s_cluster_cleanup.sh
run: bash "${HOME}/scripts/k8s_cluster_cleanup.sh"
- name: Create k8s cluster using kubeadm
run: bash ${HOME}/scripts/k8s_cluster_create.sh
run: bash "${HOME}/scripts/k8s_cluster_create.sh"
- name: Deploy Kata
timeout-minutes: 10
@@ -79,4 +80,4 @@ jobs:
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete cluster and post cleanup actions
run: bash ${HOME}/scripts/k8s_cluster_cleanup.sh
run: bash "${HOME}/scripts/k8s_cluster_cleanup.sh"

View File

@@ -36,7 +36,7 @@ jobs:
- qemu-runtime-rs
- qemu-coco-dev
k8s:
- k3s
- kubeadm
include:
- snapshotter: devmapper
pull-type: default
@@ -88,18 +88,15 @@ jobs:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Set SNAPSHOTTER to empty if overlayfs
run: echo "SNAPSHOTTER=" >> $GITHUB_ENV
run: echo "SNAPSHOTTER=" >> "$GITHUB_ENV"
if: ${{ matrix.snapshotter == 'overlayfs' }}
- name: Set KBS and KBS_INGRESS if qemu-coco-dev
run: |
echo "KBS=true" >> $GITHUB_ENV
echo "KBS_INGRESS=nodeport" >> $GITHUB_ENV
echo "KBS=true" >> "$GITHUB_ENV"
echo "KBS_INGRESS=nodeport" >> "$GITHUB_ENV"
if: ${{ matrix.vmm == 'qemu-coco-dev' }}
- name: Deploy ${{ matrix.k8s }}
run: bash tests/integration/kubernetes/gha-run.sh deploy-k8s
# qemu-runtime-rs only works with overlayfs
# See: https://github.com/kata-containers/kata-containers/issues/10066
- name: Configure the ${{ matrix.snapshotter }} snapshotter

View File

@@ -21,6 +21,9 @@ on:
required: false
type: string
default: ""
tarball-suffix:
required: false
type: string
jobs:
# Generate jobs for testing CoCo on non-TEE environments
@@ -34,13 +37,12 @@ jobs:
- nydus
pull-type:
- guest-pull
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
GH_PR_NUMBER: ${{ inputs.pr-number }}
KATA_HOST_OS: ${{ matrix.host_os }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
# Some tests rely on that variable to run (or not)
KBS: "true"
@@ -64,6 +66,15 @@ jobs:
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-artifacts
- name: Download Azure CLI
run: bash tests/integration/kubernetes/gha-run.sh install-azure-cli

View File

@@ -2,6 +2,9 @@ name: CI | Run kata coco tests
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
registry:
required: true
type: string
@@ -33,7 +36,15 @@ jobs:
- nydus
pull-type:
- guest-pull
runs-on: tdx
k8s-test-host-type:
- baremetal-attestation
- baremetal-no-attestation
include:
- k8s-test-host-type: baremetal-attestation
machine: tdx-attestation
- k8s-test-host-type: baremetal-no-attestation
machine: tdx-no-attestation
runs-on: ${{ matrix.machine }}
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
@@ -43,7 +54,7 @@ jobs:
KUBERNETES: "vanilla"
USING_NFD: "true"
KBS: "true"
K8S_TEST_HOST_TYPE: "baremetal"
K8S_TEST_HOST_TYPE: ${{ matrix.k8s-test-host-type }}
KBS_INGRESS: "nodeport"
SNAPSHOTTER: ${{ matrix.snapshotter }}
PULL_TYPE: ${{ matrix.pull-type }}
@@ -72,17 +83,24 @@ jobs:
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-tdx
- name: Uninstall previous `kbs-client`
if: ${{ matrix.machine != 'tdx-no-attestation' }}
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh uninstall-kbs-client
- name: Deploy CoCo KBS
if: ${{ matrix.machine != 'tdx-no-attestation' }}
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs
- name: Install `kbs-client`
if: ${{ matrix.machine != 'tdx-no-attestation' }}
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client
- name: Deploy CSI driver
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh deploy-csi-driver
- name: Run tests
timeout-minutes: 100
run: bash tests/integration/kubernetes/gha-run.sh run-tests
@@ -96,9 +114,13 @@ jobs:
run: bash tests/integration/kubernetes/gha-run.sh cleanup-snapshotter
- name: Delete CoCo KBS
if: always()
if: ${{ always() && matrix.machine != 'tdx-no-attestation' }}
run: bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs
- name: Delete CSI driver
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh delete-csi-driver
run-k8s-tests-on-sev:
strategy:
fail-fast: false
@@ -145,10 +167,18 @@ jobs:
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-sev
- name: Deploy CSI driver
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh deploy-csi-driver
- name: Run tests
timeout-minutes: 50
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete CSI driver
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh delete-csi-driver
- name: Delete kata-deploy
if: always()
run: bash tests/integration/kubernetes/gha-run.sh cleanup-sev
@@ -217,6 +247,10 @@ jobs:
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client
- name: Deploy CSI driver
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh deploy-csi-driver
- name: Run tests
timeout-minutes: 50
run: bash tests/integration/kubernetes/gha-run.sh run-tests
@@ -233,6 +267,10 @@ jobs:
if: always()
run: bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs
- name: Delete CSI driver
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh delete-csi-driver
# Generate jobs for testing CoCo on non-TEE environments
run-k8s-tests-coco-nontee:
strategy:
@@ -250,7 +288,6 @@ jobs:
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
GH_PR_NUMBER: ${{ inputs.pr-number }}
KATA_HOST_OS: ${{ matrix.host_os }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
# Some tests rely on that variable to run (or not)
KBS: "true"
@@ -262,6 +299,7 @@ jobs:
AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}
SNAPSHOTTER: ${{ matrix.snapshotter }}
USING_NFD: "false"
AUTO_GENERATE_POLICY: "yes"
steps:
- uses: actions/checkout@v4
with:
@@ -274,6 +312,15 @@ jobs:
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: get-kata-tarball
uses: actions/download-artifact@v4
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-artifacts
- name: Download Azure CLI
run: bash tests/integration/kubernetes/gha-run.sh install-azure-cli
@@ -314,8 +361,12 @@ jobs:
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client
- name: Deploy CSI driver
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh deploy-csi-driver
- name: Run tests
timeout-minutes: 60
timeout-minutes: 80
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete AKS cluster

View File

@@ -48,7 +48,7 @@ jobs:
# all the tests due to a single flaky instance.
fail-fast: false
matrix:
vmm: ['clh', 'qemu', 'stratovirt']
vmm: ['clh', 'qemu']
max-parallel: 1
runs-on: metrics
env:

View File

@@ -15,6 +15,8 @@ on:
jobs:
run-runk:
# Skip runk tests as we have no maintainers. TODO: Decide when to remove altogether
if: false
runs-on: ubuntu-22.04
env:
CONTAINERD_VERSION: lts

View File

@@ -31,8 +31,8 @@ jobs:
run: |
kernel_dir="tools/packaging/kernel/"
kernel_version_file="${kernel_dir}kata_config_version"
modified_files=$(git diff --name-only origin/$GITHUB_BASE_REF..HEAD)
if git diff --name-only origin/$GITHUB_BASE_REF..HEAD "${kernel_dir}" | grep "${kernel_dir}"; then
modified_files=$(git diff --name-only origin/"$GITHUB_BASE_REF"..HEAD)
if git diff --name-only origin/"$GITHUB_BASE_REF"..HEAD "${kernel_dir}" | grep "${kernel_dir}"; then
echo "Kernel directory has changed, checking if $kernel_version_file has been updated"
if echo "$modified_files" | grep -v "README.md" | grep "${kernel_dir}" >>"/dev/null"; then
echo "$modified_files" | grep "$kernel_version_file" >>/dev/null || ( echo "Please bump version in $kernel_version_file" && exit 1)
@@ -107,19 +107,19 @@ jobs:
path: ./src/github.com/${{ github.repository }}
- name: Install yq
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }}
cd "${GOPATH}/src/github.com/${{ github.repository }}"
./ci/install_yq.sh
env:
INSTALL_IN_GOPATH: false
- name: Install golang
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }}
cd "${GOPATH}/src/github.com/${{ github.repository }}"
./tests/install_go.sh -f -p
echo "/usr/local/go/bin" >> $GITHUB_PATH
echo "/usr/local/go/bin" >> "$GITHUB_PATH"
- name: Install system dependencies
run: |
sudo apt-get -y install moreutils hunspell hunspell-en-gb hunspell-en-us pandoc
- name: Run check
run: |
export PATH=${PATH}:${GOPATH}/bin
cd ${GOPATH}/src/github.com/${{ github.repository }} && ${{ matrix.cmd }}
export PATH="${PATH}:${GOPATH}/bin"
cd "${GOPATH}/src/github.com/${{ github.repository }}" && ${{ matrix.cmd }}

View File

@@ -1 +1 @@
3.10.1
3.13.0

View File

@@ -41,7 +41,7 @@ responsible for ensuring that:
### Jobs that require a maintainer's approval to run
These are the required tests, and our so-called "CI". These require a
There are some tests, and our so-called "CI". These require a
maintainer's approval to run as parts of those jobs will be running on "paid
runners", which are currently using Azure infrastructure.
@@ -77,11 +77,11 @@ them to merely debug issues.
In the previous section we've mentioned using different runners, now in this section we'll go through each type of runner used.
- Cost free runners: Those are the runners provided by Github itself, and
those are fairly small machines with virtualization capabilities enabled.
- Cost free runners: Those are the runners provided by GitHub itself, and
those are fairly small machines with virtualization capabilities enabled.
- Azure small instances: Those are runners which have virtualization
capabilities enabled, 2 CPUs, and 8GB of RAM. These runners have a "-smaller"
suffix to their name.
suffix to their name.
- Azure normal instances: Those are runners which have virtualization
capabilities enabled, 4 CPUs, and 16GB of RAM. These runners are usually
`garm` ones with no "-smaller" suffix.
@@ -91,7 +91,7 @@ In the previous section we've mentioned using different runners, now in this sec
runners which will be actually performing the tests must have virtualization
capabilities and a reasonable amount for CPU and RAM available (at least
matching the Azure normal instances).
## Adding new tests
Before someone decides to add a new test, we strongly recommend them to go
@@ -138,6 +138,63 @@ Following those examples, the community advice during the review, and even
asking the community directly on Slack are the best ways to get your test
accepted.
## Required tests
In our CI we have two categories of jobs - required and non-required:
- Required jobs need to all pass for a PR to be merged normally and
should cover all the core features on Kata Containers that we want to
ensure don't have regressions.
- The non-required jobs are for unstable tests, or for features that
are experimental and not-fully supported. We'd like those tests to also
pass on all PRs ideally, but don't block merging if they don't as it's
not necessarily an indication of the PR code causing regressions.
### Transitioning between required and non-required status
Required jobs that fail block merging of PRs, so we want to ensure that
jobs are stable and maintained before we make them required.
The [Kata Containers CI Dashboard](https://kata-containers.github.io/)
is a useful resource to check when collecting evidence of job stability.
At time of writing it reports the last ten days of Kata CI nightly test
results for each job. This isn't perfect as it doesn't currently capture
results on PRs, but is a good guideline for stability.
> [!NOTE]
> Below are general guidelines about jobs being marked as
> required/non-required, but they are subject to change and the Kata
> Architecture Committee may overrule these guidelines at their
> discretion.
#### Initial marking as required
For new jobs, or jobs that haven't been marked as required recently,
the criteria to be initially marked as required is ten days
of passing tests, with no relevant PR failures reported in that time.
Required jobs also need one or more nominated maintainers that are
responsible for the stability of their jobs.
> [!NOTE]
> We don't currently have a good place to record the job maintainers, but
> once we have this, the intention is to show it on the CI Dashboard so
> people can find the contact easily.
#### Expectation of required job maintainers
Due to the nature of the Kata Containers community having contributors
spread around the world, required jobs being blocked due to infrastructure,
or test issues can have a big impact on work. As such, the expectation is
that when a problem with a required job is noticed/reported, the maintainers
have one working day to acknowledge the issue, perform an initial
investigation and then either fix it, or get it marked as non-required
whilst the investigation and/or fix it done.
### Re-marking of required status
Once a job has been removed from the required list, it requires two
consecutive successful nightly test runs before being made required
again.
## Running tests
### Running the tests as part of the CI
@@ -247,7 +304,7 @@ $ git remote add upstream https://github.com/kata-containers/kata-containers
$ git remote update
$ git config --global user.email "you@example.com"
$ git config --global user.name "Your Name"
$ git rebase upstream/main
$ git rebase upstream/main
```
Now copy the `kata-static.tar.xz` into your `kata-containers/kata-artifacts` directory
@@ -261,7 +318,7 @@ $ cp ../kata-static.tar.xz kata-artifacts/
> If you downloaded the .zip from GitHub you need to uncompress first to see `kata-static.tar.xz`
And finally run the tests following what's in the yaml file for the test you're
debugging.
debugging.
In our case, the `run-nerdctl-tests-on-garm.yaml`.
@@ -284,7 +341,7 @@ $ bash tests/integration/nerdctl/gha-run.sh run
And with this you should've been able to reproduce exactly the same issue found
in the CI, and from now on you can build your own code, use your own binaries,
and have fun debugging and hacking!
and have fun debugging and hacking!
### Debugging a Kubernetes test
@@ -332,7 +389,7 @@ If you want to remove a current self-hosted runner:
- For each runner there's a "..." menu, where you can just click and the
"Remove runner" option will show up
## Known limitations
As the GitHub actions are structured right now we cannot: Test the addition of a

View File

@@ -13,7 +13,7 @@ metadata:
spec:
containers:
- name: http-server
image: registry.fedoraproject.org/fedora
image: docker.io/library/python:3
ports:
- containerPort: 8080
command: ["python3"]

View File

@@ -627,7 +627,7 @@ the following steps (using rootfs or initrd image).
>
> Look for `INIT_PROCESS=systemd` in the `config.sh` osbuilder rootfs config file
> to verify an osbuilder distro supports systemd for the distro you want to build rootfs for.
> For an example, see the [Clear Linux config.sh file](../tools/osbuilder/rootfs-builder/clearlinux/config.sh).
> For an example, see the [Ubuntu config.sh file](../tools/osbuilder/rootfs-builder/ubuntu/config.sh).
>
> For a non-systemd-based distro, create an equivalent system
> service using that distros init system syntax. Alternatively, you can build a distro

View File

@@ -45,6 +45,14 @@ branch to read only whilst the release action runs.
> [!NOTE]
> Admin permission is needed to complete this task.
### Wait for the `VERSION` bump PR payload publish to complete
To reduce the chance of need to re-run the release workflow, check the
[CI | Publish Kata Containers payload](https://github.com/kata-containers/kata-containers/actions/workflows/payload-after-push.yaml)
once the `VERSION` PR bump has merged to check that the assets build correctly
and are cached, so that the release process can just download these artifacts
rather than needing to build them all, which takes time and can reveal errors in infra.
### Check GitHub Actions
We make use of [GitHub actions](https://github.com/features/actions) in the

View File

@@ -135,7 +135,7 @@ See also the [process overview](README.md#process-overview).
| Image type | Default distro | Init daemon | Reason | Notes |
|-|-|-|-|-|
| [image](background.md#root-filesystem-image) | [Clear Linux](https://clearlinux.org) (for x86_64 systems)| systemd | Minimal and highly optimized | systemd offers flexibility |
| [image](background.md#root-filesystem-image) | [Ubuntu](https://ubuntu.com) (for x86_64 systems) | systemd | Fully tested in our CI | systemd offers flexibility |
| [initrd](#initrd-image) | [Alpine Linux](https://alpinelinux.org) | Kata [agent](README.md#agent) (as no systemd support) | Security hardened and tiny C library |
See also:

View File

@@ -98,8 +98,7 @@ of Kata Containers, the Cloud Hypervisor configuration supports both CPU
and memory resize, device hotplug (disk and VFIO), file-system sharing through virtio-fs,
block-based volumes, booting from VM images backed by pmem device, and
fine-grained seccomp filters for each VMM threads (e.g. all virtio
device worker threads). Please check [this GitHub Project](https://github.com/orgs/kata-containers/projects/21)
for details of ongoing integration efforts.
device worker threads).
Devices and features used:
- virtio VSOCK or virtio serial

View File

@@ -29,16 +29,16 @@ __SNP-specific steps:__
- Build the SNP-specific kernel as shown below (see this [guide](../../tools/packaging/kernel/README.md#build-kata-containers-kernel) for more information)
```bash
$ pushd kata-containers/tools/packaging/
$ ./kernel/build-kernel.sh -a x86_64 -x snp setup
$ ./kernel/build-kernel.sh -a x86_64 -x snp build
$ sudo -E PATH="${PATH}" ./kernel/build-kernel.sh -x snp install
$ ./kernel/build-kernel.sh -a x86_64 -x setup
$ ./kernel/build-kernel.sh -a x86_64 -x build
$ sudo -E PATH="${PATH}" ./kernel/build-kernel.sh -x install
$ popd
```
- Build a current OVMF capable of SEV-SNP:
```bash
$ pushd kata-containers/tools/packaging/static-build/ovmf
$ ./build.sh
$ tar -xvf edk2-x86_64.tar.gz
$ ovmf_build=sev ./build.sh
$ tar -xvf edk2-sev.tar.gz
$ popd
```
- Build a custom QEMU
@@ -106,7 +106,7 @@ sev_snp_guest = true
```
- Configure an OVMF (add path)
```toml
firmware = "/path/to/kata-containers/tools/packaging/static-build/ovmf/opt/kata/share/ovmf/OVMF.fd"
firmware = "/path/to/kata-containers/tools/packaging/static-build/ovmf/opt/kata/share/ovmf/AMDSEV.fd"
```
- SNP attestation (add cert-chain to default path or add the path with cert-chain)
```toml

View File

@@ -94,6 +94,8 @@ There are several kinds of Kata configurations and they are listed below.
| `io.katacontainers.config.hypervisor.virtio_fs_extra_args` | string | extra options passed to `virtiofs` daemon |
| `io.katacontainers.config.hypervisor.enable_guest_swap` | `boolean` | enable swap in the guest |
| `io.katacontainers.config.hypervisor.use_legacy_serial` | `boolean` | uses legacy serial device for guest's console (QEMU) |
| `io.katacontainers.config.hypervisor.default_gpus` | uint32 | the minimum number of GPUs required for the VM. Only used by remote hypervisor to help with instance selection |
| `io.katacontainers.config.hypervisor.default_gpu_model` | string | the GPU model required for the VM. Only used by remote hypervisor to help with instance selection |
## Container Options
| Key | Value Type | Comments |

391
src/agent/Cargo.lock generated
View File

@@ -64,6 +64,20 @@ dependencies = [
"version_check",
]
[[package]]
name = "ahash"
version = "0.8.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e89da841a80418a9b391ebaea17f5c112ffaaa96f621d2c285b5174da76b9011"
dependencies = [
"cfg-if 1.0.0",
"getrandom",
"once_cell",
"serde",
"version_check",
"zerocopy",
]
[[package]]
name = "aho-corasick"
version = "1.1.3"
@@ -97,6 +111,55 @@ dependencies = [
"winapi",
]
[[package]]
name = "anstream"
version = "0.6.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "64e15c1ab1f89faffbf04a634d5e1962e9074f2741eef6d97f3c4e322426d526"
dependencies = [
"anstyle",
"anstyle-parse",
"anstyle-query",
"anstyle-wincon",
"colorchoice",
"is_terminal_polyfill",
"utf8parse",
]
[[package]]
name = "anstyle"
version = "1.0.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1bec1de6f59aedf83baf9ff929c98f2ad654b97c9510f4e70cf6f661d49fd5b1"
[[package]]
name = "anstyle-parse"
version = "0.2.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "eb47de1e80c2b463c735db5b217a0ddc39d612e7ac9e2e96a5aed1f57616c1cb"
dependencies = [
"utf8parse",
]
[[package]]
name = "anstyle-query"
version = "1.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6d36fc52c7f6c869915e99412912f22093507da8d9e942ceaf66fe4b7c14422a"
dependencies = [
"windows-sys 0.52.0",
]
[[package]]
name = "anstyle-wincon"
version = "3.0.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5bf74e1b6e971609db8ca7a9ce79fd5768ab6ae46441c572e46cf596f59e57f8"
dependencies = [
"anstyle",
"windows-sys 0.52.0",
]
[[package]]
name = "anyhow"
version = "1.0.86"
@@ -712,6 +775,12 @@ dependencies = [
"syn 1.0.109",
]
[[package]]
name = "bytecount"
version = "0.6.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5ce89b21cab1437276d2650d57e971f9d548a2d9037cc231abdc0562b97498ce"
[[package]]
name = "byteorder"
version = "1.5.0"
@@ -814,6 +883,30 @@ dependencies = [
"libc",
]
[[package]]
name = "cdi"
version = "0.1.0"
source = "git+https://github.com/cncf-tags/container-device-interface-rs?rev=fba5677a8e7cc962fc6e495fcec98d7d765e332a#fba5677a8e7cc962fc6e495fcec98d7d765e332a"
dependencies = [
"anyhow",
"clap 4.5.13",
"const_format",
"jsonschema",
"lazy_static",
"libc",
"nix 0.24.3",
"notify",
"oci-spec",
"once_cell",
"path-clean",
"regex",
"semver",
"serde",
"serde_derive",
"serde_json",
"serde_yaml",
]
[[package]]
name = "cesu8"
version = "1.1.0"
@@ -914,8 +1007,8 @@ checksum = "4ea181bf566f71cb9a5d17a59e1871af638180a18fb0035c92ae62b705207123"
dependencies = [
"atty",
"bitflags 1.3.2",
"clap_derive",
"clap_lex",
"clap_derive 3.2.25",
"clap_lex 0.2.4",
"indexmap 1.9.3",
"once_cell",
"strsim 0.10.0",
@@ -923,6 +1016,28 @@ dependencies = [
"textwrap",
]
[[package]]
name = "clap"
version = "4.5.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0fbb260a053428790f3de475e304ff84cdbc4face759ea7a3e64c1edd938a7fc"
dependencies = [
"clap_builder",
"clap_derive 4.5.13",
]
[[package]]
name = "clap_builder"
version = "4.5.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "64b17d7ea74e9f833c7dbf2cbe4fb12ff26783eda4782a8975b72f895c9b4d99"
dependencies = [
"anstream",
"anstyle",
"clap_lex 0.7.2",
"strsim 0.11.1",
]
[[package]]
name = "clap_derive"
version = "3.2.25"
@@ -936,6 +1051,18 @@ dependencies = [
"syn 1.0.109",
]
[[package]]
name = "clap_derive"
version = "4.5.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "501d359d5f3dcaf6ecdeee48833ae73ec6e42723a1e52419c79abf9507eec0a0"
dependencies = [
"heck 0.5.0",
"proc-macro2",
"quote",
"syn 2.0.71",
]
[[package]]
name = "clap_lex"
version = "0.2.4"
@@ -945,6 +1072,12 @@ dependencies = [
"os_str_bytes",
]
[[package]]
name = "clap_lex"
version = "0.7.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1462739cb27611015575c0c11df5df7601141071f07518d56fcc1be504cbec97"
[[package]]
name = "cmac"
version = "0.7.2"
@@ -967,6 +1100,12 @@ dependencies = [
"wasm-bindgen",
]
[[package]]
name = "colorchoice"
version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d3fd119d74b830634cea2a0f58bbd0d54540518a14397557951e79340abc28c0"
[[package]]
name = "combine"
version = "4.6.7"
@@ -1741,6 +1880,17 @@ dependencies = [
"rand",
]
[[package]]
name = "fancy-regex"
version = "0.13.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "531e46835a22af56d1e3b66f04844bed63158bc094a628bec1d321d9b4c44bf2"
dependencies = [
"bit-set",
"regex-automata 0.4.7",
"regex-syntax 0.8.4",
]
[[package]]
name = "fastrand"
version = "1.9.0"
@@ -1812,6 +1962,15 @@ dependencies = [
"miniz_oxide",
]
[[package]]
name = "fluent-uri"
version = "0.1.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "17c704e9dbe1ddd863da1e6ff3567795087b1eb201ce80d8fa81162e1516500d"
dependencies = [
"bitflags 1.3.2",
]
[[package]]
name = "fnv"
version = "1.0.7"
@@ -1827,6 +1986,25 @@ dependencies = [
"percent-encoding",
]
[[package]]
name = "fraction"
version = "0.15.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0f158e3ff0a1b334408dc9fb811cd99b446986f4d8b741bb08f9df1604085ae7"
dependencies = [
"lazy_static",
"num",
]
[[package]]
name = "fsevent-sys"
version = "4.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "76ee7a02da4d231650c7cea31349b889be2f45ddb3ef3032d2ec8185f6313fd2"
dependencies = [
"libc",
]
[[package]]
name = "funty"
version = "2.0.0"
@@ -2052,7 +2230,7 @@ version = "0.12.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8a9ee70c43aaf417c914396645a0fa852624801b24ebb7ae78fe8272889ac888"
dependencies = [
"ahash",
"ahash 0.7.8",
]
[[package]]
@@ -2605,6 +2783,21 @@ dependencies = [
"windows-sys 0.52.0",
]
[[package]]
name = "is_terminal_polyfill"
version = "1.70.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7943c866cc5cd64cbc25b2e01621d07fa8eb2a1a23160ee81ce38704e97b8ecf"
[[package]]
name = "iso8601"
version = "0.6.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "924e5d73ea28f59011fec52a0d12185d496a9b075d360657aed2a5707f701153"
dependencies = [
"nom",
]
[[package]]
name = "itertools"
version = "0.10.5"
@@ -2623,15 +2816,6 @@ dependencies = [
"either",
]
[[package]]
name = "itertools"
version = "0.12.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ba291022dbbd398a455acf126c1e341954079855bc60dfdda641363bd6922569"
dependencies = [
"either",
]
[[package]]
name = "itoa"
version = "1.0.11"
@@ -2690,6 +2874,18 @@ dependencies = [
"smallvec",
]
[[package]]
name = "json-patch"
version = "2.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5b1fb8864823fad91877e6caea0baca82e49e8db50f8e5c9f9a453e27d3330fc"
dependencies = [
"jsonptr",
"serde",
"serde_json",
"thiserror",
]
[[package]]
name = "json-syntax"
version = "0.12.5"
@@ -2709,6 +2905,47 @@ dependencies = [
"utf8-decode",
]
[[package]]
name = "jsonptr"
version = "0.4.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1c6e529149475ca0b2820835d3dce8fcc41c6b943ca608d32f35b449255e4627"
dependencies = [
"fluent-uri",
"serde",
"serde_json",
]
[[package]]
name = "jsonschema"
version = "0.18.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ec0afd06142c9bcb03f4a8787c77897a87b6be9c4918f1946c33caa714c27578"
dependencies = [
"ahash 0.8.11",
"anyhow",
"base64 0.22.1",
"bytecount",
"clap 4.5.13",
"fancy-regex",
"fraction",
"getrandom",
"iso8601",
"itoa",
"memchr",
"num-cmp",
"once_cell",
"parking_lot 0.12.3",
"percent-encoding",
"regex",
"reqwest",
"serde",
"serde_json",
"time",
"url",
"uuid",
]
[[package]]
name = "jwt"
version = "0.16.0"
@@ -2773,20 +3010,23 @@ dependencies = [
"async-std",
"async-trait",
"capctl",
"cdi",
"cfg-if 1.0.0",
"cgroups-rs",
"clap",
"clap 3.2.25",
"const_format",
"derivative",
"futures",
"image-rs",
"ipnetwork",
"json-patch",
"kata-sys-util",
"kata-types",
"lazy_static",
"libc",
"log",
"logging",
"mem-agent",
"netlink-packet-utils",
"netlink-sys",
"nix 0.24.3",
@@ -2835,7 +3075,6 @@ version = "0.1.0"
dependencies = [
"anyhow",
"byteorder",
"cgroups-rs",
"chrono",
"common-path",
"fail",
@@ -2934,6 +3173,26 @@ dependencies = [
"zeroize",
]
[[package]]
name = "kqueue"
version = "1.0.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7447f1ca1b7b563588a205fe93dea8df60fd981423a768bc1c0ded35ed147d0c"
dependencies = [
"kqueue-sys",
"libc",
]
[[package]]
name = "kqueue-sys"
version = "1.0.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ed9625ffda8729b85e45cf04090035ac368927b8cebc34898e7c120f52e4838b"
dependencies = [
"bitflags 1.3.2",
"libc",
]
[[package]]
name = "krata-tokio-tar"
version = "0.4.2"
@@ -3225,6 +3484,21 @@ dependencies = [
"digest",
]
[[package]]
name = "mem-agent"
version = "0.1.0"
dependencies = [
"anyhow",
"async-trait",
"chrono",
"lazy_static",
"nix 0.23.2",
"page_size",
"slog",
"slog-scope",
"tokio",
]
[[package]]
name = "memchr"
version = "2.7.4"
@@ -3285,6 +3559,18 @@ dependencies = [
"adler",
]
[[package]]
name = "mio"
version = "0.8.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a4a650543ca06a924e8b371db273b2756685faae30f8487da1b56505a8f78b0c"
dependencies = [
"libc",
"log",
"wasi",
"windows-sys 0.48.0",
]
[[package]]
name = "mio"
version = "1.0.2"
@@ -3465,6 +3751,25 @@ dependencies = [
"minimal-lexical",
]
[[package]]
name = "notify"
version = "6.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6205bd8bb1e454ad2e27422015fb5e4f2bcc7e08fa8f27058670d208324a4d2d"
dependencies = [
"bitflags 2.6.0",
"crossbeam-channel",
"filetime",
"fsevent-sys",
"inotify",
"kqueue",
"libc",
"log",
"mio 0.8.11",
"walkdir",
"windows-sys 0.48.0",
]
[[package]]
name = "ntapi"
version = "0.4.1"
@@ -3515,6 +3820,12 @@ dependencies = [
"zeroize",
]
[[package]]
name = "num-cmp"
version = "0.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "63335b2e2c34fae2fb0aa2cecfd9f0832a1e24b3b32ecec612c3426d46dc8aaa"
[[package]]
name = "num-complex"
version = "0.4.6"
@@ -3813,6 +4124,16 @@ dependencies = [
"sha2",
]
[[package]]
name = "page_size"
version = "0.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "30d5b2194ed13191c1999ae0704b7839fb18384fa22e49b57eeaa97d79ce40da"
dependencies = [
"libc",
"winapi",
]
[[package]]
name = "parking"
version = "2.2.0"
@@ -3894,6 +4215,12 @@ dependencies = [
"slash-formatter",
]
[[package]]
name = "path-clean"
version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "17359afc20d7ab31fdb42bb844c8b3bb1dabd7dcf7e68428492da7f16966fcef"
[[package]]
name = "path-dedot"
version = "1.2.4"
@@ -4376,7 +4703,6 @@ name = "protocols"
version = "0.1.0"
dependencies = [
"async-trait",
"kata-sys-util",
"oci-spec",
"protobuf 3.5.1",
"serde",
@@ -4626,14 +4952,12 @@ checksum = "7a66a03ae7c801facd77a29370b4faec201768915ac14a721ba36f20bc9c209b"
[[package]]
name = "regorus"
version = "0.1.5"
version = "0.2.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "77dd872918e5c172bd42ac49716f89a15e35be513bba3d902e355a531529a87f"
checksum = "843c3d97f07e3b5ac0955d53ad0af4c91fe4a4f8525843ece5bf014f27829b73"
dependencies = [
"anyhow",
"itertools 0.12.1",
"lazy_static",
"num",
"rand",
"regex",
"scientific",
@@ -4666,6 +4990,7 @@ dependencies = [
"bytes 1.6.1",
"cookie",
"cookie_store",
"futures-channel",
"futures-core",
"futures-util",
"http",
@@ -5919,7 +6244,7 @@ dependencies = [
"backtrace",
"bytes 1.6.1",
"libc",
"mio",
"mio 1.0.2",
"parking_lot 0.12.3",
"pin-project-lite",
"signal-hook-registry",
@@ -6346,6 +6671,12 @@ version = "1.0.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b6c140620e7ffbb22c2dee59cafe6084a59b5ffc27a8859a5f0d494b5d52b6be"
[[package]]
name = "utf8parse"
version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "06abde3611657adf66d383f00b093d7faecc7fa57071cce2578660c9f1010821"
[[package]]
name = "uuid"
version = "1.10.0"
@@ -7021,6 +7352,26 @@ dependencies = [
"zvariant",
]
[[package]]
name = "zerocopy"
version = "0.7.35"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1b9b4fd18abc82b8136838da5d50bae7bdea537c574d8dc1a34ed098d6c166f0"
dependencies = [
"zerocopy-derive",
]
[[package]]
name = "zerocopy-derive"
version = "0.7.35"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fa4f8080344d4671fb4e831a13ad1e68092748387dfc4f55e356242fae12ce3e"
dependencies = [
"proc-macro2",
"quote",
"syn 2.0.71",
]
[[package]]
name = "zerofrom"
version = "0.1.4"

View File

@@ -7,6 +7,7 @@ license = "Apache-2.0"
[dependencies]
runtime-spec = { path = "../libs/runtime-spec" }
mem-agent = { path = "../mem-agent" }
oci-spec = { version = "0.6.8", features = ["runtime"] }
rustjail = { path = "rustjail" }
protocols = { path = "../libs/protocols", features = ["async", "with-serde"] }
@@ -80,10 +81,13 @@ strum_macros = "0.26.2"
image-rs = { git = "https://github.com/confidential-containers/guest-components", rev = "v0.10.0", default-features = false, optional = true }
# Agent Policy
regorus = { version = "0.1.4", default-features = false, features = [
regorus = { version = "0.2.6", default-features = false, features = [
"arc",
"regex",
"std",
], optional = true }
cdi = { git = "https://github.com/cncf-tags/container-device-interface-rs", rev = "fba5677a8e7cc962fc6e495fcec98d7d765e332a" }
json-patch = "2.0.0"
[dev-dependencies]
tempfile = "3.1.0"

View File

@@ -233,7 +233,7 @@ pub fn init_rootfs(
// bind may be only specified in the oci spec options -> flags update r#type
let m = &{
let mut mbind = m.clone();
if mbind.typ().is_none() && flags & MsFlags::MS_BIND == MsFlags::MS_BIND {
if is_none_mount_type(mbind.typ()) && flags & MsFlags::MS_BIND == MsFlags::MS_BIND {
mbind.set_typ(Some("bind".to_string()));
}
mbind
@@ -397,6 +397,13 @@ fn mount_cgroups_v2(cfd_log: RawFd, m: &Mount, rootfs: &str, flags: MsFlags) ->
Ok(())
}
fn is_none_mount_type(typ: &Option<String>) -> bool {
match typ {
Some(t) => t == "none",
None => true,
}
}
fn mount_cgroups(
cfd_log: RawFd,
m: &Mount,

View File

@@ -45,6 +45,27 @@ const IMAGE_POLICY_FILE: &str = "agent.image_policy_file";
const HTTPS_PROXY: &str = "agent.https_proxy";
const NO_PROXY: &str = "agent.no_proxy";
const MEM_AGENT_ENABLE: &str = "agent.mem_agent_enable";
const MEM_AGENT_MEMCG_DISABLE: &str = "agent.mem_agent_memcg_disable";
const MEM_AGENT_MEMCG_SWAP: &str = "agent.mem_agent_memcg_swap";
const MEM_AGENT_MEMCG_SWAPPINESS_MAX: &str = "agent.mem_agent_memcg_swappiness_max";
const MEM_AGENT_MEMCG_PERIOD_SECS: &str = "agent.mem_agent_memcg_period_secs";
const MEM_AGENT_MEMCG_PERIOD_PSI_PERCENT_LIMIT: &str =
"agent.mem_agent_memcg_period_psi_percent_limit";
const MEM_AGENT_MEMCG_EVICTION_PSI_PERCENT_LIMIT: &str =
"agent.mem_agent_memcg_eviction_psi_percent_limit";
const MEM_AGENT_MEMCG_EVICTION_RUN_AGING_COUNT_MIN: &str =
"agent.mem_agent_memcg_eviction_run_aging_count_min";
const MEM_AGENT_COMPACT_DISABLE: &str = "agent.mem_agent_compact_disable";
const MEM_AGENT_COMPACT_PERIOD_SECS: &str = "agent.mem_agent_compact_period_secs";
const MEM_AGENT_COMPACT_PERIOD_PSI_PERCENT_LIMIT: &str =
"agent.mem_agent_compact_period_psi_percent_limit";
const MEM_AGENT_COMPACT_PSI_PERCENT_LIMIT: &str = "agent.mem_agent_compact_psi_percent_limit";
const MEM_AGENT_COMPACT_SEC_MAX: &str = "agent.mem_agent_compact_sec_max";
const MEM_AGENT_COMPACT_ORDER: &str = "agent.mem_agent_compact_order";
const MEM_AGENT_COMPACT_THRESHOLD: &str = "agent.mem_agent_compact_threshold";
const MEM_AGENT_COMPACT_FORCE_TIMES: &str = "agent.mem_agent_compact_force_times";
const DEFAULT_LOG_LEVEL: slog::Level = slog::Level::Info;
const DEFAULT_HOTPLUG_TIMEOUT: time::Duration = time::Duration::from_secs(3);
const DEFAULT_CDH_API_TIMEOUT: time::Duration = time::Duration::from_secs(50);
@@ -131,6 +152,13 @@ pub struct AgentConfig {
pub image_policy_file: String,
#[cfg(feature = "agent-policy")]
pub policy_file: String,
pub mem_agent: Option<MemAgentConfig>,
}
#[derive(Debug, Default, PartialEq)]
pub struct MemAgentConfig {
pub memcg_config: mem_agent::memcg::Config,
pub compact_config: mem_agent::compact::Config,
}
#[derive(Debug, Deserialize)]
@@ -160,6 +188,22 @@ pub struct AgentConfigBuilder {
pub image_policy_file: Option<String>,
#[cfg(feature = "agent-policy")]
pub policy_file: Option<String>,
pub mem_agent_enable: Option<bool>,
pub mem_agent_memcg_disable: Option<bool>,
pub mem_agent_memcg_swap: Option<bool>,
pub mem_agent_memcg_swappiness_max: Option<u8>,
pub mem_agent_memcg_period_secs: Option<u64>,
pub mem_agent_memcg_period_psi_percent_limit: Option<u8>,
pub mem_agent_memcg_eviction_psi_percent_limit: Option<u8>,
pub mem_agent_memcg_eviction_run_aging_count_min: Option<u64>,
pub mem_agent_compact_disable: Option<bool>,
pub mem_agent_compact_period_secs: Option<u64>,
pub mem_agent_compact_period_psi_percent_limit: Option<u8>,
pub mem_agent_compact_psi_percent_limit: Option<u8>,
pub mem_agent_compact_sec_max: Option<i64>,
pub mem_agent_compact_order: Option<u8>,
pub mem_agent_compact_threshold: Option<u64>,
pub mem_agent_compact_force_times: Option<u64>,
}
macro_rules! config_override {
@@ -176,6 +220,14 @@ macro_rules! config_override {
};
}
macro_rules! mem_agent_config_override {
($builder_v:expr, $mac_v:expr) => {
if let Some(v) = $builder_v {
$mac_v = v;
}
};
}
// parse_cmdline_param parse commandline parameters.
macro_rules! parse_cmdline_param {
// commandline flags, without func to parse the option values
@@ -235,6 +287,7 @@ impl Default for AgentConfig {
image_policy_file: String::from(""),
#[cfg(feature = "agent-policy")]
policy_file: String::from(""),
mem_agent: None,
}
}
}
@@ -287,6 +340,75 @@ impl FromStr for AgentConfig {
#[cfg(feature = "agent-policy")]
config_override!(agent_config_builder, agent_config, policy_file);
if agent_config_builder.mem_agent_enable.unwrap_or(false) {
let mut mac = MemAgentConfig::default();
mem_agent_config_override!(
agent_config_builder.mem_agent_memcg_disable,
mac.memcg_config.disabled
);
mem_agent_config_override!(
agent_config_builder.mem_agent_memcg_swap,
mac.memcg_config.swap
);
mem_agent_config_override!(
agent_config_builder.mem_agent_memcg_swappiness_max,
mac.memcg_config.swappiness_max
);
mem_agent_config_override!(
agent_config_builder.mem_agent_memcg_period_secs,
mac.memcg_config.period_secs
);
mem_agent_config_override!(
agent_config_builder.mem_agent_memcg_period_psi_percent_limit,
mac.memcg_config.period_psi_percent_limit
);
mem_agent_config_override!(
agent_config_builder.mem_agent_memcg_eviction_psi_percent_limit,
mac.memcg_config.eviction_psi_percent_limit
);
mem_agent_config_override!(
agent_config_builder.mem_agent_memcg_eviction_run_aging_count_min,
mac.memcg_config.eviction_run_aging_count_min
);
mem_agent_config_override!(
agent_config_builder.mem_agent_compact_disable,
mac.compact_config.disabled
);
mem_agent_config_override!(
agent_config_builder.mem_agent_compact_period_secs,
mac.compact_config.period_secs
);
mem_agent_config_override!(
agent_config_builder.mem_agent_compact_period_psi_percent_limit,
mac.compact_config.period_psi_percent_limit
);
mem_agent_config_override!(
agent_config_builder.mem_agent_compact_psi_percent_limit,
mac.compact_config.compact_psi_percent_limit
);
mem_agent_config_override!(
agent_config_builder.mem_agent_compact_sec_max,
mac.compact_config.compact_sec_max
);
mem_agent_config_override!(
agent_config_builder.mem_agent_compact_order,
mac.compact_config.compact_order
);
mem_agent_config_override!(
agent_config_builder.mem_agent_compact_threshold,
mac.compact_config.compact_threshold
);
mem_agent_config_override!(
agent_config_builder.mem_agent_compact_force_times,
mac.compact_config.compact_force_times
);
agent_config.mem_agent = Some(mac);
}
Ok(agent_config)
}
}
@@ -311,6 +433,8 @@ impl AgentConfig {
let mut config: AgentConfig = Default::default();
let cmdline = fs::read_to_string(file)?;
let params: Vec<&str> = cmdline.split_ascii_whitespace().collect();
let mut mem_agent_enable = false;
let mut mac = MemAgentConfig::default();
for param in params.iter() {
// If we get a configuration file path from the command line, we
// generate our config from it.
@@ -367,21 +491,21 @@ impl AgentConfig {
param,
DEBUG_CONSOLE_VPORT_OPTION,
config.debug_console_vport,
get_vsock_port,
get_number_value,
|port| port > 0
);
parse_cmdline_param!(
param,
LOG_VPORT_OPTION,
config.log_vport,
get_vsock_port,
get_number_value,
|port| port > 0
);
parse_cmdline_param!(
param,
PASSFD_LISTENER_PORT,
config.passfd_listener_port,
get_vsock_port,
get_number_value,
|port| port > 0
);
parse_cmdline_param!(
@@ -437,6 +561,105 @@ impl AgentConfig {
config.secure_storage_integrity,
get_bool_value
);
parse_cmdline_param!(param, MEM_AGENT_ENABLE, mem_agent_enable, get_bool_value);
if mem_agent_enable {
parse_cmdline_param!(
param,
MEM_AGENT_MEMCG_DISABLE,
mac.memcg_config.disabled,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_MEMCG_SWAP,
mac.memcg_config.swap,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_MEMCG_SWAPPINESS_MAX,
mac.memcg_config.swappiness_max,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_MEMCG_PERIOD_SECS,
mac.memcg_config.period_secs,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_MEMCG_PERIOD_PSI_PERCENT_LIMIT,
mac.memcg_config.period_psi_percent_limit,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_MEMCG_EVICTION_PSI_PERCENT_LIMIT,
mac.memcg_config.eviction_psi_percent_limit,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_MEMCG_EVICTION_RUN_AGING_COUNT_MIN,
mac.memcg_config.eviction_run_aging_count_min,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_COMPACT_DISABLE,
mac.compact_config.disabled,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_COMPACT_PERIOD_SECS,
mac.compact_config.period_secs,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_COMPACT_PERIOD_PSI_PERCENT_LIMIT,
mac.compact_config.period_psi_percent_limit,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_COMPACT_PSI_PERCENT_LIMIT,
mac.compact_config.compact_psi_percent_limit,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_COMPACT_SEC_MAX,
mac.compact_config.compact_sec_max,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_COMPACT_ORDER,
mac.compact_config.compact_order,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_COMPACT_THRESHOLD,
mac.compact_config.compact_threshold,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_COMPACT_FORCE_TIMES,
mac.compact_config.compact_force_times,
get_number_value
);
}
}
if mem_agent_enable {
config.mem_agent = Some(mac);
}
config.override_config_from_envs();
@@ -477,11 +700,19 @@ impl AgentConfig {
}
#[instrument]
fn get_vsock_port(p: &str) -> Result<i32> {
fn get_number_value<T>(p: &str) -> Result<T>
where
T: std::str::FromStr,
<T as std::str::FromStr>::Err: std::fmt::Debug,
{
let fields: Vec<&str> = p.split('=').collect();
ensure!(fields.len() == 2, "invalid port parameter");
if fields.len() != 2 {
return Err(anyhow!("format of {} is invalid", p));
}
Ok(fields[1].parse::<i32>()?)
fields[1]
.parse::<T>()
.map_err(|e| anyhow!("parse from {} failed: {:?}", &fields[1], e))
}
// Map logrus (https://godoc.org/github.com/sirupsen/logrus)
@@ -682,6 +913,7 @@ mod tests {
image_policy_file: &'a str,
#[cfg(feature = "agent-policy")]
policy_file: &'a str,
mem_agent: Option<MemAgentConfig>,
}
impl Default for TestData<'_> {
@@ -710,6 +942,7 @@ mod tests {
image_policy_file: "",
#[cfg(feature = "agent-policy")]
policy_file: "",
mem_agent: None,
}
}
}
@@ -1204,6 +1437,40 @@ mod tests {
policy_file: "/tmp/policy.rego",
..Default::default()
},
TestData {
contents: "",
..Default::default()
},
TestData {
contents: "agent.mem_agent_enable=1",
mem_agent: Some(MemAgentConfig::default()),
..Default::default()
},
TestData {
contents: "agent.mem_agent_enable=1\nagent.mem_agent_memcg_period_secs=300",
mem_agent: Some(MemAgentConfig {
memcg_config: mem_agent::memcg::Config {
period_secs: 300,
..Default::default()
},
..Default::default()
}),
..Default::default()
},
TestData {
contents: "agent.mem_agent_enable=1\nagent.mem_agent_memcg_period_secs=300\nagent.mem_agent_compact_order=6",
mem_agent: Some(MemAgentConfig {
memcg_config: mem_agent::memcg::Config {
period_secs: 300,
..Default::default()
},
compact_config: mem_agent::compact::Config {
compact_order: 6,
..Default::default()
},
}),
..Default::default()
},
];
let dir = tempdir().expect("failed to create tmpdir");
@@ -1281,6 +1548,8 @@ mod tests {
#[cfg(feature = "agent-policy")]
assert_eq!(d.policy_file, config.policy_file, "{}", msg);
assert_eq!(d.mem_agent, config.mem_agent, "{}", msg);
for v in vars_to_unset {
env::remove_var(v);
}
@@ -1525,6 +1794,7 @@ Caused by:
server_addr = 'vsock://8:2048'
guest_components_procs = "api-server-rest"
guest_components_rest_api = "all"
mem_agent_enable = true
"#,
)
.unwrap();
@@ -1543,5 +1813,7 @@ Caused by:
// Verify that the default values are valid
assert_eq!(config.hotplug_timeout, DEFAULT_HOTPLUG_TIMEOUT);
assert_eq!(config.mem_agent, Some(MemAgentConfig::default()),);
}
}

View File

@@ -11,6 +11,9 @@ use self::vfio_device_handler::{VfioApDeviceHandler, VfioPciDeviceHandler};
use crate::pci;
use crate::sandbox::Sandbox;
use anyhow::{anyhow, Context, Result};
use cdi::annotations::parse_annotations;
use cdi::cache::{new_cache, with_auto_refresh, CdiOption};
use cdi::spec_dirs::with_spec_dirs;
use kata_types::device::DeviceHandlerManager;
use nix::sys::stat;
use oci::{LinuxDeviceCgroup, Spec};
@@ -25,6 +28,8 @@ use std::path::PathBuf;
use std::str::FromStr;
use std::sync::Arc;
use tokio::sync::Mutex;
use tokio::time;
use tokio::time::Duration;
use tracing::instrument;
pub mod block_device_handler;
@@ -238,6 +243,69 @@ pub async fn add_devices(
update_spec_devices(logger, spec, dev_updates)
}
#[instrument]
pub async fn handle_cdi_devices(
logger: &Logger,
spec: &mut Spec,
spec_dir: &str,
cdi_timeout: u64,
) -> Result<()> {
if let Some(container_type) = spec
.annotations()
.as_ref()
.and_then(|a| a.get("io.katacontainers.pkg.oci.container_type"))
{
if container_type == "pod_sandbox" {
return Ok(());
}
}
let (_, devices) = parse_annotations(spec.annotations().as_ref().unwrap())?;
if devices.is_empty() {
info!(logger, "no CDI annotations, no devices to inject");
return Ok(());
}
// Explicitly set the cache options to disable auto-refresh and
// to use the single spec dir "/var/run/cdi" for tests it can be overridden
let options: Vec<CdiOption> = vec![with_auto_refresh(false), with_spec_dirs(&[spec_dir])];
let cache: Arc<std::sync::Mutex<cdi::cache::Cache>> = new_cache(options);
for _ in 0..=cdi_timeout {
let inject_result = {
// Lock cache within this scope, std::sync::Mutex has no Send
// and await will not work with time::sleep
let mut cache = cache.lock().unwrap();
match cache.refresh() {
Ok(_) => {}
Err(e) => {
return Err(anyhow!("error refreshing cache: {:?}", e));
}
}
cache.inject_devices(Some(spec), devices.clone())
};
match inject_result {
Ok(_) => {
info!(
logger,
"all devices injected successfully, modified CDI container spec: {:?}", &spec
);
return Ok(());
}
Err(e) => {
info!(logger, "error injecting devices: {:?}", e);
println!("error injecting devices: {:?}", e);
}
}
time::sleep(Duration::from_millis(1000)).await;
}
Err(anyhow!(
"failed to inject devices after CDI timeout of {} seconds",
cdi_timeout
))
}
#[instrument]
async fn validate_device(
logger: &Logger,
@@ -1110,4 +1178,94 @@ mod tests {
assert!(name.is_ok(), "{}", name.unwrap_err());
assert_eq!(name.unwrap(), devname);
}
#[tokio::test]
async fn test_handle_cdi_devices() {
let logger = slog::Logger::root(slog::Discard, o!());
let mut spec = Spec::default();
let mut annotations = HashMap::new();
// cdi.k8s.io/vendor1_devices: vendor1.com/device=foo
annotations.insert(
"cdi.k8s.io/vfio17".to_string(),
"kata.com/gpu=0".to_string(),
);
spec.set_annotations(Some(annotations));
let temp_dir = tempdir().expect("Failed to create temporary directory");
let cdi_file = temp_dir.path().join("kata.json");
let cdi_version = "0.6.0";
let kind = "kata.com/gpu";
let device_name = "0";
let annotation_whatever = "false";
let annotation_whenever = "true";
let inner_env = "TEST_INNER_ENV=TEST_INNER_ENV_VALUE";
let outer_env = "TEST_OUTER_ENV=TEST_OUTER_ENV_VALUE";
let inner_device = "/dev/zero";
let outer_device = "/dev/null";
let cdi_content = format!(
r#"{{
"cdiVersion": "{cdi_version}",
"kind": "{kind}",
"devices": [
{{
"name": "{device_name}",
"annotations": {{
"whatever": "{annotation_whatever}",
"whenever": "{annotation_whenever}"
}},
"containerEdits": {{
"env": [
"{inner_env}"
],
"deviceNodes": [
{{
"path": "{inner_device}"
}}
]
}}
}}
],
"containerEdits": {{
"env": [
"{outer_env}"
],
"deviceNodes": [
{{
"path": "{outer_device}"
}}
]
}}
}}"#
);
fs::write(&cdi_file, cdi_content).expect("Failed to write CDI file");
let res =
handle_cdi_devices(&logger, &mut spec, temp_dir.path().to_str().unwrap(), 0).await;
println!("modfied spec {:?}", spec);
assert!(res.is_ok(), "{}", res.err().unwrap());
let linux = spec.linux().as_ref().unwrap();
let devices = linux
.resources()
.as_ref()
.unwrap()
.devices()
.as_ref()
.unwrap();
assert_eq!(devices.len(), 2);
let env = spec.process().as_ref().unwrap().env().as_ref().unwrap();
// find string TEST_OUTER_ENV in env
let outer_env = env.iter().find(|e| e.starts_with("TEST_OUTER_ENV"));
assert!(outer_env.is_some(), "TEST_OUTER_ENV not found in env");
// find TEST_INNER_ENV in env
let inner_env = env.iter().find(|e| e.starts_with("TEST_INNER_ENV"));
assert!(inner_env.is_some(), "TEST_INNER_ENV not found in env");
}
}

View File

@@ -21,6 +21,9 @@ use tokio::sync::Mutex;
use crate::rpc::CONTAINER_BASE;
use crate::AGENT_CONFIG;
use kata_types::mount::KATA_VIRTUAL_VOLUME_IMAGE_GUEST_PULL;
use protocols::agent::Storage;
pub const KATA_IMAGE_WORK_DIR: &str = "/run/kata-containers/image/";
const CONFIG_JSON: &str = "config.json";
const KATA_PAUSE_BUNDLE: &str = "/pause_bundle";
@@ -81,6 +84,28 @@ impl ImageService {
Self { image_client }
}
/// get guest pause image process specification
fn get_pause_image_process() -> Result<oci::Process> {
let guest_pause_bundle = Path::new(KATA_PAUSE_BUNDLE);
if !guest_pause_bundle.exists() {
bail!("Pause image not present in rootfs");
}
let guest_pause_config = scoped_join(guest_pause_bundle, CONFIG_JSON)?;
let image_oci = oci::Spec::load(guest_pause_config.to_str().ok_or_else(|| {
anyhow!(
"Failed to load the guest pause image config from {:?}",
guest_pause_config
)
})?)
.context("load image config file")?;
let image_oci_process = image_oci.process().as_ref().ok_or_else(|| {
anyhow!("The guest pause image config does not contain a process specification. Please check the pause image.")
})?;
Ok(image_oci_process.clone())
}
/// pause image is packaged in rootfs
fn unpack_pause_image(cid: &str) -> Result<String> {
verify_id(cid).context("The guest pause image cid contains invalid characters.")?;
@@ -132,6 +157,20 @@ impl ImageService {
Ok(pause_rootfs.display().to_string())
}
/// check whether the image is for sandbox or for container.
fn is_sandbox(image_metadata: &HashMap<String, String>) -> bool {
let mut is_sandbox = false;
for key in K8S_CONTAINER_TYPE_KEYS.iter() {
if let Some(value) = image_metadata.get(key as &str) {
if value == "sandbox" {
is_sandbox = true;
break;
}
}
}
is_sandbox
}
/// pull_image is used for call image-rs to pull image in the guest.
/// # Parameters
/// - `image`: Image name (exp: quay.io/prometheus/busybox:latest)
@@ -147,18 +186,7 @@ impl ImageService {
) -> Result<String> {
info!(sl(), "image metadata: {image_metadata:?}");
//Check whether the image is for sandbox or for container.
let mut is_sandbox = false;
for key in K8S_CONTAINER_TYPE_KEYS.iter() {
if let Some(value) = image_metadata.get(key as &str) {
if value == "sandbox" {
is_sandbox = true;
break;
}
}
}
if is_sandbox {
if Self::is_sandbox(image_metadata) {
let mount_path = Self::unpack_pause_image(cid)?;
return Ok(mount_path);
}
@@ -194,6 +222,32 @@ impl ImageService {
}
}
/// get_process overrides the OCI process spec with pause image process spec if needed
pub fn get_process(
ocip: &oci::Process,
oci: &oci::Spec,
storages: Vec<Storage>,
) -> Result<oci::Process> {
let mut guest_pull = false;
for storage in storages {
if storage.driver == KATA_VIRTUAL_VOLUME_IMAGE_GUEST_PULL {
guest_pull = true;
break;
}
}
if guest_pull {
match oci.annotations() {
Some(a) => {
if ImageService::is_sandbox(a) {
return ImageService::get_pause_image_process();
}
}
None => {}
}
}
Ok(ocip.clone())
}
/// Set proxy environment from AGENT_CONFIG
pub async fn set_proxy_env_vars() {
if env::var("HTTPS_PROXY").is_err() {

View File

@@ -21,7 +21,7 @@ extern crate slog;
use anyhow::{anyhow, Context, Result};
use cfg_if::cfg_if;
use clap::{AppSettings, Parser};
use const_format::concatcp;
use const_format::{concatcp, formatcp};
use nix::fcntl::OFlag;
use nix::sys::reboot::{reboot, RebootMode};
use nix::sys::socket::{self, AddressFamily, SockFlag, SockType, VsockAddr};
@@ -29,7 +29,7 @@ use nix::unistd::{self, dup, sync, Pid};
use std::env;
use std::ffi::OsStr;
use std::fs::{self, File};
use std::os::unix::fs as unixfs;
use std::os::unix::fs::{self as unixfs, FileTypeExt};
use std::os::unix::io::AsRawFd;
use std::path::Path;
use std::process::exit;
@@ -109,7 +109,18 @@ const CDH_SOCKET_URI: &str = concatcp!(UNIX_SOCKET_PREFIX, CDH_SOCKET);
const API_SERVER_PATH: &str = "/usr/local/bin/api-server-rest";
/// Path of ocicrypt config file. This is used by image-rs when decrypting image.
const OCICRYPT_CONFIG_PATH: &str = "/tmp/ocicrypt_config.json";
const OCICRYPT_CONFIG_PATH: &str = "/run/confidential-containers/ocicrypt_config.json";
const OCICRYPT_CONFIG: &str = formatcp!(
r#"{{
"key-providers": {{
"attestation-agent": {{
"ttrpc": "{}"
}}
}}
}}"#,
CDH_SOCKET_URI
);
const DEFAULT_LAUNCH_PROCESS_TIMEOUT: i32 = 6;
@@ -408,19 +419,32 @@ async fn start_sandbox(
sandbox.lock().await.sender = Some(tx);
let gc_procs = config.guest_components_procs;
if gc_procs != GuestComponentsProcs::None {
if !attestation_binaries_available(logger, &gc_procs) {
warn!(
logger,
"attestation binaries requested for launch not available"
);
} else {
init_attestation_components(logger, config).await?;
}
if !attestation_binaries_available(logger, &gc_procs) {
warn!(
logger,
"attestation binaries requested for launch not available"
);
} else {
init_attestation_components(logger, config).await?;
}
let mut oma = None;
let mut _ort = None;
if let Some(c) = &config.mem_agent {
let (ma, rt) =
mem_agent::agent::MemAgent::new(c.memcg_config.clone(), c.compact_config.clone())
.map_err(|e| {
error!(logger, "MemAgent::new fail: {}", e);
e
})
.context("start mem-agent")?;
oma = Some(ma);
_ort = Some(rt);
}
// vsock:///dev/vsock, port
let mut server = rpc::start(sandbox.clone(), config.server_addr.as_str(), init_mode).await?;
let mut server =
rpc::start(sandbox.clone(), config.server_addr.as_str(), init_mode, oma).await?;
server.start().await?;
@@ -447,12 +471,7 @@ fn attestation_binaries_available(logger: &Logger, procs: &GuestComponentsProcs)
true
}
// Start-up attestation-agent, CDH and api-server-rest if they are packaged in the rootfs
// and the corresponding procs are enabled in the agent configuration. the process will be
// launched in the background and the function will return immediately.
// If the CDH is started, a CDH client will be instantiated and returned.
async fn init_attestation_components(logger: &Logger, config: &AgentConfig) -> Result<()> {
// skip launch of any guest-component
async fn launch_guest_component_procs(logger: &Logger, config: &AgentConfig) -> Result<()> {
if config.guest_components_procs == GuestComponentsProcs::None {
return Ok(());
}
@@ -472,17 +491,6 @@ async fn init_attestation_components(logger: &Logger, config: &AgentConfig) -> R
return Ok(());
}
let ocicrypt_config = serde_json::json!({
"key-providers": {
"attestation-agent":{
"ttrpc":CDH_SOCKET_URI
}
}
});
fs::write(OCICRYPT_CONFIG_PATH, ocicrypt_config.to_string().as_bytes())?;
env::set_var("OCICRYPT_KEYPROVIDER_CONFIG", OCICRYPT_CONFIG_PATH);
debug!(
logger,
"spawning confidential-data-hub process {}", CDH_PATH
@@ -497,9 +505,6 @@ async fn init_attestation_components(logger: &Logger, config: &AgentConfig) -> R
)
.map_err(|e| anyhow!("launch_process {} failed: {:?}", CDH_PATH, e))?;
// initialize cdh client
cdh::init_cdh_client(CDH_SOCKET_URI).await?;
// skip launch of api-server-rest
if config.guest_components_procs == GuestComponentsProcs::ConfidentialDataHub {
return Ok(());
@@ -522,6 +527,33 @@ async fn init_attestation_components(logger: &Logger, config: &AgentConfig) -> R
Ok(())
}
// Start-up attestation-agent, CDH and api-server-rest if they are packaged in the rootfs
// and the corresponding procs are enabled in the agent configuration. the process will be
// launched in the background and the function will return immediately.
// If the CDH is started, a CDH client will be instantiated and returned.
async fn init_attestation_components(logger: &Logger, config: &AgentConfig) -> Result<()> {
launch_guest_component_procs(logger, config).await?;
// If a CDH socket exists, initialize the CDH client and enable ocicrypt
match tokio::fs::metadata(CDH_SOCKET).await {
Ok(md) => {
if md.file_type().is_socket() {
cdh::init_cdh_client(CDH_SOCKET_URI).await?;
fs::write(OCICRYPT_CONFIG_PATH, OCICRYPT_CONFIG.as_bytes())?;
env::set_var("OCICRYPT_KEYPROVIDER_CONFIG", OCICRYPT_CONFIG_PATH);
} else {
debug!(logger, "File {} is not a socket", CDH_SOCKET);
}
}
Err(err) => warn!(
logger,
"Failed to probe CDH socket file {}: {:?}", CDH_SOCKET, err
),
}
Ok(())
}
fn wait_for_path_to_exist(logger: &Logger, path: &str, timeout_secs: i32) -> Result<()> {
let p = Path::new(path);
let mut attempts = 0;

View File

@@ -3,7 +3,7 @@
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::Result;
use anyhow::{bail, Result};
use protobuf::MessageDyn;
use tokio::io::AsyncWriteExt;
@@ -68,6 +68,12 @@ pub struct AgentPolicy {
engine: regorus::Engine,
}
#[derive(serde::Deserialize, Debug)]
struct MetadataResponse {
allowed: bool,
ops: Option<json_patch::Patch>,
}
impl AgentPolicy {
/// Create AgentPolicy object.
pub fn new() -> Self {
@@ -82,6 +88,17 @@ impl AgentPolicy {
let mut engine = regorus::Engine::new();
engine.set_strict_builtin_errors(false);
engine.set_gather_prints(true);
// assign a slice of the engine data "pstate" to be used as policy state
engine
.add_data(
regorus::Value::from_json_str(
r#"{
"pstate": {}
}"#,
)
.unwrap(),
)
.unwrap();
engine
}
@@ -112,6 +129,23 @@ impl AgentPolicy {
Ok(())
}
async fn apply_patch_to_state(&mut self, patch: json_patch::Patch) -> Result<()> {
// Convert the current engine data to a JSON value
let mut state = serde_json::to_value(self.engine.get_data())?;
// Apply the patch to the state
json_patch::patch(&mut state, &patch)?;
// Clear the existing data in the engine
self.engine.clear_data();
// Add the patched state back to the engine
self.engine
.add_data(regorus::Value::from_json_str(&state.to_string())?)?;
Ok(())
}
/// Ask regorus if an API call should be allowed or not.
async fn allow_request(&mut self, ep: &str, ep_input: &str) -> Result<(bool, String)> {
debug!(sl!(), "policy check: {ep}");
@@ -120,13 +154,56 @@ impl AgentPolicy {
let query = format!("data.agent_policy.{ep}");
self.engine.set_input_json(ep_input)?;
let mut allow = match self.engine.eval_bool_query(query, false) {
Ok(a) => a,
Err(e) => {
if !self.allow_failures {
return Err(e);
let results = self.engine.eval_query(query, false)?;
let prints = match self.engine.take_prints() {
Ok(p) => p.join(" "),
Err(e) => format!("Failed to get policy log: {e}"),
};
if results.result.len() != 1 {
// Results are empty when AllowRequestsFailingPolicy is used to allow a Request that hasn't been defined in the policy
if self.allow_failures {
return Ok((true, prints));
}
bail!(
"policy check: unexpected eval_query result len {:?}",
results
);
}
if results.result[0].expressions.len() != 1 {
bail!(
"policy check: unexpected eval_query result expressions {:?}",
results
);
}
let mut allow = match &results.result[0].expressions[0].value {
regorus::Value::Bool(b) => *b,
// Match against a specific variant that could be interpreted as MetadataResponse
regorus::Value::Object(obj) => {
let json_str = serde_json::to_string(obj)?;
self.log_eval_input(ep, &json_str).await;
let metadata_response: MetadataResponse = serde_json::from_str(&json_str)?;
if metadata_response.allowed {
if let Some(ops) = metadata_response.ops {
self.apply_patch_to_state(ops).await?;
}
}
false
metadata_response.allowed
}
_ => {
error!(sl!(), "allow_request: unexpected eval_query result type");
bail!(
"policy check: unexpected eval_query result type {:?}",
results
);
}
};
@@ -135,11 +212,6 @@ impl AgentPolicy {
allow = true;
}
let prints = match self.engine.take_prints() {
Ok(p) => p.join(" "),
Err(e) => format!("Failed to get policy log: {e}"),
};
Ok((allow, prints))
}

View File

@@ -58,7 +58,7 @@ use rustjail::process::ProcessOperations;
use crate::cdh;
use crate::device::block_device_handler::get_virtio_blk_pci_device_name;
use crate::device::network_device_handler::wait_for_net_interface;
use crate::device::{add_devices, update_env_pci};
use crate::device::{add_devices, handle_cdi_devices, update_env_pci};
use crate::features::get_build_features;
use crate::image::KATA_IMAGE_WORK_DIR;
use crate::linux_abi::*;
@@ -130,6 +130,8 @@ const ERR_NO_SANDBOX_PIDNS: &str = "Sandbox does not have sandbox_pidns";
// not available.
const IPTABLES_RESTORE_WAIT_SEC: u64 = 5;
const CDI_TIMEOUT_LIMIT: u64 = 100;
// Convenience function to obtain the scope logger.
fn sl() -> slog::Logger {
slog_scope::logger()
@@ -179,6 +181,7 @@ impl<T> OptionToTtrpcResult<T> for Option<T> {
pub struct AgentService {
sandbox: Arc<Mutex<Sandbox>>,
init_mode: bool,
oma: Option<mem_agent::agent::MemAgent>,
}
impl AgentService {
@@ -224,6 +227,15 @@ impl AgentService {
// cannot predict everything from the caller.
add_devices(&sl(), &req.devices, &mut oci, &self.sandbox).await?;
// In guest-kernel mode some devices need extra handling. Taking the
// GPU as an example the shim will inject CDI annotations that will
// be used by the kata-agent to do containerEdits according to the
// CDI spec coming from a registry that is created on the fly by UDEV
// or other entities for a specifc device.
// In Kata we only consider the directory "/var/run/cdi", "/etc" may be
// readonly
handle_cdi_devices(&sl(), &mut oci, "/var/run/cdi", CDI_TIMEOUT_LIMIT).await?;
cdh_handler(&mut oci).await?;
// Both rootfs and volumes (invoked with --volume for instance) will
@@ -233,7 +245,13 @@ impl AgentService {
// After all those storages have been processed, no matter the order
// here, the agent will rely on rustjail (using the oci.Mounts
// list) to bind mount all of them inside the container.
let m = add_storages(sl(), req.storages, &self.sandbox, Some(req.container_id)).await?;
let m = add_storages(
sl(),
req.storages.clone(),
&self.sandbox,
Some(req.container_id),
)
.await?;
let mut s = self.sandbox.lock().await;
s.container_mounts.insert(cid.clone(), m);
@@ -288,6 +306,13 @@ impl AgentService {
let pipe_size = AGENT_CONFIG.container_pipe_size;
let p = if let Some(p) = oci.process() {
#[cfg(feature = "guest-pull")]
{
let new_p = image::get_process(p, &oci, req.storages.clone())?;
Process::new(&sl(), &new_p, cid.as_str(), true, pipe_size, proc_io)?
}
#[cfg(not(feature = "guest-pull"))]
Process::new(&sl(), p, cid.as_str(), true, pipe_size, proc_io)?
} else {
info!(sl(), "no process configurations!");
@@ -674,6 +699,37 @@ impl AgentService {
}
}
fn mem_agent_memcgconfig_to_memcg_optionconfig(
mc: &protocols::agent::MemAgentMemcgConfig,
) -> mem_agent::memcg::OptionConfig {
mem_agent::memcg::OptionConfig {
disabled: mc.disabled,
swap: mc.swap,
swappiness_max: mc.swappiness_max.map(|x| x as u8),
period_secs: mc.period_secs,
period_psi_percent_limit: mc.period_psi_percent_limit.map(|x| x as u8),
eviction_psi_percent_limit: mc.eviction_psi_percent_limit.map(|x| x as u8),
eviction_run_aging_count_min: mc.eviction_run_aging_count_min,
..Default::default()
}
}
fn mem_agent_compactconfig_to_compact_optionconfig(
cc: &protocols::agent::MemAgentCompactConfig,
) -> mem_agent::compact::OptionConfig {
mem_agent::compact::OptionConfig {
disabled: cc.disabled,
period_secs: cc.period_secs,
period_psi_percent_limit: cc.period_psi_percent_limit.map(|x| x as u8),
compact_psi_percent_limit: cc.compact_psi_percent_limit.map(|x| x as u8),
compact_sec_max: cc.compact_sec_max,
compact_order: cc.compact_order.map(|x| x as u8),
compact_threshold: cc.compact_threshold,
compact_force_times: cc.compact_force_times,
..Default::default()
}
}
#[async_trait]
impl agent_ttrpc::AgentService for AgentService {
async fn create_container(
@@ -1489,6 +1545,54 @@ impl agent_ttrpc::AgentService for AgentService {
Ok(Empty::new())
}
async fn mem_agent_memcg_set(
&self,
_ctx: &::ttrpc::r#async::TtrpcContext,
config: protocols::agent::MemAgentMemcgConfig,
) -> ::ttrpc::Result<Empty> {
if let Some(ma) = &self.oma {
ma.memcg_set_config_async(mem_agent_memcgconfig_to_memcg_optionconfig(&config))
.await
.map_err(|e| {
let estr = format!("ma.memcg_set_config_async fail: {}", e);
error!(sl(), "{}", estr);
ttrpc::Error::RpcStatus(ttrpc::get_status(ttrpc::Code::INTERNAL, estr))
})?;
} else {
let estr = "mem-agent is disabled";
error!(sl(), "{}", estr);
return Err(ttrpc::Error::RpcStatus(ttrpc::get_status(
ttrpc::Code::INTERNAL,
estr,
)));
}
Ok(Empty::new())
}
async fn mem_agent_compact_set(
&self,
_ctx: &::ttrpc::r#async::TtrpcContext,
config: protocols::agent::MemAgentCompactConfig,
) -> ::ttrpc::Result<Empty> {
if let Some(ma) = &self.oma {
ma.compact_set_config_async(mem_agent_compactconfig_to_compact_optionconfig(&config))
.await
.map_err(|e| {
let estr = format!("ma.compact_set_config_async fail: {}", e);
error!(sl(), "{}", estr);
ttrpc::Error::RpcStatus(ttrpc::get_status(ttrpc::Code::INTERNAL, estr))
})?;
} else {
let estr = "mem-agent is disabled";
error!(sl(), "{}", estr);
return Err(ttrpc::Error::RpcStatus(ttrpc::get_status(
ttrpc::Code::INTERNAL,
estr,
)));
}
Ok(Empty::new())
}
}
#[derive(Clone)]
@@ -1632,10 +1736,12 @@ pub async fn start(
s: Arc<Mutex<Sandbox>>,
server_address: &str,
init_mode: bool,
oma: Option<mem_agent::agent::MemAgent>,
) -> Result<TtrpcServer> {
let agent_service = Box::new(AgentService {
sandbox: s,
init_mode,
oma,
}) as Box<dyn agent_ttrpc::AgentService + Send + Sync>;
let aservice = agent_ttrpc::create_agent_service(Arc::new(agent_service));
@@ -2052,6 +2158,25 @@ fn load_kernel_module(module: &protocols::agent::KernelModule) -> Result<()> {
}
}
fn is_sealed_secret_path(source_path: &str) -> bool {
// Base path to check
let base_path = "/run/kata-containers/shared/containers";
// Paths to exclude
let excluded_suffixes = [
"resolv.conf",
"termination-log",
"hostname",
"hosts",
"serviceaccount",
];
// Ensure the path starts with the base path and does not end with any excluded suffix
source_path.starts_with(base_path)
&& !excluded_suffixes
.iter()
.any(|suffix| source_path.ends_with(suffix))
}
async fn cdh_handler(oci: &mut Spec) -> Result<()> {
if !cdh::is_cdh_client_initialized().await {
return Ok(());
@@ -2077,17 +2202,34 @@ async fn cdh_handler(oci: &mut Spec) -> Result<()> {
.ok_or_else(|| anyhow!("Spec didn't contain mounts field"))?;
for m in mounts.iter_mut() {
if m.destination().starts_with("/sealed") {
info!(
let Some(source_path) = m.source().as_ref().and_then(|p| p.to_str()) else {
warn!(sl(), "Mount source is None or invalid");
continue;
};
// Check if source_path starts with "/run/kata-containers/shared/containers"
// For a volume mount path /mydir,
// the secret file path will be like this under the /run/kata-containers/shared/containers dir
// a128482812bad768f404e063f225decd425fc94a673aec4add45a9caa1122ccb-75490e32e51da3ff-mydir
// We can ignore few paths like: resolv.conf, termination-log, hostname,hosts,serviceaccount
if is_sealed_secret_path(source_path) {
debug!(
sl(),
"sealed mount destination: {:?} source: {:?}",
m.destination(),
m.source()
"Calling unseal_file for - source: {:?} destination: {:?}",
source_path,
m.destination()
);
if let Some(source_str) = m.source().as_ref().and_then(|p| p.to_str()) {
cdh::unseal_file(source_str).await?;
} else {
warn!(sl(), "Failed to unseal: Mount source is None or invalid");
// Call unseal_file. This function checks the files under the source_path
// for the sealed secret header and unseal it if the header is present.
// This is suboptimal as we are going through every file under the source_path.
// But currently there is no quick way to determine which volume-mount is referring
// to a sealed secret without reading the file.
// And relying on file naming heuristic is inflexible. So we are going with this approach.
if let Err(e) = cdh::unseal_file(source_path).await {
warn!(
sl(),
"Failed to unseal file: {:?}, Error: {:?}", source_path, e
);
}
}
}
@@ -2270,6 +2412,7 @@ mod tests {
let agent_service = Box::new(AgentService {
sandbox: Arc::new(Mutex::new(sandbox)),
init_mode: true,
oma: None,
});
let req = protocols::agent::UpdateInterfaceRequest::default();
@@ -2287,6 +2430,7 @@ mod tests {
let agent_service = Box::new(AgentService {
sandbox: Arc::new(Mutex::new(sandbox)),
init_mode: true,
oma: None,
});
let req = protocols::agent::UpdateRoutesRequest::default();
@@ -2304,6 +2448,7 @@ mod tests {
let agent_service = Box::new(AgentService {
sandbox: Arc::new(Mutex::new(sandbox)),
init_mode: true,
oma: None,
});
let req = protocols::agent::AddARPNeighborsRequest::default();
@@ -2442,6 +2587,7 @@ mod tests {
let agent_service = Box::new(AgentService {
sandbox: Arc::new(Mutex::new(sandbox)),
init_mode: true,
oma: None,
});
let result = agent_service
@@ -2932,6 +3078,7 @@ OtherField:other
let agent_service = Box::new(AgentService {
sandbox: Arc::new(Mutex::new(sandbox)),
init_mode: true,
oma: None,
});
let ctx = mk_ttrpc_context();
@@ -3081,4 +3228,54 @@ COMMIT
"We should see the resulting rule"
);
}
#[tokio::test]
async fn test_is_sealed_secret_path() {
#[derive(Debug)]
struct TestData<'a> {
source_path: &'a str,
result: bool,
}
let tests = &[
TestData {
source_path: "/run/kata-containers/shared/containers/somefile",
result: true,
},
TestData {
source_path: "/run/kata-containers/shared/containers/a128482812bad768f404e063f225decd425fc94a673aec4add45a9caa1122ccb-75490e32e51da3ff-resolv.conf",
result: false,
},
TestData {
source_path: "/run/kata-containers/shared/containers/a128482812bad768f404e063f225decd425fc94a673aec4add45a9caa1122ccb-75490e32e51da3ff-termination-log",
result: false,
},
TestData {
source_path: "/run/kata-containers/shared/containers/a128482812bad768f404e063f225decd425fc94a673aec4add45a9caa1122ccb-75490e32e51da3ff-hostname",
result: false,
},
TestData {
source_path: "/run/kata-containers/shared/containers/a128482812bad768f404e063f225decd425fc94a673aec4add45a9caa1122ccb-75490e32e51da3ff-hosts",
result: false,
},
TestData {
source_path: "/run/kata-containers/shared/containers/a128482812bad768f404e063f225decd425fc94a673aec4add45a9caa1122ccb-75490e32e51da3ff-serviceaccount",
result: false,
},
TestData {
source_path: "/run/kata-containers/shared/containers/a128482812bad768f404e063f225decd425fc94a673aec4add45a9caa1122ccb-75490e32e51da3ff-mysecret",
result: true,
},
TestData {
source_path: "/some/other/path",
result: false,
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let result = is_sealed_secret_path(d.source_path);
assert_eq!(d.result, result, "{}", msg);
}
}
}

View File

@@ -170,7 +170,7 @@ impl EphemeralHandler {
let size = size_str
.unwrap()
.parse::<u64>()
.context(format!("parse size: {:?}", &pagesize_str))?;
.context(format!("parse size: {:?}", &size_str))?;
Ok((pagesize, size))
}

27
src/libs/Cargo.lock generated
View File

@@ -240,19 +240,6 @@ version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "baf1de4339761588bc0619e3cbc0120ee582ebb74b53b4efbf79117bd2da40fd"
[[package]]
name = "cgroups-rs"
version = "0.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5b098e7c3a70d03c288fa0a96ccf13e770eb3d78c4cc0e1549b3c13215d5f965"
dependencies = [
"libc",
"log",
"nix 0.25.1",
"regex",
"thiserror",
]
[[package]]
name = "chrono"
version = "0.4.20"
@@ -814,7 +801,6 @@ version = "0.1.0"
dependencies = [
"anyhow",
"byteorder",
"cgroups-rs",
"chrono",
"common-path",
"fail",
@@ -975,18 +961,6 @@ dependencies = [
"memoffset 0.6.5",
]
[[package]]
name = "nix"
version = "0.25.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f346ff70e7dbfd675fe90590b92d59ef2de15a8779ae305ebcbfd3f0caf59be4"
dependencies = [
"autocfg",
"bitflags",
"cfg-if",
"libc",
]
[[package]]
name = "nix"
version = "0.26.4"
@@ -1316,7 +1290,6 @@ name = "protocols"
version = "0.1.0"
dependencies = [
"async-trait",
"kata-sys-util",
"oci-spec",
"protobuf 3.2.0",
"serde",

View File

@@ -13,7 +13,6 @@ edition = "2018"
[dependencies]
anyhow = "1.0.31"
byteorder = "1.4.3"
cgroups = { package = "cgroups-rs", version = "0.3.2" }
chrono = "0.4.0"
common-path = "=1.0.0"
fail = "0.5.0"

View File

@@ -1,10 +1,9 @@
# kata-sys-util
# `kata-sys-util`
This crate is a collection of utilities and helpers for
[Kata Containers](https://github.com/kata-containers/kata-containers/) components to access system services.
It provides safe wrappers over system services, such as:
- cgroups
- file systems
- mount
- NUMA

View File

@@ -53,6 +53,7 @@ use std::time::Instant;
use lazy_static::lazy_static;
use nix::mount::{mount, MntFlags, MsFlags};
use nix::{unistd, NixPath};
use oci_spec::runtime as oci;
use crate::fs::is_symlink;
use crate::sl;
@@ -799,8 +800,20 @@ pub fn get_mount_options(options: &Option<Vec<String>>) -> Vec<String> {
}
}
pub fn get_mount_type(typ: &Option<String>) -> String {
typ.clone().unwrap_or("bind".to_string())
pub fn get_mount_type(m: &oci::Mount) -> String {
m.typ()
.clone()
.map(|typ| {
if typ.as_str() == "none" {
if let Some(opts) = m.options() {
if opts.iter().any(|opt| opt == "bind" || opt == "rbind") {
return "bind".to_string();
}
}
}
typ
})
.unwrap_or("bind".to_string())
}
#[cfg(test)]

View File

@@ -97,11 +97,3 @@ pub fn load_oci_spec() -> Result<oci::Spec, OciSpecError> {
oci::Spec::load(spec_file.to_str().unwrap_or_default())
}
/// handle string parsing for input possibly be JSON string.
pub fn parse_json_string(input: &str) -> &str {
let json_str: &str = serde_json::from_str(input).unwrap_or(input);
let stripped_str = json_str.strip_prefix("CAP_").unwrap_or(json_str);
stripped_str
}

View File

@@ -11,7 +11,12 @@ pub const CONTAINER_NAME_LABEL_KEY: &str = "io.kubernetes.cri.container-name";
pub const SANDBOX: &str = "sandbox";
pub const CONTAINER: &str = "container";
// SandboxID is the sandbox ID annotation
pub const SANDBOX_ID_LABEL_KEY: &str = "io.kubernetes.cri.sandbox-id";
// SandboxName is the name of the sandbox (pod)
pub const SANDBOX_NAME_LABEL_KEY: &str = "io.kubernetes.cri.sandbox-name";
// SandboxNamespace is the name of the namespace of the sandbox (pod)
pub const SANDBOX_NAMESPACE_LABEL_KEY: &str = "io.kubernetes.cri.sandbox-namespace";
// Ref: https://pkg.go.dev/github.com/containerd/containerd@v1.6.7/pkg/cri/annotations
// SandboxCPU annotations are based on the initial CPU configuration for the sandbox. This is calculated as the

View File

@@ -18,6 +18,44 @@ use crate::eother;
/// agent name of Kata agent.
pub const AGENT_NAME_KATA: &str = "kata";
#[derive(Default, Debug, Deserialize, Serialize, Clone)]
pub struct MemAgent {
#[serde(default, alias = "mem_agent_enable")]
pub enable: bool,
#[serde(default)]
pub memcg_disable: Option<bool>,
#[serde(default)]
pub memcg_swap: Option<bool>,
#[serde(default)]
pub memcg_swappiness_max: Option<u8>,
#[serde(default)]
pub memcg_period_secs: Option<u64>,
#[serde(default)]
pub memcg_period_psi_percent_limit: Option<u8>,
#[serde(default)]
pub memcg_eviction_psi_percent_limit: Option<u8>,
#[serde(default)]
pub memcg_eviction_run_aging_count_min: Option<u64>,
#[serde(default)]
pub compact_disable: Option<bool>,
#[serde(default)]
pub compact_period_secs: Option<u64>,
#[serde(default)]
pub compact_period_psi_percent_limit: Option<u8>,
#[serde(default)]
pub compact_psi_percent_limit: Option<u8>,
#[serde(default)]
pub compact_sec_max: Option<i64>,
#[serde(default)]
pub compact_order: Option<u8>,
#[serde(default)]
pub compact_threshold: Option<u64>,
#[serde(default)]
pub compact_force_times: Option<u64>,
}
/// Kata agent configuration information.
#[derive(Debug, Deserialize, Serialize, Clone)]
pub struct Agent {
@@ -98,6 +136,10 @@ pub struct Agent {
/// container pipe size
#[serde(default)]
pub container_pipe_size: u32,
/// Memory agent configuration
#[serde(default)]
pub mem_agent: MemAgent,
}
impl std::default::Default for Agent {
@@ -116,6 +158,7 @@ impl std::default::Default for Agent {
health_check_request_timeout_ms: 90_000,
kernel_modules: Default::default(),
container_pipe_size: 0,
mem_agent: MemAgent::default(),
}
}
}

View File

@@ -98,3 +98,11 @@ pub const DEFAULT_FIRECRACKER_GUEST_KERNEL_IMAGE: &str = "vmlinux";
pub const DEFAULT_FIRECRACKER_GUEST_KERNEL_PARAMS: &str = "";
pub const MAX_FIRECRACKER_VCPUS: u32 = 32;
pub const MIN_FIRECRACKER_MEMORY_SIZE_MB: u32 = 128;
// Default configuration for remote
pub const DEFAULT_REMOTE_HYPERVISOR_SOCKET: &str = "/run/peerpod/hypervisor.sock";
pub const DEFAULT_REMOTE_HYPERVISOR_TIMEOUT: i32 = 600; // 600 Seconds
pub const MAX_REMOTE_VCPUS: u32 = 32;
pub const MIN_REMOTE_MEMORY_SIZE_MB: u32 = 64;
pub const DEFAULT_REMOTE_MEMORY_SIZE_MB: u32 = 128;
pub const DEFAULT_REMOTE_MEMORY_SLOTS: u32 = 128;

View File

@@ -44,6 +44,9 @@ pub use self::qemu::{QemuConfig, HYPERVISOR_NAME_QEMU};
mod ch;
pub use self::ch::{CloudHypervisorConfig, HYPERVISOR_NAME_CH};
mod remote;
pub use self::remote::{RemoteConfig, HYPERVISOR_NAME_REMOTE};
/// Virtual PCI block device driver.
pub const VIRTIO_BLK_PCI: &str = "virtio-blk-pci";
@@ -540,6 +543,7 @@ impl TopologyConfigInfo {
HYPERVISOR_NAME_CH,
HYPERVISOR_NAME_DRAGONBALL,
HYPERVISOR_NAME_FIRECRACKER,
HYPERVISOR_NAME_REMOTE,
];
let hypervisor_name = toml_config.runtime.hypervisor_name.as_str();
if !hypervisor_names.contains(&hypervisor_name) {
@@ -1040,6 +1044,18 @@ impl SharedFsInfo {
}
}
/// Configuration information for remote hypervisor type.
#[derive(Clone, Debug, Default, Deserialize, Serialize)]
pub struct RemoteInfo {
/// Remote hypervisor socket path
#[serde(default)]
pub hypervisor_socket: String,
/// Remote hyperisor timeout of creating (in seconds)
#[serde(default)]
pub hypervisor_timeout: i32,
}
/// Common configuration information for hypervisors.
#[derive(Clone, Debug, Default, Deserialize, Serialize)]
pub struct Hypervisor {
@@ -1123,6 +1139,10 @@ pub struct Hypervisor {
#[serde(default, flatten)]
pub shared_fs: SharedFsInfo,
/// Remote hypervisor configuration information.
#[serde(default, flatten)]
pub remote_info: RemoteInfo,
/// A sandbox annotation used to specify prefetch_files.list host path container image
/// being used, and runtime will pass it to Hypervisor to search for corresponding
/// prefetch list file:
@@ -1164,6 +1184,10 @@ impl ConfigOps for Hypervisor {
fn adjust_config(conf: &mut TomlConfig) -> Result<()> {
HypervisorVendor::adjust_config(conf)?;
let hypervisors: Vec<String> = conf.hypervisor.keys().cloned().collect();
info!(
sl!(),
"Adjusting hypervisor configuration {:?}", hypervisors
);
for hypervisor in hypervisors.iter() {
if let Some(plugin) = get_hypervisor_plugin(hypervisor) {
plugin.adjust_config(conf)?;

View File

@@ -0,0 +1,116 @@
// Copyright 2024 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
use byte_unit::{Byte, Unit};
use std::io::Result;
use std::path::Path;
use std::sync::Arc;
use sysinfo::System;
use crate::{
config::{
default::{self, MAX_REMOTE_VCPUS, MIN_REMOTE_MEMORY_SIZE_MB},
ConfigPlugin,
}, device::DRIVER_NVDIMM_TYPE, eother, resolve_path
};
use super::register_hypervisor_plugin;
/// Hypervisor name for remote, used to index `TomlConfig::hypervisor`.
pub const HYPERVISOR_NAME_REMOTE: &str = "remote";
/// Configuration information for remote.
#[derive(Default, Debug)]
pub struct RemoteConfig {}
impl RemoteConfig {
/// Create a new instance of `RemoteConfig`
pub fn new() -> Self {
RemoteConfig {}
}
/// Register the remote plugin.
pub fn register(self) {
let plugin = Arc::new(self);
register_hypervisor_plugin(HYPERVISOR_NAME_REMOTE, plugin);
}
}
impl ConfigPlugin for RemoteConfig {
fn name(&self) -> &str {
HYPERVISOR_NAME_REMOTE
}
/// Adjust the configuration information after loading from configuration file.
fn adjust_config(&self, conf: &mut crate::config::TomlConfig) -> Result<()> {
if let Some(remote) = conf.hypervisor.get_mut(HYPERVISOR_NAME_REMOTE) {
if remote.remote_info.hypervisor_socket.is_empty() {
remote.remote_info.hypervisor_socket =
default::DEFAULT_REMOTE_HYPERVISOR_SOCKET.to_string();
}
resolve_path!(
remote.remote_info.hypervisor_socket,
"Remote hypervisor socket `{}` is invalid: {}"
)?;
if remote.remote_info.hypervisor_timeout == 0 {
remote.remote_info.hypervisor_timeout = default::DEFAULT_REMOTE_HYPERVISOR_TIMEOUT;
}
if remote.memory_info.default_memory == 0 {
remote.memory_info.default_memory = default::MIN_REMOTE_MEMORY_SIZE_MB;
}
if remote.memory_info.memory_slots == 0 {
remote.memory_info.memory_slots = default::DEFAULT_REMOTE_MEMORY_SLOTS
}
}
Ok(())
}
/// Validate the configuration information.
fn validate(&self, conf: &crate::config::TomlConfig) -> Result<()> {
if let Some(remote) = conf.hypervisor.get(HYPERVISOR_NAME_REMOTE) {
let s = System::new_all();
let total_memory = Byte::from_u64(s.total_memory())
.get_adjusted_unit(Unit::MiB)
.get_value() as u32;
if remote.memory_info.default_maxmemory != total_memory {
return Err(eother!(
"Remote hypervisor does not support memory hotplug, default_maxmemory must be equal to the total system memory",
));
}
let cpus = num_cpus::get() as u32;
if remote.cpu_info.default_maxvcpus != cpus {
return Err(eother!(
"Remote hypervisor does not support CPU hotplug, default_maxvcpus must be equal to the total system CPUs",
));
}
if !remote.boot_info.initrd.is_empty() {
return Err(eother!("Remote hypervisor does not support initrd"));
}
if !remote.boot_info.rootfs_type.is_empty() {
return Err(eother!("Remote hypervisor does not support rootfs_type"));
}
if remote.blockdev_info.block_device_driver.as_str() == DRIVER_NVDIMM_TYPE {
return Err(eother!("Remote hypervisor does not support nvdimm"));
}
if remote.memory_info.default_memory < MIN_REMOTE_MEMORY_SIZE_MB {
return Err(eother!(
"Remote hypervisor has minimal memory limitation {}",
MIN_REMOTE_MEMORY_SIZE_MB
));
}
}
Ok(())
}
fn get_min_memory(&self) -> u32 {
MIN_REMOTE_MEMORY_SIZE_MB
}
fn get_max_cpus(&self) -> u32 {
MAX_REMOTE_VCPUS
}
}

View File

@@ -26,7 +26,7 @@ pub use self::agent::Agent;
use self::default::DEFAULT_AGENT_DBG_CONSOLE_PORT;
pub use self::hypervisor::{
BootInfo, CloudHypervisorConfig, DragonballConfig, FirecrackerConfig, Hypervisor, QemuConfig,
HYPERVISOR_NAME_DRAGONBALL, HYPERVISOR_NAME_FIRECRACKER, HYPERVISOR_NAME_QEMU,
RemoteConfig, HYPERVISOR_NAME_DRAGONBALL, HYPERVISOR_NAME_FIRECRACKER, HYPERVISOR_NAME_QEMU,
};
mod runtime;
@@ -115,6 +115,14 @@ pub struct TomlConfig {
pub runtime: Runtime,
}
macro_rules! mem_agent_kv_insert {
($ma_cfg:expr, $key:expr, $map:expr) => {
if let Some(n) = $ma_cfg {
$map.insert($key.to_string(), n.to_string());
}
};
}
impl TomlConfig {
/// Load Kata configuration information from configuration files.
///
@@ -204,6 +212,83 @@ impl TomlConfig {
DEFAULT_AGENT_DBG_CONSOLE_PORT.to_string(),
);
}
if cfg.mem_agent.enable {
kv.insert("psi".to_string(), "1".to_string());
kv.insert("agent.mem_agent_enable".to_string(), "1".to_string());
mem_agent_kv_insert!(
cfg.mem_agent.memcg_disable,
"agent.mem_agent_memcg_disable",
kv
);
mem_agent_kv_insert!(cfg.mem_agent.memcg_swap, "agent.mem_agent_memcg_swap", kv);
mem_agent_kv_insert!(
cfg.mem_agent.memcg_swappiness_max,
"agent.mem_agent_memcg_swappiness_max",
kv
);
mem_agent_kv_insert!(
cfg.mem_agent.memcg_period_secs,
"agent.mem_agent_memcg_period_secs",
kv
);
mem_agent_kv_insert!(
cfg.mem_agent.memcg_period_psi_percent_limit,
"agent.mem_agent_memcg_period_psi_percent_limit",
kv
);
mem_agent_kv_insert!(
cfg.mem_agent.memcg_eviction_psi_percent_limit,
"agent.mem_agent_memcg_eviction_psi_percent_limit",
kv
);
mem_agent_kv_insert!(
cfg.mem_agent.memcg_eviction_run_aging_count_min,
"agent.mem_agent_memcg_eviction_run_aging_count_min",
kv
);
mem_agent_kv_insert!(
cfg.mem_agent.compact_disable,
"agent.mem_agent_compact_disable",
kv
);
mem_agent_kv_insert!(
cfg.mem_agent.compact_period_secs,
"agent.mem_agent_compact_period_secs",
kv
);
mem_agent_kv_insert!(
cfg.mem_agent.compact_period_psi_percent_limit,
"agent.mem_agent_compact_period_psi_percent_limit",
kv
);
mem_agent_kv_insert!(
cfg.mem_agent.compact_psi_percent_limit,
"agent.mem_agent_compact_psi_percent_limit",
kv
);
mem_agent_kv_insert!(
cfg.mem_agent.compact_sec_max,
"agent.mem_agent_compact_sec_max",
kv
);
mem_agent_kv_insert!(
cfg.mem_agent.compact_order,
"agent.mem_agent_compact_order",
kv
);
mem_agent_kv_insert!(
cfg.mem_agent.compact_threshold,
"agent.mem_agent_compact_threshold",
kv
);
mem_agent_kv_insert!(
cfg.mem_agent.compact_force_times,
"agent.mem_agent_compact_force_times",
kv
);
}
}
Ok(kv)
}

View File

@@ -7,19 +7,17 @@ license = "Apache-2.0"
[features]
default = []
with-serde = [ "serde", "serde_json" ]
with-serde = []
async = ["ttrpc/async", "async-trait"]
[dependencies]
ttrpc = "0.8"
async-trait = { version = "0.1.42", optional = true }
protobuf = { version = "3.2.0" }
serde = { version = "1.0.130", features = ["derive"], optional = true }
serde_json = { version = "1.0.68", optional = true }
serde = { version = "1.0.130", features = ["derive"] }
serde_json = "1.0.68"
oci-spec = { version = "0.6.8", features = ["runtime"] }
kata-sys-util = { path = "../kata-sys-util" }
[build-dependencies]
ttrpc-codegen = "0.4.2"
protobuf = { version = "3.2.0" }

View File

@@ -204,6 +204,7 @@ fn real_main() -> Result<(), std::io::Error> {
"protos/agent.proto",
"protos/health.proto",
"protos/confidential_data_hub.proto",
"protos/remote.proto",
],
true,
)?;
@@ -214,6 +215,7 @@ fn real_main() -> Result<(), std::io::Error> {
"src/confidential_data_hub_ttrpc.rs",
"src/confidential_data_hub_ttrpc_async.rs",
)?;
fs::rename("src/remote_ttrpc.rs", "src/remote_ttrpc_async.rs")?;
}
codegen(
@@ -222,6 +224,7 @@ fn real_main() -> Result<(), std::io::Error> {
"protos/agent.proto",
"protos/health.proto",
"protos/confidential_data_hub.proto",
"protos/remote.proto",
],
false,
)?;

View File

@@ -59,6 +59,10 @@ service AgentService {
// observability
rpc GetMetrics(GetMetricsRequest) returns (Metrics);
// mem-agent
rpc MemAgentMemcgSet(MemAgentMemcgConfig) returns (google.protobuf.Empty);
rpc MemAgentCompactSet(MemAgentCompactConfig) returns (google.protobuf.Empty);
// misc (TODO: some rpcs can be replaced by hyperstart-exec)
rpc CreateSandbox(CreateSandboxRequest) returns (google.protobuf.Empty);
rpc DestroySandbox(DestroySandboxRequest) returns (google.protobuf.Empty);
@@ -611,3 +615,24 @@ message ResizeVolumeRequest {
message SetPolicyRequest {
string policy = 1;
}
message MemAgentMemcgConfig {
optional bool disabled = 1;
optional bool swap = 2;
optional uint32 swappiness_max = 3;
optional uint64 period_secs = 4;
optional uint32 period_psi_percent_limit = 5;
optional uint32 eviction_psi_percent_limit = 6;
optional uint64 eviction_run_aging_count_min = 7;
}
message MemAgentCompactConfig {
optional bool disabled = 1;
optional uint64 period_secs = 2;
optional uint32 period_psi_percent_limit = 3;
optional uint32 compact_psi_percent_limit = 4;
optional int64 compact_sec_max = 5;
optional uint32 compact_order = 6;
optional uint64 compact_threshold = 7;
optional uint64 compact_force_times = 8;
}

View File

@@ -0,0 +1,47 @@
// Copyright 2024 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
syntax = "proto3";
package remote;
service Hypervisor {
rpc CreateVM(CreateVMRequest) returns (CreateVMResponse) {}
rpc StartVM(StartVMRequest) returns (StartVMResponse) {}
rpc StopVM(StopVMRequest) returns (StopVMResponse) {}
rpc Version(VersionRequest) returns (VersionResponse) {}
}
message VersionRequest {
string version = 1;
}
message VersionResponse {
string version = 1;
}
message CreateVMRequest {
string id = 1;
map<string, string> annotations = 2;
string networkNamespacePath = 3;
}
message CreateVMResponse {
string agentSocketPath = 1;
}
message StartVMRequest {
string id = 1;
}
message StartVMResponse {
}
message StopVMRequest {
string id = 1;
}
message StopVMResponse {
}

View File

@@ -21,6 +21,10 @@ pub mod oci;
mod serde_config;
pub mod trans;
pub mod types;
pub mod remote;
pub mod remote_ttrpc;
#[cfg(feature = "async")]
pub mod remote_ttrpc_async;
#[cfg(feature = "with-serde")]
pub use serde_config::{

View File

@@ -10,7 +10,6 @@ use std::convert::TryFrom;
use std::path::PathBuf;
use crate::oci as grpc;
use kata_sys_util::spec::parse_json_string;
use oci_spec::runtime as oci;
// translate from interface to ttprc tools
@@ -41,8 +40,9 @@ fn cap_hashset2vec(hash_set: &Option<HashSet<oci::Capability>>) -> Vec<String> {
fn cap_vec2hashset(caps: Vec<String>) -> HashSet<oci::Capability> {
caps.iter()
.map(|cap: &String| {
let cap_str = parse_json_string(cap);
cap_str
// cap might be JSON-encoded
let decoded: &str = serde_json::from_str(cap).unwrap_or(cap);
decoded.strip_prefix("CAP_").unwrap_or(decoded)
.parse::<oci::Capability>()
.unwrap_or_else(|_| panic!("Failed to parse {:?} to Enum Capability", cap))
})
@@ -97,6 +97,8 @@ impl From<oci::LinuxCapabilities> for grpc::LinuxCapabilities {
}
}
// TODO(burgerdev): remove condition here and below after upgrading to oci_spec > 0.7.
#[cfg(target_os = "linux")]
impl From<oci::PosixRlimit> for grpc::POSIXRlimit {
fn from(from: oci::PosixRlimit) -> Self {
grpc::POSIXRlimit {
@@ -118,6 +120,7 @@ impl From<oci::Process> for grpc::Process {
Env: option_vec_to_vec(from.env()),
Cwd: from.cwd().display().to_string(),
Capabilities: from_option(from.capabilities().clone()),
#[cfg(target_os = "linux")]
Rlimits: from_option_vec(from.rlimits().clone()),
NoNewPrivileges: from.no_new_privileges().unwrap_or_default(),
ApparmorProfile: from
@@ -993,6 +996,7 @@ impl From<grpc::Linux> for oci::Linux {
}
}
#[cfg(target_os = "linux")]
impl From<grpc::POSIXRlimit> for oci::PosixRlimit {
fn from(proto: grpc::POSIXRlimit) -> Self {
oci::PosixRlimitBuilder::default()
@@ -1078,6 +1082,8 @@ impl From<grpc::Process> for oci::Process {
} else {
process.set_capabilities(None);
}
#[cfg(target_os = "linux")]
if !from.Rlimits().is_empty() {
process.set_rlimits(Some(
from.Rlimits().iter().cloned().map(|r| r.into()).collect(),
@@ -1238,6 +1244,11 @@ impl From<grpc::LinuxIntelRdt> for oci::LinuxIntelRdt {
#[cfg(test)]
mod tests {
use std::collections::HashSet;
use super::cap_vec2hashset;
use super::oci;
fn from_vec<F: Sized, T: From<F>>(from: Vec<F>) -> Vec<T> {
let mut to: Vec<T> = vec![];
for data in from {
@@ -1289,4 +1300,26 @@ mod tests {
assert_eq!(from.len(), to.len());
assert_eq!(from[0].from, to[0].to);
}
#[test]
fn test_cap_vec2hashset_good() {
let expected: HashSet<oci::Capability> =
vec![oci::Capability::NetAdmin, oci::Capability::Mknod]
.into_iter()
.collect();
let actual = cap_vec2hashset(vec![
"CAP_NET_ADMIN".to_string(),
"\"CAP_MKNOD\"".to_string(),
]);
assert_eq!(expected, actual);
}
#[test]
#[should_panic]
fn test_cap_vec2hashset_bad() {
cap_vec2hashset(vec![
"CAP_DOES_NOT_EXIST".to_string(),
]);
}
}

5
src/mem-agent/.gitignore vendored Normal file
View File

@@ -0,0 +1,5 @@
/target
/example/target
/.vscode
.vscode-ctags

943
src/mem-agent/Cargo.lock generated Normal file
View File

@@ -0,0 +1,943 @@
# This file is automatically @generated by Cargo.
# It is not intended for manual editing.
version = 3
[[package]]
name = "addr2line"
version = "0.21.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8a30b2e23b9e17a9f90641c7ab1549cd9b44f296d3ccbf309d2863cfe398a0cb"
dependencies = [
"gimli",
]
[[package]]
name = "adler"
version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f26201604c87b1e01bd3d98f8d5d9a8fcbb815e8cedb41ffccbeb4bf593a35fe"
[[package]]
name = "android-tzdata"
version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e999941b234f3131b00bc13c22d06e8c5ff726d1b6318ac7eb276997bbb4fef0"
[[package]]
name = "android_system_properties"
version = "0.1.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "819e7219dbd41043ac279b19830f2efc897156490d7fd6ea916720117ee66311"
dependencies = [
"libc",
]
[[package]]
name = "anyhow"
version = "1.0.81"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0952808a6c2afd1aa8947271f3a60f1a6763c7b912d210184c5149b5cf147247"
[[package]]
name = "arc-swap"
version = "1.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7b3d0060af21e8d11a926981cc00c6c1541aa91dd64b9f881985c3da1094425f"
[[package]]
name = "async-trait"
version = "0.1.77"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c980ee35e870bd1a4d2c8294d4c04d0499e67bca1e4b5cefcc693c2fa00caea9"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "autocfg"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d468802bab17cbc0cc575e9b053f41e72aa36bfa6b7f55e3529ffa43161b97fa"
[[package]]
name = "backtrace"
version = "0.3.69"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2089b7e3f35b9dd2d0ed921ead4f6d318c27680d4a5bd167b3ee120edb105837"
dependencies = [
"addr2line",
"cc",
"cfg-if",
"libc",
"miniz_oxide",
"object",
"rustc-demangle",
]
[[package]]
name = "bitflags"
version = "1.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bef38d45163c2f1dde094a7dfd33ccf595c92905c8f8f4fdc18d06fb1037718a"
[[package]]
name = "bitflags"
version = "2.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b048fb63fd8b5923fc5aa7b340d8e156aec7ec02f0c78fa8a6ddc2613f6f71de"
[[package]]
name = "bumpalo"
version = "3.15.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7ff69b9dd49fd426c69a0db9fc04dd934cdb6645ff000864d98f7e2af8830eaa"
[[package]]
name = "bytes"
version = "1.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a2bd12c1caf447e69cd4528f47f94d203fd2582878ecb9e9465484c4148a8223"
[[package]]
name = "cc"
version = "1.0.90"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8cd6604a82acf3039f1144f54b8eb34e91ffba622051189e71b781822d5ee1f5"
[[package]]
name = "cfg-if"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "baf1de4339761588bc0619e3cbc0120ee582ebb74b53b4efbf79117bd2da40fd"
[[package]]
name = "chrono"
version = "0.4.35"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8eaf5903dcbc0a39312feb77df2ff4c76387d591b9fc7b04a238dcf8bb62639a"
dependencies = [
"android-tzdata",
"iana-time-zone",
"js-sys",
"num-traits",
"wasm-bindgen",
"windows-targets 0.52.4",
]
[[package]]
name = "core-foundation-sys"
version = "0.8.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "06ea2b9bc92be3c2baa9334a323ebca2d6f074ff852cd1d7b11064035cd3868f"
[[package]]
name = "crossbeam-channel"
version = "0.5.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "33480d6946193aa8033910124896ca395333cae7e2d1113d1fef6c3272217df2"
dependencies = [
"crossbeam-utils",
]
[[package]]
name = "crossbeam-utils"
version = "0.8.20"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "22ec99545bb0ed0ea7bb9b8e1e9122ea386ff8a48c0922e43f36d45ab09e0e80"
[[package]]
name = "deranged"
version = "0.3.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b42b6fa04a440b495c8b04d0e71b707c585f83cb9cb28cf8cd0d976c315e31b4"
dependencies = [
"powerfmt",
]
[[package]]
name = "dirs-next"
version = "2.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b98cf8ebf19c3d1b223e151f99a4f9f0690dca41414773390fc824184ac833e1"
dependencies = [
"cfg-if",
"dirs-sys-next",
]
[[package]]
name = "dirs-sys-next"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4ebda144c4fe02d1f7ea1a7d9641b6fc6b580adcfa024ae48797ecdeb6825b4d"
dependencies = [
"libc",
"redox_users",
"winapi",
]
[[package]]
name = "getrandom"
version = "0.2.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c4567c8db10ae91089c99af84c68c38da3ec2f087c3f82960bcdbf3656b6f4d7"
dependencies = [
"cfg-if",
"libc",
"wasi",
]
[[package]]
name = "gimli"
version = "0.28.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4271d37baee1b8c7e4b708028c57d816cf9d2434acb33a549475f78c181f6253"
[[package]]
name = "hermit-abi"
version = "0.3.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d231dfb89cfffdbc30e7fc41579ed6066ad03abda9e567ccafae602b97ec5024"
[[package]]
name = "hermit-abi"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fbf6a919d6cf397374f7dfeeea91d974c7c0a7221d0d0f4f20d859d329e53fcc"
[[package]]
name = "iana-time-zone"
version = "0.1.60"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e7ffbb5a1b541ea2561f8c41c087286cc091e21e556a4f09a8f6cbf17b69b141"
dependencies = [
"android_system_properties",
"core-foundation-sys",
"iana-time-zone-haiku",
"js-sys",
"wasm-bindgen",
"windows-core",
]
[[package]]
name = "iana-time-zone-haiku"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f31827a206f56af32e590ba56d5d2d085f558508192593743f16b2306495269f"
dependencies = [
"cc",
]
[[package]]
name = "is-terminal"
version = "0.4.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "261f68e344040fbd0edea105bef17c66edf46f984ddb1115b775ce31be948f4b"
dependencies = [
"hermit-abi 0.4.0",
"libc",
"windows-sys 0.52.0",
]
[[package]]
name = "itoa"
version = "1.0.14"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d75a2a4b1b190afb6f5425f10f6a8f959d2ea0b9c2b1d79553551850539e4674"
[[package]]
name = "js-sys"
version = "0.3.69"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "29c15563dc2726973df627357ce0c9ddddbea194836909d655df6a75d2cf296d"
dependencies = [
"wasm-bindgen",
]
[[package]]
name = "lazy_static"
version = "1.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e2abad23fbc42b3700f2f279844dc832adb2b2eb069b2df918f455c4e18cc646"
[[package]]
name = "libc"
version = "0.2.167"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "09d6582e104315a817dff97f75133544b2e094ee22447d2acf4a74e189ba06fc"
[[package]]
name = "libredox"
version = "0.1.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c0ff37bd590ca25063e35af745c343cb7a0271906fb7b37e4813e8f79f00268d"
dependencies = [
"bitflags 2.6.0",
"libc",
]
[[package]]
name = "lock_api"
version = "0.4.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3c168f8615b12bc01f9c17e2eb0cc07dcae1940121185446edc3744920e8ef45"
dependencies = [
"autocfg",
"scopeguard",
]
[[package]]
name = "log"
version = "0.4.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "90ed8c1e510134f979dbc4f070f87d4313098b704861a105fe34231c70a3901c"
[[package]]
name = "maplit"
version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3e2e65a1a2e43cfcb47a895c4c8b10d1f4a61097f9f254f183aee60cad9c651d"
[[package]]
name = "mem-agent"
version = "0.1.0"
dependencies = [
"anyhow",
"async-trait",
"chrono",
"lazy_static",
"maplit",
"nix",
"once_cell",
"page_size",
"slog",
"slog-async",
"slog-scope",
"slog-term",
"tokio",
]
[[package]]
name = "memchr"
version = "2.7.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "523dc4f511e55ab87b694dc30d0f820d60906ef06413f93d4d7a1385599cc149"
[[package]]
name = "memoffset"
version = "0.6.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5aa361d4faea93603064a027415f07bd8e1d5c88c9fbf68bf56a285428fd79ce"
dependencies = [
"autocfg",
]
[[package]]
name = "miniz_oxide"
version = "0.7.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9d811f3e15f28568be3407c8e7fdb6514c1cda3cb30683f15b6a1a1dc4ea14a7"
dependencies = [
"adler",
]
[[package]]
name = "mio"
version = "0.8.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a4a650543ca06a924e8b371db273b2756685faae30f8487da1b56505a8f78b0c"
dependencies = [
"libc",
"wasi",
"windows-sys 0.48.0",
]
[[package]]
name = "nix"
version = "0.23.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8f3790c00a0150112de0f4cd161e3d7fc4b2d8a5542ffc35f099a2562aecb35c"
dependencies = [
"bitflags 1.3.2",
"cc",
"cfg-if",
"libc",
"memoffset",
]
[[package]]
name = "num-conv"
version = "0.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "51d515d32fb182ee37cda2ccdcb92950d6a3c2893aa280e540671c2cd0f3b1d9"
[[package]]
name = "num-traits"
version = "0.2.18"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "da0df0e5185db44f69b44f26786fe401b6c293d1907744beaa7fa62b2e5a517a"
dependencies = [
"autocfg",
]
[[package]]
name = "num_cpus"
version = "1.16.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4161fcb6d602d4d2081af7c3a45852d875a03dd337a6bfdd6e06407b61342a43"
dependencies = [
"hermit-abi 0.3.9",
"libc",
]
[[package]]
name = "object"
version = "0.32.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a6a622008b6e321afc04970976f62ee297fdbaa6f95318ca343e3eebb9648441"
dependencies = [
"memchr",
]
[[package]]
name = "once_cell"
version = "1.19.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3fdb12b2476b595f9358c5161aa467c2438859caa136dec86c26fdd2efe17b92"
[[package]]
name = "page_size"
version = "0.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "30d5b2194ed13191c1999ae0704b7839fb18384fa22e49b57eeaa97d79ce40da"
dependencies = [
"libc",
"winapi",
]
[[package]]
name = "parking_lot"
version = "0.12.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3742b2c103b9f06bc9fff0a37ff4912935851bee6d36f3c02bcc755bcfec228f"
dependencies = [
"lock_api",
"parking_lot_core",
]
[[package]]
name = "parking_lot_core"
version = "0.9.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4c42a9226546d68acdd9c0a280d17ce19bfe27a46bf68784e4066115788d008e"
dependencies = [
"cfg-if",
"libc",
"redox_syscall",
"smallvec",
"windows-targets 0.48.5",
]
[[package]]
name = "pin-project-lite"
version = "0.2.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8afb450f006bf6385ca15ef45d71d2288452bc3683ce2e2cacc0d18e4be60b58"
[[package]]
name = "powerfmt"
version = "0.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "439ee305def115ba05938db6eb1644ff94165c5ab5e9420d1c1bcedbba909391"
[[package]]
name = "proc-macro2"
version = "1.0.79"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e835ff2298f5721608eb1a980ecaee1aef2c132bf95ecc026a11b7bf3c01c02e"
dependencies = [
"unicode-ident",
]
[[package]]
name = "quote"
version = "1.0.35"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "291ec9ab5efd934aaf503a6466c5d5251535d108ee747472c3977cc5acc868ef"
dependencies = [
"proc-macro2",
]
[[package]]
name = "redox_syscall"
version = "0.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4722d768eff46b75989dd134e5c353f0d6296e5aaa3132e776cbdb56be7731aa"
dependencies = [
"bitflags 1.3.2",
]
[[package]]
name = "redox_users"
version = "0.4.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ba009ff324d1fc1b900bd1fdb31564febe58a8ccc8a6fdbb93b543d33b13ca43"
dependencies = [
"getrandom",
"libredox",
"thiserror",
]
[[package]]
name = "rustc-demangle"
version = "0.1.23"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d626bb9dae77e28219937af045c257c28bfd3f69333c512553507f5f9798cb76"
[[package]]
name = "rustversion"
version = "1.0.18"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0e819f2bc632f285be6d7cd36e25940d45b2391dd6d9b939e79de557f7014248"
[[package]]
name = "scopeguard"
version = "1.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "94143f37725109f92c262ed2cf5e59bce7498c01bcc1502d7b9afe439a4e9f49"
[[package]]
name = "serde"
version = "1.0.210"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c8e3592472072e6e22e0a54d5904d9febf8508f65fb8552499a1abc7d1078c3a"
dependencies = [
"serde_derive",
]
[[package]]
name = "serde_derive"
version = "1.0.210"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "243902eda00fad750862fc144cea25caca5e20d615af0a81bee94ca738f1df1f"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "signal-hook-registry"
version = "1.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d8229b473baa5980ac72ef434c4415e70c4b5e71b423043adb4ba059f89c99a1"
dependencies = [
"libc",
]
[[package]]
name = "slog"
version = "2.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8347046d4ebd943127157b94d63abb990fcf729dc4e9978927fdf4ac3c998d06"
[[package]]
name = "slog-async"
version = "2.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "72c8038f898a2c79507940990f05386455b3a317d8f18d4caea7cbc3d5096b84"
dependencies = [
"crossbeam-channel",
"slog",
"take_mut",
"thread_local",
]
[[package]]
name = "slog-scope"
version = "4.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2f95a4b4c3274cd2869549da82b57ccc930859bdbf5bcea0424bc5f140b3c786"
dependencies = [
"arc-swap",
"lazy_static",
"slog",
]
[[package]]
name = "slog-term"
version = "2.9.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b6e022d0b998abfe5c3782c1f03551a596269450ccd677ea51c56f8b214610e8"
dependencies = [
"is-terminal",
"slog",
"term",
"thread_local",
"time",
]
[[package]]
name = "smallvec"
version = "1.13.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e6ecd384b10a64542d77071bd64bd7b231f4ed5940fba55e98c3de13824cf3d7"
[[package]]
name = "socket2"
version = "0.5.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "05ffd9c0a93b7543e062e759284fcf5f5e3b098501104bfbdde4d404db792871"
dependencies = [
"libc",
"windows-sys 0.52.0",
]
[[package]]
name = "syn"
version = "2.0.52"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b699d15b36d1f02c3e7c69f8ffef53de37aefae075d8488d4ba1a7788d574a07"
dependencies = [
"proc-macro2",
"quote",
"unicode-ident",
]
[[package]]
name = "take_mut"
version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f764005d11ee5f36500a149ace24e00e3da98b0158b3e2d53a7495660d3f4d60"
[[package]]
name = "term"
version = "0.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c59df8ac95d96ff9bede18eb7300b0fda5e5d8d90960e76f8e14ae765eedbf1f"
dependencies = [
"dirs-next",
"rustversion",
"winapi",
]
[[package]]
name = "thiserror"
version = "1.0.65"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5d11abd9594d9b38965ef50805c5e469ca9cc6f197f883f717e0269a3057b3d5"
dependencies = [
"thiserror-impl",
]
[[package]]
name = "thiserror-impl"
version = "1.0.65"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ae71770322cbd277e69d762a16c444af02aa0575ac0d174f0b9562d3b37f8602"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "thread_local"
version = "1.1.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8b9ef9bad013ada3808854ceac7b46812a6465ba368859a37e2100283d2d719c"
dependencies = [
"cfg-if",
"once_cell",
]
[[package]]
name = "time"
version = "0.3.37"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "35e7868883861bd0e56d9ac6efcaaca0d6d5d82a2a7ec8209ff492c07cf37b21"
dependencies = [
"deranged",
"itoa",
"num-conv",
"powerfmt",
"serde",
"time-core",
"time-macros",
]
[[package]]
name = "time-core"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ef927ca75afb808a4d64dd374f00a2adf8d0fcff8e7b184af886c3c87ec4a3f3"
[[package]]
name = "time-macros"
version = "0.2.19"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2834e6017e3e5e4b9834939793b282bc03b37a3336245fa820e35e233e2a85de"
dependencies = [
"num-conv",
"time-core",
]
[[package]]
name = "tokio"
version = "1.36.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "61285f6515fa018fb2d1e46eb21223fff441ee8db5d0f1435e8ab4f5cdb80931"
dependencies = [
"backtrace",
"bytes",
"libc",
"mio",
"num_cpus",
"parking_lot",
"pin-project-lite",
"signal-hook-registry",
"socket2",
"tokio-macros",
"windows-sys 0.48.0",
]
[[package]]
name = "tokio-macros"
version = "2.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5b8a1e28f2deaa14e508979454cb3a223b10b938b45af148bc0986de36f1923b"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "unicode-ident"
version = "1.0.12"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3354b9ac3fae1ff6755cb6db53683adb661634f67557942dea4facebec0fee4b"
[[package]]
name = "wasi"
version = "0.11.0+wasi-snapshot-preview1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9c8d87e72b64a3b4db28d11ce29237c246188f4f51057d65a7eab63b7987e423"
[[package]]
name = "wasm-bindgen"
version = "0.2.92"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4be2531df63900aeb2bca0daaaddec08491ee64ceecbee5076636a3b026795a8"
dependencies = [
"cfg-if",
"wasm-bindgen-macro",
]
[[package]]
name = "wasm-bindgen-backend"
version = "0.2.92"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "614d787b966d3989fa7bb98a654e369c762374fd3213d212cfc0251257e747da"
dependencies = [
"bumpalo",
"log",
"once_cell",
"proc-macro2",
"quote",
"syn",
"wasm-bindgen-shared",
]
[[package]]
name = "wasm-bindgen-macro"
version = "0.2.92"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a1f8823de937b71b9460c0c34e25f3da88250760bec0ebac694b49997550d726"
dependencies = [
"quote",
"wasm-bindgen-macro-support",
]
[[package]]
name = "wasm-bindgen-macro-support"
version = "0.2.92"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e94f17b526d0a461a191c78ea52bbce64071ed5c04c9ffe424dcb38f74171bb7"
dependencies = [
"proc-macro2",
"quote",
"syn",
"wasm-bindgen-backend",
"wasm-bindgen-shared",
]
[[package]]
name = "wasm-bindgen-shared"
version = "0.2.92"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "af190c94f2773fdb3729c55b007a722abb5384da03bc0986df4c289bf5567e96"
[[package]]
name = "winapi"
version = "0.3.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5c839a674fcd7a98952e593242ea400abe93992746761e38641405d28b00f419"
dependencies = [
"winapi-i686-pc-windows-gnu",
"winapi-x86_64-pc-windows-gnu",
]
[[package]]
name = "winapi-i686-pc-windows-gnu"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ac3b87c63620426dd9b991e5ce0329eff545bccbbb34f3be09ff6fb6ab51b7b6"
[[package]]
name = "winapi-x86_64-pc-windows-gnu"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "712e227841d057c1ee1cd2fb22fa7e5a5461ae8e48fa2ca79ec42cfc1931183f"
[[package]]
name = "windows-core"
version = "0.52.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "33ab640c8d7e35bf8ba19b884ba838ceb4fba93a4e8c65a9059d08afcfc683d9"
dependencies = [
"windows-targets 0.52.4",
]
[[package]]
name = "windows-sys"
version = "0.48.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "677d2418bec65e3338edb076e806bc1ec15693c5d0104683f2efe857f61056a9"
dependencies = [
"windows-targets 0.48.5",
]
[[package]]
name = "windows-sys"
version = "0.52.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "282be5f36a8ce781fad8c8ae18fa3f9beff57ec1b52cb3de0789201425d9a33d"
dependencies = [
"windows-targets 0.52.4",
]
[[package]]
name = "windows-targets"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9a2fa6e2155d7247be68c096456083145c183cbbbc2764150dda45a87197940c"
dependencies = [
"windows_aarch64_gnullvm 0.48.5",
"windows_aarch64_msvc 0.48.5",
"windows_i686_gnu 0.48.5",
"windows_i686_msvc 0.48.5",
"windows_x86_64_gnu 0.48.5",
"windows_x86_64_gnullvm 0.48.5",
"windows_x86_64_msvc 0.48.5",
]
[[package]]
name = "windows-targets"
version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7dd37b7e5ab9018759f893a1952c9420d060016fc19a472b4bb20d1bdd694d1b"
dependencies = [
"windows_aarch64_gnullvm 0.52.4",
"windows_aarch64_msvc 0.52.4",
"windows_i686_gnu 0.52.4",
"windows_i686_msvc 0.52.4",
"windows_x86_64_gnu 0.52.4",
"windows_x86_64_gnullvm 0.52.4",
"windows_x86_64_msvc 0.52.4",
]
[[package]]
name = "windows_aarch64_gnullvm"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2b38e32f0abccf9987a4e3079dfb67dcd799fb61361e53e2882c3cbaf0d905d8"
[[package]]
name = "windows_aarch64_gnullvm"
version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bcf46cf4c365c6f2d1cc93ce535f2c8b244591df96ceee75d8e83deb70a9cac9"
[[package]]
name = "windows_aarch64_msvc"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dc35310971f3b2dbbf3f0690a219f40e2d9afcf64f9ab7cc1be722937c26b4bc"
[[package]]
name = "windows_aarch64_msvc"
version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "da9f259dd3bcf6990b55bffd094c4f7235817ba4ceebde8e6d11cd0c5633b675"
[[package]]
name = "windows_i686_gnu"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a75915e7def60c94dcef72200b9a8e58e5091744960da64ec734a6c6e9b3743e"
[[package]]
name = "windows_i686_gnu"
version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b474d8268f99e0995f25b9f095bc7434632601028cf86590aea5c8a5cb7801d3"
[[package]]
name = "windows_i686_msvc"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8f55c233f70c4b27f66c523580f78f1004e8b5a8b659e05a4eb49d4166cca406"
[[package]]
name = "windows_i686_msvc"
version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1515e9a29e5bed743cb4415a9ecf5dfca648ce85ee42e15873c3cd8610ff8e02"
[[package]]
name = "windows_x86_64_gnu"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "53d40abd2583d23e4718fddf1ebec84dbff8381c07cae67ff7768bbf19c6718e"
[[package]]
name = "windows_x86_64_gnu"
version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5eee091590e89cc02ad514ffe3ead9eb6b660aedca2183455434b93546371a03"
[[package]]
name = "windows_x86_64_gnullvm"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0b7b52767868a23d5bab768e390dc5f5c55825b6d30b86c844ff2dc7414044cc"
[[package]]
name = "windows_x86_64_gnullvm"
version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "77ca79f2451b49fa9e2af39f0747fe999fcda4f5e241b2898624dca97a1f2177"
[[package]]
name = "windows_x86_64_msvc"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ed94fce61571a4006852b7389a063ab983c02eb1bb37b47f8272ce92d06d9538"
[[package]]
name = "windows_x86_64_msvc"
version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "32b752e52a2da0ddfbdbcc6fceadfeede4c939ed16d13e648833a61dfb611ed8"

22
src/mem-agent/Cargo.toml Normal file
View File

@@ -0,0 +1,22 @@
[package]
name = "mem-agent"
version = "0.1.0"
edition = "2018"
[dependencies]
slog = "2.5.2"
slog-scope = "4.1.2"
anyhow = "1.0"
page_size = "0.6"
chrono = "0.4"
tokio = { version = "1.33", features = ["full"] }
async-trait = "0.1"
lazy_static = "1.4"
nix = "0.23.2"
[dev-dependencies]
maplit = "1.0"
slog-term = "2.9.0"
slog-async = "2.7"
once_cell = "1.9.0"

6
src/mem-agent/Makefile Normal file
View File

@@ -0,0 +1,6 @@
# Copyright (C) 2024 Ant group. All rights reserved.
#
# SPDX-License-Identifier: Apache-2.0
default:
cd example; cargo build --examples --target x86_64-unknown-linux-musl

1658
src/mem-agent/example/Cargo.lock generated Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,36 @@
[package]
name = "mem-agent-bin"
version = "0.1.0"
edition = "2018"
[dependencies]
slog = "2.5.2"
slog-scope = "4.1.2"
slog-term = "2.9.0"
slog-async = "2.7"
structopt = "0.3"
anyhow = "1.0"
libc = "0.2"
page_size = "0.6"
chrono = "0.4"
maplit = "1.0"
ttrpc = { version = "0.8", features = ["async"] }
tokio = { version = "1.33", features = ["full"] }
async-trait = "0.1"
byteorder = "1.5"
protobuf = "3.1"
lazy_static = "1.4"
# Rust 1.68 doesn't support 0.5.9
home = "=0.5.5"
mem-agent = { path = "../" }
[[example]]
name = "mem-agent-srv"
path = "./srv.rs"
[[example]]
name = "mem-agent-ctl"
path = "./ctl.rs"
[build-dependencies]
ttrpc-codegen = "0.4"

View File

@@ -0,0 +1,29 @@
// Copyright (C) 2024 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
use ttrpc_codegen::{Codegen, Customize, ProtobufCustomize};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let protos = vec![
"protocols/protos/mem-agent.proto",
"protocols/protos/google/protobuf/empty.proto",
"protocols/protos/google/protobuf/timestamp.proto",
];
let protobuf_customized = ProtobufCustomize::default().gen_mod_rs(false);
Codegen::new()
.out_dir("protocols/")
.inputs(&protos)
.include("protocols/protos/")
.rust_protobuf()
.customize(Customize {
async_all: true,
..Default::default()
})
.rust_protobuf_customize(protobuf_customized.clone())
.run()?;
Ok(())
}

View File

@@ -0,0 +1,79 @@
// Copyright (C) 2023 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
mod protocols;
mod share;
use anyhow::{anyhow, Result};
use protocols::empty;
use protocols::mem_agent_ttrpc;
use share::option::{CompactSetOption, MemcgSetOption};
use structopt::StructOpt;
use ttrpc::r#async::Client;
#[derive(Debug, StructOpt)]
enum Command {
#[structopt(name = "memcgstatus", about = "get memory cgroup status")]
MemcgStatus,
#[structopt(name = "memcgset", about = "set memory cgroup")]
MemcgSet(MemcgSetOption),
#[structopt(name = "compactset", about = "set compact")]
CompactSet(CompactSetOption),
}
#[derive(StructOpt, Debug)]
#[structopt(name = "mem-agent-ctl", about = "Memory agent controler")]
struct Opt {
#[structopt(long, default_value = "unix:///var/run/mem-agent.sock")]
addr: String,
#[structopt(subcommand)]
command: Command,
}
#[tokio::main]
async fn main() -> Result<()> {
let opt = Opt::from_args();
// setup client
let c = Client::connect(&opt.addr).unwrap();
let client = mem_agent_ttrpc::ControlClient::new(c.clone());
match opt.command {
Command::MemcgStatus => {
let mss = client
.memcg_status(ttrpc::context::with_timeout(0), &empty::Empty::new())
.await
.map_err(|e| anyhow!("client.memcg_status fail: {}", e))?;
for mcg in mss.mem_cgroups {
println!("{:?}", mcg);
for (numa_id, n) in mcg.numa {
if let Some(t) = n.last_inc_time.into_option() {
println!("{} {:?}", numa_id, share::misc::timestamp_to_datetime(t)?);
}
}
}
}
Command::MemcgSet(c) => {
let config = c.to_rpc_memcg_config();
client
.memcg_set(ttrpc::context::with_timeout(0), &config)
.await
.map_err(|e| anyhow!("client.memcg_status fail: {}", e))?;
}
Command::CompactSet(c) => {
let config = c.to_rpc_compact_config();
client
.compact_set(ttrpc::context::with_timeout(0), &config)
.await
.map_err(|e| anyhow!("client.memcg_status fail: {}", e))?;
}
}
Ok(())
}

View File

@@ -0,0 +1,8 @@
// Copyright (C) 2023 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
pub mod empty;
pub mod mem_agent;
pub mod mem_agent_ttrpc;
pub mod timestamp;

View File

@@ -0,0 +1,52 @@
// Protocol Buffers - Google's data interchange format
// Copyright 2008 Google Inc. All rights reserved.
// https://developers.google.com/protocol-buffers/
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
syntax = "proto3";
package google.protobuf;
option csharp_namespace = "Google.Protobuf.WellKnownTypes";
option go_package = "types";
option java_package = "com.google.protobuf";
option java_outer_classname = "EmptyProto";
option java_multiple_files = true;
option objc_class_prefix = "GPB";
option cc_enable_arenas = true;
// A generic empty message that you can re-use to avoid defining duplicated
// empty messages in your APIs. A typical example is to use it as the request
// or the response type of an API method. For instance:
//
// service Foo {
// rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);
// }
//
// The JSON representation for `Empty` is empty JSON object `{}`.
message Empty {}

View File

@@ -0,0 +1,138 @@
// Protocol Buffers - Google's data interchange format
// Copyright 2008 Google Inc. All rights reserved.
// https://developers.google.com/protocol-buffers/
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
syntax = "proto3";
package google.protobuf;
option csharp_namespace = "Google.Protobuf.WellKnownTypes";
option cc_enable_arenas = true;
option go_package = "github.com/golang/protobuf/ptypes/timestamp";
option java_package = "com.google.protobuf";
option java_outer_classname = "TimestampProto";
option java_multiple_files = true;
option objc_class_prefix = "GPB";
// A Timestamp represents a point in time independent of any time zone or local
// calendar, encoded as a count of seconds and fractions of seconds at
// nanosecond resolution. The count is relative to an epoch at UTC midnight on
// January 1, 1970, in the proleptic Gregorian calendar which extends the
// Gregorian calendar backwards to year one.
//
// All minutes are 60 seconds long. Leap seconds are "smeared" so that no leap
// second table is needed for interpretation, using a [24-hour linear
// smear](https://developers.google.com/time/smear).
//
// The range is from 0001-01-01T00:00:00Z to 9999-12-31T23:59:59.999999999Z. By
// restricting to that range, we ensure that we can convert to and from [RFC
// 3339](https://www.ietf.org/rfc/rfc3339.txt) date strings.
//
// # Examples
//
// Example 1: Compute Timestamp from POSIX `time()`.
//
// Timestamp timestamp;
// timestamp.set_seconds(time(NULL));
// timestamp.set_nanos(0);
//
// Example 2: Compute Timestamp from POSIX `gettimeofday()`.
//
// struct timeval tv;
// gettimeofday(&tv, NULL);
//
// Timestamp timestamp;
// timestamp.set_seconds(tv.tv_sec);
// timestamp.set_nanos(tv.tv_usec * 1000);
//
// Example 3: Compute Timestamp from Win32 `GetSystemTimeAsFileTime()`.
//
// FILETIME ft;
// GetSystemTimeAsFileTime(&ft);
// UINT64 ticks = (((UINT64)ft.dwHighDateTime) << 32) | ft.dwLowDateTime;
//
// // A Windows tick is 100 nanoseconds. Windows epoch 1601-01-01T00:00:00Z
// // is 11644473600 seconds before Unix epoch 1970-01-01T00:00:00Z.
// Timestamp timestamp;
// timestamp.set_seconds((INT64) ((ticks / 10000000) - 11644473600LL));
// timestamp.set_nanos((INT32) ((ticks % 10000000) * 100));
//
// Example 4: Compute Timestamp from Java `System.currentTimeMillis()`.
//
// long millis = System.currentTimeMillis();
//
// Timestamp timestamp = Timestamp.newBuilder().setSeconds(millis / 1000)
// .setNanos((int) ((millis % 1000) * 1000000)).build();
//
//
// Example 5: Compute Timestamp from current time in Python.
//
// timestamp = Timestamp()
// timestamp.GetCurrentTime()
//
// # JSON Mapping
//
// In JSON format, the Timestamp type is encoded as a string in the
// [RFC 3339](https://www.ietf.org/rfc/rfc3339.txt) format. That is, the
// format is "{year}-{month}-{day}T{hour}:{min}:{sec}[.{frac_sec}]Z"
// where {year} is always expressed using four digits while {month}, {day},
// {hour}, {min}, and {sec} are zero-padded to two digits each. The fractional
// seconds, which can go up to 9 digits (i.e. up to 1 nanosecond resolution),
// are optional. The "Z" suffix indicates the timezone ("UTC"); the timezone
// is required. A proto3 JSON serializer should always use UTC (as indicated by
// "Z") when printing the Timestamp type and a proto3 JSON parser should be
// able to accept both UTC and other timezones (as indicated by an offset).
//
// For example, "2017-01-15T01:30:15.01Z" encodes 15.01 seconds past
// 01:30 UTC on January 15, 2017.
//
// In JavaScript, one can convert a Date object to this format using the
// standard
// [toISOString()](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date/toISOString)
// method. In Python, a standard `datetime.datetime` object can be converted
// to this format using
// [`strftime`](https://docs.python.org/2/library/time.html#time.strftime) with
// the time format spec '%Y-%m-%dT%H:%M:%S.%fZ'. Likewise, in Java, one can use
// the Joda Time's [`ISODateTimeFormat.dateTime()`](
// http://www.joda.org/joda-time/apidocs/org/joda/time/format/ISODateTimeFormat.html#dateTime%2D%2D
// ) to obtain a formatter capable of generating timestamps in this format.
//
//
message Timestamp {
// Represents seconds of UTC time since Unix epoch
// 1970-01-01T00:00:00Z. Must be from 0001-01-01T00:00:00Z to
// 9999-12-31T23:59:59Z inclusive.
int64 seconds = 1;
// Non-negative fractions of a second at nanosecond resolution. Negative
// second values with fractions must still have non-negative nanos values
// that count forward in time. Must be from 0 to 999,999,999
// inclusive.
int32 nanos = 2;
}

View File

@@ -0,0 +1,66 @@
// Copyright (C) 2023 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
syntax = "proto3";
package MemAgent;
import "google/protobuf/empty.proto";
import "google/protobuf/timestamp.proto";
service Control {
rpc MemcgStatus(google.protobuf.Empty) returns (MemcgStatusReply);
rpc MemcgSet(MemcgConfig) returns (google.protobuf.Empty);
rpc CompactSet(CompactConfig) returns (google.protobuf.Empty);
}
message EvictionCount {
uint64 page = 1;
uint64 no_min_lru_file = 2;
uint64 min_lru_inc = 3;
uint64 other_error = 4;
uint64 error = 5;
uint64 psi_exceeds_limit = 6;
}
message StatusNuma {
google.protobuf.Timestamp last_inc_time = 1;
uint64 max_seq = 2;
uint64 min_seq = 3;
uint64 run_aging_count = 4;
EvictionCount eviction_count = 5;
}
message MemCgroup {
uint32 id = 1;
uint64 ino = 2;
string path = 3;
uint64 sleep_psi_exceeds_limit = 4;
map<uint32, StatusNuma> numa = 5;
}
message MemcgStatusReply {
repeated MemCgroup mem_cgroups = 1;
}
message MemcgConfig {
optional bool disabled = 1;
optional bool swap = 2;
optional uint32 swappiness_max = 3;
optional uint64 period_secs = 4;
optional uint32 period_psi_percent_limit = 5;
optional uint32 eviction_psi_percent_limit = 6;
optional uint64 eviction_run_aging_count_min = 7;
}
message CompactConfig {
optional bool disabled = 1;
optional uint64 period_secs = 2;
optional uint32 period_psi_percent_limit = 3;
optional uint32 compact_psi_percent_limit = 4;
optional int64 compact_sec_max = 5;
optional uint32 compact_order = 6;
optional uint64 compact_threshold = 7;
optional uint64 compact_force_times = 8;
}

View File

@@ -0,0 +1,29 @@
// Copyright (C) 2023 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
use anyhow::{anyhow, Result};
use chrono::{DateTime, LocalResult, TimeZone, Utc};
use protobuf::well_known_types::timestamp::Timestamp;
pub fn datatime_to_timestamp(dt: DateTime<Utc>) -> Timestamp {
let seconds = dt.timestamp();
let nanos = dt.timestamp_subsec_nanos();
Timestamp {
seconds,
nanos: nanos as i32,
..Default::default()
}
}
#[allow(dead_code)]
pub fn timestamp_to_datetime(timestamp: Timestamp) -> Result<DateTime<Utc>> {
let seconds = timestamp.seconds;
let nanos = timestamp.nanos;
match Utc.timestamp_opt(seconds, nanos as u32) {
LocalResult::Single(t) => Ok(t),
_ => Err(anyhow!("Utc.timestamp_opt {} fail", timestamp)),
}
}

View File

@@ -0,0 +1,7 @@
// Copyright (C) 2023 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
pub mod misc;
pub mod option;
pub mod rpc;

View File

@@ -0,0 +1,146 @@
// Copyright (C) 2024 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
use crate::protocols::mem_agent as rpc;
use structopt::StructOpt;
#[derive(Debug, StructOpt)]
pub struct MemcgSetOption {
#[structopt(long)]
memcg_disabled: Option<bool>,
#[structopt(long)]
memcg_swap: Option<bool>,
#[structopt(long)]
memcg_swappiness_max: Option<u8>,
#[structopt(long)]
memcg_period_secs: Option<u64>,
#[structopt(long)]
memcg_period_psi_percent_limit: Option<u8>,
#[structopt(long)]
memcg_eviction_psi_percent_limit: Option<u8>,
#[structopt(long)]
memcg_eviction_run_aging_count_min: Option<u64>,
}
impl MemcgSetOption {
#[allow(dead_code)]
pub fn to_rpc_memcg_config(&self) -> rpc::MemcgConfig {
let config = rpc::MemcgConfig {
disabled: self.memcg_disabled,
swap: self.memcg_swap,
swappiness_max: self.memcg_swappiness_max.map(|v| v as u32),
period_secs: self.memcg_period_secs,
period_psi_percent_limit: self.memcg_period_psi_percent_limit.map(|v| v as u32),
eviction_psi_percent_limit: self.memcg_eviction_psi_percent_limit.map(|v| v as u32),
eviction_run_aging_count_min: self.memcg_eviction_run_aging_count_min,
..Default::default()
};
config
}
#[allow(dead_code)]
pub fn to_mem_agent_memcg_config(&self) -> mem_agent::memcg::Config {
let mut config = mem_agent::memcg::Config {
..Default::default()
};
if let Some(v) = self.memcg_disabled {
config.disabled = v;
}
if let Some(v) = self.memcg_swap {
config.swap = v;
}
if let Some(v) = self.memcg_swappiness_max {
config.swappiness_max = v;
}
if let Some(v) = self.memcg_period_secs {
config.period_secs = v;
}
if let Some(v) = self.memcg_period_psi_percent_limit {
config.period_psi_percent_limit = v;
}
if let Some(v) = self.memcg_eviction_psi_percent_limit {
config.eviction_psi_percent_limit = v;
}
if let Some(v) = self.memcg_eviction_run_aging_count_min {
config.eviction_run_aging_count_min = v;
}
config
}
}
#[derive(Debug, StructOpt)]
pub struct CompactSetOption {
#[structopt(long)]
compact_disabled: Option<bool>,
#[structopt(long)]
compact_period_secs: Option<u64>,
#[structopt(long)]
compact_period_psi_percent_limit: Option<u8>,
#[structopt(long)]
compact_psi_percent_limit: Option<u8>,
#[structopt(long)]
compact_sec_max: Option<i64>,
#[structopt(long)]
compact_order: Option<u8>,
#[structopt(long)]
compact_threshold: Option<u64>,
#[structopt(long)]
compact_force_times: Option<u64>,
}
impl CompactSetOption {
#[allow(dead_code)]
pub fn to_rpc_compact_config(&self) -> rpc::CompactConfig {
let config = rpc::CompactConfig {
disabled: self.compact_disabled,
period_secs: self.compact_period_secs,
period_psi_percent_limit: self.compact_period_psi_percent_limit.map(|v| v as u32),
compact_psi_percent_limit: self.compact_psi_percent_limit.map(|v| v as u32),
compact_sec_max: self.compact_sec_max,
compact_order: self.compact_order.map(|v| v as u32),
compact_threshold: self.compact_threshold,
compact_force_times: self.compact_force_times,
..Default::default()
};
config
}
#[allow(dead_code)]
pub fn to_mem_agent_compact_config(&self) -> mem_agent::compact::Config {
let mut config = mem_agent::compact::Config {
..Default::default()
};
if let Some(v) = self.compact_disabled {
config.disabled = v;
}
if let Some(v) = self.compact_period_secs {
config.period_secs = v;
}
if let Some(v) = self.compact_period_psi_percent_limit {
config.period_psi_percent_limit = v;
}
if let Some(v) = self.compact_psi_percent_limit {
config.compact_psi_percent_limit = v;
}
if let Some(v) = self.compact_sec_max {
config.compact_sec_max = v;
}
if let Some(v) = self.compact_order {
config.compact_order = v;
}
if let Some(v) = self.compact_threshold {
config.compact_threshold = v;
}
if let Some(v) = self.compact_force_times {
config.compact_force_times = v;
}
config
}
}

View File

@@ -0,0 +1,221 @@
// Copyright (C) 2023 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
use crate::protocols::mem_agent as rpc_mem_agent;
use crate::protocols::{empty, mem_agent_ttrpc};
use anyhow::{anyhow, Result};
use async_trait::async_trait;
use mem_agent::{agent, compact, memcg};
use slog_scope::{error, info};
use std::fs;
use std::os::unix::fs::PermissionsExt;
use std::sync::Arc;
use tokio::signal::unix::{signal, SignalKind};
use ttrpc::asynchronous::Server;
use ttrpc::error::Error;
use ttrpc::proto::Code;
#[derive(Debug)]
pub struct MyControl {
agent: agent::MemAgent,
}
impl MyControl {
#[allow(dead_code)]
pub fn new(agent: agent::MemAgent) -> Self {
Self { agent }
}
}
fn mem_cgroup_to_mem_cgroup_rpc(mcg: &memcg::MemCgroup) -> rpc_mem_agent::MemCgroup {
rpc_mem_agent::MemCgroup {
id: mcg.id as u32,
ino: mcg.ino as u64,
path: mcg.path.clone(),
sleep_psi_exceeds_limit: mcg.sleep_psi_exceeds_limit,
numa: mcg
.numa
.iter()
.map(|(numa_id, n)| {
(
*numa_id,
rpc_mem_agent::StatusNuma {
last_inc_time: protobuf::MessageField::some(
crate::share::misc::datatime_to_timestamp(n.last_inc_time),
),
max_seq: n.max_seq,
min_seq: n.min_seq,
run_aging_count: n.run_aging_count,
eviction_count: protobuf::MessageField::some(
rpc_mem_agent::EvictionCount {
page: n.eviction_count.page,
no_min_lru_file: n.eviction_count.no_min_lru_file,
min_lru_inc: n.eviction_count.min_lru_inc,
other_error: n.eviction_count.other_error,
error: n.eviction_count.error,
psi_exceeds_limit: n.eviction_count.psi_exceeds_limit,
..Default::default()
},
),
..Default::default()
},
)
})
.collect(),
..Default::default()
}
}
fn mem_cgroups_to_memcg_status_reply(
mgs: Vec<memcg::MemCgroup>,
) -> rpc_mem_agent::MemcgStatusReply {
let mem_cgroups: Vec<rpc_mem_agent::MemCgroup> = mgs
.iter()
.map(|x| mem_cgroup_to_mem_cgroup_rpc(&x))
.collect();
rpc_mem_agent::MemcgStatusReply {
mem_cgroups,
..Default::default()
}
}
fn memcgconfig_to_memcg_optionconfig(mc: &rpc_mem_agent::MemcgConfig) -> memcg::OptionConfig {
let moc = memcg::OptionConfig {
disabled: mc.disabled,
swap: mc.swap,
swappiness_max: mc.swappiness_max.map(|val| val as u8),
period_secs: mc.period_secs,
period_psi_percent_limit: mc.period_psi_percent_limit.map(|val| val as u8),
eviction_psi_percent_limit: mc.eviction_psi_percent_limit.map(|val| val as u8),
eviction_run_aging_count_min: mc.eviction_run_aging_count_min,
..Default::default()
};
moc
}
fn compactconfig_to_compact_optionconfig(
cc: &rpc_mem_agent::CompactConfig,
) -> compact::OptionConfig {
let coc = compact::OptionConfig {
disabled: cc.disabled,
period_secs: cc.period_secs,
period_psi_percent_limit: cc.period_psi_percent_limit.map(|val| val as u8),
compact_psi_percent_limit: cc.compact_psi_percent_limit.map(|val| val as u8),
compact_sec_max: cc.compact_sec_max,
compact_order: cc.compact_order.map(|val| val as u8),
compact_threshold: cc.compact_threshold,
compact_force_times: cc.compact_force_times,
..Default::default()
};
coc
}
#[async_trait]
impl mem_agent_ttrpc::Control for MyControl {
async fn memcg_status(
&self,
_ctx: &::ttrpc::r#async::TtrpcContext,
_: empty::Empty,
) -> ::ttrpc::Result<rpc_mem_agent::MemcgStatusReply> {
Ok(mem_cgroups_to_memcg_status_reply(
self.agent.memcg_status_async().await.map_err(|e| {
let estr = format!("agent.memcg_status_async fail: {}", e);
error!("{}", estr);
Error::RpcStatus(ttrpc::get_status(Code::INTERNAL, estr))
})?,
))
}
async fn memcg_set(
&self,
_ctx: &::ttrpc::r#async::TtrpcContext,
mc: rpc_mem_agent::MemcgConfig,
) -> ::ttrpc::Result<empty::Empty> {
self.agent
.memcg_set_config_async(memcgconfig_to_memcg_optionconfig(&mc))
.await
.map_err(|e| {
let estr = format!("agent.memcg_set_config_async fail: {}", e);
error!("{}", estr);
Error::RpcStatus(ttrpc::get_status(Code::INTERNAL, estr))
})?;
Ok(empty::Empty::new())
}
async fn compact_set(
&self,
_ctx: &::ttrpc::r#async::TtrpcContext,
cc: rpc_mem_agent::CompactConfig,
) -> ::ttrpc::Result<empty::Empty> {
self.agent
.compact_set_config_async(compactconfig_to_compact_optionconfig(&cc))
.await
.map_err(|e| {
let estr = format!("agent.compact_set_config_async fail: {}", e);
error!("{}", estr);
Error::RpcStatus(ttrpc::get_status(Code::INTERNAL, estr))
})?;
Ok(empty::Empty::new())
}
}
#[allow(dead_code)]
#[tokio::main]
pub async fn rpc_loop(agent: agent::MemAgent, addr: String) -> Result<()> {
let path = addr
.strip_prefix("unix://")
.ok_or(anyhow!("format of addr {} is not right", addr))?;
if std::path::Path::new(path).exists() {
return Err(anyhow!("addr {} is exist", addr));
}
let control = MyControl::new(agent);
let c = Box::new(control) as Box<dyn mem_agent_ttrpc::Control + Send + Sync>;
let c = Arc::new(c);
let service = mem_agent_ttrpc::create_control(c);
let mut server = Server::new().bind(&addr).unwrap().register_service(service);
let metadata = fs::metadata(path).map_err(|e| anyhow!("fs::metadata {} fail: {}", path, e))?;
let mut permissions = metadata.permissions();
permissions.set_mode(0o600);
fs::set_permissions(path, permissions)
.map_err(|e| anyhow!("fs::set_permissions {} fail: {}", path, e))?;
let mut interrupt = signal(SignalKind::interrupt())
.map_err(|e| anyhow!("signal(SignalKind::interrupt()) fail: {}", e))?;
let mut quit = signal(SignalKind::quit())
.map_err(|e| anyhow!("signal(SignalKind::quit()) fail: {}", e))?;
let mut terminate = signal(SignalKind::terminate())
.map_err(|e| anyhow!("signal(SignalKind::terminate()) fail: {}", e))?;
server
.start()
.await
.map_err(|e| anyhow!("server.start() fail: {}", e))?;
tokio::select! {
_ = interrupt.recv() => {
info!("mem-agent: interrupt shutdown");
}
_ = quit.recv() => {
info!("mem-agent: quit shutdown");
}
_ = terminate.recv() => {
info!("mem-agent: terminate shutdown");
}
};
server
.shutdown()
.await
.map_err(|e| anyhow!("server.shutdown() fail: {}", e))?;
fs::remove_file(&path).map_err(|e| anyhow!("fs::remove_file {} fail: {}", path, e))?;
Ok(())
}

View File

@@ -0,0 +1,95 @@
// Copyright (C) 2023 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
use anyhow::{anyhow, Result};
use share::option::{CompactSetOption, MemcgSetOption};
use slog::{Drain, Level, Logger};
use slog_async;
use slog_scope::set_global_logger;
use slog_scope::{error, info};
use slog_term;
use std::fs::OpenOptions;
use std::io::BufWriter;
use structopt::StructOpt;
mod protocols;
mod share;
#[derive(StructOpt, Debug)]
#[structopt(name = "mem-agent", about = "Memory agent")]
struct Opt {
#[structopt(long, default_value = "unix:///var/run/mem-agent.sock")]
addr: String,
#[structopt(long)]
log_file: Option<String>,
#[structopt(long, default_value = "trace", parse(try_from_str = parse_slog_level))]
log_level: Level,
#[structopt(flatten)]
memcg: MemcgSetOption,
#[structopt(flatten)]
compact: CompactSetOption,
}
fn parse_slog_level(src: &str) -> Result<Level, String> {
match src.to_lowercase().as_str() {
"trace" => Ok(Level::Trace),
"debug" => Ok(Level::Debug),
"info" => Ok(Level::Info),
"warning" => Ok(Level::Warning),
"warn" => Ok(Level::Warning),
"error" => Ok(Level::Error),
_ => Err(format!("Invalid log level: {}", src)),
}
}
fn setup_logging(opt: &Opt) -> Result<slog_scope::GlobalLoggerGuard> {
let drain = if let Some(f) = &opt.log_file {
let log_file = OpenOptions::new()
.create(true)
.write(true)
.append(true)
.open(f)
.map_err(|e| anyhow!("Open log file {} fail: {}", f, e))?;
let buffered = BufWriter::new(log_file);
let decorator = slog_term::PlainDecorator::new(buffered);
let drain = slog_term::CompactFormat::new(decorator)
.build()
.filter_level(opt.log_level)
.fuse();
slog_async::Async::new(drain).build().fuse()
} else {
let decorator = slog_term::TermDecorator::new().stderr().build();
let drain = slog_term::CompactFormat::new(decorator)
.build()
.filter_level(opt.log_level)
.fuse();
slog_async::Async::new(drain).build().fuse()
};
let logger = Logger::root(drain, slog::o!());
Ok(set_global_logger(logger.clone()))
}
fn main() -> Result<()> {
// Check opt
let opt = Opt::from_args();
let _ = setup_logging(&opt).map_err(|e| anyhow!("setup_logging fail: {}", e))?;
let memcg_config = opt.memcg.to_mem_agent_memcg_config();
let compact_config = opt.compact.to_mem_agent_compact_config();
let (ma, _rt) = mem_agent::agent::MemAgent::new(memcg_config, compact_config)
.map_err(|e| anyhow!("MemAgent::new fail: {}", e))?;
info!("MemAgent started");
share::rpc::rpc_loop(ma, opt.addr).map_err(|e| {
let estr = format!("rpc::rpc_loop fail: {}", e);
error!("{}", estr);
anyhow!("{}", estr)
})?;
Ok(())
}

378
src/mem-agent/src/agent.rs Normal file
View File

@@ -0,0 +1,378 @@
// Copyright (C) 2023 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
use crate::compact;
use crate::memcg::{self, MemCgroup};
use crate::{error, info};
use anyhow::{anyhow, Result};
use std::thread;
use tokio::runtime::{Builder, Runtime};
use tokio::select;
use tokio::sync::mpsc;
use tokio::sync::oneshot;
use tokio::time::{sleep, Duration, Instant};
const AGENT_WORK_ERROR_SLEEP_SECS: u64 = 5 * 60;
#[derive(Debug)]
enum AgentCmd {
MemcgStatus,
MemcgSet(memcg::OptionConfig),
CompactSet(compact::OptionConfig),
}
#[allow(dead_code)]
#[derive(Debug)]
enum AgentReturn {
Ok,
Err(anyhow::Error),
MemcgStatus(Vec<memcg::MemCgroup>),
}
async fn handle_agent_cmd(
cmd: AgentCmd,
ret_tx: oneshot::Sender<AgentReturn>,
memcg: &mut memcg::MemCG,
comp: &mut compact::Compact,
) -> Result<bool> {
#[allow(unused_assignments)]
let mut ret_msg = AgentReturn::Ok;
let need_reset_mas = match cmd {
AgentCmd::MemcgStatus => {
ret_msg = AgentReturn::MemcgStatus(memcg.get_status().await);
false
}
AgentCmd::MemcgSet(opt) => memcg.set_config(opt).await,
AgentCmd::CompactSet(opt) => comp.set_config(opt).await,
};
ret_tx
.send(ret_msg)
.map_err(|e| anyhow!("ret_tx.send failed: {:?}", e))?;
Ok(need_reset_mas)
}
fn get_remaining_tokio_duration(memcg: &memcg::MemCG, comp: &compact::Compact) -> Duration {
let memcg_d = memcg.get_remaining_tokio_duration();
let comp_d = comp.get_remaining_tokio_duration();
if memcg_d > comp_d {
comp_d
} else {
memcg_d
}
}
async fn async_get_remaining_tokio_duration(
memcg: &memcg::MemCG,
comp: &compact::Compact,
) -> Duration {
let memcg_f = memcg.async_get_remaining_tokio_duration();
let comp_f = comp.async_get_remaining_tokio_duration();
let memcg_d = memcg_f.await;
let comp_d = comp_f.await;
if memcg_d > comp_d {
comp_d
} else {
memcg_d
}
}
fn agent_work(mut memcg: memcg::MemCG, mut comp: compact::Compact) -> Result<Duration> {
let memcg_need_reset = if memcg.need_work() {
info!("memcg.work start");
memcg
.work()
.map_err(|e| anyhow!("memcg.work failed: {}", e))?;
info!("memcg.work stop");
true
} else {
false
};
let compact_need_reset = if comp.need_work() {
info!("compact.work start");
comp.work()
.map_err(|e| anyhow!("comp.work failed: {}", e))?;
info!("compact.work stop");
true
} else {
false
};
if memcg_need_reset {
memcg.reset_timer();
}
if compact_need_reset {
comp.reset_timer();
}
Ok(get_remaining_tokio_duration(&memcg, &comp))
}
struct MemAgentSleep {
duration: Duration,
start_wait_time: Instant,
timeout: bool,
}
impl MemAgentSleep {
fn new() -> Self {
Self {
duration: Duration::MAX,
start_wait_time: Instant::now(),
timeout: true,
}
}
fn set_timeout(&mut self) {
self.duration = Duration::MAX;
self.timeout = true;
}
fn set_sleep(&mut self, d: Duration) {
self.duration = d;
self.start_wait_time = Instant::now();
}
/* Return true if timeout */
fn refresh(&mut self) -> bool {
if self.duration != Duration::MAX {
let elapsed = self.start_wait_time.elapsed();
if self.duration > elapsed {
self.duration -= elapsed;
} else {
/* timeout */
self.set_timeout();
return true;
}
}
false
}
}
async fn mem_agent_loop(
mut cmd_rx: mpsc::Receiver<(AgentCmd, oneshot::Sender<AgentReturn>)>,
mut memcg: memcg::MemCG,
mut comp: compact::Compact,
) -> Result<()> {
let (work_ret_tx, mut work_ret_rx) = mpsc::channel(2);
// the time that wait to next.
let mut mas = MemAgentSleep::new();
loop {
if mas.timeout {
let thread_memcg = memcg.clone();
let thread_comp = comp.clone();
let thread_work_ret_tx = work_ret_tx.clone();
thread::spawn(move || {
info!("agent work thread start");
let d = agent_work(thread_memcg, thread_comp).unwrap_or_else(|err| {
error!("agent work thread fail {}", err);
Duration::from_secs(AGENT_WORK_ERROR_SLEEP_SECS)
});
if let Err(e) = thread_work_ret_tx.blocking_send(d) {
error!("work_ret_tx.blocking_send failed: {}", e);
}
});
mas.timeout = false;
} else {
if mas.refresh() {
continue;
}
}
info!("mem_agent_loop wait timeout {:?}", mas.duration);
select! {
Some((cmd, ret_tx)) = cmd_rx.recv() => {
if handle_agent_cmd(cmd, ret_tx, &mut memcg, &mut comp).await.map_err(|e| anyhow!("handle_agent_cmd failed: {}", e))? && !mas.timeout{
mas.set_sleep(async_get_remaining_tokio_duration(&memcg, &comp).await);
}
}
d = work_ret_rx.recv() => {
info!("agent work thread stop");
mas.set_sleep(d.unwrap_or(Duration::from_secs(AGENT_WORK_ERROR_SLEEP_SECS)));
}
_ = async {
sleep(mas.duration).await;
} => {
mas.set_timeout();
}
}
}
}
#[derive(Clone, Debug)]
pub struct MemAgent {
cmd_tx: mpsc::Sender<(AgentCmd, oneshot::Sender<AgentReturn>)>,
}
impl MemAgent {
pub fn new(
memcg_config: memcg::Config,
compact_config: compact::Config,
) -> Result<(Self, Runtime)> {
let mg = memcg::MemCG::new(memcg_config)
.map_err(|e| anyhow!("memcg::MemCG::new fail: {}", e))?;
let comp = compact::Compact::new(compact_config)
.map_err(|e| anyhow!("compact::Compact::new fail: {}", e))?;
let (cmd_tx, cmd_rx) = mpsc::channel(10);
let runtime = Builder::new_multi_thread()
.worker_threads(1)
.enable_all()
.build()
.map_err(|e| anyhow!("Builder::new_multi_threa failed: {}", e))?;
runtime.spawn(async move {
info!("mem-agent start");
match mem_agent_loop(cmd_rx, mg, comp).await {
Err(e) => error!("mem-agent error {}", e),
Ok(()) => info!("mem-agent stop"),
}
});
Ok((Self { cmd_tx }, runtime))
}
async fn send_cmd_async(&self, cmd: AgentCmd) -> Result<AgentReturn> {
let (ret_tx, ret_rx) = oneshot::channel();
self.cmd_tx
.send((cmd, ret_tx))
.await
.map_err(|e| anyhow!("cmd_tx.send cmd failed: {}", e))?;
let ret = ret_rx
.await
.map_err(|e| anyhow!("ret_rx.recv failed: {}", e))?;
Ok(ret)
}
pub async fn memcg_set_config_async(&self, opt: memcg::OptionConfig) -> Result<()> {
let ret = self
.send_cmd_async(AgentCmd::MemcgSet(opt))
.await
.map_err(|e| anyhow!("send_cmd failed: {}", e))?;
match ret {
AgentReturn::Err(e) => Err(anyhow!(
"mem_agent thread memcg_set_config_async failed: {}",
e
)),
AgentReturn::Ok => Ok(()),
_ => Err(anyhow!(
"mem_agent thread memcg_set_config_async return wrong value"
)),
}
}
pub async fn compact_set_config_async(&self, opt: compact::OptionConfig) -> Result<()> {
let ret = self
.send_cmd_async(AgentCmd::CompactSet(opt))
.await
.map_err(|e| anyhow!("send_cmd failed: {}", e))?;
match ret {
AgentReturn::Err(e) => Err(anyhow!(
"mem_agent thread compact_set_config_async failed: {}",
e
)),
AgentReturn::Ok => Ok(()),
_ => Err(anyhow!(
"mem_agent thread compact_set_config_async return wrong value"
)),
}
}
pub async fn memcg_status_async(&self) -> Result<Vec<MemCgroup>> {
let ret = self
.send_cmd_async(AgentCmd::MemcgStatus)
.await
.map_err(|e| anyhow!("send_cmd failed: {}", e))?;
let status = match ret {
AgentReturn::Err(e) => {
return Err(anyhow!("mem_agent thread memcg_status_async failed: {}", e))
}
AgentReturn::Ok => {
return Err(anyhow!(
"mem_agent thread memcg_status_async return wrong value"
))
}
AgentReturn::MemcgStatus(s) => s,
};
Ok(status)
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_agent() {
let memcg_config = memcg::Config {
disabled: true,
..Default::default()
};
let compact_config = compact::Config {
disabled: true,
..Default::default()
};
let (ma, _rt) = MemAgent::new(memcg_config, compact_config).unwrap();
tokio::runtime::Runtime::new()
.unwrap()
.block_on({
let memcg_config = memcg::OptionConfig {
period_secs: Some(120),
..Default::default()
};
ma.memcg_set_config_async(memcg_config)
})
.unwrap();
tokio::runtime::Runtime::new()
.unwrap()
.block_on({
let compact_config = compact::OptionConfig {
period_secs: Some(280),
..Default::default()
};
ma.compact_set_config_async(compact_config)
})
.unwrap();
}
#[test]
fn test_agent_memcg_status() {
let memcg_config = memcg::Config {
disabled: true,
..Default::default()
};
let compact_config = compact::Config {
disabled: true,
..Default::default()
};
let (ma, _rt) = MemAgent::new(memcg_config, compact_config).unwrap();
tokio::runtime::Runtime::new()
.unwrap()
.block_on(ma.memcg_status_async())
.unwrap();
}
}

View File

@@ -0,0 +1,446 @@
// Copyright (C) 2024 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
use crate::proc;
use crate::psi;
use crate::timer::Timeout;
use crate::{debug, error, info, trace};
use anyhow::{anyhow, Result};
use nix::sched::sched_yield;
use std::fs::File;
use std::io::{BufRead, BufReader};
use std::path::PathBuf;
use std::process::Command;
use std::sync::Arc;
use std::thread;
use std::time::Duration;
use tokio::sync::RwLock;
use tokio::time::Duration as TokioDuration;
const PAGE_REPORTING_MIN_ORDER: u8 = 9;
#[derive(Debug, Clone, PartialEq)]
pub struct Config {
pub disabled: bool,
pub psi_path: PathBuf,
pub period_secs: u64,
pub period_psi_percent_limit: u8,
pub compact_psi_percent_limit: u8,
pub compact_sec_max: i64,
// the order that want to get from compaction
pub compact_order: u8,
// compact_threshold is the pages number.
// When examining the /proc/pagetypeinfo, if there's an increase in the
// number of movable pages of orders smaller than the compact_order
// compared to the amount following the previous compaction,
// and this increase surpasses a certain threshold—specifically,
// more than 'compact_threshold' number of pages.
// Or the number of free pages has decreased by 'compact_threshold'
// since the previous compaction.
// then the system should initiate another round of memory compaction.
pub compact_threshold: u64,
// After one compaction, if there has not been a compaction within
// the next compact_force_times times, a compaction will be forced
// regardless of the system's memory situation.
// If compact_force_times is set to 0, will do force compaction each time.
// If compact_force_times is set to std::u64::MAX, will never do force compaction.
pub compact_force_times: u64,
}
impl Default for Config {
fn default() -> Self {
Self {
disabled: false,
psi_path: PathBuf::from(""),
period_secs: 10 * 60,
period_psi_percent_limit: 1,
compact_psi_percent_limit: 5,
compact_sec_max: 30 * 60,
compact_order: PAGE_REPORTING_MIN_ORDER,
compact_threshold: 2 << PAGE_REPORTING_MIN_ORDER,
compact_force_times: std::u64::MAX,
}
}
}
#[derive(Debug, Clone, Default)]
pub struct OptionConfig {
pub disabled: Option<bool>,
pub psi_path: Option<PathBuf>,
pub period_secs: Option<u64>,
pub period_psi_percent_limit: Option<u8>,
pub compact_psi_percent_limit: Option<u8>,
pub compact_sec_max: Option<i64>,
pub compact_order: Option<u8>,
pub compact_threshold: Option<u64>,
pub compact_force_times: Option<u64>,
}
#[derive(Debug, Clone)]
struct CompactCore {
timeout: Timeout,
config: Config,
psi: psi::Period,
force_counter: u64,
prev_free_movable_pages_after_compact: u64,
prev_memfree_kb: u64,
}
impl CompactCore {
fn new(config: Config) -> Self {
Self {
timeout: Timeout::new(config.period_secs),
psi: psi::Period::new(&config.psi_path, true),
force_counter: 0,
prev_free_movable_pages_after_compact: 0,
prev_memfree_kb: 0,
config,
}
}
fn psi_ok(&mut self) -> bool {
if crate::misc::is_test_environment() {
return false;
}
let percent = match self.psi.get_percent() {
Ok(v) => v,
Err(e) => {
debug!("psi.get_percent failed: {}", e);
return false;
}
};
if percent > self.config.period_psi_percent_limit as u64 {
info!(
"compact will not work because period psi {}% exceeds limit",
percent
);
false
} else {
true
}
}
fn need_force_compact(&self) -> bool {
if self.config.compact_force_times == std::u64::MAX {
return false;
}
self.force_counter >= self.config.compact_force_times
}
fn check_compact_threshold(&self, memfree_kb: u64, free_movable_pages: u64) -> bool {
if self.prev_memfree_kb > memfree_kb + (self.config.compact_threshold << 2) {
return true;
}
let threshold = self.config.compact_threshold + self.prev_free_movable_pages_after_compact;
if free_movable_pages > threshold {
true
} else {
info!(
"compact will not work because free movable pages {} less than threshold {} and prev_free {}kB current_free {}kB",
free_movable_pages, threshold, self.prev_memfree_kb, memfree_kb
);
false
}
}
fn get_special_psi(&self) -> psi::Period {
psi::Period::new(&self.config.psi_path, true)
}
fn set_prev(&mut self, memfree_kb: u64, free_movable_pages: u64) {
self.prev_memfree_kb = memfree_kb;
self.prev_free_movable_pages_after_compact = free_movable_pages;
}
fn set_disabled(&mut self, disabled: bool) {
if !disabled {
self.timeout.reset();
}
self.config.disabled = disabled;
}
// return if MemAgentSleep need be reset
fn set_config(&mut self, new_config: OptionConfig) -> bool {
let mut need_reset_mas = false;
if let Some(d) = new_config.disabled {
if self.config.disabled != d {
self.set_disabled(d);
need_reset_mas = true;
}
}
if let Some(p) = new_config.psi_path {
self.config.psi_path = p.clone();
}
if let Some(p) = new_config.period_psi_percent_limit {
self.config.period_psi_percent_limit = p;
}
if let Some(p) = new_config.compact_psi_percent_limit {
self.config.compact_psi_percent_limit = p;
}
if let Some(p) = new_config.compact_sec_max {
self.config.compact_sec_max = p;
}
if let Some(p) = new_config.compact_order {
self.config.compact_order = p;
}
if let Some(p) = new_config.compact_threshold {
self.config.compact_threshold = p;
}
if let Some(p) = new_config.compact_force_times {
self.config.compact_force_times = p;
}
if let Some(p) = new_config.period_secs {
self.config.period_secs = p;
self.timeout.set_sleep_duration(p);
if !self.config.disabled {
need_reset_mas = true;
}
}
info!("new compact config: {:#?}", self.config);
if need_reset_mas {
info!("need reset mem-agent sleep");
}
need_reset_mas
}
fn need_work(&self) -> bool {
if self.config.disabled {
return false;
}
self.timeout.is_timeout()
}
pub fn get_remaining_tokio_duration(&self) -> TokioDuration {
if self.config.disabled {
return TokioDuration::MAX;
}
self.timeout.remaining_tokio_duration()
}
}
#[derive(Debug, Clone)]
pub struct Compact {
core: Arc<RwLock<CompactCore>>,
}
impl Compact {
pub fn new(mut config: Config) -> Result<Self> {
config.psi_path =
psi::check(&config.psi_path).map_err(|e| anyhow!("psi::check failed: {}", e))?;
let c = Self {
core: Arc::new(RwLock::new(CompactCore::new(config))),
};
Ok(c)
}
fn calculate_free_movable_pages(&self) -> Result<u64> {
let file = File::open("/proc/pagetypeinfo")?;
let reader = BufReader::new(file);
let order_limit = self.core.blocking_read().config.compact_order as usize;
let mut total_free_movable_pages = 0;
for line in reader.lines() {
let line = line?;
if line.contains("Movable") {
let parts: Vec<&str> = line.split_whitespace().collect();
if let Some(index) = parts.iter().position(|&element| element == "Movable") {
for (order, &count_str) in parts[(index + 1)..].iter().enumerate() {
if order < order_limit {
if let Ok(count) = count_str.parse::<u64>() {
total_free_movable_pages += count * 1 << order;
}
}
}
}
}
}
Ok(total_free_movable_pages)
}
fn check_compact_threshold(&self) -> bool {
let memfree_kb = match proc::get_memfree_kb() {
Ok(v) => v,
Err(e) => {
error!("get_memfree_kb failed: {}", e);
return false;
}
};
let free_movable_pages = match self.calculate_free_movable_pages() {
Ok(v) => v,
Err(e) => {
error!("calculate_free_movable_pages failed: {}", e);
return false;
}
};
self.core
.blocking_read()
.check_compact_threshold(memfree_kb, free_movable_pages)
}
fn set_prev(&mut self) -> Result<()> {
let memfree_kb =
proc::get_memfree_kb().map_err(|e| anyhow!("get_memfree_kb failed: {}", e))?;
let free_movable_pages = self
.calculate_free_movable_pages()
.map_err(|e| anyhow!("calculate_free_movable_pages failed: {}", e))?;
self.core
.blocking_write()
.set_prev(memfree_kb, free_movable_pages);
Ok(())
}
fn do_compact(&self) -> Result<()> {
let compact_psi_percent_limit = self.core.blocking_read().config.compact_psi_percent_limit;
let mut compact_psi = self.core.blocking_read().get_special_psi();
let mut rest_sec = self.core.blocking_read().config.compact_sec_max;
if let Err(e) = sched_yield() {
error!("sched_yield failed: {:?}", e);
}
info!("compact start");
let mut child = Command::new("sh")
.arg("-c")
.arg("echo 1 > /proc/sys/vm/compact_memory")
.spawn()
.map_err(|e| anyhow!("Command::new failed: {}", e))?;
debug!("compact pid {}", child.id());
let mut killed = false;
loop {
match child.try_wait() {
Ok(Some(status)) => {
debug!("compact done with status {}", status);
break;
}
Ok(None) => {
if killed {
if rest_sec <= 0 {
error!("compact killed but not quit");
break;
} else {
debug!("compact killed and keep wait");
}
} else {
if rest_sec <= 0 {
debug!("compact timeout");
child
.kill()
.map_err(|e| anyhow!("child.kill failed: {}", e))?;
killed = true;
}
}
let percent = compact_psi
.get_percent()
.map_err(|e| anyhow!("compact_psi.get_percent failed: {}", e))?;
if percent > compact_psi_percent_limit as u64 {
info!(
"compaction need stop because period psi {}% exceeds limit",
percent
);
child
.kill()
.map_err(|e| anyhow!("child.kill failed: {}", e))?;
killed = true;
}
}
Err(e) => {
// try_wait will fail with code 10 because some task will
// wait compact task before try_wait.
debug!("compact try_wait fail: {:?}", e);
break;
}
}
thread::sleep(Duration::from_secs(1));
rest_sec -= 1;
}
info!("compact stop");
Ok(())
}
pub fn need_work(&self) -> bool {
self.core.blocking_read().need_work()
}
pub fn reset_timer(&mut self) {
self.core.blocking_write().timeout.reset();
}
pub fn get_remaining_tokio_duration(&self) -> TokioDuration {
self.core.blocking_read().get_remaining_tokio_duration()
}
pub async fn async_get_remaining_tokio_duration(&self) -> TokioDuration {
self.core.read().await.get_remaining_tokio_duration()
}
pub fn work(&mut self) -> Result<()> {
let mut can_work = self.core.blocking_write().psi_ok();
if can_work {
if !self.core.blocking_read().need_force_compact() {
if !self.check_compact_threshold() {
trace!("not enough free movable pages");
can_work = false;
}
} else {
trace!("force compact");
}
}
if can_work {
self.do_compact()
.map_err(|e| anyhow!("do_compact failed: {}", e))?;
self.set_prev()?;
self.core.blocking_write().force_counter = 0;
} else {
self.core.blocking_write().force_counter += 1;
}
Ok(())
}
pub async fn set_config(&mut self, new_config: OptionConfig) -> bool {
self.core.write().await.set_config(new_config)
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_compact() {
let mut c = Compact::new(Config::default()).unwrap();
assert!(c.work().is_ok());
}
}

12
src/mem-agent/src/lib.rs Normal file
View File

@@ -0,0 +1,12 @@
// Copyright (C) 2024 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
pub mod agent;
pub mod compact;
pub mod memcg;
mod mglru;
mod misc;
mod proc;
mod psi;
mod timer;

843
src/mem-agent/src/memcg.rs Normal file
View File

@@ -0,0 +1,843 @@
// Copyright (C) 2023 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
use crate::mglru::{self, MGenLRU};
use crate::timer::Timeout;
use crate::{debug, error, info, trace};
use crate::{proc, psi};
use anyhow::{anyhow, Context, Result};
use chrono::{DateTime, Utc};
use nix::sched::sched_yield;
use page_size;
use std::collections::HashMap;
use std::collections::HashSet;
use std::path::PathBuf;
use std::sync::Arc;
use tokio::sync::RwLock;
use tokio::time::Duration as TokioDuration;
/* If last_inc_time to current_time small than IDLE_FRESH_IGNORE_SECS,
not do idle_fresh for this memcg. */
const IDLE_FRESH_IGNORE_SECS: i64 = 60;
#[derive(Debug, Clone, PartialEq)]
pub struct Config {
pub disabled: bool,
pub swap: bool,
pub swappiness_max: u8,
pub psi_path: PathBuf,
pub period_secs: u64,
pub period_psi_percent_limit: u8,
pub eviction_psi_percent_limit: u8,
pub eviction_run_aging_count_min: u64,
}
impl Default for Config {
fn default() -> Self {
Self {
disabled: false,
swap: false,
swappiness_max: 50,
psi_path: PathBuf::from(""),
period_secs: 10 * 60,
period_psi_percent_limit: 1,
eviction_psi_percent_limit: 1,
eviction_run_aging_count_min: 3,
}
}
}
#[derive(Debug, Clone, Default)]
pub struct OptionConfig {
pub disabled: Option<bool>,
pub swap: Option<bool>,
pub swappiness_max: Option<u8>,
pub psi_path: Option<PathBuf>,
pub period_secs: Option<u64>,
pub period_psi_percent_limit: Option<u8>,
pub eviction_psi_percent_limit: Option<u8>,
pub eviction_run_aging_count_min: Option<u64>,
}
#[derive(Debug, Clone)]
pub struct EvictionCount {
pub page: u64,
pub no_min_lru_file: u64,
pub min_lru_inc: u64,
pub other_error: u64,
pub error: u64,
pub psi_exceeds_limit: u64,
}
#[derive(Debug, Clone)]
pub struct Numa {
pub max_seq: u64,
pub min_seq: u64,
pub last_inc_time: DateTime<Utc>,
pub min_lru_file: u64,
pub min_lru_anon: u64,
pub run_aging_count: u64,
pub eviction_count: EvictionCount,
}
impl Numa {
fn new(mglru: &MGenLRU) -> Self {
Self {
max_seq: mglru.max_seq,
min_seq: mglru.min_seq,
last_inc_time: mglru.last_birth,
min_lru_file: mglru.lru[mglru.min_lru_index].file,
min_lru_anon: mglru.lru[mglru.min_lru_index].anon,
run_aging_count: 0,
eviction_count: EvictionCount {
page: 0,
no_min_lru_file: 0,
min_lru_inc: 0,
other_error: 0,
error: 0,
psi_exceeds_limit: 0,
},
}
}
fn update(&mut self, mglru: &MGenLRU) {
self.max_seq = mglru.max_seq;
self.min_seq = mglru.min_seq;
self.last_inc_time = mglru.last_birth;
self.min_lru_file = mglru.lru[mglru.min_lru_index].file;
self.min_lru_anon = mglru.lru[mglru.min_lru_index].anon;
}
}
#[derive(Debug, Clone)]
pub struct MemCgroup {
/* get from Linux kernel static inline unsigned short mem_cgroup_id(struct mem_cgroup *memcg) */
pub id: u16,
pub ino: usize,
pub path: String,
pub numa: HashMap<u32, Numa>,
psi: psi::Period,
pub sleep_psi_exceeds_limit: u64,
}
impl MemCgroup {
fn new(
id: &usize,
ino: &usize,
path: &String,
hmg: &HashMap<usize, MGenLRU>,
psi_path: &PathBuf,
) -> Self {
let s = Self {
id: *id as u16,
ino: *ino,
path: path.to_string(),
numa: hmg
.iter()
.map(|(numa_id, mglru)| (*numa_id as u32, Numa::new(mglru)))
.collect(),
psi: psi::Period::new(&psi_path.join(path.trim_start_matches('/')), false),
sleep_psi_exceeds_limit: 0,
};
info!("MemCgroup::new {:?}", s);
s
}
fn update_from_hostmemcg(&mut self, hmg: &HashMap<usize, MGenLRU>) {
self.numa
.retain(|numa_id, _| hmg.contains_key(&(*numa_id as usize)));
for (numa_id, mglru) in hmg {
self.numa
.entry(*numa_id as u32)
.and_modify(|e| e.update(mglru))
.or_insert(Numa::new(mglru));
}
}
}
#[derive(Debug, Clone)]
enum EvictionStopReason {
None,
NoMinLru,
MinLruInc,
GetError,
PsiExceedsLimit,
}
#[derive(Debug, Clone)]
struct EvictionInfo {
psi: psi::Period,
// the min_lru_file and min_lru_anon before last time mglru::run_eviction
last_min_lru_file: u64,
last_min_lru_anon: u64,
// the evicted page count
file_page_count: u64,
anon_page_count: u64,
only_swap_mode: bool,
anon_eviction_max: u64,
stop_reason: EvictionStopReason,
}
#[derive(Debug, Clone)]
struct Info {
memcg_id: usize,
numa_id: usize,
path: String,
max_seq: u64,
min_seq: u64,
last_inc_time: DateTime<Utc>,
min_lru_file: u64,
min_lru_anon: u64,
eviction: Option<EvictionInfo>,
}
impl Info {
fn new(mcg: &MemCgroup, numa_id: usize, numa: &Numa) -> Self {
Self {
memcg_id: mcg.id as usize,
numa_id: numa_id,
path: mcg.path.clone(),
min_seq: numa.min_seq,
max_seq: numa.max_seq,
last_inc_time: numa.last_inc_time,
min_lru_file: numa.min_lru_file,
min_lru_anon: numa.min_lru_anon,
eviction: None,
}
}
fn update(&mut self, numa: &Numa) {
self.min_seq = numa.min_seq;
self.max_seq = numa.max_seq;
self.last_inc_time = numa.last_inc_time;
self.min_lru_file = numa.min_lru_file;
self.min_lru_anon = numa.min_lru_anon;
}
}
#[derive(Debug)]
struct MemCgroups {
timeout: Timeout,
config: Config,
id_map: HashMap<u16, MemCgroup>,
ino2id: HashMap<usize, u16>,
path2id: HashMap<String, u16>,
}
impl MemCgroups {
fn new(config: Config) -> Self {
Self {
timeout: Timeout::new(config.period_secs),
config,
id_map: HashMap::new(),
ino2id: HashMap::new(),
path2id: HashMap::new(),
}
}
/* Remove not exist in host or id, ino changed memcgroup */
fn remove_changed(
&mut self,
mg_hash: &HashMap<String, (usize, usize, HashMap<usize, MGenLRU>)>,
) {
/* Remove not exist in host or id, ino changed memcgroup */
let mut remove_target = Vec::new();
for (_, mg) in &self.id_map {
let mut should_remove = true;
if let Some((id, ino, _)) = mg_hash.get(&mg.path) {
if mg.id as usize == *id && mg.ino == *ino {
should_remove = false;
}
}
if should_remove {
remove_target.push((mg.id, mg.ino, mg.path.clone()));
}
}
for (id, ino, path) in remove_target {
self.id_map.remove(&id);
self.ino2id.remove(&ino);
self.path2id.remove(&path);
info!("Remove memcg {} {} {} because host changed.", id, ino, path)
}
}
fn update_and_add(
&mut self,
mg_hash: &HashMap<String, (usize, usize, HashMap<usize, MGenLRU>)>,
) {
for (path, (id, ino, hmg)) in mg_hash {
if *id == 0 {
info!(
"Not add {} {} {} because it is disabled.",
*id,
*ino,
path.to_string()
)
}
if let Some(mg) = self.id_map.get_mut(&(*id as u16)) {
mg.update_from_hostmemcg(&hmg);
} else {
self.id_map.insert(
*id as u16,
MemCgroup::new(id, ino, path, hmg, &self.config.psi_path),
);
self.ino2id.insert(*ino, *id as u16);
self.path2id.insert(path.to_string(), *id as u16);
}
}
}
fn check_psi_get_info(&mut self) -> Vec<Info> {
let mut info_ret = Vec::new();
for (_, mcg) in self.id_map.iter_mut() {
let percent = match mcg.psi.get_percent() {
Ok(p) => p,
Err(e) => {
debug!("mcg.psi.get_percent {} failed: {}", mcg.path, e);
continue;
}
};
if percent > self.config.period_psi_percent_limit as u64 {
mcg.sleep_psi_exceeds_limit += 1;
info!("{} period psi {}% exceeds limit", mcg.path, percent);
continue;
}
for (numa_id, numa) in &mcg.numa {
info_ret.push(Info::new(&mcg, *numa_id as usize, numa));
}
}
info_ret
}
fn update_info(&self, infov: &mut Vec<Info>) {
let mut i = 0;
while i < infov.len() {
if let Some(mg) = self.id_map.get(&(infov[i].memcg_id as u16)) {
if let Some(numa) = mg.numa.get(&(infov[i].numa_id as u32)) {
infov[i].update(numa);
i += 1;
continue;
}
}
infov.remove(i);
}
}
fn inc_run_aging_count(&mut self, infov: &mut Vec<Info>) {
let mut i = 0;
while i < infov.len() {
if let Some(mg) = self.id_map.get_mut(&(infov[i].memcg_id as u16)) {
if let Some(numa) = mg.numa.get_mut(&(infov[i].numa_id as u32)) {
numa.run_aging_count += 1;
if numa.run_aging_count >= self.config.eviction_run_aging_count_min {
i += 1;
continue;
}
}
}
infov.remove(i);
}
}
fn record_eviction(&mut self, infov: &Vec<Info>) {
for info in infov {
if let Some(mg) = self.id_map.get_mut(&(info.memcg_id as u16)) {
if let Some(numa) = mg.numa.get_mut(&(info.numa_id as u32)) {
if let Some(ei) = &info.eviction {
numa.eviction_count.page += ei.file_page_count + ei.anon_page_count;
match ei.stop_reason {
EvictionStopReason::None => numa.eviction_count.other_error += 1,
EvictionStopReason::NoMinLru => {
numa.eviction_count.no_min_lru_file += 1
}
EvictionStopReason::MinLruInc => numa.eviction_count.min_lru_inc += 1,
EvictionStopReason::GetError => numa.eviction_count.error += 1,
EvictionStopReason::PsiExceedsLimit => {
numa.eviction_count.psi_exceeds_limit += 1
}
}
}
}
}
}
}
fn set_disabled(&mut self, disabled: bool) {
if !disabled {
self.timeout.reset();
}
self.config.disabled = disabled;
}
// return if MemAgentSleep need be reset
fn set_config(&mut self, new_config: OptionConfig) -> bool {
let mut need_reset_mas = false;
if let Some(d) = new_config.disabled {
if self.config.disabled != d {
self.set_disabled(d);
need_reset_mas = true;
}
}
if let Some(s) = new_config.swap {
self.config.swap = s;
}
if let Some(s) = new_config.swappiness_max {
self.config.swappiness_max = s;
}
if let Some(p) = new_config.psi_path {
self.config.psi_path = p.clone();
}
if let Some(p) = new_config.period_psi_percent_limit {
self.config.period_psi_percent_limit = p;
}
if let Some(p) = new_config.eviction_psi_percent_limit {
self.config.eviction_psi_percent_limit = p;
}
if let Some(p) = new_config.eviction_run_aging_count_min {
self.config.eviction_run_aging_count_min = p;
}
if let Some(p) = new_config.period_secs {
self.config.period_secs = p;
self.timeout.set_sleep_duration(p);
if !self.config.disabled {
need_reset_mas = true;
}
}
info!("new memcg config: {:#?}", self.config);
if need_reset_mas {
info!("need reset mem-agent sleep");
}
need_reset_mas
}
fn need_work(&self) -> bool {
if self.config.disabled {
return false;
}
self.timeout.is_timeout()
}
pub fn get_remaining_tokio_duration(&self) -> TokioDuration {
if self.config.disabled {
return TokioDuration::MAX;
}
self.timeout.remaining_tokio_duration()
}
}
#[derive(Debug, Clone)]
pub struct MemCG {
memcgs: Arc<RwLock<MemCgroups>>,
}
impl MemCG {
pub fn new(mut config: Config) -> Result<Self> {
mglru::check().map_err(|e| anyhow!("mglru::check failed: {}", e))?;
config.psi_path =
psi::check(&config.psi_path).map_err(|e| anyhow!("psi::check failed: {}", e))?;
let memcg = Self {
memcgs: Arc::new(RwLock::new(MemCgroups::new(config))),
};
Ok(memcg)
}
/*
* If target_paths.len == 0,
* will remove the updated or not exist cgroup in the host from MemCgroups.
* If target_paths.len > 0, will not do that.
*/
fn refresh(&mut self, target_paths: &HashSet<String>) -> Result<()> {
let mg_hash = mglru::host_memcgs_get(target_paths, true)
.map_err(|e| anyhow!("lru_gen_parse::file_parse failed: {}", e))?;
let mut mgs = self.memcgs.blocking_write();
if target_paths.len() == 0 {
mgs.remove_changed(&mg_hash);
}
mgs.update_and_add(&mg_hash);
Ok(())
}
fn run_aging(&mut self, infov: &mut Vec<Info>, swap: bool) {
infov.retain(|info| {
let now = Utc::now();
if now.signed_duration_since(info.last_inc_time).num_seconds() < IDLE_FRESH_IGNORE_SECS
{
info!(
"{} not run aging because last_inc_time {}",
info.path, info.last_inc_time,
);
true
} else {
let res = if let Err(e) =
mglru::run_aging(info.memcg_id, info.numa_id, info.max_seq, swap, true)
{
error!(
"mglru::run_aging {} {} {} failed: {}",
info.path, info.memcg_id, info.numa_id, e
);
false
} else {
true
};
if let Err(e) = sched_yield() {
error!("sched_yield failed: {:?}", e);
}
res
}
});
self.memcgs.blocking_write().inc_run_aging_count(infov);
}
fn swap_not_available(&self) -> Result<bool> {
let freeswap_kb = proc::get_freeswap_kb().context("proc::get_freeswap_kb")?;
if freeswap_kb > (256 * page_size::get() as u64 / 1024) {
Ok(false)
} else {
Ok(true)
}
}
fn get_swappiness(&self, anon_count: u64, file_count: u64) -> u8 {
assert!(
anon_count != 0 && file_count != 0,
"anon and file must be non-zero"
);
let total = anon_count + file_count;
let c = 200 * anon_count / total;
c as u8
}
fn run_eviction(&mut self, infov: &mut Vec<Info>, mut swap: bool) -> Result<()> {
if self
.swap_not_available()
.context("self.swap_not_available")?
{
swap = false;
}
let psi_path = self.memcgs.blocking_read().config.psi_path.clone();
for info in infov.into_iter() {
info.eviction = Some(EvictionInfo {
psi: psi::Period::new(&psi_path.join(info.path.trim_start_matches('/')), false),
last_min_lru_file: 0,
last_min_lru_anon: 0,
file_page_count: 0,
anon_page_count: 0,
only_swap_mode: false,
anon_eviction_max: 0,
stop_reason: EvictionStopReason::None,
});
}
let mut removed_infov = Vec::new();
let mut ret = Ok(());
let eviction_psi_percent_limit = self
.memcgs
.blocking_read()
.config
.eviction_psi_percent_limit as u64;
let swappiness_max = self.memcgs.blocking_read().config.swappiness_max;
'main_loop: while infov.len() != 0 {
let mut i = 0;
while i < infov.len() {
let path_set: HashSet<String> =
infov.iter().map(|info| info.path.clone()).collect();
match self.refresh(&path_set) {
Ok(_) => {}
Err(e) => {
ret = Err(anyhow!("refresh failed: {}", e));
break 'main_loop;
}
};
self.update_info(infov);
let ci = infov[i].clone();
trace!("{} {} run_eviction single loop start", ci.path, ci.numa_id);
if let Some(ref mut ei) = infov[i].eviction {
if ei.last_min_lru_file == 0 && ei.last_min_lru_anon == 0 {
// First loop
trace!("{} {} run_eviction begin", ci.path, ci.numa_id,);
if ci.min_lru_file == 0 {
if !swap || ci.min_lru_anon == 0 {
info!(
"{} {} run_eviction stop because min_lru_file is 0 or min_lru_anon is 0, release {} {} pages",
ci.path, ci.numa_id, ei.anon_page_count, ei.file_page_count,
);
ei.stop_reason = EvictionStopReason::NoMinLru;
removed_infov.push(infov.remove(i));
continue;
} else {
ei.only_swap_mode = true;
ei.anon_eviction_max =
ci.min_lru_anon * swappiness_max as u64 / 200;
trace!(
"{} {} run_eviction only swap mode anon_eviction_max {}",
ci.path,
ci.numa_id,
ei.anon_eviction_max
);
}
}
ei.last_min_lru_file = ci.min_lru_file;
ei.last_min_lru_anon = ci.min_lru_anon;
} else {
if (!ei.only_swap_mode && ci.min_lru_file >= ei.last_min_lru_file)
|| (ei.only_swap_mode && ci.min_lru_file > 0)
|| (swap && ci.min_lru_anon > ei.last_min_lru_anon)
{
info!(
"{} {} run_eviction stop because min_lru_file {} last_min_lru_file {} min_lru_anon {} last_min_lru_anon {}, release {} {} pages",
ci.path, ci.numa_id, ci.min_lru_file, ei.last_min_lru_file, ci.min_lru_anon, ei.last_min_lru_anon, ei.anon_page_count, ei.file_page_count,
);
ei.stop_reason = EvictionStopReason::MinLruInc;
removed_infov.push(infov.remove(i));
continue;
}
let released = ei.last_min_lru_anon - ci.min_lru_anon;
trace!(
"{} {} run_eviction anon {} pages",
ci.path,
ci.numa_id,
released
);
ei.anon_page_count += released;
let released = ei.last_min_lru_file - ci.min_lru_file;
trace!(
"{} {} run_eviction file {} pages",
ci.path,
ci.numa_id,
released
);
ei.file_page_count += released;
if !ei.only_swap_mode {
if ci.min_lru_file == 0 {
info!(
"{} {} run_eviction stop because min_lru_file is 0, release {} {} pages",
ci.path, ci.numa_id, ei.anon_page_count, ei.file_page_count,
);
ei.stop_reason = EvictionStopReason::NoMinLru;
removed_infov.push(infov.remove(i));
continue;
}
} else {
if ei.anon_page_count >= ei.anon_eviction_max {
info!(
"{} {} run_eviction stop because anon_page_count is bigger than anon_eviction_max {}, release {} {} pages",
ci.path, ci.numa_id, ei.anon_eviction_max, ei.anon_page_count, ei.file_page_count,
);
ei.stop_reason = EvictionStopReason::NoMinLru;
removed_infov.push(infov.remove(i));
continue;
}
}
let percent = match ei.psi.get_percent() {
Ok(p) => p,
Err(e) => {
debug!(
"{} {} ei.psi.get_percent failed: {}, release {} {} pages",
ci.path, ci.numa_id, e, ei.anon_page_count, ei.file_page_count,
);
ei.stop_reason = EvictionStopReason::GetError;
removed_infov.push(infov.remove(i));
continue;
}
};
if percent > eviction_psi_percent_limit {
info!(
"{} {} run_eviction stop because period psi {}% exceeds limit, release {} {} pages",
ci.path, ci.numa_id, percent, ei.anon_page_count, ei.file_page_count,
);
ei.stop_reason = EvictionStopReason::PsiExceedsLimit;
removed_infov.push(infov.remove(i));
continue;
}
ei.last_min_lru_file = ci.min_lru_file;
ei.last_min_lru_anon = ci.min_lru_anon;
}
// get swapiness
let swappiness = if ei.only_swap_mode {
200
} else if !swap
|| ci.min_lru_anon == 0
|| match self.swap_not_available() {
Ok(b) => b,
Err(e) => {
ret = Err(anyhow!("swap_not_available failed: {:?}", e));
break 'main_loop;
}
}
{
0
} else {
let s = self.get_swappiness(ci.min_lru_anon, ci.min_lru_file);
if s > swappiness_max {
swappiness_max
} else {
s
}
};
trace!(
"{} {} run_eviction min_seq {} swappiness {}",
ci.path,
ci.numa_id,
ci.min_seq,
swappiness
);
match mglru::run_eviction(ci.memcg_id, ci.numa_id, ci.min_seq, swappiness, 1) {
Ok(_) => {}
Err(e) => {
error!(
"{} {} mglru::run_eviction failed: {}, release {} {} pages",
ci.path, ci.numa_id, e, ei.anon_page_count, ei.file_page_count,
);
ei.stop_reason = EvictionStopReason::GetError;
removed_infov.push(infov.remove(i));
continue;
}
}
if let Err(e) = sched_yield() {
error!("sched_yield failed: {:?}", e);
}
} else {
unreachable!();
}
i += 1;
}
}
let mut mgs = self.memcgs.blocking_write();
mgs.record_eviction(&infov);
mgs.record_eviction(&removed_infov);
ret
}
fn check_psi_get_info(&mut self) -> Vec<Info> {
self.memcgs.blocking_write().check_psi_get_info()
}
fn update_info(&self, infov: &mut Vec<Info>) {
self.memcgs.blocking_read().update_info(infov);
}
pub fn need_work(&self) -> bool {
self.memcgs.blocking_read().need_work()
}
pub fn reset_timer(&mut self) {
self.memcgs.blocking_write().timeout.reset();
}
pub fn get_remaining_tokio_duration(&self) -> TokioDuration {
self.memcgs.blocking_read().get_remaining_tokio_duration()
}
pub async fn async_get_remaining_tokio_duration(&self) -> TokioDuration {
self.memcgs.read().await.get_remaining_tokio_duration()
}
pub fn work(&mut self) -> Result<()> {
/* Refresh memcgroups info from host and store it to infov. */
self.refresh(&HashSet::new())
.map_err(|e| anyhow!("first refresh failed: {}", e))?;
let mut infov = self.check_psi_get_info();
let swap = self.memcgs.blocking_read().config.swap;
/* Run aging with infov. */
self.run_aging(&mut infov, swap);
self.run_eviction(&mut infov, swap)
.map_err(|e| anyhow!("run_eviction failed: {}", e))?;
Ok(())
}
pub async fn set_config(&mut self, new_config: OptionConfig) -> bool {
self.memcgs.write().await.set_config(new_config)
}
pub async fn get_status(&self) -> Vec<MemCgroup> {
let mut mcgs = Vec::new();
let mgs = self.memcgs.read().await;
for (_, m) in mgs.id_map.iter() {
mcgs.push((*m).clone());
}
mcgs
}
}
mod tests {
#[allow(unused_imports)]
use super::*;
#[test]
fn test_memcg_swap_not_available() {
let m = MemCG::new(Config::default()).unwrap();
assert!(m.swap_not_available().is_ok());
}
#[test]
fn test_memcg_get_swappiness() {
let m = MemCG::new(Config::default()).unwrap();
assert_eq!(m.get_swappiness(100, 50), 133);
}
#[test]
fn test_memcg_need_work() {
let m = MemCG::new(Config::default()).unwrap();
assert_eq!(m.need_work(), true);
}
}

924
src/mem-agent/src/mglru.rs Normal file
View File

@@ -0,0 +1,924 @@
// Copyright (C) 2023 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
use crate::debug;
use crate::warn;
use anyhow::{anyhow, Result};
use chrono::{DateTime, Duration, Utc};
use std::collections::HashMap;
use std::collections::HashSet;
use std::fs::{self, File, OpenOptions};
use std::io::{BufRead, BufReader};
use std::os::unix::fs::MetadataExt;
use std::path::PathBuf;
const WORKINGSET_ANON: usize = 0;
const WORKINGSET_FILE: usize = 1;
const LRU_GEN_ENABLED_PATH: &str = "/sys/kernel/mm/lru_gen/enabled";
const LRU_GEN_PATH: &str = "/sys/kernel/debug/lru_gen";
const MEMCGS_PATH: &str = "/sys/fs/cgroup/memory";
fn lru_gen_head_parse(line: &str) -> Result<(usize, String)> {
let words: Vec<&str> = line.split_whitespace().map(|word| word.trim()).collect();
if words.len() != 3 || words[0] != "memcg" {
return Err(anyhow!("line {} format is not right", line));
}
let id = usize::from_str_radix(words[1], 10)
.map_err(|e| anyhow!("parse line {} failed: {}", line, e))?;
Ok((id, words[2].to_string()))
}
#[derive(Debug, PartialEq)]
pub struct GenLRU {
pub seq: u64,
pub anon: u64,
pub file: u64,
pub birth: DateTime<Utc>,
}
impl GenLRU {
fn new() -> Self {
Self {
seq: 0,
anon: 0,
file: 0,
birth: Utc::now(),
}
}
}
#[derive(Debug, PartialEq)]
pub struct MGenLRU {
pub min_seq: u64,
pub max_seq: u64,
pub last_birth: DateTime<Utc>,
pub min_lru_index: usize,
pub lru: Vec<GenLRU>,
}
impl MGenLRU {
fn new() -> Self {
Self {
min_seq: 0,
max_seq: 0,
last_birth: Utc::now(),
min_lru_index: 0,
lru: Vec::new(),
}
}
}
//result:
// last_line, HashMap<node_id, MGenLRU>
fn lru_gen_lines_parse(reader: &mut BufReader<File>) -> Result<(String, HashMap<usize, MGenLRU>)> {
let mut line = String::new();
let mut ret_hash = HashMap::new();
while line.len() > 0
|| reader
.read_line(&mut line)
.map_err(|e| anyhow!("read file {} failed: {}", LRU_GEN_PATH, e))?
> 0
{
let words: Vec<&str> = line.split_whitespace().map(|word| word.trim()).collect();
if words.len() == 2 && words[0] == "node" {
// Got a new node
let node_id = usize::from_str_radix(words[1], 10)
.map_err(|e| anyhow!("parse line {} failed: {}", line, e))?;
let (ret_line, node_size) = lru_gen_seq_lines_parse(reader)
.map_err(|e| anyhow!("lru_gen_seq_lines_parse failed: {}", e))?;
if let Some(size) = node_size {
ret_hash.insert(node_id, size);
}
line = ret_line;
} else {
// Cannot get node, return the line let caller handle it.
break;
}
}
Ok((line, ret_hash))
}
fn str_to_u64(str: &str) -> Result<u64> {
if str.starts_with("-") {
warn!("{} format {} is not right", LRU_GEN_PATH, str);
return Ok(0);
}
Ok(u64::from_str_radix(str, 10)?)
}
//result:
// last_line, Option<MGenLRU>
fn lru_gen_seq_lines_parse(reader: &mut BufReader<File>) -> Result<(String, Option<MGenLRU>)> {
let mut line = String::new();
let mut ret = MGenLRU::new();
let mut got = false;
while reader
.read_line(&mut line)
.map_err(|e| anyhow!("read file {} failed: {}", LRU_GEN_PATH, e))?
> 0
{
let words: Vec<&str> = line.split_whitespace().map(|word| word.trim()).collect();
if words.len() != 4 {
//line is not format of seq line
break;
}
let msecs = i64::from_str_radix(words[1], 10)
.map_err(|e| anyhow!("parse line {} failed: {}", line, e))?;
// Use milliseconds because will got build error with try_milliseconds.
#[allow(deprecated)]
let birth = Utc::now() - Duration::milliseconds(msecs);
let mut gen = GenLRU::new();
gen.birth = birth;
gen.seq = u64::from_str_radix(words[0], 10)
.map_err(|e| anyhow!("parse line {} failed: {}", line, e))?;
gen.anon = str_to_u64(&words[2 + WORKINGSET_ANON])
.map_err(|e| anyhow!("parse line {} failed: {}", line, e))?;
gen.file = str_to_u64(&words[2 + WORKINGSET_FILE])
.map_err(|e| anyhow!("parse line {} failed: {}", line, e))?;
if !got {
ret.min_seq = gen.seq;
ret.max_seq = gen.seq;
ret.last_birth = birth;
got = true;
} else {
ret.min_seq = std::cmp::min(ret.min_seq, gen.seq);
ret.max_seq = std::cmp::max(ret.max_seq, gen.seq);
if ret.last_birth < birth {
ret.last_birth = birth;
}
}
if gen.seq == ret.min_seq {
ret.min_lru_index = ret.lru.len();
}
ret.lru.push(gen);
line.clear();
}
Ok((line, if got { Some(ret) } else { None }))
}
// Just handle the path in the target_patchs. But if len of target_patchs is 0, will handle all paths.
// if parse_line is false
// HashMap<node_id, MGenLRU> will be empty.
//result:
// HashMap<path, (id, HashMap<node_id, MGenLRU>)>
fn lru_gen_file_parse(
mut reader: &mut BufReader<File>,
target_patchs: &HashSet<String>,
parse_line: bool,
) -> Result<HashMap<String, (usize, HashMap<usize, MGenLRU>)>> {
let mut line = String::new();
let mut ret_hash = HashMap::new();
while line.len() > 0
|| reader
.read_line(&mut line)
.map_err(|e| anyhow!("read file {} failed: {}", LRU_GEN_PATH, e))?
> 0
{
let mut clear_line = true;
// Not handle the Err of lru_gen_head_parse because all lines of file will be checked.
if let Ok((id, path)) = lru_gen_head_parse(&line) {
if target_patchs.len() == 0 || target_patchs.contains(&path) {
let seq_data = if parse_line {
let (ret_line, data) = lru_gen_lines_parse(&mut reader).map_err(|e| {
anyhow!(
"lru_gen_seq_lines_parse file {} failed: {}",
LRU_GEN_PATH,
e
)
})?;
line = ret_line;
clear_line = false;
data
} else {
HashMap::new()
};
/*trace!(
"lru_gen_file_parse path {} id {} seq_data {:#?}",
path,
id,
seq_data
);*/
ret_hash.insert(path.clone(), (id, seq_data));
}
}
if clear_line {
line.clear();
}
}
Ok(ret_hash)
}
fn file_parse(
target_patchs: &HashSet<String>,
parse_line: bool,
) -> Result<HashMap<String, (usize, HashMap<usize, MGenLRU>)>> {
let file = File::open(LRU_GEN_PATH)
.map_err(|e| anyhow!("open file {} failed: {}", LRU_GEN_PATH, e))?;
let mut reader = BufReader::new(file);
lru_gen_file_parse(&mut reader, target_patchs, parse_line)
}
//result:
// HashMap<path, (id, ino, HashMap<node_id, MGenLRU>)>
pub fn host_memcgs_get(
target_patchs: &HashSet<String>,
parse_line: bool,
) -> Result<HashMap<String, (usize, usize, HashMap<usize, MGenLRU>)>> {
let mgs = file_parse(target_patchs, parse_line)
.map_err(|e| anyhow!("mglru file_parse failed: {}", e))?;
let mut host_mgs = HashMap::new();
for (path, (id, mglru)) in mgs {
let host_path = PathBuf::from(MEMCGS_PATH).join(path.trim_start_matches('/'));
let metadata = match fs::metadata(host_path.clone()) {
Err(e) => {
if id != 0 {
debug!("fs::metadata {:?} fail: {}", host_path, e);
}
continue;
}
Ok(m) => m,
};
host_mgs.insert(path, (id, metadata.ino() as usize, mglru));
}
Ok(host_mgs)
}
pub fn check() -> Result<()> {
if crate::misc::is_test_environment() {
return Ok(());
}
let content = fs::read_to_string(LRU_GEN_ENABLED_PATH)
.map_err(|e| anyhow!("open file {} failed: {}", LRU_GEN_ENABLED_PATH, e))?;
let content = content.trim();
let r = if content.starts_with("0x") {
u32::from_str_radix(&content[2..], 16)
} else {
content.parse()
};
let enabled = r.map_err(|e| anyhow!("parse file {} failed: {}", LRU_GEN_ENABLED_PATH, e))?;
if enabled != 7 {
fs::write(LRU_GEN_ENABLED_PATH, "7")
.map_err(|e| anyhow!("write file {} failed: {}", LRU_GEN_ENABLED_PATH, e))?;
}
let _ = OpenOptions::new()
.read(true)
.write(true)
.open(LRU_GEN_PATH)
.map_err(|e| anyhow!("open file {} failed: {}", LRU_GEN_PATH, e))?;
Ok(())
}
pub fn run_aging(
memcg_id: usize,
numa_id: usize,
max_seq: u64,
can_swap: bool,
force_scan: bool,
) -> Result<()> {
let cmd = format!(
"+ {} {} {} {} {}",
memcg_id, numa_id, max_seq, can_swap as i32, force_scan as i32
);
//trace!("send cmd {} to {}", cmd, LRU_GEN_PATH);
fs::write(LRU_GEN_PATH, &cmd)
.map_err(|e| anyhow!("write file {} cmd {} failed: {}", LRU_GEN_PATH, cmd, e))?;
Ok(())
}
pub fn run_eviction(
memcg_id: usize,
numa_id: usize,
min_seq: u64,
swappiness: u8,
nr_to_reclaim: usize,
) -> Result<()> {
let cmd = format!(
"- {} {} {} {} {}",
memcg_id, numa_id, min_seq, swappiness, nr_to_reclaim
);
//trace!("send cmd {} to {}", cmd, LRU_GEN_PATH);
fs::write(LRU_GEN_PATH, &cmd)
.map_err(|e| anyhow!("write file {} cmd {} failed: {}", LRU_GEN_PATH, cmd, e))?;
Ok(())
}
#[cfg(test)]
mod tests {
use super::*;
use maplit::hashmap;
use once_cell::sync::OnceCell;
use slog::{Drain, Level, Logger};
use slog_async;
use slog_scope::set_global_logger;
use slog_term;
use std::collections::HashMap;
use std::fs;
use std::fs::File;
use std::io::BufReader;
use std::io::Write;
use std::sync::Mutex;
lazy_static::lazy_static! {
static ref TEST_MUTEX: Mutex<()> = Mutex::new(());
}
static LOGGER: OnceCell<slog_scope::GlobalLoggerGuard> = OnceCell::new();
impl GenLRU {
pub fn new_from_data(seq: u64, anon: u64, file: u64, birth: DateTime<Utc>) -> Self {
Self {
seq,
anon,
file,
birth,
}
}
}
pub fn init_logger() -> &'static slog_scope::GlobalLoggerGuard {
LOGGER.get_or_init(|| {
let decorator = slog_term::TermDecorator::new().stderr().build();
let drain = slog_term::CompactFormat::new(decorator)
.build()
.filter_level(Level::Trace)
.fuse();
let drain = slog_async::Async::new(drain).build().fuse();
let logger = Logger::root(drain, slog::o!());
set_global_logger(logger.clone())
})
}
#[test]
fn test_lru_gen_file_parse_single_no_parse() {
let _lock = TEST_MUTEX.lock().unwrap_or_else(|e| e.into_inner());
let _logger = init_logger();
let mut reader = setup_test_file();
let paths = ["/justto.slice/boot.mount".to_string()]
.iter()
.cloned()
.collect();
let ret = lru_gen_file_parse(&mut reader, &paths, false).unwrap();
assert_eq!(ret.len(), 1);
assert_eq!(
ret.get("/justto.slice/boot.mount"),
Some(&(16, HashMap::new()))
);
remove_test_file();
}
#[test]
fn test_lru_gen_file_parse_multi_no_parse() {
let _lock = TEST_MUTEX.lock().unwrap_or_else(|e| e.into_inner());
let _logger = init_logger();
let mut reader = setup_test_file();
let paths = [
"/aabbc/tea-loglogl".to_string(),
"/aabbc/staraabbc".to_string(),
"/aabbc/TEAE-iaabbc".to_string(),
"/justto.slice/cpupower.service".to_string(),
]
.iter()
.cloned()
.collect();
let ret = lru_gen_file_parse(&mut reader, &paths, false).unwrap();
assert_eq!(ret.len(), 4);
assert_eq!(ret.get("/justto.slice/boot.mount"), None);
assert_eq!(ret.get("/aabbc/tea-loglogl"), Some(&(30, hashmap![])));
assert_eq!(ret.get("/aabbc/staraabbc"), Some(&(22, hashmap![])));
assert_eq!(ret.get("/aabbc/TEAE-iaabbc"), Some(&(21, hashmap![])));
assert_eq!(
ret.get("/justto.slice/cpupower.service"),
Some(&(0, hashmap![]))
);
remove_test_file();
}
#[test]
fn test_lru_gen_file_parse_multi_parse() {
let _lock = TEST_MUTEX.lock().unwrap_or_else(|e| e.into_inner());
let _logger = init_logger();
let mut reader = setup_test_file();
let paths = [
"/aabbc/tea-loglogl".to_string(),
"/aabbc/staraabbc".to_string(),
"/aabbc/TEAE-iaabbc".to_string(),
"/justto.slice/cpupower.service".to_string(),
]
.iter()
.cloned()
.collect();
let ret = lru_gen_file_parse(&mut reader, &paths, true).unwrap();
assert_eq!(ret.len(), 4);
assert_eq!(ret.get("/justto.slice/boot.mount"), None);
let birth_vec: Vec<DateTime<Utc>> = ret["/aabbc/tea-loglogl"].1[&0]
.lru
.iter()
.map(|g| g.birth)
.collect();
assert_eq!(
ret.get("/aabbc/tea-loglogl"),
Some(&(
30,
hashmap![0 => MGenLRU{min_seq: 0, max_seq: 3, last_birth: birth_vec[3], min_lru_index: 0,
lru: vec![GenLRU::new_from_data(0, 20, 23, birth_vec[0]),
GenLRU::new_from_data(1, 9, 23, birth_vec[1]),
GenLRU::new_from_data(2, 20, 19, birth_vec[2]),
GenLRU::new_from_data(3, 3, 8, birth_vec[3])]}]
))
);
let birth_vec: Vec<DateTime<Utc>> = ret["/aabbc/staraabbc"].1[&1]
.lru
.iter()
.map(|g| g.birth)
.collect();
assert_eq!(
ret.get("/aabbc/staraabbc"),
Some(&(
22,
hashmap![1 => MGenLRU{min_seq: 2, max_seq: 5, last_birth: birth_vec[3], min_lru_index: 0,
lru: vec![GenLRU::new_from_data(2, 0, 86201, birth_vec[0]),
GenLRU::new_from_data(3, 253, 0, birth_vec[1]),
GenLRU::new_from_data(4, 0, 0, birth_vec[2]),
GenLRU::new_from_data(5, 2976, 41252, birth_vec[3])]}]
))
);
let birth_vec: Vec<DateTime<Utc>> = ret["/aabbc/TEAE-iaabbc"].1[&0]
.lru
.iter()
.map(|g| g.birth)
.collect();
let birth1_vec: Vec<DateTime<Utc>> = ret["/aabbc/TEAE-iaabbc"].1[&1]
.lru
.iter()
.map(|g| g.birth)
.collect();
assert_eq!(
ret.get("/aabbc/TEAE-iaabbc"),
Some(&(
21,
hashmap![0 => MGenLRU{min_seq: 0, max_seq: 3, last_birth: birth_vec[3], min_lru_index: 0, lru: vec![GenLRU::new_from_data(0, 0, 1, birth_vec[0]),
GenLRU::new_from_data(1, 2, 3, birth_vec[1]),
GenLRU::new_from_data(2, 6, 7, birth_vec[2]),
GenLRU::new_from_data(3, 8, 9, birth_vec[3])]},
1 => MGenLRU{min_seq: 3, max_seq: 6, last_birth: birth1_vec[3], min_lru_index: 0, lru: vec![GenLRU::new_from_data(3, 10, 11, birth1_vec[0]),
GenLRU::new_from_data(4, 12, 16, birth1_vec[1]),
GenLRU::new_from_data(5, 17, 18, birth1_vec[2]),
GenLRU::new_from_data(6, 19, 20, birth1_vec[3])]}]
))
);
let birth_vec: Vec<DateTime<Utc>> = ret["/justto.slice/cpupower.service"].1[&0]
.lru
.iter()
.map(|g| g.birth)
.collect();
assert_eq!(
ret.get("/justto.slice/cpupower.service"),
Some(&(
0,
hashmap![0 => MGenLRU{min_seq: 0, max_seq: 3, last_birth: birth_vec[3], min_lru_index: 0, lru: vec![GenLRU::new_from_data(0, 0, 33, birth_vec[0]),
GenLRU::new_from_data(1, 0, 0, birth_vec[1]),
GenLRU::new_from_data(2, 0, 0, birth_vec[2]),
GenLRU::new_from_data(3, 0, 115, birth_vec[3])]}]
))
);
remove_test_file();
}
#[test]
fn test_lru_gen_file_parse_no_target_no_parse() {
let _lock = TEST_MUTEX.lock().unwrap_or_else(|e| e.into_inner());
let _logger = init_logger();
let mut reader = setup_test_file();
let paths = [].iter().cloned().collect();
let ret = lru_gen_file_parse(&mut reader, &paths, false).unwrap();
assert_eq!(ret.len(), 55);
assert_eq!(ret.get("/justto.slice/boot.mount"), Some(&(16, hashmap![])));
assert_eq!(ret.get("/aabbc/tea-loglogl"), Some(&(30, hashmap![])));
assert_eq!(ret.get("/aabbc/staraabbc"), Some(&(22, hashmap![])));
assert_eq!(ret.get("/aabbc/TEAE-iaabbc"), Some(&(21, hashmap![])));
assert_eq!(
ret.get("/justto.slice/cpupower.service"),
Some(&(0, hashmap![]))
);
remove_test_file();
}
#[test]
fn test_lru_gen_file_parse_no_target_parse() {
let _lock = TEST_MUTEX.lock().unwrap_or_else(|e| e.into_inner());
let _logger = init_logger();
let mut reader = setup_test_file();
let paths = [].iter().cloned().collect();
let ret = lru_gen_file_parse(&mut reader, &paths, true).unwrap();
assert_eq!(ret.len(), 55);
let birth_vec: Vec<DateTime<Utc>> = ret["/aabbc/tea-loglogl"].1[&0]
.lru
.iter()
.map(|g| g.birth)
.collect();
assert_eq!(
ret.get("/aabbc/tea-loglogl"),
Some(&(
30,
hashmap![0 => MGenLRU{min_seq: 0, max_seq: 3, last_birth: birth_vec[3], min_lru_index: 0,
lru: vec![GenLRU::new_from_data(0, 20, 23, birth_vec[0]),
GenLRU::new_from_data(1, 9, 23, birth_vec[1]),
GenLRU::new_from_data(2, 20, 19, birth_vec[2]),
GenLRU::new_from_data(3, 3, 8, birth_vec[3])]}]
))
);
let birth_vec: Vec<DateTime<Utc>> = ret["/aabbc/staraabbc"].1[&1]
.lru
.iter()
.map(|g| g.birth)
.collect();
assert_eq!(
ret.get("/aabbc/staraabbc"),
Some(&(
22,
hashmap![1 => MGenLRU{min_seq: 2, max_seq: 5, last_birth: birth_vec[3], min_lru_index: 0,
lru: vec![GenLRU::new_from_data(2, 0, 86201, birth_vec[0]),
GenLRU::new_from_data(3, 253, 0, birth_vec[1]),
GenLRU::new_from_data(4, 0, 0, birth_vec[2]),
GenLRU::new_from_data(5, 2976, 41252, birth_vec[3])]}]
))
);
remove_test_file();
}
fn setup_test_file() -> BufReader<File> {
let data = r#"
memcg 1 /
node 0
0 589359037 0 -0
1 589359037 12 0
2 589359037 0 0
3 589359037 1265 2471
memcg 2 /justto.slice
node 0
0 589334424 0 0
1 589334424 0 0
2 589334424 0 0
3 589334424 0 0
memcg 3 /justto.slice/justtod-teawated.service
node 0
0 589334423 0 217
1 589334423 8 0
2 589334423 0 0
3 589334423 225 40293
memcg 0 /justto.slice/justtod-readahead-replay.service
node 0
0 589334411 0 266694
1 589334411 0 0
2 589334411 0 0
3 589334411 1 21
memcg 0 /justto.slice/tea0-domainname.service
node 0
0 589334410 0 6
1 589334410 0 0
2 589334410 0 0
3 589334410 0 198
memcg 6 /justto.slice/justto-serial\x2dgetty.slice
node 0
0 589334408 0 1
1 589334408 1 0
2 589334408 0 0
3 589334408 32 0
memcg 7 /justto.slice/justto-getty.slice
node 0
0 589334408 0 0
1 589334408 1 0
2 589334408 0 0
3 589334408 31 0
memcg 8 /justto.slice/sys-kernel-debug.mount
node 0
0 589334407 0 6
1 589334408 0 0
2 589334408 0 0
3 589334408 0 7
memcg 10 /justto.slice/dev-hugepages.mount
node 0
0 589334406 0 1
1 589334406 0 0
2 589334406 0 0
3 589334406 0 0
memcg 0 /justto.slice/justtod-readahead-collect.service
node 0
0 589334405 0 96
1 589334405 0 0
2 589334405 0 0
3 589334405 0 0
memcg 12 /justto.slice/justto-justtod\x2dfsck.slice
node 0
0 589334403 0 25
1 589334403 0 0
2 589334403 0 0
3 589334403 0 239
memcg 13 /justto.slice/justto-selinux\x2dpolicy\x2dmigrate\x2dlocal\x2dchanges.slice
node 0
0 589334403 0 0
1 589334403 0 0
2 589334403 0 0
3 589334403 0 0
memcg 14 /justto.slice/dev-mqueue.mount
node 0
0 589334402 0 0
1 589334402 0 0
2 589334402 0 0
3 589334402 0 0
memcg 0 /justto.slice/tea2-monitor.service
node 0
0 589334401 0 9
1 589334401 0 0
2 589334401 0 0
3 589334401 0 582
memcg 0 /justto.slice/kmod-static-nodes.service
node 0
0 589334399 0 4
1 589334399 1 0
2 589334399 0 0
3 589334399 0 33
memcg 0 /justto.slice/plymouth-start.service
node 0
0 589334397 0 1
1 589334397 0 0
2 589334397 0 0
3 589334397 0 0
memcg 18 /justto.slice/sys-kernel-config.mount
node 0
0 589334396 0 0
1 589334396 0 0
2 589334396 0 0
3 589334396 0 0
memcg 5 /justto.slice/tea2-teaetad.service
node 0
0 589334383 0 4
1 589334383 2 0
2 589334383 0 0
3 589334383 587 14
memcg 0 /justto.slice/justtod-remount-fs.service
node 0
0 589334381 0 4
1 589334381 0 0
2 589334381 0 0
3 589334381 0 11
memcg 0 /justto.slice/justtod-tmpfiles-setup-dev.service
node 0
0 589334380 0 28
1 589334380 0 0
2 589334380 0 0
3 589334380 0 32
memcg 0 /justto.slice/justtod-sysctl.service
node 0
0 589334378 0 42
1 589334378 0 0
2 589334378 0 0
3 589334378 0 13
memcg 0 /justto.slice/justtod-teawate-flush.service
node 0
0 589334367 0 8
1 589334367 0 0
2 589334367 0 0
3 589334367 0 141
memcg 0 /justto.slice/justtod-udev-trigger.service
node 0
0 589334365 0 5
1 589334365 0 0
2 589334365 0 0
3 589334365 0 103
memcg 0 /justto.slice/tea0-readonly.service
node 0
0 589334364 0 163
1 589334364 0 0
2 589334364 0 0
3 589334364 0 35
memcg 0 /justto.slice/justtod-random-seed.service
node 0
0 589334363 0 38
1 589334363 0 0
2 589334363 0 0
3 589334363 0 9
memcg 25 /justto.slice/justtod-udevd.service
node 0
0 589334362 0 12553
1 589334362 249 0
2 589334362 0 0
3 589334362 124 1415
memcg 15 /justto.slice/dev-disk-by\x2dlabel-SWAP.swap
node 0
0 589334085 0 5
1 589334085 0 0
2 589334085 0 0
3 589334085 0 10
memcg 16 /justto.slice/boot.mount
node 0
0 589334035 0 26
1 589334035 0 0
2 589334035 0 0
3 589334035 0 0
memcg 0 /justto.slice/plymouth-read-write.service
node 0
0 589334011 0 9
1 589334011 0 0
2 589334011 0 0
3 589334011 0 27
memcg 0 /justto.slice/tea0-import-state.service
node 0
0 589334008 0 5
1 589334008 0 0
2 589334008 0 0
3 589334008 0 45
memcg 0 /justto.slice/justtod-tmpfiles-setup.service
node 0
0 589333868 0 8
1 589333868 0 0
2 589333868 0 0
3 589333868 0 0
memcg 0 /justto.slice/justtod-update-utmp.service
node 0
0 589333772 0 5
1 589333772 1 0
2 589333772 0 0
3 589333772 0 74
memcg 0 /justto.slice/network.service
node 0
0 589333758 0 2480
1 589333758 0 0
2 589333758 0 0
3 589333758 0 542
memcg 0 /justto.slice/tea0-dmesg.service
node 0
0 589333757 0 35
1 589333757 0 0
2 589333757 0 0
3 589333757 0 37
memcg 0 /justto.slice/cpupower.service
node 0
0 589333755 0 33
1 589333755 0 0
2 589333755 0 0
3 589333755 0 115
memcg 0 /justto.slice/justtod-user-sessions.service
node 0
0 589333749 0 4
1 589333749 0 0
2 589333749 0 0
3 589333749 0 8
memcg 0 /justto.slice/sysstat.service
node 0
0 589333747 0 17
1 589333747 0 0
2 589333747 0 0
3 589333747 0 41
memcg 26 /justto.slice/mcelog.service
node 0
0 589333745 0 41
1 589333745 2 0
2 589333745 0 0
3 589333745 554 37
memcg 27 /justto.slice/dbus.service
node 0
0 589333743 0 82
1 589333743 1 0
2 589333743 0 0
3 589333743 119 216
memcg 20 /justto.slice/syslog-ng.service
node 0
0 589333722 0 5889
1 589333722 2 0
2 589333722 0 0
3 589333722 418 6488
memcg 0 /justto.slice/cpunoturbo.service
node 0
0 589333596 0 1
1 589333596 0 0
2 589333596 0 0
3 589333596 0 0
memcg 4 /justto.slice/staraabbcctl.service
node 0
0 589327556 0 7
1 589327556 2 0
2 589327556 0 0
3 589327556 69 11
memcg 19 /justto.slice/sshd.service
node 0
0 589327547 0 9670
1 589327547 11 0
2 589327547 0 0
3 589327547 2304 421
memcg 0 /justto.slice/vmcore-collect.service
node 0
0 589327544 0 3
1 589327544 0 0
2 589327544 0 0
3 589327544 0 0
memcg 0 /justto.slice/kdump.service
node 0
0 589327543 0 417259
1 589327543 2 0
2 589327543 0 0
3 589327543 0 0
memcg 31 /justto.slice/proc-sys-fs-binfmt_misc.mount
node 0
0 589311768 0 0
1 589311768 0 0
2 589311768 0 0
3 589311768 0 0
memcg 32 /justto.slice/ntpd.service
node 0
0 589297199 0 14
1 589297199 2 0
2 589297199 0 0
3 589297199 120 198
memcg 29 /justto.slice/crond.service
node 0
0 589297184 0 115157
1 589297184 2 0
2 589297184 0 0
3 589297184 195 324
memcg 0 /justto.slice/justtod-tmpfiles-clean.service
node 0
0 588459896 0 8
1 588459896 0 0
2 588459896 0 0
3 588459896 0 0
memcg 9 /docker.slice
node 0
0 589334407 0 13919
1 589334407 7 0
2 589334407 0 0
3 589334407 7254 146884
memcg 17 /aabbc
node 0
0 589327431 0 0
1 589327431 0 0
2 589327431 0 0
3 589327431 0 0
memcg 22 /aabbc/staraabbc
node 1
2 589327430 0 86201
3 589327430 253 0
4 589327430 0 0
5 589327430 2976 41252
memcg 21 /aabbc/TEAE-iaabbc
node 0
0 589324388 0 1
1 589324388 2 3
2 589324388 6 7
3 589324388 8 9
node 1
3 589324388 10 11
4 589324388 12 16
5 589324388 17 18
6 589324388 19 20
memcg 28 /aabbc/teawa_tea
node 0
0 589324387 0 69337
1 589324387 2 0
2 589324387 0 0
3 589324387 1892 6103
memcg 30 /aabbc/tea-loglogl
node 0
0 589324385 20 23
1 589324385 9 23
2 589324385 20 19
3 589324380 3 8
"#;
let mut file = File::create("test_lru_gen").unwrap();
file.write_all(data.as_bytes()).unwrap();
let file = File::open("test_lru_gen").unwrap();
BufReader::new(file)
}
fn remove_test_file() {
fs::remove_file("test_lru_gen").unwrap();
}
}

71
src/mem-agent/src/misc.rs Normal file
View File

@@ -0,0 +1,71 @@
// Copyright (C) 2024 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
pub fn sl() -> slog::Logger {
slog_scope::logger().new(slog::o!("subsystem" => "mem-agent"))
}
#[macro_export]
macro_rules! error {
($($arg:tt)*) => {
slog::info!(crate::misc::sl(), "{}", format_args!($($arg)*))
}
}
#[macro_export]
macro_rules! warn {
($($arg:tt)*) => {
slog::info!(crate::misc::sl(), "{}", format_args!($($arg)*))
}
}
#[macro_export]
macro_rules! info {
($($arg:tt)*) => {
slog::info!(crate::misc::sl(), "{}", format_args!($($arg)*))
}
}
#[macro_export]
macro_rules! trace {
($($arg:tt)*) => {
slog::info!(crate::misc::sl(), "{}", format_args!($($arg)*))
}
}
#[macro_export]
macro_rules! debug {
($($arg:tt)*) => {
slog::info!(crate::misc::sl(), "{}", format_args!($($arg)*))
}
}
#[cfg(test)]
pub fn is_test_environment() -> bool {
true
}
#[cfg(not(test))]
pub fn is_test_environment() -> bool {
false
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_environment_check() {
assert!(is_test_environment());
}
#[test]
fn test_log_macro() {
error!("error");
warn!("warn");
info!("info");
trace!("trace");
debug!("debug");
}
}

44
src/mem-agent/src/proc.rs Normal file
View File

@@ -0,0 +1,44 @@
// Copyright (C) 2024 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
use anyhow::{anyhow, Result};
use std::fs::File;
use std::io::{BufRead, BufReader};
fn get_meminfo(opt: &str) -> Result<u64> {
let file = File::open("/proc/meminfo")?;
let reader = BufReader::new(file);
for line in reader.lines() {
let line = line?;
if line.starts_with(opt) {
let parts: Vec<&str> = line.split_whitespace().collect();
if parts.len() >= 2 {
let kb = parts[1].parse::<u64>()?;
return Ok(kb);
}
}
}
Err(anyhow!("no {} found", opt))
}
pub fn get_memfree_kb() -> Result<u64> {
get_meminfo("MemFree:")
}
pub fn get_freeswap_kb() -> Result<u64> {
get_meminfo("SwapFree:")
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_get_memfree_kb() {
let memfree_kb = get_memfree_kb().unwrap();
assert!(memfree_kb > 0);
}
}

260
src/mem-agent/src/psi.rs Normal file
View File

@@ -0,0 +1,260 @@
// Copyright (C) 2024 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
use crate::info;
use anyhow::{anyhow, Result};
use chrono::{DateTime, Utc};
use std::fs;
use std::fs::File;
use std::fs::OpenOptions;
use std::io::{BufRead, BufReader};
use std::path::PathBuf;
const CGROUP_PATH: &str = "/sys/fs/cgroup/";
const MEM_PSI: &str = "memory.pressure";
const IO_PSI: &str = "io.pressure";
fn find_psi_subdirs() -> Result<PathBuf> {
if PathBuf::from(CGROUP_PATH).is_dir() {
for entry in fs::read_dir(CGROUP_PATH)? {
let entry = entry?;
let path = entry.path();
if path.is_dir() {
if path.join(MEM_PSI).is_file() && path.join(IO_PSI).is_file() {
return Ok(path.clone());
}
}
}
Err(anyhow!("cannot find cpuacct dir in {}", CGROUP_PATH))
} else {
Err(anyhow!("{} is not a directory", CGROUP_PATH))
}
}
pub fn check(psi_path: &PathBuf) -> Result<PathBuf> {
if crate::misc::is_test_environment() {
return Ok(psi_path.clone());
}
let p = if psi_path.as_os_str().is_empty() {
find_psi_subdirs().map_err(|e| anyhow!("find_psi_subdirs failed: {}", e))?
} else {
psi_path.clone()
};
let mem_psi_path = p.join(MEM_PSI);
let _ = OpenOptions::new()
.read(true)
.write(true)
.open(mem_psi_path.clone())
.map_err(|e| anyhow!("open file {:?} failed: {}", mem_psi_path, e))?;
info!("psi is available at {:?}", p);
Ok(p)
}
fn read_pressure_some_total(file_path: PathBuf) -> Result<u64> {
let file = File::open(file_path).map_err(|e| anyhow!("File::open failed: {}", e))?;
let mut reader = BufReader::new(file);
let mut first_line = String::new();
if reader
.read_line(&mut first_line)
.map_err(|e| anyhow!("reader.read_line failed: {}", e))?
<= 0
{
return Err(anyhow!("File is empty"));
}
let parts: Vec<&str> = first_line.split_whitespace().collect();
let total_str = parts.get(4).ok_or_else(|| anyhow!("format is not right"))?;
let val = total_str
.split('=')
.nth(1)
.ok_or_else(|| anyhow!("format is not right"))?;
let total_value = val
.parse::<u64>()
.map_err(|e| anyhow!("parse {} failed: {}", total_str, e))?;
Ok(total_value)
}
#[derive(Debug, Clone)]
pub struct Period {
path: PathBuf,
last_psi: u64,
last_update_time: DateTime<Utc>,
include_child: bool,
}
impl Period {
pub fn new(path: &PathBuf, include_child: bool) -> Self {
Self {
path: path.to_owned(),
last_psi: 0,
last_update_time: Utc::now(),
include_child,
}
}
fn get_path_pressure_us(&self, psi_name: &str) -> Result<u64> {
let cur_path = self.path.join(psi_name);
let mut parent_val = read_pressure_some_total(cur_path.clone())
.map_err(|e| anyhow!("read_pressure_some_total {:?} failed: {}", cur_path, e))?;
if !self.include_child {
let mut child_val = 0;
let entries = fs::read_dir(self.path.clone())
.map_err(|e| anyhow!("fs::read_dir failed: {}", e))?;
for entry in entries {
let entry = entry.map_err(|e| anyhow!("get path failed: {}", e))?;
let epath = entry.path();
if epath.is_dir() {
let full_path = self.path.join(entry.file_name()).join(psi_name);
child_val += read_pressure_some_total(full_path.clone()).map_err(|e| {
anyhow!("read_pressure_some_total {:?} failed: {}", full_path, e)
})?;
}
}
if parent_val < child_val {
parent_val = 0;
} else {
parent_val -= child_val;
}
}
Ok(parent_val)
}
pub fn get_percent(&mut self) -> Result<u64> {
let now = Utc::now();
let mut psi = self
.get_path_pressure_us(MEM_PSI)
.map_err(|e| anyhow!("get_path_pressure_us MEM_PSI {:?} failed: {}", self.path, e))?;
psi += self
.get_path_pressure_us(IO_PSI)
.map_err(|e| anyhow!("get_path_pressure_us IO_PSI {:?} failed: {}", self.path, e))?;
let mut percent = 0;
if self.last_psi != 0 && self.last_psi < psi && self.last_update_time < now {
let us = (now - self.last_update_time).num_milliseconds() as u64 * 1000;
if us != 0 {
percent = (psi - self.last_psi) * 100 / us;
}
}
self.last_psi = psi;
self.last_update_time = now;
Ok(percent)
}
}
#[cfg(test)]
mod tests {
use super::*;
use std::io::Write;
#[test]
fn test_read_pressure_some_total() {
remove_fake_file();
let val = read_pressure_some_total(PathBuf::from(setup_fake_file())).unwrap();
assert_eq!(val, 37820);
remove_fake_file();
}
#[test]
fn test_period() {
remove_fake_cgroup_dir();
let dir = setup_fake_cgroup_dir();
let period = Period::new(&dir, true);
let us = period.get_path_pressure_us(MEM_PSI).unwrap();
assert_eq!(us, 37820);
let us = period.get_path_pressure_us(IO_PSI).unwrap();
assert_eq!(us, 82345);
let period = Period::new(&dir, false);
let us = period.get_path_pressure_us(MEM_PSI).unwrap();
assert_eq!(us, 26688);
let us = period.get_path_pressure_us(IO_PSI).unwrap();
assert_eq!(us, 66879);
remove_fake_cgroup_dir();
}
fn write_fake_file(path: &PathBuf, data: &str) {
let mut file = File::create(path).unwrap();
file.write_all(data.as_bytes()).unwrap();
}
fn setup_fake_file() -> String {
let data = r#"some avg10=0.00 avg60=0.00 avg300=0.00 total=37820
full avg10=0.00 avg60=0.00 avg300=0.00 total=28881
"#;
write_fake_file(&PathBuf::from("test_psi"), data);
"test_psi".to_string()
}
fn remove_fake_file() {
let _ = fs::remove_file("test_psi");
}
fn setup_fake_cgroup_dir() -> PathBuf {
let dir = PathBuf::from("fake_cgroup");
fs::create_dir(&dir).unwrap();
let mem_psi = dir.join(MEM_PSI);
let io_psi = dir.join(IO_PSI);
let data = r#"some avg10=0.00 avg60=0.00 avg300=0.00 total=37820
full avg10=0.00 avg60=0.00 avg300=0.00 total=28881
"#;
write_fake_file(&mem_psi, data);
let data = r#"some avg10=0.00 avg60=0.00 avg300=0.00 total=82345
full avg10=0.00 avg60=0.00 avg300=0.00 total=67890
"#;
write_fake_file(&io_psi, data);
let child_dir = dir.join("c1");
fs::create_dir(&child_dir).unwrap();
let child_mem_psi = child_dir.join(MEM_PSI);
let child_io_psi = child_dir.join(IO_PSI);
let data = r#"some avg10=0.00 avg60=0.00 avg300=0.00 total=3344
full avg10=0.00 avg60=0.00 avg300=0.00 total=1234
"#;
write_fake_file(&child_mem_psi, data);
let data = r#"some avg10=0.00 avg60=0.00 avg300=0.00 total=5566
full avg10=0.00 avg60=0.00 avg300=0.00 total=5678
"#;
write_fake_file(&child_io_psi, data);
let child_dir = dir.join("c2");
fs::create_dir(&child_dir).unwrap();
let child_mem_psi = child_dir.join(MEM_PSI);
let child_io_psi = child_dir.join(IO_PSI);
let data = r#"some avg10=0.00 avg60=0.00 avg300=0.00 total=7788
full avg10=0.00 avg60=0.00 avg300=0.00 total=4321
"#;
write_fake_file(&child_mem_psi, data);
let data = r#"some avg10=0.00 avg60=0.00 avg300=0.00 total=9900
full avg10=0.00 avg60=0.00 avg300=0.00 total=8765
"#;
write_fake_file(&child_io_psi, data);
dir
}
fn remove_fake_cgroup_dir() {
let _ = fs::remove_dir_all("fake_cgroup");
}
}

View File

@@ -0,0 +1,88 @@
// Copyright (C) 2024 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
use chrono::Duration as ChronoDuration;
use chrono::{DateTime, Utc};
use tokio::time::Duration as TokioDuration;
fn chrono_to_tokio_duration(chrono_duration: ChronoDuration) -> TokioDuration {
if chrono_duration.num_nanoseconds().unwrap_or(0) >= 0 {
TokioDuration::new(
chrono_duration.num_seconds() as u64,
(chrono_duration.num_nanoseconds().unwrap_or(0) % 1_000_000_000) as u32,
)
} else {
TokioDuration::new(0, 0)
}
}
#[derive(Debug, Clone)]
pub struct Timeout {
sleep_duration: ChronoDuration,
start_wait_time: DateTime<Utc>,
}
impl Timeout {
pub fn new(secs: u64) -> Self {
Self {
sleep_duration: ChronoDuration::microseconds(secs as i64 * 1000000),
/* Make sure the first time to timeout */
start_wait_time: Utc::now() - ChronoDuration::microseconds(secs as i64 * 1000000) * 2,
}
}
pub fn is_timeout(&self) -> bool {
let now = Utc::now();
now >= self.start_wait_time + self.sleep_duration
}
pub fn reset(&mut self) {
self.start_wait_time = Utc::now();
}
pub fn remaining_tokio_duration(&self) -> TokioDuration {
let now = Utc::now();
if now >= self.start_wait_time + self.sleep_duration {
return TokioDuration::ZERO;
}
chrono_to_tokio_duration(self.start_wait_time + self.sleep_duration - now)
}
pub fn set_sleep_duration(&mut self, secs: u64) {
self.sleep_duration = ChronoDuration::microseconds(secs as i64 * 1000000);
}
}
#[cfg(test)]
mod tests {
use super::*;
use std::thread;
use std::time::Duration;
#[test]
fn test_timeout() {
let mut timeout = Timeout::new(1);
// timeout should be timeout at once.
assert_eq!(timeout.is_timeout(), true);
timeout.reset();
assert_eq!(timeout.is_timeout(), false);
thread::sleep(Duration::from_secs(2));
assert_eq!(timeout.is_timeout(), true);
timeout.set_sleep_duration(2);
timeout.reset();
assert_eq!(timeout.is_timeout(), false);
thread::sleep(Duration::from_secs(1));
assert_eq!(timeout.is_timeout(), false);
thread::sleep(Duration::from_secs(1));
assert_eq!(timeout.is_timeout(), true);
}
}

Some files were not shown because too many files have changed in this diff Show More