Compare commits

..

242 Commits

Author SHA1 Message Date
Zvonko Kaiser
b9faafefb6 Create SECURITY.md, SECURITY_CONTACTS
Explicit SECURITY.md that reflects Kata’s rolling-release model (monthly cadence, no long-term branches) and sets clear expectations for reporters and downstream users.
With the SECURITY.md in place we need also the SECURITY_CONTACTS

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-08-13 14:24:58 +00:00
Dan Mihai
9379a18c8a Merge pull request #11565 from Sumynwa/sumsharma/agent_ctl_vm_boot_support
agent-ctl: Add option "--vm" to boot pod VM for testing.
2025-08-11 09:36:23 -07:00
Sumedh Alok Sharma
c7c811071a agent-ctl: Add option --vm to boot pod VM for testing.
This change introduces a new command line option `--vm`
to boot up a pod VM for testing. The tool connects with
kata agent running inside the VM to send the test commands.
The tool uses `hypervisor` crates from runtime-rs for VM
lifecycle management. Current implementation supports
Qemu & Cloud Hypervisor as VMMs.

In summary:
- tool parses the VMM specific runtime-rs kata config file in
/opt/kata/share/defaults/kata-containers/runtime-rs/*
- prepares and starts a VM using runtime-rs::hypervisor vm APIs
- retrieves agent's server address to setup connection
- tests the requested commands & shutdown the VM

Fixes #11566

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2025-08-11 11:03:18 +00:00
Hyounggyu Choi
407252a863 Merge pull request #11641 from Apokleos/kata-log
runtime-rs: Label system journal log with kata
2025-08-11 08:44:31 +02:00
Alex Lyn
196d7d674d runtime-rs: Label system journal log with kata
Route kata-shim logs directly to systemd-journald under 'kata' identifier.

This refactoring enables `kata-shim` logs to be properly attributed to
'kata' in systemd-journald, instead of inheriting the 'containerd'
identifier.

Previously, `kata-shim` logs were challenging to filter and debug as
they
appeared under the `containerd.service` unit.

This commit resolves this by:
1.  Introducing a `LogDestination` enum to explicitly define logging
targets (File or Journal).
2.  Modifying logger creation to set `SYSLOG_IDENTIFIER=kata` when
logging
to Journald.
3.  Ensuring type safety and correct ownership handling for different
logging backends.

This significantly enhances the observability and debuggability of Kata
Containers, making it easier to monitor and troubleshoot Kata-specific
events.

Fixes: #11590

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2025-08-10 16:00:36 +08:00
Aurélien Bombo
be148c7f72 Merge pull request #11666 from kata-containers/sprt/static-check-exclude-security-md
ci: static-checks: add SECURITY.md to exclude list
2025-08-08 12:50:29 -05:00
Fabiano Fidêncio
dcbdf56281 Merge pull request #11660 from zvonkok/remove-stable
ci: Remove stable
2025-08-08 14:18:25 +02:00
Xuewei Niu
1d2f2d6350 Merge pull request #11219 from fidencio/topic/version-qemu-bump-to-10.0.0
version: Bump QEMU to v10.0.0
2025-08-08 19:04:45 +08:00
RuoqingHe
aaf8de3dbf Merge pull request #11669 from kevinzs2048/add-timeout
ci: cri-containerd: add 5s timeout for creating sanbox with crictl
2025-08-08 18:25:58 +08:00
Alex Lyn
9816ffdac7 Merge pull request #11653 from Apokleos/align-initdata-annoation
Align initdata annoation with kata-runtime
2025-08-08 16:24:09 +08:00
Kevin Zhao
1aa65167d7 CI: cri-containerd: add 5s timeout for creating sanbox with crictl
After moving Arm64 CI nodes to new one, we do faced an interesting
issue for timeout when it executes the command with crictl runp,
the error is usally: code = DeadlineExceeded

Fixes: #11662

Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>
2025-08-08 15:41:39 +08:00
Fupan Li
b50777a174 Merge pull request #10580 from pmores/make-vcpu-allocation-more-accurate
runtime-rs: make vcpu allocation more accurate
2025-08-08 14:14:40 +08:00
Xuewei Niu
beea0c34c5 Merge pull request #11060 from kata-containers/sprt/vfsd-metadata
runtime: virtio-fs: Support "metadata" cache mode
2025-08-08 11:13:57 +08:00
Fabiano Fidêncio
f9e16431c1 version: Bump QEMU to v10.0.3
As the new release of QEMU is out, let's switch to it and take advantage
of bug fixes and improvements.

QEMU changelog: https://wiki.qemu.org/ChangeLog/10.0

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-08-07 22:31:30 +02:00
Greg Kurz
f9a6359674 Merge pull request #11667 from c3d/bug/11633-qmp
qemu: Respect the JSON schema for hot plug
2025-08-07 16:04:12 +02:00
Aurélien Bombo
6d96875d04 runtime: virtio-fs: Support "metadata" cache mode
The Rust virtiofsd supports a "metadata" cache mode [1] that wasn't
present in the C version [2], so this PR adds support for that.

 [1] https://gitlab.com/virtio-fs/virtiofsd
 [2] https://qemu.weilnetz.de/doc/5.1/tools/virtiofsd.html#cmdoption-virtiofsd-cache

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-08-07 21:24:40 +08:00
Pavel Mores
69f21692ed runtime-rs: enable vcpu allocation tests in CI
This series should make runtime-rs's vcpu allocation behaviour match the
behaviour of runtime-go so we can now enable pertinent tests which were
skipped so far due the difference between both shims.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2025-08-07 10:32:44 +02:00
Pavel Mores
00bfa3fa02 runtime-rs: re-adjust config after modifying it with annotations
Configuration information is adjusted after loading from file but so
far, there has been no similar check for configuration coming from
annotations.  This commit introduces re-adjusting config after
annotations have been processed.

A small refactor was necessary as a prerequisite which introduces
function TomlConfig::adjust_config() to make it easier to invoke
the adjustment for a whole TomlConfig instance.  This function is
analogous to the existing validate() function.

The immediate motivation for this change is to make sure that 0
in "default_vcpus" annotation will be properly adjusted to 1 as
is the case if 0 is loaded from a config file.  This is required
to match the golang runtime behaviour.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2025-08-07 10:32:44 +02:00
Pavel Mores
e2156721fd runtime-rs: add tests to exercise floating-point 'default_vcpus'
Also included (as commented out) is a test that does not pass although
it should.  See source code comment for explanation why fixing this seems
beyond the scope of this PR.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2025-08-07 10:32:44 +02:00
Pavel Mores
1f95d9401b runtime-rs: change representation of default_vcpus from i32 to f32
This commit focuses purely on the formal change of type.  If any subsequent
changes in semantics are needed they are purposely avoided here so that the
commit can be reviewed as a 100% formal and 0% semantic change.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2025-08-07 10:32:44 +02:00
Pavel Mores
cdc0eab8e4 runtime-rs: make sandbox vcpu allocation more accurate
This commit addresses a part of the same problem as PR #7623 did for the
golang runtime.  So far we've been rounding up individual containers'
vCPU requests and then summing them up which can lead to allocation of
excess vCPUs as described in the mentioned PR's cover letter.  We address
this by reversing the order of operations, we sum the (possibly fractional)
container requests and only then round up the total.

We also align runtime-rs's behaviour with runtime-go in that we now
include the default vcpu request from the config file ('default_vcpu')
in the total.

We diverge from PR #7623 in that `default_vcpu` is still treated as an
integer (this will be a topic of a separate commit), and that this
implementation avoids relying on 32-bit floating point arithmetic as there
are some potential problems with using f32.  For instance, some numbers
commonly used in decimal, notably all of single-decimal-digit numbers
0.1, 0.2 .. 0.9 except 0.5, are periodic in binary and thus fundamentally
not representable exactly.  Arithmetics performed on such numbers can lead
to surprising results, e.g. adding 0.1 ten times gives 1.0000001, not 1,
and taking a ceil() results in 2, clearly a wrong answer in vcpu
allocation.

So instead, we take advantage of the fact that container requests happen
to be expressed as a quota/period fraction so we can sum up quotas,
fundamentally integral numbers (possibly fractional only due to the need
to rewrite them with a common denominator) with much less danger of
precision loss.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2025-08-07 10:32:44 +02:00
Christophe de Dinechin
ec480dc438 qemu: Respect the JSON schema for hot plug
When hot-plugging CPUs on QEMU, we send a QMP command with JSON
arguments. QEMU 9.2 recently became more strict[1] enforcing the
JSON schema for QMP parameters. As a result, running Kata Containers
with QEMU 9.2 results in a message complaining that the core-id
parameter is expected to be an integer:

```
qmp hotplug cpu, cpuID=cpu-0 socketID=1, error:
QMP command failed:
Invalid parameter type for 'core-id', expected: integer
```

Fix that by changing the core-id, socket-id and thread-id to be
integer values.

[1]: be93fd5372

Fixes: #11633

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2025-08-07 09:13:57 +02:00
Alex Lyn
37685c41c7 runtime-rs: Correct the coresponding initdata annotation const
As we have changed the initdata annotation definition, Accordingly, we also
need correct its const definition with KATA_ANNO_CFG_RUNTIME_INIT_DATA.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2025-08-07 10:45:28 +08:00
Alex Lyn
163f04a918 Merge pull request #11651 from microsoft/danmihai1/debug-kubectl-logs
tests: k8s-sandbox-vcpus-allocation debug info
2025-08-07 10:27:29 +08:00
Aurélien Bombo
e3b4d87b6d ci: static-checks: add SECURITY.md to exclude list
This adds SECURITY.md to the list of GH-native files that should be excluded by
the reference checker.

Today this is useful for downstreams who already have a SECURITY.md file for
compliance reasons. When Kata onboards that file, this commit will also be
required.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-08-06 11:24:52 -05:00
Zvonko Kaiser
1b1b3af9ab ci: Remove trigger for stable branch
We do not support stable branches anymore,
remove the trigger for it.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-08-06 09:22:24 +08:00
Hyounggyu Choi
af01434226 Merge pull request #11646 from kata-containers/sprt/param-static-checks
ci: static-checks: Auto-detect repo by default
2025-08-05 22:13:20 +02:00
Alex Lyn
ede773db17 kata-types: Align the initdata annotation with kata-runtime's definition
To make it work within CI, we do alignment with kata-runtime's definition
with "io.katacontainers.config.runtime.cc_init_data".

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2025-08-03 22:51:39 +08:00
Dan Mihai
05eca5ca25 tests: k8s-sandbox-vcpus-allocation debug info
Print more details about the behavior of "kubectl logs", trying to understand
errors like:

https://github.com/kata-containers/kata-containers/actions/runs/16662887973/job/47164791712

not ok 1 Check the number vcpus are correctly allocated to the sandbox
 (in test file k8s-sandbox-vcpus-allocation.bats, line 37)
   `[ `kubectl logs ${pods[$i]}` -eq ${expected_vcpus[$i]} ]' failed with status 2
 No resources found in kata-containers-k8s-tests namespace.
...
 k8s-sandbox-vcpus-allocation.bats: line 37: [: -eq: unary operator expected

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-08-01 20:09:17 +00:00
Aurélien Bombo
c47bff6d6a Merge pull request #11637 from kata-containers/sprt/remove-install-az-cli
gha: Remove unnecessary install-azure-cli step
2025-08-01 09:34:46 -05:00
Fabiano Fidêncio
82f141a02e Merge pull request #11632 from burgerdev/codegen
runtime: reproducible generation of Golang proto bindings
2025-07-31 23:49:18 +02:00
Fabiano Fidêncio
7198c8789e Merge pull request #11639 from zvonkok/gpu_guest_components
gpu: guest components
2025-07-31 21:42:31 +02:00
Aurélien Bombo
9585e608e5 ci: static-checks: Auto-detect repo by default
This auto-detects the repo by default (instead of having to specify
KATA_DEV_MODE=true) so that forked repos can leverage the static-checks.yaml CI
check without modification.

An alternative would have been to pass the repo in static-checks.yaml. However,
because of the matrix, this would've changed the check name, which is a pain to
handle in either the gatekeeper/GH UI.

Example fork failure:
https://github.com/microsoft/kata-containers/actions/runs/16656407213/job/47142421739#step:8:75

I've tested this change to work in a fork.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-07-31 14:33:24 -05:00
Zvonko Kaiser
8422411d91 gpu: Add coco guest components
The second stage needs to consider the coco guest components

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-31 17:11:21 +00:00
Markus Rudy
3fd354b991 ci: add codegen to static-checks
Signed-off-by: Markus Rudy <mr@edgeless.systems>

Fixes: #11631

Co-authored-by: Steve Horsman <steven@uk.ibm.com>
2025-07-31 17:58:25 +01:00
Markus Rudy
9e38fd2562 tools: add image for Go proto bindings
In order to have a reproducible code generation process, we need to pin
the versions of the tools used. This is accomplished easiest by
generating inside a container.

This commit adds a container image definition with fixed dependencies
for Golang proto/ttrpc code generation, and changes the agent Makefile
to invoke the update-generated-proto.sh script from within that
container.

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2025-07-31 17:58:25 +01:00
Markus Rudy
f7a36df290 runtime: generate proto files
The generated Go bindings for the agent are out of date. This commit
was produced by running
src/agent/src/libs/protocols/hack/update-generated-proto.sh with
protobuf compiler versions matching those of the last run, according to
the generated code comments.

Since there are new RPC methods, those needed to be added to the
HybridVSockTTRPCMockImp.

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2025-07-31 17:58:25 +01:00
Fabiano Fidêncio
d077ed4c1e Merge pull request #11645 from kata-containers/topic/fix-kbuild-sign-pin-issue
build: nvidia: Fix KBUILD_SIGN_PIN breakage
2025-07-31 18:31:34 +02:00
Fabiano Fidêncio
8d30b84abd build: nvidia: Fix KBUILD_SIGN_PIN breakage
We only need KBUILD_SIGN_PIN exported when building nvidia related
artefacts.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-31 16:39:20 +02:00
Fabiano Fidêncio
20bef41347 Merge pull request #11236 from kata-containers/amd64-nvidia-gpu-cicd
gpu: AMD64 NVIDIA GPU CI/CD
2025-07-31 14:52:01 +02:00
Aurélien Bombo
96f1d95de5 gha: Remove unnecessary install-azure-cli step
az cli is already installed by the azure/login action.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-07-30 10:42:56 -05:00
Zvonko Kaiser
fbb0e7f2f2 gpu: Add secrets passthrough to the workflow
We need to pass-through the secrets in all the needed workflows
ci, ci-on-push, ci-nightly, ci-devel

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-30 13:51:01 +00:00
Zvonko Kaiser
30778594d0 gpu: Add arm64-nvidia-a100 to actionlint.yaml
Make zizmor happy about our custom runner label

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-30 13:45:59 +00:00
Zvonko Kaiser
8768e08258 gpu: Add embeding service
For a simple RAG pipeline add a embeding service

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-30 13:45:59 +00:00
Zvonko Kaiser
254dbd9b45 gpu: Add Pod spec for NIM llama
Pod spec for the NIM inferencing service

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-30 13:45:59 +00:00
Zvonko Kaiser
568b13400a gpu: Add NIM bats test
We're running a simple NIM container to test if the GPUs
are working properly

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-30 13:45:59 +00:00
Zvonko Kaiser
6188b7f79f gpu: Add run_kubernetes_nv_tests.sh
Replicate what we have for run_tests and run .bats files

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-30 13:45:59 +00:00
Zvonko Kaiser
9a829107ba gpu: Add selector for k8s tests
We want to reuse the current run_tests with GPUs, introduce a var
that will define what to run.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-30 13:45:59 +00:00
Zvonko Kaiser
7669f1fbd1 gpu: Add NVIDIA GPU test block for amd64
Once we have the amd64 artifacts we can run some arm64 k8s tests.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-30 13:45:59 +00:00
Zvonko Kaiser
97d7575d41 gpu: Disable metrics tests
We are not running the metrics tests anyway for now
lets make room to run the GPU tests.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-30 13:45:58 +00:00
Anastassios Nanos
00e0db99a3 Merge pull request #11627 from itsmohitnarayan/FirecrackerVersionUpdate 2025-07-30 13:59:55 +03:00
Kumar Mohit
5cccbb9f41 versions: Upgrade Firecracker Version to 1.12.1
Updated versions.yaml to use Firecracker v1.12.1.
Replaced firecracker and jailer binaries under /opt/kata/bin.

Tested with kata-fc runtime on Kubernetes:
- Deployed pods using gitpod/openvscode-server
- Verified microVM startup, container access, and Firecracker usage
- Confirmed Firecracker and jailer versions via CLI

Signed-off-by: Kumar Mohit <68772712+itsmohitnarayan@users.noreply.github.com>
2025-07-30 12:51:08 +05:30
Saul Paredes
1aaaef2134 Merge pull request #11553 from microsoft/danmihai1/genpolicy-cleanup
genpolicy: reduce complexity
2025-07-28 14:32:59 -07:00
Dan Mihai
c11c972465 genpolicy: config layer logging clean-up
Use a simple debug!() for logging the config_layer string, instead of
transcoding, etc.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-07-28 18:30:13 +00:00
Dan Mihai
30bfa2dfcc genpolicy: use CoCo settings by default
- "confidential_emptyDir" becomes "emptyDir" in the settings file.
- "confidential_configMap" becomes "configMap" in settings.
- "mount_source_cpath" becomes "cpath".
- The new "root_path" gets used instead of the old "cpath" to point to
  the container root path..
- "confidential_guest" is no longer used. By default it gets replaced
  by "enable_configmap_secret_storages"=false, because CoCo is using
  CopyFileRequest instead of the Storage data structures for ConfigMap
  and/or Secret volume mounts during CreateContainerRequest.
- The value of "guest_pull" becomes true by default.
- "image_layer_verification" is no longer used - just CoCo's guest pull
  is supported.
- The Request input files from unit tests are changing to reflect the
  new default settings values described above.
- tests/integration/kubernetes/tests_common.sh adjusts the settings for
  platforms that are not set-up for CoCo during CI (i.e., platforms
  other than SNP, TDX, and CoCo Dev).

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-07-28 18:30:13 +00:00
Dan Mihai
94995d7102 genpolicy: skip pulling layers for guest-pull
Skip pulling container image layers when guest-pull=true. The contents
of these layers were ignored due to:
- #11162, and
- tarfs snapshotter support having been removed from genpolicy.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-07-28 18:30:13 +00:00
Dan Mihai
f6016f4f36 genpolicy: remove tarfs snapshotter support
AKS Confidential Containers are using the tarfs snapshotter. CoCo
upstream doesn't use this snapshotter, so remove this Policy complexity
from upstream.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-07-28 18:30:10 +00:00
Steve Horsman
077c59dd1f Merge pull request #11385 from wainersm/ci_make_coco_nontee_required
ci/gatekeeper: make run-k8s-tests-coco-nontee job required
2025-07-28 14:16:23 +01:00
Steve Horsman
74fba9c736 Merge pull request #11619 from kata-containers/install-dependencies-gh-cli
ci: Try passing api token into githubh api call
2025-07-28 13:35:12 +01:00
Xuewei Niu
2a3c8b04df Merge pull request #11613 from RuoqingHe/clippy-fix-for-libs-20250721
mem-agent: Ignore Cargo.lock
2025-07-28 17:45:29 +08:00
RuoqingHe
3f46347dc5 Merge pull request #11618 from RuoqingHe/fix-dragonball-default-build
dragonball: Fix warnings in default build
2025-07-28 11:24:46 +08:00
Xuewei Niu
e5d5768c75 Merge pull request #11626 from RuoqingHe/bump-cloud-hypervisor-v47
versions: Upgrade to Cloud Hypervisor v47.0
2025-07-28 10:34:45 +08:00
Ruoqing He
4ca6c2d917 mem-agent: Ignore Cargo.lock
`mem-agent` here is now a library and do not contain examples, ignore
Cargo.lock to get rid of untracked file noise produced by `cargo run` or
`cargo test`.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-07-28 10:32:46 +08:00
Ruoqing He
3ec10b3721 runtime: clh: Re-generate client code against v47.0
Re-generates the client code against Cloud Hypervisor v47.0.

Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-07-25 20:44:14 +02:00
Ruoqing He
14e9d2c815 versions: Upgrade to Cloud Hypervisor v47.0
Details of v47.0 release can be found in our roadmap project as
iteration v47.0: https://github.com/orgs/cloud-hypervisor/projects/6.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-07-25 20:42:24 +02:00
Xuewei Niu
6f6d64604f Merge pull request #11598 from justxuewei/cgroups 2025-07-25 17:53:03 +08:00
Hyounggyu Choi
860779c4d9 Merge pull request #11621 from Apokleos/enhance-copyfile
runtime-rs: Some extra work to enhance copyfile with sharedfs disabled
2025-07-25 11:27:03 +02:00
Ruoqing He
639273366a dragonball: Gate MmapRegion behind virtio-fs
`MmapRegion` is only used while `virtio-fs` is enabled during testing
dragonball, gate the import behind `virtio-fs` feature.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-07-25 09:09:35 +00:00
Ruoqing He
2e81ac463a dragonball: Allow unused to suppress warnings
Some variables went unused if certain features are not enabled, use
`#[allow(unused)]` to suppress those warnings at the time being.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-07-25 09:07:19 +00:00
Ruoqing He
5f7da1ccaa dragonball: Silence never read fields
Some fields in structures used for testing purpose are never read,
rename to send out the message.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-07-25 09:07:19 +00:00
Ruoqing He
225e6fffbc dragonball: Gate VcpuManagerError behind host-device
`VcpuManagerError` is only needed when `host-device` feature is enabled,
gate the import behind that feature.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-07-25 09:07:19 +00:00
Ruoqing He
0502b05718 dragonball: Remove with-serde feature assertion
Code inside `test_mac_addr_serialization_and_deserialization` test does
not actually require this `with-serde` feature to test, removing the
assertion here to enable this test.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-07-25 09:05:55 +00:00
Xuewei Niu
60e3679eb7 runtime-rs: Add full cgroups support on host
Add full cgroups support on host. Cgroups are managed by `FsManager` and
`SystemdManager`. As the names impies, the `FsManager` manages cgroups
through cgroupfs, while the `SystemdManager` manages cgroups through
systemd. The two manages support cgroup v1 and cgroup v2.

Two types of cgroups path are supported:

1. For colon paths, for example "foo.slice:bar:baz", the runtime manages
cgroups by `SystemdManager`;
2. For relative/absolute paths, the runtime manages cgroups by
`FsManager`.

vCPU threads are added into the sandbox cgroups in cgroup v1 + cgroupfs,
others, cgroup v1 + systemd, cgroup v2 + cgroupfs, cgroup v2 + systemd, VMM
process is added into the cgroups.

The systemd doesn't provide a way to add thread to a unit. `add_thread()`
in `SystemdManager` is equivalent to `add_process()`.

Cgroup v2 supports threaded mode. However, we should enable threaded mode
from leaf node to the root node (`/`) iteratively [1]. This means the
runtime needs to modify the cgroups created by container runtime (e.g.
containerd). Considering cgroupfs + cgroup v2 is not a common combination,
its behavior is aligned with systemd + cgroup v2, which is not allowed to
manage process at the thread level.

1: https://www.kernel.org/doc/html/v4.18/admin-guide/cgroup-v2.html#threads

Fixes: #11356

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2025-07-25 14:52:55 +08:00
alex.lyn
613dba6f1f runtime-rs: Some extra work to enhance copyfile with sharedfs disabled
As some reasons, it first should make it align with runtime-go, this
commit  will do this work.

Fixes #11543

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-25 11:39:20 +08:00
Xuewei Niu
6aa3517393 tests: Prevent the shim from being killed in k8s-oom test
The actual memory usage on the host is equal to the hypervisor memory usage
plus the user memory usage. An OOM killer might kill the shim when the
memory limit on host is same with that of container and the container
consumes all available memory. In this case, the containerd will never
receive OOM event, but get "task exit" event. That makes the `k8s-oom.bats`
test fail.

The fix is to add a new container to increase the sandbox memory limit.
When the container "oom-test" is killed by OOM killer, there is still
available memory for the shim, so it will not be killed.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2025-07-24 23:44:21 +08:00
Steve Horsman
c762a3dd4f Merge pull request #11372 from kata-containers/dependabot/cargo/src/dragonball/openssl-af8515b6e0
build(deps): bump the openssl group across 4 directories with 1 update
2025-07-24 13:27:24 +01:00
Fupan Li
fdbe549368 Merge pull request #11547 from Apokleos/virtio-scsi
runtime-rs: support block device driver virtio-scsi within qemu-rs
2025-07-24 18:02:11 +08:00
Xuewei Niu
635272f3e8 runtime-rs: Ignore SIGTERM signal in shim
When enabling systemd cgroup driver and sandbox cgroup only, the shim is
under a systemd unit. When the unit is stopping, systemd sends SIGTERM to
the shim. The shim can't exit immediately, as there are some cleanups to
do. Therefore, ignoring SIGTERM is required here. The shim should complete
the work within a period (Kata sets it to 300s by default). Once a timeout
occurs, systemd will send SIGKILL.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2025-07-24 17:15:15 +08:00
Xuewei Niu
79f29bc523 runtime-rs: QEMU get_thread_ids() returns real vCPU's tids
The information is obtained through QMP query_cpus_fast.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2025-07-24 17:15:15 +08:00
stevenhorsman
475baf95ad ci: Try passing api token into githubh api call
Our CI keeps on getting
```
jq: error (at <stdin>:1): Cannot index string with string "tag_name"
```
during the install dependencies phase, which I suspect
might be due to github rate limits being reduced, so try
to pass through the `GH_TOKEN` env and use it in the auth header.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-24 08:49:32 +01:00
alex.lyn
b40d65bc1b runtime-rs: support block device driver virtio-scsi within qemu-rs
It is important that we continue to support VirtIO-SCSI. While
VirtIO-BLK is a common choice, virtio-scsi offers significant
performance advantages in specific scenarios, particularly when
utilizing iothreads and with NVMe Fabrics.

Maintaining Flexibility and Choice by supporting both virtio-blk and
virtio-scsi, we provide greater flexibility for users to choose the
optimal storage(virtio-blk, virtio-scsi) interface based on their
specific workload requirements and hardware configurations.

As virtio-scsi controller has been created when qemu vm starts with
block device driver is set to `virtio-scsi`. This commit is for blockdev_add
the backend block device and device_add frondend virtio-scsi device via qmp.

Fixes #11516

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-24 14:00:02 +08:00
alex.lyn
e683a7fd37 runtime-rs: Change the device_id with block device index
As block device index is an very important unique id of a block device
and can indicate a block device which is equivalent to device_id.
In case of index is required in calculating scsi LUN and reduce
useless arguments within reusing `hotplug_block_device`, we'd better
change the device_id with block device index.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-24 11:57:00 +08:00
alex.lyn
4521cae0c0 runtime-rs: Support AIO for hotplugging block device within qemu
In this commit, block device aio are introduced within hotplug_block_device
within qemu via qmp and the "iouring" is set the default.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-24 11:57:00 +08:00
alex.lyn
b4d276bc2b runtime-rs: Handle virtio-scsi within device manager
It should be correctly handled within the device manager when do
create_block_device if the driver_option is virtio-scsi.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-24 11:57:00 +08:00
alex.lyn
fbd84fd3f4 runtime-rs: Support virtio-scsi device within handle_block_volume
It supports handling scsi device when block device driver is `scsi`.
And it will ensure a correct storage source with LUN.

Fixes #11516

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-24 11:57:00 +08:00
alex.lyn
57645c0786 runtime-rs: Add support for block device AIO
In this commit, three block device aio modes are introduced and the
"iouring" is set the default.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-24 11:57:00 +08:00
alex.lyn
40e6aacc34 runtime-rs: Introduce scsi_addr within BlockConfig for SCSI devices
It's used to help discover scsi devices inside guest and also add a
new const value `KATA_SCSI_DEV_TYPE` to help pass information.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-24 11:57:00 +08:00
alex.lyn
125383e53c runtime-rs: Add support for configurable block device aio
AIO is the I/O mechanism used by qemu with options:
- threads
  Pthread based disk I/O.
- native
  Native Linux I/O.
- io_uring (default mode)
  Linux io_uring API. This provides the fastest I/O operations on

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-24 11:56:52 +08:00
dependabot[bot]
ef9d960763 build(deps): bump the openssl group across 4 directories with 1 update
Bumps the openssl group with 1 update in the /src/dragonball directory: [openssl](https://github.com/sfackler/rust-openssl).
Bumps the openssl group with 1 update in the /src/runtime-rs directory: [openssl](https://github.com/sfackler/rust-openssl).
Bumps the openssl group with 1 update in the /src/tools/genpolicy directory: [openssl](https://github.com/sfackler/rust-openssl).
Bumps the openssl group with 1 update in the /src/tools/kata-ctl directory: [openssl](https://github.com/sfackler/rust-openssl).


Updates `openssl` from 0.10.72 to 0.10.73
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-v0.10.72...openssl-v0.10.73)

Updates `openssl` from 0.10.72 to 0.10.73
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-v0.10.72...openssl-v0.10.73)

Updates `openssl` from 0.10.72 to 0.10.73
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-v0.10.72...openssl-v0.10.73)

Updates `openssl` from 0.10.72 to 0.10.73
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-v0.10.72...openssl-v0.10.73)

---
updated-dependencies:
- dependency-name: openssl
  dependency-version: 0.10.73
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: openssl
- dependency-name: openssl
  dependency-version: 0.10.73
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: openssl
- dependency-name: openssl
  dependency-version: 0.10.73
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: openssl
- dependency-name: openssl
  dependency-version: 0.10.73
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: openssl
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-23 15:17:12 +00:00
Fabiano Fidêncio
58925714d2 Merge pull request #11579 from Apokleos/fix-hotplug-blk
runtime-rs: Support hotplugging host block devices within qemu-rs
2025-07-23 11:10:04 +02:00
alex.lyn
a12ae58431 runtime-rs: Support hotplugging host block devices within qemu-rs
Although Previous implementation of hotplugging block device via QMP
can successfully hot-plug the regular file based block device, but it
fails when the backend is /dev/xxx(e.g. /dev/loop0). With analysis about
it, we can know that it lacks the ablility to hotplug host block devices.

This commit will fill the gap, and make it work well for host block
devices.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-22 15:40:03 +08:00
Fabiano Fidêncio
acae4480ac Merge pull request #11604 from fidencio/release/3.19.1
release: Bump version to 3.19.1
2025-07-22 09:00:15 +02:00
Fabiano Fidêncio
0220b4d661 release: Bump version to 3.19.1
As there were a few moderate security vulnerability fixes missed as part
of the 3.19.0 release.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-21 20:09:21 +02:00
Steve Horsman
09efcfbd86 Merge pull request #11606 from kata-containers/dependabot/cargo/src/tools/genpolicy/zerocopy-0.6.6
build(deps): bump zerocopy from 0.6.1 to 0.6.6 in /src/tools/genpolicy
2025-07-21 18:58:56 +01:00
Steve Horsman
9f04d8e121 Merge pull request #11605 from kata-containers/dependabot/cargo/src/tools/kata-ctl/unsafe-libyaml-0.2.11
build(deps): bump unsafe-libyaml from 0.2.9 to 0.2.11 in /src/tools/kata-ctl
2025-07-21 18:50:01 +01:00
dependabot[bot]
a9c8377073 build(deps): bump zerocopy from 0.6.1 to 0.6.6 in /src/tools/genpolicy
---
updated-dependencies:
- dependency-name: zerocopy
  dependency-version: 0.6.6
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-21 12:50:38 +00:00
dependabot[bot]
0b4c434ece build(deps): bump unsafe-libyaml in /src/tools/kata-ctl
Bumps [unsafe-libyaml](https://github.com/dtolnay/unsafe-libyaml) from 0.2.9 to 0.2.11.
- [Release notes](https://github.com/dtolnay/unsafe-libyaml/releases)
- [Commits](https://github.com/dtolnay/unsafe-libyaml/compare/0.2.9...0.2.11)

---
updated-dependencies:
- dependency-name: unsafe-libyaml
  dependency-version: 0.2.11
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-21 12:46:27 +00:00
Fabiano Fidêncio
35629d0690 Merge pull request #11603 from stevenhorsman/security-updates-21-jul
dependencies: More crate bumps to resolve security issues
2025-07-21 14:33:07 +02:00
stevenhorsman
162ba19b85 agent-ctl: Bump rusttls
Bump rusttls to >=0.23.18 to remediate RUSTSEC-2024-0399

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-21 10:41:59 +01:00
stevenhorsman
42339e9cdf dragonball: Update url crate
Update url to 2.5.4 to bump idna to 1.0.3 and remediate
RUSTSEC-2024-0421

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-21 10:35:05 +01:00
stevenhorsman
1795361589 runk: Update rustjail
Update the rustjail crate to pull in the latest security fixes

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-21 10:31:18 +01:00
stevenhorsman
28929f5b3e runtime: Bump promethus
Bump this crate to remove the old version of protobuf
and remediate RUSTSEC-2024-0437

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-21 10:29:57 +01:00
stevenhorsman
e66aa1ef8c runtime: Bump promethus and ttrpc-codegen
Bump these crates to remove the old version of protobuf
and remediate RUSTSEC-2024-0437

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-21 10:29:39 +01:00
Fabiano Fidêncio
d60513ece9 Merge pull request #11597 from kata-containers/topic/fix-release-static-tarball-content
release: Copy the VERSION file to the tarball
2025-07-20 21:06:40 +02:00
Fabiano Fidêncio
55aae75ed7 shellcheck: Fix issues on kata-deploy-merge-builds.sh
As we're already touching the file, let's get those fixed.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-20 09:33:50 +02:00
Fabiano Fidêncio
aaeb3b3221 release: Copy the VERSION file to the tarball
For the release itself, let's simply copy the VERSION file to the
tarball.

To do so, we had to change the logic that merges the build, as at that
point the tag is not yet pushed to the repo.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-20 00:06:14 +02:00
Fabiano Fidêncio
21ccaf4a80 Merge pull request #11596 from fidencio/release/v3.19.0
release: Bump version to 3.19.0
2025-07-19 18:27:36 +02:00
Fabiano Fidêncio
60f312b4ae release: Bump version to 3.19.0
Bump VERSION and helm-chart versions

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-19 09:11:30 +02:00
Fabiano Fidêncio
1351ccb2de Merge pull request #11576 from Tim-Zhang/update-protobuf-to-fix-CVE-2025-53605
chore: Update protobuf to fix CVE-2025-53605
2025-07-19 07:43:13 +02:00
Fabiano Fidêncio
7f5f032aca runtime-rs: Update containerd-shim / containerd-shim-protos
Let's bump those to their 0.10.0 releases, which contain fixes for the
CVE-2025-53605.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-19 00:18:01 +02:00
Fabiano Fidêncio
6dc4c0faae Merge pull request #11589 from fidencio/topic/fix-tdx-qemu-path-for-non-gpu
qemu: tdx: Fix binary path for non-gpu TDX
2025-07-18 17:24:00 +02:00
Tim Zhang
2fe9df16cc gent-ctl: update Cargo.lock to fix CVE-2025-53605
Fixes: https://github.com/kata-containers/kata-containers/security/dependabot/392
Fixes: #11570

Signed-off-by: Tim Zhang <tim@hyper.sh>
Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-18 16:13:25 +02:00
Tim Zhang
45b44742de genpolicy: update Cargo.lock to fix CVE-2025-53605
Fixes: https://github.com/kata-containers/kata-containers/security/dependabot/394
Fixes: #11570

Signed-off-by: Tim Zhang <tim@hyper.sh>
Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-18 16:10:52 +02:00
Tim Zhang
fa9ff1b299 kata-ctl: update prometheus/protobuf to fix CVE-2025-53605
Fixes: https://github.com/kata-containers/kata-containers/security/dependabot/395
Fixes: #11570

Signed-off-by: Tim Zhang <tim@hyper.sh>
Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-18 16:05:13 +02:00
Tim Zhang
d0e7a51f7b dragonball: update prometheus/protobuf to fix CVE-2025-53605
Fixes: https://github.com/kata-containers/kata-containers/security/dependabot/396
Fixes: #11570

Signed-off-by: Tim Zhang <tim@hyper.sh>
2025-07-18 16:02:29 +02:00
Tim Zhang
222393375a agent: update ttrpc-codegen to remove dependency on protobuf v2
To fix CVE-2025-53605.

Fixes: https://github.com/kata-containers/kata-containers/security/dependabot/397
Fixes: #11570

Signed-off-by: Tim Zhang <tim@hyper.sh>
Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-18 16:02:07 +02:00
Fabiano Fidêncio
60c3d89767 Merge pull request #11558 from gmintoco/feature/helm-nodeSelector
helm: add nodeSelector support to kata-deploy chart
2025-07-18 15:52:19 +02:00
Fabiano Fidêncio
3143787f69 qemu: tdx: Fix binary path for non-gpu TDX
On commit 90bc749a19, we've changed the
QEMUTDXPATH in order to get it to work with GPUs, but the change broke
the non-GPU TDX use-case, which depends on the distro binary.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-18 15:26:27 +02:00
Fabiano Fidêncio
497a3620c2 tests: Remove references to qemu-sev
As it's been removed from our codebase.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-18 12:49:54 +02:00
Fabiano Fidêncio
17ce44083c runtime: Remove reference to sev package
Otherwise it'll just break static checks.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-18 12:49:54 +02:00
Gus Minto-Cowcher
3b5cd2aad6 helm: remove qemu-sev references
qemu-sev support has been removed, but those bits were left behind by
mistake.

Signed-off-by: Gus Minto-Cowcher <gus@basecamp-research.com>
2025-07-18 12:49:54 +02:00
Gus Minto-Cowcher
41d41d51f7 helm: add nodeSelector support to kata-deploy chart
- Add nodeSelector configuration to values.yaml with empty default
- Update DaemonSet template to conditionally include nodeSelector
- Add documentation and examples for nodeSelector usage in README
- Allows users to restrict kata-containers deployment to specific nodes by labeling them

Signed-off-by: Gus Minto-Cowcher <gus@basecamp-research.com>
2025-07-18 12:49:54 +02:00
Fabiano Fidêncio
7d709a0759 Merge pull request #11493 from stevenhorsman/agent-ctl-tag-cache
ci: cache: Tag agent-ctl cache
2025-07-18 12:12:46 +02:00
Fabiano Fidêncio
4a6c718f23 Merge pull request #11584 from zvonkok/fix-kernel-debug-enabled
kernel: fix enable kernel debug
2025-07-18 11:38:36 +02:00
Sumedh Alok Sharma
47184e82f5 Merge pull request #11313 from Ankita13-code/ankitapareek/exec-id-agent-fix
agent: update the processes hashmap to use exec_id as primary key
2025-07-18 14:07:15 +05:30
Fabiano Fidêncio
d9daddce28 Merge pull request #11578 from justxuewei/vsock-async
runtime-rs: Fix the issue of blocking socket with Tokio
2025-07-18 10:13:03 +02:00
Xuewei Niu
629c942d4b runtime-rs: Fix the issue of blocking socket with Tokio
According to the issue [1], Tokio will panic when we are giving a blocking
socket to Tokio's `from_std()` method, the information is as follows:

```
A panic occurred at crates/agent/src/sock/vsock.rs:59: Registering a
blocking socket with the tokio runtime is unsupported. If you wish to do
anyways, please add `--cfg tokio_allow_from_blocking_fd` to your RUSTFLAGS.
```

A workaround is to set the socket to non-blocking.

1: https://github.com/tokio-rs/tokio/issues/7172

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2025-07-18 10:55:48 +08:00
Xuewei Niu
1508e6f0f5 agent: Bump Tokio to v1.46.1
Tokio now has a newer version, let us bump it.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2025-07-18 10:55:48 +08:00
Xuewei Niu
5a4050660a runtime-rs: Bump Tokio to v1.46.1
Tokio now has a newer version, let us bump it.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2025-07-18 10:55:48 +08:00
Zvonko Kaiser
a786dc48b0 kernel: fix enable kernel debug
The KERNEL_DEBUG_ENABLED was missing in the outer shell script
so overrides via make were not possible.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-18 02:24:19 +00:00
Fabiano Fidêncio
eb2bfbf7ac Merge pull request #11572 from stevenhorsman/RUSTSEC-2024-0384-remediate
More crate bumps for security remediations
2025-07-17 22:35:05 +02:00
Zvonko Kaiser
cef9485634 Merge pull request #11450 from kata-containers/dependabot/cargo/src/agent/nix-0.27.1
build(deps): bump nix to 0.26.4 in agent, libs, runtime-rs
2025-07-17 14:22:40 -04:00
stevenhorsman
41a608e5ce tools: Bump borsh, liboci-cli and oci-spec
Bump these crates to remove the unmaintained dependency
proc-macro-error and remediate RUSTSEC-2024-0370

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-17 18:23:19 +01:00
stevenhorsman
e56f493191 deps: Bump zbus, serial_test & async-std
Bump these crates across various components to remove the
dependency on unmaintained instant crate and remediate
RUSTSEC-2024-0384

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-17 18:23:19 +01:00
stevenhorsman
bb820714cb agent-ctl: Update borsh
- Update borsh to remove the unmaintained dependency
proc-macro-error and remediate RUSTSEC-2024-0370

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-17 18:23:19 +01:00
Steve Horsman
549fd2a196 Merge pull request #11581 from stevenhorsman/osv-scanner-action-permissions-fix
workflow: Fix osv-scanner action
2025-07-17 18:18:16 +01:00
stevenhorsman
a7e27b9b68 workflow: Fix osv-scanner action
- The github generated template had an old version which
isn't valid for the pr-scan, so update to the latest
- The action needs also `actions: read` and `contents:read` to run in kata-containers

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-17 17:29:35 +01:00
Steve Horsman
8741f2ab3d Merge pull request #11580 from kata-containers/osv-scanner-action
workflow: Add osv-scanner action
2025-07-17 17:00:34 +01:00
stevenhorsman
1a75c12651 workflow: Add osv-scanner action
Add action to check for vulnerabilities in the project and
on each PR

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-17 16:41:56 +01:00
stevenhorsman
4c776167e5 trace-forwarder: Add nix features
Some of the nix apis we are using are now enabled by features,
so add these to resolve the compilation issues

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-17 15:09:21 +01:00
dependabot[bot]
cd79108c77 build(deps): bump nix in /src/tools/trace-forwarder
Bumps [nix](https://github.com/nix-rust/nix) from 0.23.1 to 0.30.1.
- [Changelog](https://github.com/nix-rust/nix/blob/master/CHANGELOG.md)
- [Commits](https://github.com/nix-rust/nix/compare/v0.23.1...v0.30.1)

---
updated-dependencies:
- dependency-name: nix
  dependency-version: 0.30.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-17 15:09:06 +01:00
stevenhorsman
9185ef1a67 runtime-rs: Bump nix to matching version
runtime-rs needs the same version as libs,
so sync this up as well.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-17 15:08:46 +01:00
dependabot[bot]
219ad505c2 build(deps): bump nix from 0.24.3 to 0.26.4 in /src/agent
Nix needs to be in sync between libs and agent, so bump
the agent to the libs version

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-17 15:01:06 +01:00
dependabot[bot]
a4d22fe330 build(deps): bump nix from 0.24.2 to 0.26.4 in /src/libs
---
updated-dependencies:
- dependency-name: nix
  dependency-version: 0.26.4
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-17 15:01:06 +01:00
Fabiano Fidêncio
6dabb3683f Merge pull request #10961 from zvonkok/shellcheck-zero
shellcheck: fix kernel/build.sh
2025-07-17 12:59:00 +02:00
Steve Horsman
405f5283f0 Merge pull request #11573 from arvindskumar99/versions_comment
OVMF: Making comment in versions.yaml for SEV-SNP
2025-07-17 10:11:58 +01:00
Fabiano Fidêncio
32d40849fa Merge pull request #11577 from Xynnn007/bump-gc
deps(chore): bump guest-components to candidate v0.14.0
2025-07-17 11:08:36 +02:00
Zvonko Kaiser
ca4f96ed00 shellcheck: fix kernel/build.sh
Refactor code to make shellcheck happy

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-17 10:15:41 +02:00
Xynnn007
82b890349d deps(chore): bump guest-components to candidate v0.14.0
This new version of gc fixes s390x attestation, also introduces registry
configuration setting directly via initdata.

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
2025-07-17 10:19:02 +08:00
stevenhorsman
51f41b1669 ci: cache: Tag agent-ctl cache
The peer pods project is using the agent-ctl tool in some
tests, so tagging our cache will let them more easily identify
development versions of kata for testing between releases.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-16 11:32:33 +01:00
Fupan Li
75d23b8884 Merge pull request #11504 from lifupan/fix_fd_leak
agent: fix the issue of parent writer pipe fd leak
2025-07-16 18:29:24 +08:00
Fupan Li
83f54eec52 agent: fix the issue of parent writer pipe fd leak
Sometimes, containers or execs do not use stdin, so there is no chance
to add parent stdin to the process's writer hashmap, resulting in the
parent stdin's fd not being closed when the process is cleaned up later.

Therefore, when creating a process, first explicitly add parent stdin to
the wirter hashmap. Make sure that the parent stdin's fd can be closed
when the process is cleaned up later.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-07-16 16:15:31 +08:00
Fupan Li
752c8b611e Merge pull request #11575 from Tim-Zhang/fix-runk-build
runk: Fix build errors
2025-07-16 15:23:58 +08:00
Arvind Kumar
2a52351822 OVMF: Making comment in versions.yaml for SEV-SNP
Adding comment to versions.yaml to indicate that the ovmf-sev is also
used by AMD SEV-SNP, as per the discussion in
https://github.com/kata-containers/kata-containers/pull/11561.

Signed-off-by: Arvind Kumar <arvinkum@amd.com>
2025-07-16 06:35:21 +02:00
Tim Zhang
c8183a2c14 runk: rename imported crate from users to uzers
To adapt the new crate name and fix build errors
introduced in the commit 39f51b4c6d

Fixes: #11574

Signed-off-by: Tim Zhang <tim@hyper.sh>
2025-07-16 11:35:39 +08:00
Fabiano Fidêncio
9cebbab29d Merge pull request #11335 from zvonkok/fix-kata-deploy.sh
gpu: Fix kata deploy.sh
2025-07-15 19:50:44 +02:00
Fabiano Fidêncio
c8b7a51d72 Merge pull request #11082 from zvonkok/debug-kernel
kernel: debug config
2025-07-15 19:04:15 +02:00
Zvonko Kaiser
c56c896fc6 qemu: remove the experimental suffix for qemu-snp
We switched to vanilla QEMU for the CPU SNP use-case.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-15 16:49:58 +02:00
Zvonko Kaiser
a282fa6865 gpu: Add TDX related runtime adjustments
We have the QEMU adjustments for SNP but missing those for TDX

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-15 16:49:56 +02:00
Zvonko Kaiser
0d2993dcfd kernel: bump kernel version
Obligatory kernel version bump

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-15 16:48:23 +02:00
Zvonko Kaiser
a4597672c0 kernel: Add KERNEL_DEBUG_ENABLED to build scripts
We want to be able to build a debug version of the kernel for various
use-cases like debugging, tracing and others.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-15 16:48:03 +02:00
Fabiano Fidêncio
b7af7f344b Merge pull request #11569 from Xynnn007/bump-coco
deps(chore): update guest-components and trustee
2025-07-15 16:34:23 +02:00
Fabiano Fidêncio
aac555eeff Merge pull request #11571 from fidencio/topic/fix-nvidia-gpu-initrd-cache
build: Fix cache for nvidia-gpu-initrd builds
2025-07-15 16:28:03 +02:00
Fabiano Fidêncio
4415a47fff Merge pull request #11557 from Apokleos/fix-initdata
runtime-rs: Fix initdata length field missing when create block
2025-07-15 16:22:45 +02:00
Fabiano Fidêncio
11c744c5c3 Merge pull request #11567 from zvonkok/remove-gpu-admin-tools
Remove gpu admin tools
2025-07-15 15:11:56 +02:00
Fabiano Fidêncio
fa7598f6ec Merge pull request #11568 from zvonkok/tdx-qemu-path
gpu: Add  proper TDX config path
2025-07-15 14:54:13 +02:00
Fabiano Fidêncio
3e86f3a95c build: Rename rootfs-nvidia-* to fix cache issues
The convention for rootfs-* names is:
* rootfs-${image_type}-${special_build}

If this is not followed, cache will never work as expected, leading to
building the initrd / image on every single build, which is specially
constly when building the nvidia specific targets.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-15 14:48:45 +02:00
alex.lyn
56c0c172fa runtime-rs: Fix initdata length field missing when create block
The init data could not be read properly within kata-agent because the
data length field was omitted, a consequence of a mismatch in the data
write format.

Fixes #11556

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-15 19:22:17 +08:00
Fabiano Fidêncio
b76efa2a25 Merge pull request #11564 from BbolroC/make-qemu-coco-dev-s390x-required
ci: Make qemu-coco-dev for s390x (zVSI) required again
2025-07-15 12:04:18 +02:00
Xynnn007
4da31bf2f9 agent: deliver initdata toml to attestation agent
Now AA supports to receive initdata toml plaintext and deliver it in the
attestation. This patch creates a file under
'/run/confidential-containers/initdata'
to store the initdata toml and give it to AA process.

When we have a separate component to handle initdata, we will move the
logic to that component.

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
2025-07-15 17:10:56 +08:00
Steve Horsman
d219fc20e1 Merge pull request #11555 from stevenhorsman/rust-advisory-fixes-pre-3.19.0
Rust advisory fixes pre 3.19.0
2025-07-15 09:11:33 +01:00
Hui Zhu
3577e4bb43 Merge pull request #11480 from teawater/update_ma
mem-agent: Update to https://github.com/teawater/mem-agent/tree/kata-20250627
2025-07-15 15:22:10 +08:00
Xynnn007
19001af1e2 deps(chore): update guest-components and trustee
to the version of pre v0.14.0

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
2025-07-15 09:12:47 +08:00
teawater
028f25ac84 mem-agent: Update to kata-20250627
Update to https://github.com/teawater/mem-agent/tree/kata-20250627.

The commit list:
3854b3a Update nix version from 0.23.2 to 0.30.1
d9a4ced Update tokio version from 1.33 to 1.45.1
9115c4d run_eviction_single_config: Simplify check evicted pages after
	eviction
68b48d2 get_swappiness: Use a rounding method to obtain the swappiness
	value
14c4508 run_eviction_single_config: Add max_seq and min_seq check with
	each info
8a3a642 run_eviction_single_config: Move infov update to main loop
b6d30cf memcg.rs: run_aging_single_config: Fix error of last_inc_time
	check
54fce7e memcg.rs: Update anon eviction code
41c31bf cgroup.rs: Fix build issue with musl
0d6aa77 Remove lazy_static from dependencies
a66711d memcg.rs: update_and_add: Fix memcg not work after set memcg
	issue
cb932b1 Add logs and change some level of some logs
93c7ad8 Add per-cgroup and per-numa config support
092a75b Remove all Cargo.lock to support different versions of rust
540bf04 Update mem-agent-srv, mem-agent-ctl and mem-agent-lib to
	v0.2.0
81f39b2 compact.rs: Change default value of compact_sec_max to 300
c455d47 compact.rs: Fix psi_path error with cgroup v2 issue
6016e86 misc.rs: Fix log error
ded90e9 Set mem-agent-srv and mem-agent-ctl as bin

Fixes: #11478

Signed-off-by: teawater <zhuhui@kylinos.cn>
2025-07-15 08:57:41 +08:00
Zvonko Kaiser
90bc749a19 gpu: Add proper TDX config path
This was missed during the GPU TDX experimental enablement

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-14 23:26:28 +00:00
Zvonko Kaiser
da17b06d28 gpu: Pin toolkit version
New versions have incompatibilites, pin toolkit to a working
version

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-14 22:07:21 +00:00
Zvonko Kaiser
97a4a1574e gpu: Remove gpu-admin-tools
NVRC got a new feature reading the CC mode directly from register

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-14 21:59:31 +00:00
stevenhorsman
18597588c0 agent: Bump cdi version
Bump cdi version to the pick up fixes to:
- RUSTSEC-2025-0024
- RUSTSEC-2025-0023
- RUSTSEC-2024-0370

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-14 16:54:30 +01:00
stevenhorsman
661d88b11f versions: Bump oci-spec
Try bumping oci-spec to 0.8.1 as it included fixes for vulnerabilities
including RUSTSEC-2024-0370

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-14 16:54:30 +01:00
Fabiano Fidêncio
579d373623 Merge pull request #11521 from stevenhorsman/idna-1.0.4-bump
versions: Bump idna crate to >= 1.0.3
2025-07-14 17:39:30 +02:00
Fabiano Fidêncio
f5decea13e Merge pull request #11550 from stevenhorsman/runtime-rs-bump-chrono-0.4.41
runtime-rs | trace-forwarder: Bump chrono crate version
2025-07-14 16:45:58 +02:00
Steve Horsman
0fa2cd8202 Merge pull request #11519 from wainersm/tests_teardown_common
tests/k8s: instrument some tests for debugging
2025-07-14 13:20:01 +01:00
Hyounggyu Choi
a224b4f9e4 ci: Make qemu-coco-dev for s390x (zVSI) required again
As the following job has passed 10 days in a row for the nightly test:

```
kata-containers-ci-on-push / run-k8s-tests-on-zvsi / run-k8s-tests (nydus, qemu-coco-dev, kubeadm)
```

this commit makes the job required again.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-07-14 11:03:54 +02:00
Wainer dos Santos Moschetta
f0f1974e14 tests/k8s: call teardown_common in k8s-parallel.bats
The teardown_common will print the description of the running pods, kill
them all and print the system's syslogs afterwards.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2025-07-12 10:13:51 +01:00
Wainer dos Santos Moschetta
8dfeed77cd tests/k8s: add handler for Job in set_node()
Set the node in the spec template of a Job manifest, allowing to use
set_node() on tests like k8s-parallel.bats

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2025-07-12 10:13:51 +01:00
Wainer dos Santos Moschetta
806d63d1d8 tests/k8s: call teardown_common in k8s-credentials-secrets.bats
The teardown_common will print the description of the running pods, kill
them all and print the system's syslogs afterwards.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2025-07-12 10:13:51 +01:00
Wainer dos Santos Moschetta
c8f40fe12c tests/k8s: call teardown_common in k8s-sandbox-vcpus-allocation.bats
The teardown_common will print the description of the running pods, kill
them all and print the system's syslogs afterwards.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2025-07-12 10:13:51 +01:00
Fabiano Fidêncio
4a79c2520d Merge pull request #11491 from Apokleos/default-blk-driver
runtime-rs: Change default block device driver from virtio-scsi to virtio-blk-*
2025-07-11 23:14:13 +02:00
alex.lyn
9cc14e4908 runtime-rs: Update block device driver docs within configuration
The previous description for the `block_device_driver` was inaccurate or
outdated. This commit updates the documentation to provide a more
precise explanation of its function.

Fixes #11488

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-11 17:40:58 +02:00
alex.lyn
92160c82ff runtime-rs: Change block device driver defualt with virtio-blk-*
When we run a kata pod with runtime-rs/qemu and with a default
configuration toml, it will fail with error "unsupported driver type
virtio-scsi".
As virtio-scsi within runtime-rs is not so popular, we set default block
device driver with `virtio-blk-*`.

Fixes #11488

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-11 17:40:58 +02:00
Ankita Pareek
5f08cc75b3 agent: update the processes hashmap to use exec_id as primary key
This patch changes the container process HashMap to use exec_id as the primary
key instead of PID, preventing exec_id collisions that could be exploited in
Confidential Computing scenarios where the host is less trusted than the guest.

Key changes:
- Changed `processes: HashMap<pid_t, Process>` to `HashMap<String, Process>`
- Added exec_id collision detection in `start()` method
- Updated process lookup operations to use exec_id directly
- Simplified `get_process()` with direct HashMap access

This prevents multiple exec operations from reusing the same exec_id, which
could be problematic in CoCo use cases where process isolation and unique
identification are critical for security.

Signed-off-by: Ankita Pareek <ankitapareek@microsoft.com>
2025-07-11 10:10:23 +00:00
Steve Horsman
878e50f978 Merge pull request #11554 from fidencio/topic/fix-version-file-on-release
gh: Fix released VERSION file
2025-07-11 09:20:06 +01:00
Fabiano Fidêncio
fb22e873cd gh: Fix released VERSION file
The `/opt/kata/VERSION` file, which is created using `git describe
--tags`, requires the newly released tag to be updated in order to be
accurate.

To do so, let's add a `fetch-tags: true` to the checkout action used
during the `create-kata-tarball` job.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-11 09:47:11 +02:00
Alex Lyn
87e41e2a09 Merge pull request #11549 from stevenhorsman/bump-remove_dir_all
runtime-rs: Switch tempdir to tempfile
2025-07-11 13:46:12 +08:00
Alex Lyn
f22272b8f7 Merge pull request #11540 from Apokleos/coldplug-vfio-clh
runtime-rs: Add vfio support with coldplug for cloud-hypervisor
2025-07-11 10:33:59 +08:00
RuoqingHe
7cd4e3278a Merge pull request #11545 from RuoqingHe/remove-lockfile-for-libs
libs: Remove lockfile for libs
2025-07-10 21:56:10 +08:00
stevenhorsman
c740896b1c trace-forwarder: Bump chrono crate version
Bump chrono version to drop time@0.1.43 and remediate
vulnerability CVE-2020-26235

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-10 14:55:20 +01:00
stevenhorsman
3916507553 runtime-rs: Bump chrono crate version
Bump chrono version to drop time@0.1.45 and remediate
vulnerability CVE-2020-26235

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-10 13:47:05 +01:00
Wainer dos Santos Moschetta
3ab6a8462d ci/gatekeeper: make run-k8s-tests-coco-nontee job required
The CoCo non-TEE job (run-k8s-tests-coco-nontee) used to be required but
we had to withdraw it to fix a problem (#11156). Now the job is back
running and stable, so time to make it required again.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2025-07-10 12:19:19 +01:00
stevenhorsman
c5ceae887b runtime-rs: Switch tempdir to tempfile
tempdir hasn't been updated for seven years and pulls in
remove_dir_all@0.5.3 which has security advisory
GHSA-mc8h-8q98-g5hr, so replace this with using tempfile,
which the crate got merged into and we use elsewhere in the
project

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-10 12:16:35 +01:00
Ruoqing He
4039506740 libs: Ignore Cargo.lock in libs workspace
Ignore Cargo.lock in `libs` to prevent developers from accidentally
track lock files in `libs` workspace.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-07-10 09:31:45 +00:00
alex.lyn
3fbe493edc runtime-rs: Convert host devices within VmConfig for cloud-hypervisor
This PR adds support for adding a network device before starting the
cloud-hypervisor VM.
This commit will get the host devices from NamedHypervisorConfig and
assign it to VmConfig's devices which is for vfio devices when clh
starts launching.
And with this, it successfully finish the vfio devices conversion from
a generic Hypervisor config to a clh specific VmConfig.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-07-10 16:33:43 +08:00
alex.lyn
0b5b8f549d runtime-rs: Introduce a field host_devices within NamedHypervisorConfig
This commit introduce `host_devices` to help convert vfio devices from
a generic hypervisor config to a cloud-hypervisor specific VmConfig.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-07-10 16:33:41 +08:00
alex.lyn
d37183d754 runtime-rs: Add vfio support with coldplug for cloud-hypervisor
This PR adds support for adding a vfio device before starting the
cloud-hypervisor VM (or cold-plug vfio device).

This commit changes "pending_devices" for clh implementation via adding
DeviceType::Vfio() into pending_devices. And it will get shared host devices
after correctly handling vfio devices (Specially for primary device).

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-07-10 16:32:21 +08:00
Ruoqing He
ffa3a5a15e libs: Remove Cargo.lock
crates in `libs` workspace do not ship binaries, they are just libraries
for other workspace to reference, the `Cargo.lock` file hence would not
take effect. Removing Cargo.lock for `libs` workspace.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-07-10 03:14:55 +00:00
Fabiano Fidêncio
c68eb58f3f Merge pull request #11529 from fidencio/topic/only-use-fixed-version-of-k0s-for-crio
tests: k0s: Always use latest version, apart from CRI-O tests
2025-07-09 18:47:18 +02:00
Hyounggyu Choi
09297b7955 Merge pull request #11537 from BbolroC/set-sharedfs-to-none-for-ibm-sel
runtime/runtime-rs: Set shared_fs to none for IBM SEL in config file
2025-07-09 18:30:08 +02:00
Hyounggyu Choi
bca31d5a4d runtime/runtime-rs: Set shared_fs to none for IBM SEL in config file
In line with configuration for other TEEs, shared_fs should
be set to none for IBM SEL. This commit updates the value for
runtime/runtime-rs.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-07-09 14:22:28 +02:00
Fabiano Fidêncio
5f17e61d11 tests: kata-deploy: Remove --wait from helm uninstall
As we're using a `kubectl wait --timeout ...` to check whether the
kata-deploy pod's been deleted or not, let's remove the `--wait` from
the `helm uninstall ...` call as k0s tests were failing because the
`kubectl wait --timeout...` was starting after the pod was deleted,
making the test fail.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-09 14:01:30 +02:00
Fabiano Fidêncio
842e17b756 tests: k0s: Always use latest version, apart from CRI-O tests
We've been pinning a specific version of k0s for CRI-O tests, which may
make sense for CRI-O, but doesn't make sense at all when it comes to
testing that we can install kata-deploy on latest k0s (and currently our
test for that is broken).

Let's bump to the latest, and from this point we start debugging,
instead of debugging on an ancient version of the project.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-09 13:27:18 +02:00
Steve Horsman
7bc25b0259 Merge pull request #11494 from katexochen/p/opa-1.6
versions: bump opa 1.5.1 -> 1.6.0
2025-07-09 11:45:54 +01:00
Steve Horsman
967f66f677 Merge pull request #11380 from arvindskumar99/sev-deprecation
Sev deprecation
2025-07-09 11:38:13 +01:00
stevenhorsman
f96b8fb690 kata-ctl: Update expected test failure message
Update expected error after url crate bump

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-09 11:34:27 +01:00
stevenhorsman
b7bf46fdfa versions: Bump idna crate to >= 1.0.4
Bump url, reqwests and idna crates in order to move away from
idna <1.0.3 and remediate CVE-2024-12224.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-09 11:34:27 +01:00
Xuewei Niu
b8838140d0 Merge pull request #11527 from StevenFryto/fix-runtime-rootless-bugs
runtime: Fix rootlessDir not correctly set in rootless VMM mode
2025-07-09 16:40:11 +08:00
Steve Horsman
990c4e68ee Merge pull request #11523 from wainersm/ci_setup_kubectl
workflows: adopting azure/setup-kubectl
2025-07-09 09:09:38 +01:00
stevenfryto
3c7a670129 runtime: Fix rootlessDir not correctly set in rootless VMM mode
Previously, the rootlessDir variable in `src/runtime/virtcontainers/pkg/rootless.go` was initialized at
package load time using `os.Getenv("XDG_RUNTIME_DIR")`. However, in rootless
VMM mode, the correct value of $XDG_RUNTIME_DIR is set later during runtime
using os.Setenv(), so rootlessDir remained empty.

This patch defers the initialization of rootlessDir until the first call
to `GetRootlessDir()`, ensuring it always reflects the current environment
value of $XDG_RUNTIME_DIR.

Fixes: #11526

Signed-off-by: stevenfryto <sunzitai_1832@bupt.edu.cn>
2025-07-09 09:51:48 +08:00
Wainer dos Santos Moschetta
e4da3b84a3 workflows: adopting azure/setup-kubectl
There are workflows that rely on `az aks install-cli` to get kubectl
installed. There is a well-known problem on install-cli, related with
API usage rate limit, that has recently caused the command to fail
quite often.

This is replacing install-cli with the azure/setup-kubectl github
action which has no such as rate limit problem.

While here, removed the install_cli() function from gha-run-k8s-common.sh
so avoid developers using it by mistake in the future.

Fixes #11463
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2025-07-08 15:15:54 -03:00
Alex Lyn
294b2c1c10 Merge pull request #11528 from Apokleos/remote-initdata
runtime-rs: add initdata annotation for remote hypervisor
2025-07-08 09:13:13 +08:00
Arvind Kumar
afedad0965 kernel: Removing SEV kernel packages
Removing kernel config files realting
to SEV as part of the SEV deprecation
efforts.

Co-authored-by: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
Signed-off-by: Arvind Kumar <arvinkum@amd.com>
2025-07-07 11:21:11 -05:00
Arvind Kumar
ecac3d2d28 runtime: Removing runtime logic for SEV
Removing runtime SEV functionality,
such as the kbs, ovmf, VMSA handling,
and SEV configs as part of deprecating
SEV from kata.

Co-authored-by: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
Signed-off-by: Arvind Kumar <arvinkum@amd.com>
2025-07-07 11:17:32 -05:00
Arvind Kumar
8eebcef8fb tests: Removing testing framework for SEV
Removing files pertaining to SEV from
the CI framework.

Co-authored-by: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
Signed-off-by: Arvind Kumar <arvinkum@amd.com>
2025-07-07 11:17:32 -05:00
Arvind Kumar
675ea86aba kata-deploy: Removing SEV from kata-deploy
Removing files related to SEV, responsible for
installing and configuring Kata containers.

Co-authored-by: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
Signed-off-by: Arvind Kumar <arvinkum@amd.com>
2025-07-07 11:17:32 -05:00
Paul Meyer
ff7ac58579 versions: bump opa 1.5.1 -> 1.6.0
Bumping opa to latest release.

Signed-off-by: Paul Meyer <katexochen0@gmail.com>
2025-07-07 14:19:08 +02:00
alex.lyn
fcaade24f4 runtime-rs: add initdata annotation for remote hypervisor
Add init data annotation within preparing remote hypervisor annotations
when prepare vm, so that it can be passed within CreateVMRequest.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-07 12:46:05 +01:00
Fabiano Fidêncio
110f68a0f1 Merge pull request #11530 from fidencio/topic/tests-fix-runtime-class-check
tests: runtimeclasses: Adjust gpu runtimeclasses
2025-07-07 13:42:46 +02:00
Fabiano Fidêncio
2c2995b7b0 tests: runtimeclasses: Adjust gpu runtimeclasses
679cc9d47c was merged and bumped the
podoverhead for the gpu related runtimeclasses. However, the bump on the
`kata-runtimeClasses.yaml` as overlooked, making our tests fail due to
that discrepancy.

Let's just adjust the values here and move on.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-07 11:43:40 +02:00
Fabiano Fidêncio
ef545eed86 Merge pull request #11513 from lifupan/dragonball_6.12.x
tools: port the dragonball kernel patch to 6.12.x
2025-07-07 10:31:49 +02:00
Steve Horsman
d291e9bda0 Merge pull request #11336 from zvonkok/fix-podoverhead
gpu: Update runtimeClasses for correct podoverhead
2025-07-07 09:20:07 +01:00
Fabiano Fidêncio
a2faf93211 kernel: Bump to v6.12.36
As that's the latest releasesd LTS.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-06 23:48:20 +02:00
Fupan Li
fd21c9df59 tools: port the dragonball kernel patch to 6.12.x
Backport the dragonball's kernel patches to
6.12.x kernel version.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-07-06 23:48:20 +02:00
Zvonko Kaiser
679cc9d47c gpu: Update runtimeClasses for correct podoverhead
We cannot only rely only on default_cpu and default_memory in the
config, default is 1 and 2Gi but we need some overhead for QEMU and
the other related binaries running as the pod overhead. Especially
when QEMU is hot-plugging GPUs, CPUs, and memory it can consume more
memory.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-04 12:20:15 -04:00
Steve Horsman
1c718dbcdd Merge pull request #11506 from stevenhorsman/remove-atty-dependency
Remove atty dependency
2025-07-04 10:46:28 +01:00
Alex Lyn
362ea54763 Merge pull request #11517 from zvonkok/fix-nvrc-build
gpu: NVRC static build
2025-07-04 13:51:03 +08:00
Alex Lyn
2e35a8067d Merge pull request #11482 from Apokleos/fix-force-guestpull
runtime-rs:  refactor and fix the implementation of guest-pull
2025-07-04 11:29:33 +08:00
stevenhorsman
6f23608e96 ci: Remove atty group
atty is unmaintained, with the last release almost 3 years
ago, so we don't need to check for updates, but instead will
remove it from out dependency tree.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-04 09:43:34 +08:00
stevenhorsman
7ffbdf7b3a mem-agent: Remove structopts crate
structopt features were integrated into clap v3 and so is not
actively updated and pulls in the atty crate which has a security
advisory, so update clap, remove structopts, update the code that
used it to remove the outdated dependencies.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-04 09:43:34 +08:00
stevenhorsman
7845129bdc versions: Bump slog-term to 2.9.1
slog-term 2.9.0 included atty, which is unmaintained
as has a security advisory GHSA-g98v-hv3f-hcfr,
so bump the version across our components to remove
this dependency.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-04 09:43:34 +08:00
Zvonko Kaiser
c3b2d69452 gpu: NVRC static build
We had the proper config.toml configuration for static builds
but were building the glibc  target and not the musl target.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-07-03 15:31:00 +00:00
alex.lyn
2b95facc6f kata-type: Relax Mandatory source Field Check in Guest-Pull Mode
Previously, the source field was subject to mandatory checks. However,
in guest-pull mode, this field doesn't consistently provide useful
information. Our practical experience has shown that relying on this
field for critical data isn't always necessary.

In other aspect, not all cases need mandatory check for KataVirtualVolume.
based on this fact, we'd better to make from_base64 do only one thing and
remove the validate(). Of course, We also keep the previous capability to
make it easy for possible cases which use such method and we rename it
clearly with from_base64_and_validate.

This commit relaxes the mandatory checks on the KataVirtualVolume specifically
for guest-pull mode, acknowledging its diminished utility in this context.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-03 17:07:20 +08:00
alex.lyn
8f8b196705 runtime-rs: refactor merging metadata within image_pull
refactor implementation for merging metadata.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-03 17:07:08 +08:00
alex.lyn
7a59d7f937 runtime-rs: Import the public const value from libs
Introduce a const value `KATA_VIRTUAL_VOLUME_PREFIX` defined in the libs/kata-types,
and it'll be better import such const value from there.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-03 09:42:17 +08:00
280 changed files with 11765 additions and 12581 deletions

View File

@@ -19,8 +19,8 @@ self-hosted-runner:
- metrics
- ppc64le
- riscv-builder
- sev
- sev-snp
- s390x
- s390x-large
- tdx
- amd64-nvidia-a100

View File

@@ -36,9 +36,6 @@ updates:
# create groups for common dependencies, so they can all go in a single PR
# We can extend this as we see more frequent groups
groups:
atty:
patterns:
- atty
bit-vec:
patterns:
- bit-vec

View File

@@ -49,6 +49,8 @@ jobs:
- name: Install dependencies
run: bash tests/integration/cri-containerd/gha-run.sh install-dependencies
env:
GH_TOKEN: ${{ github.token }}
- name: get-kata-tarball
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0
@@ -89,6 +91,8 @@ jobs:
- name: Install dependencies
run: bash tests/stability/gha-run.sh install-dependencies
env:
GH_TOKEN: ${{ github.token }}
- name: get-kata-tarball
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0
@@ -132,6 +136,8 @@ jobs:
- name: Install dependencies
run: bash tests/integration/nydus/gha-run.sh install-dependencies
env:
GH_TOKEN: ${{ github.token }}
- name: get-kata-tarball
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0
@@ -209,6 +215,8 @@ jobs:
- name: Install dependencies
run: bash tests/functional/tracing/gha-run.sh install-dependencies
env:
GH_TOKEN: ${{ github.token }}
- name: get-kata-tarball
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0
@@ -253,6 +261,8 @@ jobs:
- name: Install dependencies
run: bash tests/functional/vfio/gha-run.sh install-dependencies
env:
GH_TOKEN: ${{ github.token }}
- name: get-kata-tarball
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0
@@ -294,6 +304,8 @@ jobs:
- name: Install dependencies
run: bash tests/integration/docker/gha-run.sh install-dependencies
env:
GH_TOKEN: ${{ github.token }}
- name: get-kata-tarball
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0
@@ -339,6 +351,7 @@ jobs:
- name: Install dependencies
env:
GITHUB_API_TOKEN: ${{ github.token }}
GH_TOKEN: ${{ github.token }}
run: bash tests/integration/nerdctl/gha-run.sh install-dependencies
- name: get-kata-tarball
@@ -383,6 +396,8 @@ jobs:
- name: Install dependencies
run: bash tests/functional/kata-agent-apis/gha-run.sh install-dependencies
env:
GH_TOKEN: ${{ github.token }}
- name: get-kata-tarball
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

View File

@@ -48,7 +48,9 @@ jobs:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/cri-containerd/gha-run.sh install-dependencies
run: bash tests/integration/cri-containerd/gha-run.sh
env:
GH_TOKEN: ${{ github.token }}
- name: get-kata-tarball
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

View File

@@ -23,6 +23,8 @@ on:
secrets:
QUAY_DEPLOYER_PASSWORD:
required: false
KBUILD_SIGN_PIN:
required: true
permissions:
contents: read
@@ -95,6 +97,7 @@ jobs:
- name: Build ${{ matrix.asset }}
id: build
run: |
[[ "${KATA_ASSET}" == *"nvidia"* ]] && echo "KBUILD_SIGN_PIN=${{ secrets.KBUILD_SIGN_PIN }}" >> "${GITHUB_ENV}"
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
@@ -168,8 +171,8 @@ jobs:
- rootfs-image-mariner
- rootfs-initrd
- rootfs-initrd-confidential
- rootfs-nvidia-gpu-initrd
- rootfs-nvidia-gpu-confidential-initrd
- rootfs-initrd-nvidia-gpu
- rootfs-initrd-nvidia-gpu-confidential
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
@@ -201,6 +204,7 @@ jobs:
- name: Build ${{ matrix.asset }}
id: build
run: |
[[ "${KATA_ASSET}" == *"nvidia"* ]] && echo "KBUILD_SIGN_PIN=${{ secrets.KBUILD_SIGN_PIN }}" >> "${GITHUB_ENV}"
./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
@@ -327,6 +331,7 @@ jobs:
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
fetch-tags: true
persist-credentials: false
- name: Rebase atop of the latest target branch
run: |
@@ -342,6 +347,8 @@ jobs:
- name: merge-artifacts
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts versions.yaml
env:
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifacts
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
with:

View File

@@ -145,7 +145,7 @@ jobs:
asset:
- rootfs-image
- rootfs-initrd
- rootfs-nvidia-gpu-initrd
- rootfs-initrd-nvidia-gpu
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
@@ -297,6 +297,7 @@ jobs:
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
fetch-tags: true
persist-credentials: false
- name: Rebase atop of the latest target branch
run: |
@@ -312,6 +313,8 @@ jobs:
- name: merge-artifacts
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts versions.yaml
env:
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifacts
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
with:

View File

@@ -240,6 +240,7 @@ jobs:
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
fetch-tags: true
persist-credentials: false
- name: Rebase atop of the latest target branch
run: |
@@ -255,6 +256,8 @@ jobs:
- name: merge-artifacts
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts versions.yaml
env:
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifacts
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
with:

View File

@@ -326,6 +326,7 @@ jobs:
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
fetch-tags: true
persist-credentials: false
- name: Rebase atop of the latest target branch
run: |
@@ -341,6 +342,8 @@ jobs:
- name: merge-artifacts
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts versions.yaml
env:
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifacts
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
with:

View File

@@ -31,3 +31,4 @@ jobs:
AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}
AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

View File

@@ -27,6 +27,8 @@ jobs:
CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}
ITA_KEY: ${{ secrets.ITA_KEY }}
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
NGC_API_KEY: ${{ secrets.NGC_API_KEY }}
KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}
build-checks:
uses: ./.github/workflows/build-checks.yaml

View File

@@ -31,3 +31,5 @@ jobs:
CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}
ITA_KEY: ${{ secrets.ITA_KEY }}
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
NGC_API_KEY: ${{ secrets.NGC_API_KEY }}
KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

View File

@@ -3,7 +3,6 @@ on:
pull_request_target:
branches:
- 'main'
- 'stable-*'
types:
# Adding 'labeled' to the list of activity types that trigger this event
# (default: opened, synchronize, reopened) so that we can run this
@@ -52,3 +51,5 @@ jobs:
CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}
ITA_KEY: ${{ secrets.ITA_KEY }}
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
NGC_API_KEY: ${{ secrets.NGC_API_KEY }}
KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

View File

@@ -27,6 +27,8 @@ on:
required: true
QUAY_DEPLOYER_PASSWORD:
required: true
KBUILD_SIGN_PIN:
required: true
permissions:
contents: read
@@ -43,6 +45,8 @@ jobs:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
secrets:
KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}
publish-kata-deploy-payload-amd64:
needs: build-kata-static-tarball-amd64

View File

@@ -35,6 +35,10 @@ on:
required: true
QUAY_DEPLOYER_PASSWORD:
required: true
NGC_API_KEY:
required: true
KBUILD_SIGN_PIN:
required: true
permissions:
contents: read
@@ -52,6 +56,8 @@ jobs:
tarball-suffix: -${{ inputs.tag }}
commit-hash: ${{ inputs.commit-hash }}
target-branch: ${{ inputs.target-branch }}
secrets:
KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}
publish-kata-deploy-payload-amd64:
needs: build-kata-static-tarball-amd64
@@ -323,6 +329,21 @@ jobs:
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
run-k8s-tests-on-nvidia-gpu:
if: ${{ inputs.skip-test != 'yes' }}
needs: publish-kata-deploy-payload-amd64
uses: ./.github/workflows/run-k8s-tests-on-nvidia-gpu.yaml
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
secrets:
NGC_API_KEY: ${{ secrets.NGC_API_KEY }}
run-kata-coco-tests:
if: ${{ inputs.skip-test != 'yes' }}
needs:
@@ -383,20 +404,6 @@ jobs:
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
run-metrics-tests:
# Skip metrics tests whilst runner is broken
if: false
# if: ${{ inputs.skip-test != 'yes' }}
needs: build-kata-static-tarball-amd64
uses: ./.github/workflows/run-metrics.yaml
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
run-basic-amd64-tests:
if: ${{ inputs.skip-test != 'yes' }}
needs: build-kata-static-tarball-amd64

View File

@@ -11,6 +11,7 @@ permissions:
jobs:
cleanup-resources:
runs-on: ubuntu-22.04
environment: ci
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
with:

41
.github/workflows/osv-scanner.yaml vendored Normal file
View File

@@ -0,0 +1,41 @@
# A sample workflow which sets up periodic OSV-Scanner scanning for vulnerabilities,
# in addition to a PR check which fails if new vulnerabilities are introduced.
#
# For more examples and options, including how to ignore specific vulnerabilities,
# see https://google.github.io/osv-scanner/github-action/
name: OSV-Scanner
on:
workflow_dispatch:
pull_request:
branches: [ "main" ]
schedule:
- cron: '0 1 * * 0'
push:
branches: [ "main" ]
jobs:
scan-scheduled:
permissions:
actions: read # # Required to upload SARIF file to CodeQL
contents: read # Read commit contents
security-events: write # Require writing security events to upload SARIF file to security tab
if: ${{ github.event_name == 'push' || github.event_name == 'schedule' || github.event_name == 'workflow_dispatch' }}
uses: "google/osv-scanner-action/.github/workflows/osv-scanner-reusable.yml@b00f71e051ddddc6e46a193c31c8c0bf283bf9e6" # v2.1.0
with:
scan-args: |-
-r
./
scan-pr:
permissions:
actions: read # Required to upload SARIF file to CodeQL
contents: read # Read commit contents
security-events: write # Require writing security events to upload SARIF file to security tab
if: ${{ github.event_name == 'pull_request' }}
uses: "google/osv-scanner-action/.github/workflows/osv-scanner-reusable-pr.yml@b00f71e051ddddc6e46a193c31c8c0bf283bf9e6" # v2.1.0
with:
# Example of specifying custom arguments
scan-args: |-
-r
./

View File

@@ -25,6 +25,7 @@ jobs:
target-branch: ${{ github.ref_name }}
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}
build-assets-arm64:
permissions:

View File

@@ -8,6 +8,8 @@ on:
secrets:
QUAY_DEPLOYER_PASSWORD:
required: true
KBUILD_SIGN_PIN:
required: true
permissions:
contents: read
@@ -20,6 +22,7 @@ jobs:
stage: release
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}
permissions:
contents: read
packages: write

View File

@@ -35,6 +35,7 @@ jobs:
target-arch: amd64
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}
build-and-push-assets-arm64:
needs: release

View File

@@ -59,6 +59,8 @@ jobs:
- name: Install dependencies
timeout-minutes: 15
run: bash tests/integration/cri-containerd/gha-run.sh install-dependencies
env:
GH_TOKEN: ${{ github.token }}
- name: get-kata-tarball for ${{ inputs.arch }}
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

View File

@@ -71,6 +71,7 @@ jobs:
instance-type: normal
auto-generate-policy: yes
runs-on: ubuntu-22.04
environment: ci
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
@@ -106,7 +107,9 @@ jobs:
run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-artifacts
- name: Download Azure CLI
run: bash tests/integration/kubernetes/gha-run.sh install-azure-cli
uses: azure/setup-kubectl@776406bce94f63e41d621b960d78ee25c8b76ede # v4.0.1
with:
version: 'latest'
- name: Log into the Azure account
uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0
@@ -128,7 +131,9 @@ jobs:
run: bash tests/integration/kubernetes/gha-run.sh install-bats
- name: Install `kubectl`
run: bash tests/integration/kubernetes/gha-run.sh install-kubectl
uses: azure/setup-kubectl@776406bce94f63e41d621b960d78ee25c8b76ede # v4.0.1
with:
version: 'latest'
- name: Download credentials for the Kubernetes CLI to use them
run: bash tests/integration/kubernetes/gha-run.sh get-cluster-credentials

View File

@@ -79,6 +79,8 @@ jobs:
- name: Deploy ${{ matrix.k8s }}
run: bash tests/integration/kubernetes/gha-run.sh deploy-k8s
env:
CONTAINER_RUNTIME: ${{ matrix.container_runtime }}
- name: Configure the ${{ matrix.snapshotter }} snapshotter
if: matrix.snapshotter != ''

View File

@@ -0,0 +1,89 @@
name: CI | Run NVIDIA GPU kubernetes tests on arm64
on:
workflow_call:
inputs:
registry:
required: true
type: string
repo:
required: true
type: string
tag:
required: true
type: string
pr-number:
required: true
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
secrets:
NGC_API_KEY:
required: true
permissions: {}
jobs:
run-nvidia-gpu-tests-on-amd64:
strategy:
fail-fast: false
matrix:
vmm:
- qemu-nvidia-gpu
k8s:
- kubeadm
runs-on: amd64-nvidia-a100
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
DOCKER_TAG: ${{ inputs.tag }}
GH_PR_NUMBER: ${{ inputs.pr-number }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
KUBERNETES: ${{ matrix.k8s }}
USING_NFD: "false"
K8S_TEST_HOST_TYPE: all
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
persist-credentials: false
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Deploy Kata
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh deploy-kata
- name: Install `bats`
run: bash tests/integration/kubernetes/gha-run.sh install-bats
- name: Run tests
timeout-minutes: 30
run: bash tests/integration/kubernetes/gha-run.sh run-nv-tests
env:
NGC_API_KEY: ${{ secrets.NGC_API_KEY }}
- name: Collect artifacts ${{ matrix.vmm }}
if: always()
run: bash tests/integration/kubernetes/gha-run.sh collect-artifacts
continue-on-error: true
- name: Archive artifacts ${{ matrix.vmm }}
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
with:
name: k8s-tests-${{ matrix.vmm }}-${{ matrix.k8s }}-${{ inputs.tag }}
path: /tmp/artifacts
retention-days: 1
- name: Delete kata-deploy
if: always()
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh cleanup

View File

@@ -52,6 +52,7 @@ jobs:
pull-type:
- guest-pull
runs-on: ubuntu-22.04
environment: ci
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
@@ -90,9 +91,6 @@ jobs:
- name: Install kata
run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-artifacts
- name: Download Azure CLI
run: bash tests/integration/kubernetes/gha-run.sh install-azure-cli
- name: Log into the Azure account
uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0
with:
@@ -113,7 +111,9 @@ jobs:
run: bash tests/integration/kubernetes/gha-run.sh install-bats
- name: Install `kubectl`
run: bash tests/integration/kubernetes/gha-run.sh install-kubectl
uses: azure/setup-kubectl@776406bce94f63e41d621b960d78ee25c8b76ede # v4.0.1
with:
version: 'latest'
- name: Download credentials for the Kubernetes CLI to use them
run: bash tests/integration/kubernetes/gha-run.sh get-cluster-credentials

View File

@@ -126,7 +126,6 @@ jobs:
timeout-minutes: 5
run: bash tests/integration/kubernetes/gha-run.sh delete-csi-driver
# AMD has deprecated SEV support on Kata and henceforth SNP will be the only feature supported for Kata Containers.
run-k8s-tests-sev-snp:
strategy:
fail-fast: false
@@ -224,6 +223,7 @@ jobs:
pull-type:
- guest-pull
runs-on: ubuntu-22.04
environment: ci
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
@@ -268,9 +268,6 @@ jobs:
- name: Install kata
run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-artifacts
- name: Download Azure CLI
run: bash tests/integration/kubernetes/gha-run.sh install-azure-cli
- name: Log into the Azure account
uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0
with:
@@ -291,7 +288,9 @@ jobs:
run: bash tests/integration/kubernetes/gha-run.sh install-bats
- name: Install `kubectl`
run: bash tests/integration/kubernetes/gha-run.sh install-kubectl
uses: azure/setup-kubectl@776406bce94f63e41d621b960d78ee25c8b76ede # v4.0.1
with:
version: 'latest'
- name: Download credentials for the Kubernetes CLI to use them
run: bash tests/integration/kubernetes/gha-run.sh get-cluster-credentials

View File

@@ -49,6 +49,7 @@ jobs:
- host_os: cbl-mariner
vmm: clh
runs-on: ubuntu-22.04
environment: ci
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
DOCKER_REPO: ${{ inputs.repo }}
@@ -71,9 +72,6 @@ jobs:
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Download Azure CLI
run: bash tests/functional/kata-deploy/gha-run.sh install-azure-cli
- name: Log into the Azure account
uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0
with:
@@ -94,7 +92,9 @@ jobs:
run: bash tests/functional/kata-deploy/gha-run.sh install-bats
- name: Install `kubectl`
run: bash tests/functional/kata-deploy/gha-run.sh install-kubectl
uses: azure/setup-kubectl@776406bce94f63e41d621b960d78ee25c8b76ede # v4.0.1
with:
version: 'latest'
- name: Download credentials for the Kubernetes CLI to use them
run: bash tests/functional/kata-deploy/gha-run.sh get-cluster-credentials

View File

@@ -54,6 +54,8 @@ jobs:
- name: Install dependencies
run: bash tests/functional/kata-monitor/gha-run.sh install-dependencies
env:
GH_TOKEN: ${{ github.token }}
- name: get-kata-tarball
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

View File

@@ -38,6 +38,8 @@ jobs:
- name: Install dependencies
run: bash tests/integration/runk/gha-run.sh install-dependencies
env:
GH_TOKEN: ${{ github.token }}
- name: get-kata-tarball
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

View File

@@ -150,3 +150,36 @@ jobs:
needs: skipper
if: ${{ needs.skipper.outputs.skip_static != 'yes' }}
uses: ./.github/workflows/govulncheck.yaml
codegen:
runs-on: ubuntu-22.04
needs: skipper
if: ${{ needs.skipper.outputs.skip_static != 'yes' }}
permissions:
contents: read # for checkout
steps:
- name: Checkout code
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
with:
fetch-depth: 0
persist-credentials: false
- name: generate
run: make -C src/agent generate-protocols
- name: check for diff
run: |
diff=$(git diff)
if [[ -z "${diff}" ]]; then
echo "No diff detected."
exit 0
fi
cat << EOF >> "${GITHUB_STEP_SUMMARY}"
Run \`make -C src/agent generate-protocols\` to update protobuf bindings.
\`\`\`diff
${diff}
\`\`\`
EOF
echo "::error::Golang protobuf bindings need to be regenerated (see Github step summary for diff)."
exit 1

View File

@@ -42,7 +42,7 @@ generate-protocols:
# Some static checks rely on generated source files of components.
static-checks: static-checks-build
bash tests/static-checks.sh github.com/kata-containers/kata-containers
bash tests/static-checks.sh
docs-url-alive-check:
bash ci/docs-url-alive-check.sh

View File

@@ -105,7 +105,7 @@ Please raise an issue
[in this repository](https://github.com/kata-containers/kata-containers/issues).
> **Note:**
> If you are reporting a security issue, please follow the [vulnerability reporting process](https://github.com/kata-containers/community#vulnerability-handling)
> If you are reporting a security issue, please follow the [vulnerability reporting process](SECURITY.md)
## Developers

79
SECURITY.md Normal file
View File

@@ -0,0 +1,79 @@
# Security Policy
Kata Containers is a **rolling-release** project: every monthly release replaces the previous one, and only the _current_ release series receives security fixes. There are **no long-term-support branches**.
---
## Reporting a Vulnerability
### How to report
- **Keep it private first.**
Please **do not** open a public GitHub issue or pull request for security problems.
- **Use GitHubs built-in security advisory workflow.**
See GitHubs official guide:
[Creating a repository security advisory](https://docs.github.com/en/code-security/security-advisories/working-with-repository-security-advisories/creating-a-repository-security-advisory#creating-a-security-advisory)
### What happens after you submit
We follow the OpenSSF vulnerability-handling guidelines.
The table below shows the target timelines we hold ourselves to once we receive your report.
| Stage | Target time | Notes |
|-------|-------------|-------|
| **Initial acknowledgement** | ≤ 14 calendar days | Maintainers confirm receipt and start triage. |
| **Triage & CVSS-v3.1 scoring** | ≤ 30 days | We assign severity and plan remediation. |
| **Fix availability** | Next scheduled monthly release<br />(or an out-of-band patch for Critical/High issues) | We may cut a `vX.Y.Z` patch if waiting a month poses undue risk. |
---
## Supported Versions
| Release | First published | Security-fix window |
|---------|-----------------|---------------------|
| **Latest monthly release** | see `git tag --sort=-creatordate \| head -n 1` | Actively maintained |
| Any prior release | — | **Unsupported** please upgrade |
> **Why no backports?**
> Katas architecture evolves quickly; back-porting patches would re-introduce the very maintenance burden we avoid by using a rolling model.
---
## Disclosure Process & Fix Delivery
1. We develop the fix on a private branch.
2. Once validated, we coordinate embargo dates with downstream consumers when appropriate.
3. The fix ships in **either**:
* Common: The next regular monthly release (e.g., `v3.19`) when impact is moderate and waiting does not materially increase risk, **or**
* Exception: A point release (e.g., `v3.18.1`) if the vulnerability affects only the current series.
4. After the fix is public, we request a CVE ID (if not already issued) and publish details.
---
## Security Advisories & Release Notes
* Each patch or monthly release includes a **Security Bulletin** section in its GitHub *Release Notes* summarizing:
* affected components & versions,
* CVE identifiers (if assigned),
* severity / CVSS score,
* mitigation steps,
* upgrade instructions.
* We do **not** publish separate “stable-branch” advisories because unsupported branches receive no fixes.
---
## Frequently Asked Questions
**Q: I run `v3.16` will you patch it?**
A: No. Upgrade to the latest monthly release.
**Q: Can I get early access to embargoed fixes?**
A: Only project members under the disclosure agreement (see [SECURITY_CONTACTS](SECURITY_CONTACTS)) receive advance patches.
**Q: Where can I discuss the vulnerability once it is public?**
A: Open/continue a GitHub issue **after** the advisory is published, or use `#kata-containers` on Slack with a link to the advisory.
---
*Last updated:* 2025-06-27

13
SECURITY_CONTACTS Normal file
View File

@@ -0,0 +1,13 @@
# Copyright (c) 2025 Kata Containers Authors
#
# SPDX-License-Identifier: Apache-2.0
#
# Defined below are the security contacts for this repo.
#
# They are the contact point for the Product Security Committee to reach out
# to for triaging and handling of incoming issues.
#
# DO NOT REPORT SECURITY VULNERABILITIES DIRECTLY TO THESE NAMES, FOLLOW THE
# INSTRUCTIONS AT [SECURITY.md](SECURITY.md)
@kata-containers/architecture-committee

View File

@@ -1 +1 @@
3.18.0
3.19.1

595
src/agent/Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -8,18 +8,19 @@ license = "Apache-2.0"
rust-version = "1.85.1"
[workspace.dependencies]
oci-spec = { version = "0.6.8", features = ["runtime"] }
oci-spec = { version = "0.8.1", features = ["runtime"] }
lazy_static = "1.3.0"
ttrpc = { version = "0.8.4", features = ["async"], default-features = false }
protobuf = "3.7.2"
libc = "0.2.94"
nix = "0.24.2"
# Notes: nix needs to stay in sync with libs
nix = "0.26.4"
capctl = "0.2.0"
scan_fmt = "0.2.6"
scopeguard = "1.0.0"
thiserror = "1.0.26"
regex = "1.10.5"
serial_test = "0.5.1"
serial_test = "0.10.0"
url = "2.5.0"
derivative = "2.2.0"
const_format = "0.2.30"
@@ -30,7 +31,7 @@ async-recursion = "0.3.2"
futures = "0.3.30"
# Async runtime
tokio = { version = "1.44.2", features = ["full"] }
tokio = { version = "1.46.1", features = ["full"] }
tokio-vsock = "0.3.4"
netlink-sys = { version = "0.7.0", features = ["tokio_socket"] }
@@ -49,7 +50,7 @@ slog-stdlog = "4.0.0"
log = "0.4.11"
cfg-if = "1.0.0"
prometheus = { version = "0.13.0", features = ["process"] }
prometheus = { version = "0.14.0", features = ["process"] }
procfs = "0.12.0"
anyhow = "1"
@@ -80,7 +81,7 @@ kata-agent-policy = { path = "policy" }
rustjail = { path = "rustjail" }
vsock-exporter = { path = "vsock-exporter" }
mem-agent = { path = "../mem-agent" }
mem-agent = { path = "../mem-agent", package = "mem-agent-lib" }
kata-sys-util = { path = "../libs/kata-sys-util" }
kata-types = { path = "../libs/kata-types" }
@@ -163,7 +164,7 @@ strum.workspace = true
strum_macros.workspace = true
# Agent Policy
cdi = { git = "https://github.com/cncf-tags/container-device-interface-rs", rev = "fba5677a8e7cc962fc6e495fcec98d7d765e332a" }
cdi = { git = "https://github.com/cncf-tags/container-device-interface-rs", rev = "3b1e83dda5efcc83c7a4f134466ec006b37109c9" }
# Local dependencies
kata-agent-policy = { workspace = true, optional = true }

View File

@@ -217,4 +217,11 @@ codecov-html: check_tarpaulin
##TARGET generate-protocols: generate/update grpc agent protocols
generate-protocols:
image=$$(docker build -q \
--build-arg GO_VERSION=$$(yq '.languages.golang.version' $(CURDIR)/../../versions.yaml) \
--build-arg PROTOC_VERSION=$$(yq '.externals.protoc.version' $(CURDIR)/../../versions.yaml | grep -oE "[0-9.]+") \
--build-arg PROTOC_GEN_GO_VERSION=$$(yq '.externals.protoc-gen-go.version' $(CURDIR)/../../versions.yaml) \
--build-arg TTRPC_VERSION=$$(yq '.externals.ttrpc.version' $(CURDIR)/../../versions.yaml) \
$(CURDIR)/../../tools/packaging/static-build/codegen) && \
docker run --rm --workdir /kata/src/agent -v $(CURDIR)/../..:/kata --user $(shell id -u) $$image \
../libs/protocols/hack/update-generated-proto.sh all

View File

@@ -32,6 +32,7 @@ use crate::cgroups::{DevicesCgroupInfo, Manager};
use crate::console;
use crate::log_child;
use crate::process::Process;
use crate::process::ProcessOperations;
#[cfg(feature = "seccomp")]
use crate::seccomp;
use crate::selinux;
@@ -261,7 +262,7 @@ pub struct LinuxContainer {
pub init_process_start_time: u64,
pub uid_map_path: String,
pub gid_map_path: String,
pub processes: HashMap<pid_t, Process>,
pub processes: HashMap<String, Process>,
pub status: ContainerStatus,
pub created: SystemTime,
pub logger: Logger,
@@ -933,17 +934,13 @@ impl BaseContainer for LinuxContainer {
}
fn processes(&self) -> Result<Vec<i32>> {
Ok(self.processes.keys().cloned().collect())
Ok(self.processes.values().map(|p| p.pid).collect())
}
fn get_process(&mut self, eid: &str) -> Result<&mut Process> {
for (_, v) in self.processes.iter_mut() {
if eid == v.exec_id.as_str() {
return Ok(v);
}
}
Err(anyhow!("invalid eid {}", eid))
self.processes
.get_mut(eid)
.ok_or_else(|| anyhow!("invalid eid {}", eid))
}
fn stats(&self) -> Result<StatsContainerResponse> {
@@ -967,6 +964,12 @@ impl BaseContainer for LinuxContainer {
async fn start(&mut self, mut p: Process) -> Result<()> {
let logger = self.logger.new(o!("eid" => p.exec_id.clone()));
// Check if exec_id is already in use to prevent collisions
if self.processes.contains_key(p.exec_id.as_str()) {
return Err(anyhow!("exec_id '{}' already exists", p.exec_id));
}
let tty = p.tty;
let fifo_file = format!("{}/{}", &self.root, EXEC_FIFO_FILENAME);
info!(logger, "enter container.start!");
@@ -1235,7 +1238,7 @@ impl BaseContainer for LinuxContainer {
let spec = self.config.spec.as_mut().unwrap();
update_namespaces(&self.logger, spec, p.pid)?;
}
self.processes.insert(p.pid, p);
self.processes.insert(p.exec_id.clone(), p);
info!(logger, "wait on child log handler");
let _ = log_handler
@@ -1261,13 +1264,13 @@ impl BaseContainer for LinuxContainer {
let spec = self.config.spec.as_ref().unwrap();
let st = self.oci_state()?;
for pid in self.processes.keys() {
match signal::kill(Pid::from_raw(*pid), Some(Signal::SIGKILL)) {
for process in self.processes.values() {
match signal::kill(process.pid(), Some(Signal::SIGKILL)) {
Err(Errno::ESRCH) => {
info!(
self.logger,
"kill encounters ESRCH, pid: {}, container: {}",
pid,
process.pid(),
self.id.clone()
);
continue;
@@ -2081,13 +2084,14 @@ mod tests {
});
}
#[test]
fn test_linuxcontainer_get_process() {
#[tokio::test]
async fn test_linuxcontainer_get_process() {
let _ = new_linux_container_and_then(|mut c: LinuxContainer| {
c.processes.insert(
1,
Process::new(&sl(), &oci::Process::default(), "123", true, 1, None).unwrap(),
);
let process =
Process::new(&sl(), &oci::Process::default(), "123", true, 1, None).unwrap();
let exec_id = process.exec_id.clone();
c.processes.insert(exec_id, process);
let p = c.get_process("123");
assert!(p.is_ok(), "Expecting Ok, Got {:?}", p);
Ok(())

View File

@@ -179,6 +179,11 @@ impl Process {
p.parent_stdin = Some(pstdin);
p.stdin = Some(stdin);
// Make sure the parent stdin writer be inserted into
// p.writes hashmap, thus the cleanup_process_stream can
// cleanup and close the parent stdin fd.
let _ = p.get_writer(StreamType::ParentStdin);
// These pipes are necessary as the stdout/stderr of the child process
// cannot be a socket. Otherwise, some images relying on the /dev/stdout(stderr)
// and /proc/self/fd/1(2) will fail to boot as opening an existing socket
@@ -308,8 +313,8 @@ mod tests {
assert_eq!(max_size, actual_size);
}
#[test]
fn test_process() {
#[tokio::test]
async fn test_process() {
let id = "abc123rgb";
let init = true;
let process = Process::new(

View File

@@ -323,31 +323,31 @@ impl FromStr for AgentConfig {
mem_agent_config_override!(
agent_config_builder.mem_agent_memcg_disable,
mac.memcg_config.disabled
mac.memcg_config.default.disabled
);
mem_agent_config_override!(
agent_config_builder.mem_agent_memcg_swap,
mac.memcg_config.swap
mac.memcg_config.default.swap
);
mem_agent_config_override!(
agent_config_builder.mem_agent_memcg_swappiness_max,
mac.memcg_config.swappiness_max
mac.memcg_config.default.swappiness_max
);
mem_agent_config_override!(
agent_config_builder.mem_agent_memcg_period_secs,
mac.memcg_config.period_secs
mac.memcg_config.default.period_secs
);
mem_agent_config_override!(
agent_config_builder.mem_agent_memcg_period_psi_percent_limit,
mac.memcg_config.period_psi_percent_limit
mac.memcg_config.default.period_psi_percent_limit
);
mem_agent_config_override!(
agent_config_builder.mem_agent_memcg_eviction_psi_percent_limit,
mac.memcg_config.eviction_psi_percent_limit
mac.memcg_config.default.eviction_psi_percent_limit
);
mem_agent_config_override!(
agent_config_builder.mem_agent_memcg_eviction_run_aging_count_min,
mac.memcg_config.eviction_run_aging_count_min
mac.memcg_config.default.eviction_run_aging_count_min
);
mem_agent_config_override!(
@@ -549,43 +549,43 @@ impl AgentConfig {
parse_cmdline_param!(
param,
MEM_AGENT_MEMCG_DISABLE,
mac.memcg_config.disabled,
mac.memcg_config.default.disabled,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_MEMCG_SWAP,
mac.memcg_config.swap,
mac.memcg_config.default.swap,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_MEMCG_SWAPPINESS_MAX,
mac.memcg_config.swappiness_max,
mac.memcg_config.default.swappiness_max,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_MEMCG_PERIOD_SECS,
mac.memcg_config.period_secs,
mac.memcg_config.default.period_secs,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_MEMCG_PERIOD_PSI_PERCENT_LIMIT,
mac.memcg_config.period_psi_percent_limit,
mac.memcg_config.default.period_psi_percent_limit,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_MEMCG_EVICTION_PSI_PERCENT_LIMIT,
mac.memcg_config.eviction_psi_percent_limit,
mac.memcg_config.default.eviction_psi_percent_limit,
get_number_value
);
parse_cmdline_param!(
param,
MEM_AGENT_MEMCG_EVICTION_RUN_AGING_COUNT_MIN,
mac.memcg_config.eviction_run_aging_count_min,
mac.memcg_config.default.eviction_run_aging_count_min,
get_number_value
);
parse_cmdline_param!(
@@ -1408,7 +1408,10 @@ mod tests {
contents: "agent.mem_agent_enable=1\nagent.mem_agent_memcg_period_secs=300",
mem_agent: Some(MemAgentConfig {
memcg_config: mem_agent::memcg::Config {
period_secs: 300,
default: mem_agent::memcg::SingleConfig {
period_secs: 300,
..Default::default()
},
..Default::default()
},
..Default::default()
@@ -1419,7 +1422,10 @@ mod tests {
contents: "agent.mem_agent_enable=1\nagent.mem_agent_memcg_period_secs=300\nagent.mem_agent_compact_order=6",
mem_agent: Some(MemAgentConfig {
memcg_config: mem_agent::memcg::Config {
period_secs: 300,
default: mem_agent::memcg::SingleConfig {
period_secs: 300,
..Default::default()
},
..Default::default()
},
compact_config: mem_agent::compact::Config {

View File

@@ -27,6 +27,9 @@ const AA_CONFIG_KEY: &str = "aa.toml";
const CDH_CONFIG_KEY: &str = "cdh.toml";
const POLICY_KEY: &str = "policy.rego";
/// The path of initdata toml
pub const INITDATA_TOML_PATH: &str = concatcp!(INITDATA_PATH, "/initdata.toml");
/// The path of AA's config file
pub const AA_CONFIG_PATH: &str = concatcp!(INITDATA_PATH, "/aa.toml");
@@ -95,7 +98,7 @@ pub async fn read_initdata(device_path: &str) -> Result<Vec<u8>> {
}
pub struct InitdataReturnValue {
pub digest: Vec<u8>,
pub _digest: Vec<u8>,
pub _policy: Option<String>,
}
@@ -122,7 +125,11 @@ pub async fn initialize_initdata(logger: &Logger) -> Result<Option<InitdataRetur
info!(logger, "Initdata version: {}", initdata.version());
initdata.validate()?;
let digest = match initdata.algorithm() {
tokio::fs::write(INITDATA_TOML_PATH, &initdata_content)
.await
.context("write initdata toml failed")?;
let _digest = match initdata.algorithm() {
"sha256" => Sha256::digest(&initdata_content).to_vec(),
"sha384" => Sha384::digest(&initdata_content).to_vec(),
"sha512" => Sha512::digest(&initdata_content).to_vec(),
@@ -143,10 +150,10 @@ pub async fn initialize_initdata(logger: &Logger) -> Result<Option<InitdataRetur
info!(logger, "write CDH config from initdata");
}
debug!(logger, "Initdata digest: {}", STANDARD.encode(&digest));
debug!(logger, "Initdata digest: {}", STANDARD.encode(&_digest));
let res = InitdataReturnValue {
digest,
_digest,
_policy: initdata.get_coco_data(POLICY_KEY).cloned(),
};

View File

@@ -19,7 +19,6 @@ extern crate scopeguard;
extern crate slog;
use anyhow::{anyhow, bail, Context, Result};
use base64::Engine;
use cfg_if::cfg_if;
use clap::Parser;
use const_format::concatcp;
@@ -485,12 +484,9 @@ async fn launch_guest_component_procs(
debug!(logger, "spawning attestation-agent process {}", AA_PATH);
let mut aa_args = vec!["--attestation_sock", AA_ATTESTATION_URI];
let initdata_parameter;
if let Some(initdata_return_value) = initdata_return_value {
initdata_parameter =
base64::engine::general_purpose::STANDARD.encode(&initdata_return_value.digest);
aa_args.push("--initdata");
aa_args.push(&initdata_parameter);
if initdata_return_value.is_some() {
aa_args.push("--initdata-toml");
aa_args.push(initdata::INITDATA_TOML_PATH);
}
launch_process(

View File

@@ -554,7 +554,7 @@ impl AgentService {
req: protocols::agent::WaitProcessRequest,
) -> Result<protocols::agent::WaitProcessResponse> {
let cid = req.container_id;
let eid = req.exec_id;
let mut eid = req.exec_id;
let mut resp = WaitProcessResponse::new();
info!(
@@ -587,7 +587,7 @@ impl AgentService {
.get_container(&cid)
.ok_or_else(|| anyhow!("Invalid container id"))?;
let p = match ctr.processes.get_mut(&pid) {
let p = match ctr.processes.values_mut().find(|p| p.pid == pid) {
Some(p) => p,
None => {
// Lost race, pick up exit code from channel
@@ -600,6 +600,8 @@ impl AgentService {
}
};
eid = p.exec_id.clone();
// need to close all fd
// ignore errors for some fd might be closed by stream
p.cleanup_process_stream();
@@ -611,7 +613,7 @@ impl AgentService {
let _ = s.send(p.exit_code).await;
}
ctr.processes.remove(&pid);
ctr.processes.remove(&eid);
Ok(resp)
}
@@ -708,13 +710,15 @@ fn mem_agent_memcgconfig_to_memcg_optionconfig(
mc: &protocols::agent::MemAgentMemcgConfig,
) -> mem_agent::memcg::OptionConfig {
mem_agent::memcg::OptionConfig {
disabled: mc.disabled,
swap: mc.swap,
swappiness_max: mc.swappiness_max.map(|x| x as u8),
period_secs: mc.period_secs,
period_psi_percent_limit: mc.period_psi_percent_limit.map(|x| x as u8),
eviction_psi_percent_limit: mc.eviction_psi_percent_limit.map(|x| x as u8),
eviction_run_aging_count_min: mc.eviction_run_aging_count_min,
default: mem_agent::memcg::SingleOptionConfig {
disabled: mc.disabled,
swap: mc.swap,
swappiness_max: mc.swappiness_max.map(|x| x as u8),
period_secs: mc.period_secs,
period_psi_percent_limit: mc.period_psi_percent_limit.map(|x| x as u8),
eviction_psi_percent_limit: mc.eviction_psi_percent_limit.map(|x| x as u8),
eviction_run_aging_count_min: mc.eviction_run_aging_count_min,
},
..Default::default()
}
}
@@ -2621,11 +2625,6 @@ mod tests {
}),
..Default::default()
},
TestData {
has_fd: false,
result: Err(anyhow!(ERR_CANNOT_GET_WRITER)),
..Default::default()
},
];
for (i, d) in tests.iter().enumerate() {
@@ -2673,7 +2672,7 @@ mod tests {
}
linux_container
.processes
.insert(exec_process_id, exec_process);
.insert(exec_process.exec_id.clone(), exec_process);
sandbox.add_container(linux_container);
}

View File

@@ -272,8 +272,10 @@ impl Sandbox {
pub fn find_process(&mut self, pid: pid_t) -> Option<&mut Process> {
for (_, c) in self.containers.iter_mut() {
if let Some(p) = c.processes.get_mut(&pid) {
return Some(p);
for p in c.processes.values_mut() {
if p.pid == pid {
return Some(p);
}
}
}
@@ -286,9 +288,11 @@ impl Sandbox {
.ok_or_else(|| anyhow!(ERR_INVALID_CONTAINER_ID))?;
if eid.is_empty() {
let init_pid = ctr.init_process_pid;
return ctr
.processes
.get_mut(&ctr.init_process_pid)
.values_mut()
.find(|p| p.pid == init_pid)
.ok_or_else(|| anyhow!("cannot find init process!"));
}
@@ -1014,23 +1018,26 @@ mod tests {
linux_container.init_process_pid = 1;
linux_container.id = cid.to_string();
// add init process
linux_container.processes.insert(
1,
Process::new(&logger, &oci::Process::default(), "1", true, 1, None).unwrap(),
);
let mut init_process =
Process::new(&logger, &oci::Process::default(), "1", true, 1, None).unwrap();
init_process.pid = 1;
linux_container
.processes
.insert("1".to_string(), init_process);
// add exec process
linux_container.processes.insert(
123,
Process::new(
&logger,
&oci::Process::default(),
"exec-123",
false,
1,
None,
)
.unwrap(),
);
let mut exec_process = Process::new(
&logger,
&oci::Process::default(),
"exec-123",
false,
1,
None,
)
.unwrap();
exec_process.pid = 123;
linux_container
.processes
.insert("exec-123".to_string(), exec_process);
s.add_container(linux_container);
@@ -1081,8 +1088,8 @@ mod tests {
.unwrap();
// processes interally only have pids when manually set
test_process.pid = test_pid;
linux_container.processes.insert(test_pid, test_process);
let test_exec_id = test_process.exec_id.clone();
linux_container.processes.insert(test_exec_id, test_process);
s.add_container(linux_container);

File diff suppressed because it is too large Load Diff

View File

@@ -87,7 +87,7 @@ linux-loader = {workspace = true}
log = "0.4.14"
nix = "0.24.2"
procfs = "0.12.0"
prometheus = { version = "0.13.0", features = ["process"] }
prometheus = { version = "0.14.0", features = ["process"] }
seccompiler = {workspace = true}
serde = "1.0.27"
serde_derive = "1.0.27"

View File

@@ -146,7 +146,6 @@ mod tests {
assert!(MacAddr::from_bytes(&src3[..]).is_err());
}
#[cfg(feature = "with-serde")]
#[test]
fn test_mac_addr_serialization_and_deserialization() {
let mac: MacAddr =

View File

@@ -313,8 +313,8 @@ mod tests {
pub struct TestContext {
pub cid: u64,
pub mem: GuestMemoryMmap,
pub mem_size: usize,
pub epoll_manager: EpollManager,
pub _mem_size: usize,
pub _epoll_manager: EpollManager,
pub device: Vsock<Arc<GuestMemoryMmap>, TestMuxer>,
}
@@ -327,8 +327,8 @@ mod tests {
Self {
cid: CID,
mem,
mem_size: MEM_SIZE,
epoll_manager: epoll_manager.clone(),
_mem_size: MEM_SIZE,
_epoll_manager: epoll_manager.clone(),
device: Vsock::new_with_muxer(
CID,
Arc::new(defs::QUEUE_SIZES.to_vec()),
@@ -394,7 +394,7 @@ mod tests {
EventHandlerContext {
guest_rxvq,
guest_txvq,
guest_evvq,
_guest_evvq: guest_evvq,
queues,
epoll_handler: None,
device: Vsock::new_with_muxer(
@@ -422,7 +422,7 @@ mod tests {
pub queues: Vec<VirtioQueueConfig<QueueSync>>,
pub guest_rxvq: GuestQ<'a>,
pub guest_txvq: GuestQ<'a>,
pub guest_evvq: GuestQ<'a>,
pub _guest_evvq: GuestQ<'a>,
pub mem: Arc<GuestMemoryMmap>,
}

View File

@@ -17,7 +17,6 @@ use tracing::instrument;
use crate::error::{Result, StartMicroVmError, StopMicrovmError};
use crate::event_manager::EventManager;
use crate::tracer::{DragonballTracer, TraceError, TraceInfo};
use crate::vcpu::VcpuManagerError;
use crate::vm::{CpuTopology, KernelConfigInfo, VmConfigInfo};
use crate::vmm::Vmm;
@@ -55,6 +54,8 @@ pub use crate::device_manager::virtio_net_dev_mgr::{
};
#[cfg(feature = "virtio-vsock")]
pub use crate::device_manager::vsock_dev_mgr::{VsockDeviceConfigInfo, VsockDeviceError};
#[cfg(feature = "host-device")]
use crate::vcpu::VcpuManagerError;
#[cfg(feature = "hotplug")]
pub use crate::vcpu::{VcpuResizeError, VcpuResizeInfo};

View File

@@ -879,7 +879,7 @@ impl DeviceManager {
/// Start all registered devices when booting the associated virtual machine.
pub fn start_devices(
&mut self,
vm_as: &GuestAddressSpaceImpl,
#[allow(unused)] vm_as: &GuestAddressSpaceImpl,
) -> std::result::Result<(), StartMicroVmError> {
// It is safe because we don't expect poison lock.
#[cfg(feature = "host-device")]
@@ -899,6 +899,7 @@ impl DeviceManager {
address_space: Option<&AddressSpace>,
) -> Result<()> {
// create context for removing devices
#[allow(unused)]
let mut ctx = DeviceOpContext::new(
Some(epoll_mgr),
self,
@@ -1275,7 +1276,9 @@ mod tests {
use dbs_address_space::{AddressSpaceLayout, AddressSpaceRegion, AddressSpaceRegionType};
use kvm_ioctls::Kvm;
use test_utils::skip_if_not_root;
use vm_memory::{GuestAddress, GuestUsize, MmapRegion};
#[cfg(feature = "virtio-fs")]
use vm_memory::MmapRegion;
use vm_memory::{GuestAddress, GuestUsize};
use super::*;
#[cfg(target_arch = "x86_64")]

1
src/libs/.gitignore vendored Normal file
View File

@@ -0,0 +1 @@
Cargo.lock

2826
src/libs/Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -18,7 +18,7 @@ common-path = "=1.0.0"
fail = "0.5.0"
lazy_static = "1.4.0"
libc = "0.2.100"
nix = "0.24.2"
nix = "0.26.4"
once_cell = "1.9.0"
serde = { version = "1.0.138", features = ["derive"] }
serde_json = "1.0.73"
@@ -32,7 +32,7 @@ pci-ids = "0.2.5"
mockall = "0.13.1"
kata-types = { path = "../kata-types" }
oci-spec = { version = "0.6.8", features = ["runtime"] }
oci-spec = { version = "0.8.1", features = ["runtime"] }
runtime-spec = { path = "../runtime-spec" }
safe-path = { path = "../safe-path" }

View File

@@ -31,13 +31,13 @@ sha2 = "0.10.8"
flate2 = { version = "1.0", features = ["zlib"] }
hex = "0.4"
oci-spec = { version = "0.6.8", features = ["runtime"] }
oci-spec = { version = "0.8.1", features = ["runtime"] }
safe-path = { path = "../safe-path" }
[dev-dependencies]
tempfile = "3.19.1"
test-utils = { path = "../test-utils" }
nix = "0.24.2"
nix = "0.26.4"
[features]
default = []

View File

@@ -273,8 +273,7 @@ pub const KATA_ANNO_CFG_HYPERVISOR_VIRTIO_FS_EXTRA_ARGS: &str =
/// A sandbox annotation to specify as the msize for 9p shares.
pub const KATA_ANNO_CFG_HYPERVISOR_MSIZE_9P: &str = "io.katacontainers.config.hypervisor.msize_9p";
/// The initdata annotation passed in when CVM launchs
pub const KATA_ANNO_CFG_HYPERVISOR_INIT_DATA: &str =
"io.katacontainers.config.hypervisor.cc_init_data";
pub const KATA_ANNO_CFG_RUNTIME_INIT_DATA: &str = "io.katacontainers.config.runtime.cc_init_data";
/// GPU specific annotations for remote hypervisor to help with instance selection
/// It's for minimum number of GPUs required for the VM.
@@ -635,13 +634,13 @@ impl Annotation {
KATA_ANNO_CFG_HYPERVISOR_CPU_FEATURES => {
hv.cpu_info.cpu_features = value.to_string();
}
KATA_ANNO_CFG_HYPERVISOR_DEFAULT_VCPUS => match self.get_value::<i32>(key) {
KATA_ANNO_CFG_HYPERVISOR_DEFAULT_VCPUS => match self.get_value::<f32>(key) {
Ok(num_cpus) => {
let num_cpus = num_cpus.unwrap_or_default();
if num_cpus
> get_hypervisor_plugin(hypervisor_name)
.unwrap()
.get_max_cpus() as i32
.get_max_cpus() as f32
{
return Err(io::Error::new(
io::ErrorKind::InvalidData,
@@ -895,7 +894,7 @@ impl Annotation {
hv.security_info.validate_path(value)?;
hv.security_info.guest_hook_path = value.to_string();
}
KATA_ANNO_CFG_HYPERVISOR_INIT_DATA => {
KATA_ANNO_CFG_RUNTIME_INIT_DATA => {
hv.security_info.initdata =
add_hypervisor_initdata_overrides(value).unwrap();
}
@@ -1079,6 +1078,9 @@ impl Annotation {
}
}
}
config.adjust_config()?;
Ok(())
}
}

View File

@@ -37,6 +37,9 @@ pub const DEFAULT_INTERNETWORKING_MODEL: &str = "tcfilter";
pub const DEFAULT_BLOCK_DEVICE_TYPE: &str = "virtio-blk-pci";
pub const DEFAULT_VHOST_USER_STORE_PATH: &str = "/var/run/vhost-user";
pub const DEFAULT_BLOCK_NVDIMM_MEM_OFFSET: u64 = 0;
pub const DEFAULT_BLOCK_DEVICE_AIO_THREADS: &str = "threads";
pub const DEFAULT_BLOCK_DEVICE_AIO_NATIVE: &str = "native";
pub const DEFAULT_BLOCK_DEVICE_AIO: &str = "io_uring";
pub const DEFAULT_SHARED_FS_TYPE: &str = "virtio-fs";
pub const DEFAULT_VIRTIO_FS_CACHE_MODE: &str = "never";

View File

@@ -369,7 +369,7 @@ mod drop_in_directory_handling {
config.hypervisor["qemu"].path,
"/usr/bin/qemu-kvm".to_string()
);
assert_eq!(config.hypervisor["qemu"].cpu_info.default_vcpus, 2);
assert_eq!(config.hypervisor["qemu"].cpu_info.default_vcpus, 2.0);
assert_eq!(config.hypervisor["qemu"].device_info.default_bridges, 4);
assert_eq!(
config.hypervisor["qemu"].shared_fs.shared_fs.as_deref(),

View File

@@ -109,7 +109,7 @@ impl ConfigPlugin for CloudHypervisorConfig {
return Err(eother!("Both guest boot image and initrd for CH are empty"));
}
if (ch.cpu_info.default_vcpus > 0
if (ch.cpu_info.default_vcpus > 0.0
&& ch.cpu_info.default_vcpus as u32 > default::MAX_CH_VCPUS)
|| ch.cpu_info.default_maxvcpus > default::MAX_CH_VCPUS
{

View File

@@ -66,7 +66,7 @@ impl ConfigPlugin for DragonballConfig {
}
if db.cpu_info.default_vcpus as u32 > db.cpu_info.default_maxvcpus {
db.cpu_info.default_vcpus = db.cpu_info.default_maxvcpus as i32;
db.cpu_info.default_vcpus = db.cpu_info.default_maxvcpus as f32;
}
if db.machine_info.entropy_source.is_empty() {
@@ -135,7 +135,7 @@ impl ConfigPlugin for DragonballConfig {
));
}
if (db.cpu_info.default_vcpus > 0
if (db.cpu_info.default_vcpus > 0.0
&& db.cpu_info.default_vcpus as u32 > default::MAX_DRAGONBALL_VCPUS)
|| db.cpu_info.default_maxvcpus > default::MAX_DRAGONBALL_VCPUS
{

View File

@@ -93,7 +93,7 @@ impl ConfigPlugin for FirecrackerConfig {
));
}
if (firecracker.cpu_info.default_vcpus > 0
if (firecracker.cpu_info.default_vcpus > 0.0
&& firecracker.cpu_info.default_vcpus as u32 > default::MAX_FIRECRACKER_VCPUS)
|| firecracker.cpu_info.default_maxvcpus > default::MAX_FIRECRACKER_VCPUS
{

View File

@@ -107,6 +107,21 @@ pub struct BlockDeviceInfo {
#[serde(default)]
pub block_device_driver: String,
/// Block device AIO is the I/O mechanism specially for Qemu
/// Options:
///
/// - threads
/// Pthread based disk I/O.
///
/// - native
/// Native Linux I/O.
///
/// - io_uring
/// Linux io_uring API. This provides the fastest I/O operations on Linux, requires kernel > 5.1 and
/// qemu >= 5.0.
#[serde(default)]
pub block_device_aio: String,
/// Specifies cache-related options will be set to block devices or not.
#[serde(default)]
pub block_device_cache_set: bool,
@@ -168,6 +183,21 @@ impl BlockDeviceInfo {
if self.block_device_driver.is_empty() {
self.block_device_driver = default::DEFAULT_BLOCK_DEVICE_TYPE.to_string();
}
if self.block_device_aio.is_empty() {
self.block_device_aio = default::DEFAULT_BLOCK_DEVICE_AIO.to_string();
} else {
const VALID_BLOCK_DEVICE_AIO: &[&str] = &[
default::DEFAULT_BLOCK_DEVICE_AIO,
default::DEFAULT_BLOCK_DEVICE_AIO_NATIVE,
default::DEFAULT_BLOCK_DEVICE_AIO_THREADS,
];
if !VALID_BLOCK_DEVICE_AIO.contains(&self.block_device_aio.as_str()) {
return Err(eother!(
"{} is unsupported block device AIO mode.",
self.block_device_aio
));
}
}
if self.memory_offset == 0 {
self.memory_offset = default::DEFAULT_BLOCK_NVDIMM_MEM_OFFSET;
}
@@ -318,7 +348,7 @@ pub struct CpuInfo {
/// > 0 <= number of physical cores --> will be set to the specified number
/// > number of physical cores --> will be set to the actual number of physical cores
#[serde(default)]
pub default_vcpus: i32,
pub default_vcpus: f32,
/// Default maximum number of vCPUs per SB/VM:
/// - unspecified or == 0 --> will be set to the actual number of physical cores or
@@ -350,22 +380,22 @@ impl CpuInfo {
let features: Vec<&str> = self.cpu_features.split(',').map(|v| v.trim()).collect();
self.cpu_features = features.join(",");
let cpus = num_cpus::get() as u32;
let cpus = num_cpus::get() as f32;
// adjust default_maxvcpus
if self.default_maxvcpus == 0 || self.default_maxvcpus > cpus {
self.default_maxvcpus = cpus;
if self.default_maxvcpus == 0 || self.default_maxvcpus as f32 > cpus {
self.default_maxvcpus = cpus as u32;
}
// adjust default_vcpus
if self.default_vcpus < 0 || self.default_vcpus as u32 > cpus {
self.default_vcpus = cpus as i32;
} else if self.default_vcpus == 0 {
self.default_vcpus = default::DEFAULT_GUEST_VCPUS as i32;
if self.default_vcpus < 0.0 || self.default_vcpus > cpus {
self.default_vcpus = cpus;
} else if self.default_vcpus == 0.0 {
self.default_vcpus = default::DEFAULT_GUEST_VCPUS as f32;
}
if self.default_vcpus > self.default_maxvcpus as i32 {
self.default_vcpus = self.default_maxvcpus as i32;
if self.default_vcpus > self.default_maxvcpus as f32 {
self.default_vcpus = self.default_maxvcpus as f32;
}
Ok(())
@@ -373,7 +403,7 @@ impl CpuInfo {
/// Validate the configuration information.
pub fn validate(&self) -> Result<()> {
if self.default_vcpus > self.default_maxvcpus as i32 {
if self.default_vcpus > self.default_maxvcpus as f32 {
return Err(eother!(
"The default_vcpus({}) is greater than default_maxvcpus({})",
self.default_vcpus,
@@ -1383,8 +1413,8 @@ mod tests {
#[test]
fn test_cpu_info_adjust_config() {
// get CPU cores of the test node
let node_cpus = num_cpus::get() as u32;
let default_vcpus = default::DEFAULT_GUEST_VCPUS as i32;
let node_cpus = num_cpus::get() as f32;
let default_vcpus = default::DEFAULT_GUEST_VCPUS as f32;
struct TestData<'a> {
desc: &'a str,
@@ -1397,38 +1427,38 @@ mod tests {
desc: "all with default values",
input: &mut CpuInfo {
cpu_features: "".to_string(),
default_vcpus: 0,
default_vcpus: 0.0,
default_maxvcpus: 0,
},
output: CpuInfo {
cpu_features: "".to_string(),
default_vcpus,
default_maxvcpus: node_cpus,
default_maxvcpus: node_cpus as u32,
},
},
TestData {
desc: "all with big values",
input: &mut CpuInfo {
cpu_features: "a,b,c".to_string(),
default_vcpus: 9999999,
default_vcpus: 9999999.0,
default_maxvcpus: 9999999,
},
output: CpuInfo {
cpu_features: "a,b,c".to_string(),
default_vcpus: node_cpus as i32,
default_maxvcpus: node_cpus,
default_vcpus: node_cpus,
default_maxvcpus: node_cpus as u32,
},
},
TestData {
desc: "default_vcpus lager than default_maxvcpus",
input: &mut CpuInfo {
cpu_features: "a, b ,c".to_string(),
default_vcpus: -1,
default_vcpus: -1.0,
default_maxvcpus: 1,
},
output: CpuInfo {
cpu_features: "a,b,c".to_string(),
default_vcpus: 1,
default_vcpus: 1.0,
default_maxvcpus: 1,
},
},

View File

@@ -128,7 +128,7 @@ impl ConfigPlugin for QemuConfig {
}
}
if (qemu.cpu_info.default_vcpus > 0
if (qemu.cpu_info.default_vcpus > 0.0
&& qemu.cpu_info.default_vcpus as u32 > default::MAX_QEMU_VCPUS)
|| qemu.cpu_info.default_maxvcpus > default::MAX_QEMU_VCPUS
{

View File

@@ -131,9 +131,7 @@ impl TomlConfig {
pub fn load_from_file<P: AsRef<Path>>(config_file: P) -> Result<(TomlConfig, PathBuf)> {
let mut result = Self::load_raw_from_file(config_file);
if let Ok((ref mut config, _)) = result {
Hypervisor::adjust_config(config)?;
Runtime::adjust_config(config)?;
Agent::adjust_config(config)?;
config.adjust_config()?;
info!(sl!(), "get kata config: {:?}", config);
}
@@ -175,13 +173,20 @@ impl TomlConfig {
/// drop-in config file fragments in config.d/.
pub fn load(content: &str) -> Result<TomlConfig> {
let mut config: TomlConfig = toml::from_str(content)?;
Hypervisor::adjust_config(&mut config)?;
Runtime::adjust_config(&mut config)?;
Agent::adjust_config(&mut config)?;
config.adjust_config()?;
info!(sl!(), "get kata config: {:?}", config);
Ok(config)
}
/// Adjust Kata configuration information.
pub fn adjust_config(&mut self) -> Result<()> {
Hypervisor::adjust_config(self)?;
Runtime::adjust_config(self)?;
Agent::adjust_config(self)?;
Ok(())
}
/// Validate Kata configuration information.
pub fn validate(&self) -> Result<()> {
Hypervisor::validate(self)?;

View File

@@ -47,6 +47,9 @@ pub const SANDBOX_BIND_MOUNTS_RO: &str = ":ro";
/// SANDBOX_BIND_MOUNTS_RO is for sandbox bindmounts with readwrite
pub const SANDBOX_BIND_MOUNTS_RW: &str = ":rw";
/// KATA_VIRTUAL_VOLUME_PREFIX is for container image guest pull
pub const KATA_VIRTUAL_VOLUME_PREFIX: &str = "io.katacontainers.volume=";
/// Directly assign a block volume to vm and mount it inside guest.
pub const KATA_VIRTUAL_VOLUME_DIRECT_BLOCK: &str = "direct_block";
/// Present a container image as a generic block device.
@@ -384,7 +387,15 @@ impl KataVirtualVolume {
pub fn from_base64(value: &str) -> Result<Self> {
let json = base64::decode(value)?;
let volume: KataVirtualVolume = serde_json::from_slice(&json)?;
Ok(volume)
}
/// Decode and deserialize a virtual volume object from base64 encoded json string and validate it.
pub fn from_base64_and_validate(value: &str) -> Result<Self> {
let volume = Self::from_base64(value)?;
volume.validate()?;
Ok(volume)
}
}
@@ -532,7 +543,7 @@ pub fn adjust_rootfs_mounts() -> Result<Vec<Mount>> {
// Create a new Vec<Mount> with a single Mount entry.
// This Mount's options will contain the base64-encoded virtual volume.
Ok(vec![Mount {
options: vec![format!("{}={}", "io.katacontainers.volume", b64_vol)],
options: vec![format!("{}{}", KATA_VIRTUAL_VOLUME_PREFIX, b64_vol)],
..Default::default() // Use default values for other Mount fields
}])
}
@@ -647,7 +658,8 @@ mod tests {
volume.direct_volume = Some(DirectAssignedVolume { metadata });
let value = volume.to_base64().unwrap();
let volume2: KataVirtualVolume = KataVirtualVolume::from_base64(value.as_str()).unwrap();
let volume2: KataVirtualVolume =
KataVirtualVolume::from_base64_and_validate(value.as_str()).unwrap();
assert_eq!(volume.volume_type, volume2.volume_type);
assert_eq!(volume.source, volume2.source);
assert_eq!(volume.fs_type, volume2.fs_type);

View File

@@ -186,7 +186,7 @@ mod tests {
"./test_hypervisor_hook_path"
);
assert!(!hv.memory_info.enable_mem_prealloc);
assert_eq!(hv.cpu_info.default_vcpus, 12);
assert_eq!(hv.cpu_info.default_vcpus, 12.0);
assert!(!hv.memory_info.enable_guest_swap);
assert_eq!(hv.memory_info.default_memory, 100);
assert!(!hv.enable_iothreads);

View File

@@ -13,11 +13,16 @@ serde_json = "1.0.73"
# - Dynamic keys required to allow HashMap keys to be slog::Serialized.
# - The 'max_*' features allow changing the log level at runtime
# (by stopping the compiler from removing log calls).
slog = { version = "2.5.2", features = ["dynamic-keys", "max_level_trace", "release_max_level_debug"] }
slog = { version = "2.5.2", features = [
"dynamic-keys",
"max_level_trace",
"release_max_level_debug",
] }
slog-json = "2.4.0"
slog-term = "2.9.0"
slog-term = "2.9.1"
slog-async = "2.7.0"
slog-scope = "4.4.0"
slog-journald = "2.2.0"
lazy_static = "1.3.0"
arc-swap = "1.5.0"

View File

@@ -81,6 +81,11 @@ pub fn create_term_logger(level: slog::Level) -> (slog::Logger, slog_async::Asyn
(logger, guard)
}
pub enum LogDestination {
File(Box<dyn Write + Send + Sync>),
Journal,
}
// Creates a logger which prints output as JSON
// XXX: 'writer' param used to make testing possible.
pub fn create_logger<W>(
@@ -92,13 +97,43 @@ pub fn create_logger<W>(
where
W: Write + Send + Sync + 'static,
{
let json_drain = slog_json::Json::new(writer)
.add_default_keys()
.build()
.fuse();
create_logger_with_destination(name, source, level, LogDestination::File(Box::new(writer)))
}
// Creates a logger which prints output as JSON or to systemd journal
pub fn create_logger_with_destination(
name: &str,
source: &str,
level: slog::Level,
destination: LogDestination,
) -> (slog::Logger, slog_async::AsyncGuard) {
// Check the destination type before consuming it.
// The `matches` macro performs a non-consuming check (it borrows).
let is_journal_destination = matches!(destination, LogDestination::Journal);
// The target type for boxed drain. Note that Err = slog::Never.
// Both `.fuse()` and `.ignore_res()` convert potential errors into a non-returning path
// (panic or ignore), so they never return an Err.
let drain: Box<dyn Drain<Ok = (), Err = slog::Never> + Send> = match destination {
LogDestination::File(writer) => {
// `destination` is `File`.
let json_drain = slog_json::Json::new(writer)
.add_default_keys()
.build()
.fuse();
Box::new(json_drain)
}
LogDestination::Journal => {
// `destination` is `Journal`.
let journal_drain = slog_journald::JournaldDrain.ignore_res();
Box::new(journal_drain)
}
};
// Ensure only a unique set of key/value fields is logged
let unique_drain = UniqueDrain::new(json_drain).fuse();
let unique_drain = UniqueDrain::new(drain).fuse();
// Adjust the level which will be applied to the log-system
// Info is the default level, but if Debug flag is set, the overall log level will be changed to Debug here
@@ -119,16 +154,28 @@ where
.thread_name("slog-async-logger".into())
.build_with_guard();
// Add some "standard" fields
let logger = slog::Logger::root(
// Create a base logger with common fields.
let base_logger = slog::Logger::root(
async_drain.fuse(),
o!("version" => env!("CARGO_PKG_VERSION"),
o!(
"version" => env!("CARGO_PKG_VERSION"),
"subsystem" => DEFAULT_SUBSYSTEM,
"pid" => process::id().to_string(),
"name" => name.to_string(),
"source" => source.to_string()),
"source" => source.to_string()
),
);
// If not journal destination, the logger remains the base_logger.
let logger = if is_journal_destination {
// Use the .new() method to build a child logger which inherits all existing
// key-value pairs from its parent and supplements them with additional ones.
// This is the idiomatic way.
base_logger.new(o!("SYSLOG_IDENTIFIER" => "kata"))
} else {
base_logger
};
(logger, guard)
}
@@ -502,7 +549,12 @@ mod tests {
let record_key = "record-key-1";
let record_value = "record-key-2";
let (logger, guard) = create_logger(name, source, level, writer);
let (logger, guard) = create_logger_with_destination(
name,
source,
level,
LogDestination::File(Box::new(writer)),
);
let msg = "foo, bar, baz";
@@ -661,7 +713,12 @@ mod tests {
.reopen()
.unwrap_or_else(|_| panic!("{:?}: failed to clone tempfile", msg));
let (logger, logger_guard) = create_logger(name, source, d.slog_level, writer);
let (logger, logger_guard) = create_logger_with_destination(
name,
source,
d.slog_level,
LogDestination::File(Box::new(writer)),
);
// Call the logger (which calls the drain)
(d.closure)(&logger, d.msg.to_owned());

View File

@@ -16,8 +16,8 @@ async-trait = { version = "0.1.42", optional = true }
protobuf = { version = "3.7.2" }
serde = { version = "1.0.130", features = ["derive"] }
serde_json = "1.0.68"
oci-spec = { version = "0.6.8", features = ["runtime"] }
oci-spec = { version = "0.8.1", features = ["runtime"] }
[build-dependencies]
ttrpc-codegen = "0.5.0"
ttrpc-codegen = "0.6.0"
protobuf = { version = "3.7.2" }

View File

@@ -13,7 +13,7 @@ edition = "2018"
[dependencies]
anyhow = "^1.0"
nix = "0.24.0"
nix = "0.26.4"
tokio = { version = "1.44.2", features = ["rt-multi-thread"] }
hyper = { version = "0.14.20", features = ["stream", "server", "http1"] }
hyperlocal = "0.8"

View File

@@ -12,4 +12,4 @@ license = "Apache-2.0"
edition = "2018"
[dependencies]
nix = "0.24.2"
nix = "0.26.4"

View File

@@ -1,5 +1 @@
/target
/example/target
/.vscode
.vscode-ctags
Cargo.lock

917
src/mem-agent/Cargo.lock generated
View File

@@ -1,917 +0,0 @@
# This file is automatically @generated by Cargo.
# It is not intended for manual editing.
version = 4
[[package]]
name = "addr2line"
version = "0.21.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8a30b2e23b9e17a9f90641c7ab1549cd9b44f296d3ccbf309d2863cfe398a0cb"
dependencies = [
"gimli",
]
[[package]]
name = "adler"
version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f26201604c87b1e01bd3d98f8d5d9a8fcbb815e8cedb41ffccbeb4bf593a35fe"
[[package]]
name = "android-tzdata"
version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e999941b234f3131b00bc13c22d06e8c5ff726d1b6318ac7eb276997bbb4fef0"
[[package]]
name = "android_system_properties"
version = "0.1.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "819e7219dbd41043ac279b19830f2efc897156490d7fd6ea916720117ee66311"
dependencies = [
"libc",
]
[[package]]
name = "anyhow"
version = "1.0.81"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0952808a6c2afd1aa8947271f3a60f1a6763c7b912d210184c5149b5cf147247"
[[package]]
name = "arc-swap"
version = "1.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7b3d0060af21e8d11a926981cc00c6c1541aa91dd64b9f881985c3da1094425f"
[[package]]
name = "async-trait"
version = "0.1.77"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c980ee35e870bd1a4d2c8294d4c04d0499e67bca1e4b5cefcc693c2fa00caea9"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "autocfg"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d468802bab17cbc0cc575e9b053f41e72aa36bfa6b7f55e3529ffa43161b97fa"
[[package]]
name = "backtrace"
version = "0.3.69"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2089b7e3f35b9dd2d0ed921ead4f6d318c27680d4a5bd167b3ee120edb105837"
dependencies = [
"addr2line",
"cc",
"cfg-if",
"libc",
"miniz_oxide",
"object",
"rustc-demangle",
]
[[package]]
name = "bitflags"
version = "1.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bef38d45163c2f1dde094a7dfd33ccf595c92905c8f8f4fdc18d06fb1037718a"
[[package]]
name = "bitflags"
version = "2.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b048fb63fd8b5923fc5aa7b340d8e156aec7ec02f0c78fa8a6ddc2613f6f71de"
[[package]]
name = "bumpalo"
version = "3.15.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7ff69b9dd49fd426c69a0db9fc04dd934cdb6645ff000864d98f7e2af8830eaa"
[[package]]
name = "bytes"
version = "1.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a2bd12c1caf447e69cd4528f47f94d203fd2582878ecb9e9465484c4148a8223"
[[package]]
name = "cc"
version = "1.0.90"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8cd6604a82acf3039f1144f54b8eb34e91ffba622051189e71b781822d5ee1f5"
[[package]]
name = "cfg-if"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "baf1de4339761588bc0619e3cbc0120ee582ebb74b53b4efbf79117bd2da40fd"
[[package]]
name = "chrono"
version = "0.4.35"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8eaf5903dcbc0a39312feb77df2ff4c76387d591b9fc7b04a238dcf8bb62639a"
dependencies = [
"android-tzdata",
"iana-time-zone",
"js-sys",
"num-traits",
"wasm-bindgen",
"windows-targets 0.52.4",
]
[[package]]
name = "core-foundation-sys"
version = "0.8.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "06ea2b9bc92be3c2baa9334a323ebca2d6f074ff852cd1d7b11064035cd3868f"
[[package]]
name = "crossbeam-channel"
version = "0.5.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "82b8f8f868b36967f9606790d1903570de9ceaf870a7bf9fbbd3016d636a2cb2"
dependencies = [
"crossbeam-utils",
]
[[package]]
name = "crossbeam-utils"
version = "0.8.20"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "22ec99545bb0ed0ea7bb9b8e1e9122ea386ff8a48c0922e43f36d45ab09e0e80"
[[package]]
name = "deranged"
version = "0.3.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b42b6fa04a440b495c8b04d0e71b707c585f83cb9cb28cf8cd0d976c315e31b4"
dependencies = [
"powerfmt",
]
[[package]]
name = "dirs-next"
version = "2.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b98cf8ebf19c3d1b223e151f99a4f9f0690dca41414773390fc824184ac833e1"
dependencies = [
"cfg-if",
"dirs-sys-next",
]
[[package]]
name = "dirs-sys-next"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4ebda144c4fe02d1f7ea1a7d9641b6fc6b580adcfa024ae48797ecdeb6825b4d"
dependencies = [
"libc",
"redox_users",
"winapi",
]
[[package]]
name = "getrandom"
version = "0.2.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c4567c8db10ae91089c99af84c68c38da3ec2f087c3f82960bcdbf3656b6f4d7"
dependencies = [
"cfg-if",
"libc",
"wasi",
]
[[package]]
name = "gimli"
version = "0.28.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4271d37baee1b8c7e4b708028c57d816cf9d2434acb33a549475f78c181f6253"
[[package]]
name = "hermit-abi"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fbf6a919d6cf397374f7dfeeea91d974c7c0a7221d0d0f4f20d859d329e53fcc"
[[package]]
name = "iana-time-zone"
version = "0.1.60"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e7ffbb5a1b541ea2561f8c41c087286cc091e21e556a4f09a8f6cbf17b69b141"
dependencies = [
"android_system_properties",
"core-foundation-sys",
"iana-time-zone-haiku",
"js-sys",
"wasm-bindgen",
"windows-core",
]
[[package]]
name = "iana-time-zone-haiku"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f31827a206f56af32e590ba56d5d2d085f558508192593743f16b2306495269f"
dependencies = [
"cc",
]
[[package]]
name = "is-terminal"
version = "0.4.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "261f68e344040fbd0edea105bef17c66edf46f984ddb1115b775ce31be948f4b"
dependencies = [
"hermit-abi",
"libc",
"windows-sys",
]
[[package]]
name = "itoa"
version = "1.0.14"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d75a2a4b1b190afb6f5425f10f6a8f959d2ea0b9c2b1d79553551850539e4674"
[[package]]
name = "js-sys"
version = "0.3.69"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "29c15563dc2726973df627357ce0c9ddddbea194836909d655df6a75d2cf296d"
dependencies = [
"wasm-bindgen",
]
[[package]]
name = "lazy_static"
version = "1.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e2abad23fbc42b3700f2f279844dc832adb2b2eb069b2df918f455c4e18cc646"
[[package]]
name = "libc"
version = "0.2.172"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d750af042f7ef4f724306de029d18836c26c1765a54a6a3f094cbd23a7267ffa"
[[package]]
name = "libredox"
version = "0.1.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c0ff37bd590ca25063e35af745c343cb7a0271906fb7b37e4813e8f79f00268d"
dependencies = [
"bitflags 2.6.0",
"libc",
]
[[package]]
name = "lock_api"
version = "0.4.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3c168f8615b12bc01f9c17e2eb0cc07dcae1940121185446edc3744920e8ef45"
dependencies = [
"autocfg",
"scopeguard",
]
[[package]]
name = "log"
version = "0.4.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "90ed8c1e510134f979dbc4f070f87d4313098b704861a105fe34231c70a3901c"
[[package]]
name = "maplit"
version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3e2e65a1a2e43cfcb47a895c4c8b10d1f4a61097f9f254f183aee60cad9c651d"
[[package]]
name = "mem-agent"
version = "0.1.0"
dependencies = [
"anyhow",
"async-trait",
"chrono",
"lazy_static",
"maplit",
"nix",
"once_cell",
"page_size",
"slog",
"slog-async",
"slog-scope",
"slog-term",
"tokio",
]
[[package]]
name = "memchr"
version = "2.7.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "523dc4f511e55ab87b694dc30d0f820d60906ef06413f93d4d7a1385599cc149"
[[package]]
name = "memoffset"
version = "0.6.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5aa361d4faea93603064a027415f07bd8e1d5c88c9fbf68bf56a285428fd79ce"
dependencies = [
"autocfg",
]
[[package]]
name = "miniz_oxide"
version = "0.7.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9d811f3e15f28568be3407c8e7fdb6514c1cda3cb30683f15b6a1a1dc4ea14a7"
dependencies = [
"adler",
]
[[package]]
name = "mio"
version = "1.0.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2886843bf800fba2e3377cff24abf6379b4c4d5c6681eaf9ea5b0d15090450bd"
dependencies = [
"libc",
"wasi",
"windows-sys",
]
[[package]]
name = "nix"
version = "0.23.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8f3790c00a0150112de0f4cd161e3d7fc4b2d8a5542ffc35f099a2562aecb35c"
dependencies = [
"bitflags 1.3.2",
"cc",
"cfg-if",
"libc",
"memoffset",
]
[[package]]
name = "num-conv"
version = "0.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "51d515d32fb182ee37cda2ccdcb92950d6a3c2893aa280e540671c2cd0f3b1d9"
[[package]]
name = "num-traits"
version = "0.2.18"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "da0df0e5185db44f69b44f26786fe401b6c293d1907744beaa7fa62b2e5a517a"
dependencies = [
"autocfg",
]
[[package]]
name = "object"
version = "0.32.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a6a622008b6e321afc04970976f62ee297fdbaa6f95318ca343e3eebb9648441"
dependencies = [
"memchr",
]
[[package]]
name = "once_cell"
version = "1.19.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3fdb12b2476b595f9358c5161aa467c2438859caa136dec86c26fdd2efe17b92"
[[package]]
name = "page_size"
version = "0.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "30d5b2194ed13191c1999ae0704b7839fb18384fa22e49b57eeaa97d79ce40da"
dependencies = [
"libc",
"winapi",
]
[[package]]
name = "parking_lot"
version = "0.12.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3742b2c103b9f06bc9fff0a37ff4912935851bee6d36f3c02bcc755bcfec228f"
dependencies = [
"lock_api",
"parking_lot_core",
]
[[package]]
name = "parking_lot_core"
version = "0.9.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4c42a9226546d68acdd9c0a280d17ce19bfe27a46bf68784e4066115788d008e"
dependencies = [
"cfg-if",
"libc",
"redox_syscall",
"smallvec",
"windows-targets 0.48.5",
]
[[package]]
name = "pin-project-lite"
version = "0.2.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8afb450f006bf6385ca15ef45d71d2288452bc3683ce2e2cacc0d18e4be60b58"
[[package]]
name = "powerfmt"
version = "0.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "439ee305def115ba05938db6eb1644ff94165c5ab5e9420d1c1bcedbba909391"
[[package]]
name = "proc-macro2"
version = "1.0.79"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e835ff2298f5721608eb1a980ecaee1aef2c132bf95ecc026a11b7bf3c01c02e"
dependencies = [
"unicode-ident",
]
[[package]]
name = "quote"
version = "1.0.35"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "291ec9ab5efd934aaf503a6466c5d5251535d108ee747472c3977cc5acc868ef"
dependencies = [
"proc-macro2",
]
[[package]]
name = "redox_syscall"
version = "0.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4722d768eff46b75989dd134e5c353f0d6296e5aaa3132e776cbdb56be7731aa"
dependencies = [
"bitflags 1.3.2",
]
[[package]]
name = "redox_users"
version = "0.4.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ba009ff324d1fc1b900bd1fdb31564febe58a8ccc8a6fdbb93b543d33b13ca43"
dependencies = [
"getrandom",
"libredox",
"thiserror",
]
[[package]]
name = "rustc-demangle"
version = "0.1.23"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d626bb9dae77e28219937af045c257c28bfd3f69333c512553507f5f9798cb76"
[[package]]
name = "rustversion"
version = "1.0.18"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0e819f2bc632f285be6d7cd36e25940d45b2391dd6d9b939e79de557f7014248"
[[package]]
name = "scopeguard"
version = "1.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "94143f37725109f92c262ed2cf5e59bce7498c01bcc1502d7b9afe439a4e9f49"
[[package]]
name = "serde"
version = "1.0.210"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c8e3592472072e6e22e0a54d5904d9febf8508f65fb8552499a1abc7d1078c3a"
dependencies = [
"serde_derive",
]
[[package]]
name = "serde_derive"
version = "1.0.210"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "243902eda00fad750862fc144cea25caca5e20d615af0a81bee94ca738f1df1f"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "signal-hook-registry"
version = "1.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d8229b473baa5980ac72ef434c4415e70c4b5e71b423043adb4ba059f89c99a1"
dependencies = [
"libc",
]
[[package]]
name = "slog"
version = "2.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8347046d4ebd943127157b94d63abb990fcf729dc4e9978927fdf4ac3c998d06"
[[package]]
name = "slog-async"
version = "2.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "72c8038f898a2c79507940990f05386455b3a317d8f18d4caea7cbc3d5096b84"
dependencies = [
"crossbeam-channel",
"slog",
"take_mut",
"thread_local",
]
[[package]]
name = "slog-scope"
version = "4.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2f95a4b4c3274cd2869549da82b57ccc930859bdbf5bcea0424bc5f140b3c786"
dependencies = [
"arc-swap",
"lazy_static",
"slog",
]
[[package]]
name = "slog-term"
version = "2.9.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b6e022d0b998abfe5c3782c1f03551a596269450ccd677ea51c56f8b214610e8"
dependencies = [
"is-terminal",
"slog",
"term",
"thread_local",
"time",
]
[[package]]
name = "smallvec"
version = "1.13.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e6ecd384b10a64542d77071bd64bd7b231f4ed5940fba55e98c3de13824cf3d7"
[[package]]
name = "socket2"
version = "0.5.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "05ffd9c0a93b7543e062e759284fcf5f5e3b098501104bfbdde4d404db792871"
dependencies = [
"libc",
"windows-sys",
]
[[package]]
name = "syn"
version = "2.0.52"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b699d15b36d1f02c3e7c69f8ffef53de37aefae075d8488d4ba1a7788d574a07"
dependencies = [
"proc-macro2",
"quote",
"unicode-ident",
]
[[package]]
name = "take_mut"
version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f764005d11ee5f36500a149ace24e00e3da98b0158b3e2d53a7495660d3f4d60"
[[package]]
name = "term"
version = "0.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c59df8ac95d96ff9bede18eb7300b0fda5e5d8d90960e76f8e14ae765eedbf1f"
dependencies = [
"dirs-next",
"rustversion",
"winapi",
]
[[package]]
name = "thiserror"
version = "1.0.65"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5d11abd9594d9b38965ef50805c5e469ca9cc6f197f883f717e0269a3057b3d5"
dependencies = [
"thiserror-impl",
]
[[package]]
name = "thiserror-impl"
version = "1.0.65"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ae71770322cbd277e69d762a16c444af02aa0575ac0d174f0b9562d3b37f8602"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "thread_local"
version = "1.1.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8b9ef9bad013ada3808854ceac7b46812a6465ba368859a37e2100283d2d719c"
dependencies = [
"cfg-if",
"once_cell",
]
[[package]]
name = "time"
version = "0.3.37"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "35e7868883861bd0e56d9ac6efcaaca0d6d5d82a2a7ec8209ff492c07cf37b21"
dependencies = [
"deranged",
"itoa",
"num-conv",
"powerfmt",
"serde",
"time-core",
"time-macros",
]
[[package]]
name = "time-core"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ef927ca75afb808a4d64dd374f00a2adf8d0fcff8e7b184af886c3c87ec4a3f3"
[[package]]
name = "time-macros"
version = "0.2.19"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2834e6017e3e5e4b9834939793b282bc03b37a3336245fa820e35e233e2a85de"
dependencies = [
"num-conv",
"time-core",
]
[[package]]
name = "tokio"
version = "1.44.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e6b88822cbe49de4185e3a4cbf8321dd487cf5fe0c5c65695fef6346371e9c48"
dependencies = [
"backtrace",
"bytes",
"libc",
"mio",
"parking_lot",
"pin-project-lite",
"signal-hook-registry",
"socket2",
"tokio-macros",
"windows-sys",
]
[[package]]
name = "tokio-macros"
version = "2.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6e06d43f1345a3bcd39f6a56dbb7dcab2ba47e68e8ac134855e7e2bdbaf8cab8"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "unicode-ident"
version = "1.0.12"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3354b9ac3fae1ff6755cb6db53683adb661634f67557942dea4facebec0fee4b"
[[package]]
name = "wasi"
version = "0.11.0+wasi-snapshot-preview1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9c8d87e72b64a3b4db28d11ce29237c246188f4f51057d65a7eab63b7987e423"
[[package]]
name = "wasm-bindgen"
version = "0.2.92"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4be2531df63900aeb2bca0daaaddec08491ee64ceecbee5076636a3b026795a8"
dependencies = [
"cfg-if",
"wasm-bindgen-macro",
]
[[package]]
name = "wasm-bindgen-backend"
version = "0.2.92"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "614d787b966d3989fa7bb98a654e369c762374fd3213d212cfc0251257e747da"
dependencies = [
"bumpalo",
"log",
"once_cell",
"proc-macro2",
"quote",
"syn",
"wasm-bindgen-shared",
]
[[package]]
name = "wasm-bindgen-macro"
version = "0.2.92"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a1f8823de937b71b9460c0c34e25f3da88250760bec0ebac694b49997550d726"
dependencies = [
"quote",
"wasm-bindgen-macro-support",
]
[[package]]
name = "wasm-bindgen-macro-support"
version = "0.2.92"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e94f17b526d0a461a191c78ea52bbce64071ed5c04c9ffe424dcb38f74171bb7"
dependencies = [
"proc-macro2",
"quote",
"syn",
"wasm-bindgen-backend",
"wasm-bindgen-shared",
]
[[package]]
name = "wasm-bindgen-shared"
version = "0.2.92"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "af190c94f2773fdb3729c55b007a722abb5384da03bc0986df4c289bf5567e96"
[[package]]
name = "winapi"
version = "0.3.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5c839a674fcd7a98952e593242ea400abe93992746761e38641405d28b00f419"
dependencies = [
"winapi-i686-pc-windows-gnu",
"winapi-x86_64-pc-windows-gnu",
]
[[package]]
name = "winapi-i686-pc-windows-gnu"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ac3b87c63620426dd9b991e5ce0329eff545bccbbb34f3be09ff6fb6ab51b7b6"
[[package]]
name = "winapi-x86_64-pc-windows-gnu"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "712e227841d057c1ee1cd2fb22fa7e5a5461ae8e48fa2ca79ec42cfc1931183f"
[[package]]
name = "windows-core"
version = "0.52.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "33ab640c8d7e35bf8ba19b884ba838ceb4fba93a4e8c65a9059d08afcfc683d9"
dependencies = [
"windows-targets 0.52.4",
]
[[package]]
name = "windows-sys"
version = "0.52.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "282be5f36a8ce781fad8c8ae18fa3f9beff57ec1b52cb3de0789201425d9a33d"
dependencies = [
"windows-targets 0.52.4",
]
[[package]]
name = "windows-targets"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9a2fa6e2155d7247be68c096456083145c183cbbbc2764150dda45a87197940c"
dependencies = [
"windows_aarch64_gnullvm 0.48.5",
"windows_aarch64_msvc 0.48.5",
"windows_i686_gnu 0.48.5",
"windows_i686_msvc 0.48.5",
"windows_x86_64_gnu 0.48.5",
"windows_x86_64_gnullvm 0.48.5",
"windows_x86_64_msvc 0.48.5",
]
[[package]]
name = "windows-targets"
version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7dd37b7e5ab9018759f893a1952c9420d060016fc19a472b4bb20d1bdd694d1b"
dependencies = [
"windows_aarch64_gnullvm 0.52.4",
"windows_aarch64_msvc 0.52.4",
"windows_i686_gnu 0.52.4",
"windows_i686_msvc 0.52.4",
"windows_x86_64_gnu 0.52.4",
"windows_x86_64_gnullvm 0.52.4",
"windows_x86_64_msvc 0.52.4",
]
[[package]]
name = "windows_aarch64_gnullvm"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2b38e32f0abccf9987a4e3079dfb67dcd799fb61361e53e2882c3cbaf0d905d8"
[[package]]
name = "windows_aarch64_gnullvm"
version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bcf46cf4c365c6f2d1cc93ce535f2c8b244591df96ceee75d8e83deb70a9cac9"
[[package]]
name = "windows_aarch64_msvc"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dc35310971f3b2dbbf3f0690a219f40e2d9afcf64f9ab7cc1be722937c26b4bc"
[[package]]
name = "windows_aarch64_msvc"
version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "da9f259dd3bcf6990b55bffd094c4f7235817ba4ceebde8e6d11cd0c5633b675"
[[package]]
name = "windows_i686_gnu"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a75915e7def60c94dcef72200b9a8e58e5091744960da64ec734a6c6e9b3743e"
[[package]]
name = "windows_i686_gnu"
version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b474d8268f99e0995f25b9f095bc7434632601028cf86590aea5c8a5cb7801d3"
[[package]]
name = "windows_i686_msvc"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8f55c233f70c4b27f66c523580f78f1004e8b5a8b659e05a4eb49d4166cca406"
[[package]]
name = "windows_i686_msvc"
version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1515e9a29e5bed743cb4415a9ecf5dfca648ce85ee42e15873c3cd8610ff8e02"
[[package]]
name = "windows_x86_64_gnu"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "53d40abd2583d23e4718fddf1ebec84dbff8381c07cae67ff7768bbf19c6718e"
[[package]]
name = "windows_x86_64_gnu"
version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5eee091590e89cc02ad514ffe3ead9eb6b660aedca2183455434b93546371a03"
[[package]]
name = "windows_x86_64_gnullvm"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0b7b52767868a23d5bab768e390dc5f5c55825b6d30b86c844ff2dc7414044cc"
[[package]]
name = "windows_x86_64_gnullvm"
version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "77ca79f2451b49fa9e2af39f0747fe999fcda4f5e241b2898624dca97a1f2177"
[[package]]
name = "windows_x86_64_msvc"
version = "0.48.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ed94fce61571a4006852b7389a063ab983c02eb1bb37b47f8272ce92d06d9538"
[[package]]
name = "windows_x86_64_msvc"
version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "32b752e52a2da0ddfbdbcc6fceadfeede4c939ed16d13e648833a61dfb611ed8"

View File

@@ -1,7 +1,7 @@
[package]
name = "mem-agent"
version = "0.1.0"
edition = "2018"
name = "mem-agent-lib"
version = "0.2.0"
edition = "2021"
[dependencies]
slog = "2.5.2"
@@ -9,13 +9,14 @@ slog-scope = "4.1.2"
anyhow = "1.0"
page_size = "0.6"
chrono = "0.4"
tokio = { version = "1.44.2", features = ["full"] }
tokio = { version = "1.45.1", features = ["full"] }
async-trait = "0.1"
lazy_static = "1.4"
nix = "0.23.2"
maplit = "1.0"
nix = { version = "0.30.1", features = ["fs", "sched"] }
[dev-dependencies]
maplit = "1.0"
slog-term = "2.9.0"
slog-async = "2.7"
once_cell = "1.9.0"
lazy_static = "1.4"

View File

@@ -1,6 +0,0 @@
# Copyright (C) 2024 Ant group. All rights reserved.
#
# SPDX-License-Identifier: Apache-2.0
default:
cd example; cargo build --examples --target x86_64-unknown-linux-musl

File diff suppressed because it is too large Load Diff

View File

@@ -1,36 +0,0 @@
[package]
name = "mem-agent-bin"
version = "0.1.0"
edition = "2018"
[dependencies]
slog = "2.5.2"
slog-scope = "4.1.2"
slog-term = "2.9.0"
slog-async = "2.7"
structopt = "0.3"
anyhow = "1.0"
libc = "0.2"
page_size = "0.6"
chrono = "0.4"
maplit = "1.0"
ttrpc = { version = "0.8", features = ["async"] }
tokio = { version = "1.44.2", features = ["full"] }
async-trait = "0.1"
byteorder = "1.5"
protobuf = "3.7.2"
lazy_static = "1.4"
# Rust 1.68 doesn't support 0.5.9
home = "=0.5.5"
mem-agent = { path = "../" }
[[example]]
name = "mem-agent-srv"
path = "./srv.rs"
[[example]]
name = "mem-agent-ctl"
path = "./ctl.rs"
[build-dependencies]
ttrpc-codegen = "0.4"

View File

@@ -1,29 +0,0 @@
// Copyright (C) 2024 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
use ttrpc_codegen::{Codegen, Customize, ProtobufCustomize};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let protos = vec![
"protocols/protos/mem-agent.proto",
"protocols/protos/google/protobuf/empty.proto",
"protocols/protos/google/protobuf/timestamp.proto",
];
let protobuf_customized = ProtobufCustomize::default().gen_mod_rs(false);
Codegen::new()
.out_dir("protocols/")
.inputs(&protos)
.include("protocols/protos/")
.rust_protobuf()
.customize(Customize {
async_all: true,
..Default::default()
})
.rust_protobuf_customize(protobuf_customized.clone())
.run()?;
Ok(())
}

View File

@@ -1,79 +0,0 @@
// Copyright (C) 2023 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
mod protocols;
mod share;
use anyhow::{anyhow, Result};
use protocols::empty;
use protocols::mem_agent_ttrpc;
use share::option::{CompactSetOption, MemcgSetOption};
use structopt::StructOpt;
use ttrpc::r#async::Client;
#[derive(Debug, StructOpt)]
enum Command {
#[structopt(name = "memcgstatus", about = "get memory cgroup status")]
MemcgStatus,
#[structopt(name = "memcgset", about = "set memory cgroup")]
MemcgSet(MemcgSetOption),
#[structopt(name = "compactset", about = "set compact")]
CompactSet(CompactSetOption),
}
#[derive(StructOpt, Debug)]
#[structopt(name = "mem-agent-ctl", about = "Memory agent controler")]
struct Opt {
#[structopt(long, default_value = "unix:///var/run/mem-agent.sock")]
addr: String,
#[structopt(subcommand)]
command: Command,
}
#[tokio::main]
async fn main() -> Result<()> {
let opt = Opt::from_args();
// setup client
let c = Client::connect(&opt.addr).unwrap();
let client = mem_agent_ttrpc::ControlClient::new(c.clone());
match opt.command {
Command::MemcgStatus => {
let mss = client
.memcg_status(ttrpc::context::with_timeout(0), &empty::Empty::new())
.await
.map_err(|e| anyhow!("client.memcg_status fail: {}", e))?;
for mcg in mss.mem_cgroups {
println!("{:?}", mcg);
for (numa_id, n) in mcg.numa {
if let Some(t) = n.last_inc_time.into_option() {
println!("{} {:?}", numa_id, share::misc::timestamp_to_datetime(t)?);
}
}
}
}
Command::MemcgSet(c) => {
let config = c.to_rpc_memcg_config();
client
.memcg_set(ttrpc::context::with_timeout(0), &config)
.await
.map_err(|e| anyhow!("client.memcg_status fail: {}", e))?;
}
Command::CompactSet(c) => {
let config = c.to_rpc_compact_config();
client
.compact_set(ttrpc::context::with_timeout(0), &config)
.await
.map_err(|e| anyhow!("client.memcg_status fail: {}", e))?;
}
}
Ok(())
}

View File

@@ -1,8 +0,0 @@
// Copyright (C) 2023 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
pub mod empty;
pub mod mem_agent;
pub mod mem_agent_ttrpc;
pub mod timestamp;

View File

@@ -1,52 +0,0 @@
// Protocol Buffers - Google's data interchange format
// Copyright 2008 Google Inc. All rights reserved.
// https://developers.google.com/protocol-buffers/
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
syntax = "proto3";
package google.protobuf;
option csharp_namespace = "Google.Protobuf.WellKnownTypes";
option go_package = "types";
option java_package = "com.google.protobuf";
option java_outer_classname = "EmptyProto";
option java_multiple_files = true;
option objc_class_prefix = "GPB";
option cc_enable_arenas = true;
// A generic empty message that you can re-use to avoid defining duplicated
// empty messages in your APIs. A typical example is to use it as the request
// or the response type of an API method. For instance:
//
// service Foo {
// rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);
// }
//
// The JSON representation for `Empty` is empty JSON object `{}`.
message Empty {}

View File

@@ -1,138 +0,0 @@
// Protocol Buffers - Google's data interchange format
// Copyright 2008 Google Inc. All rights reserved.
// https://developers.google.com/protocol-buffers/
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
syntax = "proto3";
package google.protobuf;
option csharp_namespace = "Google.Protobuf.WellKnownTypes";
option cc_enable_arenas = true;
option go_package = "github.com/golang/protobuf/ptypes/timestamp";
option java_package = "com.google.protobuf";
option java_outer_classname = "TimestampProto";
option java_multiple_files = true;
option objc_class_prefix = "GPB";
// A Timestamp represents a point in time independent of any time zone or local
// calendar, encoded as a count of seconds and fractions of seconds at
// nanosecond resolution. The count is relative to an epoch at UTC midnight on
// January 1, 1970, in the proleptic Gregorian calendar which extends the
// Gregorian calendar backwards to year one.
//
// All minutes are 60 seconds long. Leap seconds are "smeared" so that no leap
// second table is needed for interpretation, using a [24-hour linear
// smear](https://developers.google.com/time/smear).
//
// The range is from 0001-01-01T00:00:00Z to 9999-12-31T23:59:59.999999999Z. By
// restricting to that range, we ensure that we can convert to and from [RFC
// 3339](https://www.ietf.org/rfc/rfc3339.txt) date strings.
//
// # Examples
//
// Example 1: Compute Timestamp from POSIX `time()`.
//
// Timestamp timestamp;
// timestamp.set_seconds(time(NULL));
// timestamp.set_nanos(0);
//
// Example 2: Compute Timestamp from POSIX `gettimeofday()`.
//
// struct timeval tv;
// gettimeofday(&tv, NULL);
//
// Timestamp timestamp;
// timestamp.set_seconds(tv.tv_sec);
// timestamp.set_nanos(tv.tv_usec * 1000);
//
// Example 3: Compute Timestamp from Win32 `GetSystemTimeAsFileTime()`.
//
// FILETIME ft;
// GetSystemTimeAsFileTime(&ft);
// UINT64 ticks = (((UINT64)ft.dwHighDateTime) << 32) | ft.dwLowDateTime;
//
// // A Windows tick is 100 nanoseconds. Windows epoch 1601-01-01T00:00:00Z
// // is 11644473600 seconds before Unix epoch 1970-01-01T00:00:00Z.
// Timestamp timestamp;
// timestamp.set_seconds((INT64) ((ticks / 10000000) - 11644473600LL));
// timestamp.set_nanos((INT32) ((ticks % 10000000) * 100));
//
// Example 4: Compute Timestamp from Java `System.currentTimeMillis()`.
//
// long millis = System.currentTimeMillis();
//
// Timestamp timestamp = Timestamp.newBuilder().setSeconds(millis / 1000)
// .setNanos((int) ((millis % 1000) * 1000000)).build();
//
//
// Example 5: Compute Timestamp from current time in Python.
//
// timestamp = Timestamp()
// timestamp.GetCurrentTime()
//
// # JSON Mapping
//
// In JSON format, the Timestamp type is encoded as a string in the
// [RFC 3339](https://www.ietf.org/rfc/rfc3339.txt) format. That is, the
// format is "{year}-{month}-{day}T{hour}:{min}:{sec}[.{frac_sec}]Z"
// where {year} is always expressed using four digits while {month}, {day},
// {hour}, {min}, and {sec} are zero-padded to two digits each. The fractional
// seconds, which can go up to 9 digits (i.e. up to 1 nanosecond resolution),
// are optional. The "Z" suffix indicates the timezone ("UTC"); the timezone
// is required. A proto3 JSON serializer should always use UTC (as indicated by
// "Z") when printing the Timestamp type and a proto3 JSON parser should be
// able to accept both UTC and other timezones (as indicated by an offset).
//
// For example, "2017-01-15T01:30:15.01Z" encodes 15.01 seconds past
// 01:30 UTC on January 15, 2017.
//
// In JavaScript, one can convert a Date object to this format using the
// standard
// [toISOString()](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date/toISOString)
// method. In Python, a standard `datetime.datetime` object can be converted
// to this format using
// [`strftime`](https://docs.python.org/2/library/time.html#time.strftime) with
// the time format spec '%Y-%m-%dT%H:%M:%S.%fZ'. Likewise, in Java, one can use
// the Joda Time's [`ISODateTimeFormat.dateTime()`](
// http://www.joda.org/joda-time/apidocs/org/joda/time/format/ISODateTimeFormat.html#dateTime%2D%2D
// ) to obtain a formatter capable of generating timestamps in this format.
//
//
message Timestamp {
// Represents seconds of UTC time since Unix epoch
// 1970-01-01T00:00:00Z. Must be from 0001-01-01T00:00:00Z to
// 9999-12-31T23:59:59Z inclusive.
int64 seconds = 1;
// Non-negative fractions of a second at nanosecond resolution. Negative
// second values with fractions must still have non-negative nanos values
// that count forward in time. Must be from 0 to 999,999,999
// inclusive.
int32 nanos = 2;
}

View File

@@ -1,66 +0,0 @@
// Copyright (C) 2023 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
syntax = "proto3";
package MemAgent;
import "google/protobuf/empty.proto";
import "google/protobuf/timestamp.proto";
service Control {
rpc MemcgStatus(google.protobuf.Empty) returns (MemcgStatusReply);
rpc MemcgSet(MemcgConfig) returns (google.protobuf.Empty);
rpc CompactSet(CompactConfig) returns (google.protobuf.Empty);
}
message EvictionCount {
uint64 page = 1;
uint64 no_min_lru_file = 2;
uint64 min_lru_inc = 3;
uint64 other_error = 4;
uint64 error = 5;
uint64 psi_exceeds_limit = 6;
}
message StatusNuma {
google.protobuf.Timestamp last_inc_time = 1;
uint64 max_seq = 2;
uint64 min_seq = 3;
uint64 run_aging_count = 4;
EvictionCount eviction_count = 5;
}
message MemCgroup {
uint32 id = 1;
uint64 ino = 2;
string path = 3;
uint64 sleep_psi_exceeds_limit = 4;
map<uint32, StatusNuma> numa = 5;
}
message MemcgStatusReply {
repeated MemCgroup mem_cgroups = 1;
}
message MemcgConfig {
optional bool disabled = 1;
optional bool swap = 2;
optional uint32 swappiness_max = 3;
optional uint64 period_secs = 4;
optional uint32 period_psi_percent_limit = 5;
optional uint32 eviction_psi_percent_limit = 6;
optional uint64 eviction_run_aging_count_min = 7;
}
message CompactConfig {
optional bool disabled = 1;
optional uint64 period_secs = 2;
optional uint32 period_psi_percent_limit = 3;
optional uint32 compact_psi_percent_limit = 4;
optional int64 compact_sec_max = 5;
optional uint32 compact_order = 6;
optional uint64 compact_threshold = 7;
optional uint64 compact_force_times = 8;
}

View File

@@ -1,29 +0,0 @@
// Copyright (C) 2023 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
use anyhow::{anyhow, Result};
use chrono::{DateTime, LocalResult, TimeZone, Utc};
use protobuf::well_known_types::timestamp::Timestamp;
pub fn datatime_to_timestamp(dt: DateTime<Utc>) -> Timestamp {
let seconds = dt.timestamp();
let nanos = dt.timestamp_subsec_nanos();
Timestamp {
seconds,
nanos: nanos as i32,
..Default::default()
}
}
#[allow(dead_code)]
pub fn timestamp_to_datetime(timestamp: Timestamp) -> Result<DateTime<Utc>> {
let seconds = timestamp.seconds;
let nanos = timestamp.nanos;
match Utc.timestamp_opt(seconds, nanos as u32) {
LocalResult::Single(t) => Ok(t),
_ => Err(anyhow!("Utc.timestamp_opt {} fail", timestamp)),
}
}

View File

@@ -1,7 +0,0 @@
// Copyright (C) 2023 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
pub mod misc;
pub mod option;
pub mod rpc;

View File

@@ -1,146 +0,0 @@
// Copyright (C) 2024 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
use crate::protocols::mem_agent as rpc;
use structopt::StructOpt;
#[derive(Debug, StructOpt)]
pub struct MemcgSetOption {
#[structopt(long)]
memcg_disabled: Option<bool>,
#[structopt(long)]
memcg_swap: Option<bool>,
#[structopt(long)]
memcg_swappiness_max: Option<u8>,
#[structopt(long)]
memcg_period_secs: Option<u64>,
#[structopt(long)]
memcg_period_psi_percent_limit: Option<u8>,
#[structopt(long)]
memcg_eviction_psi_percent_limit: Option<u8>,
#[structopt(long)]
memcg_eviction_run_aging_count_min: Option<u64>,
}
impl MemcgSetOption {
#[allow(dead_code)]
pub fn to_rpc_memcg_config(&self) -> rpc::MemcgConfig {
let config = rpc::MemcgConfig {
disabled: self.memcg_disabled,
swap: self.memcg_swap,
swappiness_max: self.memcg_swappiness_max.map(|v| v as u32),
period_secs: self.memcg_period_secs,
period_psi_percent_limit: self.memcg_period_psi_percent_limit.map(|v| v as u32),
eviction_psi_percent_limit: self.memcg_eviction_psi_percent_limit.map(|v| v as u32),
eviction_run_aging_count_min: self.memcg_eviction_run_aging_count_min,
..Default::default()
};
config
}
#[allow(dead_code)]
pub fn to_mem_agent_memcg_config(&self) -> mem_agent::memcg::Config {
let mut config = mem_agent::memcg::Config {
..Default::default()
};
if let Some(v) = self.memcg_disabled {
config.disabled = v;
}
if let Some(v) = self.memcg_swap {
config.swap = v;
}
if let Some(v) = self.memcg_swappiness_max {
config.swappiness_max = v;
}
if let Some(v) = self.memcg_period_secs {
config.period_secs = v;
}
if let Some(v) = self.memcg_period_psi_percent_limit {
config.period_psi_percent_limit = v;
}
if let Some(v) = self.memcg_eviction_psi_percent_limit {
config.eviction_psi_percent_limit = v;
}
if let Some(v) = self.memcg_eviction_run_aging_count_min {
config.eviction_run_aging_count_min = v;
}
config
}
}
#[derive(Debug, StructOpt)]
pub struct CompactSetOption {
#[structopt(long)]
compact_disabled: Option<bool>,
#[structopt(long)]
compact_period_secs: Option<u64>,
#[structopt(long)]
compact_period_psi_percent_limit: Option<u8>,
#[structopt(long)]
compact_psi_percent_limit: Option<u8>,
#[structopt(long)]
compact_sec_max: Option<i64>,
#[structopt(long)]
compact_order: Option<u8>,
#[structopt(long)]
compact_threshold: Option<u64>,
#[structopt(long)]
compact_force_times: Option<u64>,
}
impl CompactSetOption {
#[allow(dead_code)]
pub fn to_rpc_compact_config(&self) -> rpc::CompactConfig {
let config = rpc::CompactConfig {
disabled: self.compact_disabled,
period_secs: self.compact_period_secs,
period_psi_percent_limit: self.compact_period_psi_percent_limit.map(|v| v as u32),
compact_psi_percent_limit: self.compact_psi_percent_limit.map(|v| v as u32),
compact_sec_max: self.compact_sec_max,
compact_order: self.compact_order.map(|v| v as u32),
compact_threshold: self.compact_threshold,
compact_force_times: self.compact_force_times,
..Default::default()
};
config
}
#[allow(dead_code)]
pub fn to_mem_agent_compact_config(&self) -> mem_agent::compact::Config {
let mut config = mem_agent::compact::Config {
..Default::default()
};
if let Some(v) = self.compact_disabled {
config.disabled = v;
}
if let Some(v) = self.compact_period_secs {
config.period_secs = v;
}
if let Some(v) = self.compact_period_psi_percent_limit {
config.period_psi_percent_limit = v;
}
if let Some(v) = self.compact_psi_percent_limit {
config.compact_psi_percent_limit = v;
}
if let Some(v) = self.compact_sec_max {
config.compact_sec_max = v;
}
if let Some(v) = self.compact_order {
config.compact_order = v;
}
if let Some(v) = self.compact_threshold {
config.compact_threshold = v;
}
if let Some(v) = self.compact_force_times {
config.compact_force_times = v;
}
config
}
}

View File

@@ -1,221 +0,0 @@
// Copyright (C) 2023 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
use crate::protocols::mem_agent as rpc_mem_agent;
use crate::protocols::{empty, mem_agent_ttrpc};
use anyhow::{anyhow, Result};
use async_trait::async_trait;
use mem_agent::{agent, compact, memcg};
use slog_scope::{error, info};
use std::fs;
use std::os::unix::fs::PermissionsExt;
use std::sync::Arc;
use tokio::signal::unix::{signal, SignalKind};
use ttrpc::asynchronous::Server;
use ttrpc::error::Error;
use ttrpc::proto::Code;
#[derive(Debug)]
pub struct MyControl {
agent: agent::MemAgent,
}
impl MyControl {
#[allow(dead_code)]
pub fn new(agent: agent::MemAgent) -> Self {
Self { agent }
}
}
fn mem_cgroup_to_mem_cgroup_rpc(mcg: &memcg::MemCgroup) -> rpc_mem_agent::MemCgroup {
rpc_mem_agent::MemCgroup {
id: mcg.id as u32,
ino: mcg.ino as u64,
path: mcg.path.clone(),
sleep_psi_exceeds_limit: mcg.sleep_psi_exceeds_limit,
numa: mcg
.numa
.iter()
.map(|(numa_id, n)| {
(
*numa_id,
rpc_mem_agent::StatusNuma {
last_inc_time: protobuf::MessageField::some(
crate::share::misc::datatime_to_timestamp(n.last_inc_time),
),
max_seq: n.max_seq,
min_seq: n.min_seq,
run_aging_count: n.run_aging_count,
eviction_count: protobuf::MessageField::some(
rpc_mem_agent::EvictionCount {
page: n.eviction_count.page,
no_min_lru_file: n.eviction_count.no_min_lru_file,
min_lru_inc: n.eviction_count.min_lru_inc,
other_error: n.eviction_count.other_error,
error: n.eviction_count.error,
psi_exceeds_limit: n.eviction_count.psi_exceeds_limit,
..Default::default()
},
),
..Default::default()
},
)
})
.collect(),
..Default::default()
}
}
fn mem_cgroups_to_memcg_status_reply(
mgs: Vec<memcg::MemCgroup>,
) -> rpc_mem_agent::MemcgStatusReply {
let mem_cgroups: Vec<rpc_mem_agent::MemCgroup> = mgs
.iter()
.map(|x| mem_cgroup_to_mem_cgroup_rpc(&x))
.collect();
rpc_mem_agent::MemcgStatusReply {
mem_cgroups,
..Default::default()
}
}
fn memcgconfig_to_memcg_optionconfig(mc: &rpc_mem_agent::MemcgConfig) -> memcg::OptionConfig {
let moc = memcg::OptionConfig {
disabled: mc.disabled,
swap: mc.swap,
swappiness_max: mc.swappiness_max.map(|val| val as u8),
period_secs: mc.period_secs,
period_psi_percent_limit: mc.period_psi_percent_limit.map(|val| val as u8),
eviction_psi_percent_limit: mc.eviction_psi_percent_limit.map(|val| val as u8),
eviction_run_aging_count_min: mc.eviction_run_aging_count_min,
..Default::default()
};
moc
}
fn compactconfig_to_compact_optionconfig(
cc: &rpc_mem_agent::CompactConfig,
) -> compact::OptionConfig {
let coc = compact::OptionConfig {
disabled: cc.disabled,
period_secs: cc.period_secs,
period_psi_percent_limit: cc.period_psi_percent_limit.map(|val| val as u8),
compact_psi_percent_limit: cc.compact_psi_percent_limit.map(|val| val as u8),
compact_sec_max: cc.compact_sec_max,
compact_order: cc.compact_order.map(|val| val as u8),
compact_threshold: cc.compact_threshold,
compact_force_times: cc.compact_force_times,
..Default::default()
};
coc
}
#[async_trait]
impl mem_agent_ttrpc::Control for MyControl {
async fn memcg_status(
&self,
_ctx: &::ttrpc::r#async::TtrpcContext,
_: empty::Empty,
) -> ::ttrpc::Result<rpc_mem_agent::MemcgStatusReply> {
Ok(mem_cgroups_to_memcg_status_reply(
self.agent.memcg_status_async().await.map_err(|e| {
let estr = format!("agent.memcg_status_async fail: {}", e);
error!("{}", estr);
Error::RpcStatus(ttrpc::get_status(Code::INTERNAL, estr))
})?,
))
}
async fn memcg_set(
&self,
_ctx: &::ttrpc::r#async::TtrpcContext,
mc: rpc_mem_agent::MemcgConfig,
) -> ::ttrpc::Result<empty::Empty> {
self.agent
.memcg_set_config_async(memcgconfig_to_memcg_optionconfig(&mc))
.await
.map_err(|e| {
let estr = format!("agent.memcg_set_config_async fail: {}", e);
error!("{}", estr);
Error::RpcStatus(ttrpc::get_status(Code::INTERNAL, estr))
})?;
Ok(empty::Empty::new())
}
async fn compact_set(
&self,
_ctx: &::ttrpc::r#async::TtrpcContext,
cc: rpc_mem_agent::CompactConfig,
) -> ::ttrpc::Result<empty::Empty> {
self.agent
.compact_set_config_async(compactconfig_to_compact_optionconfig(&cc))
.await
.map_err(|e| {
let estr = format!("agent.compact_set_config_async fail: {}", e);
error!("{}", estr);
Error::RpcStatus(ttrpc::get_status(Code::INTERNAL, estr))
})?;
Ok(empty::Empty::new())
}
}
#[allow(dead_code)]
#[tokio::main]
pub async fn rpc_loop(agent: agent::MemAgent, addr: String) -> Result<()> {
let path = addr
.strip_prefix("unix://")
.ok_or(anyhow!("format of addr {} is not right", addr))?;
if std::path::Path::new(path).exists() {
return Err(anyhow!("addr {} is exist", addr));
}
let control = MyControl::new(agent);
let c = Box::new(control) as Box<dyn mem_agent_ttrpc::Control + Send + Sync>;
let c = Arc::new(c);
let service = mem_agent_ttrpc::create_control(c);
let mut server = Server::new().bind(&addr).unwrap().register_service(service);
let metadata = fs::metadata(path).map_err(|e| anyhow!("fs::metadata {} fail: {}", path, e))?;
let mut permissions = metadata.permissions();
permissions.set_mode(0o600);
fs::set_permissions(path, permissions)
.map_err(|e| anyhow!("fs::set_permissions {} fail: {}", path, e))?;
let mut interrupt = signal(SignalKind::interrupt())
.map_err(|e| anyhow!("signal(SignalKind::interrupt()) fail: {}", e))?;
let mut quit = signal(SignalKind::quit())
.map_err(|e| anyhow!("signal(SignalKind::quit()) fail: {}", e))?;
let mut terminate = signal(SignalKind::terminate())
.map_err(|e| anyhow!("signal(SignalKind::terminate()) fail: {}", e))?;
server
.start()
.await
.map_err(|e| anyhow!("server.start() fail: {}", e))?;
tokio::select! {
_ = interrupt.recv() => {
info!("mem-agent: interrupt shutdown");
}
_ = quit.recv() => {
info!("mem-agent: quit shutdown");
}
_ = terminate.recv() => {
info!("mem-agent: terminate shutdown");
}
};
server
.shutdown()
.await
.map_err(|e| anyhow!("server.shutdown() fail: {}", e))?;
fs::remove_file(&path).map_err(|e| anyhow!("fs::remove_file {} fail: {}", path, e))?;
Ok(())
}

View File

@@ -1,95 +0,0 @@
// Copyright (C) 2023 Ant group. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
use anyhow::{anyhow, Result};
use share::option::{CompactSetOption, MemcgSetOption};
use slog::{Drain, Level, Logger};
use slog_async;
use slog_scope::set_global_logger;
use slog_scope::{error, info};
use slog_term;
use std::fs::OpenOptions;
use std::io::BufWriter;
use structopt::StructOpt;
mod protocols;
mod share;
#[derive(StructOpt, Debug)]
#[structopt(name = "mem-agent", about = "Memory agent")]
struct Opt {
#[structopt(long, default_value = "unix:///var/run/mem-agent.sock")]
addr: String,
#[structopt(long)]
log_file: Option<String>,
#[structopt(long, default_value = "trace", parse(try_from_str = parse_slog_level))]
log_level: Level,
#[structopt(flatten)]
memcg: MemcgSetOption,
#[structopt(flatten)]
compact: CompactSetOption,
}
fn parse_slog_level(src: &str) -> Result<Level, String> {
match src.to_lowercase().as_str() {
"trace" => Ok(Level::Trace),
"debug" => Ok(Level::Debug),
"info" => Ok(Level::Info),
"warning" => Ok(Level::Warning),
"warn" => Ok(Level::Warning),
"error" => Ok(Level::Error),
_ => Err(format!("Invalid log level: {}", src)),
}
}
fn setup_logging(opt: &Opt) -> Result<slog_scope::GlobalLoggerGuard> {
let drain = if let Some(f) = &opt.log_file {
let log_file = OpenOptions::new()
.create(true)
.write(true)
.append(true)
.open(f)
.map_err(|e| anyhow!("Open log file {} fail: {}", f, e))?;
let buffered = BufWriter::new(log_file);
let decorator = slog_term::PlainDecorator::new(buffered);
let drain = slog_term::CompactFormat::new(decorator)
.build()
.filter_level(opt.log_level)
.fuse();
slog_async::Async::new(drain).build().fuse()
} else {
let decorator = slog_term::TermDecorator::new().stderr().build();
let drain = slog_term::CompactFormat::new(decorator)
.build()
.filter_level(opt.log_level)
.fuse();
slog_async::Async::new(drain).build().fuse()
};
let logger = Logger::root(drain, slog::o!());
Ok(set_global_logger(logger.clone()))
}
fn main() -> Result<()> {
// Check opt
let opt = Opt::from_args();
let _ = setup_logging(&opt).map_err(|e| anyhow!("setup_logging fail: {}", e))?;
let memcg_config = opt.memcg.to_mem_agent_memcg_config();
let compact_config = opt.compact.to_mem_agent_compact_config();
let (ma, _rt) = mem_agent::agent::MemAgent::new(memcg_config, compact_config)
.map_err(|e| anyhow!("MemAgent::new fail: {}", e))?;
info!("MemAgent started");
share::rpc::rpc_loop(ma, opt.addr).map_err(|e| {
let estr = format!("rpc::rpc_loop fail: {}", e);
error!("{}", estr);
anyhow!("{}", estr)
})?;
Ok(())
}

View File

@@ -4,8 +4,9 @@
use crate::compact;
use crate::memcg::{self, MemCgroup};
use crate::{error, info};
use crate::{debug, error, info};
use anyhow::{anyhow, Result};
use std::collections::HashMap;
use std::thread;
use tokio::runtime::{Builder, Runtime};
use tokio::select;
@@ -27,7 +28,7 @@ enum AgentCmd {
enum AgentReturn {
Ok,
Err(anyhow::Error),
MemcgStatus(Vec<memcg::MemCgroup>),
MemcgStatus(HashMap<String, memcg::MemCgroup>),
}
async fn handle_agent_cmd(
@@ -44,7 +45,16 @@ async fn handle_agent_cmd(
ret_msg = AgentReturn::MemcgStatus(memcg.get_status().await);
false
}
AgentCmd::MemcgSet(opt) => memcg.set_config(opt).await,
AgentCmd::MemcgSet(opt) => match memcg.set_config(opt).await {
Ok(reset) => {
ret_msg = AgentReturn::Ok;
reset
}
Err(e) => {
ret_msg = AgentReturn::Err(e);
false
}
},
AgentCmd::CompactSet(opt) => comp.set_config(opt).await,
};
@@ -59,6 +69,11 @@ fn get_remaining_tokio_duration(memcg: &memcg::MemCG, comp: &compact::Compact) -
let memcg_d = memcg.get_remaining_tokio_duration();
let comp_d = comp.get_remaining_tokio_duration();
debug!(
"get_remaining_tokio_duration: memcg_d={:?}, comp_d={:?}",
memcg_d, comp_d
);
if memcg_d > comp_d {
comp_d
} else {
@@ -76,6 +91,11 @@ async fn async_get_remaining_tokio_duration(
let memcg_d = memcg_f.await;
let comp_d = comp_f.await;
debug!(
"async_get_remaining_tokio_duration: memcg_d={:?}, comp_d={:?}",
memcg_d, comp_d
);
if memcg_d > comp_d {
comp_d
} else {
@@ -84,16 +104,14 @@ async fn async_get_remaining_tokio_duration(
}
fn agent_work(mut memcg: memcg::MemCG, mut comp: compact::Compact) -> Result<Duration> {
let memcg_need_reset = if memcg.need_work() {
let memcg_work_list = memcg.get_timeout_list();
if memcg_work_list.len() > 0 {
info!("memcg.work start");
memcg
.work()
.work(&memcg_work_list)
.map_err(|e| anyhow!("memcg.work failed: {}", e))?;
info!("memcg.work stop");
true
} else {
false
};
}
let compact_need_reset = if comp.need_work() {
info!("compact.work start");
@@ -105,9 +123,8 @@ fn agent_work(mut memcg: memcg::MemCG, mut comp: compact::Compact) -> Result<Dur
false
};
if memcg_need_reset {
memcg.reset_timer();
}
memcg.reset_timers(&memcg_work_list);
if compact_need_reset {
comp.reset_timer();
}
@@ -136,6 +153,8 @@ impl MemAgentSleep {
}
fn set_sleep(&mut self, d: Duration) {
info!("MemAgentSleep::set_sleep: {:?}", d);
self.duration = d;
self.start_wait_time = Instant::now();
}
@@ -219,10 +238,17 @@ impl MemAgent {
memcg_config: memcg::Config,
compact_config: compact::Config,
) -> Result<(Self, Runtime)> {
let mg = memcg::MemCG::new(memcg_config)
let is_cg_v2 = crate::cgroup::is_cgroup_v2()?;
if is_cg_v2 {
info!("current host use cgroup v2");
} else {
info!("current host use cgroup v1");
}
let mg = memcg::MemCG::new(is_cg_v2, memcg_config)
.map_err(|e| anyhow!("memcg::MemCG::new fail: {}", e))?;
let comp = compact::Compact::new(compact_config)
let comp = compact::Compact::new(is_cg_v2, compact_config)
.map_err(|e| anyhow!("compact::Compact::new fail: {}", e))?;
let (cmd_tx, cmd_rx) = mpsc::channel(10);
@@ -295,7 +321,7 @@ impl MemAgent {
}
}
pub async fn memcg_status_async(&self) -> Result<Vec<MemCgroup>> {
pub async fn memcg_status_async(&self) -> Result<HashMap<String, MemCgroup>> {
let ret = self
.send_cmd_async(AgentCmd::MemcgStatus)
.await
@@ -323,10 +349,8 @@ mod tests {
#[test]
fn test_agent() {
let memcg_config = memcg::Config {
disabled: true,
..Default::default()
};
let mut memcg_config = memcg::Config::default();
memcg_config.default.disabled = true;
let compact_config = compact::Config {
disabled: true,
..Default::default()
@@ -337,10 +361,8 @@ mod tests {
tokio::runtime::Runtime::new()
.unwrap()
.block_on({
let memcg_config = memcg::OptionConfig {
period_secs: Some(120),
..Default::default()
};
let mut memcg_config = memcg::OptionConfig::default();
memcg_config.default.period_secs = Some(120);
ma.memcg_set_config_async(memcg_config)
})
.unwrap();
@@ -359,10 +381,8 @@ mod tests {
#[test]
fn test_agent_memcg_status() {
let memcg_config = memcg::Config {
disabled: true,
..Default::default()
};
let mut memcg_config = memcg::Config::default();
memcg_config.default.disabled = true;
let compact_config = compact::Config {
disabled: true,
..Default::default()

View File

@@ -0,0 +1,23 @@
// Copyright (C) 2025 Kylin Soft. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
use anyhow::{anyhow, Result};
use nix::sys::statfs::statfs;
use std::path::Path;
#[cfg(target_env = "musl")]
const CGROUP2_SUPER_MAGIC: nix::sys::statfs::FsType = nix::sys::statfs::FsType(0x63677270);
#[cfg(not(target_env = "musl"))]
use nix::sys::statfs::CGROUP2_SUPER_MAGIC;
pub const CGROUP_PATH: &str = "/sys/fs/cgroup/";
pub const MEMCGS_V1_PATH: &str = "/sys/fs/cgroup/memory";
pub fn is_cgroup_v2() -> Result<bool> {
let cgroup_path = Path::new("/sys/fs/cgroup");
let stat =
statfs(cgroup_path).map_err(|e| anyhow!("statfs {:?} failed: {}", cgroup_path, e))?;
Ok(stat.filesystem_type() == CGROUP2_SUPER_MAGIC)
}

View File

@@ -2,6 +2,7 @@
//
// SPDX-License-Identifier: Apache-2.0
use crate::cgroup::CGROUP_PATH;
use crate::proc;
use crate::psi;
use crate::timer::Timeout;
@@ -59,7 +60,7 @@ impl Default for Config {
period_secs: 10 * 60,
period_psi_percent_limit: 1,
compact_psi_percent_limit: 5,
compact_sec_max: 30 * 60,
compact_sec_max: 5 * 60,
compact_order: PAGE_REPORTING_MIN_ORDER,
compact_threshold: 2 << PAGE_REPORTING_MIN_ORDER,
compact_force_times: std::u64::MAX,
@@ -238,7 +239,11 @@ pub struct Compact {
}
impl Compact {
pub fn new(mut config: Config) -> Result<Self> {
pub fn new(is_cg_v2: bool, mut config: Config) -> Result<Self> {
if is_cg_v2 {
config.psi_path = PathBuf::from(CGROUP_PATH);
}
config.psi_path =
psi::check(&config.psi_path).map_err(|e| anyhow!("psi::check failed: {}", e))?;
@@ -440,7 +445,8 @@ mod tests {
#[test]
fn test_compact() {
let mut c = Compact::new(Config::default()).unwrap();
let is_cg_v2 = crate::cgroup::is_cgroup_v2().unwrap();
let mut c = Compact::new(is_cg_v2, Config::default()).unwrap();
assert!(c.work().is_ok());
}
}

View File

@@ -3,6 +3,7 @@
// SPDX-License-Identifier: Apache-2.0
pub mod agent;
mod cgroup;
pub mod compact;
pub mod memcg;
mod mglru;

File diff suppressed because it is too large Load Diff

View File

@@ -2,8 +2,9 @@
//
// SPDX-License-Identifier: Apache-2.0
use crate::debug;
use crate::warn;
use crate::cgroup::CGROUP_PATH;
use crate::cgroup::MEMCGS_V1_PATH;
use crate::{debug, trace, warn};
use anyhow::{anyhow, Result};
use chrono::{DateTime, Duration, Utc};
use std::collections::HashMap;
@@ -17,7 +18,7 @@ const WORKINGSET_ANON: usize = 0;
const WORKINGSET_FILE: usize = 1;
const LRU_GEN_ENABLED_PATH: &str = "/sys/kernel/mm/lru_gen/enabled";
const LRU_GEN_PATH: &str = "/sys/kernel/debug/lru_gen";
const MEMCGS_PATH: &str = "/sys/fs/cgroup/memory";
pub const MAX_NR_GENS: u64 = 4;
fn lru_gen_head_parse(line: &str) -> Result<(usize, String)> {
let words: Vec<&str> = line.split_whitespace().map(|word| word.trim()).collect();
@@ -238,13 +239,18 @@ fn file_parse(
pub fn host_memcgs_get(
target_patchs: &HashSet<String>,
parse_line: bool,
is_cg_v2: bool,
) -> Result<HashMap<String, (usize, usize, HashMap<usize, MGenLRU>)>> {
let mgs = file_parse(target_patchs, parse_line)
.map_err(|e| anyhow!("mglru file_parse failed: {}", e))?;
let mut host_mgs = HashMap::new();
for (path, (id, mglru)) in mgs {
let host_path = PathBuf::from(MEMCGS_PATH).join(path.trim_start_matches('/'));
let host_path = if is_cg_v2 {
PathBuf::from(CGROUP_PATH).join(path.trim_start_matches('/'))
} else {
PathBuf::from(MEMCGS_V1_PATH).join(path.trim_start_matches('/'))
};
let metadata = match fs::metadata(host_path.clone()) {
Err(e) => {
@@ -301,7 +307,7 @@ pub fn run_aging(
"+ {} {} {} {} {}",
memcg_id, numa_id, max_seq, can_swap as i32, force_scan as i32
);
//trace!("send cmd {} to {}", cmd, LRU_GEN_PATH);
trace!("send cmd {} to {}", cmd, LRU_GEN_PATH);
fs::write(LRU_GEN_PATH, &cmd)
.map_err(|e| anyhow!("write file {} cmd {} failed: {}", LRU_GEN_PATH, cmd, e))?;
Ok(())
@@ -318,7 +324,7 @@ pub fn run_eviction(
"- {} {} {} {} {}",
memcg_id, numa_id, min_seq, swappiness, nr_to_reclaim
);
//trace!("send cmd {} to {}", cmd, LRU_GEN_PATH);
trace!("send cmd {} to {}", cmd, LRU_GEN_PATH);
fs::write(LRU_GEN_PATH, &cmd)
.map_err(|e| anyhow!("write file {} cmd {} failed: {}", LRU_GEN_PATH, cmd, e))?;
Ok(())

View File

@@ -9,14 +9,14 @@ pub fn sl() -> slog::Logger {
#[macro_export]
macro_rules! error {
($($arg:tt)*) => {
slog::info!(crate::misc::sl(), "{}", format_args!($($arg)*))
slog::error!(crate::misc::sl(), "{}", format_args!($($arg)*))
}
}
#[macro_export]
macro_rules! warn {
($($arg:tt)*) => {
slog::info!(crate::misc::sl(), "{}", format_args!($($arg)*))
slog::warn!(crate::misc::sl(), "{}", format_args!($($arg)*))
}
}
@@ -30,14 +30,14 @@ macro_rules! info {
#[macro_export]
macro_rules! trace {
($($arg:tt)*) => {
slog::info!(crate::misc::sl(), "{}", format_args!($($arg)*))
slog::trace!(crate::misc::sl(), "{}", format_args!($($arg)*))
}
}
#[macro_export]
macro_rules! debug {
($($arg:tt)*) => {
slog::info!(crate::misc::sl(), "{}", format_args!($($arg)*))
slog::debug!(crate::misc::sl(), "{}", format_args!($($arg)*))
}
}

View File

@@ -2,6 +2,7 @@
//
// SPDX-License-Identifier: Apache-2.0
use crate::cgroup::CGROUP_PATH;
use crate::info;
use anyhow::{anyhow, Result};
use chrono::{DateTime, Utc};
@@ -11,7 +12,6 @@ use std::fs::OpenOptions;
use std::io::{BufRead, BufReader};
use std::path::PathBuf;
const CGROUP_PATH: &str = "/sys/fs/cgroup/";
const MEM_PSI: &str = "memory.pressure";
const IO_PSI: &str = "io.pressure";

1464
src/runtime-rs/Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -49,8 +49,8 @@ dbs-utils = { path = "../dragonball/dbs_utils" }
actix-rt = "2.7.0"
anyhow = "1.0"
async-trait = "0.1.48"
containerd-shim = { version = "0.6.0", features = ["async"] }
containerd-shim-protos = { version = "0.6.0", features = ["async"] }
containerd-shim = { version = "0.10.0", features = ["async"] }
containerd-shim-protos = { version = "0.10.0", features = ["async"] }
go-flag = "0.1.0"
hyper = "0.14.20"
hyperlocal = "0.8.0"
@@ -58,8 +58,9 @@ lazy_static = "1.4"
libc = "0.2"
log = "0.4.14"
netns-rs = "0.1.0"
nix = "0.24.2"
oci-spec = { version = "0.6.8", features = ["runtime"] }
# Note: nix needs to stay sync'd with libs versions
nix = "0.26.4"
oci-spec = { version = "0.8.1", features = ["runtime"] }
protobuf = "3.7.2"
rand = "0.8.4"
serde = { version = "1.0.145", features = ["derive"] }
@@ -69,8 +70,8 @@ slog-scope = "4.4.0"
strum = { version = "0.24.0", features = ["derive"] }
tempfile = "3.19.1"
thiserror = "1.0"
tokio = "1.38.2"
tokio = "1.46.1"
tracing = "0.1.41"
tracing-opentelemetry = "0.18.0"
ttrpc = "0.8.4"
url = "2.3.1"
url = "2.5.4"

View File

@@ -311,13 +311,15 @@ ifneq (,$(QEMUCMD))
DEFSANDBOXCGROUPONLY_QEMU := false
ifeq ($(ARCH), s390x)
VMROOTFSDRIVER_QEMU := virtio-blk-ccw
DEFBLOCKSTORAGEDRIVER_QEMU := virtio-blk-ccw
else
VMROOTFSDRIVER_QEMU := virtio-pmem
DEFBLOCKSTORAGEDRIVER_QEMU := virtio-blk-pci
endif
DEFVCPUS_QEMU := 1
DEFMAXVCPUS_QEMU := 0
DEFSHAREDFS_QEMU_VIRTIOFS := virtio-fs
DEFBLOCKSTORAGEDRIVER_QEMU := virtio-scsi
DEFSHAREDFS_QEMU_SEL_VIRTIOFS := none
DEFBLOCKDEVICEAIO_QEMU := io_uring
DEFNETWORKMODEL_QEMU := tcfilter
DEFDISABLEGUESTSELINUX := true
@@ -472,6 +474,7 @@ USER_VARS += DEFBLOCKDEVICEAIO_QEMU
USER_VARS += DEFBLOCKSTORAGEDRIVER_FC
USER_VARS += DEFSHAREDFS_CLH_VIRTIOFS
USER_VARS += DEFSHAREDFS_QEMU_VIRTIOFS
USER_VARS += DEFSHAREDFS_QEMU_SEL_VIRTIOFS
USER_VARS += DEFVIRTIOFSDAEMON
USER_VARS += DEFVALIDVIRTIOFSDAEMONPATHS
USER_VARS += DEFVIRTIOFSCACHESIZE

View File

@@ -168,8 +168,9 @@ default_bridges = @DEFBRIDGES@
# Default false
#reclaim_guest_freed_memory = true
# Block storage driver to be used for the hypervisor in case the container
# rootfs is backed by a block device.
# Block device driver to be used by the hypervisor when a container's storage
# is backed by a block device or a file. This driver facilitates attaching the
# storage directly to the guest VM.
block_device_driver = "virtio-blk-pci"
# Specifies cache-related options for block devices.

View File

@@ -118,9 +118,11 @@ default_memory = @DEFMEMSZ@
# > amount of physical RAM --> will be set to the actual amount of physical RAM
default_maxmemory = @DEFMAXMEMSZ@
# Block storage driver to be used for the hypervisor in case the container
# rootfs is backed by a block device. DB only supports virtio-blk.
# Block device driver to be used by the hypervisor when a container's storage
# is backed by a block device or a file. This driver facilitates attaching the
# storage directly to the guest VM.
#
# DB only supports virtio-blk-mmio.
block_device_driver = "@DEFBLOCKSTORAGEDRIVER_DB@"
# This option changes the default hypervisor and kernel parameters

Some files were not shown because too many files have changed in this diff Show More