Compare commits

..

186 Commits

Author SHA1 Message Date
Steve Horsman
b5f503b0b5 Merge pull request #10471 from fidencio/topic/possibly-fix-release-workflow
workflows: Possibly fix the release workflow
2024-10-28 11:38:33 +00:00
Fabiano Fidêncio
a8fad6893a workflows: Possibly fix the release workflow
The only reason we had this one passing for amd64 is because the check
was done using the wrong variable (`matrix.stage`, while in the other
workflows the variable used is `inputs.stage`).

The commit that broke the release process is 67a8665f51, which
blindly copy & pasted the logic from the matrix assets.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 11:15:53 +01:00
Steve Horsman
ad5749fd6b Merge pull request #10467 from stevenhorsman/release-3.10.1
release: Bump version to 3.10.1
2024-10-25 20:19:23 +01:00
stevenhorsman
b22d4429fb release: Bump version to 3.10.1
Fix release to pick up #10463

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-10-25 17:16:09 +01:00
Steve Horsman
19ac0b24f1 Merge pull request #10463 from skaegi/rustjail_filemode_perm_fix
agent: Correct rustjail device filemode permission typo
2024-10-25 14:27:50 +01:00
Fabiano Fidêncio
cc815957c0 Merge pull request #10461 from kata-containers/topic/workflows-follow-up-on-manually-triggered-job
workflows: devel: Follow-up on the manually triggered jobs
2024-10-25 08:31:14 +02:00
Simon Kaegi
322846b36f agent: Correct rustjail device filemode permission typo
Corrects device filemode permissions typo/regression in rustjail to `666` instead of `066`.
`666` is the standard and expected value for these devices in containers.

Fixes: #10454

Signed-off-by: Simon Kaegi <simon.kaegi@gmail.com>
2024-10-24 16:46:40 -04:00
GabyCT
a9af46ccd2 Merge pull request #10452 from GabyCT/topic/katadoctemp
tests: Add trap statement in kata doc script
2024-10-24 13:21:11 -06:00
Fabiano Fidêncio
475ad3e06b workflows: devel: Allow running more than one at once
More than one developer can and should be able to run this workflow at
the same time, without cancelling the job started by another developer.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-24 15:38:35 +02:00
Fabiano Fidêncio
8f634ceb6b workflows: devel: Adjust the pr-number
Let's use "dev" instead of "manually-triggered" as it avoids the name
being too long, which results in failures to create AKS clusters.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-24 15:38:31 +02:00
GabyCT
41d1178e4a Merge pull request #10438 from GabyCT/topic/fixspellreadme
docs: Fix misspelling in CI documentation
2024-10-23 13:34:52 -06:00
Steve Horsman
c5c389f473 Merge pull request #10449 from kata-containers/topic/add-workflows-specifically-for-testing
Add a specific workflow for testing the CI, without messing up with the "nightly" weather
2024-10-23 19:03:49 +01:00
Gabriela Cervantes
093a6fd542 tests: Add trap statement in kata doc script
This PR adds the trap statement into the kata doc
script to clean up properly the temporary files.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-23 15:56:58 +00:00
Gabriela Cervantes
701891312e docs: Fix misspelling in CI documentation
This PR fixes a misspelling in CI documentation readme.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-23 15:42:08 +00:00
Fabiano Fidêncio
829415dfda workflows: Remove the possibility to manually trigger the nightly CI
As a new workflow was added for the cases where developers want to test
their changes in the workflow itself, let's make sure we stop allowing
manual triggers on this workflow, which can lead to a polluted /
misleading weather of the CI.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-23 13:19:45 +02:00
Fabiano Fidêncio
cc093cdfdb workflows: Add a manually trigger "devel" workflow for the CI
This workflow is intended to replace the `workflow_dispatch` trigger
currently present as part of the `ci-nightly.yaml`.

The reasoning behind having this done in this way is because of our good
and old GHA behaviour for `pull_request_target`, which requires a PR to
be merged in order to check the changes in the workflow itself, which
leads to:
* when a change in a workflow is done, developers (should) do:
  * push their branch to the kata-containers repo
  * manually trigger the "nightly" CI in order to ensure the changes
    don't break anything
    * this can result in the "nightly" CI weather being polluted
      * we don't have the guarantee / assurance about the last n nightly
	runs anymore

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-23 13:14:50 +02:00
Greg Kurz
378f454fb9 Merge pull request #10208 from wtootw/main
runtime: Failed to clean up resources when QEMU is terminated
2024-10-23 12:11:57 +02:00
Fabiano Fidêncio
ca416d8837 Merge pull request #10446 from kata-containers/topic/re-work-shim-v2-build-as-part-of-the-ci-and-release
workflows: Ensure shim-v2 is built as the last asset
2024-10-23 09:27:29 +02:00
Fabiano Fidêncio
c082b99652 Merge pull request #10439 from microsoft/mahuber/azl-cfg-var
tools: Change PACKAGES var for cbl-mariner
2024-10-23 08:39:49 +02:00
Manuel Huber
a730cef9cf tools: Change PACKAGES var for cbl-mariner
Change the PACKAGES variable for the cbl-mariner rootfs-builder
to use the kata-packages-uvm meta package from
packages.microsoft.com to define the set of packages to be
contained in the UVM.
This aligns the UVM build for the Azure Linux distribution
with the UVM build done for the Kata Containers offering on
Azure Kubernetes Services (AKS).

Signed-off-by: Manuel Huber <mahuber@microsoft.com>
2024-10-22 23:11:42 +00:00
Fabiano Fidêncio
67a8665f51 workflows: Ensure shim-v2 is built as the last asset
By doing this we can ensure that whenever the rootfs changes, we'll be
able to get the new root_hash.txt and use it.

This is the very first step to bring the measured rootfs tests back.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-22 14:56:37 +02:00
Greg Kurz
3de6d09a86 Merge pull request #10443 from gkurz/release-3.10.0
release: Bump VERSION to 3.10.0
2024-10-22 14:46:30 +02:00
Greg Kurz
3037303e09 release: Bump VERSION to 3.10.0
Let's start the 3.10.0 release.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-10-22 11:28:15 +02:00
wangyaqi54
cf4b81344d runtime: Failed to clean up resources when QEMU is terminated by signal 15
When QEMU is terminated by signal 15, it deletes the PidFile.
Upon detecting that QEMU has exited, the shim executes the stopVM function.
If the PidFile is not found, the PID is set to 0.
Subsequently, the shim executes `kill -9 0`, which terminates the current process group.
This prevents any further logic from being executed, resulting in resources not being cleaned up.

Signed-off-by: wangyaqi54 <wangyaqi54@jd.com>
2024-10-22 17:04:46 +08:00
Fabiano Fidêncio
4c34cfb0ab Merge pull request #10420 from pmores/add-support-for-virtio-scsi
runtime-rs: support virtio-scsi device in qemu-rs
2024-10-22 11:00:33 +02:00
Pavel Mores
8cdd968092 runtime-rs: support virtio-scsi device in qemu-rs
Semantics are lifted straight out of the go runtime for compatibility.
We introduce DeviceVirtioScsi to represent a virtio-scsi device and
instantiate it if block device driver in the configuration file is set
to virtio-scsi.  We also introduce ObjectIoThread which is instantiated
if the configuration file additionally enables iothreads.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-10-22 08:55:54 +02:00
Greg Kurz
91b874f18c Merge pull request #10421 from Apokleos/hostname-bugfix
kata-agent: fixing bug of unable setting hostname correctly.
2024-10-22 00:26:51 +02:00
alex.lyn
b25538f670 ci: Introduce CI to validate pod hostname
Fixes #10422

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2024-10-21 16:32:56 +01:00
alex.lyn
3dabe0f5f0 kata-agent: fixing bug of unable setting hostname correctly.
When do update_container_namespaces updating namespaces, setting
all UTS(and IPC) namespace paths to None resulted in hostnames
set prior to the update becoming ineffective. This was primarily
due to an error made while aligning with the oci spec: in an attempt
to match empty strings with None values in oci-spec-rs, all paths
were incorrectly set to None.

Fixes #10325

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2024-10-21 16:32:56 +01:00
Steve Horsman
98886a7571 Merge pull request #10437 from mkulke/mkulke/dont-parse-oci-image-for-cached-artifacts
ci: don't parse oci image for cached artifacts
2024-10-21 16:31:23 +01:00
Magnus Kulke
e27d70d47e ci: don't parse oci image for cached artifacts
Moved the parsing of the oci image marker into its own step, since we
only need to perform that for attestation purposes and some cached
images might not have that file in the tarball.

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
2024-10-21 14:50:00 +02:00
Magnus Kulke
9a33a3413b Merge pull request #10433 from mkulke/mkulke/add-provenance-attestation-for-agent-builds
ci: add provenance attestation for agent artifact
2024-10-18 15:00:18 +02:00
Anastassios Nanos
68d539f5c5 Merge pull request #10435 from nubificus/fix_fc_machineconfig
runtime-rs: Use vCPU and memory values from config
2024-10-18 13:41:20 +01:00
Magnus Kulke
b93f5390ce ci: add provenance attestation for agent artifact
This adds provenance attestation logic for agent binaries that are
published to an oci registry via ORAS.

As a downstream consumer of the kata-agent binary the Peerpod project
needs to verify that the artifact has been built on kata's CI.

To create an attestation we need to know the exact digest of the oci
artifact, at the point when the artifact was pushed.

Therefore we record the full oci image as returned by oras push.

The pushing and tagging logic has been slightly reworked to make this
task less repetetive.

The oras cli accepts multiple tags separated by comma on pushes, so a
push can be performed atomically instead of iterating through tags and
pushing each individually. This removes the risk of partially successful
push operations (think: rate limits on the oci registry).

So far the provenance creation has been only enabled for agent builds on
amd64 and xs390x.

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
2024-10-18 10:24:00 +02:00
Anastassios Nanos
23f5786cca runtime-rs: Use vCPU and memory values from config
Use values from the config for the setup of the microVM.

Fixes: #10434

Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
2024-10-17 23:17:02 +01:00
GabyCT
4ae9317675 Merge pull request #10430 from GabyCT/topic/ciaz
docs: Update CI documentation
2024-10-17 15:09:24 -06:00
GabyCT
b00203ba9b Merge pull request #10428 from GabyCT/topic/archk8sc
gha: Use a arch_to_golang variable to have uniformity
2024-10-17 11:00:59 -06:00
Chengyu Zhu
cca77f0911 Merge pull request #10412 from stevenhorsman/agent-config-rstest
agent: config: Use rstest for unit tests
2024-10-17 23:01:21 +08:00
Gabriela Cervantes
e3efad8ed2 docs: Update CI documentation
This PR updates the CI documentation referring to the several tests and
in which kind of instances is running them.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-16 19:23:19 +00:00
stevenhorsman
4adb454ed0 agent: config: Use rstest for unit tests
Use rstest for unit test rather than TestData arrays where
possible to make the code more compact, easier to read
and open the possibility to enhance test cases with a
description more easily.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-10-16 16:55:44 +01:00
Gabriela Cervantes
f0e0c74fd4 gha: Use a arch_to_golang variable to have uniformity
This PR replaces the arch uname -m to use the arch_to_golang
variable in the script to have a better uniformity across the script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-15 20:03:09 +00:00
Dan Mihai
69509eff33 Merge pull request #10417 from microsoft/danmihai1/k8s-inotify.bats
tests: k8s-inotify.bats improvements
2024-10-15 11:22:53 -07:00
Dan Mihai
ece0f9690e tests: k8s-inotify: longer pod termination timeout
inotify-configmap-pod.yaml is using: "inotifywait --timeout 120",
so wait for up to 180 seconds for the pod termination to be
reported.

Hopefully, some of the sporadic errors from #10413 will be avoided
this way:

not ok 1 configmap update works, and preserves symlinks
waitForProcess "${wait_time}" "$sleep_time" "${command}" failed

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-10-15 16:01:25 +00:00
Dan Mihai
ccfb7faa1b tests: k8s-inotify.bats: don't leak configmap
Delete the configmap if the test failed, not just on the successful
path.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-10-15 16:01:25 +00:00
Aurélien Bombo
f13d13c8fa Merge pull request #10416 from microsoft/danmihai1/mariner_static_sandbox_resource_mgmt
ci: static_sandbox_resource_mgmt for cbl-mariner
2024-10-15 10:40:17 -05:00
Aurélien Bombo
c371b4e1ce Merge pull request #10426 from 3u13r/fix/genpolicy/handle-config-map-binary-data
genpolicy: read binaryData value as String
2024-10-14 21:31:23 -05:00
Leonard Cohnen
c06bf2e3bb genpolicy: read binaryData value as String
While Kubernetes defines `binaryData` as `[]byte`,
when defined in a YAML file the raw bytes are
base64 encoded. Therefore, we need to read the YAML
value as `String` and not as `Vec<u8>`.

Fixes: #10410

Signed-off-by: Leonard Cohnen <lc@edgeless.systems>
2024-10-14 20:03:11 +02:00
Aurélien Bombo
f9b7a8a23c Merge pull request #10402 from Sumynwa/sumsharma/agent-ctl-dependencies
ci: Install build dependencies for building agent-ctl with image pull.
2024-10-14 10:28:32 -05:00
Sumedh Alok Sharma
bc195d758a ci: Install build dependencies for building agent-ctl with image pull.
Adds dependencies of 'clang' & 'protobuf' to be installed in runners
when building agent-ctl sources having image pull support.

Fixes #10400

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-10-14 10:36:04 +05:30
Aurélien Bombo
614e21ccfb Merge pull request #10415 from GabyCT/topic/egreptim
tools/osbuilder/tests: Remove egrep in test images script
2024-10-11 13:47:30 -05:00
Gabriela Cervantes
aae654be80 tools/osbuilder/tests: Remove egrep in test images script
This PR removes egrep command as it has been deprecated and it replaces by
grep in the test images script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-11 17:23:35 +00:00
Dan Mihai
3622b5e8b4 ci: static_sandbox_resource_mgmt for cbl-mariner
Use the configuration used by AKS (static_sandbox_resource_mgmt=true)
for CI testing on Mariner hosts.

Hopefully pod startup will become more predictable on these hosts -
e.g., by avoiding the occasional hotplug timeouts described by #10413.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-10-10 22:17:39 +00:00
Fabiano Fidêncio
02f5fd94bd Merge pull request #10409 from fidencio/topic/ci-add-ita_image-and-ita_image_tag
kbs: ita: Ensure the proper image / image_tag is used for ITA
2024-10-10 11:46:26 +02:00
Fabiano Fidêncio
cf5d3ed0d4 kbs: ita: Ensure the proper image / image_tag is used for ITA
When dealing with a specific release, it was easier to just do some
adjustments on the image that has to be used for ITA without actually
adding a new entry in the versions.yaml.

However, it's been proven to be more complicated than that when it comes
to dealing with staged images, and we better explicitly add (and
update) those versions altogether to avoid CI issues.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-10 10:01:33 +02:00
Steve Horsman
0c4a7c8771 Merge pull request #10406 from ChengyuZhu6/fix-unit
agent:cdh: fix unit tests about sealed secret
2024-10-10 08:57:28 +01:00
Fabiano Fidêncio
3f7ce1d620 Merge pull request #10401 from stevenhorsman/kbs-deploy-overlays-update
Kbs deploy overlays update
2024-10-10 09:50:19 +02:00
Fabiano Fidêncio
036b04094e Merge pull request #10397 from fidencio/topic/build-remove-initrd-mariner-target
build: mariner: Remove the ability to build the marine initrd
2024-10-10 09:44:36 +02:00
ChengyuZhu6
65ecac5777 agent:cdh: fix unit tests about sealed secret
The root cause is that the CDH client is a global variable, and unit tests `test_unseal_env` and `test_unseal_file`
share this lock-free global variable, leading to resource contention and destruction.
Merging the two unit tests into one test_sealed_secret will resolve this issue.

Fixes: #10403

Signed-off-by: ChengyuZhu6 <zhucy0405@gmail.com>
2024-10-10 08:38:06 +08:00
ChengyuZhu6
a992feb7f3 Revert "Revert "agent:cdh: unittest for sealed secret as file""
This reverts commit b5142c94b9.

Signed-off-by: ChengyuZhu6 <zhucy0405@gmail.com>
2024-10-10 08:37:06 +08:00
GabyCT
0cda92c6d8 Merge pull request #10407 from GabyCT/topic/fixbuildk
packaging: Remove unused variable in build kernel script
2024-10-09 16:53:45 -06:00
Gabriela Cervantes
616eb8b19b packaging: Remove unused variable in build kernel script
This PR removes an unused variable in the build kernel script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-09 20:02:56 +00:00
Fabiano Fidêncio
652ba30d4a build: mariner: Remove the ability to build the marine initrd
As mariner has switched to using an image instead of an initrd, let's
just drop the abiliy to build the initrd and avoid keeping something in
the tree that won't be used.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-09 21:42:55 +02:00
Fabiano Fidêncio
59e3ab07e4 Merge pull request #10396 from fidencio/topic/ci-mariner-test-using-mariner-image-instead-of-initrd
ci: mariner: Use the image instead of the initrd
2024-10-09 21:39:44 +02:00
stevenhorsman
b2fb19f8f8 versions: Bump KBS version
Bump to the commit that had the overlays changes we want
to adapt to.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-10-09 17:49:21 +01:00
Fabiano Fidêncio
01a957f7e1 ci: mariner: Stop building mariner initrd
As the mariner image is already in place, and the tests were modified to
use them (as part of this series), let's just stop building it as part
of the CI.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-09 18:23:35 +02:00
Fabiano Fidêncio
091ad2a1b2 ci: mariner: Ensure kernel_params can be set
The reason we're doing this is because mariner image uses, by default,
cgroups default-hierarchy as `unified` (aka, cgroupsv2).

In order to keep the same initrd behaviour for mariner, let's enforce
that `SYSTEMD_CGROUP_ENABLE_LEGACY_FORCE=1
systemd.legacy_systemd_cgroup_controller=yes
systemd.unified_cgroup_hierarchy=0` is passed to the kernel cmdline, at
least for now.

Other tests that are setting `kernel_params` are not running on mariner,
then we're safe taking this path as it's done as part of this PR.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-09 18:23:35 +02:00
Fabiano Fidêncio
3bbf3c81c2 ci: mariner: Use the image instead of the initrd
As an image has been added for mariner as part of the commit 63c1f81c2,
let's start using it in the CI, instead of using the initrd.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-09 18:23:32 +02:00
Fabiano Fidêncio
9c0c159b25 Merge pull request #10404 from fidencio/topic/rever-sealed-secrets-tests
Revert "agent:cdh: unittest for sealed secret as file"
2024-10-09 18:09:09 +02:00
GabyCT
2035d638df Merge pull request #10388 from GabyCT/topic/testimtemp
tools/osbuilder/tests: Add trap statement in test images script
2024-10-09 09:49:45 -06:00
Fabiano Fidêncio
b5142c94b9 Revert "agent:cdh: unittest for sealed secret as file"
This reverts commit 31e09058af, as it's
breaking the agent unit tests CI.

This is a stop gap till Chengyu Zhu finds the time to properly address
the issue, avoiding the CI to be blocked for now.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-09 16:06:09 +02:00
stevenhorsman
8763880e93 tests/k8s: kbs: Update overlays logic
In https://github.com/confidential-containers/trustee/pull/521
the overlays logic was modified to add non-SE
s390x support and simplify non-ibm-se platforms.
We need to update the logic in `kbs_k8s_deploy`
to match and can remove the dummying of `IBM_SE_CREDS_DIR`
for non-SE now

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-10-09 09:39:41 +01:00
Gabriela Cervantes
e08749ce58 tools/osbuilder/tests: Add trap statement in test images script
This PR adds the trap statement in the test images script to clean up
tmp files.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-08 19:54:23 +00:00
Fabiano Fidêncio
80196c06ad Merge pull request #10390 from microsoft/danmihai1/new-rootfs-image-mariner
local-build: add ability to build rootfs-image-mariner
2024-10-08 21:40:43 +02:00
Fabiano Fidêncio
083b2f24d8 Merge pull request #10363 from ChengyuZhu6/secret-as-volume
Support Confidential Sealed Secrets (as volume)
2024-10-08 19:23:40 +02:00
Dan Mihai
63c1f81c23 local-build: add rootfs-image-mariner
Kata CI will start testing the new rootfs-image-mariner instead of the
older rootfs-initrd-mariner image.

The "official" AKS images are moving from a rootfs-initrd-mariner
format to the rootfs-image-mariner format. Making the same change in
Kata CI is useful to keep this testing in sync with the AKS settings.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-10-08 17:15:56 +00:00
GabyCT
7a38cce73c Merge pull request #10383 from kata-containers/topic/imagevar
image-builder: Remove unused variable
2024-10-08 10:27:03 -06:00
Aurélien Bombo
e56af7a370 Merge pull request #10389 from emanuellima1/fix-agent-policy
build: Fix RPM build fail due to AGENT_POLICY
2024-10-08 09:59:21 -05:00
ChengyuZhu6
a94024aedc tests: add test for sealed file secrets
add a test for sealed file secrets.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-10-08 16:01:48 +08:00
ChengyuZhu6
fe307303c8 agent:rpc: Refactor CDH-related operations
Refactor CDH-related operations into the cdh_handler function to make the `create_container` code clearer.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-10-08 16:01:48 +08:00
ChengyuZhu6
31e09058af agent:cdh: unittest for sealed secret as file
add unittest for sealed secret as file.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
2024-10-08 16:01:48 +08:00
ChengyuZhu6
974d6b0736 agent:cdh: initialize cdhclient with the input cdh socket uri
Refactor cdh code to initialize cdhclient with the input cdh socket uri.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-10-08 14:58:07 +08:00
ChengyuZhu6
1f33fd4cd4 agent:rpc: handle the sealed secret in createcontainer
Users must set the mount path to `/sealed/<path>` for kata agent to detect the sealed secret mount
and handle it in createcontainer stage.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
2024-10-08 14:58:07 +08:00
ChengyuZhu6
da281b4444 agent:cdh: support to unseal secret as file
Introduced `unseal_file` function to unseal secret as files:
- Implemented logic to handle symlinks and regular files within the sealed secret directory.
- For each entry, call CDH to unseal secrets and the unsealed contents are written to a new file, and a symlink is created to replace the sealed symlink.

Fixes: #8123

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
2024-10-08 14:58:07 +08:00
Fabiano Fidêncio
71d0c46e0a Merge pull request #10384 from microsoft/danmihai1/virtio-fs-policy
tests: k8s: AUTO_GENERATE_POLICY=yes for local testing
2024-10-07 21:25:52 +02:00
Emanuel Lima
e989e7ee4e build: Fix RPM build fail due to AGENT_POLICY
By checking for AGENT_POLICY we ensure we only try to read
allow-all.rego if AGENT_POLICY is set to "yes"

Signed-off-by: Emanuel Lima <emlima@redhat.com>
2024-10-07 15:43:23 -03:00
Dan Mihai
6d5fc898b8 tests: k8s: AUTO_GENERATE_POLICY=yes for local testing
The behavior of Kata CI doesn't change.

For local testing using kubernetes/gha-run.sh and AUTO_GENERATE_POLICY=yes:

1. Before these changes users were forced to use:
- SEV, SNP, or TDX guests, or
- KATA_HOST_OS=cbl-mariner

2. After these changes users can also use other platforms that are
configured with "shared_fs = virtio-fs" - e.g.,
- KATA_HOST_OS=ubuntu + KATA_HYPERVISOR=qemu

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-10-04 18:26:00 +00:00
Dan Mihai
5aaef8e6eb Merge pull request #10376 from microsoft/danmihai1/auto-generate-just-for-ci
gha: enable AUTO_GENERATE_POLICY where needed
2024-10-04 10:52:31 -07:00
Gabriela Cervantes
4cd737d9fd image-builder: Remove unused variable
This PR removes an unused variable in the image builder script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-04 15:56:28 +00:00
Greg Kurz
77c5db6267 Merge pull request #9637 from ldoktor/selective-ci
CI: Select jobs by touched code
2024-10-04 11:29:05 +02:00
GabyCT
2d089d9695 Merge pull request #10381 from GabyCT/topic/archrootfs
osbuilder: Remove duplicated arch variable definition
2024-10-03 14:48:08 -06:00
Wainer Moschetta
b9025462fb Merge pull request #10134 from ldoktor/ci-sort-range
ci.ocp: Sort images according to git
2024-10-03 15:08:41 -03:00
Chelsea Mafrica
9138f55757 Merge pull request #10375 from GabyCT/topic/mktempkbs
k8s:kbs: Add trap statement to clean up tmp files
2024-10-03 12:32:30 -04:00
Gabriela Cervantes
d7c2b7d13c osbuilder: Remove duplicated arch variable definition
This PR removes duplicated arch variable definition in the rootfs script
as this variable and its value is already defined at the top of the
script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-03 16:22:27 +00:00
Greg Kurz
96336d141b Merge pull request #10165 from pmores/add-network-device-hotplugging
runtime-rs: add network device hotplugging to qemu-rs
2024-10-03 17:44:50 +02:00
Pavel Mores
23927d8a94 runtime-rs: plug in netdev hotplugging functionality and actually call it
add_device() now checks if QEMU is running already by checking if we have
a QMP connection.  If we do a new function hotplug_device() is called
which hotplugs the device if it's a network one.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-10-03 11:23:10 +02:00
Pavel Mores
ac393f6316 runtime-rs: implement netdev hotplugging for qemu-rs
With the helpers from previous commit, the actual hotplugging
implementation, though lengthy, is mostly just assembling a QMP command
to hotplug the network device backend and then doing the same for the
corresponding frontend.

Note that hotplug_network_device() takes cmdline_generator types Netdev
and DeviceVirtioNet.  This is intentional and aims to take advantage of
the similarity between parameter sets needed to coldplug and hotplug
devices reuse and simplify our code.  To enable using the types from qmp,
accessors were added as needed.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-10-03 11:20:02 +02:00
Pavel Mores
4eb7e2966c runtime-rs: add netdev hotplugging helpers to qemu-rs
Before adding network device hotplugging functionality itself we add
a couple of helpers in a separate commit since their functionality is
non-trivial.

To hotplug a device we need a free PCI slot.  We add find_free_slot()
which can be called to obtain one.  It looks for PCI bridges connected
to the root bridge and looks for an unoccupied slot on each of them.  The
first found is returned to the caller.  The algorithm explicitly doesn't
support any more complex bridge hierarchies since those are never produced
when coldplugging PCI bridges.

Sending netdev queue and vhost file descriptors to QEMU is slightly
involved and implemented in pass_fd().  The actual socket has to be passed
in an SCM_RIGHTS socket control message (also called ancillary data, see
man 3 cmsg) so we have to use the msghdr structure and sendmsg() call
(see man 2 sendmsg) to send the message.  Since qapi-rs doesn't support
sending messages with ancillary data we have to do the sending sort of
"under it", manually, by retrieving qapi-rs's socket and using it directly.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-10-03 11:15:31 +02:00
Pavel Mores
3f46dfcf2f runtime-rs: don't treat NetworkConfig::index as unique in qemu-rs
NetworkConfig::index has been used to generate an id for a network device
backend.  However, it turns out that it's not unique (it's always zero
as confirmed by a comment at its definition) so it's not suitable to
generate an id that needs to be unique.

Use the host device name instead.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-10-03 11:12:37 +02:00
Pavel Mores
cda04fa539 runtime-rs: factor setup of network device out of QemuCmdLine
Network device hotplugging will use the same infrastructure (Netdev,
DeviceVirtioNet) as coldplugging, i.e. QemuCmdLine.  To make the code
of network device setup visible outside of QemuCmdLine we factor it out
to a non-member function `get_network_device()` and make QemuCmdLine just
delegate to it.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-10-03 11:03:32 +02:00
Pavel Mores
efc8e93bfe runtime-rs: factor bus_type() out of QemuCmdLine
The function takes a whole QemuCmdLine but only actually uses
HypervisorConfig.  We increase callability of the function by limiting
its interface to what it needs.  This will come handy shortly.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-10-03 11:03:32 +02:00
Pavel Mores
720265c2d8 runtime-rs: support adding PCI bridges to qemu VM
At least one PCI bridge is necessary to hotplug PCI devices.  We only
support PCI (at this point at least) since that's what the go runtime
does (note that looking at the code in virtcontainers it might seem that
other bus types are supported, however when the bridge objects are passed
to govmm, all but PCI bridges are actually ignored).  The entire logic of
bridge setup is lifted from runtime-go for compatibility's sake.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-10-03 11:03:32 +02:00
Lukáš Doktor
63b6e8a215 ci: Ensure we check the latest workflow run in gatekeeper
with multiple iterations/reruns we need to use the latest run of each
workflow. For that we can use the "run_id" and only update results of
the same or newer run_ids.

To do that we need to store the "run_id". To avoid adding individual
attributes this commit stores the full job object that contains the
status, conclussion as well as other attributes of the individual jobs,
which might come handy in the future in exchange for slightly bigger
memory overhead (still we only store the latest run of required jobs
only).

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-10-03 09:10:45 +02:00
Lukáš Doktor
2ae090b44b ci: Add extra gatekeeper debug output to stderr
which might be useful to assess the amount of querries.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-10-03 09:08:35 +02:00
Lukáš Doktor
2440a39c50 ci: Check required lables before checking tests in gatekeeper
some tests require certain labels before they are executed. When our PR
is not labeled appropriately the gatekeeper detects skipped required
tests and reports a failure. With this change we add "required-labeles"
to the tests mapping and check the expected labels first informing the
user about the missing labeles before even checking the test statuses.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-10-03 09:08:35 +02:00
Lukáš Doktor
dd2878a9c8 ci: Unify character for separating items
the test names are using `;` and regexps were designed to use `,` but
during development simply joined the expressions by `|`. This should
work but might be confusing so let's go with the semi-colon separator
everywhere.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-10-03 09:08:35 +02:00
Wainer dos Santos Moschetta
fdcfac0641 workflows/gatekeeper: export COMMIT_HASH variable
The Github SHA of triggering PR should be exported in the environment
so that gatekeeper can fetch the right workflows/jobs.

Note: by default github will export GITHUB_SHA in the job's environment
but that value cannot be used if the gatekeeper was triggered from a
pull_request_target event, because the SHA correspond to the push
branch.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-10-03 09:08:35 +02:00
Wainer dos Santos Moschetta
4abfc11b4f workflows/gatekeeper: configure concurrency properly
This will allow to cancel-in-progress the gatekeeper jobs.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-10-03 09:08:35 +02:00
Lukáš Doktor
5c1cea1601 ci: Select jobs by touched code
to allow selective testing as well as selective list of required tests
let's add a mapping of required jobs/tests in "skips.py" and a
"gatekeaper" workflow that will ensure the expected required jobs were
successful. Then we can only mark the "gatekeaper" as the required job
and modify the logic to suit our needs.

Fixes: #9237

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-10-03 09:08:33 +02:00
Dan Mihai
1a4928e710 gha: enable AUTO_GENERATE_POLICY where needed
The behavior of Kata CI doesn't change.

For local testing using kubernetes/gha-run.sh:

1. Before these changes:
- AUTO_GENERATE_POLICY=yes was always used by the users of SEV, SNP,
  TDX, or KATA_HOST_OS=cbl-mariner.

2. After these changes:
- Users of SEV, SNP, TDX, or KATA_HOST_OS=cbl-mariner must specify
  AUTO_GENERATE_POLICY=yes if they want to auto-generate policy.
- These users have the option to test just using hard-coded policies
  (e.g., using the default policy built into the Guest rootfs) by
  using AUTO_GENERATE_POLICY=no. AUTO_GENERATE_POLICY=no is the default
  value of this env variable.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-10-02 23:20:33 +00:00
Gabriela Cervantes
973b8a1d8f k8s:kbs: Add trap statement to clean up tmp files
This PR adds the trap statement in the confidential kbs script
to clean up temporary files and ensure we are leaving them.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-02 19:59:08 +00:00
Steve Horsman
8412c09143 Merge pull request #10371 from fidencio/topic/k8s-tdx-re-enable-empty-dir-tests
k8s: tests: Re-enable empty-dirs tests for TDX / coco-qemu-dev
2024-10-02 18:41:19 +01:00
Dan Mihai
9a8341f431 Merge pull request #10370 from microsoft/danmihai1/k8s-policy-rc
tests: k8s-policy-rc: remove default UID from YAML
2024-10-02 09:32:17 -07:00
GabyCT
a1d380305c Merge pull request #10369 from GabyCT/topic/egrepfastf
metrics: Update fast footprint script to use grep
2024-10-02 10:10:12 -06:00
Fabiano Fidêncio
b3ed7830e4 k8s: tests: Re-enable empty-dirs tests for TDX / coco-qemu-dev
The tests is disabled for qemu-coco-dev / qemu-tdx, but it doesn't seen
to actually be failing on those.  Plus, it's passing on SEV / SNP, which
means that we most likely missed re-enabling this one in the past.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-01 20:51:01 +02:00
Hyounggyu Choi
b179598fed Merge pull request #10374 from BbolroC/skip-block-volume-qemu-runtime-rs
tests: Skip k8s-block-volume.bats for qemu-runtime-rs
2024-10-01 19:45:10 +02:00
Lukáš Doktor
820e000f1c ci.ocp: Sort images according to git
The quay.io registry returns the tags sorted alphabetically and doesn't
seem to provide a way to sort it by age. Let's use "git log" to get all
changes between the commits and print all tags that were actually
pushed.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-10-01 16:08:00 +02:00
Hyounggyu Choi
4ccf1f29f9 tests: Skip k8s-block-volume.bats for qemu-runtime-rs
Currently, `qemu-runtime-rs` does not support `virtio-scsi`,
which causes the `k8s-block-volume.bats` test to fail.
We should skip this test until `virtio-scsi` is supported by the runtime.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-10-01 09:09:47 +02:00
Dan Mihai
3b24219310 tests: k8s-policy-rc: remove default UID from YAML
The nginx container seems to error out when using UID=123.

Depending on the timing between container initialization and "kubectl
wait", the test might have gotten lucky and found the pod briefly in
Ready state before nginx errored out. But on some of the nodes, the pod
never got reported as Ready.

Also, don't block in "kubectl wait --for=condition=Ready" when wrapping
that command in a waitForProcess call, because waitForProcess is
designed for short-lived commands.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-10-01 00:10:30 +00:00
Saul Paredes
94bc54f4d2 Merge pull request #10340 from microsoft/saulparedes/validate_create_sandbox_storages
genpolicy: validate create sandbox storages
2024-09-30 14:24:56 -07:00
Aurélien Bombo
b49800633d Merge pull request #7165 from sprt/k8s-block-volume-test
tests: Add `k8s-block-volume` test to GHA CI
2024-09-30 13:26:18 -07:00
Dan Mihai
7fe44d3a3d genpolicy: validate create sandbox storages
Reject any unexpected values from the CreateSandboxRequest storages
field.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-09-30 11:31:12 -07:00
Gabriela Cervantes
52ef092489 metrics: Update fast footprint script to use grep
This PR updates the fast footprint script to remove the use
of egrep as this command has been deprecated and change it
to use grep command.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-30 17:43:08 +00:00
Aurélien Bombo
c037ac0e82 tests: Add k8s-block-volume test
This imports the k8s-block-volume test from the tests repo and modifies
it slightly to set up the host volume on the AKS host.

This is a follow-up to #7132.

Fixes: #7164

Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-09-30 10:58:30 -05:00
Alex Lyn
dfd0ca9bfe Merge pull request #10312 from sidneychang/configurable-build-dragonball
runtime-rs: Add Configurable Compilation for Dragonball in Runtime-rs
2024-09-29 22:33:54 +08:00
GabyCT
6a9e3ccddf Merge pull request #10305 from GabyCT/topic/ita
ci:tdx: Use an ITA key for TDX
2024-09-27 16:44:53 -06:00
Fabiano Fidêncio
66bcfe7369 k8s: kbs: Properly delete ita kustomization
The ita kustomization for Trustee, as well as previously used one
(DCAP), doesn't have a $(uname -m) directory after the deployment
directory name.

Let's follow the same logic used for the deploy-kbs script and clean
those up accordingly.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-27 21:47:29 +02:00
Gabriela Cervantes
bafa527be0 ci: tdx: Test attestation with ITTS
Intel Tiber Trust Services (formerly known as Intel Trust Authority) is
Intel's own attestation service, and we want to take advantage of the
TDX CI in order to ensure ITTS works as expected.

In order to do so, let's replace the former method used (DCAP) to use
ITTS instead.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-27 21:47:25 +02:00
GabyCT
36750b56f1 Merge pull request #10342 from GabyCT/topic/updevguide
docs: Remove qemu information not longer valid
2024-09-27 11:15:11 -06:00
Fabiano Fidêncio
86b8c53d27 Merge pull request #10357 from fidencio/topic/add-ita-secret
gha: Add ita_key as a github secret
2024-09-27 17:40:41 +02:00
Gabriela Cervantes
d91979d7fa gha: Add ita_key as a github secret
This PR adds ita_key as a github secret at the kata coco tests yaml workflow.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-27 17:15:22 +02:00
Xuewei Niu
ad0f2b2a55 Merge pull request #10219 from sidneychang/decouple-runtime-rs-from-dragonball
runtime-rs: Port TAP implementation from dragonball
2024-09-27 11:17:55 +08:00
Xuewei Niu
11b1a72442 Merge pull request #10349 from lifupan/main_nsandboxapi
sandbox: refactor the sandbox init process
2024-09-27 11:10:45 +08:00
Xuewei Niu
3911bd3108 Merge pull request #10351 from lifupan/main_agent
agent: fix the issue of setup sandbox pidns
2024-09-27 10:49:47 +08:00
Fupan Li
f7bc627a86 sandbox: refactor the sandbox init process
Inorder to support sandbox api, intorduce the sandbox_config
struct and split the sandbox start stage from init process.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-09-26 23:50:24 +08:00
Hyounggyu Choi
b1275bed1b Merge pull request #10346 from BbolroC/minor-improvement-k8s-tests
tests: Minor improvement k8s tests
2024-09-26 17:01:32 +02:00
Hyounggyu Choi
01d460ac63 tests: Add teardown_common() to tests_common.sh
There are many similar or duplicated code patterns in `teardown()`.
This commit consolidates them into a new function, `teardown_common()`,
which is now called within `teardown()`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-26 13:56:36 +02:00
Hyounggyu Choi
e8d1feb25f tests: Validate node name for exec_host()
The current `exec_host()` accepts a given node name and
creates a node debugger pod, even if the name is invalid.
This could result in the creation of an unnecessary pending
pod (since we are using nodeAffinity; if the given name
does not match any actual node names, the pod won’t be scheduled),
which wastes resources.

This commit introduces validation for the node name to
prevent this situation.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-26 13:20:50 +02:00
Xuewei Niu
3a7f9595b6 Merge pull request #10318 from lsc2001/ci-add-docker
ci: Enable basic docker tests for runtime-rs
2024-09-26 17:41:09 +08:00
Xuewei Niu
cb5a2b30e9 Merge pull request #10293 from lsc2001/solve-docker-compatibility
runtime-rs: Notify containerd when process exits
2024-09-26 14:51:20 +08:00
Sicheng Liu
e4733748aa ci: Enable basic docker tests for runtime-rs
This commit enables basic amd64 tests of docker for runtime-rs by adding
vmm types "dragonball" and "cloud-hypervisor".

Signed-off-by: Sicheng Liu <lsc2001@outlook.com>
2024-09-26 06:27:05 +00:00
Sicheng Liu
08eb5fc7ff runtime-rs: Notify containerd when process exits
Docker cannot exit normally after the container process exits when
used with runtime-rs since it doesn't receive the exit event. This
commit enable runtime-rs to send TaskExit to containerd after process
exits.

Also, it moves "system_time_into" and "option_system_time_into" from
crates/runtimes/common/src/types/trans_into_shim.rs to a new utility
mod.

Signed-off-by: Sicheng Liu <lsc2001@outlook.com>
2024-09-26 02:52:50 +00:00
Fupan Li
71afeccdf1 agent: fix the issue of setup sandbox pidns
When the sandbox api was enabled, the pasue container
wouldn't be created, thus the shared sandbox pidns
should be fallbacked to the first container's init process,
instead of return any error here.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-09-26 10:21:25 +08:00
Xuewei Niu
857222af02 Merge pull request #10330 from lifupan/main_sandboxapi
Some prepared work for sandbox api support
2024-09-26 09:47:47 +08:00
Hyounggyu Choi
caf3b19505 Merge pull request #10348 from BbolroC/delete-node-debugger-by-trap
tests: Delete custom node debugger pod on EXIT
2024-09-25 23:39:43 +02:00
Hyounggyu Choi
57e8cbff6f tests: Delete custom node debugger pod on EXIT
It was observed that the custom node debugger pod is not
cleaned up when a test times out.
This commit ensures the pod is cleaned up by triggering
the cleanup on EXIT, preventing any debugger pods from
being left behind.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-25 20:36:05 +02:00
Fabiano Fidêncio
edf4ca4738 Merge pull request #10345 from ldoktor/kata-webhook
ci: Reorder webhook deployment
2024-09-25 18:16:46 +02:00
Fabiano Fidêncio
09ed9c5c50 Merge pull request #10328 from BbolroC/improve-negative-tests
tests: Improve k8s negative tests
2024-09-25 18:16:28 +02:00
Xuewei Niu
e1825c2ef3 Merge pull request #9977 from l8huang/dan-2-vfio
runtime: add DAN support for VFIO network device in Go kata-runtime
2024-09-25 10:11:38 +08:00
Lei Huang
39b0e9aa8f runtime: add DAN support for VFIO network device in Go kata-runtime
When using network adapters that support SR-IOV, a VFIO device can be
plugged into a guest VM and claimed as a network interface. This can
significantly enhance network performance.

Fixes: #9758

Signed-off-by: Lei Huang <leih@nvidia.com>
2024-09-24 09:53:28 -07:00
Hyounggyu Choi
c70588fafe tests: Use custom-node-debugger pod
With #10232 merged, we now have a persistent node debugger pod throughout the test.
As a result, there’s no need to spawn another debugger pod using `kubectl debug`,
which could lead to false negatives due to premature pod termination, as reported
in #10081.

This commit removes the `print_node_journal()` call that uses `kubectl debug` and
instead uses `exec_host()` to capture the host journal. The `exec_host()` function
is relocated to `tests/integration/kubernetes/lib.sh` to prevent cyclical dependencies
between `tests_common.sh` and `lib.sh`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-24 17:25:24 +02:00
Lukáš Doktor
8355eee9f5 ci: Reorder webhook deployment
in b9d88f74ed the `runtime_class` CM was
added which overrides the one we previously set. Let's reorder our logic
to first deploy webhook and then override the default CM in order to use
the one we really want.

Since we need to change dirs we also have to use realpath to ensure the
files are located well.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-09-24 17:01:28 +02:00
Hyounggyu Choi
2c2941122c tests: Fail fast in assert_pod_fail()
`assert_pod_fail()` currently calls `k8s_create_pod()` to ensure that a pod
does not become ready within the default 120s. However, this delays the test's
completion even if an error message is detected earlier in the journal.

This commit removes the use of `k8s_create_pod()` and modifies `assert_pod_fail()`
to fail as soon as the pod enters a failed state.

All failing pods end up in one of the following states:

- CrashLoopBackOff
- ImagePullBackOff

The function now polls the pod's state every 5 seconds to check for these conditions.
If the pod enters a failed state, the function immediately returns 0. If the pod
does not reach a failed state within 120 seconds, it returns 1.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-24 16:09:20 +02:00
Gabriela Cervantes
6a8b137965 docs: Remove qemu information not longer valid
This PR removes some qemu information which is not longer valid as
this is referring to the tests repository and to kata 1.x.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-23 16:58:24 +00:00
Aurélien Bombo
e738054ddb Merge pull request #10311 from pawelpros/pproskur/fixyq
ci: don't require sudo for yq if already installed
2024-09-23 08:57:11 -07:00
Alex Lyn
6b94cc47a8 Merge pull request #10146 from Apokleos/intro-cdi
Introduce cdi in runtime-rs
2024-09-23 21:45:42 +08:00
Alex Lyn
b8ba346e98 runtime-rs: Add test for container devices with CDI.
Fixes #10145

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-09-23 17:20:22 +08:00
Steve Horsman
0e0cb24387 Merge pull request #10329 from Bickor/webhook-check
tools.kata-webhook: Specify runtime class using configMap
2024-09-23 09:59:12 +01:00
Steve Horsman
6f0b3eb2f9 Merge pull request #10337 from stevenhorsman/update-release-process-post-3.9.0
doc: Update the release process
2024-09-23 09:55:57 +01:00
Hyounggyu Choi
8a893cd4ee Merge pull request #10232 from BbolroC/fix-loop-device-for-exec_host
tests: Fix loop device handling for exec_host()
2024-09-23 08:15:03 +02:00
Fupan Li
f1f5bef9ef Merge pull request #10339 from lifupan/main_fix
runtime-rs: fix the issue of using block_on
2024-09-23 09:28:40 +08:00
Fupan Li
52397ca2c1 sandbox: rename the task_service to service
rename the task_service to service, in order to
incopperate with the following added sandbox
services.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-09-22 14:44:19 +08:00
Fupan Li
20b4be0225 runtime-rs: rename the Request/Response to TaskRequest/TaskResponse
In order to make different from sandbox request/response, this commit
changed the task request/response to TaskRequest/TaskResponse.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-09-22 14:44:11 +08:00
Fupan Li
ba94eed891 sandbox: fix the issue of hypervisor's wait_vm
Since the wait_vm would be called before calling stop_vm,
which would take the reader lock, thus blocking the stop_vm
getting the writer lock, which would trigge the dead lock.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-09-22 14:44:03 +08:00
Fupan Li
fb27de3561 runtime-rs: fix the issue of using block_on
Since the block_on would block on the current thread
which would prevent other async tasks to be run on this
worker thread, thus change it to use the async task for
this task.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-09-22 14:40:44 +08:00
Aurélien Bombo
79a3b4e2e5 Merge pull request #10335 from kata-containers/sprt/fix-kata-deploy-docs
kata-deploy: clean up and fix docs for k0s
2024-09-20 13:33:14 -07:00
stevenhorsman
4f745f77cb doc: Update the release process
- Reflect the need to update the versions in the Helm Chart
- Add the lock branch instruction
- Add clarity about the permissions needed to complete tasks

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-09-20 19:04:33 +01:00
Aurélien Bombo
78c63c7951 kata-deploy: clean up and fix docs for k0s
* Clarifies instructions for k0s.
* Adds kata-deploy step for each cluster type.
* Removes the old kata-deploy-stable step for vanilla k8s.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-09-20 11:59:40 -05:00
sidney chang
456e13db98 runtime-rs: Add Configurable Compilation for Dragonball in Runtime-rs
rename DEFAULT_HYPERVISOR to HYPERVISOR in Makefile
Fixes #10310

Signed-off-by: sidney chang <2190206983@qq.com>
2024-09-20 05:41:34 -07:00
sidneychang
b85a886694 runtime-rs: Add Configurable Compilation for Dragonball in Runtime-rs
This PR introduces support for selectively compiling Dragonball in
runtime-rs. By default, Dragonball will continue to be compiled into
the containerd-shim-kata-v2 executable, but users now have the option
to disable Dragonball compilation.

Fixes #10310

Signed-off-by: sidney chang <2190206983@qq.com>
2024-09-20 05:38:59 -07:00
Hyounggyu Choi
2d6ac3d85d tests: Re-enable guest-pull-image tests for qemu-coco-dev
Now that the issue with handling loop devices has been resolved,
this commit re-enables the guest-pull-image tests for `qemu-coco-dev`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-20 14:37:43 +02:00
Hyounggyu Choi
c6b86e88e4 tests: Increase timeouts for qemu-coco-dev in trusted image storage tests
Timeouts occur (e.g. `create_container_timeout` and `wait_time`)
when using qemu-coco-dev.
This commit increases these timeouts for the trusted image storage
test cases

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-20 14:37:43 +02:00
Hyounggyu Choi
9cff9271bc tests: Run all commands in *_loop_device() using exec_host()
If the host running the tests is different from the host where the cluster is running,
the *_loop_device() functions do not work as expected because the device is created
on the test host, while the cluster expects the device to be local.

This commit ensures that all commands for the relevant functions are executed via exec_host()
so that a device should be handled on a cluster node.

Additionally, it modifies exec_host() to return the exit code of the last executed command
because the existing logic with `kubectl debug` sometimes includes unexpected characters
that are difficult to handle. `kubectl exec` appears to properly return the exit code for
a given command to it.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-20 14:37:43 +02:00
Hyounggyu Choi
374b8d2534 tests: Create and delete node debugger pod only once
Creating and deleting a node debugger pod for every `exec_host()`
call is inefficient.
This commit changes the test suite to create and delete the pod
only once, globally.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-20 14:37:43 +02:00
Hyounggyu Choi
aedf14b244 tests: Mimic node debugger with full privileges
This commit addresses an issue with handling loop devices
via a node debugger due to restricted privileges.
It runs a pod with full privileges, allowing it to mount
the host root to `/host`, similar to the node debugger.
This change enables us to run tests for trusted image storage
using the `qemu-coco-dev` runtime class.

Fixes: #10133

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-20 14:37:43 +02:00
Alex Lyn
63b25e8cb0 runtime-rs: Introduce cdi devices in container creation
Fixes #10145

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-09-20 09:28:51 +08:00
Alex Lyn
03735d78ec runtime-rs: add cdi devices definition and related methods
Add cdi devices including ContainerDevice definition and
annotation_container_device method to annotate vfio device
in OCI Spec annotations which is inserted into Guest with
its mapping of vendor-class and guest pci path.

Fixes #10145

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-09-20 09:28:51 +08:00
Alex Lyn
020e3da9b9 runtime-rs: extend DeviceVendor with device class
We need vfio device's properties device, vendor and
class, but we can only get property device and vendor.
just extend it with class is ok.

Fixes #10145

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-09-20 09:28:51 +08:00
Fabiano Fidêncio
77c844da12 Merge pull request #10239 from fidencio/topic/remove-acrn
acrn: Drop support
2024-09-19 23:10:29 +02:00
GabyCT
6eef58dc3e Merge pull request #10336 from GabyCT/topic/extendtimeout
gha: Increase timeout to run k8s tests on TDX
2024-09-19 13:12:55 -06:00
Martin
b9d88f74ed tools.kata-webhook: Specify runtime class using configMap
The kata webhook requires a configmap to define what runtime class it
should set for the newly created pods. Additionally, the configmap
allows others to modify the default runtime class name we wish to set
(in case the handler is kata but the name of the runtimeclass is
different).

Finally, this PR changes the webhook-check to compare the runtime of the
newly created pod against the specific runtime class in the configmap,
if said confimap doesn't exist, then it will default to "kata".

Signed-off-by: Martin <mheberling@microsoft.com>
2024-09-19 11:51:38 -07:00
Fabiano Fidêncio
51dade3382 docs: Fix spell checker
tokio is not a valid word, it seeems, so let's use `tokio`.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-19 20:25:21 +02:00
Gabriela Cervantes
49b3a0faa3 gha: Increase timeout to run k8s tests on TDX
This PR increases the timeout to run k8s tests for Kata CoCo TDX
to avoid the random failures of timeout.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-19 17:15:47 +00:00
Fabiano Fidêncio
31438dba79 docs: Fix qemu link
Otherwise static checks will fail, as we woke up the dogs with changes
on the same file.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-19 16:05:43 +02:00
Fabiano Fidêncio
fefcf7cfa4 acrn: Drop support
As we don't have any CI, nor maintainer to keep ACRN code around, we
better have it removed than give users the expectation that it should or
would work at some point.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-19 16:05:43 +02:00
Pawel Proskurnicki
b63d49b34a ci: don't require sudo for yq if already installed
Yq installation shouldn't force to use sudo in case yq is already installed in correct version.

Signed-off-by: Pawel Proskurnicki <pawel.proskurnicki@intel.com>
2024-09-18 11:01:07 +02:00
sidney chang
5a7d0ed3ad runtime-rs: introduce tap in hypervisor by extrating it from dragonball
It's a prerequisite PR to make built-in vmm dragonball compilation
options configurable.

Extract TAP device-related code from dragonball's dbs_utils into a
separate library within the runtime-rs hypervisor module.
To enhance functionality and reduce dependencies, the extracted code
has been reimplemented using the libc crate and the ifreq structure.

Fixes #10182

Signed-off-by: sidney chang <2190206983@qq.com>
2024-09-13 07:32:14 -07:00
179 changed files with 4226 additions and 4440 deletions

View File

@@ -260,6 +260,8 @@ jobs:
vmm:
- clh
- qemu
- dragonball
- cloud-hypervisor
runs-on: ubuntu-22.04
env:
KATA_HYPERVISOR: ${{ matrix.vmm }}

View File

@@ -94,10 +94,10 @@ jobs:
echo "LIBSECCOMP_LINK_TYPE=static" >> $GITHUB_ENV
echo "LIBSECCOMP_LIB_PATH=${libseccomp_install_dir}/lib" >> $GITHUB_ENV
- name: Install protobuf-compiler
if: ${{ matrix.command != 'make vendor' && (matrix.component == 'agent' || matrix.component == 'runk' || matrix.component == 'genpolicy') }}
if: ${{ matrix.command != 'make vendor' && (matrix.component == 'agent' || matrix.component == 'runk' || matrix.component == 'genpolicy' || matrix.component == 'agent-ctl') }}
run: sudo apt-get -y install protobuf-compiler
- name: Install clang
if: ${{ matrix.command == 'make check' && matrix.component == 'agent' }}
if: ${{ matrix.command == 'make check' && (matrix.component == 'agent' || matrix.component == 'agent-ctl') }}
run: sudo apt-get -y install clang
- name: Setup XDG_RUNTIME_DIR for the `runtime` tests
if: ${{ matrix.command != 'make vendor' && matrix.command != 'make check' && matrix.component == 'runtime' }}

View File

@@ -24,6 +24,11 @@ on:
jobs:
build-asset:
runs-on: ubuntu-22.04
permissions:
contents: read
packages: write
id-token: write
attestations: write
strategy:
matrix:
asset:
@@ -50,11 +55,10 @@ jobs:
- stratovirt
- rootfs-image
- rootfs-image-confidential
- rootfs-image-mariner
- rootfs-initrd
- rootfs-initrd-confidential
- rootfs-initrd-mariner
- runk
- shim-v2
- trace-forwarder
- virtiofsd
stage:
@@ -62,6 +66,8 @@ jobs:
exclude:
- asset: cloud-hypervisor-glibc
stage: release
env:
PERFORM_ATTESTATION: ${{ matrix.asset == 'agent' && inputs.push-to-registry == 'yes' && 'yes' || 'no' }}
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
@@ -83,6 +89,7 @@ jobs:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Build ${{ matrix.asset }}
id: build
run: |
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
@@ -98,6 +105,34 @@ jobs:
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: Parse OCI image name and digest
id: parse-oci-segments
if: ${{ env.PERFORM_ATTESTATION == 'yes' }}
run: |
oci_image="$(<"build/${{ matrix.asset }}-oci-image")"
echo "oci-name=${oci_image%@*}" >> "$GITHUB_OUTPUT"
echo "oci-digest=${oci_image#*@}" >> "$GITHUB_OUTPUT"
- uses: oras-project/setup-oras@v1
if: ${{ env.PERFORM_ATTESTATION == 'yes' }}
with:
version: "1.2.0"
# for pushing attestations to the registry
- uses: docker/login-action@v3
if: ${{ env.PERFORM_ATTESTATION == 'yes' }}
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- uses: actions/attest-build-provenance@v1
if: ${{ env.PERFORM_ATTESTATION == 'yes' }}
with:
subject-name: ${{ steps.parse-oci-segments.outputs.oci-name }}
subject-digest: ${{ steps.parse-oci-segments.outputs.oci-digest }}
push-to-registry: true
- name: store-artifact ${{ matrix.asset }}
if: ${{ matrix.stage != 'release' || (matrix.asset != 'agent' && matrix.asset != 'coco-guest-components' && matrix.asset != 'pause-image') }}
uses: actions/upload-artifact@v4
@@ -107,9 +142,57 @@ jobs:
retention-days: 15
if-no-files-found: error
create-kata-tarball:
build-asset-shim-v2:
runs-on: ubuntu-22.04
needs: build-asset
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Build shim-v2
id: build
run: |
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-${KATA_ASSET}*.tar.* kata-build/.
env:
KATA_ASSET: shim-v2
TAR_OUTPUT: shim-v2.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact shim-v2
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-amd64-shim-v2${{ inputs.tarball-suffix }}
path: kata-build/kata-static-shim-v2.tar.xz
retention-days: 15
if-no-files-found: error
create-kata-tarball:
runs-on: ubuntu-22.04
needs: [build-asset, build-asset-shim-v2]
steps:
- uses: actions/checkout@v4
with:

View File

@@ -37,7 +37,6 @@ jobs:
- stratovirt
- rootfs-image
- rootfs-initrd
- shim-v2
- virtiofsd
steps:
- name: Login to Kata Containers quay.io
@@ -84,9 +83,56 @@ jobs:
retention-days: 15
if-no-files-found: error
create-kata-tarball:
build-asset-shim-v2:
runs-on: arm64-builder
needs: build-asset
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Build shim-v2
run: |
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-${KATA_ASSET}*.tar.* kata-build/.
env:
KATA_ASSET: shim-v2
TAR_OUTPUT: shim-v2.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact shim-v2
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-arm64-shim-v2${{ inputs.tarball-suffix }}
path: kata-build/kata-static-shim-v2.tar.xz
retention-days: 15
if-no-files-found: error
create-kata-tarball:
runs-on: arm64-builder
needs: [build-asset, build-asset-shim-v2]
steps:
- name: Adjust a permission for repo
run: |

View File

@@ -31,7 +31,6 @@ jobs:
- kernel
- qemu
- rootfs-initrd
- shim-v2
- virtiofsd
stage:
- ${{ inputs.stage }}
@@ -85,9 +84,61 @@ jobs:
retention-days: 1
if-no-files-found: error
create-kata-tarball:
build-asset-shim-v2:
runs-on: ppc64le
needs: build-asset
steps:
- name: Prepare the self-hosted runner
run: |
${HOME}/scripts/prepare_runner.sh
sudo rm -rf $GITHUB_WORKSPACE/*
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Build shim-v2
run: |
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-${KATA_ASSET}*.tar.* kata-build/.
env:
KATA_ASSET: shim-v2
TAR_OUTPUT: shim-v2.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact shim-v2
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-ppc64le-shim-v2${{ inputs.tarball-suffix }}
path: kata-build/kata-static-shim-v2.tar.xz
retention-days: 1
if-no-files-found: error
create-kata-tarball:
runs-on: ppc64le
needs: [build-asset, build-asset-shim-v2]
steps:
- name: Adjust a permission for repo
run: |

View File

@@ -24,6 +24,11 @@ on:
jobs:
build-asset:
runs-on: s390x
permissions:
contents: read
packages: write
id-token: write
attestations: write
strategy:
matrix:
asset:
@@ -37,8 +42,9 @@ jobs:
- rootfs-image-confidential
- rootfs-initrd
- rootfs-initrd-confidential
- shim-v2
- virtiofsd
env:
PERFORM_ATTESTATION: ${{ matrix.asset == 'agent' && inputs.push-to-registry == 'yes' && 'yes' || 'no' }}
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
@@ -60,6 +66,7 @@ jobs:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Build ${{ matrix.asset }}
id: build
run: |
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
@@ -75,6 +82,29 @@ jobs:
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: Parse OCI image name and digest
id: parse-oci-segments
if: ${{ env.PERFORM_ATTESTATION == 'yes' }}
run: |
oci_image="$(<"build/${{ matrix.asset }}-oci-image")"
echo "oci-name=${oci_image%@*}" >> "$GITHUB_OUTPUT"
echo "oci-digest=${oci_image#*@}" >> "$GITHUB_OUTPUT"
# for pushing attestations to the registry
- uses: docker/login-action@v3
if: ${{ env.PERFORM_ATTESTATION == 'yes' }}
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- uses: actions/attest-build-provenance@v1
if: ${{ env.PERFORM_ATTESTATION == 'yes' }}
with:
subject-name: ${{ steps.parse-oci-segments.outputs.oci-name }}
subject-digest: ${{ steps.parse-oci-segments.outputs.oci-digest }}
push-to-registry: true
- name: store-artifact ${{ matrix.asset }}
if: ${{ inputs.stage != 'release' || (matrix.asset != 'agent' && matrix.asset != 'coco-guest-components' && matrix.asset != 'pause-image') }}
uses: actions/upload-artifact@v4
@@ -132,9 +162,57 @@ jobs:
retention-days: 1
if-no-files-found: error
build-asset-shim-v2:
runs-on: s390x
needs: build-asset
steps:
- name: Login to Kata Containers quay.io
if: ${{ inputs.push-to-registry == 'yes' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0 # This is needed in order to keep the commit ids history
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Build shim-v2
id: build
run: |
make "${KATA_ASSET}-tarball"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
mkdir -p kata-build && cp "${build_dir}"/kata-static-${KATA_ASSET}*.tar.* kata-build/.
env:
KATA_ASSET: shim-v2
TAR_OUTPUT: shim-v2.tar.gz
PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}
ARTEFACT_REGISTRY: ghcr.io
ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}
ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
TARGET_BRANCH: ${{ inputs.target-branch }}
RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}
- name: store-artifact shim-v2
uses: actions/upload-artifact@v4
with:
name: kata-artifacts-s390x-shim-v2${{ inputs.tarball-suffix }}
path: kata-build/kata-static-shim-v2.tar.xz
retention-days: 15
if-no-files-found: error
create-kata-tarball:
runs-on: s390x
needs: [build-asset, build-asset-boot-image-se]
needs: [build-asset, build-asset-boot-image-se, build-asset-shim-v2]
steps:
- uses: actions/checkout@v4
with:

13
.github/workflows/ci-devel.yaml vendored Normal file
View File

@@ -0,0 +1,13 @@
name: Kata Containers CI (manually triggered)
on:
workflow_dispatch:
jobs:
kata-containers-ci-on-push:
uses: ./.github/workflows/ci.yaml
with:
commit-hash: ${{ github.sha }}
pr-number: "dev"
tag: ${{ github.sha }}-dev
target-branch: ${{ github.ref_name }}
secrets: inherit

View File

@@ -2,7 +2,6 @@ name: Kata Containers Nightly CI
on:
schedule:
- cron: '0 0 * * *'
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

View File

@@ -19,12 +19,21 @@ concurrency:
cancel-in-progress: true
jobs:
kata-containers-ci-on-push:
skipper:
if: ${{ contains(github.event.pull_request.labels.*.name, 'ok-to-test') }}
uses: ./.github/workflows/gatekeeper-skipper.yaml
with:
commit-hash: ${{ github.event.pull_request.head.sha }}
target-branch: ${{ github.event.pull_request.base.ref }}
kata-containers-ci-on-push:
needs: skipper
if: ${{ needs.skipper.outputs.skip_build != 'yes' }}
uses: ./.github/workflows/ci.yaml
with:
commit-hash: ${{ github.event.pull_request.head.sha }}
pr-number: ${{ github.event.pull_request.number }}
tag: ${{ github.event.pull_request.number }}-${{ github.event.pull_request.head.sha }}
target-branch: ${{ github.event.pull_request.base.ref }}
skip-test: ${{ needs.skipper.outputs.skip_test }}
secrets: inherit

View File

@@ -15,6 +15,10 @@ on:
required: false
type: string
default: ""
skip-test:
required: false
type: string
default: no
jobs:
build-kata-static-tarball-amd64:
@@ -132,6 +136,7 @@ jobs:
file: tests/integration/kubernetes/runtimeclass_workloads/confidential/unencrypted/Dockerfile
run-kata-monitor-tests:
if: ${{ inputs.skip-test != 'yes' }}
needs: build-kata-static-tarball-amd64
uses: ./.github/workflows/run-kata-monitor-tests.yaml
with:
@@ -140,6 +145,7 @@ jobs:
target-branch: ${{ inputs.target-branch }}
run-k8s-tests-on-aks:
if: ${{ inputs.skip-test != 'yes' }}
needs: publish-kata-deploy-payload-amd64
uses: ./.github/workflows/run-k8s-tests-on-aks.yaml
with:
@@ -153,6 +159,7 @@ jobs:
secrets: inherit
run-k8s-tests-on-amd64:
if: ${{ inputs.skip-test != 'yes' }}
needs: publish-kata-deploy-payload-amd64
uses: ./.github/workflows/run-k8s-tests-on-amd64.yaml
with:
@@ -165,6 +172,7 @@ jobs:
secrets: inherit
run-kata-coco-tests:
if: ${{ inputs.skip-test != 'yes' }}
needs: [publish-kata-deploy-payload-amd64, build-and-publish-tee-confidential-unencrypted-image]
uses: ./.github/workflows/run-kata-coco-tests.yaml
with:
@@ -177,6 +185,7 @@ jobs:
secrets: inherit
run-k8s-tests-on-zvsi:
if: ${{ inputs.skip-test != 'yes' }}
needs: [publish-kata-deploy-payload-s390x, build-and-publish-tee-confidential-unencrypted-image]
uses: ./.github/workflows/run-k8s-tests-on-zvsi.yaml
with:
@@ -189,6 +198,7 @@ jobs:
secrets: inherit
run-k8s-tests-on-ppc64le:
if: ${{ inputs.skip-test != 'yes' }}
needs: publish-kata-deploy-payload-ppc64le
uses: ./.github/workflows/run-k8s-tests-on-ppc64le.yaml
with:
@@ -200,6 +210,7 @@ jobs:
target-branch: ${{ inputs.target-branch }}
run-metrics-tests:
if: ${{ inputs.skip-test != 'yes' }}
needs: build-kata-static-tarball-amd64
uses: ./.github/workflows/run-metrics.yaml
with:
@@ -208,6 +219,7 @@ jobs:
target-branch: ${{ inputs.target-branch }}
run-basic-amd64-tests:
if: ${{ inputs.skip-test != 'yes' }}
needs: build-kata-static-tarball-amd64
uses: ./.github/workflows/basic-ci-amd64.yaml
with:
@@ -216,6 +228,7 @@ jobs:
target-branch: ${{ inputs.target-branch }}
run-cri-containerd-tests-s390x:
if: ${{ inputs.skip-test != 'yes' }}
needs: build-kata-static-tarball-s390x
uses: ./.github/workflows/run-cri-containerd-tests-s390x.yaml
with:
@@ -224,6 +237,7 @@ jobs:
target-branch: ${{ inputs.target-branch }}
run-cri-containerd-tests-ppc64le:
if: ${{ inputs.skip-test != 'yes' }}
needs: build-kata-static-tarball-ppc64le
uses: ./.github/workflows/run-cri-containerd-tests-ppc64le.yaml
with:

View File

@@ -0,0 +1,52 @@
name: Skipper
# This workflow sets various "skip_*" output values that can be used to
# determine what workflows/jobs are expected to be executed. Sample usage:
#
# skipper:
# uses: ./.github/workflows/gatekeeper-skipper.yaml
# with:
# commit-hash: ${{ github.event.pull_request.head.sha }}
# target-branch: ${{ github.event.pull_request.base.ref }}
#
# your-workflow:
# needs: skipper
# if: ${{ needs.skipper.outputs.skip_build != 'yes' }}
on:
workflow_call:
inputs:
commit-hash:
required: true
type: string
target-branch:
required: false
type: string
default: ""
outputs:
skip_build:
value: ${{ jobs.skipper.outputs.skip_build }}
skip_test:
value: ${{ jobs.skipper.outputs.skip_test }}
skip_static:
value: ${{ jobs.skipper.outputs.skip_static }}
jobs:
skipper:
runs-on: ubuntu-latest
outputs:
skip_build: ${{ steps.skipper.outputs.skip_build }}
skip_test: ${{ steps.skipper.outputs.skip_test }}
skip_static: ${{ steps.skipper.outputs.skip_static }}
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
- id: skipper
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
run: |
python3 tools/testing/gatekeeper/skips.py | tee -a "$GITHUB_OUTPUT"
shell: /usr/bin/bash -x {0}

44
.github/workflows/gatekeeper.yaml vendored Normal file
View File

@@ -0,0 +1,44 @@
name: Gatekeeper
# Gatekeeper uses the "skips.py" to determine which job names/regexps are
# required for given PR and waits for them to either complete or fail
# reporting the status.
on:
pull_request_target:
types:
- opened
- synchronize
- reopened
- labeled
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
gatekeeper:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
fetch-depth: 0
- id: gatekeeper
env:
TARGET_BRANCH: ${{ github.event.pull_request.base.ref }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
COMMIT_HASH: ${{ github.event.pull_request.head.sha }}
PR_NUMBER: ${{ github.event.pull_request.number }}
run: |
#!/usr/bin/env bash -x
mapfile -t lines < <(python3 tools/testing/gatekeeper/skips.py -t)
export REQUIRED_JOBS="${lines[0]}"
export REQUIRED_REGEXPS="${lines[1]}"
export REQUIRED_LABELS="${lines[2]}"
echo "REQUIRED_JOBS: $REQUIRED_JOBS"
echo "REQUIRED_REGEXPS: $REQUIRED_REGEXPS"
echo "REQUIRED_LABELS: $REQUIRED_LABELS"
python3 tools/testing/gatekeeper/jobs.py
exit $?
shell: /usr/bin/bash -x {0}

View File

@@ -47,13 +47,16 @@ jobs:
vmm: clh
instance-type: small
genpolicy-pull-method: oci-distribution
auto-generate-policy: yes
- host_os: cbl-mariner
vmm: clh
instance-type: small
genpolicy-pull-method: containerd
auto-generate-policy: yes
- host_os: cbl-mariner
vmm: clh
instance-type: normal
auto-generate-policy: yes
runs-on: ubuntu-22.04
env:
DOCKER_REGISTRY: ${{ inputs.registry }}
@@ -66,6 +69,7 @@ jobs:
USING_NFD: "false"
K8S_TEST_HOST_TYPE: ${{ matrix.instance-type }}
GENPOLICY_PULL_METHOD: ${{ matrix.genpolicy-pull-method }}
AUTO_GENERATE_POLICY: ${{ matrix.auto-generate-policy }}
steps:
- uses: actions/checkout@v4
with:

View File

@@ -49,6 +49,8 @@ jobs:
PULL_TYPE: ${{ matrix.pull-type }}
AUTHENTICATED_IMAGE_USER: ${{ secrets.AUTHENTICATED_IMAGE_USER }}
AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}
ITA_KEY: ${{ secrets.ITA_KEY }}
AUTO_GENERATE_POLICY: "yes"
steps:
- uses: actions/checkout@v4
with:
@@ -82,7 +84,7 @@ jobs:
run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client
- name: Run tests
timeout-minutes: 50
timeout-minutes: 100
run: bash tests/integration/kubernetes/gha-run.sh run-tests
- name: Delete kata-deploy
@@ -122,6 +124,7 @@ jobs:
PULL_TYPE: ${{ matrix.pull-type }}
AUTHENTICATED_IMAGE_USER: ${{ secrets.AUTHENTICATED_IMAGE_USER }}
AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}
AUTO_GENERATE_POLICY: "yes"
steps:
- uses: actions/checkout@v4
with:
@@ -181,6 +184,7 @@ jobs:
PULL_TYPE: ${{ matrix.pull-type }}
AUTHENTICATED_IMAGE_USER: ${{ secrets.AUTHENTICATED_IMAGE_USER }}
AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}
AUTO_GENERATE_POLICY: "yes"
steps:
- uses: actions/checkout@v4
with:

View File

@@ -12,8 +12,16 @@ concurrency:
name: Static checks self-hosted
jobs:
build-checks:
skipper:
if: ${{ contains(github.event.pull_request.labels.*.name, 'ok-to-test') }}
uses: ./.github/workflows/gatekeeper-skipper.yaml
with:
commit-hash: ${{ github.event.pull_request.head.sha }}
target-branch: ${{ github.event.pull_request.base.ref }}
build-checks:
needs: skipper
if: ${{ needs.skipper.outputs.skip_static != 'yes' }}
strategy:
fail-fast: false
matrix:

View File

@@ -12,7 +12,15 @@ concurrency:
name: Static checks
jobs:
skipper:
uses: ./.github/workflows/gatekeeper-skipper.yaml
with:
commit-hash: ${{ github.event.pull_request.head.sha }}
target-branch: ${{ github.event.pull_request.base.ref }}
check-kernel-config-version:
needs: skipper
if: ${{ needs.skipper.outputs.skip_static != 'yes' }}
runs-on: ubuntu-22.04
steps:
- name: Checkout the code
@@ -35,12 +43,16 @@ jobs:
fi
build-checks:
needs: skipper
if: ${{ needs.skipper.outputs.skip_static != 'yes' }}
uses: ./.github/workflows/build-checks.yaml
with:
instance: ubuntu-22.04
build-checks-depending-on-kvm:
runs-on: ubuntu-22.04
needs: skipper
if: ${{ needs.skipper.outputs.skip_static != 'yes' }}
strategy:
fail-fast: false
matrix:
@@ -78,6 +90,8 @@ jobs:
static-checks:
runs-on: ubuntu-22.04
needs: skipper
if: ${{ needs.skipper.outputs.skip_static != 'yes' }}
strategy:
fail-fast: false
matrix:

View File

@@ -1 +1 @@
3.9.0
3.10.1

View File

@@ -55,14 +55,14 @@ of a PR review), the following tests will be executed:
- Run the following tests:
- Tests depending on the generated tarball
- Metrics (runs on bare-metal)
- `docker` (runs on Azure small instances)
- `nerdctl` (runs on Azure small instances)
- `kata-monitor` (runs on Azure small instances)
- `cri-containerd` (runs on Azure small instances)
- `nydus` (runs on Azure small instances)
- `vfio` (runs on Azure normal instances)
- `docker` (runs on cost free runners)
- `nerdctl` (runs on cost free runners)
- `kata-monitor` (runs on cost free runners)
- `cri-containerd` (runs on cost free runners)
- `nydus` (runs on cost free runners)
- `vfio` (runs on cost free runners)
- Tests depending on the generated kata-deploy payload
- kata-deploy (runs on Azure small instances)
- kata-deploy (runs on cost free runners)
- Tests are performed using different "Kubernetes flavors", such as k0s, k3s, rke2, and Azure Kubernetes Service (AKS).
- Kubernetes (runs in Azure small and medium instances depending on what's required by each test, and on TEE bare-metal machines)
- Tests are performed with different runtime engines, such as CRI-O and containerd.
@@ -77,8 +77,8 @@ them to merely debug issues.
In the previous section we've mentioned using different runners, now in this section we'll go through each type of runner used.
- Cost free runners: Those are the runners provided by GIthub itself, and
those are fairly small machines with no virtualization capabilities enabled -
- Cost free runners: Those are the runners provided by Github itself, and
those are fairly small machines with virtualization capabilities enabled.
- Azure small instances: Those are runners which have virtualization
capabilities enabled, 2 CPUs, and 8GB of RAM. These runners have a "-smaller"
suffix to their name.

View File

@@ -14,20 +14,38 @@ die() {
exit 1
}
function verify_yq_exists() {
local yq_path=$1
local yq_version=$2
local expected="yq (https://github.com/mikefarah/yq/) version $yq_version"
if [ -x "${yq_path}" ] && [ "$($yq_path --version)"X == "$expected"X ]; then
return 0
else
return 1
fi
}
# Install the yq yaml query package from the mikefarah github repo
# Install via binary download, as we may not have golang installed at this point
function install_yq() {
local yq_pkg="github.com/mikefarah/yq"
local yq_version=v4.40.7
local precmd=""
local yq_path=""
INSTALL_IN_GOPATH=${INSTALL_IN_GOPATH:-true}
if [ "${INSTALL_IN_GOPATH}" == "true" ];then
if [ "${INSTALL_IN_GOPATH}" == "true" ]; then
GOPATH=${GOPATH:-${HOME}/go}
mkdir -p "${GOPATH}/bin"
local yq_path="${GOPATH}/bin/yq"
yq_path="${GOPATH}/bin/yq"
else
yq_path="/usr/local/bin/yq"
fi
if verify_yq_exists "$yq_path" "$yq_version"; then
echo "yq is already installed in correct version"
return
fi
if [ "${yq_path}" == "/usr/local/bin/yq" ]; then
# Check if we need sudo to install yq
if [ ! -w "/usr/local/bin" ]; then
# Check if we have sudo privileges
@@ -38,7 +56,6 @@ function install_yq() {
fi
fi
fi
[ -x "${yq_path}" ] && [ "`${yq_path} --version`"X == "yq (https://github.com/mikefarah/yq/) version ${yq_version}"X ] && return
read -r -a sysInfo <<< "$(uname -sm)"

View File

@@ -16,9 +16,12 @@ REPO="quay.io/kata-containers/kata-deploy-ci"
TAGS=$(skopeo list-tags "docker://$REPO")
# Only amd64
TAGS=$(echo "$TAGS" | jq '.Tags' | jq "map(select(endswith(\"$ARCH\")))" | jq -r '.[]')
# Tags since $GOOD
TAGS=$(echo "$TAGS" | sed -n -e "/$GOOD/,$$p")
# Tags up to $BAD
[ -n "$BAD" ] && TAGS=$(echo "$TAGS" | sed "/$BAD/q")
# Sort by git
SORTED=""
[ -n "$BAD" ] && LOG_ARGS="$GOOD~1..$BAD" || LOG_ARGS="$GOOD~1.."
for TAG in $(git log --merges --pretty=format:%H --reverse $LOG_ARGS); do
[[ "$TAGS" =~ "$TAG" ]] && SORTED+="
kata-containers-$TAG-$ARCH"
done
# Comma separated tags with repo
echo "$TAGS" | sed -e "s@^@$REPO:@" | paste -s -d, -
echo "$SORTED" | tail -n +2 | sed -e "s@^@$REPO:@" | paste -s -d, -

View File

@@ -13,16 +13,11 @@ set -e
set -o nounset
set -o pipefail
script_dir="$(dirname $0)"
script_dir="$(realpath $(dirname $0))"
webhook_dir="${script_dir}/../../../tools/testing/kata-webhook"
source "${script_dir}/../lib.sh"
KATA_RUNTIME=${KATA_RUNTIME:-kata-ci}
info "Creates the kata-webhook ConfigMap"
RUNTIME_CLASS="${KATA_RUNTIME}" \
envsubst < "${script_dir}/deployments/configmap_kata-webhook.yaml.in" \
| oc apply -f -
pushd "${webhook_dir}" >/dev/null
# Build and deploy the webhook
#
@@ -30,6 +25,12 @@ info "Builds the kata-webhook"
./create-certs.sh
info "Deploys the kata-webhook"
oc apply -f deploy/
info "Override our KATA_RUNTIME ConfigMap"
RUNTIME_CLASS="${KATA_RUNTIME}" \
envsubst < "${script_dir}/deployments/configmap_kata-webhook.yaml.in" \
| oc apply -f -
# Check the webhook was deployed and is working.
RUNTIME_CLASS="${KATA_RUNTIME}" ./webhook-check.sh
popd >/dev/null

View File

@@ -499,19 +499,6 @@ If you do not want to install the respective QEMU version, the configuration fil
See the [static-build script for QEMU](../tools/packaging/static-build/qemu/build-static-qemu.sh) for a reference on how to get, setup, configure and build QEMU for Kata.
### Build a custom QEMU for aarch64/arm64 - REQUIRED
> **Note:**
>
> - You should only do this step if you are on aarch64/arm64.
> - You should include [Eric Auger's latest PCDIMM/NVDIMM patches](https://patchwork.kernel.org/cover/10647305/) which are
> under upstream review for supporting NVDIMM on aarch64.
>
You could build the custom `qemu-system-aarch64` as required with the following command:
```bash
$ git clone https://github.com/kata-containers/tests.git
$ script -fec 'sudo -E tests/.ci/install_qemu.sh'
```
## Build `virtiofsd`
When using the file system type virtio-fs (default), `virtiofsd` is required

View File

@@ -28,10 +28,22 @@ Bug fixes are released as part of `MINOR` or `MAJOR` releases only. `PATCH` is a
## Release Process
### Bump the `VERSION` file
### Bump the `VERSION` and `Chart.yaml` file
When the `kata-containers/kata-containers` repository is ready for a new release,
first create a PR to set the release in the `VERSION` file and have it merged.
first create a PR to set the release in the [`VERSION`](./../VERSION) file and update the
`version` and `appVersion` in the
[`Chart.yaml`](./../tools/packaging/kata-deploy/helm-chart/kata-deploy/Chart.yaml) file and
have it merged.
### Lock the `main` branch
In order to prevent any PRs getting merged during the release process, and slowing the release
process down, by impacting the payload caches, we have recently trailed setting the `main`
branch to read only whilst the release action runs.
> [!NOTE]
> Admin permission is needed to complete this task.
### Check GitHub Actions
@@ -40,6 +52,9 @@ We make use of [GitHub actions](https://github.com/features/actions) in the
file from the `kata-containers/kata-containers` repository to build and upload
release artifacts.
> [!NOTE]
> Write permissions to trigger the action.
The action is manually triggered and is responsible for generating a new
release (including a new tag), pushing those to the
`kata-containers/kata-containers` repository. The new release is initially
@@ -59,6 +74,11 @@ If for some reason you need to cancel the workflow or re-run it entirely, go fir
to the [Release page](https://github.com/kata-containers/kata-containers/releases) and
delete the draft release from the previous run.
### Unlock the `main` branch
After the release process has concluded, either unlock the `main` branch, or ask
an admin to do it.
### Improve the release notes
Release notes are auto-generated by the GitHub CLI tool used as part of our

View File

@@ -50,7 +50,7 @@ We provide `Dragonball` Sandbox to enable built-in VMM by integrating VMM's func
#### How To Support Async
The kata-runtime is controlled by TOKIO_RUNTIME_WORKER_THREADS to run the OS thread, which is 2 threads by default. For TTRPC and container-related threads run in the `tokio` thread in a unified manner, and related dependencies need to be switched to Async, such as Timer, File, Netlink, etc. With the help of Async, we can easily support no-block I/O and timer. Currently, we only utilize Async for kata-runtime. The built-in VMM keeps the OS thread because it can ensure that the threads are controllable.
**For N tokio worker threads and M containers**
**For N `tokio` worker threads and M containers**
- Sync runtime(both OS thread and `tokio` task are OS thread but without `tokio` worker thread) OS thread number: 4 + 12*M
- Async runtime(only OS thread is OS thread) OS thread number: 2 + N
@@ -103,7 +103,6 @@ In our case, there will be a variety of resources, and every resource has severa
| `Cgroup V2` | | Stage 2 | 🚧 |
| Hypervisor | `Dragonball` | Stage 1 | 🚧 |
| | QEMU | Stage 2 | 🚫 |
| | ACRN | Stage 3 | 🚫 |
| | Cloud Hypervisor | Stage 3 | 🚫 |
| | Firecracker | Stage 3 | 🚫 |
@@ -166,4 +165,4 @@ In our case, there will be a variety of resources, and every resource has severa
- What is the security boundary for the monolithic / "Built-in VMM" case?
It has the security boundary of virtualization. More details will be provided in next stage.
It has the security boundary of virtualization. More details will be provided in next stage.

View File

@@ -20,12 +20,6 @@
for the VM rootfs. Refer to the following guide for additional configuration
steps:
- [Setup Kata containers with `firecracker`](how-to-use-kata-containers-with-firecracker.md)
- `ACRN`
While `qemu` , `cloud-hypervisor` and `firecracker` work out of the box with installation of Kata,
some additional configuration is needed in case of `ACRN`.
Refer to the following guides for additional configuration steps:
- [Kata Containers with ACRN Hypervisor](how-to-use-kata-containers-with-acrn.md)
## Confidential Containers Policy
@@ -52,4 +46,4 @@
- [How to use EROFS to build rootfs in Kata Containers](how-to-use-erofs-build-rootfs.md)
- [How to run Kata Containers with kinds of Block Volumes](how-to-run-kata-containers-with-kinds-of-Block-Volumes.md)
- [How to use the Kata Agent Policy](how-to-use-the-kata-agent-policy.md)
- [How to pull images in the guest](how-to-pull-images-in-guest-with-kata.md)
- [How to pull images in the guest](how-to-pull-images-in-guest-with-kata.md)

View File

@@ -46,7 +46,6 @@ There are several kinds of Kata configurations and they are listed below.
| `io.katacontainers.config.hypervisor.block_device_cache_set` | `boolean` | cache-related options will be set to block devices or not |
| `io.katacontainers.config.hypervisor.block_device_driver` | string | the driver to be used for block device, valid values are `virtio-blk`, `virtio-scsi`, `nvdimm`|
| `io.katacontainers.config.hypervisor.cpu_features` | `string` | Comma-separated list of CPU features to pass to the CPU (QEMU) |
| `io.katacontainers.config.hypervisor.ctlpath` (R) | `string` | Path to the `acrnctl` binary for the ACRN hypervisor |
| `io.katacontainers.config.hypervisor.default_max_vcpus` | uint32| the maximum number of vCPUs allocated for the VM by the hypervisor |
| `io.katacontainers.config.hypervisor.default_memory` | uint32| the memory assigned for a VM by the hypervisor in `MiB` |
| `io.katacontainers.config.hypervisor.default_vcpus` | float32| the default vCPUs assigned for a VM by the hypervisor |
@@ -209,7 +208,6 @@ the configuration entry:
| Key | Config file entry | Comments |
|-------| ----- | ----- |
| `ctlpath` | `valid_ctlpaths` | Valid paths for `acrnctl` binary |
| `entropy_source` | `valid_entropy_sources` | Valid entropy sources, e.g. `/dev/random` |
| `file_mem_backend` | `valid_file_mem_backends` | Valid locations for the file-based memory backend root directory |
| `jailer_path` | `valid_jailer_paths`| Valid paths for the jailer constraining the container VM (Firecracker) |

View File

@@ -1,125 +0,0 @@
# Kata Containers with ACRN
This document provides an overview on how to run Kata containers with ACRN hypervisor and device model.
## Introduction
ACRN is a flexible, lightweight Type-1 reference hypervisor built with real-time and safety-criticality in mind. ACRN uses an open source platform making it optimized to streamline embedded development.
Some of the key features being:
- Small footprint - Approx. 25K lines of code (LOC).
- Real Time - Low latency, faster boot time, improves overall responsiveness with hardware.
- Adaptability - Multi-OS support for guest operating systems like Linux, Android, RTOSes.
- Rich I/O mediators - Allows sharing of various I/O devices across VMs.
- Optimized for a variety of IoT (Internet of Things) and embedded device solutions.
Please refer to ACRN [documentation](https://projectacrn.github.io/latest/index.html) for more details on ACRN hypervisor and device model.
## Pre-requisites
This document requires the presence of the ACRN hypervisor and Kata Containers on your system. Install using the instructions available through the following links:
- ACRN supported [Hardware](https://projectacrn.github.io/latest/hardware.html#supported-hardware).
> **Note:** Please make sure to have a minimum of 4 logical processors (HT) or cores.
- ACRN [software](https://projectacrn.github.io/latest/tutorials/run_kata_containers.html) setup.
- For networking, ACRN supports either MACVTAP or TAP. If MACVTAP is not enabled in the Service OS, please follow the below steps to update the kernel:
```sh
$ git clone https://github.com/projectacrn/acrn-kernel.git
$ cd acrn-kernel
$ cp kernel_config_sos .config
$ sed -i "s/# CONFIG_MACVLAN is not set/CONFIG_MACVLAN=y/" .config
$ sed -i '$ i CONFIG_MACVTAP=y' .config
$ make clean && make olddefconfig && make && sudo make modules_install INSTALL_MOD_PATH=out/
```
Login into Service OS and update the kernel with MACVTAP support:
```sh
$ sudo mount /dev/sda1 /mnt
$ sudo scp -r <user name>@<host address>:<your workspace>/acrn-kernel/arch/x86/boot/bzImage /mnt/EFI/org.clearlinux/
$ sudo scp -r <user name>@<host address>:<your workspace>/acrn-kernel/out/lib/modules/* /lib/modules/
$ conf_file=$(sed -n '$ s/default //p' /mnt/loader/loader.conf).conf
$ kernel_img=$(sed -n 2p /mnt/loader/entries/$conf_file | cut -d'/' -f4)
$ sudo sed -i "s/$kernel_img/bzImage/g" /mnt/loader/entries/$conf_file
$ sync && sudo umount /mnt && sudo reboot
```
- Kata Containers installation: Automated installation does not seem to be supported for Clear Linux, so please use [manual installation](../Developer-Guide.md) steps.
> **Note:** Create rootfs image and not initrd image.
In order to run Kata with ACRN, your container stack must provide block-based storage, such as device-mapper.
> **Note:** Currently, by design you can only launch one VM from Kata Containers using ACRN hypervisor (SDC scenario). Based on feedback from community we can increase number of VMs.
## Configure Docker
To configure Docker for device-mapper and Kata,
1. Stop Docker daemon if it is already running.
```bash
$ sudo systemctl stop docker
```
2. Set `/etc/docker/daemon.json` with the following contents.
```
{
"storage-driver": "devicemapper"
}
```
3. Restart docker.
```bash
$ sudo systemctl daemon-reload
$ sudo systemctl restart docker
```
4. Configure [Docker](../Developer-Guide.md#update-the-docker-systemd-unit-file) to use `kata-runtime`.
## Configure Kata Containers with ACRN
To configure Kata Containers with ACRN, copy the generated `configuration-acrn.toml` file when building the `kata-runtime` to either `/etc/kata-containers/configuration.toml` or `/usr/share/defaults/kata-containers/configuration.toml`.
The following command shows full paths to the `configuration.toml` files that the runtime loads. It will use the first path that exists. (Please make sure the kernel and image paths are set correctly in the `configuration.toml` file)
```bash
$ sudo kata-runtime --show-default-config-paths
```
>**Warning:** Please offline CPUs using [this](offline_cpu.sh) script, else VM launches will fail.
```bash
$ sudo ./offline_cpu.sh
```
Start an ACRN based Kata Container,
```bash
$ sudo docker run -ti --runtime=kata-runtime busybox sh
```
You will see ACRN(`acrn-dm`) is now running on your system, as well as a `kata-shim`. You should obtain an interactive shell prompt. Verify that all the Kata processes terminate once you exit the container.
```bash
$ ps -ef | grep -E "kata|acrn"
```
Validate ACRN hypervisor by using `kata-runtime kata-env`,
```sh
$ kata-runtime kata-env | awk -v RS= '/\[Hypervisor\]/'
[Hypervisor]
MachineType = ""
Version = "DM version is: 1.2-unstable-254577a6-dirty (daily tag:acrn-2019w27.4-140000p)
Path = "/usr/bin/acrn-dm"
BlockDeviceDriver = "virtio-blk"
EntropySource = "/dev/urandom"
Msize9p = 0
MemorySlots = 10
Debug = false
UseVSock = false
SharedFS = ""
```

View File

@@ -18,7 +18,6 @@ for i in $(ls -d /sys/devices/system/cpu/cpu[1-9]*); do
echo 0 > $i/online
online=`cat $i/online`
done
echo $idx > /sys/class/vhm/acrn_vhm/offline_cpu
fi
done

View File

@@ -18,7 +18,6 @@ which hypervisors you may wish to investigate further.
| Hypervisor | Written in | Architectures | Type |
|-|-|-|-|
|[ACRN] | C | `x86_64` | Type 1 (bare metal) |
|[Cloud Hypervisor] | rust | `aarch64`, `x86_64` | Type 2 ([KVM]) |
|[Firecracker] | rust | `aarch64`, `x86_64` | Type 2 ([KVM]) |
|[QEMU] | C | all | Type 2 ([KVM]) | `configuration-qemu.toml` |
@@ -38,7 +37,6 @@ the hypervisors:
| Hypervisor | Summary | Features | Limitations | Container Creation speed | Memory density | Use cases | Comment |
|-|-|-|-|-|-|-|-|
|[ACRN] | Safety critical and real-time workloads | | | excellent | excellent | Embedded and IOT systems | For advanced users |
|[Cloud Hypervisor] | Low latency, small memory footprint, small attack surface | Minimal | | excellent | excellent | High performance modern cloud workloads | |
|[Firecracker] | Very slimline | Extremely minimal | Doesn't support all device types | excellent | excellent | Serverless / FaaS | |
|[QEMU] | Lots of features | Lots | | good | good | Good option for most users | |
@@ -57,7 +55,6 @@ are available, their default values and how each setting can be used.
| Hypervisor | Golang runtime config file | golang runtime short name | golang runtime default | rust runtime config file | rust runtime short name | rust runtime default |
|-|-|-|-|-|-|-|
| [ACRN] | [`configuration-acrn.toml`](../src/runtime/config/configuration-acrn.toml.in) | `acrn` | | | | |
| [Cloud Hypervisor] | [`configuration-clh.toml`](../src/runtime/config/configuration-clh.toml.in) | `clh` | | [`configuration-cloud-hypervisor.toml`](../src/runtime-rs/config/configuration-cloud-hypervisor.toml.in) | `cloud-hypervisor` | |
| [Firecracker] | [`configuration-fc.toml`](../src/runtime/config/configuration-fc.toml.in) | `fc` | | | | |
| [QEMU] | [`configuration-qemu.toml`](../src/runtime/config/configuration-qemu.toml.in) | `qemu` | yes | [`configuration-qemu.toml`](../src/runtime-rs/config/configuration-qemu-runtime-rs.toml.in) | `qemu` | |
@@ -93,10 +90,9 @@ are available, their default values and how each setting can be used.
To switch the configured hypervisor, you only need to run a single command.
See [the `kata-manager` documentation](../utils/README.md#choose-a-hypervisor) for further details.
[ACRN]: https://projectacrn.org
[Cloud Hypervisor]: https://github.com/cloud-hypervisor/cloud-hypervisor
[Firecracker]: https://github.com/firecracker-microvm/firecracker
[KVM]: https://en.wikipedia.org/wiki/Kernel-based_Virtual_Machine
[QEMU]: http://www.qemu-project.org
[QEMU]: http://www.qemu.org
[`Dragonball`]: https://github.com/kata-containers/kata-containers/blob/main/src/dragonball
[StratoVirt]: https://gitee.com/openeuler/stratovirt

View File

@@ -83,6 +83,23 @@ $ make && sudo make install
```
After running the command above, the default config file `configuration.toml` will be installed under `/usr/share/defaults/kata-containers/`, the binary file `containerd-shim-kata-v2` will be installed under `/usr/local/bin/` .
### Install Shim Without Builtin Dragonball VMM
By default, runtime-rs includes the `Dragonball` VMM. To build without the built-in `Dragonball` hypervisor, use `make USE_BUILDIN_DB=false`:
```bash
$ cd kata-containers/src/runtime-rs
$ make USE_BUILDIN_DB=false
```
After building, specify the desired hypervisor during installation using `HYPERVISOR`. For example, to use `qemu` or `cloud-hypervisor`:
```
sudo make install HYPERVISOR=qemu
```
or
```
sudo make install HYPERVISOR=cloud-hypervisor
```
### Build Kata Containers Kernel
Follow the [Kernel installation guide](/tools/packaging/kernel/README.md).

View File

@@ -158,7 +158,7 @@ lazy_static! {
.typ(oci::LinuxDeviceType::C)
.major(1)
.minor(3)
.file_mode(0o066_u32)
.file_mode(0o666_u32)
.uid(0xffffffff_u32)
.gid(0xffffffff_u32)
.build()
@@ -168,7 +168,7 @@ lazy_static! {
.typ(oci::LinuxDeviceType::C)
.major(1)
.minor(5)
.file_mode(0o066_u32)
.file_mode(0o666_u32)
.uid(0xffffffff_u32)
.gid(0xffffffff_u32)
.build()
@@ -178,7 +178,7 @@ lazy_static! {
.typ(oci::LinuxDeviceType::C)
.major(1)
.minor(7)
.file_mode(0o066_u32)
.file_mode(0o666_u32)
.uid(0xffffffff_u32)
.gid(0xffffffff_u32)
.build()
@@ -188,7 +188,7 @@ lazy_static! {
.typ(oci::LinuxDeviceType::C)
.major(5)
.minor(0)
.file_mode(0o066_u32)
.file_mode(0o666_u32)
.uid(0xffffffff_u32)
.gid(0xffffffff_u32)
.build()
@@ -198,7 +198,7 @@ lazy_static! {
.typ(oci::LinuxDeviceType::C)
.major(1)
.minor(9)
.file_mode(0o066_u32)
.file_mode(0o666_u32)
.uid(0xffffffff_u32)
.gid(0xffffffff_u32)
.build()
@@ -208,7 +208,7 @@ lazy_static! {
.typ(oci::LinuxDeviceType::C)
.major(1)
.minor(8)
.file_mode(0o066_u32)
.file_mode(0o666_u32)
.uid(0xffffffff_u32)
.gid(0xffffffff_u32)
.build()

View File

@@ -8,13 +8,15 @@
// https://github.com/confidential-containers/guest-components/tree/main/confidential-data-hub
use crate::AGENT_CONFIG;
use crate::CDH_SOCKET_URI;
use anyhow::{Context, Result};
use anyhow::{bail, Context, Result};
use derivative::Derivative;
use protocols::{
confidential_data_hub, confidential_data_hub_ttrpc_async,
confidential_data_hub_ttrpc_async::{SealedSecretServiceClient, SecureMountServiceClient},
};
use std::fs;
use std::os::unix::fs::symlink;
use std::path::Path;
use tokio::sync::OnceCell;
// Nanoseconds
@@ -22,8 +24,14 @@ lazy_static! {
static ref CDH_API_TIMEOUT: i64 = AGENT_CONFIG.cdh_api_timeout.as_nanos() as i64;
pub static ref CDH_CLIENT: OnceCell<CDHClient> = OnceCell::new();
}
const SEALED_SECRET_PREFIX: &str = "sealed.";
// Convenience function to obtain the scope logger.
fn sl() -> slog::Logger {
slog_scope::logger().new(o!("subsystem" => "cdh"))
}
#[derive(Derivative)]
#[derivative(Clone, Debug)]
pub struct CDHClient {
@@ -34,8 +42,8 @@ pub struct CDHClient {
}
impl CDHClient {
pub fn new() -> Result<Self> {
let client = ttrpc::asynchronous::Client::connect(CDH_SOCKET_URI)?;
pub fn new(cdh_socket_uri: &str) -> Result<Self> {
let client = ttrpc::asynchronous::Client::connect(cdh_socket_uri)?;
let sealed_secret_client =
confidential_data_hub_ttrpc_async::SealedSecretServiceClient::new(client.clone());
let secure_mount_client =
@@ -78,9 +86,11 @@ impl CDHClient {
}
}
pub async fn init_cdh_client() -> Result<()> {
pub async fn init_cdh_client(cdh_socket_uri: &str) -> Result<()> {
CDH_CLIENT
.get_or_try_init(|| async { CDHClient::new().context("Failed to create CDH Client") })
.get_or_try_init(|| async {
CDHClient::new(cdh_socket_uri).context("Failed to create CDH Client")
})
.await?;
Ok(())
}
@@ -106,6 +116,74 @@ pub async fn unseal_env(env: &str) -> Result<String> {
Ok((*env.to_owned()).to_string())
}
pub async fn unseal_file(path: &str) -> Result<()> {
let cdh_client = CDH_CLIENT
.get()
.expect("Confidential Data Hub not initialized");
if !Path::new(path).exists() {
bail!("sealed secret file {:?} does not exist", path);
}
// Iterate over all entries to handle the sealed secret file.
// For example, the directory is as follows:
// The secret directory in the guest: /run/kata-containers/shared/containers/21bbf0d932b70263d65d7052ecfd72ee46de03f766650cb378e93852ddb30a54-5063be11b6800f96-sealed-secret-target/:
// - ..2024_09_30_02_55_58.2237819815
// - ..data -> ..2024_09_30_02_55_58.2237819815
// - secret -> ..2024_09_30_02_55_58.2237819815/secret
//
// The directory "..2024_09_30_02_55_58.2237819815":
// - secret
for entry in fs::read_dir(path)? {
let entry = entry?;
let entry_type = entry.file_type()?;
if !entry_type.is_symlink() && !entry_type.is_file() {
debug!(
sl(),
"skipping sealed source entry {:?} because its file type is {:?}",
entry,
entry_type
);
continue;
}
let target_path = fs::canonicalize(&entry.path())?;
info!(sl(), "sealed source entry target path: {:?}", target_path);
// Skip if the target path is not a file (e.g., it's a symlink pointing to the secret file).
if !target_path.is_file() {
debug!(sl(), "sealed source is not a file: {:?}", target_path);
continue;
}
let secret_name = entry.file_name();
let contents = fs::read_to_string(&target_path)?;
if contents.starts_with(SEALED_SECRET_PREFIX) {
// Get the directory name of the sealed secret file
let dir_name = target_path
.parent()
.and_then(|p| p.file_name())
.map(|name| name.to_string_lossy().to_string())
.unwrap_or_default();
// Create the unsealed file name in the same directory, which will be written the unsealed data.
let unsealed_filename = format!("{}.unsealed", target_path.to_string_lossy());
// Create the unsealed file symlink, which is used for reading the unsealed data in the container.
let unsealed_filename_symlink =
format!("{}/{}.unsealed", dir_name, secret_name.to_string_lossy());
// Unseal the secret and write it to the unsealed file
let unsealed_value = cdh_client.unseal_secret_async(&contents).await?;
fs::write(&unsealed_filename, unsealed_value)?;
// Remove the original sealed symlink and create a symlink to the unsealed file
fs::remove_file(&entry.path())?;
symlink(unsealed_filename_symlink, &entry.path())?;
}
}
Ok(())
}
pub async fn secure_mount(
volume_type: &str,
options: &std::collections::HashMap<String, String>,
@@ -123,17 +201,15 @@ pub async fn secure_mount(
}
#[cfg(test)]
#[cfg(feature = "sealed-secret")]
mod tests {
use crate::cdh::CDHClient;
use crate::cdh::CDH_ADDR;
use anyhow::anyhow;
use super::*;
use async_trait::async_trait;
use protocols::{confidential_data_hub, confidential_data_hub_ttrpc_async};
use std::fs::File;
use std::io::{Read, Write};
use std::sync::Arc;
use tempfile::tempdir;
use test_utils::skip_if_not_root;
use tokio::signal::unix::{signal, SignalKind};
struct TestService;
#[async_trait]
@@ -161,17 +237,17 @@ mod tests {
Ok(())
}
fn start_ttrpc_server() {
fn start_ttrpc_server(cdh_socket_uri: String) {
tokio::spawn(async move {
let ss = Box::new(TestService {})
as Box<dyn confidential_data_hub_ttrpc_async::SealedSecretService + Send + Sync>;
let ss = Arc::new(ss);
let ss_service = confidential_data_hub_ttrpc_async::create_sealed_secret_service(ss);
remove_if_sock_exist(CDH_ADDR).unwrap();
remove_if_sock_exist(&cdh_socket_uri).unwrap();
let mut server = ttrpc::asynchronous::Server::new()
.bind(CDH_ADDR)
.bind(&cdh_socket_uri)
.unwrap()
.register_service(ss_service);
@@ -187,23 +263,58 @@ mod tests {
}
#[tokio::test]
async fn test_unseal_env() {
async fn test_sealed_secret() {
skip_if_not_root!();
let test_dir = tempdir().expect("failed to create tmpdir");
let test_dir_path = test_dir.path();
let cdh_sock_uri = &format!(
"unix://{}",
test_dir_path.join("cdh.sock").to_str().unwrap()
);
let rt = tokio::runtime::Runtime::new().unwrap();
let _guard = rt.enter();
start_ttrpc_server();
start_ttrpc_server(cdh_sock_uri.to_string());
std::thread::sleep(std::time::Duration::from_secs(2));
init_cdh_client(cdh_sock_uri).await.unwrap();
let cc = Some(CDHClient::new().unwrap());
let cdh_client = cc.as_ref().ok_or(anyhow!("get cdh_client failed")).unwrap();
// Test sealed secret as env vars
let sealed_env = String::from("key=sealed.testdata");
let unsealed_env = cdh_client.unseal_env(&sealed_env).await.unwrap();
let unsealed_env = unseal_env(&sealed_env).await.unwrap();
assert_eq!(unsealed_env, String::from("key=unsealed"));
let normal_env = String::from("key=testdata");
let unchanged_env = cdh_client.unseal_env(&normal_env).await.unwrap();
let unchanged_env = unseal_env(&normal_env).await.unwrap();
assert_eq!(unchanged_env, String::from("key=testdata"));
// Test sealed secret as files
let sealed_dir = test_dir_path.join("..test");
fs::create_dir(&sealed_dir).unwrap();
let sealed_filename = sealed_dir.join("secret");
let mut sealed_file = File::create(sealed_filename.clone()).unwrap();
sealed_file.write_all(b"sealed.testdata").unwrap();
let secret_symlink = test_dir_path.join("secret");
symlink(&sealed_filename, &secret_symlink).unwrap();
unseal_file(test_dir_path.to_str().unwrap()).await.unwrap();
let unsealed_filename = test_dir_path.join("secret");
let mut unsealed_file = File::open(unsealed_filename.clone()).unwrap();
let mut contents = String::new();
unsealed_file.read_to_string(&mut contents).unwrap();
assert_eq!(contents, String::from("unsealed"));
fs::remove_file(sealed_filename).unwrap();
fs::remove_file(unsealed_filename).unwrap();
let normal_filename = test_dir_path.join("secret");
let mut normal_file = File::create(normal_filename.clone()).unwrap();
normal_file.write_all(b"testdata").unwrap();
unseal_file(test_dir_path.to_str().unwrap()).await.unwrap();
let mut contents = String::new();
let mut normal_file = File::open(normal_filename.clone()).unwrap();
normal_file.read_to_string(&mut contents).unwrap();
assert_eq!(contents, String::from("testdata"));
fs::remove_file(normal_filename).unwrap();
rt.shutdown_background();
std::thread::sleep(std::time::Duration::from_secs(2));
}

View File

@@ -89,7 +89,7 @@ pub enum GuestComponentsFeatures {
Resource,
}
#[derive(Clone, Copy, Debug, Default, Display, Deserialize, EnumString, PartialEq)]
#[derive(Clone, Copy, Debug, Default, Display, Deserialize, EnumString, PartialEq, Eq)]
/// Attestation-related processes that we want to spawn as children of the agent
#[strum(serialize_all = "kebab-case")]
#[serde(rename_all = "kebab-case")]
@@ -630,6 +630,7 @@ mod tests {
use super::*;
use anyhow::anyhow;
use rstest::*;
use serial_test::serial;
use std::fs::File;
use std::io::Write;
@@ -1327,562 +1328,193 @@ mod tests {
assert_eq!(expected.tracing, config.tracing);
}
#[test]
fn test_logrus_to_slog_level() {
#[derive(Debug)]
struct TestData<'a> {
logrus_level: &'a str,
result: Result<slog::Level>,
}
let tests = &[
TestData {
logrus_level: "",
result: Err(anyhow!(ERR_INVALID_LOG_LEVEL)),
},
TestData {
logrus_level: "foo",
result: Err(anyhow!(ERR_INVALID_LOG_LEVEL)),
},
TestData {
logrus_level: "debugging",
result: Err(anyhow!(ERR_INVALID_LOG_LEVEL)),
},
TestData {
logrus_level: "xdebug",
result: Err(anyhow!(ERR_INVALID_LOG_LEVEL)),
},
TestData {
logrus_level: "trace",
result: Ok(slog::Level::Trace),
},
TestData {
logrus_level: "debug",
result: Ok(slog::Level::Debug),
},
TestData {
logrus_level: "info",
result: Ok(slog::Level::Info),
},
TestData {
logrus_level: "warn",
result: Ok(slog::Level::Warning),
},
TestData {
logrus_level: "warning",
result: Ok(slog::Level::Warning),
},
TestData {
logrus_level: "error",
result: Ok(slog::Level::Error),
},
TestData {
logrus_level: "critical",
result: Ok(slog::Level::Critical),
},
TestData {
logrus_level: "fatal",
result: Ok(slog::Level::Critical),
},
TestData {
logrus_level: "panic",
result: Ok(slog::Level::Critical),
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let result = logrus_to_slog_level(d.logrus_level);
let msg = format!("{}: result: {:?}", msg, result);
assert_result!(d.result, result, msg);
}
#[rstest]
#[case("", Err(anyhow!(ERR_INVALID_LOG_LEVEL)))]
#[case("foo", Err(anyhow!(ERR_INVALID_LOG_LEVEL)))]
#[case("debugging", Err(anyhow!(ERR_INVALID_LOG_LEVEL)))]
#[case("xdebug", Err(anyhow!(ERR_INVALID_LOG_LEVEL)))]
#[case("trace", Ok(slog::Level::Trace))]
#[case("debug", Ok(slog::Level::Debug))]
#[case("info", Ok(slog::Level::Info))]
#[case("warn", Ok(slog::Level::Warning))]
#[case("warning", Ok(slog::Level::Warning))]
#[case("error", Ok(slog::Level::Error))]
#[case("critical", Ok(slog::Level::Critical))]
#[case("fatal", Ok(slog::Level::Critical))]
#[case("panic", Ok(slog::Level::Critical))]
fn test_logrus_to_slog_level(#[case] input: &str, #[case] expected: Result<slog::Level>) {
let result = logrus_to_slog_level(input);
let msg = format!("expected: {:?}, result: {:?}", expected, result);
assert_result!(expected, result, msg);
}
#[test]
fn test_get_log_level() {
#[derive(Debug)]
struct TestData<'a> {
param: &'a str,
result: Result<slog::Level>,
}
let tests = &[
TestData {
param: "",
result: Err(anyhow!(ERR_INVALID_LOG_LEVEL_PARAM)),
},
TestData {
param: "=",
result: Err(anyhow!(ERR_INVALID_LOG_LEVEL_KEY)),
},
TestData {
param: "x=",
result: Err(anyhow!(ERR_INVALID_LOG_LEVEL_KEY)),
},
TestData {
param: "=y",
result: Err(anyhow!(ERR_INVALID_LOG_LEVEL_KEY)),
},
TestData {
param: "==",
result: Err(anyhow!(ERR_INVALID_LOG_LEVEL_PARAM)),
},
TestData {
param: "= =",
result: Err(anyhow!(ERR_INVALID_LOG_LEVEL_PARAM)),
},
TestData {
param: "x=y",
result: Err(anyhow!(ERR_INVALID_LOG_LEVEL_KEY)),
},
TestData {
param: "agent=debug",
result: Err(anyhow!(ERR_INVALID_LOG_LEVEL_KEY)),
},
TestData {
param: "agent.logg=debug",
result: Err(anyhow!(ERR_INVALID_LOG_LEVEL_KEY)),
},
TestData {
param: "agent.log=trace",
result: Ok(slog::Level::Trace),
},
TestData {
param: "agent.log=debug",
result: Ok(slog::Level::Debug),
},
TestData {
param: "agent.log=info",
result: Ok(slog::Level::Info),
},
TestData {
param: "agent.log=warn",
result: Ok(slog::Level::Warning),
},
TestData {
param: "agent.log=warning",
result: Ok(slog::Level::Warning),
},
TestData {
param: "agent.log=error",
result: Ok(slog::Level::Error),
},
TestData {
param: "agent.log=critical",
result: Ok(slog::Level::Critical),
},
TestData {
param: "agent.log=fatal",
result: Ok(slog::Level::Critical),
},
TestData {
param: "agent.log=panic",
result: Ok(slog::Level::Critical),
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let result = get_log_level(d.param);
let msg = format!("{}: result: {:?}", msg, result);
assert_result!(d.result, result, msg);
}
#[rstest]
#[case("",Err(anyhow!(ERR_INVALID_LOG_LEVEL_PARAM)))]
#[case("=",Err(anyhow!(ERR_INVALID_LOG_LEVEL_KEY)))]
#[case("x=",Err(anyhow!(ERR_INVALID_LOG_LEVEL_KEY)))]
#[case("=y",Err(anyhow!(ERR_INVALID_LOG_LEVEL_KEY)))]
#[case("==",Err(anyhow!(ERR_INVALID_LOG_LEVEL_PARAM)))]
#[case("= =",Err(anyhow!(ERR_INVALID_LOG_LEVEL_PARAM)))]
#[case("x=y",Err(anyhow!(ERR_INVALID_LOG_LEVEL_KEY)))]
#[case("agent=debug",Err(anyhow!(ERR_INVALID_LOG_LEVEL_KEY)))]
#[case("agent.logg=debug",Err(anyhow!(ERR_INVALID_LOG_LEVEL_KEY)))]
#[case("agent.log=trace", Ok(slog::Level::Trace))]
#[case("agent.log=debug", Ok(slog::Level::Debug))]
#[case("agent.log=info", Ok(slog::Level::Info))]
#[case("agent.log=warn", Ok(slog::Level::Warning))]
#[case("agent.log=warning", Ok(slog::Level::Warning))]
#[case("agent.log=error", Ok(slog::Level::Error))]
#[case("agent.log=critical", Ok(slog::Level::Critical))]
#[case("agent.log=fatal", Ok(slog::Level::Critical))]
#[case("agent.log=panic", Ok(slog::Level::Critical))]
fn test_get_log_level(#[case] input: &str, #[case] expected: Result<slog::Level>) {
let result = get_log_level(input);
let msg = format!("expected: {:?}, result: {:?}", expected, result);
assert_result!(expected, result, msg);
}
#[test]
fn test_get_timeout() {
#[derive(Debug)]
struct TestData<'a> {
param: &'a str,
result: Result<time::Duration>,
}
let tests = &[
TestData {
param: "",
result: Err(anyhow!(ERR_INVALID_TIMEOUT)),
},
TestData {
param: "agent.hotplug_timeout",
result: Err(anyhow!(ERR_INVALID_TIMEOUT)),
},
TestData {
param: "foo=bar",
result: Err(anyhow!(ERR_INVALID_TIMEOUT_KEY)),
},
TestData {
param: "agent.hotplug_timeot=1",
result: Err(anyhow!(ERR_INVALID_TIMEOUT_KEY)),
},
TestData {
param: "agent.chd_api_timeout=1",
result: Err(anyhow!(ERR_INVALID_TIMEOUT_KEY)),
},
TestData {
param: "agent.hotplug_timeout=1",
result: Ok(time::Duration::from_secs(1)),
},
TestData {
param: "agent.hotplug_timeout=3",
result: Ok(time::Duration::from_secs(3)),
},
TestData {
param: "agent.hotplug_timeout=3600",
result: Ok(time::Duration::from_secs(3600)),
},
TestData {
param: "agent.cdh_api_timeout=600",
result: Ok(time::Duration::from_secs(600)),
},
TestData {
param: "agent.hotplug_timeout=0",
result: Ok(time::Duration::from_secs(0)),
},
TestData {
param: "agent.hotplug_timeout=-1",
result: Err(anyhow!(
"unable to parse timeout
#[rstest]
#[case("", Err(anyhow!(ERR_INVALID_TIMEOUT)))]
#[case("agent.hotplug_timeout", Err(anyhow!(ERR_INVALID_TIMEOUT)))]
#[case("foo=bar", Err(anyhow!(ERR_INVALID_TIMEOUT_KEY)))]
#[case("agent.hotplug_timeot=1", Err(anyhow!(ERR_INVALID_TIMEOUT_KEY)))]
#[case("agent.hotplug_timeout=1", Ok(time::Duration::from_secs(1)))]
#[case("agent.hotplug_timeout=3", Ok(time::Duration::from_secs(3)))]
#[case("agent.hotplug_timeout=3600", Ok(time::Duration::from_secs(3600)))]
#[case("agent.hotplug_timeout=0", Ok(time::Duration::from_secs(0)))]
#[case("agent.hotplug_timeout=-1", Err(anyhow!(
"unable to parse timeout
Caused by:
invalid digit found in string"
)),
},
TestData {
param: "agent.hotplug_timeout=4jbsdja",
result: Err(anyhow!(
"unable to parse timeout
)))]
#[case("agent.hotplug_timeout=4jbsdja", Err(anyhow!(
"unable to parse timeout
Caused by:
invalid digit found in string"
)),
},
TestData {
param: "agent.hotplug_timeout=foo",
result: Err(anyhow!(
"unable to parse timeout
)))]
#[case("agent.hotplug_timeout=foo", Err(anyhow!(
"unable to parse timeout
Caused by:
invalid digit found in string"
)),
},
TestData {
param: "agent.hotplug_timeout=j",
result: Err(anyhow!(
"unable to parse timeout
)))]
#[case("agent.hotplug_timeout=j", Err(anyhow!(
"unable to parse timeout
Caused by:
invalid digit found in string"
)),
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let result = get_timeout(d.param);
let msg = format!("{}: result: {:?}", msg, result);
assert_result!(d.result, result, msg);
}
)))]
#[case("agent.chd_api_timeout=1", Err(anyhow!(ERR_INVALID_TIMEOUT_KEY)))]
#[case("agent.cdh_api_timeout=600", Ok(time::Duration::from_secs(600)))]
fn test_timeout(#[case] param: &str, #[case] expected: Result<time::Duration>) {
let result = get_timeout(param);
let msg = format!("expected: {:?}, result: {:?}", expected, result);
assert_result!(expected, result, msg);
}
#[test]
fn test_get_container_pipe_size() {
#[derive(Debug)]
struct TestData<'a> {
param: &'a str,
result: Result<i32>,
}
let tests = &[
TestData {
param: "",
result: Err(anyhow!(ERR_INVALID_CONTAINER_PIPE_SIZE)),
},
TestData {
param: "agent.container_pipe_size",
result: Err(anyhow!(ERR_INVALID_CONTAINER_PIPE_SIZE)),
},
TestData {
param: "foo=bar",
result: Err(anyhow!(ERR_INVALID_CONTAINER_PIPE_SIZE_KEY)),
},
TestData {
param: "agent.container_pip_siz=1",
result: Err(anyhow!(ERR_INVALID_CONTAINER_PIPE_SIZE_KEY)),
},
TestData {
param: "agent.container_pipe_size=1",
result: Ok(1),
},
TestData {
param: "agent.container_pipe_size=3",
result: Ok(3),
},
TestData {
param: "agent.container_pipe_size=2097152",
result: Ok(2097152),
},
TestData {
param: "agent.container_pipe_size=0",
result: Ok(0),
},
TestData {
param: "agent.container_pipe_size=-1",
result: Err(anyhow!(ERR_INVALID_CONTAINER_PIPE_NEGATIVE)),
},
TestData {
param: "agent.container_pipe_size=foobar",
result: Err(anyhow!(
"unable to parse container pipe size
#[rstest]
#[case("", Err(anyhow!(ERR_INVALID_CONTAINER_PIPE_SIZE)))]
#[case("agent.container_pipe_size", Err(anyhow!(ERR_INVALID_CONTAINER_PIPE_SIZE)))]
#[case("foo=bar", Err(anyhow!(ERR_INVALID_CONTAINER_PIPE_SIZE_KEY)))]
#[case("agent.container_pip_siz=1", Err(anyhow!(ERR_INVALID_CONTAINER_PIPE_SIZE_KEY)))]
#[case("agent.container_pipe_size=1", Ok(1))]
#[case("agent.container_pipe_size=3", Ok(3))]
#[case("agent.container_pipe_size=2097152", Ok(2097152))]
#[case("agent.container_pipe_size=0", Ok(0))]
#[case("agent.container_pipe_size=-1", Err(anyhow!(ERR_INVALID_CONTAINER_PIPE_NEGATIVE)))]
#[case("agent.container_pipe_size=foobar", Err(anyhow!(
"unable to parse container pipe size
Caused by:
invalid digit found in string"
)),
},
TestData {
param: "agent.container_pipe_size=j",
result: Err(anyhow!(
"unable to parse container pipe size
)))]
#[case("agent.container_pipe_size=j", Err(anyhow!(
"unable to parse container pipe size
Caused by:
invalid digit found in string",
)),
},
TestData {
param: "agent.container_pipe_size=4jbsdja",
result: Err(anyhow!(
"unable to parse container pipe size
)))]
#[case("agent.container_pipe_size=4jbsdja", Err(anyhow!(
"unable to parse container pipe size
Caused by:
invalid digit found in string"
)),
},
TestData {
param: "agent.container_pipe_size=4294967296",
result: Err(anyhow!(
"unable to parse container pipe size
)))]
#[case("agent.container_pipe_size=4294967296", Err(anyhow!(
"unable to parse container pipe size
Caused by:
number too large to fit in target type"
)),
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let result = get_container_pipe_size(d.param);
let msg = format!("{}: result: {:?}", msg, result);
assert_result!(d.result, result, msg);
}
)))]
fn test_get_container_pipe_size(#[case] param: &str, #[case] expected: Result<i32>) {
let result = get_container_pipe_size(param);
let msg = format!("expected: {:?}, result: {:?}", expected, result);
assert_result!(expected, result, msg);
}
#[test]
fn test_get_string_value() {
#[derive(Debug)]
struct TestData<'a> {
param: &'a str,
result: Result<String>,
}
let tests = &[
TestData {
param: "",
result: Err(anyhow!(ERR_INVALID_GET_VALUE_PARAM)),
},
TestData {
param: "=",
result: Err(anyhow!(ERR_INVALID_GET_VALUE_NO_NAME)),
},
TestData {
param: "==",
result: Err(anyhow!(ERR_INVALID_GET_VALUE_NO_NAME)),
},
TestData {
param: "x=",
result: Err(anyhow!(ERR_INVALID_GET_VALUE_NO_VALUE)),
},
TestData {
param: "x==",
result: Ok("=".into()),
},
TestData {
param: "x===",
result: Ok("==".into()),
},
TestData {
param: "x==x",
result: Ok("=x".into()),
},
TestData {
param: "x=x",
result: Ok("x".into()),
},
TestData {
param: "x=x=",
result: Ok("x=".into()),
},
TestData {
param: "x=x=x",
result: Ok("x=x".into()),
},
TestData {
param: "foo=bar",
result: Ok("bar".into()),
},
TestData {
param: "x= =",
result: Ok(" =".into()),
},
TestData {
param: "x= =",
result: Ok(" =".into()),
},
TestData {
param: "x= = ",
result: Ok(" = ".into()),
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let result = get_string_value(d.param);
let msg = format!("{}: result: {:?}", msg, result);
assert_result!(d.result, result, msg);
}
#[rstest]
#[case("", Err(anyhow!(ERR_INVALID_GET_VALUE_PARAM)))]
#[case("=", Err(anyhow!(ERR_INVALID_GET_VALUE_NO_NAME)))]
#[case("==", Err(anyhow!(ERR_INVALID_GET_VALUE_NO_NAME)))]
#[case("x=", Err(anyhow!(ERR_INVALID_GET_VALUE_NO_VALUE)))]
#[case("x==", Ok("=".into()))]
#[case("x===", Ok("==".into()))]
#[case("x==x", Ok("=x".into()))]
#[case("x=x", Ok("x".into()))]
#[case("x=x=", Ok("x=".into()))]
#[case("x=x=x", Ok("x=x".into()))]
#[case("foo=bar", Ok("bar".into()))]
#[case("x= =", Ok(" =".into()))]
#[case("x= =", Ok(" =".into()))]
#[case("x= = ", Ok(" = ".into()))]
fn test_get_string_value(#[case] param: &str, #[case] expected: Result<String>) {
let result = get_string_value(param);
let msg = format!("expected: {:?}, result: {:?}", expected, result);
assert_result!(expected, result, msg);
}
#[test]
fn test_get_guest_components_features_value() {
#[derive(Debug)]
struct TestData<'a> {
param: &'a str,
result: Result<GuestComponentsFeatures>,
}
let tests = &[
TestData {
param: "",
result: Err(anyhow!(ERR_INVALID_GET_VALUE_PARAM)),
},
TestData {
param: "=",
result: Err(anyhow!(ERR_INVALID_GET_VALUE_NO_NAME)),
},
TestData {
param: "==",
result: Err(anyhow!(ERR_INVALID_GET_VALUE_NO_NAME)),
},
TestData {
param: "x=all",
result: Ok(GuestComponentsFeatures::All),
},
TestData {
param: "x=attestation",
result: Ok(GuestComponentsFeatures::Attestation),
},
TestData {
param: "x=resource",
result: Ok(GuestComponentsFeatures::Resource),
},
TestData {
param: "x===",
result: Err(anyhow!(ERR_INVALID_GUEST_COMPONENTS_REST_API_VALUE)),
},
TestData {
param: "x==x",
result: Err(anyhow!(ERR_INVALID_GUEST_COMPONENTS_REST_API_VALUE)),
},
TestData {
param: "x=x",
result: Err(anyhow!(ERR_INVALID_GUEST_COMPONENTS_REST_API_VALUE)),
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let result = get_guest_components_features_value(d.param);
let msg = format!("{}: result: {:?}", msg, result);
assert_result!(d.result, result, msg);
}
#[rstest]
#[case("", Err(anyhow!(ERR_INVALID_GET_VALUE_PARAM)))]
#[case("=", Err(anyhow!(ERR_INVALID_GET_VALUE_NO_NAME)))]
#[case("==", Err(anyhow!(ERR_INVALID_GET_VALUE_NO_NAME)))]
#[case("x=all", Ok(GuestComponentsFeatures::All))]
#[case("x=attestation", Ok(GuestComponentsFeatures::Attestation))]
#[case("x=resource", Ok(GuestComponentsFeatures::Resource))]
#[case("x===", Err(anyhow!(ERR_INVALID_GUEST_COMPONENTS_REST_API_VALUE)))]
#[case("x==x", Err(anyhow!(ERR_INVALID_GUEST_COMPONENTS_REST_API_VALUE)))]
#[case("x=x", Err(anyhow!(ERR_INVALID_GUEST_COMPONENTS_REST_API_VALUE)))]
fn test_get_guest_components_features_value(
#[case] input: &str,
#[case] expected: Result<GuestComponentsFeatures>,
) {
let result = get_guest_components_features_value(input);
let msg = format!("expected: {:?}, result: {:?}", expected, result);
assert_result!(expected, result, msg);
}
#[test]
fn test_get_guest_components_procs_value() {
#[derive(Debug)]
struct TestData<'a> {
param: &'a str,
result: Result<GuestComponentsProcs>,
}
let tests = &[
TestData {
param: "",
result: Err(anyhow!(ERR_INVALID_GET_VALUE_PARAM)),
},
TestData {
param: "=",
result: Err(anyhow!(ERR_INVALID_GET_VALUE_NO_NAME)),
},
TestData {
param: "==",
result: Err(anyhow!(ERR_INVALID_GET_VALUE_NO_NAME)),
},
TestData {
param: "x=attestation-agent",
result: Ok(GuestComponentsProcs::AttestationAgent),
},
TestData {
param: "x=confidential-data-hub",
result: Ok(GuestComponentsProcs::ConfidentialDataHub),
},
TestData {
param: "x=none",
result: Ok(GuestComponentsProcs::None),
},
TestData {
param: "x=api-server-rest",
result: Ok(GuestComponentsProcs::ApiServerRest),
},
TestData {
param: "x===",
result: Err(anyhow!(ERR_INVALID_GUEST_COMPONENTS_PROCS_VALUE)),
},
TestData {
param: "x==x",
result: Err(anyhow!(ERR_INVALID_GUEST_COMPONENTS_PROCS_VALUE)),
},
TestData {
param: "x=x",
result: Err(anyhow!(ERR_INVALID_GUEST_COMPONENTS_PROCS_VALUE)),
},
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let result = get_guest_components_procs_value(d.param);
let msg = format!("{}: result: {:?}", msg, result);
assert_result!(d.result, result, msg);
}
#[rstest]
#[case("", Err(anyhow!(ERR_INVALID_GET_VALUE_PARAM)))]
#[case("=", Err(anyhow!(ERR_INVALID_GET_VALUE_NO_NAME)))]
#[case("==", Err(anyhow!(ERR_INVALID_GET_VALUE_NO_NAME)))]
#[case("x=attestation-agent", Ok(GuestComponentsProcs::AttestationAgent))]
#[case(
"x=confidential-data-hub",
Ok(GuestComponentsProcs::ConfidentialDataHub)
)]
#[case("x=none", Ok(GuestComponentsProcs::None))]
#[case("x=api-server-rest", Ok(GuestComponentsProcs::ApiServerRest))]
#[case("x===", Err(anyhow!(ERR_INVALID_GUEST_COMPONENTS_PROCS_VALUE)))]
#[case("x==x", Err(anyhow!(ERR_INVALID_GUEST_COMPONENTS_PROCS_VALUE)))]
#[case("x=x", Err(anyhow!(ERR_INVALID_GUEST_COMPONENTS_PROCS_VALUE)))]
fn test_get_guest_components_procs_value(
#[case] param: &str,
#[case] expected: Result<GuestComponentsProcs>,
) {
let result = get_guest_components_procs_value(param);
let msg = format!("expected: {:?}, result: {:?}", expected, result);
assert_result!(expected, result, msg);
}
#[test]

View File

@@ -498,7 +498,7 @@ async fn init_attestation_components(logger: &Logger, config: &AgentConfig) -> R
.map_err(|e| anyhow!("launch_process {} failed: {:?}", CDH_PATH, e))?;
// initialize cdh client
cdh::init_cdh_client().await?;
cdh::init_cdh_client(CDH_SOCKET_URI).await?;
// skip launch of api-server-rest
if config.guest_components_procs == GuestComponentsProcs::ConfidentialDataHub {

View File

@@ -224,54 +224,7 @@ impl AgentService {
// cannot predict everything from the caller.
add_devices(&sl(), &req.devices, &mut oci, &self.sandbox).await?;
let process = oci
.process_mut()
.as_mut()
.ok_or_else(|| anyhow!("Spec didn't contain process field"))?;
if cdh::is_cdh_client_initialized().await {
if let Some(envs) = process.env_mut().as_mut() {
for env in envs.iter_mut() {
match cdh::unseal_env(env).await {
Ok(unsealed_env) => *env = unsealed_env.to_string(),
Err(e) => {
warn!(sl(), "Failed to unseal secret: {}", e)
}
}
}
}
}
let linux = oci
.linux()
.as_ref()
.ok_or_else(|| anyhow!("Spec didn't contain linux field"))?;
if cdh::is_cdh_client_initialized().await {
if let Some(devices) = linux.devices() {
for specdev in devices.iter() {
if specdev.path().as_path().to_str() == Some(TRUSTED_IMAGE_STORAGE_DEVICE) {
let dev_major_minor = format!("{}:{}", specdev.major(), specdev.minor());
let secure_storage_integrity =
AGENT_CONFIG.secure_storage_integrity.to_string();
info!(
sl(),
"trusted_store device major:min {}, enable data integrity {}",
dev_major_minor,
secure_storage_integrity
);
let options = std::collections::HashMap::from([
("deviceId".to_string(), dev_major_minor),
("encryptType".to_string(), "LUKS".to_string()),
("dataIntegrity".to_string(), secure_storage_integrity),
]);
cdh::secure_mount("BlockDevice", &options, vec![], KATA_IMAGE_WORK_DIR)
.await?;
break;
}
}
}
}
cdh_handler(&mut oci).await?;
// Both rootfs and volumes (invoked with --volume for instance) will
// be processed the same way. The idea is to always mount any provided
@@ -1726,13 +1679,19 @@ fn update_container_namespaces(
if let Some(namespaces) = linux.namespaces_mut() {
for namespace in namespaces.iter_mut() {
if namespace.typ().to_string() == NSTYPEIPC {
namespace.set_path(Some(PathBuf::from(&sandbox.shared_ipcns.path.clone())));
namespace.set_path(None);
namespace.set_path(if !sandbox.shared_ipcns.path.is_empty() {
Some(PathBuf::from(&sandbox.shared_ipcns.path))
} else {
None
});
continue;
}
if namespace.typ().to_string() == NSTYPEUTS {
namespace.set_path(Some(PathBuf::from(&sandbox.shared_utsns.path.clone())));
namespace.set_path(None);
namespace.set_path(if !sandbox.shared_utsns.path.is_empty() {
Some(PathBuf::from(&sandbox.shared_utsns.path))
} else {
None
});
continue;
}
}
@@ -1750,7 +1709,7 @@ fn update_container_namespaces(
if !pidns.path.is_empty() {
pid_ns.set_path(Some(PathBuf::from(&pidns.path)));
}
} else {
} else if !sandbox.containers.is_empty() {
return Err(anyhow!(ERR_NO_SANDBOX_PIDNS));
}
}
@@ -2093,6 +2052,76 @@ fn load_kernel_module(module: &protocols::agent::KernelModule) -> Result<()> {
}
}
async fn cdh_handler(oci: &mut Spec) -> Result<()> {
if !cdh::is_cdh_client_initialized().await {
return Ok(());
}
let process = oci
.process_mut()
.as_mut()
.ok_or_else(|| anyhow!("Spec didn't contain process field"))?;
if let Some(envs) = process.env_mut().as_mut() {
for env in envs.iter_mut() {
match cdh::unseal_env(env).await {
Ok(unsealed_env) => *env = unsealed_env.to_string(),
Err(e) => {
warn!(sl(), "Failed to unseal secret: {}", e)
}
}
}
}
let mounts = oci
.mounts_mut()
.as_mut()
.ok_or_else(|| anyhow!("Spec didn't contain mounts field"))?;
for m in mounts.iter_mut() {
if m.destination().starts_with("/sealed") {
info!(
sl(),
"sealed mount destination: {:?} source: {:?}",
m.destination(),
m.source()
);
if let Some(source_str) = m.source().as_ref().and_then(|p| p.to_str()) {
cdh::unseal_file(source_str).await?;
} else {
warn!(sl(), "Failed to unseal: Mount source is None or invalid");
}
}
}
let linux = oci
.linux()
.as_ref()
.ok_or_else(|| anyhow!("Spec didn't contain linux field"))?;
if let Some(devices) = linux.devices() {
for specdev in devices.iter() {
if specdev.path().as_path().to_str() == Some(TRUSTED_IMAGE_STORAGE_DEVICE) {
let dev_major_minor = format!("{}:{}", specdev.major(), specdev.minor());
let secure_storage_integrity = AGENT_CONFIG.secure_storage_integrity.to_string();
info!(
sl(),
"trusted_store device major:min {}, enable data integrity {}",
dev_major_minor,
secure_storage_integrity
);
let options = std::collections::HashMap::from([
("deviceId".to_string(), dev_major_minor),
("encryptType".to_string(), "LUKS".to_string()),
("dataIntegrity".to_string(), secure_storage_integrity),
]);
cdh::secure_mount("BlockDevice", &options, vec![], KATA_IMAGE_WORK_DIR).await?;
break;
}
}
}
Ok(())
}
#[cfg(test)]
#[allow(dead_code)]
mod tests {
@@ -2527,14 +2556,6 @@ mod tests {
.unwrap()],
..Default::default()
},
TestData {
namespaces: vec![],
sandbox_pidns_path: None,
use_sandbox_pidns: true,
result: Err(anyhow!(ERR_NO_SANDBOX_PIDNS)),
expected_namespaces: vec![],
..Default::default()
},
TestData {
has_linux_in_spec: false,
result: Err(anyhow!(ERR_NO_LINUX_FIELD)),

View File

@@ -88,12 +88,6 @@ pub const KATA_ANNO_CFG_HYPERVISOR_PREFIX: &str = "io.katacontainers.config.hype
pub const KATA_ANNO_CFG_HYPERVISOR_PATH: &str = "io.katacontainers.config.hypervisor.path";
/// A sandbox annotation for passing a container hypervisor binary SHA-512 hash value.
pub const KATA_ANNO_CFG_HYPERVISOR_HASH: &str = "io.katacontainers.config.hypervisor.path_hash";
/// A sandbox annotation for passing a per container path pointing at the hypervisor control binary
/// that will run the container VM.
pub const KATA_ANNO_CFG_HYPERVISOR_CTLPATH: &str = "io.katacontainers.config.hypervisor.ctlpath";
/// A sandbox annotation for passing a container hypervisor control binary SHA-512 hash value.
pub const KATA_ANNO_CFG_HYPERVISOR_CTLHASH: &str =
"io.katacontainers.config.hypervisor.hypervisorctl_hash";
/// A sandbox annotation for passing a per container path pointing at the jailer that will constrain
/// the container VM.
pub const KATA_ANNO_CFG_HYPERVISOR_JAILER_PATH: &str =
@@ -506,10 +500,6 @@ impl Annotation {
hv.validate_hypervisor_path(value)?;
hv.path = value.to_string();
}
KATA_ANNO_CFG_HYPERVISOR_CTLPATH => {
hv.validate_hypervisor_ctlpath(value)?;
hv.ctlpath = value.to_string();
}
KATA_ANNO_CFG_HYPERVISOR_JAILER_PATH => {
hv.validate_jailer_path(value)?;

View File

@@ -83,16 +83,17 @@ LOCALSTATEDIR := /var
CONFIG_FILE = configuration.toml
RUNTIMENAME := virt_container
HYPERVISOR_DB = dragonball
HYPERVISOR_ACRN = acrn
HYPERVISOR_FC = firecracker
HYPERVISOR_QEMU = qemu
HYPERVISOR_CLH = cloud-hypervisor
# When set to true, builds the built-in Dragonball hypervisor
USE_BUILDIN_DB := true
DEFAULT_HYPERVISOR ?= $(HYPERVISOR_DB)
HYPERVISOR ?= $(HYPERVISOR_DB)
##VAR HYPERVISOR=<hypervisor_name> List of hypervisors this build system can generate configuration for.
HYPERVISORS := $(HYPERVISOR_DB) $(HYPERVISOR_ACRN) $(HYPERVISOR_FC) $(HYPERVISOR_QEMU) $(HYPERVISOR_CLH)
HYPERVISORS := $(HYPERVISOR_DB) $(HYPERVISOR_FC) $(HYPERVISOR_QEMU) $(HYPERVISOR_CLH)
CLHPATH := $(CLHBINDIR)/$(CLHCMD)
CLHVALIDHYPERVISORPATHS := [\"$(CLHPATH)\"]
@@ -187,8 +188,6 @@ CONFIG_PATHS =
SYSCONFIG_PATHS =
# List of hypervisors known for the current architecture
KNOWN_HYPERVISORS =
# List of hypervisors known for the current architecture
KNOWN_HYPERVISORS =
CONFDIR := $(DEFAULTSDIR)/$(PROJECT_DIR)/runtime-rs
SYSCONFDIR := $(SYSCONFDIR)/$(PROJECT_DIR)
@@ -318,14 +317,14 @@ ifneq (,$(FCCMD))
DEFSTATICRESOURCEMGMT_FC := true
endif
ifeq ($(DEFAULT_HYPERVISOR),$(HYPERVISOR_DB))
ifeq ($(HYPERVISOR),$(HYPERVISOR_DB))
DEFAULT_HYPERVISOR_CONFIG = $(CONFIG_FILE_DB)
endif
ifeq ($(DEFAULT_HYPERVISOR),$(HYPERVISOR_QEMU))
ifeq ($(HYPERVISOR),$(HYPERVISOR_QEMU))
DEFAULT_HYPERVISOR_CONFIG = $(CONFIG_FILE_QEMU)
endif
ifeq ($(DEFAULT_HYPERVISOR),$(HYPERVISOR_FC))
ifeq ($(HYPERVISOR),$(HYPERVISOR_FC))
DEFAULT_HYPERVISOR_CONFIG = $(CONFIG_FILE_FC)
endif
# list of variables the user may wish to override
@@ -336,7 +335,8 @@ USER_VARS += CONFIG_FC_IN
USER_VARS += CONFIG_PATH
USER_VARS += CONFIG_QEMU_IN
USER_VARS += DESTDIR
USER_VARS += DEFAULT_HYPERVISOR
USER_VARS += HYPERVISOR
USER_VARS += USE_BUILDIN_DB
USER_VARS += DBCMD
USER_VARS += DBCTLCMD
USER_VARS += FCCTLCMD
@@ -399,7 +399,6 @@ USER_VARS += SYSCONFDIR
USER_VARS += DEFVCPUS
USER_VARS += DEFVCPUS_QEMU
USER_VARS += DEFMAXVCPUS
USER_VARS += DEFMAXVCPUS_ACRN
USER_VARS += DEFMAXVCPUS_DB
USER_VARS += DEFMAXVCPUS_QEMU
USER_VARS += DEFMEMSZ
@@ -475,6 +474,11 @@ COMMIT_MSG = $(if $(COMMIT),$(COMMIT),unknown)
EXTRA_RUSTFEATURES :=
# if use dragonball hypervisor, add the feature to build dragonball in runtime
ifeq ($(USE_BUILDIN_DB),true)
EXTRA_RUSTFEATURES += dragonball
endif
ifneq ($(EXTRA_RUSTFEATURES),)
override EXTRA_RUSTFEATURES := --features $(EXTRA_RUSTFEATURES)
endif
@@ -614,7 +618,7 @@ show-summary: show-header
@printf " %s\n" "$(call get_toolchain_version)"
@printf "\n"
@printf "• Hypervisors:\n"
@printf "\tDefault: $(DEFAULT_HYPERVISOR)\n"
@printf "\tDefault: $(HYPERVISOR)\n"
@printf "\tKnown: $(sort $(HYPERVISORS))\n"
@printf "\tAvailable for this architecture: $(sort $(KNOWN_HYPERVISORS))\n"
@printf "\n"
@@ -634,7 +638,7 @@ show-summary: show-header
@printf "\talternate config paths (SYSCONFIG_PATHS) : %s\n"
@printf \
"$(foreach c,$(sort $(SYSCONFIG_PATHS)),$(shell printf "\\t - $(c)\\\n"))"
@printf "\tdefault install path for $(DEFAULT_HYPERVISOR) (CONFIG_PATH) : %s\n" $(abspath $(CONFIG_PATH))
@printf "\tdefault install path for $(HYPERVISOR) (CONFIG_PATH) : %s\n" $(abspath $(CONFIG_PATH))
@printf "\tdefault alternate config path (SYSCONFIG) : %s\n" $(abspath $(SYSCONFIG))
ifneq (,$(findstring $(HYPERVISOR_QEMU),$(KNOWN_HYPERVISORS)))
@printf "\t$(HYPERVISOR_QEMU) hypervisor path (QEMUPATH) : %s\n" $(abspath $(QEMUPATH))
@@ -647,9 +651,6 @@ ifneq (,$(findstring $(HYPERVISOR_CLH),$(KNOWN_HYPERVISORS)))
endif
ifneq (,$(findstring $(HYPERVISOR_FC),$(KNOWN_HYPERVISORS)))
@printf "\t$(HYPERVISOR_FC) hypervisor path (FCPATH) : %s\n" $(abspath $(FCPATH))
endif
ifneq (,$(findstring $(HYPERVISOR_ACRN),$(KNOWN_HYPERVISORS)))
@printf "\t$(HYPERVISOR_ACRN) hypervisor path (ACRNPATH) : %s\n" $(abspath $(ACRNPATH))
endif
@printf "\tassets path (PKGDATADIR) : %s\n" $(abspath $(PKGDATADIR))
@printf "\tshim path (PKGLIBEXECDIR) : %s\n" $(abspath $(PKGLIBEXECDIR))

View File

@@ -42,7 +42,7 @@ pub struct StringUser {
pub additional_gids: Vec<String>,
}
#[derive(PartialEq, Clone, Default)]
#[derive(PartialEq, Clone, Debug, Default)]
pub struct Device {
pub id: String,
pub field_type: String,

View File

@@ -29,7 +29,6 @@ path-clean = "1.0.1"
lazy_static = "1.4"
tracing = "0.1.36"
dbs-utils = { path = "../../../dragonball/src/dbs_utils" }
kata-sys-util = { path = "../../../libs/kata-sys-util" }
kata-types = { path = "../../../libs/kata-types" }
logging = { path = "../../../libs/logging" }
@@ -48,7 +47,7 @@ qapi-spec = "0.3.1"
qapi-qmp = "0.14.0"
[target.'cfg(not(target_arch = "s390x"))'.dependencies]
dragonball = { path = "../../../dragonball", features = ["atomic-guest-memory", "virtio-vsock", "hotplug", "virtio-blk", "virtio-net", "virtio-fs", "vhost-net", "dbs-upcall", "virtio-mem", "virtio-balloon", "vhost-user-net", "host-device"] }
dragonball = { path = "../../../dragonball", features = ["atomic-guest-memory", "virtio-vsock", "hotplug", "virtio-blk", "virtio-net", "virtio-fs", "vhost-net", "dbs-upcall", "virtio-mem", "virtio-balloon", "vhost-user-net", "host-device"], optional = true }
dbs-utils = { path = "../../../dragonball/src/dbs_utils" }
hyperlocal = "0.8.0"
hyper = {version = "0.14.18", features = ["client"]}
@@ -56,6 +55,7 @@ hyper = {version = "0.14.18", features = ["client"]}
[features]
default = []
dragonball = ["dep:dragonball"]
# Feature is not yet complete, so not enabled by default.
# See https://github.com/kata-containers/kata-containers/issues/6264.
cloud-hypervisor = ["ch-config"]

View File

@@ -16,7 +16,6 @@ use persist::sandbox_persist::Persist;
use std::collections::HashMap;
use std::os::unix::net::UnixStream;
use tokio::sync::watch::{channel, Receiver, Sender};
use tokio::sync::Mutex;
use tokio::task::JoinHandle;
use tokio::{process::Child, sync::mpsc};
@@ -79,13 +78,12 @@ pub struct CloudHypervisorInner {
pub(crate) _guest_memory_block_size_mb: u32,
pub(crate) exit_notify: Option<mpsc::Sender<i32>>,
pub(crate) exit_waiter: Mutex<(mpsc::Receiver<i32>, i32)>,
}
const CH_DEFAULT_TIMEOUT_SECS: u32 = 10;
impl CloudHypervisorInner {
pub fn new() -> Self {
pub fn new(exit_notify: Option<mpsc::Sender<i32>>) -> Self {
let mut capabilities = Capabilities::new();
capabilities.set(
CapabilityBits::BlockDeviceSupport
@@ -95,7 +93,6 @@ impl CloudHypervisorInner {
);
let (tx, rx) = channel(true);
let (exit_notify, exit_waiter) = mpsc::channel(1);
Self {
api_socket: None,
@@ -122,8 +119,7 @@ impl CloudHypervisorInner {
ch_features: None,
_guest_memory_block_size_mb: 0,
exit_notify: Some(exit_notify),
exit_waiter: Mutex::new((exit_waiter, 0)),
exit_notify,
}
}
@@ -138,14 +134,14 @@ impl CloudHypervisorInner {
impl Default for CloudHypervisorInner {
fn default() -> Self {
Self::new()
Self::new(None)
}
}
#[async_trait]
impl Persist for CloudHypervisorInner {
type State = HypervisorState;
type ConstructorArgs = ();
type ConstructorArgs = mpsc::Sender<i32>;
// Return a state object that will be saved by the caller.
async fn save(&self) -> Result<Self::State> {
@@ -166,11 +162,10 @@ impl Persist for CloudHypervisorInner {
// Set the hypervisor state to the specified state
async fn restore(
_hypervisor_args: Self::ConstructorArgs,
exit_notify: mpsc::Sender<i32>,
hypervisor_state: Self::State,
) -> Result<Self> {
let (tx, rx) = channel(true);
let (exit_notify, exit_waiter) = mpsc::channel(1);
let mut ch = Self {
config: Some(hypervisor_state.config),
@@ -190,7 +185,6 @@ impl Persist for CloudHypervisorInner {
jailer_root: String::default(),
ch_features: None,
exit_notify: Some(exit_notify),
exit_waiter: Mutex::new((exit_waiter, 0)),
..Default::default()
};
@@ -207,7 +201,9 @@ mod tests {
#[actix_rt::test]
async fn test_save_clh() {
let mut clh = CloudHypervisorInner::new();
let (exit_notify, _exit_waiter) = mpsc::channel(1);
let mut clh = CloudHypervisorInner::new(Some(exit_notify.clone()));
clh.id = String::from("123456");
clh.netns = Some(String::from("/var/run/netns/testnet"));
clh.vm_path = String::from("/opt/kata/bin/cloud-hypervisor");
@@ -229,7 +225,7 @@ mod tests {
assert!(!state.jailed);
assert_eq!(state.hypervisor_type, HYPERVISOR_NAME_CH.to_string());
let clh = CloudHypervisorInner::restore((), state.clone())
let clh = CloudHypervisorInner::restore(exit_notify, state.clone())
.await
.unwrap();
assert_eq!(clh.id, state.id);

View File

@@ -19,7 +19,6 @@ use ch_config::ch_api::{
};
use ch_config::{guest_protection_is_tdx, NamedHypervisorConfig, VmConfig};
use core::future::poll_fn;
use futures::executor::block_on;
use futures::future::join_all;
use kata_sys_util::protection::{available_guest_protection, GuestProtection};
use kata_types::capabilities::{Capabilities, CapabilityBits};
@@ -640,7 +639,7 @@ impl CloudHypervisorInner {
Ok(())
}
pub(crate) fn stop_vm(&mut self) -> Result<()> {
pub(crate) async fn stop_vm(&mut self) -> Result<()> {
// If the container workload exits, this method gets called. However,
// the container manager always makes a ShutdownContainer request,
// which results in this method being called potentially a second
@@ -652,19 +651,14 @@ impl CloudHypervisorInner {
self.state = VmmState::NotReady;
block_on(self.cloud_hypervisor_shutdown()).map_err(|e| anyhow!(e))?;
self.cloud_hypervisor_shutdown().await?;
Ok(())
}
#[allow(dead_code)]
pub(crate) async fn wait_vm(&self) -> Result<i32> {
debug!(sl!(), "Waiting CH vmm");
let mut waiter = self.exit_waiter.lock().await;
if let Some(exitcode) = waiter.0.recv().await {
waiter.1 = exitcode;
}
Ok(waiter.1)
Ok(0)
}
pub(crate) fn pause_vm(&self) -> Result<()> {

View File

@@ -12,7 +12,7 @@ use kata_types::capabilities::{Capabilities, CapabilityBits};
use kata_types::config::hypervisor::Hypervisor as HypervisorConfig;
use persist::sandbox_persist::Persist;
use std::sync::Arc;
use tokio::sync::RwLock;
use tokio::sync::{mpsc, Mutex, RwLock};
// Convenience macro to obtain the scope logger
#[macro_export]
@@ -29,15 +29,19 @@ mod utils;
use inner::CloudHypervisorInner;
#[derive(Debug, Default, Clone)]
#[derive(Debug)]
pub struct CloudHypervisor {
inner: Arc<RwLock<CloudHypervisorInner>>,
exit_waiter: Mutex<(mpsc::Receiver<i32>, i32)>,
}
impl CloudHypervisor {
pub fn new() -> Self {
let (exit_notify, exit_waiter) = mpsc::channel(1);
Self {
inner: Arc::new(RwLock::new(CloudHypervisorInner::new())),
inner: Arc::new(RwLock::new(CloudHypervisorInner::new(Some(exit_notify)))),
exit_waiter: Mutex::new((exit_waiter, 0)),
}
}
@@ -47,6 +51,12 @@ impl CloudHypervisor {
}
}
impl Default for CloudHypervisor {
fn default() -> Self {
Self::new()
}
}
#[async_trait]
impl Hypervisor for CloudHypervisor {
async fn prepare_vm(&self, id: &str, netns: Option<String>) -> Result<()> {
@@ -61,12 +71,17 @@ impl Hypervisor for CloudHypervisor {
async fn stop_vm(&self) -> Result<()> {
let mut inner = self.inner.write().await;
inner.stop_vm()
inner.stop_vm().await
}
async fn wait_vm(&self) -> Result<i32> {
let inner = self.inner.read().await;
inner.wait_vm().await
debug!(sl!(), "Waiting CH vmm");
let mut waiter = self.exit_waiter.lock().await;
if let Some(exitcode) = waiter.0.recv().await {
waiter.1 = exitcode;
}
Ok(waiter.1)
}
async fn pause_vm(&self) -> Result<()> {
@@ -204,12 +219,15 @@ impl Persist for CloudHypervisor {
}
async fn restore(
hypervisor_args: Self::ConstructorArgs,
_hypervisor_args: Self::ConstructorArgs,
hypervisor_state: Self::State,
) -> Result<Self> {
let inner = CloudHypervisorInner::restore(hypervisor_args, hypervisor_state).await?;
let (exit_notify, exit_waiter) = mpsc::channel(1);
let inner = CloudHypervisorInner::restore(exit_notify, hypervisor_state).await?;
Ok(Self {
inner: Arc::new(RwLock::new(inner)),
exit_waiter: Mutex::new((exit_waiter, 0)),
})
}
}

View File

@@ -115,12 +115,12 @@ pub enum VfioDeviceType {
Mediated,
}
// DeviceVendor represents a PCI device's device id and vendor id
// DeviceVendor: (device, vendor)
// DeviceVendorClass represents a PCI device's deviceID, vendorID and classID
// DeviceVendorClass: (device, vendor, class)
#[derive(Clone, Debug)]
pub struct DeviceVendor(String, String);
pub struct DeviceVendorClass(String, String, String);
impl DeviceVendor {
impl DeviceVendorClass {
pub fn get_device_vendor(&self) -> Result<(u32, u32)> {
// default value is 0 when vendor_id or device_id is empty
if self.0.is_empty() || self.1.is_empty() {
@@ -142,6 +142,10 @@ impl DeviceVendor {
Ok((device, vendor))
}
pub fn get_vendor_class_id(&self) -> Result<(&str, &str)> {
Ok((&self.1, &self.2))
}
pub fn get_device_vendor_id(&self) -> Result<u32> {
let (device, vendor) = self
.get_device_vendor()
@@ -163,8 +167,8 @@ pub struct HostDevice {
/// PCI device information (BDF): "bus:slot:function"
pub bus_slot_func: String,
/// device_vendor: device id and vendor id
pub device_vendor: Option<DeviceVendor>,
/// device_vendor_class: (device, vendor, class)
pub device_vendor_class: Option<DeviceVendorClass>,
/// type of vfio device
pub vfio_type: VfioDeviceType,
@@ -336,13 +340,14 @@ impl VfioDevice {
}
// read vendor and deviceor from /sys/bus/pci/devices/BDF/X
fn get_vfio_device_vendor(&self, bdf: &str) -> Result<DeviceVendor> {
fn get_vfio_device_vendor_class(&self, bdf: &str) -> Result<DeviceVendorClass> {
let device =
get_device_property(bdf, "device").context("get device from syspath failed")?;
let vendor =
get_device_property(bdf, "vendor").context("get vendor from syspath failed")?;
let class = get_device_property(bdf, "class").context("get class from syspath failed")?;
Ok(DeviceVendor(device, vendor))
Ok(DeviceVendorClass(device, vendor, class))
}
fn set_vfio_config(
@@ -356,13 +361,13 @@ impl VfioDevice {
// It's safe as BDF really exists.
let dev_bdf = vfio_dev_details.0.unwrap();
let dev_vendor = self
.get_vfio_device_vendor(&dev_bdf)
let dev_vendor_class = self
.get_vfio_device_vendor_class(&dev_bdf)
.context("get property device and vendor failed")?;
let vfio_dev = HostDevice {
bus_slot_func: dev_bdf.clone(),
device_vendor: Some(dev_vendor),
device_vendor_class: Some(dev_vendor_class),
sysfs_path: vfio_dev_details.1,
vfio_type: vfio_dev_details.2,
..Default::default()

View File

@@ -20,6 +20,8 @@ use self::topology::PCIeTopology;
pub mod device_manager;
pub mod driver;
pub mod pci_path;
mod tap;
pub use self::tap::{Error as TapError, Tap};
pub mod topology;
pub mod util;

View File

@@ -0,0 +1,264 @@
// Copyright 2024 Kata Contributors
// Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: Apache-2.0
//
// Portions Copyright 2017 The Chromium OS Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the THIRD-PARTY file.
use std::ffi::CStr;
use std::fs::File;
use std::io::{Error as IoError, Read, Result as IoResult, Write};
use std::net::UdpSocket;
use std::os::raw::*;
use std::os::unix::io::{AsRawFd, FromRawFd, RawFd};
use libc::ifreq;
use vmm_sys_util::ioctl::{ioctl_with_mut_ref, ioctl_with_ref, ioctl_with_val};
use vmm_sys_util::{ioctl_ioc_nr, ioctl_iow_nr};
// As defined in the Linux UAPI:
// https://elixir.bootlin.com/linux/v4.17/source/include/uapi/linux/if.h#L33
pub(crate) const IFACE_NAME_MAX_LEN: usize = 16;
/// List of errors the tap implementation can throw.
#[derive(Debug, thiserror::Error)]
pub enum Error {
/// Failed to create a socket.
#[error("cannot create socket. {0}")]
CreateSocket(#[source] IoError),
/// Unable to create tap interface.
#[error("cannot create tap device. {0}")]
CreateTap(IoError),
/// Invalid interface name.
#[error("invalid network interface name")]
InvalidIfname,
/// ioctl failed.
#[error("failure while issue Tap ioctl command. {0}")]
IoctlError(#[source] IoError),
/// Couldn't open /dev/net/tun.
#[error("cannot open tap device. {0}")]
OpenTun(#[source] IoError),
}
pub type Result<T> = ::std::result::Result<T, Error>;
const TUNTAP: ::std::os::raw::c_uint = 84;
ioctl_iow_nr!(TUNSETIFF, TUNTAP, 202, ::std::os::raw::c_int);
ioctl_iow_nr!(TUNSETOFFLOAD, TUNTAP, 208, ::std::os::raw::c_uint);
ioctl_iow_nr!(TUNSETVNETHDRSZ, TUNTAP, 216, ::std::os::raw::c_int);
/// Handle for a network tap interface.
///
/// For now, this simply wraps the file descriptor for the tap device so methods
/// can run ioctls on the interface. The tap interface fd will be closed when
/// Tap goes out of scope, and the kernel will clean up the interface automatically.
#[derive(Debug)]
pub struct Tap {
/// tap device file handle
pub tap_file: File,
pub(crate) if_name: [std::os::raw::c_char; IFACE_NAME_MAX_LEN],
pub(crate) if_flags: std::os::raw::c_short,
}
impl PartialEq for Tap {
fn eq(&self, other: &Tap) -> bool {
self.if_name == other.if_name
}
}
fn create_socket() -> Result<UdpSocket> {
// This is safe since we check the return value.
let sock = unsafe { libc::socket(libc::AF_INET, libc::SOCK_DGRAM, 0) };
if sock < 0 {
return Err(Error::CreateSocket(IoError::last_os_error()));
}
// This is safe; nothing else will use or hold onto the raw sock fd.
Ok(unsafe { UdpSocket::from_raw_fd(sock) })
}
// Returns an array representing the contents of a null-terminated C string
// containing if_name.
pub fn build_terminated_if_name(if_name: &str) -> Result<[c_char; IFACE_NAME_MAX_LEN]> {
// Convert the string slice to bytes, and shadow the variable,
// since we no longer need the &str version.
let if_name_bytes = if_name.as_bytes();
if if_name_bytes.len() >= IFACE_NAME_MAX_LEN {
return Err(Error::InvalidIfname);
}
let mut terminated_if_name = [0 as c_char; IFACE_NAME_MAX_LEN];
for (i, &byte) in if_name_bytes.iter().enumerate() {
terminated_if_name[i] = byte as c_char;
}
// 0 is the null terminator for c_char type
terminated_if_name[if_name_bytes.len()] = 0 as c_char;
Ok(terminated_if_name)
}
impl Tap {
/// Create a TUN/TAP device given the interface name.
/// # Arguments
///
/// * `if_name` - the name of the interface.
pub fn open_named(if_name: &str, multi_vq: bool) -> Result<Tap> {
let terminated_if_name = build_terminated_if_name(if_name)?;
// Initialize an `ifreq` structure with the given interface name
// and configure its flags for setting up a network interface.
let mut ifr = ifreq {
ifr_name: terminated_if_name,
ifr_ifru: libc::__c_anonymous_ifr_ifru {
ifru_flags: (libc::IFF_TAP
| libc::IFF_NO_PI
| libc::IFF_VNET_HDR
| if multi_vq { libc::IFF_MULTI_QUEUE } else { 0 })
as c_short,
},
};
Tap::create_tap_with_ifreq(&mut ifr)
}
fn create_tap_with_ifreq(ifr: &mut ifreq) -> Result<Tap> {
let fd = unsafe {
let dev_net_tun = CStr::from_bytes_with_nul(b"/dev/net/tun\0").unwrap_or_else(|_| {
unreachable!("The string is guaranteed to be null-terminated and valid.")
});
// Open calls are safe because we use a CStr, which guarantees a
// constant null-terminated string.
libc::open(
dev_net_tun.as_ptr(),
libc::O_RDWR | libc::O_NONBLOCK | libc::O_CLOEXEC,
)
};
if fd < 0 {
return Err(Error::OpenTun(IoError::last_os_error()));
}
// We just checked that the fd is valid.
let tuntap = unsafe { File::from_raw_fd(fd) };
// ioctl is safe since we call it with a valid tap fd and check the return
// value.
let ret = unsafe { ioctl_with_mut_ref(&tuntap, TUNSETIFF(), ifr) };
if ret < 0 {
return Err(Error::CreateTap(IoError::last_os_error()));
}
Ok(Tap {
tap_file: tuntap,
if_name: ifr.ifr_name,
// This is safe since ifru_flags was correctly initialized earlier.
if_flags: unsafe { ifr.ifr_ifru.ifru_flags },
})
}
/// Change the origin tap into multiqueue taps.
pub fn into_mq_taps(self, vq_pairs: usize) -> Result<Vec<Tap>> {
let mut taps = Vec::with_capacity(vq_pairs);
if vq_pairs == 1 {
// vq_pairs cannot be less than 1, so only handle the case where it equals 1.
taps.push(self);
return Ok(taps);
}
// Add other socket into the origin tap interface.
for _ in 0..vq_pairs - 1 {
let mut ifr: ifreq = self.get_ifreq();
let tap = Tap::create_tap_with_ifreq(&mut ifr)?;
tap.enable()?;
taps.push(tap);
}
taps.insert(0, self);
Ok(taps)
}
/// Set the offload flags for the tap interface.
pub fn set_offload(&self, flags: c_uint) -> Result<()> {
// ioctl is safe. Called with a valid tap fd, and we check the return.
let ret = unsafe { ioctl_with_val(&self.tap_file, TUNSETOFFLOAD(), c_ulong::from(flags)) };
if ret < 0 {
return Err(Error::IoctlError(IoError::last_os_error()));
}
Ok(())
}
/// Enable the tap interface.
pub fn enable(&self) -> Result<()> {
let sock = create_socket()?;
let mut ifr = self.get_ifreq();
ifr.ifr_ifru.ifru_flags = (libc::IFF_UP | libc::IFF_RUNNING) as i16;
// ioctl is safe. Called with a valid sock fd, and we check the return.
let ret = unsafe { ioctl_with_ref(&sock, c_ulong::from(libc::SIOCSIFFLAGS), &ifr) };
if ret < 0 {
return Err(Error::IoctlError(IoError::last_os_error()));
}
Ok(())
}
/// Set the size of the vnet hdr.
pub fn set_vnet_hdr_size(&self, size: c_int) -> Result<()> {
// ioctl is safe. Called with a valid tap fd, and we check the return.
let ret = unsafe { ioctl_with_ref(&self.tap_file, TUNSETVNETHDRSZ(), &size) };
if ret < 0 {
return Err(Error::IoctlError(IoError::last_os_error()));
}
Ok(())
}
fn get_ifreq(&self) -> ifreq {
let mut ifr_name = [0 as c_char; libc::IFNAMSIZ];
ifr_name[..self.if_name.len()].copy_from_slice(&self.if_name);
// Return an `ifreq` structure with the interface name and flags.
ifreq {
ifr_name,
ifr_ifru: libc::__c_anonymous_ifr_ifru {
ifru_flags: self.if_flags,
},
}
}
/// Get the origin flags when interface was created.
pub fn if_flags(&self) -> u32 {
self.if_flags as u32
}
}
impl Read for Tap {
fn read(&mut self, buf: &mut [u8]) -> IoResult<usize> {
self.tap_file.read(buf)
}
}
impl Write for Tap {
fn write(&mut self, buf: &[u8]) -> IoResult<usize> {
self.tap_file.write(buf)
}
fn flush(&mut self) -> IoResult<()> {
Ok(())
}
}
impl AsRawFd for Tap {
fn as_raw_fd(&self) -> RawFd {
self.tap_file.as_raw_fd()
}
}

View File

@@ -147,8 +147,8 @@ impl DragonballInner {
// And the the first one is Primary device.
// safe here, devices is not empty.
let primary_device = device.devices.first_mut().unwrap();
let vendor_device_id = if let Some(vd) = primary_device.device_vendor.as_ref() {
vd.get_device_vendor_id()?
let vendor_device_id = if let Some(vdc) = primary_device.device_vendor_class.as_ref() {
vdc.get_device_vendor_id()?
} else {
0
};

View File

@@ -107,6 +107,11 @@ impl FcInner {
.get_resource(&self.config.boot_info.image, FC_ROOT_FS)
.context("get resource ROOTFS")?;
let body_config: String = json!({
"mem_size_mib": self.config.memory_info.default_memory,
"vcpu_count": self.config.cpu_info.default_vcpus,
})
.to_string();
let body_kernel: String = json!({
"kernel_image_path": kernel,
"boot_args": parameters,
@@ -124,6 +129,8 @@ impl FcInner {
info!(sl(), "Before first request");
self.request_with_retry(Method::PUT, "/boot-source", body_kernel)
.await?;
self.request_with_retry(Method::PUT, "/machine-config", body_config)
.await?;
self.request_with_retry(Method::PUT, "/drives/rootfs", body_rootfs)
.await?;

View File

@@ -46,14 +46,12 @@ pub struct FcInner {
pub(crate) capabilities: Capabilities,
pub(crate) fc_process: Mutex<Option<Child>>,
pub(crate) exit_notify: Option<mpsc::Sender<()>>,
pub(crate) exit_waiter: Mutex<(mpsc::Receiver<()>, i32)>,
}
impl FcInner {
pub fn new() -> FcInner {
pub fn new(exit_notify: mpsc::Sender<()>) -> FcInner {
let mut capabilities = Capabilities::new();
capabilities.set(CapabilityBits::BlockDeviceSupport);
let (exit_notify, exit_waiter) = mpsc::channel(1);
FcInner {
id: String::default(),
@@ -71,7 +69,6 @@ impl FcInner {
capabilities,
fc_process: Mutex::new(None),
exit_notify: Some(exit_notify),
exit_waiter: Mutex::new((exit_waiter, 0)),
}
}
@@ -124,11 +121,10 @@ impl FcInner {
let mut child = cmd.stderr(Stdio::piped()).spawn()?;
let stderr = child.stderr.take().unwrap();
let exit_notify: mpsc::Sender<()> = self
let exit_notify = self
.exit_notify
.take()
.ok_or_else(|| anyhow!("no exit notify"))?;
tokio::spawn(log_fc_stderr(stderr, exit_notify));
match child.id() {
@@ -216,7 +212,7 @@ async fn log_fc_stderr(stderr: ChildStderr, exit_notify: mpsc::Sender<()>) -> Re
#[async_trait]
impl Persist for FcInner {
type State = HypervisorState;
type ConstructorArgs = ();
type ConstructorArgs = mpsc::Sender<()>;
async fn save(&self) -> Result<Self::State> {
Ok(HypervisorState {
@@ -231,12 +227,7 @@ impl Persist for FcInner {
..Default::default()
})
}
async fn restore(
_hypervisor_args: Self::ConstructorArgs,
hypervisor_state: Self::State,
) -> Result<Self> {
let (exit_notify, exit_waiter) = mpsc::channel(1);
async fn restore(exit_notify: mpsc::Sender<()>, hypervisor_state: Self::State) -> Result<Self> {
Ok(FcInner {
id: hypervisor_state.id,
asock_path: String::default(),
@@ -253,7 +244,6 @@ impl Persist for FcInner {
capabilities: Capabilities::new(),
fc_process: Mutex::new(None),
exit_notify: Some(exit_notify),
exit_waiter: Mutex::new((exit_waiter, 0)),
})
}
}

View File

@@ -102,21 +102,14 @@ impl FcInner {
}
pub(crate) async fn wait_vm(&self) -> Result<i32> {
debug!(sl(), "Wait fc sandbox");
let mut waiter = self.exit_waiter.lock().await;
//wait until the fc process exited.
waiter.0.recv().await;
let mut fc_process = self.fc_process.lock().await;
if let Some(mut fc_process) = fc_process.take() {
if let Ok(status) = fc_process.wait().await {
waiter.1 = status.code().unwrap_or(0);
}
let status = fc_process.wait().await?;
Ok(status.code().unwrap_or(0))
} else {
Err(anyhow!("the process has been reaped"))
}
Ok(waiter.1)
}
pub(crate) fn pause_vm(&self) -> Result<()> {

View File

@@ -19,11 +19,14 @@ use kata_types::capabilities::Capabilities;
use kata_types::capabilities::CapabilityBits;
use persist::sandbox_persist::Persist;
use std::sync::Arc;
use tokio::sync::mpsc;
use tokio::sync::Mutex;
use tokio::sync::RwLock;
#[derive(Debug)]
pub struct Firecracker {
inner: Arc<RwLock<FcInner>>,
exit_waiter: Mutex<(mpsc::Receiver<()>, i32)>,
}
// Convenience function to set the scope.
@@ -39,8 +42,11 @@ impl Default for Firecracker {
impl Firecracker {
pub fn new() -> Self {
let (exit_notify, exit_waiter) = mpsc::channel(1);
Self {
inner: Arc::new(RwLock::new(FcInner::new())),
inner: Arc::new(RwLock::new(FcInner::new(exit_notify))),
exit_waiter: Mutex::new((exit_waiter, 0)),
}
}
@@ -68,8 +74,18 @@ impl Hypervisor for Firecracker {
}
async fn wait_vm(&self) -> Result<i32> {
debug!(sl(), "Wait fc sandbox");
let mut waiter = self.exit_waiter.lock().await;
//wait until the fc process exited.
waiter.0.recv().await;
let inner = self.inner.read().await;
inner.wait_vm().await
if let Ok(exit_code) = inner.wait_vm().await {
waiter.1 = exit_code;
}
Ok(waiter.1)
}
async fn pause_vm(&self) -> Result<()> {
@@ -209,12 +225,15 @@ impl Persist for Firecracker {
}
/// Restore a component from a specified state.
async fn restore(
hypervisor_args: Self::ConstructorArgs,
_hypervisor_args: Self::ConstructorArgs,
hypervisor_state: Self::State,
) -> Result<Self> {
let inner = FcInner::restore(hypervisor_args, hypervisor_state).await?;
let (exit_notify, exit_waiter) = mpsc::channel(1);
let inner = FcInner::restore(exit_notify, hypervisor_state).await?;
Ok(Self {
inner: Arc::new(RwLock::new(inner)),
exit_waiter: Mutex::new((exit_waiter, 0)),
})
}
}

View File

@@ -10,7 +10,7 @@ use serde::{Deserialize, Serialize};
use std::collections::HashSet;
#[derive(Serialize, Deserialize, Default, Clone, Debug)]
pub struct HypervisorState {
// Type of hypervisor, E.g. dragonball/qemu/firecracker/acrn.
// Type of hypervisor, E.g. dragonball/qemu/firecracker.
pub hypervisor_type: String,
pub pid: Option<i32>,
pub uuid: String,

View File

@@ -13,7 +13,7 @@ pub mod device;
pub mod hypervisor_persist;
pub use device::driver::*;
use device::DeviceType;
#[cfg(not(target_arch = "s390x"))]
#[cfg(all(feature = "dragonball", not(target_arch = "s390x")))]
pub mod dragonball;
#[cfg(not(target_arch = "s390x"))]
pub mod firecracker;
@@ -53,12 +53,14 @@ const VM_ROOTFS_FILESYSTEM_EROFS: &str = "erofs";
// /dev/hugepages will be the mount point
// mkdir -p /dev/hugepages
// mount -t hugetlbfs none /dev/hugepages
#[cfg(not(target_arch = "s390x"))]
const DEV_HUGEPAGES: &str = "/dev/hugepages";
pub const HUGETLBFS: &str = "hugetlbfs";
#[cfg(not(target_arch = "s390x"))]
// Constants required for Dragonball VMM when enabled and not on s390x.
// Not needed when the built-in VMM is not used.
#[cfg(all(feature = "dragonball", not(target_arch = "s390x")))]
const DEV_HUGEPAGES: &str = "/dev/hugepages";
#[cfg(all(feature = "dragonball", not(target_arch = "s390x")))]
const SHMEM: &str = "shmem";
#[cfg(not(target_arch = "s390x"))]
#[cfg(all(feature = "dragonball", not(target_arch = "s390x")))]
const HUGE_SHMEM: &str = "hugeshmem";
pub const HYPERVISOR_DRAGONBALL: &str = "dragonball";

View File

@@ -8,6 +8,7 @@ use crate::{kernel_param::KernelParams, Address, HypervisorConfig};
use anyhow::{anyhow, Context, Result};
use async_trait::async_trait;
use kata_types::config::hypervisor::VIRTIO_SCSI;
use std::collections::HashMap;
use std::fmt::Display;
use std::fs::{read_to_string, File};
@@ -49,7 +50,7 @@ trait ToQemuParams: Send + Sync {
async fn qemu_params(&self) -> Result<Vec<String>>;
}
#[derive(Debug, PartialEq)]
#[derive(Debug, PartialEq, Clone, Copy)]
enum VirtioBusType {
Pci,
Ccw,
@@ -70,6 +71,14 @@ impl Display for VirtioBusType {
}
}
fn bus_type(config: &HypervisorConfig) -> VirtioBusType {
if config.machine_info.machine_type.contains("-ccw-") {
VirtioBusType::Ccw
} else {
VirtioBusType::Pci
}
}
// Conventions used in qemu command line generation
// ================================================
//
@@ -975,7 +984,7 @@ fn format_fds(files: &[File]) -> String {
}
#[derive(Debug)]
struct Netdev {
pub struct Netdev {
id: String,
// File descriptors for vhost multi-queue support.
@@ -1013,6 +1022,18 @@ impl Netdev {
self.disable_vhost_net = disable_vhost_net;
self
}
pub fn get_id(&self) -> &String {
&self.id
}
pub fn get_fds(&self) -> &Vec<File> {
&self.fds["fds"]
}
pub fn get_vhostfds(&self) -> &Vec<File> {
&self.fds["vhostfds"]
}
}
#[async_trait]
@@ -1081,6 +1102,26 @@ impl DeviceVirtioNet {
self.iommu_platform = iommu_platform;
self
}
pub fn get_netdev_id(&self) -> &String {
&self.netdev_id
}
pub fn get_device_driver(&self) -> &String {
&self.device_driver
}
pub fn get_mac_addr(&self) -> String {
format!("{:?}", self.mac_address)
}
pub fn get_num_queues(&self) -> u32 {
self.num_queues
}
pub fn get_disable_modern(&self) -> bool {
self.disable_modern
}
}
#[async_trait]
@@ -1290,6 +1331,74 @@ impl ToQemuParams for DeviceIntelIommu {
}
}
#[derive(Debug)]
struct DevicePciBridge {
driver: String,
bus: String,
id: String,
chassis_nr: u32,
shpc: bool,
addr: u32,
io_reserve: String,
mem_reserve: String,
pref64_reserve: String,
}
impl DevicePciBridge {
fn new(config: &HypervisorConfig, bridge_idx: u32) -> DevicePciBridge {
DevicePciBridge {
// The go runtime doesn't support bridges other than PCI although
// PCIe should also be available. Stick with the legacy behaviour
// of ignoring PCIe since it's not clear to me how to decide
// between the two.
driver: "pci-bridge".to_owned(),
bus: match config.machine_info.machine_type.as_str() {
"q35" | "virt" => "pcie.0",
_ => "pci.0",
}
.to_owned(),
id: format!("pci-bridge-{}", bridge_idx),
// Each bridge is required to be assigned a unique chassis id > 0.
chassis_nr: bridge_idx + 1,
shpc: false,
// 2 is documented by the go runtime as the first slot available
// for a bridge (on x86_64)
// (https://github.com/kata-containers/kata-containers/blob/99730256a2899c82d111400024621519d17ea15d/src/runtime/virtcontainers/qemu_arch_base.go#L212)
addr: 2 + bridge_idx,
// Values taken from the go runtime implementation which comments
// the choices as follows:
// Certain guest BIOS versions think !SHPC means no hotplug, and
// won't reserve the IO and memory windows that will be needed for
// devices added underneath this bridge. This will only break for
// certain combinations of exact qemu, BIOS and guest kernel
// versions, but for consistency, just hint the usual default
// windows for a bridge (as the BIOS would use with SHPC) so that
// we can do ACPI hotplug.
// (https://github.com/kata-containers/kata-containers/blob/99730256a2899c82d111400024621519d17ea15d/src/runtime/virtcontainers/qemu.go#L2474)
io_reserve: "4k".to_owned(),
mem_reserve: "1m".to_owned(),
pref64_reserve: "1m".to_owned(),
}
}
}
#[async_trait]
impl ToQemuParams for DevicePciBridge {
async fn qemu_params(&self) -> Result<Vec<String>> {
let mut params = Vec::new();
params.push(self.driver.clone());
params.push(format!("bus={}", self.bus));
params.push(format!("id={}", self.id));
params.push(format!("chassis_nr={}", self.chassis_nr));
params.push(format!("shpc={}", if self.shpc { "on" } else { "off" }));
params.push(format!("addr={}", self.addr));
params.push(format!("io-reserve={}", self.io_reserve));
params.push(format!("mem-reserve={}", self.mem_reserve));
params.push(format!("pref64-reserve={}", self.pref64_reserve));
Ok(vec!["-device".to_owned(), params.join(",")])
}
}
// Qemu provides methods and types for managing QEMU instances.
// To manage a qemu instance after it has been launched you need
// to pass the -qmp option during launch requesting the qemu instance
@@ -1405,6 +1514,66 @@ impl ToQemuParams for QmpSocket {
}
}
#[derive(Debug)]
struct DeviceVirtioScsi {
bus_type: VirtioBusType,
id: String,
disable_modern: bool,
iothread: String,
}
impl DeviceVirtioScsi {
fn new(id: &str, disable_modern: bool, bus_type: VirtioBusType) -> Self {
DeviceVirtioScsi {
bus_type,
id: id.to_owned(),
disable_modern,
iothread: "".to_owned(),
}
}
fn set_iothread(&mut self, iothread: &str) {
self.iothread = iothread.to_owned();
}
}
#[async_trait]
impl ToQemuParams for DeviceVirtioScsi {
async fn qemu_params(&self) -> Result<Vec<String>> {
let mut params = Vec::new();
params.push(format!("virtio-scsi-{}", self.bus_type));
params.push(format!("id={}", self.id));
if self.disable_modern {
params.push("disable-modern=true".to_owned());
}
if !self.iothread.is_empty() {
params.push(format!("iothread={}", self.iothread));
}
Ok(vec!["-device".to_owned(), params.join(",")])
}
}
#[derive(Debug)]
struct ObjectIoThread {
id: String,
}
impl ObjectIoThread {
fn new(id: &str) -> Self {
ObjectIoThread { id: id.to_owned() }
}
}
#[async_trait]
impl ToQemuParams for ObjectIoThread {
async fn qemu_params(&self) -> Result<Vec<String>> {
let mut params = Vec::new();
params.push("iothread".to_owned());
params.push(format!("id={}", self.id));
Ok(vec!["-object".to_owned(), params.join(",")])
}
}
fn is_running_in_vm() -> Result<bool> {
let res = read_to_string("/proc/cpuinfo")?
.lines()
@@ -1480,10 +1649,18 @@ impl<'a> QemuCmdLine<'a> {
qemu_cmd_line.add_rtc();
if qemu_cmd_line.bus_type() != VirtioBusType::Ccw {
if bus_type(config) != VirtioBusType::Ccw {
qemu_cmd_line.add_rng();
}
if bus_type(config) != VirtioBusType::Ccw && config.device_info.default_bridges > 0 {
qemu_cmd_line.add_bridges(config.device_info.default_bridges);
}
if config.blockdev_info.block_device_driver == VIRTIO_SCSI {
qemu_cmd_line.add_scsi_controller();
}
Ok(qemu_cmd_line)
}
@@ -1507,14 +1684,6 @@ impl<'a> QemuCmdLine<'a> {
self.devices.push(Box::new(rng_device));
}
fn bus_type(&self) -> VirtioBusType {
if self.config.machine_info.machine_type.contains("-ccw-") {
VirtioBusType::Ccw
} else {
VirtioBusType::Pci
}
}
fn add_iommu(&mut self) {
let dev_iommu = DeviceIntelIommu::new();
self.devices.push(Box::new(dev_iommu));
@@ -1526,6 +1695,25 @@ impl<'a> QemuCmdLine<'a> {
self.machine.set_kernel_irqchip("split");
}
fn add_bridges(&mut self, count: u32) {
for idx in 0..count {
let bridge = DevicePciBridge::new(self.config, idx);
self.devices.push(Box::new(bridge));
}
}
fn add_scsi_controller(&mut self) {
let mut virtio_scsi =
DeviceVirtioScsi::new("scsi0", should_disable_modern(), bus_type(self.config));
if self.config.enable_iothreads {
let iothread_id = "scsi-io-thread";
let iothread = ObjectIoThread::new(iothread_id);
virtio_scsi.set_iothread(iothread_id);
self.devices.push(Box::new(iothread));
}
self.devices.push(Box::new(virtio_scsi));
}
pub fn add_virtiofs_share(
&mut self,
virtiofsd_socket_path: &str,
@@ -1542,9 +1730,11 @@ impl<'a> QemuCmdLine<'a> {
self.devices.push(Box::new(virtiofsd_socket_chardev));
let mut virtiofs_device = DeviceVhostUserFs::new(chardev_name, mount_tag, self.bus_type());
let bus_type = bus_type(self.config);
let mut virtiofs_device = DeviceVhostUserFs::new(chardev_name, mount_tag, bus_type);
virtiofs_device.set_queue_size(queue_size);
if self.config.device_info.enable_iommu_platform && self.bus_type() == VirtioBusType::Ccw {
if self.config.device_info.enable_iommu_platform && bus_type == VirtioBusType::Ccw {
virtiofs_device.set_iommu_platform(true);
}
self.devices.push(Box::new(virtiofs_device));
@@ -1558,7 +1748,7 @@ impl<'a> QemuCmdLine<'a> {
//self.devices.push(Box::new(mem_file));
self.memory.set_memory_backend_file(&mem_file);
match self.bus_type() {
match bus_type {
VirtioBusType::Pci => {
self.machine.set_nvdimm(true);
self.devices.push(Box::new(NumaNode::new(&mem_file.id)));
@@ -1572,7 +1762,7 @@ impl<'a> QemuCmdLine<'a> {
pub fn add_vsock(&mut self, vhostfd: tokio::fs::File, guest_cid: u32) -> Result<()> {
clear_cloexec(vhostfd.as_raw_fd()).context("clearing O_CLOEXEC failed on vsock fd")?;
let mut vhost_vsock_pci = VhostVsock::new(vhostfd, guest_cid, self.bus_type());
let mut vhost_vsock_pci = VhostVsock::new(vhostfd, guest_cid, bus_type(self.config));
if !self.config.disable_nesting_checks && should_disable_modern() {
vhost_vsock_pci.set_disable_modern(true);
@@ -1619,8 +1809,10 @@ impl<'a> QemuCmdLine<'a> {
pub fn add_block_device(&mut self, device_id: &str, path: &str) -> Result<()> {
self.devices
.push(Box::new(BlockBackend::new(device_id, path)));
self.devices
.push(Box::new(DeviceVirtioBlk::new(device_id, self.bus_type())));
self.devices.push(Box::new(DeviceVirtioBlk::new(
device_id,
bus_type(self.config),
)));
Ok(())
}
@@ -1634,32 +1826,9 @@ impl<'a> QemuCmdLine<'a> {
));
}
pub fn add_network_device(
&mut self,
dev_index: u64,
host_dev_name: &str,
guest_mac: Address,
) -> Result<()> {
let mut netdev = Netdev::new(
&format!("network-{}", dev_index),
host_dev_name,
self.config.network_info.network_queues,
)?;
if self.config.network_info.disable_vhost_net {
netdev.set_disable_vhost_net(true);
}
let mut virtio_net_device = DeviceVirtioNet::new(&netdev.id, guest_mac);
if should_disable_modern() {
virtio_net_device.set_disable_modern(true);
}
if self.config.device_info.enable_iommu_platform && self.bus_type() == VirtioBusType::Ccw {
virtio_net_device.set_iommu_platform(true);
}
if self.config.network_info.network_queues > 1 {
virtio_net_device.set_num_queues(self.config.network_info.network_queues);
}
pub fn add_network_device(&mut self, host_dev_name: &str, guest_mac: Address) -> Result<()> {
let (netdev, virtio_net_device) =
get_network_device(self.config, host_dev_name, guest_mac)?;
self.devices.push(Box::new(netdev));
self.devices.push(Box::new(virtio_net_device));
@@ -1667,8 +1836,10 @@ impl<'a> QemuCmdLine<'a> {
}
pub fn add_console(&mut self, console_socket_path: &str) {
let mut serial_dev = DeviceVirtioSerial::new("serial0", self.bus_type());
if self.config.device_info.enable_iommu_platform && self.bus_type() == VirtioBusType::Ccw {
let mut serial_dev = DeviceVirtioSerial::new("serial0", bus_type(self.config));
if self.config.device_info.enable_iommu_platform
&& bus_type(self.config) == VirtioBusType::Ccw
{
serial_dev.set_iommu_platform(true);
}
self.devices.push(Box::new(serial_dev));
@@ -1709,3 +1880,32 @@ impl<'a> QemuCmdLine<'a> {
Ok(result)
}
}
pub fn get_network_device(
config: &HypervisorConfig,
host_dev_name: &str,
guest_mac: Address,
) -> Result<(Netdev, DeviceVirtioNet)> {
let mut netdev = Netdev::new(
&format!("network-{}", host_dev_name),
host_dev_name,
config.network_info.network_queues,
)?;
if config.network_info.disable_vhost_net {
netdev.set_disable_vhost_net(true);
}
let mut virtio_net_device = DeviceVirtioNet::new(&netdev.id, guest_mac);
if should_disable_modern() {
virtio_net_device.set_disable_modern(true);
}
if config.device_info.enable_iommu_platform && bus_type(config) == VirtioBusType::Ccw {
virtio_net_device.set_iommu_platform(true);
}
if config.network_info.network_queues > 1 {
virtio_net_device.set_num_queues(config.network_info.network_queues);
}
Ok((netdev, virtio_net_device))
}

View File

@@ -3,7 +3,7 @@
// SPDX-License-Identifier: Apache-2.0
//
use super::cmdline_generator::{QemuCmdLine, QMP_SOCKET_FILE};
use super::cmdline_generator::{get_network_device, QemuCmdLine, QMP_SOCKET_FILE};
use super::qmp::Qmp;
use crate::{
hypervisor_persist::HypervisorState, utils::enter_netns, HypervisorConfig, MemoryConfig,
@@ -43,13 +43,10 @@ pub struct QemuInner {
netns: Option<String>,
exit_notify: Option<mpsc::Sender<()>>,
exit_waiter: Mutex<(mpsc::Receiver<()>, i32)>,
}
impl QemuInner {
pub fn new() -> QemuInner {
let (exit_notify, exit_waiter) = mpsc::channel(1);
pub fn new(exit_notify: mpsc::Sender<()>) -> QemuInner {
QemuInner {
id: "".to_string(),
qemu_process: Mutex::new(None),
@@ -59,7 +56,6 @@ impl QemuInner {
netns: None,
exit_notify: Some(exit_notify),
exit_waiter: Mutex::new((exit_waiter, 0)),
}
}
@@ -124,7 +120,6 @@ impl QemuInner {
let _netns_guard = NetnsGuard::new(&netns).context("new netns guard")?;
cmdline.add_network_device(
network.config.index,
&network.config.host_dev_name,
network.config.guest_mac.clone().unwrap(),
)?;
@@ -202,22 +197,14 @@ impl QemuInner {
}
pub(crate) async fn wait_vm(&self) -> Result<i32> {
info!(sl!(), "Wait QEMU VM");
let mut waiter = self.exit_waiter.lock().await;
//wait until the qemu process exited.
waiter.0.recv().await;
let mut qemu_process = self.qemu_process.lock().await;
if let Some(mut qemu_process) = qemu_process.take() {
if let Ok(status) = qemu_process.wait().await {
waiter.1 = status.code().unwrap_or(0);
}
let status = qemu_process.wait().await?;
Ok(status.code().unwrap_or(0))
} else {
Err(anyhow!("the process has been reaped"))
}
Ok(waiter.1)
}
pub(crate) fn pause_vm(&self) -> Result<()> {
@@ -552,9 +539,16 @@ use crate::device::DeviceType;
// device manager part of Hypervisor
impl QemuInner {
pub(crate) async fn add_device(&mut self, device: DeviceType) -> Result<DeviceType> {
pub(crate) async fn add_device(&mut self, mut device: DeviceType) -> Result<DeviceType> {
info!(sl!(), "QemuInner::add_device() {}", device);
self.devices.push(device.clone());
let is_qemu_ready_to_hotplug = self.qmp.is_some();
if is_qemu_ready_to_hotplug {
// hypervisor is running already
device = self.hotplug_device(device)?;
} else {
// store the device to coldplug it later, on hypervisor launch
self.devices.push(device.clone());
}
Ok(device)
}
@@ -565,6 +559,26 @@ impl QemuInner {
device
))
}
fn hotplug_device(&mut self, device: DeviceType) -> Result<DeviceType> {
let qmp = match self.qmp {
Some(ref mut qmp) => qmp,
None => return Err(anyhow!("QMP not initialized")),
};
match device {
DeviceType::Network(ref network_device) => {
let (netdev, virtio_net_device) = get_network_device(
&self.config,
&network_device.config.host_dev_name,
network_device.config.guest_mac.clone().unwrap(),
)?;
qmp.hotplug_network_device(&netdev, &virtio_net_device)?
}
_ => info!(sl!(), "hotplugging of {:#?} is unsupported", device),
}
Ok(device)
}
}
// private helpers
@@ -589,7 +603,7 @@ impl QemuInner {
#[async_trait]
impl Persist for QemuInner {
type State = HypervisorState;
type ConstructorArgs = ();
type ConstructorArgs = mpsc::Sender<()>;
/// Save a state of hypervisor
async fn save(&self) -> Result<Self::State> {
@@ -602,12 +616,7 @@ impl Persist for QemuInner {
}
/// Restore hypervisor
async fn restore(
_hypervisor_args: Self::ConstructorArgs,
hypervisor_state: Self::State,
) -> Result<Self> {
let (exit_notify, exit_waiter) = mpsc::channel(1);
async fn restore(exit_notify: mpsc::Sender<()>, hypervisor_state: Self::State) -> Result<Self> {
Ok(QemuInner {
id: hypervisor_state.id,
qemu_process: Mutex::new(None),
@@ -617,7 +626,6 @@ impl Persist for QemuInner {
netns: None,
exit_notify: Some(exit_notify),
exit_waiter: Mutex::new((exit_waiter, 0)),
})
}
}

View File

@@ -20,10 +20,12 @@ use async_trait::async_trait;
use std::sync::Arc;
use tokio::sync::RwLock;
use tokio::sync::{mpsc, Mutex};
#[derive(Debug)]
pub struct Qemu {
inner: Arc<RwLock<QemuInner>>,
exit_waiter: Mutex<(mpsc::Receiver<()>, i32)>,
}
impl Default for Qemu {
@@ -34,8 +36,11 @@ impl Default for Qemu {
impl Qemu {
pub fn new() -> Self {
let (exit_notify, exit_waiter) = mpsc::channel(1);
Self {
inner: Arc::new(RwLock::new(QemuInner::new())),
inner: Arc::new(RwLock::new(QemuInner::new(exit_notify))),
exit_waiter: Mutex::new((exit_waiter, 0)),
}
}
@@ -63,8 +68,19 @@ impl Hypervisor for Qemu {
}
async fn wait_vm(&self) -> Result<i32> {
info!(sl!(), "Wait QEMU VM");
let mut waiter = self.exit_waiter.lock().await;
//wait until the qemu process exited.
waiter.0.recv().await;
let inner = self.inner.read().await;
inner.wait_vm().await
if let Ok(exit_code) = inner.wait_vm().await {
waiter.1 = exit_code;
}
Ok(waiter.1)
}
async fn pause_vm(&self) -> Result<()> {
@@ -204,12 +220,15 @@ impl Persist for Qemu {
/// Restore a component from a specified state.
async fn restore(
hypervisor_args: Self::ConstructorArgs,
_hypervisor_args: Self::ConstructorArgs,
hypervisor_state: Self::State,
) -> Result<Self> {
let inner = QemuInner::restore(hypervisor_args, hypervisor_state).await?;
let (exit_notify, exit_waiter) = mpsc::channel(1);
let inner = QemuInner::restore(exit_notify, hypervisor_state).await?;
Ok(Self {
inner: Arc::new(RwLock::new(inner)),
exit_waiter: Mutex::new((exit_waiter, 0)),
})
}
}

View File

@@ -3,9 +3,13 @@
// SPDX-License-Identifier: Apache-2.0
//
use crate::qemu::cmdline_generator::{DeviceVirtioNet, Netdev};
use anyhow::{anyhow, Result};
use nix::sys::socket::{sendmsg, ControlMessage, MsgFlags};
use std::fmt::{Debug, Error, Formatter};
use std::io::BufReader;
use std::os::fd::{AsRawFd, RawFd};
use std::os::unix::net::UnixStream;
use std::time::Duration;
@@ -291,6 +295,178 @@ impl Qmp {
}
Ok(())
}
fn find_free_slot(&mut self) -> Result<(String, i64)> {
let pci = self.qmp.execute(&qapi_qmp::query_pci {})?;
for pci_info in &pci {
for pci_dev in &pci_info.devices {
let pci_bridge = match &pci_dev.pci_bridge {
Some(bridge) => bridge,
None => continue,
};
info!(sl!(), "found PCI bridge: {}", pci_dev.qdev_id);
if let Some(bridge_devices) = &pci_bridge.devices {
let occupied_slots = bridge_devices
.iter()
.map(|pci_dev| pci_dev.slot)
.collect::<Vec<_>>();
info!(
sl!(),
"already occupied slots on bridge {}: {:#?}",
pci_dev.qdev_id,
occupied_slots
);
// from virtcontainers' bridges.go
let pci_bridge_max_capacity = 30;
for slot in 0..pci_bridge_max_capacity {
if !occupied_slots.iter().any(|elem| *elem == slot) {
info!(
sl!(),
"found free slot on bridge {}: {}", pci_dev.qdev_id, slot
);
return Ok((pci_dev.qdev_id.clone(), slot));
}
}
}
}
}
Err(anyhow!("no free slots on PCI bridges"))
}
fn pass_fd(&mut self, fd: RawFd, fdname: &str) -> Result<()> {
info!(sl!(), "passing fd {:?} as {}", fd, fdname);
// Put the QMP 'getfd' command itself into the message payload.
let getfd_cmd = format!(
"{{ \"execute\": \"getfd\", \"arguments\": {{ \"fdname\": \"{}\" }} }}",
fdname
);
let buf = getfd_cmd.as_bytes();
let bufs = &mut [std::io::IoSlice::new(buf)][..];
debug!(sl!(), "bufs: {:?}", bufs);
let fds = [fd];
let cmsg = [ControlMessage::ScmRights(&fds)];
let result = sendmsg::<()>(
self.qmp.inner_mut().get_mut_write().as_raw_fd(),
bufs,
&cmsg,
MsgFlags::empty(),
None,
);
info!(sl!(), "sendmsg() result: {:#?}", result);
let result = self.qmp.read_response::<&qmp::getfd>();
match result {
Ok(_) => {
info!(sl!(), "successfully passed {} ({})", fdname, fd);
Ok(())
}
Err(err) => Err(anyhow!("failed to pass {} ({}): {}", fdname, fd, err)),
}
}
pub fn hotplug_network_device(
&mut self,
netdev: &Netdev,
virtio_net_device: &DeviceVirtioNet,
) -> Result<()> {
debug!(
sl!(),
"hotplug_network_device(): PCI before {}: {:#?}",
virtio_net_device.get_netdev_id(),
self.qmp.execute(&qapi_qmp::query_pci {})?
);
let (bus, slot) = self.find_free_slot()?;
let mut fd_names = vec![];
for (idx, fd) in netdev.get_fds().iter().enumerate() {
let fdname = format!("fd{}", idx);
self.pass_fd(fd.as_raw_fd(), fdname.as_ref())?;
fd_names.push(fdname);
}
let mut vhostfd_names = vec![];
for (idx, fd) in netdev.get_vhostfds().iter().enumerate() {
let vhostfdname = format!("vhostfd{}", idx);
self.pass_fd(fd.as_raw_fd(), vhostfdname.as_ref())?;
vhostfd_names.push(vhostfdname);
}
self.qmp
.execute(&qapi_qmp::netdev_add(qapi_qmp::Netdev::tap {
id: netdev.get_id().clone(),
tap: qapi_qmp::NetdevTapOptions {
br: None,
downscript: None,
fd: None,
// Logic in cmdline_generator::Netdev::new() seems to
// guarantee that there will always be at least one fd.
fds: Some(fd_names.join(",")),
helper: None,
ifname: None,
poll_us: None,
queues: None,
script: None,
sndbuf: None,
vhost: if vhostfd_names.is_empty() {
None
} else {
Some(true)
},
vhostfd: None,
vhostfds: if vhostfd_names.is_empty() {
None
} else {
Some(vhostfd_names.join(","))
},
vhostforce: None,
vnet_hdr: None,
},
}))?;
let mut netdev_frontend_args = Dictionary::new();
netdev_frontend_args.insert(
"netdev".to_owned(),
virtio_net_device.get_netdev_id().clone().into(),
);
netdev_frontend_args.insert("addr".to_owned(), format!("{:02}", slot).into());
netdev_frontend_args.insert("mac".to_owned(), virtio_net_device.get_mac_addr().into());
netdev_frontend_args.insert("mq".to_owned(), "on".into());
// As the golang runtime documents the vectors computation, it's
// 2N+2 vectors, N for tx queues, N for rx queues, 1 for config, and one for possible control vq
netdev_frontend_args.insert(
"vectors".to_owned(),
(2 * virtio_net_device.get_num_queues() + 2).into(),
);
if virtio_net_device.get_disable_modern() {
netdev_frontend_args.insert("disable-modern".to_owned(), true.into());
}
self.qmp.execute(&qmp::device_add {
bus: Some(bus),
id: Some(format!("frontend-{}", virtio_net_device.get_netdev_id())),
driver: virtio_net_device.get_device_driver().clone(),
arguments: netdev_frontend_args,
})?;
debug!(
sl!(),
"hotplug_network_device(): PCI after {}: {:#?}",
virtio_net_device.get_netdev_id(),
self.qmp.execute(&qapi_qmp::query_pci {})?
);
Ok(())
}
}
fn vcpu_id_from_core_id(core_id: i64) -> String {

View File

@@ -11,13 +11,14 @@ use std::{
};
use anyhow::{anyhow, Context, Result};
use dbs_utils::net::Tap;
use kata_types::config::KATA_PATH;
use nix::{
fcntl,
sched::{setns, CloneFlags},
};
use crate::device::Tap;
use crate::{DEFAULT_HYBRID_VSOCK_NAME, JAILER_ROOT};
pub fn get_child_threads(pid: u32) -> HashSet<u32> {

View File

@@ -0,0 +1,475 @@
//
// Copyright (c) 2024 Ant Group
//
// SPDX-License-Identifier: Apache-2.0
//
use std::collections::HashMap;
use std::path::Path;
use anyhow::Result;
use oci_spec::runtime::Spec;
use super::{resolve_cdi_device_kind, ContainerDevice};
use agent::types::Device;
const CDI_PREFIX: &str = "cdi.k8s.io";
// Sort the devices based on the first element's PCI_Guest_Path in the PCI bus according to options.
fn sort_devices_by_guest_pcipath(devices: &mut [ContainerDevice]) {
// Extract first guest_pcipath from device_options
let extract_first_guest_pcipath = |options: &[String]| -> Option<String> {
options
.first()
.and_then(|option| option.split('=').nth(1))
.map(|path| path.to_string())
};
devices.sort_by(|a, b| {
let guest_path_a = extract_first_guest_pcipath(&a.device.options);
let guest_path_b = extract_first_guest_pcipath(&b.device.options);
guest_path_a.cmp(&guest_path_b)
});
}
// Annotate container devices with CDI annotations in OCI Spec
pub fn annotate_container_devices(
spec: &mut Spec,
container_devices: Vec<ContainerDevice>,
) -> Result<Vec<Device>> {
let mut devices_agent: Vec<Device> = Vec::new();
// Make sure that annotations is Some().
if spec.annotations().is_none() {
spec.set_annotations(Some(HashMap::new()));
}
// Step 1: Extract all devices and filter out devices without device_info for vfio_devices
let vfio_devices: Vec<ContainerDevice> = container_devices
.into_iter()
.map(|device| {
// push every device's Device to agent_devices
devices_agent.push(device.device.clone());
device
})
.filter(|device| device.device_info.is_some())
.collect();
// Step 2: Group devices by vendor_id-class_id
let mut grouped_devices: HashMap<String, Vec<ContainerDevice>> = HashMap::new();
for device in vfio_devices {
// Extract the vendor/class key and insert into the map if both are present
if let Some(key) = device
.device_info
.as_ref()
.and_then(|info| resolve_cdi_device_kind(&info.vendor_id, &info.class_id))
{
grouped_devices
.entry(key.to_owned())
.or_default()
.push(device);
}
}
// Step 3: Sort devices within each group by guest_pcipath
grouped_devices
.iter_mut()
.for_each(|(vendor_class, container_devices)| {
// The *offset* is a monotonically increasing counter that keeps track of the number of devices
// within an IOMMU group. It increments by total_of whenever a new IOMMU group is processed.
let offset: &mut usize = &mut 0;
sort_devices_by_guest_pcipath(container_devices);
container_devices
.iter()
.enumerate()
.for_each(|(base, container_device)| {
let total_of = container_device.device.options.len();
// annotate device with cdi information in OCI Spec.
for index in 0..total_of {
if let Some(iommu_grpid) =
Path::new(&container_device.device.container_path)
.file_name()
.and_then(|name| name.to_str())
{
spec.annotations_mut().as_mut().unwrap().insert(
format!("{}/vfio{}.{}", CDI_PREFIX, iommu_grpid, index), // cdi.k8s.io/vfioX.y
format!("{}={}", vendor_class, base + *offset), // vendor/class=name
);
}
}
// update the offset with *total_of*.
*offset += total_of - 1;
});
});
Ok(devices_agent)
}
#[cfg(test)]
mod tests {
use std::path::PathBuf;
use crate::cdi_devices::DeviceInfo;
use agent::types::Device;
use oci_spec::runtime::SpecBuilder;
use super::*;
#[test]
fn test_sort_devices_by_guest_pcipath() {
let mut devices = vec![
ContainerDevice {
device_info: Some(DeviceInfo {
vendor_id: "0xffff".to_string(),
class_id: "0x030x".to_string(),
host_path: PathBuf::from("/dev/device3"),
}),
device: Device {
options: vec!["pci_host_path03=BB:DD03.F03".to_string()],
..Default::default()
},
},
ContainerDevice {
device_info: Some(DeviceInfo {
vendor_id: "0xffff".to_string(),
class_id: "0x030x".to_string(),
host_path: PathBuf::from("/dev/device1"),
}),
device: Device {
options: vec!["pci_host_path01=BB:DD01.F01".to_string()],
..Default::default()
},
},
ContainerDevice {
device_info: Some(DeviceInfo {
vendor_id: "0xffff".to_string(),
class_id: "0x030x".to_string(),
host_path: PathBuf::from("/dev/device2"),
}),
device: Device {
options: vec!["pci_host_path02=BB:DD02.F02".to_string()],
..Default::default()
},
},
];
sort_devices_by_guest_pcipath(&mut devices);
let expected_devices_order = vec![
"/dev/device1".to_string(),
"/dev/device2".to_string(),
"/dev/device3".to_string(),
];
let actual_devices_order: Vec<String> = devices
.iter()
.map(|cd| {
cd.device_info
.as_ref()
.unwrap()
.host_path
.display()
.to_string()
})
.collect();
assert_eq!(actual_devices_order, expected_devices_order);
}
#[test]
fn test_sort_devices_with_empty_options() {
let mut devices = vec![
ContainerDevice {
device_info: Some(DeviceInfo {
vendor_id: "0xffff".to_string(),
class_id: "0x030x".to_string(),
host_path: PathBuf::from("/dev/device1"),
}),
device: Device {
options: vec![], // empty
..Default::default()
},
},
ContainerDevice {
device_info: Some(DeviceInfo {
vendor_id: "0xffff".to_string(),
class_id: "0x030x".to_string(),
host_path: PathBuf::from("/dev/device2"),
}),
device: Device {
options: vec!["pci_host_path02=BB:DD02.F02".to_string()],
..Default::default()
},
},
];
sort_devices_by_guest_pcipath(&mut devices);
// As the first device has no options, ignore it.
let expected_devices_order = vec!["BB:DD02.F02".to_string()];
let actual_devices_order: Vec<String> = devices
.iter()
.filter_map(|d| d.device.options.first())
.map(|option| option.split('=').nth(1).unwrap_or("").to_string())
.collect();
assert_eq!(actual_devices_order, expected_devices_order);
}
#[test]
fn test_annotate_container_devices() {
let devices = vec![
ContainerDevice {
device_info: None,
device: Device {
id: "test0000x".to_string(),
container_path: "/dev/xvdx".to_string(),
field_type: "virtio-blk".to_string(),
vm_path: "/dev/vdx".to_string(),
options: vec![],
},
},
ContainerDevice {
device_info: Some(DeviceInfo {
vendor_id: "0x1002".to_string(),
class_id: "0x0302".to_string(),
host_path: PathBuf::from("/dev/device2"),
}),
device: Device {
container_path: "/dev/device2".to_string(),
options: vec!["pci_host_path02=BB:DD02.F02".to_string()],
..Default::default()
},
},
ContainerDevice {
device_info: Some(DeviceInfo {
vendor_id: "0x1002".to_string(),
class_id: "0x0302".to_string(),
host_path: PathBuf::from("/dev/device3"),
}),
device: Device {
container_path: "/dev/device3".to_string(),
options: vec!["pci_host_path03=BB:DD03.F03".to_string()],
..Default::default()
},
},
ContainerDevice {
device_info: Some(DeviceInfo {
vendor_id: "0x1002".to_string(),
class_id: "0x0302".to_string(),
host_path: PathBuf::from("/dev/device1"),
}),
device: Device {
container_path: "/dev/device1".to_string(),
options: vec!["pci_host_path01=BB:DD01.F01".to_string()],
..Default::default()
},
},
ContainerDevice {
device_info: None,
device: Device {
id: "test0000yx".to_string(),
container_path: "/dev/xvdyx".to_string(),
field_type: "virtio-blk".to_string(),
vm_path: "/dev/vdyx".to_string(),
options: vec![],
},
},
];
let annotations = HashMap::new();
let mut spec = SpecBuilder::default()
.annotations(annotations)
.build()
.unwrap();
// do annotate container devices
let _devices = annotate_container_devices(&mut spec, devices);
let expected_annotations: HashMap<String, String> = vec![
(
"cdi.k8s.io/vfiodevice3.0".to_owned(),
"amd.com/gpu=2".to_owned(),
),
(
"cdi.k8s.io/vfiodevice1.0".to_owned(),
"amd.com/gpu=0".to_owned(),
),
(
"cdi.k8s.io/vfiodevice2.0".to_owned(),
"amd.com/gpu=1".to_owned(),
),
]
.into_iter()
.collect();
assert_eq!(Some(expected_annotations), spec.annotations().clone());
}
#[test]
fn test_annotate_container_multi_vendor_devices() {
let devices = vec![
ContainerDevice {
device_info: None,
device: Device {
id: "test0000x".to_string(),
container_path: "/dev/xvdx".to_string(),
field_type: "virtio-blk".to_string(),
vm_path: "/dev/vdx".to_string(),
options: vec![],
},
},
ContainerDevice {
device_info: Some(DeviceInfo {
vendor_id: "0x10de".to_string(),
class_id: "0x0302".to_string(),
host_path: PathBuf::from("/dev/device2"),
}),
device: Device {
container_path: "/dev/device2".to_string(),
options: vec!["pci_host_path02=BB:DD02.F02".to_string()],
..Default::default()
},
},
ContainerDevice {
device_info: Some(DeviceInfo {
vendor_id: "0x10de".to_string(),
class_id: "0x0302".to_string(),
host_path: PathBuf::from("/dev/device3"),
}),
device: Device {
container_path: "/dev/device3".to_string(),
options: vec!["pci_host_path03=BB:DD03.F03".to_string()],
..Default::default()
},
},
ContainerDevice {
device_info: Some(DeviceInfo {
vendor_id: "0x8086".to_string(),
class_id: "0x0302".to_string(),
host_path: PathBuf::from("/dev/device1"),
}),
device: Device {
container_path: "/dev/device1".to_string(),
options: vec!["pci_host_path01=BB:DD01.F01".to_string()],
..Default::default()
},
},
ContainerDevice {
device_info: Some(DeviceInfo {
vendor_id: "0x8086".to_string(),
class_id: "0x0302".to_string(),
host_path: PathBuf::from("/dev/device4"),
}),
device: Device {
container_path: "/dev/device4".to_string(),
options: vec!["pci_host_path04=BB:DD01.F04".to_string()],
..Default::default()
},
},
ContainerDevice {
device_info: None,
device: Device {
id: "test0000yx".to_string(),
container_path: "/dev/xvdyx".to_string(),
field_type: "virtio-blk".to_string(),
vm_path: "/dev/vdyx".to_string(),
options: vec![],
},
},
];
let annotations = HashMap::new();
let mut spec = SpecBuilder::default()
.annotations(annotations)
.build()
.unwrap();
let _devices = annotate_container_devices(&mut spec, devices);
let expected_annotations: HashMap<String, String> = vec![
(
"cdi.k8s.io/vfiodevice1.0".to_owned(),
"intel.com/gpu=0".to_owned(),
),
(
"cdi.k8s.io/vfiodevice2.0".to_owned(),
"nvidia.com/gpu=0".to_owned(),
),
(
"cdi.k8s.io/vfiodevice3.0".to_owned(),
"nvidia.com/gpu=1".to_owned(),
),
(
"cdi.k8s.io/vfiodevice4.0".to_owned(),
"intel.com/gpu=1".to_owned(),
),
]
.into_iter()
.collect();
assert_eq!(Some(expected_annotations), spec.annotations().clone());
}
#[test]
fn test_annotate_container_without_vfio_devices() {
let devices = vec![
ContainerDevice {
device_info: None,
device: Device {
id: "test0000x".to_string(),
container_path: "/dev/xvdx".to_string(),
field_type: "virtio-blk".to_string(),
vm_path: "/dev/vdx".to_string(),
options: vec![],
},
},
ContainerDevice {
device_info: None,
device: Device {
id: "test0000y".to_string(),
container_path: "/dev/yvdy".to_string(),
field_type: "virtio-blk".to_string(),
vm_path: "/dev/vdy".to_string(),
options: vec![],
},
},
ContainerDevice {
device_info: None,
device: Device {
id: "test0000z".to_string(),
container_path: "/dev/zvdz".to_string(),
field_type: "virtio-blk".to_string(),
vm_path: "/dev/zvdz".to_string(),
options: vec![],
},
},
];
let annotations = HashMap::from([(
"cdi.k8s.io/vfiodeviceX".to_owned(),
"katacontainer.com/device=Y".to_owned(),
)]);
let mut spec = SpecBuilder::default()
.annotations(annotations)
.build()
.unwrap();
// do annotate container devices
let annotated_devices = annotate_container_devices(&mut spec, devices.clone()).unwrap();
let actual_devices = devices
.iter()
.map(|d| d.device.clone())
.collect::<Vec<Device>>();
let expected_annotations: HashMap<String, String> = HashMap::from([(
"cdi.k8s.io/vfiodeviceX".to_owned(),
"katacontainer.com/device=Y".to_owned(),
)]);
assert_eq!(Some(expected_annotations), spec.annotations().clone());
assert_eq!(annotated_devices, actual_devices);
}
}

View File

@@ -0,0 +1,63 @@
//
// Copyright (c) 2024 Ant Group
//
// SPDX-License-Identifier: Apache-2.0
//
pub mod container_device;
use agent::types::Device;
use std::collections::HashMap;
use std::path::PathBuf;
#[derive(Clone, Default)]
pub struct DeviceInfo {
pub class_id: String,
pub vendor_id: String,
pub host_path: PathBuf,
}
#[derive(Clone, Default)]
pub struct ContainerDevice {
pub device_info: Option<DeviceInfo>,
pub device: Device,
}
lazy_static! {
// *CDI_DEVICE_KIND_TABLE* is static hash map to store a mapping between device vendor and class
// identifiers and their corresponding CDI vendor and class strings. This mapping is essentially a
// lookup table that allows the system to determine the appropriate CDI for a given device based on
// its vendor and class information.
// Note: Our device mapping is designed to be flexible and responsive to user needs. The current list
// is not exhaustive and will be updated as required.
pub static ref CDI_DEVICE_KIND_TABLE: HashMap<&'static str, &'static str> = {
let mut m = HashMap::new();
m.insert("0x10de-0x030", "nvidia.com/gpu");
m.insert("0x8086-0x030", "intel.com/gpu");
m.insert("0x1002-0x030", "amd.com/gpu");
m.insert("0x15b3-0x020", "nvidia.com/nic");
// TODO: it will be updated as required.
m
};
}
// Sort devices by guest_pcipath
pub fn sort_options_by_pcipath(mut device_options: Vec<String>) -> Vec<String> {
device_options.sort_by(|a, b| {
let extract_path = |s: &str| s.split('=').nth(1).map(|path| path.to_string());
let guest_path_a = extract_path(a);
let guest_path_b = extract_path(b);
guest_path_a.cmp(&guest_path_b)
});
device_options
}
// Resolve the CDI vendor ID/device Class by a lookup table based on the provided vendor and class.
pub fn resolve_cdi_device_kind<'a>(vendor_id: &'a str, class_id: &'a str) -> Option<&'a str> {
let vendor_class = format!("{}-{}", vendor_id, class_id);
// The first 12 characters of the string ("0x10de-0x030") provide a concise
// and clear identification of both the manufacturer and the device category.
// it returns "nvidia.com/gpu", "amd.com/gpu" or others.
CDI_DEVICE_KIND_TABLE.get(&vendor_class[..12]).copied()
}

View File

@@ -4,7 +4,7 @@
// SPDX-License-Identifier: Apache-2.0
//
use std::convert::TryFrom;
use std::{collections::HashMap, convert::TryFrom};
use anyhow::{Context, Result};
use kata_types::{
@@ -22,6 +22,34 @@ struct InitialSize {
orig_toml_default_mem: u32,
}
// generate initial resource(vcpu and memory in MiB) from annotations
impl TryFrom<&HashMap<String, String>> for InitialSize {
type Error = anyhow::Error;
fn try_from(an: &HashMap<String, String>) -> Result<Self> {
let mut vcpu: u32 = 0;
let annotation = Annotation::new(an.clone());
let (period, quota, memory) =
get_sizing_info(annotation).context("failed to get sizing info")?;
let mut cpu = oci::LinuxCpu::default();
cpu.set_period(Some(period));
cpu.set_quota(Some(quota));
// although it may not be actually a linux container, we are only using the calculation inside
// LinuxContainerCpuResources::try_from to generate our vcpu number
if let Ok(cpu_resource) = LinuxContainerCpuResources::try_from(&cpu) {
vcpu = get_nr_vcpu(&cpu_resource);
}
let mem_mb = convert_memory_to_mb(memory);
Ok(Self {
vcpu,
mem_mb,
orig_toml_default_mem: 0,
})
}
}
// generate initial resource(vcpu and memory in MiB) from spec's information
impl TryFrom<&oci::Spec> for InitialSize {
type Error = anyhow::Error;
@@ -32,19 +60,7 @@ impl TryFrom<&oci::Spec> for InitialSize {
// podsandbox, from annotation
ContainerType::PodSandbox => {
let spec_annos = spec.annotations().clone().unwrap_or_default();
let annotation = Annotation::new(spec_annos);
let (period, quota, memory) =
get_sizing_info(annotation).context("failed to get sizing info")?;
let mut cpu = oci::LinuxCpu::default();
cpu.set_period(Some(period));
cpu.set_quota(Some(quota));
// although it may not be actually a linux container, we are only using the calculation inside
// LinuxContainerCpuResources::try_from to generate our vcpu number
if let Ok(cpu_resource) = LinuxContainerCpuResources::try_from(&cpu) {
vcpu = get_nr_vcpu(&cpu_resource);
}
mem_mb = convert_memory_to_mb(memory);
return InitialSize::try_from(&spec_annos);
}
// single container, from container spec
_ => {
@@ -107,6 +123,13 @@ impl InitialSizeManager {
})
}
pub fn new_from(annotation: &HashMap<String, String>) -> Result<Self> {
Ok(Self {
resource: InitialSize::try_from(annotation)
.context("failed to construct static resource")?,
})
}
pub fn setup_config(&mut self, config: &mut TomlConfig) -> Result<()> {
// update this data to the hypervisor config for later use by hypervisor
let hypervisor_name = &config.runtime.hypervisor_name;

View File

@@ -23,6 +23,7 @@ pub mod rootfs;
pub mod share_fs;
pub mod volume;
pub use manager::ResourceManager;
pub mod cdi_devices;
pub mod cpu_mem;
use kata_types::config::hypervisor::SharedFsInfo;

View File

@@ -6,7 +6,6 @@
use std::sync::Arc;
use agent::types::Device;
use agent::{Agent, Storage};
use anyhow::Result;
use async_trait::async_trait;
@@ -20,6 +19,7 @@ use persist::sandbox_persist::Persist;
use tokio::sync::RwLock;
use tracing::instrument;
use crate::cdi_devices::ContainerDevice;
use crate::cpu_mem::initial_size::InitialSizeManager;
use crate::network::NetworkConfig;
use crate::resource_persist::ResourceState;
@@ -116,7 +116,7 @@ impl ResourceManager {
inner.handler_volumes(cid, spec).await
}
pub async fn handler_devices(&self, cid: &str, linux: &Linux) -> Result<Vec<Device>> {
pub async fn handler_devices(&self, cid: &str, linux: &Linux) -> Result<Vec<ContainerDevice>> {
let inner = self.inner.read().await;
inner.handler_devices(cid, linux).await
}

View File

@@ -25,6 +25,7 @@ use persist::sandbox_persist::Persist;
use tokio::{runtime, sync::RwLock};
use crate::{
cdi_devices::{sort_options_by_pcipath, ContainerDevice, DeviceInfo},
cgroups::{CgroupArgs, CgroupsResource},
cpu_mem::{cpu::CpuResource, initial_size::InitialSizeManager, mem::MemResource},
manager::ManagerArgs,
@@ -292,7 +293,7 @@ impl ResourceManagerInner {
.await
}
pub async fn handler_devices(&self, _cid: &str, linux: &Linux) -> Result<Vec<Device>> {
pub async fn handler_devices(&self, _cid: &str, linux: &Linux) -> Result<Vec<ContainerDevice>> {
let mut devices = vec![];
let linux_devices = linux.devices().clone().unwrap_or_default();
@@ -329,7 +330,10 @@ impl ResourceManagerInner {
vm_path: device.config.virt_path,
..Default::default()
};
devices.push(agent_device);
devices.push(ContainerDevice {
device_info: None,
device: agent_device,
});
}
}
LinuxDeviceType::C => {
@@ -361,14 +365,33 @@ impl ResourceManagerInner {
// create agent device
if let DeviceType::Vfio(device) = device_info {
let device_options = sort_options_by_pcipath(device.device_options);
let agent_device = Device {
id: device.device_id, // just for kata-agent
container_path: d.path().display().to_string().clone(),
field_type: vfio_mode,
options: device.device_options,
options: device_options,
..Default::default()
};
devices.push(agent_device);
let vendor_class = device
.devices
.first()
.unwrap()
.device_vendor_class
.as_ref()
.unwrap()
.get_vendor_class_id()
.context("get vendor class failed")?;
let device_info = Some(DeviceInfo {
vendor_id: vendor_class.0.to_owned(),
class_id: vendor_class.1.to_owned(),
host_path: d.path().clone(),
});
devices.push(ContainerDevice {
device_info,
device: agent_device,
});
}
}
_ => {

View File

@@ -4,7 +4,7 @@
// SPDX-License-Identifier: Apache-2.0
//
use crate::types::{ContainerProcess, Response};
use crate::types::{ContainerProcess, TaskResponse};
#[derive(thiserror::Error, Debug)]
pub enum Error {
@@ -13,5 +13,5 @@ pub enum Error {
#[error("failed to find process {0}")]
ProcessNotFound(ContainerProcess),
#[error("unexpected response {0} to shim {1}")]
UnexpectedResponse(Response, String),
UnexpectedResponse(TaskResponse, String),
}

View File

@@ -6,7 +6,8 @@
use std::sync::Arc;
use anyhow::{Context, Result};
use containerd_shim_protos::{events::task::TaskOOM, protobuf::Message as ProtobufMessage};
use containerd_shim_protos::events::task::{TaskExit, TaskOOM};
use containerd_shim_protos::protobuf::Message as ProtobufMessage;
use tokio::sync::mpsc::{channel, Receiver, Sender};
/// message receiver buffer size
@@ -47,7 +48,10 @@ impl Message {
}
const TASK_OOM_EVENT_TOPIC: &str = "/tasks/oom";
const TASK_EXIT_EVENT_TOPIC: &str = "/tasks/exit";
const TASK_OOM_EVENT_URL: &str = "containerd.events.TaskOOM";
const TASK_EXIT_EVENT_URL: &str = "containerd.events.TaskExit";
pub trait Event: std::fmt::Debug + Send {
fn r#type(&self) -> String;
@@ -68,3 +72,17 @@ impl Event for TaskOOM {
self.write_to_bytes().context("get oom value")
}
}
impl Event for TaskExit {
fn r#type(&self) -> String {
TASK_EXIT_EVENT_TOPIC.to_string()
}
fn type_url(&self) -> String {
TASK_EXIT_EVENT_URL.to_string()
}
fn value(&self) -> Result<Vec<u8>> {
self.write_to_bytes().context("get exit value")
}
}

View File

@@ -6,7 +6,7 @@
use std::sync::Arc;
use crate::{message::Message, ContainerManager, Sandbox};
use crate::{message::Message, types::SandboxConfig, ContainerManager, Sandbox};
use anyhow::Result;
use async_trait::async_trait;
use kata_types::config::TomlConfig;
@@ -39,6 +39,7 @@ pub trait RuntimeHandler: Send + Sync {
msg_sender: Sender<Message>,
config: Arc<TomlConfig>,
init_size_manager: InitialSizeManager,
sandbox_config: SandboxConfig,
) -> Result<RuntimeInstance>;
fn cleanup(&self, id: &str) -> Result<()>;

View File

@@ -4,10 +4,10 @@
// SPDX-License-Identifier: Apache-2.0
//
use crate::{types::ContainerProcess, ContainerManager};
use anyhow::Result;
use async_trait::async_trait;
use oci_spec::runtime as oci;
use runtime_spec as spec;
use std::sync::Arc;
#[derive(Clone)]
pub struct SandboxNetworkEnv {
@@ -26,13 +26,7 @@ impl std::fmt::Debug for SandboxNetworkEnv {
#[async_trait]
pub trait Sandbox: Send + Sync {
async fn start(
&self,
dns: Vec<String>,
spec: &oci::Spec,
state: &spec::State,
network_env: SandboxNetworkEnv,
) -> Result<()>;
async fn start(&self) -> Result<()>;
async fn stop(&self) -> Result<()>;
async fn cleanup(&self) -> Result<()>;
async fn shutdown(&self) -> Result<()>;
@@ -43,6 +37,12 @@ pub trait Sandbox: Send + Sync {
async fn direct_volume_stats(&self, volume_path: &str) -> Result<String>;
async fn direct_volume_resize(&self, resize_req: agent::ResizeVolumeRequest) -> Result<()>;
async fn agent_sock(&self) -> Result<String>;
async fn wait_process(
&self,
cm: Arc<dyn ContainerManager>,
process_id: ContainerProcess,
shim_pid: u32,
) -> Result<()>;
// metrics function
async fn agent_metrics(&self) -> Result<String>;

View File

@@ -8,18 +8,25 @@ mod trans_from_agent;
mod trans_from_shim;
mod trans_into_agent;
mod trans_into_shim;
pub mod utils;
use std::fmt;
use std::{
collections::{hash_map::RandomState, HashMap},
fmt,
};
use crate::SandboxNetworkEnv;
use anyhow::{Context, Result};
use kata_sys_util::validate;
use kata_types::mount::Mount;
use oci_spec::runtime as oci;
use strum::Display;
/// Request: request from shim
/// Request and Response messages need to be paired
/// TaskRequest: TaskRequest from shim
/// TaskRequest and TaskResponse messages need to be paired
#[derive(Debug, Clone, Display)]
pub enum Request {
pub enum TaskRequest {
CreateContainer(ContainerConfig),
CloseProcessIO(ContainerProcess),
DeleteProcess(ContainerProcess),
@@ -38,10 +45,10 @@ pub enum Request {
ConnectContainer(ContainerID),
}
/// Response: response to shim
/// Request and Response messages need to be paired
/// TaskResponse: TaskResponse to shim
/// TaskRequest and TaskResponse messages need to be paired
#[derive(Debug, Clone, Display)]
pub enum Response {
pub enum TaskResponse {
CreateContainer(PID),
CloseProcessIO,
DeleteProcess(ProcessStateInfo),
@@ -134,6 +141,17 @@ pub struct ContainerConfig {
pub stderr: Option<String>,
}
#[derive(Clone, Debug)]
pub struct SandboxConfig {
pub sandbox_id: String,
pub hostname: String,
pub dns: Vec<String>,
pub network_env: SandboxNetworkEnv,
pub annotations: HashMap<String, String, RandomState>,
pub hooks: Option<oci::Hooks>,
pub state: runtime_spec::State,
}
#[derive(Debug, Clone)]
pub struct PID {
pub pid: u32,

View File

@@ -5,8 +5,8 @@
//
use super::{
ContainerConfig, ContainerID, ContainerProcess, ExecProcessRequest, KillRequest, Request,
ResizePTYRequest, ShutdownRequest, UpdateRequest,
ContainerConfig, ContainerID, ContainerProcess, ExecProcessRequest, KillRequest,
ResizePTYRequest, ShutdownRequest, TaskRequest, UpdateRequest,
};
use anyhow::{Context, Result};
use containerd_shim_protos::api;
@@ -37,7 +37,7 @@ fn trans_from_shim_mount(from: &api::Mount) -> Mount {
}
}
impl TryFrom<api::CreateTaskRequest> for Request {
impl TryFrom<api::CreateTaskRequest> for TaskRequest {
type Error = anyhow::Error;
fn try_from(from: api::CreateTaskRequest) -> Result<Self> {
let options = if from.has_options() {
@@ -45,7 +45,7 @@ impl TryFrom<api::CreateTaskRequest> for Request {
} else {
None
};
Ok(Request::CreateContainer(ContainerConfig {
Ok(TaskRequest::CreateContainer(ContainerConfig {
container_id: from.id.clone(),
bundle: from.bundle.clone(),
rootfs_mounts: from.rootfs.iter().map(trans_from_shim_mount).collect(),
@@ -58,29 +58,29 @@ impl TryFrom<api::CreateTaskRequest> for Request {
}
}
impl TryFrom<api::CloseIORequest> for Request {
impl TryFrom<api::CloseIORequest> for TaskRequest {
type Error = anyhow::Error;
fn try_from(from: api::CloseIORequest) -> Result<Self> {
Ok(Request::CloseProcessIO(
Ok(TaskRequest::CloseProcessIO(
ContainerProcess::new(&from.id, &from.exec_id).context("new process id")?,
))
}
}
impl TryFrom<api::DeleteRequest> for Request {
impl TryFrom<api::DeleteRequest> for TaskRequest {
type Error = anyhow::Error;
fn try_from(from: api::DeleteRequest) -> Result<Self> {
Ok(Request::DeleteProcess(
Ok(TaskRequest::DeleteProcess(
ContainerProcess::new(&from.id, &from.exec_id).context("new process id")?,
))
}
}
impl TryFrom<api::ExecProcessRequest> for Request {
impl TryFrom<api::ExecProcessRequest> for TaskRequest {
type Error = anyhow::Error;
fn try_from(from: api::ExecProcessRequest) -> Result<Self> {
let spec = from.spec();
Ok(Request::ExecProcess(ExecProcessRequest {
Ok(TaskRequest::ExecProcess(ExecProcessRequest {
process: ContainerProcess::new(&from.id, &from.exec_id).context("new process id")?,
terminal: from.terminal,
stdin: (!from.stdin.is_empty()).then(|| from.stdin.clone()),
@@ -92,10 +92,10 @@ impl TryFrom<api::ExecProcessRequest> for Request {
}
}
impl TryFrom<api::KillRequest> for Request {
impl TryFrom<api::KillRequest> for TaskRequest {
type Error = anyhow::Error;
fn try_from(from: api::KillRequest) -> Result<Self> {
Ok(Request::KillProcess(KillRequest {
Ok(TaskRequest::KillProcess(KillRequest {
process: ContainerProcess::new(&from.id, &from.exec_id).context("new process id")?,
signal: from.signal,
all: from.all,
@@ -103,47 +103,47 @@ impl TryFrom<api::KillRequest> for Request {
}
}
impl TryFrom<api::WaitRequest> for Request {
impl TryFrom<api::WaitRequest> for TaskRequest {
type Error = anyhow::Error;
fn try_from(from: api::WaitRequest) -> Result<Self> {
Ok(Request::WaitProcess(
Ok(TaskRequest::WaitProcess(
ContainerProcess::new(&from.id, &from.exec_id).context("new process id")?,
))
}
}
impl TryFrom<api::StartRequest> for Request {
impl TryFrom<api::StartRequest> for TaskRequest {
type Error = anyhow::Error;
fn try_from(from: api::StartRequest) -> Result<Self> {
Ok(Request::StartProcess(
Ok(TaskRequest::StartProcess(
ContainerProcess::new(&from.id, &from.exec_id).context("new process id")?,
))
}
}
impl TryFrom<api::StateRequest> for Request {
impl TryFrom<api::StateRequest> for TaskRequest {
type Error = anyhow::Error;
fn try_from(from: api::StateRequest) -> Result<Self> {
Ok(Request::StateProcess(
Ok(TaskRequest::StateProcess(
ContainerProcess::new(&from.id, &from.exec_id).context("new process id")?,
))
}
}
impl TryFrom<api::ShutdownRequest> for Request {
impl TryFrom<api::ShutdownRequest> for TaskRequest {
type Error = anyhow::Error;
fn try_from(from: api::ShutdownRequest) -> Result<Self> {
Ok(Request::ShutdownContainer(ShutdownRequest {
Ok(TaskRequest::ShutdownContainer(ShutdownRequest {
container_id: from.id.to_string(),
is_now: from.now,
}))
}
}
impl TryFrom<api::ResizePtyRequest> for Request {
impl TryFrom<api::ResizePtyRequest> for TaskRequest {
type Error = anyhow::Error;
fn try_from(from: api::ResizePtyRequest) -> Result<Self> {
Ok(Request::ResizeProcessPTY(ResizePTYRequest {
Ok(TaskRequest::ResizeProcessPTY(ResizePTYRequest {
process: ContainerProcess::new(&from.id, &from.exec_id).context("new process id")?,
width: from.width,
height: from.height,
@@ -151,47 +151,47 @@ impl TryFrom<api::ResizePtyRequest> for Request {
}
}
impl TryFrom<api::PauseRequest> for Request {
impl TryFrom<api::PauseRequest> for TaskRequest {
type Error = anyhow::Error;
fn try_from(from: api::PauseRequest) -> Result<Self> {
Ok(Request::PauseContainer(ContainerID::new(&from.id)?))
Ok(TaskRequest::PauseContainer(ContainerID::new(&from.id)?))
}
}
impl TryFrom<api::ResumeRequest> for Request {
impl TryFrom<api::ResumeRequest> for TaskRequest {
type Error = anyhow::Error;
fn try_from(from: api::ResumeRequest) -> Result<Self> {
Ok(Request::ResumeContainer(ContainerID::new(&from.id)?))
Ok(TaskRequest::ResumeContainer(ContainerID::new(&from.id)?))
}
}
impl TryFrom<api::StatsRequest> for Request {
impl TryFrom<api::StatsRequest> for TaskRequest {
type Error = anyhow::Error;
fn try_from(from: api::StatsRequest) -> Result<Self> {
Ok(Request::StatsContainer(ContainerID::new(&from.id)?))
Ok(TaskRequest::StatsContainer(ContainerID::new(&from.id)?))
}
}
impl TryFrom<api::UpdateTaskRequest> for Request {
impl TryFrom<api::UpdateTaskRequest> for TaskRequest {
type Error = anyhow::Error;
fn try_from(from: api::UpdateTaskRequest) -> Result<Self> {
Ok(Request::UpdateContainer(UpdateRequest {
Ok(TaskRequest::UpdateContainer(UpdateRequest {
container_id: from.id.to_string(),
value: from.resources().value.to_vec(),
}))
}
}
impl TryFrom<api::PidsRequest> for Request {
impl TryFrom<api::PidsRequest> for TaskRequest {
type Error = anyhow::Error;
fn try_from(_from: api::PidsRequest) -> Result<Self> {
Ok(Request::Pid)
Ok(TaskRequest::Pid)
}
}
impl TryFrom<api::ConnectRequest> for Request {
impl TryFrom<api::ConnectRequest> for TaskRequest {
type Error = anyhow::Error;
fn try_from(from: api::ConnectRequest) -> Result<Self> {
Ok(Request::ConnectContainer(ContainerID::new(&from.id)?))
Ok(TaskRequest::ConnectContainer(ContainerID::new(&from.id)?))
}
}

View File

@@ -6,37 +6,16 @@
use std::{
any::type_name,
convert::{Into, TryFrom, TryInto},
time,
convert::{Into, TryFrom},
};
use anyhow::{anyhow, Result};
use containerd_shim_protos::api;
use super::{ProcessExitStatus, ProcessStateInfo, ProcessStatus, Response};
use super::utils::option_system_time_into;
use super::{ProcessExitStatus, ProcessStateInfo, ProcessStatus, TaskResponse};
use crate::error::Error;
fn system_time_into(time: time::SystemTime) -> ::protobuf::well_known_types::timestamp::Timestamp {
let mut proto_time = ::protobuf::well_known_types::timestamp::Timestamp::new();
proto_time.seconds = time
.duration_since(time::UNIX_EPOCH)
.unwrap_or_default()
.as_secs()
.try_into()
.unwrap_or_default();
proto_time
}
fn option_system_time_into(
time: Option<time::SystemTime>,
) -> protobuf::MessageField<protobuf::well_known_types::timestamp::Timestamp> {
match time {
Some(v) => ::protobuf::MessageField::some(system_time_into(v)),
None => ::protobuf::MessageField::none(),
}
}
impl From<ProcessExitStatus> for api::WaitResponse {
fn from(from: ProcessExitStatus) -> Self {
Self {
@@ -89,11 +68,11 @@ impl From<ProcessStateInfo> for api::DeleteResponse {
}
}
impl TryFrom<Response> for api::CreateTaskResponse {
impl TryFrom<TaskResponse> for api::CreateTaskResponse {
type Error = anyhow::Error;
fn try_from(from: Response) -> Result<Self> {
fn try_from(from: TaskResponse) -> Result<Self> {
match from {
Response::CreateContainer(resp) => Ok(Self {
TaskResponse::CreateContainer(resp) => Ok(Self {
pid: resp.pid,
..Default::default()
}),
@@ -105,11 +84,11 @@ impl TryFrom<Response> for api::CreateTaskResponse {
}
}
impl TryFrom<Response> for api::DeleteResponse {
impl TryFrom<TaskResponse> for api::DeleteResponse {
type Error = anyhow::Error;
fn try_from(from: Response) -> Result<Self> {
fn try_from(from: TaskResponse) -> Result<Self> {
match from {
Response::DeleteProcess(resp) => Ok(resp.into()),
TaskResponse::DeleteProcess(resp) => Ok(resp.into()),
_ => Err(anyhow!(Error::UnexpectedResponse(
from,
type_name::<Self>().to_string()
@@ -118,11 +97,11 @@ impl TryFrom<Response> for api::DeleteResponse {
}
}
impl TryFrom<Response> for api::WaitResponse {
impl TryFrom<TaskResponse> for api::WaitResponse {
type Error = anyhow::Error;
fn try_from(from: Response) -> Result<Self> {
fn try_from(from: TaskResponse) -> Result<Self> {
match from {
Response::WaitProcess(resp) => Ok(resp.into()),
TaskResponse::WaitProcess(resp) => Ok(resp.into()),
_ => Err(anyhow!(Error::UnexpectedResponse(
from,
type_name::<Self>().to_string()
@@ -131,11 +110,11 @@ impl TryFrom<Response> for api::WaitResponse {
}
}
impl TryFrom<Response> for api::StartResponse {
impl TryFrom<TaskResponse> for api::StartResponse {
type Error = anyhow::Error;
fn try_from(from: Response) -> Result<Self> {
fn try_from(from: TaskResponse) -> Result<Self> {
match from {
Response::StartProcess(resp) => Ok(api::StartResponse {
TaskResponse::StartProcess(resp) => Ok(api::StartResponse {
pid: resp.pid,
..Default::default()
}),
@@ -147,11 +126,11 @@ impl TryFrom<Response> for api::StartResponse {
}
}
impl TryFrom<Response> for api::StateResponse {
impl TryFrom<TaskResponse> for api::StateResponse {
type Error = anyhow::Error;
fn try_from(from: Response) -> Result<Self> {
fn try_from(from: TaskResponse) -> Result<Self> {
match from {
Response::StateProcess(resp) => Ok(resp.into()),
TaskResponse::StateProcess(resp) => Ok(resp.into()),
_ => Err(anyhow!(Error::UnexpectedResponse(
from,
type_name::<Self>().to_string()
@@ -160,13 +139,13 @@ impl TryFrom<Response> for api::StateResponse {
}
}
impl TryFrom<Response> for api::StatsResponse {
impl TryFrom<TaskResponse> for api::StatsResponse {
type Error = anyhow::Error;
fn try_from(from: Response) -> Result<Self> {
fn try_from(from: TaskResponse) -> Result<Self> {
let mut any = ::protobuf::well_known_types::any::Any::new();
let mut response = api::StatsResponse::new();
match from {
Response::StatsContainer(resp) => {
TaskResponse::StatsContainer(resp) => {
if let Some(value) = resp.value {
any.type_url = value.type_url;
any.value = value.value;
@@ -182,11 +161,11 @@ impl TryFrom<Response> for api::StatsResponse {
}
}
impl TryFrom<Response> for api::PidsResponse {
impl TryFrom<TaskResponse> for api::PidsResponse {
type Error = anyhow::Error;
fn try_from(from: Response) -> Result<Self> {
fn try_from(from: TaskResponse) -> Result<Self> {
match from {
Response::Pid(resp) => {
TaskResponse::Pid(resp) => {
let mut processes: Vec<api::ProcessInfo> = vec![];
let mut p_info = api::ProcessInfo::new();
let mut res = api::PidsResponse::new();
@@ -203,11 +182,11 @@ impl TryFrom<Response> for api::PidsResponse {
}
}
impl TryFrom<Response> for api::ConnectResponse {
impl TryFrom<TaskResponse> for api::ConnectResponse {
type Error = anyhow::Error;
fn try_from(from: Response) -> Result<Self> {
fn try_from(from: TaskResponse) -> Result<Self> {
match from {
Response::ConnectContainer(resp) => {
TaskResponse::ConnectContainer(resp) => {
let mut res = api::ConnectResponse::new();
res.set_shim_pid(resp.pid);
Ok(res)
@@ -220,18 +199,18 @@ impl TryFrom<Response> for api::ConnectResponse {
}
}
impl TryFrom<Response> for api::Empty {
impl TryFrom<TaskResponse> for api::Empty {
type Error = anyhow::Error;
fn try_from(from: Response) -> Result<Self> {
fn try_from(from: TaskResponse) -> Result<Self> {
match from {
Response::CloseProcessIO => Ok(api::Empty::new()),
Response::ExecProcess => Ok(api::Empty::new()),
Response::KillProcess => Ok(api::Empty::new()),
Response::ShutdownContainer => Ok(api::Empty::new()),
Response::PauseContainer => Ok(api::Empty::new()),
Response::ResumeContainer => Ok(api::Empty::new()),
Response::ResizeProcessPTY => Ok(api::Empty::new()),
Response::UpdateContainer => Ok(api::Empty::new()),
TaskResponse::CloseProcessIO => Ok(api::Empty::new()),
TaskResponse::ExecProcess => Ok(api::Empty::new()),
TaskResponse::KillProcess => Ok(api::Empty::new()),
TaskResponse::ShutdownContainer => Ok(api::Empty::new()),
TaskResponse::PauseContainer => Ok(api::Empty::new()),
TaskResponse::ResumeContainer => Ok(api::Empty::new()),
TaskResponse::ResizeProcessPTY => Ok(api::Empty::new()),
TaskResponse::UpdateContainer => Ok(api::Empty::new()),
_ => Err(anyhow!(Error::UnexpectedResponse(
from,
type_name::<Self>().to_string()

View File

@@ -0,0 +1,28 @@
// Copyright 2024 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
use std::convert::TryInto;
use std::time;
fn system_time_into(time: time::SystemTime) -> ::protobuf::well_known_types::timestamp::Timestamp {
let mut proto_time = ::protobuf::well_known_types::timestamp::Timestamp::new();
proto_time.seconds = time
.duration_since(time::UNIX_EPOCH)
.unwrap_or_default()
.as_secs()
.try_into()
.unwrap_or_default();
proto_time
}
pub fn option_system_time_into(
time: Option<time::SystemTime>,
) -> protobuf::MessageField<protobuf::well_known_types::timestamp::Timestamp> {
match time {
Some(v) => ::protobuf::MessageField::some(system_time_into(v)),
None => ::protobuf::MessageField::none(),
}
}

View File

@@ -7,7 +7,7 @@ use std::sync::Arc;
use anyhow::Result;
use async_trait::async_trait;
use common::{message::Message, RuntimeHandler, RuntimeInstance};
use common::{message::Message, types::SandboxConfig, RuntimeHandler, RuntimeInstance};
use kata_types::config::TomlConfig;
use resource::cpu_mem::initial_size::InitialSizeManager;
use tokio::sync::mpsc::Sender;
@@ -34,6 +34,7 @@ impl RuntimeHandler for LinuxContainer {
_msg_sender: Sender<Message>,
_config: Arc<TomlConfig>,
_init_size_manager: InitialSizeManager,
_sandbox_config: SandboxConfig,
) -> Result<RuntimeInstance> {
todo!()
}

View File

@@ -7,7 +7,7 @@
use anyhow::{anyhow, Context, Result};
use common::{
message::Message,
types::{Request, Response},
types::{ContainerProcess, SandboxConfig, TaskRequest, TaskResponse},
RuntimeHandler, RuntimeInstance, Sandbox, SandboxNetworkEnv,
};
use hypervisor::Param;
@@ -92,10 +92,7 @@ impl RuntimeHandlerManagerInner {
#[instrument]
async fn init_runtime_handler(
&mut self,
spec: &oci::Spec,
state: &spec::State,
network_env: SandboxNetworkEnv,
dns: Vec<String>,
sandbox_config: SandboxConfig,
config: Arc<TomlConfig>,
init_size_manager: InitialSizeManager,
) -> Result<()> {
@@ -117,6 +114,7 @@ impl RuntimeHandlerManagerInner {
self.msg_sender.clone(),
config.clone(),
init_size_manager,
sandbox_config,
)
.await
.context("new runtime instance")?;
@@ -137,21 +135,22 @@ impl RuntimeHandlerManagerInner {
let instance = Arc::new(runtime_instance);
self.runtime_instance = Some(instance.clone());
// start sandbox
instance
.sandbox
.start(dns, spec, state, network_env)
.await
.context("start sandbox")?;
Ok(())
}
#[instrument]
async fn start_runtime_handler(&self) -> Result<()> {
if let Some(instance) = self.runtime_instance.as_ref() {
instance.sandbox.start().await.context("start sandbox")?;
}
Ok(())
}
#[instrument]
async fn try_init(
&mut self,
spec: &oci::Spec,
state: &spec::State,
mut sandbox_config: SandboxConfig,
spec: Option<&oci::Spec>,
options: &Option<Vec<u8>>,
) -> Result<()> {
// return if runtime instance has init
@@ -159,8 +158,6 @@ impl RuntimeHandlerManagerInner {
return Ok(());
}
let mut dns: Vec<String> = vec![];
#[cfg(feature = "linux")]
LinuxContainer::init().context("init linux container")?;
#[cfg(feature = "wasm")]
@@ -168,15 +165,8 @@ impl RuntimeHandlerManagerInner {
#[cfg(feature = "virt")]
VirtContainer::init().context("init virt container")?;
let spec_mounts = spec.mounts().clone().unwrap_or_default();
for m in &spec_mounts {
if get_mount_path(&Some(m.destination().clone())) == DEFAULT_GUEST_DNS_FILE {
let contents = fs::read_to_string(&Path::new(&get_mount_path(m.source()))).await?;
dns = contents.split('\n').map(|e| e.to_string()).collect();
}
}
let mut config = load_config(spec, options).context("load config")?;
let mut config =
load_config(&sandbox_config.annotations, options).context("load config")?;
// Sandbox sizing information *may* be provided in two scenarios:
// 1. The upper layer runtime (ie, containerd or crio) provide sandbox sizing information as an annotation
@@ -186,8 +176,14 @@ impl RuntimeHandlerManagerInner {
// 2. If this is not a sandbox infrastructure container, but instead a standalone single container (analogous to "docker run..."),
// then the container spec itself will contain appropriate sizing information for the entire sandbox (since it is
// a single container.
let mut initial_size_manager =
InitialSizeManager::new(spec).context("failed to construct static resource manager")?;
let mut initial_size_manager = if let Some(spec) = spec {
InitialSizeManager::new(spec).context("failed to construct static resource manager")?
} else {
InitialSizeManager::new_from(&sandbox_config.annotations)
.context("failed to construct static resource manager")?
};
initial_size_manager
.setup_config(&mut config)
.context("failed to setup static resource mgmt config")?;
@@ -195,53 +191,18 @@ impl RuntimeHandlerManagerInner {
update_component_log_level(&config);
let dan_path = dan_config_path(&config, &self.id);
let mut network_created = false;
// set netns to None if we want no network for the VM
let netns = if config.runtime.disable_new_netns {
None
} else if dan_path.exists() {
info!(sl!(), "Do not create a netns due to DAN");
None
} else {
let mut netns_path = None;
if let Some(linux) = &spec.linux() {
let linux_namespaces = linux.namespaces().clone().unwrap_or_default();
for ns in &linux_namespaces {
if ns.typ() != oci::LinuxNamespaceType::Network {
continue;
}
// get netns path from oci spec
if ns.path().is_some() {
netns_path = ns.path().clone().map(|p| p.display().to_string());
}
// if we get empty netns from oci spec, we need to create netns for the VM
else {
let ns_name = generate_netns_name();
let netns = NetNs::new(ns_name)?;
let path = Some(PathBuf::from(netns.path()).display().to_string());
netns_path = path;
network_created = true;
}
break;
}
}
netns_path
};
if config.runtime.disable_new_netns || dan_path.exists() {
sandbox_config.network_env.netns = None;
}
let network_env = SandboxNetworkEnv {
netns,
network_created,
};
self.init_runtime_handler(
spec,
state,
network_env,
dns,
Arc::new(config),
initial_size_manager,
)
.await
.context("init runtime handler")?;
self.init_runtime_handler(sandbox_config, Arc::new(config), initial_size_manager)
.await
.context("init runtime handler")?;
self.start_runtime_handler()
.await
.context("start runtime handler")?;
// the sandbox creation can reach here only once and the sandbox is created
// so we can safely create the shim management socket right now
@@ -294,7 +255,8 @@ impl RuntimeHandlerManager {
.context("failed to load the sandbox state")?;
let config = if let Ok(spec) = load_oci_spec() {
load_config(&spec, &None).context("load config")?
let annotations = spec.annotations().clone().unwrap_or_default();
load_config(&annotations, &None).context("load config")?
} else {
TomlConfig::default()
};
@@ -350,19 +312,81 @@ impl RuntimeHandlerManager {
}
#[instrument]
async fn try_init_runtime_instance(
async fn task_init_runtime_instance(
&self,
spec: &oci::Spec,
state: &spec::State,
options: &Option<Vec<u8>>,
) -> Result<()> {
let mut dns: Vec<String> = vec![];
let spec_mounts = spec.mounts().clone().unwrap_or_default();
for m in &spec_mounts {
if get_mount_path(&Some(m.destination().clone())) == DEFAULT_GUEST_DNS_FILE {
let contents = fs::read_to_string(&Path::new(&get_mount_path(m.source()))).await?;
dns = contents.split('\n').map(|e| e.to_string()).collect();
}
}
let mut network_created = false;
let mut netns = None;
if let Some(linux) = &spec.linux() {
let linux_namespaces = linux.namespaces().clone().unwrap_or_default();
for ns in &linux_namespaces {
if ns.typ() != oci::LinuxNamespaceType::Network {
continue;
}
// get netns path from oci spec
if ns.path().is_some() {
netns = ns.path().clone().map(|p| p.display().to_string());
}
// if we get empty netns from oci spec, we need to create netns for the VM
else {
let ns_name = generate_netns_name();
let raw_netns = NetNs::new(ns_name)?;
let path = Some(PathBuf::from(raw_netns.path()).display().to_string());
netns = path;
network_created = true;
}
break;
}
}
let network_env = SandboxNetworkEnv {
netns,
network_created,
};
let mut inner: tokio::sync::RwLockWriteGuard<'_, RuntimeHandlerManagerInner> =
self.inner.write().await;
let sandbox_config = SandboxConfig {
sandbox_id: inner.id.clone(),
dns,
hostname: spec.hostname().clone().unwrap_or_default(),
network_env,
annotations: spec.annotations().clone().unwrap_or_default(),
hooks: spec.hooks().clone(),
state: state.clone(),
};
inner.try_init(sandbox_config, Some(spec), options).await
}
#[instrument]
async fn init_runtime_instance(
&self,
sandbox_config: SandboxConfig,
spec: Option<&oci::Spec>,
options: &Option<Vec<u8>>,
) -> Result<()> {
let mut inner = self.inner.write().await;
inner.try_init(spec, state, options).await
inner.try_init(sandbox_config, spec, options).await
}
#[instrument(parent = &*(ROOTSPAN))]
pub async fn handler_message(&self, req: Request) -> Result<Response> {
if let Request::CreateContainer(container_config) = req {
pub async fn handler_message(&self, req: TaskRequest) -> Result<TaskResponse> {
if let TaskRequest::CreateContainer(container_config) = req {
// get oci spec
let bundler_path = format!(
"{}/{}",
@@ -379,7 +403,7 @@ impl RuntimeHandlerManager {
annotations: spec.annotations().clone().unwrap_or_default(),
};
self.try_init_runtime_instance(&spec, &state, &container_config.options)
self.task_init_runtime_instance(&spec, &state, &container_config.options)
.await
.context("try init runtime instance")?;
let instance = self
@@ -387,20 +411,37 @@ impl RuntimeHandlerManager {
.await
.context("get runtime instance")?;
let container_id = container_config.container_id.clone();
let shim_pid = instance
.container_manager
.create_container(container_config, spec)
.await
.context("create container")?;
Ok(Response::CreateContainer(shim_pid))
let container_manager = instance.container_manager.clone();
let process_id =
ContainerProcess::new(&container_id, "").context("create container process")?;
let pid = shim_pid.pid;
tokio::spawn(async move {
let result = instance
.sandbox
.wait_process(container_manager, process_id, pid)
.await;
if let Err(e) = result {
error!(sl!(), "sandbox wait process error: {:?}", e);
}
});
Ok(TaskResponse::CreateContainer(shim_pid))
} else {
self.handler_request(req).await.context("handler request")
self.handler_request(req)
.await
.context("handler TaskRequest")
}
}
#[instrument(parent = &(*ROOTSPAN))]
pub async fn handler_request(&self, req: Request) -> Result<Response> {
pub async fn handler_request(&self, req: TaskRequest) -> Result<TaskResponse> {
let instance = self
.get_runtime_instance()
.await
@@ -409,24 +450,24 @@ impl RuntimeHandlerManager {
let cm = instance.container_manager.clone();
match req {
Request::CreateContainer(req) => Err(anyhow!("Unreachable request {:?}", req)),
Request::CloseProcessIO(process_id) => {
TaskRequest::CreateContainer(req) => Err(anyhow!("Unreachable TaskRequest {:?}", req)),
TaskRequest::CloseProcessIO(process_id) => {
cm.close_process_io(&process_id).await.context("close io")?;
Ok(Response::CloseProcessIO)
Ok(TaskResponse::CloseProcessIO)
}
Request::DeleteProcess(process_id) => {
TaskRequest::DeleteProcess(process_id) => {
let resp = cm.delete_process(&process_id).await.context("do delete")?;
Ok(Response::DeleteProcess(resp))
Ok(TaskResponse::DeleteProcess(resp))
}
Request::ExecProcess(req) => {
TaskRequest::ExecProcess(req) => {
cm.exec_process(req).await.context("exec")?;
Ok(Response::ExecProcess)
Ok(TaskResponse::ExecProcess)
}
Request::KillProcess(req) => {
TaskRequest::KillProcess(req) => {
cm.kill_process(&req).await.context("kill process")?;
Ok(Response::KillProcess)
Ok(TaskResponse::KillProcess)
}
Request::ShutdownContainer(req) => {
TaskRequest::ShutdownContainer(req) => {
if cm.need_shutdown_sandbox(&req).await {
sandbox.shutdown().await.context("do shutdown")?;
@@ -435,59 +476,67 @@ impl RuntimeHandlerManager {
let tracer = kata_tracer.lock().await;
tracer.trace_end();
}
Ok(Response::ShutdownContainer)
Ok(TaskResponse::ShutdownContainer)
}
Request::WaitProcess(process_id) => {
TaskRequest::WaitProcess(process_id) => {
let exit_status = cm.wait_process(&process_id).await.context("wait process")?;
if cm.is_sandbox_container(&process_id).await {
sandbox.stop().await.context("stop sandbox")?;
}
Ok(Response::WaitProcess(exit_status))
Ok(TaskResponse::WaitProcess(exit_status))
}
Request::StartProcess(process_id) => {
TaskRequest::StartProcess(process_id) => {
let shim_pid = cm
.start_process(&process_id)
.await
.context("start process")?;
Ok(Response::StartProcess(shim_pid))
let pid = shim_pid.pid;
tokio::spawn(async move {
let result = sandbox.wait_process(cm, process_id, pid).await;
if let Err(e) = result {
error!(sl!(), "sandbox wait process error: {:?}", e);
}
});
Ok(TaskResponse::StartProcess(shim_pid))
}
Request::StateProcess(process_id) => {
TaskRequest::StateProcess(process_id) => {
let state = cm
.state_process(&process_id)
.await
.context("state process")?;
Ok(Response::StateProcess(state))
Ok(TaskResponse::StateProcess(state))
}
Request::PauseContainer(container_id) => {
TaskRequest::PauseContainer(container_id) => {
cm.pause_container(&container_id)
.await
.context("pause container")?;
Ok(Response::PauseContainer)
Ok(TaskResponse::PauseContainer)
}
Request::ResumeContainer(container_id) => {
TaskRequest::ResumeContainer(container_id) => {
cm.resume_container(&container_id)
.await
.context("resume container")?;
Ok(Response::ResumeContainer)
Ok(TaskResponse::ResumeContainer)
}
Request::ResizeProcessPTY(req) => {
TaskRequest::ResizeProcessPTY(req) => {
cm.resize_process_pty(&req).await.context("resize pty")?;
Ok(Response::ResizeProcessPTY)
Ok(TaskResponse::ResizeProcessPTY)
}
Request::StatsContainer(container_id) => {
TaskRequest::StatsContainer(container_id) => {
let stats = cm
.stats_container(&container_id)
.await
.context("stats container")?;
Ok(Response::StatsContainer(stats))
Ok(TaskResponse::StatsContainer(stats))
}
Request::UpdateContainer(req) => {
TaskRequest::UpdateContainer(req) => {
cm.update_container(req).await.context("update container")?;
Ok(Response::UpdateContainer)
Ok(TaskResponse::UpdateContainer)
}
Request::Pid => Ok(Response::Pid(cm.pid().await.context("pid")?)),
Request::ConnectContainer(container_id) => Ok(Response::ConnectContainer(
TaskRequest::Pid => Ok(TaskResponse::Pid(cm.pid().await.context("pid")?)),
TaskRequest::ConnectContainer(container_id) => Ok(TaskResponse::ConnectContainer(
cm.connect_container(&container_id)
.await
.context("connect")?,
@@ -503,9 +552,10 @@ impl RuntimeHandlerManager {
/// 4. If above three are not set, then get default path from DEFAULT_RUNTIME_CONFIGURATIONS
/// in kata-containers/src/libs/kata-types/src/config/default.rs, in array order.
#[instrument]
fn load_config(spec: &oci::Spec, option: &Option<Vec<u8>>) -> Result<TomlConfig> {
fn load_config(an: &HashMap<String, String>, option: &Option<Vec<u8>>) -> Result<TomlConfig> {
const KATA_CONF_FILE: &str = "KATA_CONF_FILE";
let annotation = Annotation::new(spec.annotations().clone().unwrap_or_default());
let annotation = Annotation::new(an.clone());
let config_path = if let Some(path) = annotation.get_sandbox_config_path() {
path
} else if let Ok(path) = std::env::var(KATA_CONF_FILE) {

View File

@@ -43,3 +43,6 @@ default = ["cloud-hypervisor"]
# Enable the Cloud Hypervisor driver
cloud-hypervisor = []
# Enable the build-in VMM Dragtonball
dragonball = []

View File

@@ -21,7 +21,9 @@ use kata_types::k8s;
use oci_spec::runtime as oci;
use oci::{LinuxResources, Process as OCIProcess};
use resource::{ResourceManager, ResourceUpdateOp};
use resource::{
cdi_devices::container_device::annotate_container_devices, ResourceManager, ResourceUpdateOp,
};
use tokio::sync::RwLock;
use super::{
@@ -174,10 +176,12 @@ impl Container {
.as_ref()
.context("OCI spec missing linux field")?;
let devices_agent = self
let container_devices = self
.resource_manager
.handler_devices(&config.container_id, linux)
.await?;
let devices_agent = annotate_container_devices(&mut spec, container_devices)
.context("annotate container devices failed")?;
// update vcpus, mems and host cgroups
let resources = self

View File

@@ -19,14 +19,14 @@ use std::sync::Arc;
use agent::{kata::KataAgent, AGENT_KATA};
use anyhow::{anyhow, Context, Result};
use async_trait::async_trait;
use common::{message::Message, RuntimeHandler, RuntimeInstance};
use common::{message::Message, types::SandboxConfig, RuntimeHandler, RuntimeInstance};
use hypervisor::Hypervisor;
#[cfg(not(target_arch = "s390x"))]
#[cfg(all(feature = "dragonball", not(target_arch = "s390x")))]
use hypervisor::{dragonball::Dragonball, HYPERVISOR_DRAGONBALL};
#[cfg(not(target_arch = "s390x"))]
use hypervisor::{firecracker::Firecracker, HYPERVISOR_FIRECRACKER};
use hypervisor::{qemu::Qemu, HYPERVISOR_QEMU};
#[cfg(not(target_arch = "s390x"))]
#[cfg(all(feature = "dragonball", not(target_arch = "s390x")))]
use kata_types::config::DragonballConfig;
#[cfg(not(target_arch = "s390x"))]
use kata_types::config::FirecrackerConfig;
@@ -57,7 +57,9 @@ impl RuntimeHandler for VirtContainer {
// register
#[cfg(not(target_arch = "s390x"))]
{
#[cfg(feature = "dragonball")]
let dragonball_config = Arc::new(DragonballConfig::new());
#[cfg(feature = "dragonball")]
register_hypervisor_plugin("dragonball", dragonball_config);
let firecracker_config = Arc::new(FirecrackerConfig::new());
@@ -91,6 +93,7 @@ impl RuntimeHandler for VirtContainer {
msg_sender: Sender<Message>,
config: Arc<TomlConfig>,
init_size_manager: InitialSizeManager,
sandbox_config: SandboxConfig,
) -> Result<RuntimeInstance> {
let hypervisor = new_hypervisor(&config).await.context("new hypervisor")?;
@@ -114,6 +117,7 @@ impl RuntimeHandler for VirtContainer {
agent.clone(),
hypervisor.clone(),
resource_manager.clone(),
sandbox_config,
)
.await
.context("new virt sandbox")?;
@@ -147,7 +151,7 @@ async fn new_hypervisor(toml_config: &TomlConfig) -> Result<Arc<dyn Hypervisor>>
// TODO: support other hypervisor
// issue: https://github.com/kata-containers/kata-containers/issues/4634
match hypervisor_name.as_str() {
#[cfg(not(target_arch = "s390x"))]
#[cfg(all(feature = "dragonball", not(target_arch = "s390x")))]
HYPERVISOR_DRAGONBALL => {
let mut hypervisor = Dragonball::new();
hypervisor

View File

@@ -12,11 +12,15 @@ use agent::{
use anyhow::{anyhow, Context, Result};
use async_trait::async_trait;
use common::message::{Action, Message};
use common::{Sandbox, SandboxNetworkEnv};
use containerd_shim_protos::events::task::TaskOOM;
use common::types::utils::option_system_time_into;
use common::types::ContainerProcess;
use common::{types::SandboxConfig, ContainerManager, Sandbox, SandboxNetworkEnv};
use containerd_shim_protos::events::task::{TaskExit, TaskOOM};
use hypervisor::VsockConfig;
#[cfg(not(target_arch = "s390x"))]
use hypervisor::{dragonball::Dragonball, HYPERVISOR_DRAGONBALL, HYPERVISOR_FIRECRACKER};
use hypervisor::HYPERVISOR_FIRECRACKER;
#[cfg(all(feature = "dragonball", not(target_arch = "s390x")))]
use hypervisor::{dragonball::Dragonball, HYPERVISOR_DRAGONBALL};
use hypervisor::{qemu::Qemu, HYPERVISOR_QEMU};
use hypervisor::{utils::get_hvsock_path, HybridVsockConfig, DEFAULT_GUEST_VSOCK_CID};
use hypervisor::{BlockConfig, Hypervisor};
@@ -27,6 +31,7 @@ use kata_types::config::hypervisor::HYPERVISOR_NAME_CH;
use kata_types::config::TomlConfig;
use oci_spec::runtime as oci;
use persist::{self, sandbox_persist::Persist};
use protobuf::SpecialFields;
use resource::manager::ManagerArgs;
use resource::network::{dan_config_path, DanNetworkConfig, NetworkConfig, NetworkWithNetNsConfig};
use resource::{ResourceConfig, ResourceManager};
@@ -73,6 +78,7 @@ pub struct VirtSandbox {
agent: Arc<dyn Agent>,
hypervisor: Arc<dyn Hypervisor>,
monitor: Arc<HealthCheck>,
sandbox_config: Option<SandboxConfig>,
}
impl std::fmt::Debug for VirtSandbox {
@@ -91,6 +97,7 @@ impl VirtSandbox {
agent: Arc<dyn Agent>,
hypervisor: Arc<dyn Hypervisor>,
resource_manager: Arc<ResourceManager>,
sandbox_config: SandboxConfig,
) -> Result<Self> {
let config = resource_manager.config().await;
let keep_abnormal = config.runtime.keep_abnormal;
@@ -102,6 +109,7 @@ impl VirtSandbox {
hypervisor,
resource_manager,
monitor: Arc::new(HealthCheck::new(true, keep_abnormal)),
sandbox_config: Some(sandbox_config),
})
}
@@ -290,8 +298,8 @@ impl VirtSandbox {
fn has_prestart_hooks(
&self,
prestart_hooks: Vec<oci::Hook>,
create_runtime_hooks: Vec<oci::Hook>,
prestart_hooks: &[oci::Hook],
create_runtime_hooks: &[oci::Hook],
) -> bool {
!prestart_hooks.is_empty() || !create_runtime_hooks.is_empty()
}
@@ -300,32 +308,32 @@ impl VirtSandbox {
#[async_trait]
impl Sandbox for VirtSandbox {
#[instrument(name = "sb: start")]
async fn start(
&self,
dns: Vec<String>,
spec: &oci::Spec,
state: &spec::State,
network_env: SandboxNetworkEnv,
) -> Result<()> {
async fn start(&self) -> Result<()> {
let id = &self.sid;
// if sandbox running, return
// if sandbox not running try to start sandbox
if self.sandbox_config.is_none() {
return Err(anyhow!("sandbox config is missing"));
}
let sandbox_config = self.sandbox_config.as_ref().unwrap();
// if sandbox is not in SandboxState::Init then return,
// otherwise try to create sandbox
let mut inner = self.inner.write().await;
if inner.state == SandboxState::Running {
warn!(sl!(), "sandbox is running, no need to start");
if inner.state != SandboxState::Init {
warn!(sl!(), "sandbox is started");
return Ok(());
}
self.hypervisor
.prepare_vm(id, network_env.netns.clone())
.prepare_vm(id, sandbox_config.network_env.netns.clone())
.await
.context("prepare vm")?;
// generate device and setup before start vm
// should after hypervisor.prepare_vm
let resources = self
.prepare_for_start_sandbox(id, network_env.clone())
.prepare_for_start_sandbox(id, sandbox_config.network_env.clone())
.await?;
self.resource_manager
@@ -338,28 +346,33 @@ impl Sandbox for VirtSandbox {
info!(sl!(), "start vm");
// execute pre-start hook functions, including Prestart Hooks and CreateRuntime Hooks
let (prestart_hooks, create_runtime_hooks) = if let Some(hooks) = spec.hooks().as_ref() {
(
hooks.prestart().clone().unwrap_or_default(),
hooks.create_runtime().clone().unwrap_or_default(),
)
} else {
(Vec::new(), Vec::new())
};
let (prestart_hooks, create_runtime_hooks) =
if let Some(hooks) = sandbox_config.hooks.as_ref() {
(
hooks.prestart().clone().unwrap_or_default(),
hooks.create_runtime().clone().unwrap_or_default(),
)
} else {
(Vec::new(), Vec::new())
};
self.execute_oci_hook_functions(&prestart_hooks, &create_runtime_hooks, state)
.await?;
self.execute_oci_hook_functions(
&prestart_hooks,
&create_runtime_hooks,
&sandbox_config.state,
)
.await?;
// 1. if there are pre-start hook functions, network config might have been changed.
// We need to rescan the netns to handle the change.
// 2. Do not scan the netns if we want no network for the VM.
// TODO In case of vm factory, scan the netns to hotplug interfaces after the VM is started.
let config = self.resource_manager.config().await;
if self.has_prestart_hooks(prestart_hooks, create_runtime_hooks)
if self.has_prestart_hooks(&prestart_hooks, &create_runtime_hooks)
&& !config.runtime.disable_new_netns
&& !dan_config_path(&config, &self.sid).exists()
{
if let Some(netns_path) = network_env.netns {
if let Some(netns_path) = &sandbox_config.network_env.netns {
let network_resource = NetworkConfig::NetNs(NetworkWithNetNsConfig {
network_model: config.runtime.internetworking_model.clone(),
netns_path: netns_path.to_owned(),
@@ -369,7 +382,7 @@ impl Sandbox for VirtSandbox {
.await
.network_info
.network_queues as usize,
network_created: network_env.network_created,
network_created: sandbox_config.network_env.network_created,
});
self.resource_manager
.handle_network(network_resource)
@@ -399,8 +412,8 @@ impl Sandbox for VirtSandbox {
let agent_config = self.agent.agent_config().await;
let kernel_modules = KernelModule::set_kernel_modules(agent_config.kernel_modules)?;
let req = agent::CreateSandboxRequest {
hostname: spec.hostname().clone().unwrap_or_default(),
dns,
hostname: sandbox_config.hostname.clone(),
dns: sandbox_config.dns.clone(),
storages: self
.resource_manager
.get_storage_for_sandbox()
@@ -519,6 +532,44 @@ impl Sandbox for VirtSandbox {
Ok(())
}
async fn wait_process(
&self,
cm: Arc<dyn ContainerManager>,
process_id: ContainerProcess,
shim_pid: u32,
) -> Result<()> {
let exit_status = cm.wait_process(&process_id).await?;
info!(sl!(), "container process exited with {:?}", exit_status);
if cm.is_sandbox_container(&process_id).await {
self.stop().await.context("stop sandbox")?;
}
let cid = process_id.container_id();
if cid.is_empty() {
return Err(anyhow!("container id is empty"));
}
let eid = process_id.exec_id();
let id = if eid.is_empty() {
cid.to_string()
} else {
eid.to_string()
};
let event = TaskExit {
container_id: cid.to_string(),
id,
pid: shim_pid,
exit_status: exit_status.exit_code as u32,
exited_at: option_system_time_into(exit_status.exit_time),
special_fields: SpecialFields::new(),
};
let msg = Message::new(Action::Event(Arc::new(event)));
let lock_sender = self.msg_sender.lock().await;
lock_sender.send(msg).await.context("send exit event")?;
Ok(())
}
async fn agent_sock(&self) -> Result<String> {
self.agent.agent_sock().await
}
@@ -591,7 +642,7 @@ impl Persist for VirtSandbox {
resource: Some(self.resource_manager.save().await?),
hypervisor: match hypervisor_state.hypervisor_type.as_str() {
// TODO support other hypervisors
#[cfg(not(target_arch = "s390x"))]
#[cfg(all(feature = "dragonball", not(target_arch = "s390x")))]
HYPERVISOR_DRAGONBALL => Ok(Some(hypervisor_state)),
#[cfg(not(target_arch = "s390x"))]
HYPERVISOR_NAME_CH => Ok(Some(hypervisor_state)),
@@ -630,7 +681,7 @@ impl Persist for VirtSandbox {
let h = sandbox_state.hypervisor.unwrap_or_default();
let hypervisor = match h.hypervisor_type.as_str() {
// TODO support other hypervisors
#[cfg(not(target_arch = "s390x"))]
#[cfg(all(feature = "dragonball", not(target_arch = "s390x")))]
HYPERVISOR_DRAGONBALL => {
let hypervisor = Arc::new(Dragonball::restore((), h).await?) as Arc<dyn Hypervisor>;
Ok(hypervisor)
@@ -659,6 +710,7 @@ impl Persist for VirtSandbox {
hypervisor,
resource_manager,
monitor: Arc::new(HealthCheck::new(true, keep_abnormal)),
sandbox_config: None,
})
}
}

View File

@@ -7,7 +7,7 @@ use std::sync::Arc;
use anyhow::Result;
use async_trait::async_trait;
use common::{message::Message, RuntimeHandler, RuntimeInstance};
use common::{message::Message, types::SandboxConfig, RuntimeHandler, RuntimeInstance};
use kata_types::config::TomlConfig;
use resource::cpu_mem::initial_size::InitialSizeManager;
use tokio::sync::mpsc::Sender;
@@ -33,6 +33,7 @@ impl RuntimeHandler for WasmContainer {
_msg_sender: Sender<Message>,
_config: Arc<TomlConfig>,
_init_size_manager: InitialSizeManager,
_sandbox_config: SandboxConfig,
) -> Result<RuntimeInstance> {
todo!()
}

View File

@@ -25,7 +25,7 @@ const MESSAGE_BUFFER_SIZE: usize = 8;
pub struct ServiceManager {
receiver: Option<Receiver<Message>>,
handler: Arc<RuntimeHandlerManager>,
task_server: Option<Server>,
server: Option<Server>,
binary: String,
address: String,
namespace: String,
@@ -37,7 +37,7 @@ impl std::fmt::Debug for ServiceManager {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.debug_struct("ServiceManager")
.field("receiver", &self.receiver)
.field("task_server.is_some()", &self.task_server.is_some())
.field("server.is_some()", &self.server.is_some())
.field("binary", &self.binary)
.field("address", &self.address)
.field("namespace", &self.namespace)
@@ -60,8 +60,8 @@ impl ServiceManager {
let (sender, receiver) = channel::<Message>(MESSAGE_BUFFER_SIZE);
let rt_mgr = RuntimeHandlerManager::new(id, sender).context("new runtime handler")?;
let handler = Arc::new(rt_mgr);
let mut task_server = unsafe { Server::from_raw_fd(task_server_fd) };
task_server = task_server.set_domain_unix();
let mut server = unsafe { Server::from_raw_fd(task_server_fd) };
server = server.set_domain_unix();
let event_publisher = new_event_publisher(namespace)
.await
.context("new event publisher")?;
@@ -69,7 +69,7 @@ impl ServiceManager {
Ok(Self {
receiver: Some(receiver),
handler,
task_server: Some(task_server),
server: Some(server),
binary: containerd_binary.to_string(),
address: address.to_string(),
namespace: namespace.to_string(),
@@ -136,24 +136,24 @@ impl ServiceManager {
}
fn registry_service(&mut self) -> Result<()> {
if let Some(t) = self.task_server.take() {
if let Some(t) = self.server.take() {
let task_service = Arc::new(Box::new(TaskService::new(self.handler.clone()))
as Box<dyn shim_async::Task + Send + Sync>);
let t = t.register_service(shim_async::create_task(task_service));
self.task_server = Some(t);
self.server = Some(t);
}
Ok(())
}
async fn start_service(&mut self) -> Result<()> {
if let Some(t) = self.task_server.as_mut() {
if let Some(t) = self.server.as_mut() {
t.start().await.context("task server start")?;
}
Ok(())
}
async fn stop_service(&mut self) -> Result<()> {
if let Some(t) = self.task_server.as_mut() {
if let Some(t) = self.server.as_mut() {
t.stop_listen().await;
}
Ok(())

View File

@@ -10,7 +10,7 @@ use std::{
};
use async_trait::async_trait;
use common::types::{Request, Response};
use common::types::{TaskRequest, TaskResponse};
use containerd_shim_protos::{api, shim_async};
use ttrpc::{self, r#async::TtrpcContext};
@@ -31,10 +31,10 @@ impl TaskService {
req: TtrpcReq,
) -> ttrpc::Result<TtrpcResp>
where
Request: TryFrom<TtrpcReq>,
<Request as TryFrom<TtrpcReq>>::Error: std::fmt::Debug,
TtrpcResp: TryFrom<Response>,
<TtrpcResp as TryFrom<Response>>::Error: std::fmt::Debug,
TaskRequest: TryFrom<TtrpcReq>,
<TaskRequest as TryFrom<TtrpcReq>>::Error: std::fmt::Debug,
TtrpcResp: TryFrom<TaskResponse>,
<TtrpcResp as TryFrom<TaskResponse>>::Error: std::fmt::Debug,
{
let r = req.try_into().map_err(|err| {
ttrpc::Error::Others(format!("failed to translate from shim {:?}", err))

View File

@@ -6,7 +6,7 @@
use anyhow::{Context, Result};
use common::{
message::Message,
types::{ContainerConfig, Request},
types::{ContainerConfig, TaskRequest},
};
use runtimes::RuntimeHandlerManager;
use tokio::sync::mpsc::channel;
@@ -18,7 +18,7 @@ async fn real_main() {
let (sender, _receiver) = channel::<Message>(MESSAGE_BUFFER_SIZE);
let manager = RuntimeHandlerManager::new("xxx", sender).unwrap();
let req = Request::CreateContainer(ContainerConfig {
let req = TaskRequest::CreateContainer(ContainerConfig {
container_id: "xxx".to_owned(),
bundle: ".".to_owned(),
rootfs_mounts: Vec::new(),

View File

@@ -82,7 +82,6 @@ BINDIR := $(EXEC_PREFIX)/bin
QEMUBINDIR := $(PREFIXDEPS)/bin
CLHBINDIR := $(PREFIXDEPS)/bin
FCBINDIR := $(PREFIXDEPS)/bin
ACRNBINDIR := $(PREFIXDEPS)/bin
STRATOVIRTBINDIR := $(PREFIXDEPS)/bin
SYSCONFDIR := /etc
LOCALSTATEDIR := /var
@@ -99,7 +98,6 @@ RUNTIME_NAME = $(TARGET)
GENERATED_FILES += $(COLLECT_SCRIPT)
GENERATED_VARS = \
VERSION \
CONFIG_ACRN_IN \
CONFIG_QEMU_IN \
CONFIG_QEMU_COCO_DEV_IN \
CONFIG_QEMU_NVIDIA_GPU_IN \
@@ -159,7 +157,6 @@ KERNELTDXPARAMS += $(ROOTMEASURECONFIG)
# Name of default configuration file the runtime will use.
CONFIG_FILE = configuration.toml
HYPERVISOR_ACRN = acrn
HYPERVISOR_FC = firecracker
HYPERVISOR_QEMU = qemu
HYPERVISOR_CLH = cloud-hypervisor
@@ -170,7 +167,7 @@ HYPERVISOR_REMOTE = remote
DEFAULT_HYPERVISOR ?= $(HYPERVISOR_QEMU)
# List of hypervisors this build system can generate configuration for.
HYPERVISORS := $(HYPERVISOR_ACRN) $(HYPERVISOR_FC) $(HYPERVISOR_QEMU) $(HYPERVISOR_CLH) $(HYPERVISOR_STRATOVIRT) $(HYPERVISOR_REMOTE)
HYPERVISORS := $(HYPERVISOR_FC) $(HYPERVISOR_QEMU) $(HYPERVISOR_CLH) $(HYPERVISOR_STRATOVIRT) $(HYPERVISOR_REMOTE)
QEMUPATH := $(QEMUBINDIR)/$(QEMUCMD)
QEMUVALIDHYPERVISORPATHS := [\"$(QEMUPATH)\"]
@@ -193,11 +190,6 @@ FCVALIDHYPERVISORPATHS := [\"$(FCPATH)\"]
FCJAILERPATH = $(FCBINDIR)/$(FCJAILERCMD)
FCVALIDJAILERPATHS = [\"$(FCJAILERPATH)\"]
ACRNPATH := $(ACRNBINDIR)/$(ACRNCMD)
ACRNVALIDHYPERVISORPATHS := [\"$(ACRNPATH)\"]
ACRNCTLPATH := $(ACRNBINDIR)/$(ACRNCTLCMD)
ACRNVALIDCTLPATHS := [\"$(ACRNCTLPATH)\"]
STRATOVIRTPATH = $(STRATOVIRTBINDIR)/$(STRATOVIRTCMD)
STRATOVIRTVALIDHYPERVISORPATHS := [\"$(STRATOVIRTPATH)\"]
@@ -538,30 +530,6 @@ ifneq (,$(FCCMD))
KERNELPATH_FC = $(KERNELDIR)/$(KERNEL_NAME_FC)
endif
ifneq (,$(ACRNCMD))
KNOWN_HYPERVISORS += $(HYPERVISOR_ACRN)
CONFIG_FILE_ACRN = configuration-acrn.toml
CONFIG_ACRN = config/$(CONFIG_FILE_ACRN)
CONFIG_ACRN_IN = $(CONFIG_ACRN).in
CONFIG_PATH_ACRN = $(abspath $(CONFDIR)/$(CONFIG_FILE_ACRN))
CONFIG_PATHS += $(CONFIG_PATH_ACRN)
SYSCONFIG_ACRN = $(abspath $(SYSCONFDIR)/$(CONFIG_FILE_ACRN))
SYSCONFIG_PATHS += $(SYSCONFIG_ACRN)
CONFIGS += $(CONFIG_ACRN)
# acrn-specific options (all should be suffixed by "_ACRN")
DEFMAXVCPUS_ACRN := 1
DEFBLOCKSTORAGEDRIVER_ACRN := virtio-blk
DEFNETWORKMODEL_ACRN := macvtap
KERNELTYPE_ACRN = compressed
KERNEL_NAME_ACRN = $(call MAKE_KERNEL_NAME,$(KERNELTYPE_ACRN))
KERNELPATH_ACRN = $(KERNELDIR)/$(KERNEL_NAME_ACRN)
endif
ifeq (,$(KNOWN_HYPERVISORS))
$(error "ERROR: No hypervisors known for architecture $(ARCH) (looked for: $(HYPERVISORS))")
endif
@@ -586,10 +554,6 @@ ifeq ($(DEFAULT_HYPERVISOR),$(HYPERVISOR_FC))
DEFAULT_HYPERVISOR_CONFIG = $(CONFIG_FILE_FC)
endif
ifeq ($(DEFAULT_HYPERVISOR),$(HYPERVISOR_ACRN))
DEFAULT_HYPERVISOR_CONFIG = $(CONFIG_FILE_ACRN)
endif
ifeq ($(DEFAULT_HYPERVISOR),$(HYPERVISOR_CLH))
DEFAULT_HYPERVISOR_CONFIG = $(CONFIG_FILE_CLH)
endif
@@ -609,7 +573,6 @@ SHAREDIR := $(SHAREDIR)
# list of variables the user may wish to override
USER_VARS += ARCH
USER_VARS += BINDIR
USER_VARS += CONFIG_ACRN_IN
USER_VARS += CONFIG_CLH_IN
USER_VARS += CONFIG_FC_IN
USER_VARS += CONFIG_STRATOVIRT_IN
@@ -618,12 +581,6 @@ USER_VARS += CONFIG_QEMU_IN
USER_VARS += CONFIG_REMOTE_IN
USER_VARS += DESTDIR
USER_VARS += DEFAULT_HYPERVISOR
USER_VARS += ACRNCMD
USER_VARS += ACRNCTLCMD
USER_VARS += ACRNPATH
USER_VARS += ACRNVALIDHYPERVISORPATHS
USER_VARS += ACRNCTLPATH
USER_VARS += ACRNVALIDCTLPATHS
USER_VARS += CLHPATH
USER_VARS += CLHVALIDHYPERVISORPATHS
USER_VARS += FIRMWAREPATH_CLH
@@ -667,9 +624,7 @@ USER_VARS += MACHINETYPE
USER_VARS += KERNELDIR
USER_VARS += KERNELTYPE
USER_VARS += KERNELTYPE_FC
USER_VARS += KERNELTYPE_ACRN
USER_VARS += KERNELTYPE_CLH
USER_VARS += KERNELPATH_ACRN
USER_VARS += KERNELPATH
USER_VARS += KERNELCONFIDENTIALPATH
USER_VARS += KERNELSEPATH
@@ -722,12 +677,10 @@ USER_VARS += SHAREDIR
USER_VARS += SYSCONFDIR
USER_VARS += DEFVCPUS
USER_VARS += DEFMAXVCPUS
USER_VARS += DEFMAXVCPUS_ACRN
USER_VARS += DEFMEMSZ
USER_VARS += DEFMEMSLOTS
USER_VARS += DEFMAXMEMSZ
USER_VARS += DEFBRIDGES
USER_VARS += DEFNETWORKMODEL_ACRN
USER_VARS += DEFNETWORKMODEL_CLH
USER_VARS += DEFNETWORKMODEL_FC
USER_VARS += DEFNETWORKMODEL_QEMU
@@ -739,7 +692,6 @@ USER_VARS += DEFDISABLEGUESTSELINUX
USER_VARS += DEFGUESTSELINUXLABEL
USER_VARS += DEFAULTEXPFEATURES
USER_VARS += DEFDISABLEBLOCK
USER_VARS += DEFBLOCKSTORAGEDRIVER_ACRN
USER_VARS += DEFBLOCKSTORAGEDRIVER_FC
USER_VARS += DEFBLOCKSTORAGEDRIVER_QEMU
USER_VARS += DEFBLOCKSTORAGEDRIVER_STRATOVIRT
@@ -1106,9 +1058,6 @@ endif
ifneq (,$(findstring $(HYPERVISOR_FC),$(KNOWN_HYPERVISORS)))
@printf "\t$(HYPERVISOR_FC) hypervisor path (FCPATH) : %s\n" $(abspath $(FCPATH))
endif
ifneq (,$(findstring $(HYPERVISOR_ACRN),$(KNOWN_HYPERVISORS)))
@printf "\t$(HYPERVISOR_ACRN) hypervisor path (ACRNPATH) : %s\n" $(abspath $(ACRNPATH))
endif
ifneq (,$(findstring $(HYPERVISOR_STRATOVIRT),$(KNOWN_HYPERVISORS)))
@printf "\t$(HYPERVISOR_STRATOVIRT) hypervisor path (STRATOVIRTPATH) : %s\n" $(abspath $(STRATOVIRTPATH))
endif

View File

@@ -20,10 +20,6 @@ FCCMD := firecracker
# Firecracker's jailer binary name
FCJAILERCMD := jailer
#ACRN binary name
ACRNCMD := acrn-dm
ACRNCTLCMD := acrnctl
# cloud-hypervisor binary name
CLHCMD := cloud-hypervisor

View File

@@ -9,8 +9,6 @@ import (
"fmt"
"os"
"strings"
"syscall"
"unsafe"
vc "github.com/kata-containers/kata-containers/src/runtime/virtcontainers"
"github.com/sirupsen/logrus"
@@ -45,41 +43,6 @@ const (
cpuTypeUnknown = -1
)
const acrnDevice = "/dev/acrn_hsm"
// ioctl_ACRN_CREATE_VM is the IOCTL to create VM in ACRN.
// Current Linux mainstream kernel doesn't have support for ACRN.
// Due to this several macros are not defined in Linux headers.
// Until the support is available, directly use the value instead
// of macros.
// https://github.com/kata-containers/runtime/issues/1784
const ioctl_ACRN_CREATE_VM = 0xC030A210 //nolint
const ioctl_ACRN_PAUSE_VM = 0xA213 //nolint
const ioctl_ACRN_DESTROY_VM = 0xA211 //nolint
type acrn_vm_creation struct { //nolint
vmid uint16 //nolint
reserved0 uint16 //nolint
vcpu_num uint16 //nolint
reserved1 uint16 //nolint
name [16]uint8
vm_flag uint64 //nolint
ioreq_buf uint64 //nolint
cpu_affinity uint64 //nolint
}
var io_request_page [4096]byte
type acrn_io_request struct { // nolint
io_type uint32 // nolint
completion_polling uint32 // nolint
reserved0 [14]uint32 // nolint
data [8]uint64 // nolint
reserved1 uint32 // nolint
kernel_handled uint32 // nolint
processed uint32 // nolint
}
// cpuType save the CPU type
var cpuType int
@@ -155,28 +118,6 @@ func setCPUtype(hypervisorType vc.HypervisorType) error {
required: false,
},
}
case vc.AcrnHypervisor:
archRequiredCPUFlags = map[string]string{
cpuFlagLM: "64Bit CPU",
cpuFlagSSE4_1: "SSE4.1",
}
archRequiredCPUAttribs = map[string]string{
archGenuineIntel: "Intel Architecture CPU",
}
archRequiredKernelModules = map[string]kernelModule{
kernelModvhost: {
desc: msgKernelVirtio,
required: false,
},
kernelModvhostnet: {
desc: msgKernelVirtioNet,
required: false,
},
kernelModvhostvsock: {
desc: msgKernelVirtioVhostVsock,
required: false,
},
}
case vc.MockHypervisor:
archRequiredCPUFlags = map[string]string{
cpuFlagVMX: "Virtualization support",
@@ -249,68 +190,6 @@ func kvmIsUsable() error {
return genericKvmIsUsable()
}
// acrnIsUsable determines if it will be possible to create a full virtual machine
// by creating a minimal VM and then deleting it.
func acrnIsUsable() error {
flags := syscall.O_RDWR | syscall.O_CLOEXEC
f, err := syscall.Open(acrnDevice, flags, 0)
if err != nil {
return err
}
defer syscall.Close(f)
kataLog.WithField("device", acrnDevice).Info("device available")
var createVM acrn_vm_creation
copy(createVM.name[:], "KataACRNVM")
ioreq_buf := (*acrn_io_request)(unsafe.Pointer(&io_request_page))
createVM.ioreq_buf = uint64(uintptr(unsafe.Pointer(ioreq_buf)))
ret, _, errno := syscall.Syscall(syscall.SYS_IOCTL,
uintptr(f),
uintptr(ioctl_ACRN_CREATE_VM),
uintptr(unsafe.Pointer(&createVM)))
if ret != 0 || errno != 0 {
if errno == syscall.EBUSY {
kataLog.WithField("reason", "another hypervisor running").Error("cannot create VM")
}
kataLog.WithFields(logrus.Fields{
"ret": ret,
"errno": errno,
"VM_name": createVM.name,
}).Info("Create VM Error")
return errno
}
ret, _, errno = syscall.Syscall(syscall.SYS_IOCTL,
uintptr(f),
uintptr(ioctl_ACRN_PAUSE_VM),
0)
if ret != 0 || errno != 0 {
kataLog.WithFields(logrus.Fields{
"ret": ret,
"errno": errno,
}).Info("PAUSE VM Error")
return errno
}
ret, _, errno = syscall.Syscall(syscall.SYS_IOCTL,
uintptr(f),
uintptr(ioctl_ACRN_DESTROY_VM),
0)
if ret != 0 || errno != 0 {
kataLog.WithFields(logrus.Fields{
"ret": ret,
"errno": errno,
}).Info("Destroy VM Error")
return errno
}
kataLog.WithField("feature", "create-vm").Info("feature available")
return nil
}
func archHostCanCreateVMContainer(hypervisorType vc.HypervisorType) error {
switch hypervisorType {
case vc.QemuHypervisor:
@@ -321,8 +200,6 @@ func archHostCanCreateVMContainer(hypervisorType vc.HypervisorType) error {
fallthrough
case vc.FirecrackerHypervisor:
return kvmIsUsable()
case vc.AcrnHypervisor:
return acrnIsUsable()
case vc.RemoteHypervisor:
return nil
case vc.MockHypervisor:

View File

@@ -916,7 +916,6 @@ func TestGetHypervisorInfoSocket(t *testing.T) {
}
hypervisors := []TestHypervisorDetails{
{vc.AcrnHypervisor, false},
{vc.ClhHypervisor, true},
{vc.FirecrackerHypervisor, true},
{vc.MockHypervisor, false},

View File

@@ -1,263 +0,0 @@
# Copyright (c) 2017-2019 Intel Corporation
# Copyright (c) 2021 Adobe Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# XXX: WARNING: this file is auto-generated.
# XXX:
# XXX: Source file: "@CONFIG_ACRN_IN@"
# XXX: Project:
# XXX: Name: @PROJECT_NAME@
# XXX: Type: @PROJECT_TYPE@
[hypervisor.acrn]
path = "@ACRNPATH@"
ctlpath = "@ACRNCTLPATH@"
kernel = "@KERNELPATH_ACRN@"
image = "@IMAGEPATH@"
# rootfs filesystem type:
# - ext4 (default)
# - xfs
# - erofs
rootfs_type=@DEFROOTFSTYPE@
# List of valid annotation names for the hypervisor
# Each member of the list is a regular expression, which is the base name
# of the annotation, e.g. "path" for io.katacontainers.config.hypervisor.path"
enable_annotations = @DEFENABLEANNOTATIONS@
# List of valid annotations values for the hypervisor
# Each member of the list is a path pattern as described by glob(3).
# The default if not set is empty (all annotations rejected.)
# Your distribution recommends: @ACRNVALIDHYPERVISORPATHS@
valid_hypervisor_paths = @ACRNVALIDHYPERVISORPATHS@
# List of valid annotations values for ctlpath
# The default if not set is empty (all annotations rejected.)
# Your distribution recommends: @ACRNVALIDCTLPATHS@
valid_ctlpaths = @ACRNVALIDCTLPATHS@
# Optional space-separated list of options to pass to the guest kernel.
# For example, use `kernel_params = "vsyscall=emulate"` if you are having
# trouble running pre-2.15 glibc.
#
# WARNING: - any parameter specified here will take priority over the default
# parameter value of the same name used to start the virtual machine.
# Do not set values here unless you understand the impact of doing so as you
# may stop the virtual machine from booting.
# To see the list of default parameters, enable hypervisor debug, create a
# container and look for 'default-kernel-parameters' log entries.
kernel_params = "@KERNELPARAMS@"
# Path to the firmware.
# If you want that acrn uses the default firmware leave this option empty
firmware = "@FIRMWAREPATH@"
# Default maximum number of vCPUs per SB/VM:
# unspecified or == 0 --> will be set to the actual number of physical cores or to the maximum number
# of vCPUs supported by KVM if that number is exceeded
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores --> will be set to the actual number of physical cores or to the maximum number
# of vCPUs supported by KVM if that number is exceeded
# WARNING: Depending of the architecture, the maximum number of vCPUs supported by KVM is used when
# the actual number of physical cores is greater than it.
# WARNING: Be aware that this value impacts the virtual machine's memory footprint and CPU
# the hotplug functionality. For example, `default_maxvcpus = 240` specifies that until 240 vCPUs
# can be added to a SB/VM, but the memory footprint will be big. Another example, with
# `default_maxvcpus = 8` the memory footprint will be small, but 8 will be the maximum number of
# vCPUs supported by the SB/VM. In general, we recommend that you do not edit this variable,
# unless you know what are you doing.
default_maxvcpus = @DEFMAXVCPUS_ACRN@
# Bridges can be used to hot plug devices.
# Limitations:
# * Currently only pci bridges are supported
# * Until 30 devices per bridge can be hot plugged.
# * Until 5 PCI bridges can be cold plugged per VM.
# This limitation could be a bug in the kernel
# Default number of bridges per SB/VM:
# unspecified or 0 --> will be set to @DEFBRIDGES@
# > 1 <= 5 --> will be set to the specified number
# > 5 --> will be set to 5
default_bridges = @DEFBRIDGES@
# Default memory size in MiB for SB/VM.
# If unspecified then it will be set @DEFMEMSZ@ MiB.
default_memory = @DEFMEMSZ@
# Block storage driver to be used for the hypervisor in case the container
# rootfs is backed by a block device. ACRN only supports virtio-blk.
block_device_driver = "@DEFBLOCKSTORAGEDRIVER_ACRN@"
# This option changes the default hypervisor and kernel parameters
# to enable debug output where available.
#
# Default false
#enable_debug = true
# Disable the customizations done in the runtime when it detects
# that it is running on top a VMM. This will result in the runtime
# behaving as it would when running on bare metal.
#
#disable_nesting_checks = true
# If host doesn't support vhost_net, set to true. Thus we won't create vhost fds for nics.
# Default false
#disable_vhost_net = true
# Path to OCI hook binaries in the *guest rootfs*.
# This does not affect host-side hooks which must instead be added to
# the OCI spec passed to the runtime.
#
# You can create a rootfs with hooks by customizing the osbuilder scripts:
# https://github.com/kata-containers/kata-containers/tree/main/tools/osbuilder
#
# Hooks must be stored in a subdirectory of guest_hook_path according to their
# hook type, i.e. "guest_hook_path/{prestart,poststart,poststop}".
# The agent will scan these directories for executable files and add them, in
# lexicographical order, to the lifecycle of the guest container.
# Hooks are executed in the runtime namespace of the guest. See the official documentation:
# https://github.com/opencontainers/runtime-spec/blob/v1.0.1/config.md#posix-platform-hooks
# Warnings will be logged if any error is encountered while scanning for hooks,
# but it will not abort container execution.
#guest_hook_path = "/usr/share/oci/hooks"
# disable applying SELinux on the VMM process (default false)
disable_selinux=@DEFDISABLESELINUX@
[agent.@PROJECT_TYPE@]
# If enabled, make the agent display debug-level messages.
# (default: disabled)
#enable_debug = true
# Enable agent tracing.
#
# If enabled, the agent will generate OpenTelemetry trace spans.
#
# Notes:
#
# - If the runtime also has tracing enabled, the agent spans will be
# associated with the appropriate runtime parent span.
# - If enabled, the runtime will wait for the container to shutdown,
# increasing the container shutdown time slightly.
#
# (default: disabled)
#enable_tracing = true
# Enable debug console.
# If enabled, user can connect guest OS running inside hypervisor
# through "kata-runtime exec <sandbox-id>" command
#debug_console_enabled = true
# Agent connection dialing timeout value in seconds
# (default: 45)
dial_timeout = 45
# Confidential Data Hub API timeout value in seconds
# (default: 50)
#cdh_api_timeout = 50
[runtime]
# If enabled, the runtime will log additional debug messages to the
# system log
# (default: disabled)
#enable_debug = true
#
# Internetworking model
# Determines how the VM should be connected to the
# the container network interface
# Options:
#
# - bridged (Deprecated)
# Uses a linux bridge to interconnect the container interface to
# the VM. Works for most cases except macvlan and ipvlan.
# ***NOTE: This feature has been deprecated with plans to remove this
# feature in the future. Please use other network models listed below.
#
#
# - macvtap
# Used when the Container network interface can be bridged using
# macvtap.
#
# - none
# Used when customize network. Only creates a tap device. No veth pair.
#
# - tcfilter
# Uses tc filter rules to redirect traffic from the network interface
# provided by plugin to a tap interface connected to the VM.
#
internetworking_model="@DEFNETWORKMODEL_ACRN@"
# disable guest seccomp
# Determines whether container seccomp profiles are passed to the virtual
# machine and applied by the kata agent. If set to true, seccomp is not applied
# within the guest
# (default: true)
disable_guest_seccomp=@DEFDISABLEGUESTSECCOMP@
# If enabled, the runtime will create opentracing.io traces and spans.
# (See https://www.jaegertracing.io/docs/getting-started).
# (default: disabled)
#enable_tracing = true
# Set the full url to the Jaeger HTTP Thrift collector.
# The default if not set will be "http://localhost:14268/api/traces"
#jaeger_endpoint = ""
# Sets the username to be used if basic auth is required for Jaeger.
#jaeger_user = ""
# Sets the password to be used if basic auth is required for Jaeger.
#jaeger_password = ""
# If enabled, the runtime will not create a network namespace for shim and hypervisor processes.
# This option may have some potential impacts to your host. It should only be used when you know what you're doing.
# `disable_new_netns` conflicts with `internetworking_model=bridged` and `internetworking_model=macvtap`. It works only
# with `internetworking_model=none`. The tap device will be in the host network namespace and can connect to a bridge
# (like OVS) directly.
# (default: false)
#disable_new_netns = true
# if enabled, the runtime will add all the kata processes inside one dedicated cgroup.
# The container cgroups in the host are not created, just one single cgroup per sandbox.
# The runtime caller is free to restrict or collect cgroup stats of the overall Kata sandbox.
# The sandbox cgroup path is the parent cgroup of a container with the PodSandbox annotation.
# The sandbox cgroup is constrained if there is no container type annotation.
# See: https://pkg.go.dev/github.com/kata-containers/kata-containers/src/runtime/virtcontainers#ContainerType
sandbox_cgroup_only=@DEFSANDBOXCGROUPONLY@
# If enabled, the runtime will not create Kubernetes emptyDir mounts on the guest filesystem. Instead, emptyDir mounts will
# be created on the host and shared via virtio-fs. This is potentially slower, but allows sharing of files from host to guest.
disable_guest_empty_dir=@DEFDISABLEGUESTEMPTYDIR@
# Enabled experimental feature list, format: ["a", "b"].
# Experimental features are features not stable enough for production,
# they may break compatibility, and are prepared for a big version bump.
# Supported experimental features:
# (default: [])
experimental=@DEFAULTEXPFEATURES@
# If enabled, user can run pprof tools with shim v2 process through kata-monitor.
# (default: false)
# enable_pprof = true
# Indicates the CreateContainer request timeout needed for the workload(s)
# It using guest_pull this includes the time to pull the image inside the guest
# Defaults to @DEFCREATECONTAINERTIMEOUT@ second(s)
# Note: The effective timeout is determined by the lesser of two values: runtime-request-timeout from kubelet config
# (https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout) and create_container_timeout.
# In essence, the timeout used for guest pull=runtime-request-timeout<create_container_timeout?runtime-request-timeout:create_container_timeout.
create_container_timeout = @DEFCREATECONTAINERTIMEOUT@
# Base directory of directly attachable network config.
# Network devices for VM-based containers are allowed to be placed in the
# host netns to eliminate as many hops as possible, which is what we
# called a "Directly Attachable Network". The config, set by special CNI
# plugins, is used to tell the Kata containers what devices are attached
# to the hypervisor.
# (default: /run/kata-containers/dans)
dan_conf = "@DEFDANCONF@"

View File

@@ -60,12 +60,6 @@ func (device *BlockDevice) Attach(ctx context.Context, devReceiver api.DeviceRec
return err
}
hypervisorType := devReceiver.GetHypervisorType()
if hypervisorType == "acrn" {
deviceLogger().Debug("Special casing for ACRN to increment BlockIndex")
index = index + 1
}
drive := &config.BlockDrive{
File: device.DeviceInfo.HostPath,
Format: "raw",

View File

@@ -30,7 +30,7 @@ type CPUDevice struct {
type HypervisorState struct {
BlockIndexMap map[int]struct{}
// Type of hypervisor, E.g. qemu/firecracker/acrn.
// Type of hypervisor, E.g. qemu/firecracker
Type string
UUID string
// clh sepcific: refer to 'virtcontainers/clh.go:CloudHypervisorState'

View File

@@ -44,7 +44,6 @@ var DEFAULTRUNTIMECONFIGURATION = "@CONFIG_PATH@"
// defaultRuntimeConfiguration.
var DEFAULTSYSCONFRUNTIMECONFIGURATION = "@SYSCONFIG@"
var defaultHypervisorPath = "/usr/bin/qemu-system-x86_64"
var defaultHypervisorCtlPath = "/usr/bin/acrnctl"
var defaultJailerPath = "/usr/bin/jailer"
var defaultImagePath = "/usr/share/kata-containers/kata-containers.img"
var defaultKernelPath = "/usr/share/kata-containers/vmlinuz.container"

View File

@@ -50,7 +50,6 @@ const (
firecrackerHypervisorTableType = "firecracker"
clhHypervisorTableType = "clh"
qemuHypervisorTableType = "qemu"
acrnHypervisorTableType = "acrn"
dragonballHypervisorTableType = "dragonball"
stratovirtHypervisorTableType = "stratovirt"
remoteHypervisorTableType = "remote"
@@ -83,7 +82,6 @@ type hypervisor struct {
Path string `toml:"path"`
JailerPath string `toml:"jailer_path"`
Kernel string `toml:"kernel"`
CtlPath string `toml:"ctlpath"`
Initrd string `toml:"initrd"`
Image string `toml:"image"`
RootfsType string `toml:"rootfs_type"`
@@ -109,7 +107,6 @@ type hypervisor struct {
SnpCertsPath string `toml:"snp_certs_path"`
HypervisorPathList []string `toml:"valid_hypervisor_paths"`
JailerPathList []string `toml:"valid_jailer_paths"`
CtlPathList []string `toml:"valid_ctlpaths"`
VirtioFSDaemonList []string `toml:"valid_virtio_fs_daemon_paths"`
VirtioFSExtraArgs []string `toml:"virtio_fs_extra_args"`
PFlashList []string `toml:"pflashes"`
@@ -225,16 +222,6 @@ func (h hypervisor) path() (string, error) {
return ResolvePath(p)
}
func (h hypervisor) ctlpath() (string, error) {
p := h.CtlPath
if h.CtlPath == "" {
p = defaultHypervisorCtlPath
}
return ResolvePath(p)
}
func (h hypervisor) jailerPath() (string, error) {
p := h.JailerPath
@@ -1022,79 +1009,6 @@ func newQemuHypervisorConfig(h hypervisor) (vc.HypervisorConfig, error) {
}, nil
}
func newAcrnHypervisorConfig(h hypervisor) (vc.HypervisorConfig, error) {
hypervisor, err := h.path()
if err != nil {
return vc.HypervisorConfig{}, err
}
hypervisorctl, err := h.ctlpath()
if err != nil {
return vc.HypervisorConfig{}, err
}
kernel, err := h.kernel()
if err != nil {
return vc.HypervisorConfig{}, err
}
image, err := h.image()
if err != nil {
return vc.HypervisorConfig{}, err
}
if image == "" {
return vc.HypervisorConfig{},
errors.New("image must be defined in the configuration file")
}
rootfsType, err := h.rootfsType()
if err != nil {
return vc.HypervisorConfig{}, err
}
firmware, err := h.firmware()
if err != nil {
return vc.HypervisorConfig{}, err
}
kernelParams := h.kernelParams()
blockDriver, err := h.blockDeviceDriver()
if err != nil {
return vc.HypervisorConfig{}, err
}
return vc.HypervisorConfig{
HypervisorPath: hypervisor,
HypervisorPathList: h.HypervisorPathList,
KernelPath: kernel,
ImagePath: image,
RootfsType: rootfsType,
HypervisorCtlPath: hypervisorctl,
HypervisorCtlPathList: h.CtlPathList,
FirmwarePath: firmware,
KernelParams: vc.DeserializeParams(vc.KernelParamFields(kernelParams)),
NumVCPUsF: h.defaultVCPUs(),
DefaultMaxVCPUs: h.defaultMaxVCPUs(),
MemorySize: h.defaultMemSz(),
MemSlots: h.defaultMemSlots(),
DefaultMaxMemorySize: h.defaultMaxMemSz(),
EntropySource: h.GetEntropySource(),
EntropySourceList: h.EntropySourceList,
DefaultBridges: h.defaultBridges(),
HugePages: h.HugePages,
Debug: h.Debug,
DisableNestingChecks: h.DisableNestingChecks,
BlockDeviceDriver: blockDriver,
DisableVhostNet: h.DisableVhostNet,
GuestHookPath: h.guestHookPath(),
DisableSeLinux: h.DisableSeLinux,
EnableAnnotations: h.EnableAnnotations,
DisableGuestSeLinux: true, // Guest SELinux is not supported in ACRN
}, nil
}
func newClhHypervisorConfig(h hypervisor) (vc.HypervisorConfig, error) {
hypervisor, err := h.path()
if err != nil {
@@ -1393,9 +1307,6 @@ func updateRuntimeConfigHypervisor(configPath string, tomlConf tomlConfig, confi
case qemuHypervisorTableType:
config.HypervisorType = vc.QemuHypervisor
hConfig, err = newQemuHypervisorConfig(hypervisor)
case acrnHypervisorTableType:
config.HypervisorType = vc.AcrnHypervisor
hConfig, err = newAcrnHypervisorConfig(hypervisor)
case clhHypervisorTableType:
config.HypervisorType = vc.ClhHypervisor
hConfig, err = newClhHypervisorConfig(hypervisor)

View File

@@ -1770,9 +1770,6 @@ func TestUpdateRuntimeConfigHypervisor(t *testing.T) {
configFile := "/some/where/configuration.toml"
// Note: We cannot test acrnHypervisorTableType since
// newAcrnHypervisorConfig() expects ACRN binaries to be
// installed.
var entries = []tableTypeEntry{
{clhHypervisorTableType, true},
{dragonballHypervisorTableType, true},

View File

@@ -578,13 +578,6 @@ func addHypervisorPathOverrides(ocispec specs.Spec, config *vc.SandboxConfig, ru
config.HypervisorConfig.JailerPath = value
}
if value, ok := ocispec.Annotations[vcAnnotations.CtlPath]; ok {
if !checkPathIsInGlobs(runtime.HypervisorConfig.HypervisorCtlPathList, value) {
return fmt.Errorf("hypervisor control %v required from annotation is not valid", value)
}
config.HypervisorConfig.HypervisorCtlPath = value
}
if value, ok := ocispec.Annotations[vcAnnotations.KernelParams]; ok {
if value != "" {
params := vc.DeserializeParams(strings.Fields(value))

View File

@@ -483,9 +483,6 @@ func TestAddAssetAnnotations(t *testing.T) {
vcAnnotations.HypervisorPath: fakeAssetFile,
vcAnnotations.HypervisorHash: "bbbbb",
vcAnnotations.HypervisorCtlPath: fakeAssetFile,
vcAnnotations.HypervisorCtlHash: "cc",
vcAnnotations.ImagePath: fakeAssetFile,
vcAnnotations.ImageHash: "52ss2550983",
@@ -533,7 +530,6 @@ func TestAddAssetAnnotations(t *testing.T) {
// Check that it works if all path lists are enabled
runtimeConfig.HypervisorConfig.HypervisorPathList = []string{tmpdirGlob}
runtimeConfig.HypervisorConfig.JailerPathList = []string{tmpdirGlob}
runtimeConfig.HypervisorConfig.HypervisorCtlPathList = []string{tmpdirGlob}
err = addAnnotations(ocispec, &config, runtimeConfig)
assert.NoError(err)

View File

@@ -1,684 +0,0 @@
//go:build linux
// Copyright (c) 2019 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
package virtcontainers
import (
"context"
"fmt"
"os"
"os/exec"
"path/filepath"
"strconv"
"strings"
"syscall"
"time"
"github.com/pkg/errors"
"github.com/prometheus/procfs"
"github.com/sirupsen/logrus"
"github.com/kata-containers/kata-containers/src/runtime/pkg/device/config"
hv "github.com/kata-containers/kata-containers/src/runtime/pkg/hypervisors"
"github.com/kata-containers/kata-containers/src/runtime/pkg/katautils/katatrace"
persistapi "github.com/kata-containers/kata-containers/src/runtime/virtcontainers/persist/api"
"github.com/kata-containers/kata-containers/src/runtime/virtcontainers/types"
"github.com/kata-containers/kata-containers/src/runtime/virtcontainers/utils"
)
// acrnTracingTags defines tags for the trace span
var acrnTracingTags = map[string]string{
"source": "runtime",
"package": "virtcontainers",
"subsystem": "hypervisor",
"type": "acrn",
}
// AcrnState keeps track of VM UUID, PID.
type AcrnState struct {
PID int
}
// Acrn is an Hypervisor interface implementation for the Linux acrn hypervisor.
type Acrn struct {
sandbox *Sandbox
ctx context.Context
arch acrnArch
store persistapi.PersistDriver
id string
acrnConfig Config
config HypervisorConfig
state AcrnState
}
const (
acrnConsoleSocket = "console.sock"
acrnStopSandboxTimeoutSecs = 15
)
// agnostic list of kernel parameters
var acrnDefaultKernelParameters = []Param{
{"panic", "1"},
}
func (a *Acrn) kernelParameters() string {
// get a list of arch kernel parameters
params := a.arch.kernelParameters(a.config.Debug)
// use default parameters
params = append(params, acrnDefaultKernelParameters...)
// set the maximum number of vCPUs
params = append(params, Param{"maxcpus", fmt.Sprintf("%d", a.config.DefaultMaxVCPUs)})
// add the params specified by the provided config. As the kernel
// honours the last parameter value set and since the config-provided
// params are added here, they will take priority over the defaults.
params = append(params, a.config.KernelParams...)
paramsStr := SerializeParams(params, "=")
return strings.Join(paramsStr, " ")
}
// Adds all capabilities supported by Acrn implementation of hypervisor interface
func (a *Acrn) Capabilities(ctx context.Context) types.Capabilities {
span, _ := katatrace.Trace(ctx, a.Logger(), "Capabilities", acrnTracingTags, map[string]string{"sandbox_id": a.id})
defer span.End()
return a.arch.capabilities(a.config)
}
func (a *Acrn) HypervisorConfig() HypervisorConfig {
return a.config
}
// Get cpu apicid to identify vCPU that will be assigned for a VM by reading `proc/cpuinfo`
func (a *Acrn) getNextApicid() (string, error) {
fs, err := procfs.NewFS("/proc")
if err != nil {
return "", err
}
cpuinfo, err := fs.CPUInfo()
if err != nil {
return "", err
}
prevIdx := -1
fileName := filepath.Join(a.config.VMStorePath, "cpu_affinity_idx")
_, err = os.Stat(fileName)
if err == nil {
data, err := os.ReadFile(fileName)
if err != nil {
a.Logger().Error("Loading cpu affinity index from file failed!")
return "", err
}
prevIdx, err = strconv.Atoi(string(data))
if err != nil {
a.Logger().Error("CreateVM: Convert from []byte to integer failed!")
return "", err
}
if prevIdx >= (len(cpuinfo) - 1) {
prevIdx = -1
}
}
currentIdx := prevIdx + 1
err = os.WriteFile(fileName, []byte(strconv.Itoa(currentIdx)), defaultFilePerms)
if err != nil {
a.Logger().Error("Storing cpu affinity index from file failed!")
return "", err
}
return cpuinfo[currentIdx].APICID, nil
}
// get the acrn binary path
func (a *Acrn) acrnPath() (string, error) {
p, err := a.config.HypervisorAssetPath()
if err != nil {
return "", err
}
if p == "" {
p, err = a.arch.acrnPath()
if err != nil {
return "", err
}
}
if _, err = os.Stat(p); os.IsNotExist(err) {
return "", fmt.Errorf("acrn path (%s) does not exist", p)
}
return p, nil
}
// get the ACRNCTL binary path
func (a *Acrn) acrnctlPath() (string, error) {
ctlpath, err := a.config.HypervisorCtlAssetPath()
if err != nil {
return "", err
}
if ctlpath == "" {
ctlpath, err = a.arch.acrnctlPath()
if err != nil {
return "", err
}
}
if _, err = os.Stat(ctlpath); os.IsNotExist(err) {
return "", fmt.Errorf("acrnctl path (%s) does not exist", ctlpath)
}
return ctlpath, nil
}
// Logger returns a logrus logger appropriate for logging acrn messages
func (a *Acrn) Logger() *logrus.Entry {
return virtLog.WithField("subsystem", "acrn")
}
func (a *Acrn) memoryTopology() (Memory, error) {
memMb := uint64(a.config.MemorySize)
return a.arch.memoryTopology(memMb), nil
}
func (a *Acrn) appendImage(devices []Device, imagePath string) ([]Device, error) {
if imagePath == "" {
return nil, fmt.Errorf("Image path is empty: %s", imagePath)
}
var err error
devices, err = a.arch.appendImage(devices, imagePath)
if err != nil {
return nil, err
}
return devices, nil
}
func (a *Acrn) buildDevices(ctx context.Context, imagePath string) ([]Device, error) {
var devices []Device
if imagePath == "" {
return nil, fmt.Errorf("Image Path should not be empty: %s", imagePath)
}
_, console, err := a.GetVMConsole(ctx, a.id)
if err != nil {
return nil, err
}
// Add bridges before any other devices. This way we make sure that
// bridge gets the first available PCI address.
devices = a.arch.appendBridges(devices)
//Add LPC device to the list of other devices.
devices = a.arch.appendLPC(devices)
devices = a.arch.appendConsole(devices, console)
devices, err = a.appendImage(devices, imagePath)
if err != nil {
return nil, err
}
// Create virtio blk devices with dummy backend as a place
// holder for container rootfs (as acrn doesn't support hot-plug).
// Once the container rootfs is known, replace the dummy backend
// with actual path (using block rescan feature in acrn)
devices, err = a.createDummyVirtioBlkDev(ctx, devices)
if err != nil {
return nil, err
}
return devices, nil
}
// setup sets the Acrn structure up.
func (a *Acrn) setup(ctx context.Context, id string, hypervisorConfig *HypervisorConfig) error {
span, _ := katatrace.Trace(ctx, a.Logger(), "setup", acrnTracingTags, map[string]string{"sandbox_id": a.id})
defer span.End()
if err := a.setConfig(hypervisorConfig); err != nil {
return err
}
a.id = id
var err error
a.arch, err = newAcrnArch(a.config)
if err != nil {
return err
}
return nil
}
func (a *Acrn) createDummyVirtioBlkDev(ctx context.Context, devices []Device) ([]Device, error) {
span, _ := katatrace.Trace(ctx, a.Logger(), "createDummyVirtioBlkDev", acrnTracingTags, map[string]string{"sandbox_id": a.id})
defer span.End()
// Since acrn doesn't support hot-plug, dummy virtio-blk
// devices are added and later replaced with container-rootfs.
// Starting from driveIndex 1, as 0 is allocated for VM rootfs.
for driveIndex := 1; driveIndex <= AcrnBlkDevPoolSz; driveIndex++ {
drive := config.BlockDrive{
File: "nodisk",
Index: driveIndex,
}
devices = a.arch.appendBlockDevice(devices, drive)
}
return devices, nil
}
func (a *Acrn) setConfig(config *HypervisorConfig) error {
a.config = *config
return nil
}
// CreateVM is the VM creation
func (a *Acrn) CreateVM(ctx context.Context, id string, network Network, hypervisorConfig *HypervisorConfig) error {
// Save the tracing context
a.ctx = ctx
span, ctx := katatrace.Trace(ctx, a.Logger(), "CreateVM", acrnTracingTags, map[string]string{"sandbox_id": a.id})
defer span.End()
if err := a.setup(ctx, id, hypervisorConfig); err != nil {
return err
}
memory, err := a.memoryTopology()
if err != nil {
return err
}
kernelPath, err := a.config.KernelAssetPath()
if err != nil {
return err
}
imagePath, err := a.config.ImageAssetPath()
if err != nil {
return err
}
kernel := Kernel{
Path: kernelPath,
ImagePath: imagePath,
Params: a.kernelParameters(),
}
devices, err := a.buildDevices(ctx, imagePath)
if err != nil {
return err
}
acrnPath, err := a.acrnPath()
if err != nil {
return err
}
acrnctlPath, err := a.acrnctlPath()
if err != nil {
return err
}
vmName := fmt.Sprintf("sbx-%s", a.id)
if len(vmName) > 15 {
return fmt.Errorf("VM Name len is %d but ACRN supports max VM name len of 15.", len(vmName))
}
apicID, err := a.getNextApicid()
if err != nil {
return err
}
acrnConfig := Config{
ACPIVirt: true,
Path: acrnPath,
CtlPath: acrnctlPath,
Memory: memory,
Devices: devices,
Kernel: kernel,
Name: vmName,
ApicID: apicID,
}
a.acrnConfig = acrnConfig
return nil
}
// StartVM will start the Sandbox's VM.
func (a *Acrn) StartVM(ctx context.Context, timeoutSecs int) error {
span, ctx := katatrace.Trace(ctx, a.Logger(), "StartVM", acrnTracingTags, map[string]string{"sandbox_id": a.id})
defer span.End()
if a.config.Debug {
params := a.arch.kernelParameters(a.config.Debug)
strParams := SerializeParams(params, "=")
formatted := strings.Join(strParams, " ")
// The name of this field matches a similar one generated by
// the runtime and allows users to identify which parameters
// are set here, which come from the runtime and which are set
// by the user.
a.Logger().WithField("default-kernel-parameters", formatted).Debug()
}
vmPath := filepath.Join(a.config.VMStorePath, a.id)
err := os.MkdirAll(vmPath, DirMode)
if err != nil {
return err
}
defer func() {
if err != nil {
if err := os.RemoveAll(vmPath); err != nil {
a.Logger().WithError(err).Error("Failed to clean up vm directory")
}
}
}()
var strErr string
var PID int
a.Logger().Error("StartVM: LaunchAcrn() function called")
PID, strErr, err = LaunchAcrn(a.acrnConfig, virtLog.WithField("subsystem", "acrn-dm"))
if err != nil {
return fmt.Errorf("%s", strErr)
}
a.state.PID = PID
if err = a.waitVM(ctx, timeoutSecs); err != nil {
a.Logger().WithField("acrn wait failed:", err).Debug()
return err
}
return nil
}
// waitVM will wait for the Sandbox's VM to be up and running.
func (a *Acrn) waitVM(ctx context.Context, timeoutSecs int) error {
span, _ := katatrace.Trace(ctx, a.Logger(), "waitVM", acrnTracingTags, map[string]string{"sandbox_id": a.id})
defer span.End()
if timeoutSecs < 0 {
return fmt.Errorf("Invalid timeout %ds", timeoutSecs)
}
time.Sleep(time.Duration(timeoutSecs) * time.Second)
return nil
}
// StopVM will stop the Sandbox's VM.
func (a *Acrn) StopVM(ctx context.Context, waitOnly bool) (err error) {
span, _ := katatrace.Trace(ctx, a.Logger(), "StopVM", acrnTracingTags, map[string]string{"sandbox_id": a.id})
defer span.End()
a.Logger().Info("Stopping acrn VM")
defer func() {
if err != nil {
a.Logger().Info("StopVM failed")
} else {
a.Logger().Info("acrn VM stopped")
}
}()
fileName := filepath.Join(a.config.VMStorePath, "cpu_affinity_idx")
data, err := os.ReadFile(fileName)
if err != nil {
a.Logger().Error("Loading cpu affinity index from file failed!")
return err
}
currentIdx, err := strconv.Atoi(string(data))
if err != nil {
a.Logger().Error("Converting from []byte to integer failed!")
return err
}
currentIdx = currentIdx - 1
err = os.WriteFile(fileName, []byte(strconv.Itoa(currentIdx)), defaultFilePerms)
if err != nil {
a.Logger().Error("Storing cpu affinity index from file failed!")
return err
}
pid := a.state.PID
shutdownSignal := syscall.SIGINT
if waitOnly {
// NOP
shutdownSignal = syscall.Signal(0)
}
return utils.WaitLocalProcess(pid, acrnStopSandboxTimeoutSecs, shutdownSignal, a.Logger())
}
func (a *Acrn) updateBlockDevice(drive *config.BlockDrive) error {
if drive.Swap {
return fmt.Errorf("Acrn doesn't support swap")
}
var err error
if drive.File == "" || drive.Index >= AcrnBlkDevPoolSz {
return fmt.Errorf("Empty filepath or invalid drive index, Dive ID:%s, Drive Index:%d",
drive.ID, drive.Index)
}
slot := AcrnBlkdDevSlot[drive.Index]
//Explicitly set PCIPath to NULL, so that VirtPath can be used
drive.PCIPath = types.PciPath{}
args := []string{"blkrescan", a.acrnConfig.Name, fmt.Sprintf("%d,%s", slot, drive.File)}
a.Logger().WithFields(logrus.Fields{
"drive": drive,
"path": a.config.HypervisorCtlPath,
}).Info("updateBlockDevice with acrnctl path")
cmd := exec.Command(a.config.HypervisorCtlPath, args...)
if err := cmd.Run(); err != nil {
a.Logger().WithError(err).Error("updating Block device with newFile path")
}
return err
}
func (a *Acrn) HotplugAddDevice(ctx context.Context, devInfo interface{}, devType DeviceType) (interface{}, error) {
span, _ := katatrace.Trace(ctx, a.Logger(), "HotplugAddDevice", acrnTracingTags, map[string]string{"sandbox_id": a.id})
defer span.End()
switch devType {
case BlockDev:
//The drive placeholder has to exist prior to Update
return nil, a.updateBlockDevice(devInfo.(*config.BlockDrive))
default:
return nil, fmt.Errorf("HotplugAddDevice: unsupported device: devInfo:%v, deviceType%v",
devInfo, devType)
}
}
func (a *Acrn) HotplugRemoveDevice(ctx context.Context, devInfo interface{}, devType DeviceType) (interface{}, error) {
span, _ := katatrace.Trace(ctx, a.Logger(), "HotplugRemoveDevice", acrnTracingTags, map[string]string{"sandbox_id": a.id})
defer span.End()
// Not supported. return success
return nil, nil
}
func (a *Acrn) PauseVM(ctx context.Context) error {
span, _ := katatrace.Trace(ctx, a.Logger(), "PauseVM", acrnTracingTags, map[string]string{"sandbox_id": a.id})
defer span.End()
// Not supported. return success
return nil
}
func (a *Acrn) ResumeVM(ctx context.Context) error {
span, _ := katatrace.Trace(ctx, a.Logger(), "ResumeVM", acrnTracingTags, map[string]string{"sandbox_id": a.id})
defer span.End()
// Not supported. return success
return nil
}
// AddDevice will add extra devices to acrn command line.
func (a *Acrn) AddDevice(ctx context.Context, devInfo interface{}, devType DeviceType) error {
var err error
span, _ := katatrace.Trace(ctx, a.Logger(), "AddDevice", acrnTracingTags, map[string]string{"sandbox_id": a.id})
defer span.End()
switch v := devInfo.(type) {
case types.Volume:
// Not supported. return success
err = nil
case types.Socket:
a.acrnConfig.Devices = a.arch.appendSocket(a.acrnConfig.Devices, v)
case types.VSock:
a.acrnConfig.Devices = a.arch.appendVSock(a.acrnConfig.Devices, v)
case Endpoint:
a.acrnConfig.Devices = a.arch.appendNetwork(a.acrnConfig.Devices, v)
case config.BlockDrive:
a.acrnConfig.Devices = a.arch.appendBlockDevice(a.acrnConfig.Devices, v)
case config.VhostUserDeviceAttrs:
// Not supported. return success
err = nil
case config.VFIODev:
// Not supported. return success
err = nil
default:
err = nil
a.Logger().WithField("unknown-device-type", devInfo).Error("Adding device")
}
return err
}
// GetVMConsole builds the path of the console where we can read logs coming
// from the sandbox.
func (a *Acrn) GetVMConsole(ctx context.Context, id string) (string, string, error) {
span, _ := katatrace.Trace(ctx, a.Logger(), "GetVMConsole", acrnTracingTags, map[string]string{"sandbox_id": a.id})
defer span.End()
consoleURL, err := utils.BuildSocketPath(a.config.VMStorePath, id, acrnConsoleSocket)
if err != nil {
a.Logger().Error("GetVMConsole returned error")
return consoleProtoUnix, "", err
}
return consoleProtoUnix, consoleURL, nil
}
func (a *Acrn) SaveVM() error {
a.Logger().Info("Save sandbox")
// Not supported. return success
return nil
}
func (a *Acrn) Disconnect(ctx context.Context) {
span, _ := katatrace.Trace(ctx, a.Logger(), "Disconnect", acrnTracingTags, map[string]string{"sandbox_id": a.id})
defer span.End()
// Not supported.
}
func (a *Acrn) GetThreadIDs(ctx context.Context) (VcpuThreadIDs, error) {
span, _ := katatrace.Trace(ctx, a.Logger(), "GetThreadIDs", acrnTracingTags, map[string]string{"sandbox_id": a.id})
defer span.End()
// Not supported. return success
//Just allocating an empty map
return VcpuThreadIDs{}, nil
}
func (a *Acrn) GetTotalMemoryMB(ctx context.Context) uint32 {
return a.config.MemorySize
}
func (a *Acrn) ResizeMemory(ctx context.Context, reqMemMB uint32, memoryBlockSizeMB uint32, probe bool) (uint32, MemoryDevice, error) {
return 0, MemoryDevice{}, nil
}
func (a *Acrn) ResizeVCPUs(ctx context.Context, reqVCPUs uint32) (currentVCPUs uint32, newVCPUs uint32, err error) {
return 0, 0, nil
}
func (a *Acrn) Cleanup(ctx context.Context) error {
span, _ := katatrace.Trace(ctx, a.Logger(), "Cleanup", acrnTracingTags, map[string]string{"sandbox_id": a.id})
defer span.End()
return nil
}
func (a *Acrn) GetPids() []int {
return []int{a.state.PID}
}
func (a *Acrn) GetVirtioFsPid() *int {
return nil
}
func (a *Acrn) fromGrpc(ctx context.Context, hypervisorConfig *HypervisorConfig, j []byte) error {
return errors.New("acrn is not supported by VM cache")
}
func (a *Acrn) toGrpc(ctx context.Context) ([]byte, error) {
return nil, errors.New("acrn is not supported by VM cache")
}
func (a *Acrn) Save() (s hv.HypervisorState) {
s.Pid = a.state.PID
s.Type = string(AcrnHypervisor)
return
}
func (a *Acrn) Load(s hv.HypervisorState) {
a.state.PID = s.Pid
}
func (a *Acrn) Check() error {
if err := syscall.Kill(a.state.PID, syscall.Signal(0)); err != nil {
return errors.Wrapf(err, "failed to ping acrn process")
}
return nil
}
func (a *Acrn) GenerateSocket(id string) (interface{}, error) {
socket, err := generateVMSocket(id, a.config.VMStorePath)
if err != nil {
return "", err
}
vsock, _ := socket.(types.VSock)
vsock.VhostFd.Close()
return socket, err
}
func (a *Acrn) IsRateLimiterBuiltin() bool {
return false
}

View File

@@ -1,802 +0,0 @@
//go:build linux
// Copyright (c) 2019 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
package virtcontainers
import (
"bytes"
"context"
"fmt"
"os"
"os/exec"
"strings"
"github.com/kata-containers/kata-containers/src/runtime/pkg/device/config"
"github.com/kata-containers/kata-containers/src/runtime/virtcontainers/types"
"github.com/sirupsen/logrus"
)
type acrnArch interface {
// acrnPath returns the path to the acrn binary
acrnPath() (string, error)
// acrnctlPath returns the path to the acrnctl binary
acrnctlPath() (string, error)
// kernelParameters returns the kernel parameters
// if debug is true then kernel debug parameters are included
kernelParameters(debug bool) []Param
//capabilities returns the capabilities supported by acrn
capabilities(config HypervisorConfig) types.Capabilities
// memoryTopology returns the memory topology using the given amount of memoryMb and hostMemoryMb
memoryTopology(memMb uint64) Memory
// appendConsole appends a console to devices
appendConsole(devices []Device, path string) []Device
// appendImage appends an image to devices
appendImage(devices []Device, path string) ([]Device, error)
// appendBridges appends bridges to devices
appendBridges(devices []Device) []Device
// appendLPC appends LPC to devices
// UART device emulated by the acrn-dm is connected to the system by the LPC bus
appendLPC(devices []Device) []Device
// appendSocket appends a socket to devices
appendSocket(devices []Device, socket types.Socket) []Device
// appendVSock appends a vsock PCI to devices
appendVSock(devices []Device, vsock types.VSock) []Device
// appendNetwork appends a endpoint device to devices
appendNetwork(devices []Device, endpoint Endpoint) []Device
// appendBlockDevice appends a block drive to devices
appendBlockDevice(devices []Device, drive config.BlockDrive) []Device
// handleImagePath handles the Hypervisor Config image path
handleImagePath(config HypervisorConfig) error
}
type acrnArchBase struct {
path string
ctlpath string
kernelParamsNonDebug []Param
kernelParamsDebug []Param
kernelParams []Param
}
const acrnPath = "/usr/bin/acrn-dm"
const acrnctlPath = "/usr/bin/acrnctl"
// acrn GVT-g slot is harded code to 2 as there is
// no simple way to pass arguments of PCI slots from
// device model (acrn-dm) to ACRNGT module.
const acrnGVTgReservedSlot = 2
const acrnLPCDev = "lpc"
const acrnHostBridge = "hostbridge"
var baselogger *logrus.Entry
// AcrnBlkDevPoolSz defines the number of dummy virtio-blk
// device that will be created for hot-plugging container
// rootfs. Since acrn doesn't support hot-plug, dummy virtio-blk
// devices are added and later replaced with container-rootfs.
var AcrnBlkDevPoolSz = 8
// AcrnBlkdDevSlot array provides translation between
// the vitio-blk device index and slot it is currently
// attached.
// Allocating extra 1 to accommodate for VM rootfs
// which is at driveIndex 0
var AcrnBlkdDevSlot = make([]int, AcrnBlkDevPoolSz+1)
// acrnKernelParamsNonDebug is a list of the default kernel
// parameters that will be used in standard (non-debug) mode.
var acrnKernelParamsNonDebug = []Param{
{"quiet", ""},
}
// acrnKernelParamsSystemdNonDebug is a list of the default systemd related
// kernel parameters that will be used in standard (non-debug) mode.
var acrnKernelParamsSystemdNonDebug = []Param{
{"systemd.show_status", "false"},
}
// acrnKernelParamsDebug is a list of the default kernel
// parameters that will be used in debug mode (as much boot output as
// possible).
var acrnKernelParamsDebug = []Param{
{"debug", ""},
}
// acrnKernelParamsSystemdDebug is a list of the default systemd related kernel
// parameters that will be used in debug mode (as much boot output as
// possible).
var acrnKernelParamsSystemdDebug = []Param{
{"systemd.show_status", "true"},
{"systemd.log_level", "debug"},
{"systemd.log_target", "kmsg"},
{"printk.devkmsg", "on"},
}
var acrnKernelRootParams = []Param{
{"root", "/dev/vda1 rw rootwait"},
}
var acrnKernelParams = []Param{
{"tsc", "reliable"},
{"no_timer_check", ""},
{"nohpet", ""},
{"console", "tty0"},
{"console", "ttyS0"},
{"console", "hvc0"},
{"log_buf_len", "16M"},
{"consoleblank", "0"},
}
// Device is the acrn device interface.
type Device interface {
Valid() bool
AcrnParams(slot int, config *Config) []string
}
// ConsoleDeviceBackend is the character device backend for acrn
type ConsoleDeviceBackend string
const (
// Socket creates a 2 way stream socket (TCP or Unix).
Socket ConsoleDeviceBackend = "socket"
// Stdio sends traffic from the guest to acrn's standard output.
Stdio ConsoleDeviceBackend = "console"
// File backend only supports console output to a file (no input).
File ConsoleDeviceBackend = "file"
// TTY is an alias for Serial.
TTY ConsoleDeviceBackend = "tty"
// PTY creates a new pseudo-terminal on the host and connect to it.
PTY ConsoleDeviceBackend = "pty"
)
// BEPortType marks the port as console port or virtio-serial port
type BEPortType int
const (
// SerialBE marks the port as serial port
SerialBE BEPortType = iota
//ConsoleBE marks the port as console port (append @)
ConsoleBE
)
// ConsoleDevice represents a acrn console device.
type ConsoleDevice struct {
// Name of the socket
Name string
//Path to virtio-console backend (can be omitted for pty, tty, stdio)
Path string
//Backend device used for virtio-console
Backend ConsoleDeviceBackend
// PortType marks the port as serial or console port (@)
PortType BEPortType
}
// VSOCKDevice represents a AF_VSOCK socket.
type VSOCKDevice struct {
//Guest CID assigned by Host.
ContextID uint64
}
// NetDeviceType is a acrn networking device type.
type NetDeviceType string
const (
// TAP is a TAP networking device type.
TAP NetDeviceType = "tap"
// MACVTAP is a macvtap networking device type.
MACVTAP NetDeviceType = "macvtap"
)
// NetDevice represents a guest networking device
type NetDevice struct {
// Type is the netdev type (e.g. tap).
Type NetDeviceType
// IfName is the interface name
IFName string
//MACAddress is the networking device interface MAC address
MACAddress string
}
// BlockDevice represents a acrn block device.
type BlockDevice struct {
// mem path to block device
FilePath string
//BlkIndex - Blk index corresponding to slot
Index int
}
// BridgeDevice represents a acrn bridge device like pci-bridge, pxb, etc.
type BridgeDevice struct {
// Emul is a string describing the type of PCI device e.g. virtio-net
Emul string
// Config is an optional string, depending on the device, that can be
// used for configuration
Config string
// Function is PCI function. Func can be from 0 to 7
Function int
}
// LPCDevice represents a acrn LPC device
type LPCDevice struct {
// Emul is a string describing the type of PCI device e.g. virtio-net
Emul string
// Function is PCI function. Func can be from 0 to 7
Function int
}
// Memory is the guest memory configuration structure.
type Memory struct {
// Size is the amount of memory made available to the guest.
// It should be suffixed with M or G for sizes in megabytes or
// gigabytes respectively.
Size string
}
// Kernel is the guest kernel configuration structure.
type Kernel struct {
// Path is the guest kernel path on the host filesystem.
Path string
// InitrdPath is the guest initrd path on the host filesystem.
ImagePath string
// Params is the kernel parameters string.
Params string
}
// Config is the acrn configuration structure.
// It allows for passing custom settings and parameters to the acrn-dm API.
type Config struct {
// Devices is a list of devices for acrn to create and drive.
Devices []Device
// Path is the acrn binary path.
Path string
// Path is the acrn binary path.
CtlPath string
// Name is the acrn guest name
Name string
// APICID to identify vCPU that will be assigned for this VM.
ApicID string
// Kernel is the guest kernel configuration.
Kernel Kernel
// Memory is the guest memory configuration.
Memory Memory
acrnParams []string
// ACPI virtualization support
ACPIVirt bool
}
// MaxAcrnVCPUs returns the maximum number of vCPUs supported
func MaxAcrnVCPUs() uint32 {
return uint32(8)
}
func newAcrnArch(config HypervisorConfig) (acrnArch, error) {
a := &acrnArchBase{
path: acrnPath,
ctlpath: acrnctlPath,
kernelParamsNonDebug: acrnKernelParamsNonDebug,
kernelParamsDebug: acrnKernelParamsDebug,
kernelParams: acrnKernelParams,
}
if err := a.handleImagePath(config); err != nil {
return nil, err
}
return a, nil
}
func (a *acrnArchBase) acrnPath() (string, error) {
p := a.path
return p, nil
}
func (a *acrnArchBase) acrnctlPath() (string, error) {
ctlpath := a.ctlpath
return ctlpath, nil
}
func (a *acrnArchBase) kernelParameters(debug bool) []Param {
params := a.kernelParams
if debug {
params = append(params, a.kernelParamsDebug...)
} else {
params = append(params, a.kernelParamsNonDebug...)
}
return params
}
func (a *acrnArchBase) memoryTopology(memoryMb uint64) Memory {
mem := fmt.Sprintf("%dM", memoryMb)
memory := Memory{
Size: mem,
}
return memory
}
func (a *acrnArchBase) capabilities(config HypervisorConfig) types.Capabilities {
var caps types.Capabilities
caps.SetBlockDeviceSupport()
caps.SetBlockDeviceHotplugSupport()
caps.SetNetworkDeviceHotplugSupported()
return caps
}
// Valid returns true if the CharDevice structure is valid and complete.
func (cdev ConsoleDevice) Valid() bool {
if cdev.Backend != "tty" && cdev.Backend != "pty" &&
cdev.Backend != "console" && cdev.Backend != "socket" &&
cdev.Backend != "file" {
return false
} else if cdev.PortType != ConsoleBE && cdev.PortType != SerialBE {
return false
} else if cdev.Path == "" {
return false
} else {
return true
}
}
// AcrnParams returns the acrn parameters built out of this console device.
func (cdev ConsoleDevice) AcrnParams(slot int, config *Config) []string {
var acrnParams []string
var deviceParams []string
acrnParams = append(acrnParams, "-s")
deviceParams = append(deviceParams, fmt.Sprintf("%d,virtio-console,", slot))
if cdev.PortType == ConsoleBE {
deviceParams = append(deviceParams, "@")
}
switch cdev.Backend {
case "pty":
deviceParams = append(deviceParams, "pty:pty_port")
case "tty":
deviceParams = append(deviceParams, fmt.Sprintf("tty:tty_port=%s", cdev.Path))
case "socket":
deviceParams = append(deviceParams, fmt.Sprintf("socket:%s=%s", cdev.Name, cdev.Path))
case "file":
deviceParams = append(deviceParams, fmt.Sprintf("file:file_port=%s", cdev.Path))
case "stdio":
deviceParams = append(deviceParams, "stdio:stdio_port")
default:
// do nothing. Error should be already caught
}
acrnParams = append(acrnParams, strings.Join(deviceParams, ""))
return acrnParams
}
// AcrnNetdevParam converts to the acrn type to string
func (netdev NetDevice) AcrnNetdevParam() []string {
var deviceParams []string
switch netdev.Type {
case TAP:
deviceParams = append(deviceParams, netdev.IFName)
deviceParams = append(deviceParams, fmt.Sprintf(",mac=%s", netdev.MACAddress))
case MACVTAP:
deviceParams = append(deviceParams, netdev.IFName)
deviceParams = append(deviceParams, fmt.Sprintf(",mac=%s", netdev.MACAddress))
default:
deviceParams = append(deviceParams, netdev.IFName)
}
return deviceParams
}
const (
// MinimalGuestCID is the smallest valid context ID for a guest.
MinimalGuestCID uint64 = 3
// MaxGuestCID is the largest valid context ID for a guest.
MaxGuestCID uint64 = 1<<32 - 1
)
// Valid returns true if the VSOCKDevice structure is valid and complete.
func (vsock VSOCKDevice) Valid() bool {
if vsock.ContextID < MinimalGuestCID || vsock.ContextID > MaxGuestCID {
return false
}
return true
}
// AcrnParams returns the acrn parameters built out of this vsock device.
func (vsock VSOCKDevice) AcrnParams(slot int, config *Config) []string {
var acrnParams []string
acrnParams = append(acrnParams, "-s")
acrnParams = append(acrnParams, fmt.Sprintf("%d,vhost-vsock,cid=%d", slot, uint32(vsock.ContextID)))
return acrnParams
}
// Valid returns true if the NetDevice structure is valid and complete.
func (netdev NetDevice) Valid() bool {
if netdev.IFName == "" {
return false
} else if netdev.MACAddress == "" {
return false
} else if netdev.Type != TAP && netdev.Type != MACVTAP {
return false
} else {
return true
}
}
// AcrnParams returns the acrn parameters built out of this network device.
func (netdev NetDevice) AcrnParams(slot int, config *Config) []string {
var acrnParams []string
acrnParams = append(acrnParams, "-s")
acrnParams = append(acrnParams, fmt.Sprintf("%d,virtio-net,%s", slot, strings.Join(netdev.AcrnNetdevParam(), "")))
return acrnParams
}
// Valid returns true if the BlockDevice structure is valid and complete.
func (blkdev BlockDevice) Valid() bool {
return blkdev.FilePath != ""
}
// AcrnParams returns the acrn parameters built out of this block device.
func (blkdev BlockDevice) AcrnParams(slot int, config *Config) []string {
var acrnParams []string
device := "virtio-blk"
acrnParams = append(acrnParams, "-s")
acrnParams = append(acrnParams, fmt.Sprintf("%d,%s,%s",
slot, device, blkdev.FilePath))
// Update the global array (BlkIndex<->slot)
// Used to identify slots for the hot-plugged virtio-blk devices
if blkdev.Index <= AcrnBlkDevPoolSz {
AcrnBlkdDevSlot[blkdev.Index] = slot
} else {
baselogger.WithFields(logrus.Fields{
"device": device,
"index": blkdev.Index,
}).Info("Invalid index device")
}
return acrnParams
}
// Valid returns true if the BridgeDevice structure is valid and complete.
func (bridgeDev BridgeDevice) Valid() bool {
if bridgeDev.Function != 0 || bridgeDev.Emul != acrnHostBridge {
return false
}
return true
}
// AcrnParams returns the acrn parameters built out of this bridge device.
func (bridgeDev BridgeDevice) AcrnParams(slot int, config *Config) []string {
var acrnParams []string
acrnParams = append(acrnParams, "-s")
acrnParams = append(acrnParams, fmt.Sprintf("%d:%d,%s", slot,
bridgeDev.Function, bridgeDev.Emul))
return acrnParams
}
// Valid returns true if the BridgeDevice structure is valid and complete.
func (lpcDev LPCDevice) Valid() bool {
return lpcDev.Emul == acrnLPCDev
}
// AcrnParams returns the acrn parameters built out of this bridge device.
func (lpcDev LPCDevice) AcrnParams(slot int, config *Config) []string {
var acrnParams []string
var deviceParams []string
acrnParams = append(acrnParams, "-s")
acrnParams = append(acrnParams, fmt.Sprintf("%d:%d,%s", slot,
lpcDev.Function, lpcDev.Emul))
//define UART port
deviceParams = append(deviceParams, "-l")
deviceParams = append(deviceParams, "com1,stdio")
acrnParams = append(acrnParams, strings.Join(deviceParams, ""))
return acrnParams
}
func (config *Config) appendName() {
if config.Name != "" {
config.acrnParams = append(config.acrnParams, config.Name)
}
}
func (config *Config) appendDevices() {
slot := 0
for _, d := range config.Devices {
if !d.Valid() {
continue
}
if slot == acrnGVTgReservedSlot {
slot++ /*Slot 2 is assigned for GVT-g in acrn, so skip 2 */
baselogger.Info("Slot 2 is assigned for GVT-g in acrn, so skipping this slot")
}
config.acrnParams = append(config.acrnParams, d.AcrnParams(slot, config)...)
slot++
}
}
func (config *Config) appendACPI() {
if config.ACPIVirt {
config.acrnParams = append(config.acrnParams, "-A")
}
}
func (config *Config) appendMemory() {
if config.Memory.Size != "" {
config.acrnParams = append(config.acrnParams, "-m")
config.acrnParams = append(config.acrnParams, config.Memory.Size)
}
}
func (config *Config) appendCPUAffinity() {
if config.ApicID == "" {
return
}
config.acrnParams = append(config.acrnParams, "--cpu_affinity")
config.acrnParams = append(config.acrnParams, config.ApicID)
}
func (config *Config) appendKernel() {
if config.Kernel.Path == "" {
return
}
config.acrnParams = append(config.acrnParams, "-k")
config.acrnParams = append(config.acrnParams, config.Kernel.Path)
if config.Kernel.Params == "" {
return
}
config.acrnParams = append(config.acrnParams, "-B")
config.acrnParams = append(config.acrnParams, config.Kernel.Params)
}
// LaunchAcrn can be used to launch a new acrn instance.
//
// The Config parameter contains a set of acrn parameters and settings.
//
// This function writes its log output via logger parameter.
func LaunchAcrn(config Config, logger *logrus.Entry) (int, string, error) {
baselogger = logger
config.appendACPI()
config.appendMemory()
config.appendDevices()
config.appendCPUAffinity()
config.appendKernel()
config.appendName()
return LaunchCustomAcrn(context.Background(), config.Path, config.acrnParams, logger)
}
// LaunchCustomAcrn can be used to launch a new acrn instance.
//
// The path parameter is used to pass the acrn executable path.
//
// params is a slice of options to pass to acrn-dm
//
// This function writes its log output via logger parameter.
func LaunchCustomAcrn(ctx context.Context, path string, params []string,
logger *logrus.Entry) (int, string, error) {
errStr := ""
if path == "" {
path = "acrn-dm"
}
/* #nosec */
cmd := exec.CommandContext(ctx, path, params...)
var stderr bytes.Buffer
cmd.Stderr = &stderr
logger.WithFields(logrus.Fields{
"Path": path,
"Params": params,
}).Info("launching acrn with:")
err := cmd.Start()
if err != nil {
logger.Errorf("Unable to launch %s: %v", path, err)
errStr = stderr.String()
logger.Errorf("%s", errStr)
}
return cmd.Process.Pid, errStr, err
}
func (a *acrnArchBase) appendImage(devices []Device, path string) ([]Device, error) {
if _, err := os.Stat(path); os.IsNotExist(err) {
return nil, err
}
ImgBlkdevice := BlockDevice{
FilePath: path,
Index: 0,
}
devices = append(devices, ImgBlkdevice)
return devices, nil
}
// appendBridges appends to devices the given bridges
func (a *acrnArchBase) appendBridges(devices []Device) []Device {
devices = append(devices,
BridgeDevice{
Function: 0,
Emul: acrnHostBridge,
Config: "",
},
)
return devices
}
// appendBridges appends to devices the given bridges
func (a *acrnArchBase) appendLPC(devices []Device) []Device {
devices = append(devices,
LPCDevice{
Function: 0,
Emul: acrnLPCDev,
},
)
return devices
}
func (a *acrnArchBase) appendConsole(devices []Device, path string) []Device {
console := ConsoleDevice{
Name: "console0",
Backend: Socket,
PortType: ConsoleBE,
Path: path,
}
devices = append(devices, console)
return devices
}
func (a *acrnArchBase) appendSocket(devices []Device, socket types.Socket) []Device {
serailsocket := ConsoleDevice{
Name: socket.Name,
Backend: Socket,
PortType: SerialBE,
Path: socket.HostPath,
}
devices = append(devices, serailsocket)
return devices
}
func (a *acrnArchBase) appendVSock(devices []Device, vsock types.VSock) []Device {
vmsock := VSOCKDevice{
ContextID: vsock.ContextID,
}
devices = append(devices, vmsock)
return devices
}
func networkModelToAcrnType(model NetInterworkingModel) NetDeviceType {
switch model {
case NetXConnectMacVtapModel:
return MACVTAP
default:
//TAP should work for most other cases
return TAP
}
}
func (a *acrnArchBase) appendNetwork(devices []Device, endpoint Endpoint) []Device {
switch ep := endpoint.(type) {
case *VethEndpoint:
netPair := ep.NetworkPair()
devices = append(devices,
NetDevice{
Type: networkModelToAcrnType(netPair.NetInterworkingModel),
IFName: netPair.TAPIface.Name,
MACAddress: netPair.TAPIface.HardAddr,
},
)
case *MacvtapEndpoint:
devices = append(devices,
NetDevice{
Type: MACVTAP,
IFName: ep.Name(),
MACAddress: ep.HardwareAddr(),
},
)
default:
// Return devices as is for unsupported endpoint.
baselogger.WithField("Endpoint", endpoint).Error("Unsupported N/W Endpoint")
}
return devices
}
func (a *acrnArchBase) appendBlockDevice(devices []Device, drive config.BlockDrive) []Device {
if drive.File == "" {
return devices
}
devices = append(devices,
BlockDevice{
FilePath: drive.File,
Index: drive.Index,
},
)
return devices
}
func (a *acrnArchBase) handleImagePath(config HypervisorConfig) error {
if config.ImagePath != "" {
a.kernelParams = append(a.kernelParams, acrnKernelRootParams...)
a.kernelParamsNonDebug = append(a.kernelParamsNonDebug, acrnKernelParamsSystemdNonDebug...)
a.kernelParamsDebug = append(a.kernelParamsDebug, acrnKernelParamsSystemdDebug...)
}
return nil
}

View File

@@ -1,305 +0,0 @@
// Copyright (c) 2019 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
//go:build linux
// +build linux
package virtcontainers
import (
"fmt"
"net"
"os"
"path/filepath"
"testing"
"github.com/kata-containers/kata-containers/src/runtime/pkg/device/config"
"github.com/kata-containers/kata-containers/src/runtime/virtcontainers/persist/fs"
"github.com/kata-containers/kata-containers/src/runtime/virtcontainers/types"
"github.com/stretchr/testify/assert"
)
const (
acrnArchBaseAcrnPath = "/usr/bin/acrn"
acrnArchBaseAcrnCtlPath = "/usr/bin/acrnctl"
)
var acrnArchBaseKernelParamsNonDebug = []Param{
{"quiet", ""},
}
var acrnArchBaseKernelParamsDebug = []Param{
{"debug", ""},
}
var acrnArchBaseKernelParams = []Param{
{"root", "/dev/vda"},
}
func newAcrnArchBase() *acrnArchBase {
return &acrnArchBase{
path: acrnArchBaseAcrnPath,
ctlpath: acrnArchBaseAcrnCtlPath,
kernelParamsNonDebug: acrnArchBaseKernelParamsNonDebug,
kernelParamsDebug: acrnArchBaseKernelParamsDebug,
kernelParams: acrnArchBaseKernelParams,
}
}
func TestAcrnArchBaseAcrnPaths(t *testing.T) {
assert := assert.New(t)
acrnArchBase := newAcrnArchBase()
p, err := acrnArchBase.acrnPath()
assert.NoError(err)
assert.Equal(p, acrnArchBaseAcrnPath)
ctlp, err := acrnArchBase.acrnctlPath()
assert.NoError(err)
assert.Equal(ctlp, acrnArchBaseAcrnCtlPath)
}
func TestAcrnArchBaseKernelParameters(t *testing.T) {
assert := assert.New(t)
acrnArchBase := newAcrnArchBase()
// with debug params
expectedParams := acrnArchBaseKernelParams
debugParams := acrnArchBaseKernelParamsDebug
expectedParams = append(expectedParams, debugParams...)
p := acrnArchBase.kernelParameters(true)
assert.Equal(expectedParams, p)
// with non-debug params
expectedParams = acrnArchBaseKernelParams
nonDebugParams := acrnArchBaseKernelParamsNonDebug
expectedParams = append(expectedParams, nonDebugParams...)
p = acrnArchBase.kernelParameters(false)
assert.Equal(expectedParams, p)
}
func TestAcrnArchBaseCapabilities(t *testing.T) {
assert := assert.New(t)
acrnArchBase := newAcrnArchBase()
config := HypervisorConfig{}
c := acrnArchBase.capabilities(config)
assert.True(c.IsBlockDeviceSupported())
assert.True(c.IsBlockDeviceHotplugSupported())
assert.False(c.IsFsSharingSupported())
assert.True(c.IsNetworkDeviceHotplugSupported())
}
func TestAcrnArchBaseMemoryTopology(t *testing.T) {
assert := assert.New(t)
acrnArchBase := newAcrnArchBase()
mem := uint64(8192)
expectedMemory := Memory{
Size: fmt.Sprintf("%dM", mem),
}
m := acrnArchBase.memoryTopology(mem)
assert.Equal(expectedMemory, m)
}
func TestAcrnArchBaseAppendConsoles(t *testing.T) {
var devices []Device
assert := assert.New(t)
acrnArchBase := newAcrnArchBase()
path := filepath.Join(filepath.Join(fs.MockRunStoragePath(), "test"), consoleSocket)
expectedOut := []Device{
ConsoleDevice{
Name: "console0",
Backend: Socket,
PortType: ConsoleBE,
Path: path,
},
}
devices = acrnArchBase.appendConsole(devices, path)
assert.Equal(expectedOut, devices)
}
func TestAcrnArchBaseAppendImage(t *testing.T) {
var devices []Device
assert := assert.New(t)
acrnArchBase := newAcrnArchBase()
image, err := os.CreateTemp("", "img")
assert.NoError(err)
defer os.Remove(image.Name())
err = image.Close()
assert.NoError(err)
devices, err = acrnArchBase.appendImage(devices, image.Name())
assert.NoError(err)
assert.Len(devices, 1)
expectedOut := []Device{
BlockDevice{
FilePath: image.Name(),
Index: 0,
},
}
assert.Equal(expectedOut, devices)
}
func TestAcrnArchBaseAppendBridges(t *testing.T) {
function := 0
emul := acrnHostBridge
config := ""
var devices []Device
assert := assert.New(t)
acrnArchBase := newAcrnArchBase()
devices = acrnArchBase.appendBridges(devices)
assert.Len(devices, 1)
expectedOut := []Device{
BridgeDevice{
Function: function,
Emul: emul,
Config: config,
},
}
assert.Equal(expectedOut, devices)
}
func TestAcrnArchBaseAppendLpcDevice(t *testing.T) {
function := 0
emul := acrnLPCDev
var devices []Device
assert := assert.New(t)
acrnArchBase := newAcrnArchBase()
devices = acrnArchBase.appendLPC(devices)
assert.Len(devices, 1)
expectedOut := []Device{
LPCDevice{
Function: function,
Emul: emul,
},
}
assert.Equal(expectedOut, devices)
}
func testAcrnArchBaseAppend(t *testing.T, structure interface{}, expected []Device) {
var devices []Device
var err error
assert := assert.New(t)
acrnArchBase := newAcrnArchBase()
switch s := structure.(type) {
case types.Socket:
devices = acrnArchBase.appendSocket(devices, s)
case config.BlockDrive:
devices = acrnArchBase.appendBlockDevice(devices, s)
}
assert.NoError(err)
assert.Equal(devices, expected)
}
func TestAcrnArchBaseAppendSocket(t *testing.T) {
name := "archserial.test"
hostPath := "/tmp/archserial.sock"
expectedOut := []Device{
ConsoleDevice{
Name: name,
Backend: Socket,
PortType: SerialBE,
Path: hostPath,
},
}
socket := types.Socket{
HostPath: hostPath,
Name: name,
}
testAcrnArchBaseAppend(t, socket, expectedOut)
}
func TestAcrnArchBaseAppendBlockDevice(t *testing.T) {
path := "/tmp/archtest.img"
index := 5
expectedOut := []Device{
BlockDevice{
FilePath: path,
Index: index,
},
}
drive := config.BlockDrive{
File: path,
Index: index,
}
testAcrnArchBaseAppend(t, drive, expectedOut)
}
func TestAcrnArchBaseAppendNetwork(t *testing.T) {
var devices []Device
assert := assert.New(t)
acrnArchBase := newAcrnArchBase()
macAddr := net.HardwareAddr{0x02, 0x00, 0xCA, 0xFE, 0x00, 0x04}
vethEp := &VethEndpoint{
NetPair: NetworkInterfacePair{
TapInterface: TapInterface{
ID: "uniqueTestID0",
Name: "br0_kata",
TAPIface: NetworkInterface{
Name: "tap0_kata",
},
},
VirtIface: NetworkInterface{
Name: "eth0",
HardAddr: macAddr.String(),
},
NetInterworkingModel: DefaultNetInterworkingModel,
},
EndpointType: VethEndpointType,
}
macvtapEp := &MacvtapEndpoint{
EndpointType: MacvtapEndpointType,
EndpointProperties: NetworkInfo{
Iface: NetlinkIface{
Type: "macvtap",
},
},
}
expectedOut := []Device{
NetDevice{
Type: TAP,
IFName: vethEp.NetPair.TAPIface.Name,
MACAddress: vethEp.NetPair.TAPIface.HardAddr,
},
NetDevice{
Type: MACVTAP,
IFName: macvtapEp.Name(),
MACAddress: macvtapEp.HardwareAddr(),
},
}
devices = acrnArchBase.appendNetwork(devices, vethEp)
devices = acrnArchBase.appendNetwork(devices, macvtapEp)
assert.Equal(expectedOut, devices)
}

View File

@@ -1,287 +0,0 @@
//go:build linux
// Copyright (c) 2019 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
package virtcontainers
import (
"context"
"fmt"
"path/filepath"
"testing"
"github.com/kata-containers/kata-containers/src/runtime/pkg/device/config"
"github.com/kata-containers/kata-containers/src/runtime/virtcontainers/persist"
"github.com/kata-containers/kata-containers/src/runtime/virtcontainers/types"
"github.com/stretchr/testify/assert"
)
func newAcrnConfig() HypervisorConfig {
return HypervisorConfig{
KernelPath: testAcrnKernelPath,
ImagePath: testAcrnImagePath,
HypervisorPath: testAcrnPath,
HypervisorCtlPath: testAcrnCtlPath,
NumVCPUsF: defaultVCPUs,
MemorySize: defaultMemSzMiB,
BlockDeviceDriver: config.VirtioBlock,
DefaultBridges: defaultBridges,
DefaultMaxVCPUs: MaxAcrnVCPUs(),
// Adding this here, as hypervisorconfig.valid()
// forcefully adds it even when 9pfs is not supported
Msize9p: defaultMsize9p,
}
}
func testAcrnKernelParameters(t *testing.T, kernelParams []Param, debug bool) {
assert := assert.New(t)
acrnConfig := newAcrnConfig()
acrnConfig.KernelParams = kernelParams
if debug == true {
acrnConfig.Debug = true
}
a := &Acrn{
config: acrnConfig,
arch: &acrnArchBase{},
}
expected := fmt.Sprintf("panic=1 maxcpus=%d foo=foo bar=bar", a.config.DefaultMaxVCPUs)
params := a.kernelParameters()
assert.Equal(params, expected)
}
func TestAcrnKernelParameters(t *testing.T) {
params := []Param{
{
Key: "foo",
Value: "foo",
},
{
Key: "bar",
Value: "bar",
},
}
testAcrnKernelParameters(t, params, true)
testAcrnKernelParameters(t, params, false)
}
func TestAcrnCapabilities(t *testing.T) {
assert := assert.New(t)
a := &Acrn{
ctx: context.Background(),
arch: &acrnArchBase{},
}
caps := a.Capabilities(a.ctx)
assert.True(caps.IsBlockDeviceSupported())
assert.True(caps.IsBlockDeviceHotplugSupported())
assert.True(caps.IsNetworkDeviceHotplugSupported())
}
func testAcrnAddDevice(t *testing.T, devInfo interface{}, devType DeviceType, expected []Device) {
assert := assert.New(t)
a := &Acrn{
ctx: context.Background(),
arch: &acrnArchBase{},
}
err := a.AddDevice(context.Background(), devInfo, devType)
assert.NoError(err)
assert.Exactly(a.acrnConfig.Devices, expected)
}
func TestAcrnAddDeviceSerialPortDev(t *testing.T) {
name := "serial.test"
hostPath := "/tmp/serial.sock"
expectedOut := []Device{
ConsoleDevice{
Name: name,
Backend: Socket,
PortType: SerialBE,
Path: hostPath,
},
}
socket := types.Socket{
HostPath: hostPath,
Name: name,
}
testAcrnAddDevice(t, socket, SerialPortDev, expectedOut)
}
func TestAcrnAddDeviceBlockDev(t *testing.T) {
path := "/tmp/test.img"
index := 1
expectedOut := []Device{
BlockDevice{
FilePath: path,
Index: index,
},
}
drive := config.BlockDrive{
File: path,
Index: index,
}
testAcrnAddDevice(t, drive, BlockDev, expectedOut)
}
func TestAcrnHotplugUnsupportedDeviceType(t *testing.T) {
assert := assert.New(t)
acrnConfig := newAcrnConfig()
a := &Acrn{
ctx: context.Background(),
id: "acrnTest",
config: acrnConfig,
}
_, err := a.HotplugAddDevice(a.ctx, &MemoryDevice{0, 128, uint64(0), false}, FsDev)
assert.Error(err)
}
func TestAcrnUpdateBlockDeviceInvalidPath(t *testing.T) {
assert := assert.New(t)
path := ""
index := 1
acrnConfig := newAcrnConfig()
a := &Acrn{
ctx: context.Background(),
id: "acrnBlkTest",
config: acrnConfig,
}
drive := &config.BlockDrive{
File: path,
Index: index,
}
err := a.updateBlockDevice(drive)
assert.Error(err)
}
func TestAcrnUpdateBlockDeviceInvalidIdx(t *testing.T) {
assert := assert.New(t)
path := "/tmp/test.img"
index := AcrnBlkDevPoolSz + 1
acrnConfig := newAcrnConfig()
a := &Acrn{
ctx: context.Background(),
id: "acrnBlkTest",
config: acrnConfig,
}
drive := &config.BlockDrive{
File: path,
Index: index,
}
err := a.updateBlockDevice(drive)
assert.Error(err)
}
func TestAcrnGetSandboxConsole(t *testing.T) {
assert := assert.New(t)
store, err := persist.GetDriver()
assert.NoError(err)
a := &Acrn{
ctx: context.Background(),
config: HypervisorConfig{
VMStorePath: store.RunVMStoragePath(),
RunStorePath: store.RunStoragePath(),
},
store: store,
}
sandboxID := "testSandboxID"
expected := filepath.Join(store.RunVMStoragePath(), sandboxID, consoleSocket)
proto, result, err := a.GetVMConsole(a.ctx, sandboxID)
assert.NoError(err)
assert.Equal(result, expected)
assert.Equal(proto, consoleProtoUnix)
}
func TestAcrnCreateVM(t *testing.T) {
assert := assert.New(t)
acrnConfig := newAcrnConfig()
store, err := persist.GetDriver()
assert.NoError(err)
a := &Acrn{
store: store,
config: HypervisorConfig{
VMStorePath: store.RunVMStoragePath(),
RunStorePath: store.RunStoragePath(),
},
}
sandbox := &Sandbox{
ctx: context.Background(),
id: "testSandbox",
config: &SandboxConfig{
HypervisorConfig: acrnConfig,
},
state: types.SandboxState{BlockIndexMap: make(map[int]struct{})},
}
a.sandbox = sandbox
a.state.PID = 1
network, err := NewNetwork()
assert.NoError(err)
err = a.CreateVM(context.Background(), sandbox.id, network, &sandbox.config.HypervisorConfig)
assert.NoError(err)
assert.Exactly(acrnConfig, a.config)
}
func TestAcrnMemoryTopology(t *testing.T) {
mem := uint32(1000)
assert := assert.New(t)
a := &Acrn{
arch: &acrnArchBase{},
config: HypervisorConfig{
MemorySize: mem,
},
}
expectedOut := Memory{
Size: fmt.Sprintf("%dM", mem),
}
memory, err := a.memoryTopology()
assert.NoError(err)
assert.Exactly(memory, expectedOut)
}
func TestAcrnSetConfig(t *testing.T) {
assert := assert.New(t)
config := newAcrnConfig()
a := &Acrn{}
assert.Equal(a.config, HypervisorConfig{})
err := a.setConfig(&config)
assert.NoError(err)
assert.Equal(a.config, config)
}

View File

@@ -102,9 +102,6 @@ const (
// QemuHypervisor is the QEMU hypervisor.
QemuHypervisor HypervisorType = "qemu"
// AcrnHypervisor is the ACRN hypervisor.
AcrnHypervisor HypervisorType = "acrn"
// ClhHypervisor is the ICH hypervisor.
ClhHypervisor HypervisorType = "clh"
@@ -173,12 +170,6 @@ type HypervisorConfig struct {
// HypervisorPathList is the list of hypervisor paths names allowed in annotations
HypervisorPathList []string
// HypervisorCtlPathList is the list of hypervisor control paths names allowed in annotations
HypervisorCtlPathList []string
// HypervisorCtlPath is the hypervisor ctl executable host path.
HypervisorCtlPath string
// JailerPath is the jailer executable host path.
JailerPath string

View File

@@ -65,6 +65,13 @@ const (
// IPVlanEndpointType is ipvlan network interface.
IPVlanEndpointType EndpointType = "ipvlan"
// VfioEndpointType is a VFIO device that will be claimed as a network interface
// in the guest VM. Unlike PhysicalEndpointType, which requires a VF network interface
// with its network configured on the host before creating the sandbox, VfioEndpointType
// does not need a host network interface and instead has its network network configured
// through DAN.
VfioEndpointType EndpointType = "vfio"
)
// Set sets an endpoint type based on the input string.
@@ -94,6 +101,9 @@ func (endpointType *EndpointType) Set(value string) error {
case "ipvlan":
*endpointType = IPVlanEndpointType
return nil
case "vfio":
*endpointType = VfioEndpointType
return nil
default:
return fmt.Errorf("Unknown endpoint type %s", value)
}
@@ -118,6 +128,8 @@ func (endpointType *EndpointType) String() string {
return string(TuntapEndpointType)
case IPVlanEndpointType:
return string(IPVlanEndpointType)
case VfioEndpointType:
return string(VfioEndpointType)
default:
return ""
}

View File

@@ -42,6 +42,10 @@ func TestMacvtapEndpointTypeSet(t *testing.T) {
testEndpointTypeSet(t, "macvtap", MacvtapEndpointType)
}
func TestVfioEndpointTypeSet(t *testing.T) {
testEndpointTypeSet(t, "vfio", VfioEndpointType)
}
func TestEndpointTypeSetFailure(t *testing.T) {
var endpointType EndpointType

Some files were not shown because too many files have changed in this diff Show More