We have had those tests broken for months. It's time to get rid of
those.
NOTE that we could easily revert this commit and re-add those tests as
soon as we find someone to maintain and be responsible for such
integration.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Previous set for the Mount.type with `bind` is wrong, and for local
storage, the type of Mount should be `local`.
This commit aims to correct the type with "local".
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
As the disable_guest_empty_dir order is wrong which causes
the bool value is not correct and it got a wrong result.
This commit aims to correct the parameters order.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
The test case designed to verify policy failures due to an "unexpected
capability" was misconfigured. It was using "CAP_SYS_CHROOT" as the
unexpected capability to be added.
This configuration was flawed for two main reasons:
1.Incorrect Syntax: Kubernetes Pod specs expect capability names without
the "CAP_" prefix (e.g., "SYS_CHROOT", not "CAP_SYS_CHROOT").
This made the test case's premise incorrect from a K8s API perspective.
2.Part of Default Set: "SYS_CHROOT" is already included in the
`default_caps` list for a standard container. Therefore, adding it would
not trigger a policy violation, defeating the purpose of the
"unexpected capability" test.
Furthermore, a related issue was observed where a malformed capability
like "CAP_CAP_SYS_CHROOT" was being generated, causing parsing failures
in the `oci-spec-rs` library. This was a symptom of incorrect string
manipulation when handling capabilities.
This commit corrects the test by selecting "SYS_NICE" as the unexpected
capability. "SYS_NICE" is a more suitable choice because:
- It is a valid Linux capability.
- It is relatively harmless.
- It is **not** part of the default capability set defined in
`genpolicy-settings.json`.
By using "SYS_NICE", the test now accurately simulates a scenario where
a Pod requests a legitimate but non-default capability, which the policy
(generated from a baseline Pod without this capability) should correctly
reject. This change fixes the test's logic and also resolves the
downstream `oci-spec-rs` parsing error by ensuring only valid capability
names are processed.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Detected a format mismatch in OCI Spec Capabilities fields between
`runtime-rs` (no `CAP_` prefix) and `runtime-go` (with `CAP_` prefix).
This introduces a normalization of caps in match_caps(p_caps, i_caps).
This ensures robust and consistent processing of Capabilities regardless
of whether the OCI Spec originates from `runtime-rs` or `runtime-go`.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Currently, the initdata module only detects virtio-blk devices
(/dev/vd*) when searching for the initdata block device. However,
when using virtio-scsi, the devices appear as /dev/sd* in the
guest, causing the initdata detection to fail.
This commit extends the device detection logic to support both
device types:
- virtio-blk devices: /dev/vda, /dev/vdb, etc.
- virtio-scsi devices: /dev/sda, /dev/sdb, etc.
This commits aims to address issue of theinitdata device not being
found when using virtio-scsi
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Thankfully there's only one piece that's still SNP specific (for the
supported TEEs). Let's adjust it so we can have an easy and smooth
execution when adding a TDX CI machine.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
There are several changes needed in order to get this test working with
CC, and yet we still are skipping it.
Basically, we need to:
* Pull an authenticated image inside the guest, which requires:
* Using Trustee to release the credential
* We still depend on a PR to be merged on Trustee side
* https://github.com/confidential-containers/trustee/pull/1035
* We still depend on a Trustee bump (including the PR above) on our
side
Apart from those changes, I ended up "duplicating" the tests by adding a
"-tee" version of those, which already have:
* The proper kbs annotations set up
* Dropped host mounts
* Increases the memory needed
Last but not least, as "bats" probably means "being a terrible script",
I had to re-arrange a few things otherwise the tests would not even run
due to bats-isms that I am sincerely not able to pin-point.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
We added the tests using virtio-9p as we knew it'd require incremental
changes to be able to use any kind of guest-pull method.
Now, as in the coming commits we'll be actually ensuring that guest-pull
works and is in use, we can enforce the experimental_force_guest_pull
usage for the nvidia cases.
Note: We're using experimental_force_guest_pull instead of
nydus-snapshotter due to stability concerns with the snapshotter.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
It was just missed when adding those configurations.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
It takes either a shim name or "", but we were treating this (thankfully
only in this specific file) as a boolean.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Adjust output to the setup_file and teardown_file behavior.
With this, we will be able to observe relevant logging rather than
adding to the output variable.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Previous commit enabled getting the physical address reduction from
processor but just stored it for later use. This commit adds handling
of the value to ProtectionDevice and enables the QEMU driver to use it.
Signed-off-by: Pavel Mores <pmores@redhat.com>
An implementation of cbitpos acquisition is supplied that was missing
so far. We also get the physical address reduction value from the same
source (CPUID Fn8000_001f function). This has been hardcoded at 1 so far,
following the Go runtime example, but it's better to get it from the
processor.
Signed-off-by: Pavel Mores <pmores@redhat.com>
- version.rs gets generated from version.rs.in
- version.rs.in contains values read from VERSION
- so version.rs (and maybe other Agent files too) must be
re-generated when the VERSION file changes
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
The new image reference has changed to mcr.microsoft.com/oss/v2/kubernetes/pause:3.6
from mcr.microsoft.com/oss/kubernetes/pause:3.6.
The new image uses by default UID=0, GID=0 while the older. The older image had:
UID=65535, GID=65535.
There is a new pause_container_id_policy field in genpolicy-settings.json, informing
genpolicy about the way AdditionalGids gets updated - "v1" for the older behavior
and "v2" for the newer AKS version:
- When using v1, the default value of AdditionalGids is {65535}.
- When using v2, the default value of AdditionalGids is {}.
UID=65535 and GID=65535 are still hard-coded by default in genpolicy-settings.json.
We might be able to remove/ignore these fields in the future, if we'll stop relying
on policy::KataSpec::get_process_fields to use these fields.
A new CI function adapt_common_policy_settings_for_aks() changes the pause container
UID, GID, pause_container_id_policy, and image ref settings values when testing on
AKS Hosts - i.e., when testing coco-dev or mariner Hosts.
The genpolicy workarounds for the unexpected behavior with guest pull enabled have
been improved to use the current container's GID instead of hard-coding GID=0 as the
guest pull default. Also, AdditionalGids gets updated when the current container's GID is
changing, instead of always changing the AdditionalGids at the very end of
policy::AgentPolicy::get_container_process(), when the relevant evolution of the GID
value was no longer available.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Make it easier to understand the source of the UID/GID/AdditionalGids
values from the container in the auto-generated policy.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Parallelize busybox builds to build a bit faster and create the
build directory prior to Docker execution, which on my
environment, helps with permission issues when building busybox
without the kata-containers/build directory existing beforehand.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Correct the hardcoded value of disable_guest_empty_dir, instead,
we use the real value of it which comes from the configuration.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
A sandbox annotation that determines if it should create Kubernetes
emptyDir mounts on the guest filesystem.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
It acts as if it should create Kubernetes emptyDir mounts on the
guest filesystem. If enabled, the runtime will not create Kubernetes
emptyDir mounts on the guest filesystem.Instead, emptyDir mounts will
be created on the host and shared via virtio-fs.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Let's ensure Trustee is deployed as some of the tests rely images that
live behind authentication. /o\
The approach taken here to deploy Trustee is exactly the same one taken
on the other CoCo tests, apart from an env var passed to ensure we're
using the NVIDIA remote verifier (which will be in handy very very
soon).
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
This isn't really related to remote hypervisor though it was useful for
its debugging. It's a small helper I've been using regularly during
development for quite some time that I think might be useful more broadly.
Signed-off-by: Pavel Mores <pmores@redhat.com>
The remote hypervisor launches no VM, it just instructs the Cloud API
Adaptor to do so, therefore it has no need for an image or initrd to boot
from and should be exempt from the mandate for one or the other to be
specified.
Signed-off-by: Pavel Mores <pmores@redhat.com>
The go runtime's .proto file - which is also used by the Cloud API
Adaptor - puts the Hypervisor service into the "hypervisor" package.
runtime-rs has to do the same to avoid an "unimplemented" error.
Signed-off-by: Pavel Mores <pmores@redhat.com>
With the change made to the matrix when the CC GPU runner was added,
there was a change in the job name (@sprt saw that coming, but I
didn't).
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Same deal as the previous commut, just enabling the tests here, with the
same list of improvements that we will need to go through in order to
get is working in a perfect way.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
While the primary goal of this change is to detect regressions to the
NVIDIA SNP GPU scenario, various improvements to reflect a more
realistic CC setting are planned in subsequent changes, such as:
* moving away from the overlayfs snapshotter
* disabling filesystem sharing
* applying a pod security policy
* activating the GPUs only after attestation
* using a refined approach for GPU cold-plugging without requiring
annotations
* revisiting pod timeout and overhead parameters (the podOverhead value
was increased due to CUDA vectorAdd requiring about 6Gi of
podOverhead, as well as the inference and embedqa requiring at least
12Gi, respectively, 14Gi of podOverhead to run without invoking the
host's oom-killer. We will revisit this aspect after addressing
points 1. and 2.)
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
For the nvidia-gpu-snp and nvidia-gpu-tdx we must set containerd to
allow the CDI annotation to be passed to down.
This solution may become obsolete soon enough, but the cleanest way to
have it properly working is by adding it here (even if we remove it
before the next release).
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
It's been noticed that as more RAM is needed to run the CC tests, we
also need to update the podOverhead of the NVIDIA CC runtime classes to
avoid getting OOM Killed.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Since the nerdctl's network hook would call pselect6 syscall
by xtables-nft-multi, thus we'd better add it to the seccomp's
whitelist.
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>