Add GPU annotations for remote hypervisor to help
with the right instance selection based on number of GPUs
and model
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
We may decide to add this later on, but for now this is only targetting
TEEs and the confidential image / initrd.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Return of proper error to the initiator is not guaranteed.
Method StopVM could kill shim process together with VM pieces.
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Corrects device filemode permissions typo/regression in rustjail to `666` instead of `066`.
`666` is the standard and expected value for these devices in containers.
Fixes: #10454
Signed-off-by: Simon Kaegi <simon.kaegi@gmail.com>
When QEMU is terminated by signal 15, it deletes the PidFile.
Upon detecting that QEMU has exited, the shim executes the stopVM function.
If the PidFile is not found, the PID is set to 0.
Subsequently, the shim executes `kill -9 0`, which terminates the current process group.
This prevents any further logic from being executed, resulting in resources not being cleaned up.
Signed-off-by: wangyaqi54 <wangyaqi54@jd.com>
Semantics are lifted straight out of the go runtime for compatibility.
We introduce DeviceVirtioScsi to represent a virtio-scsi device and
instantiate it if block device driver in the configuration file is set
to virtio-scsi. We also introduce ObjectIoThread which is instantiated
if the configuration file additionally enables iothreads.
Signed-off-by: Pavel Mores <pmores@redhat.com>
When do update_container_namespaces updating namespaces, setting
all UTS(and IPC) namespace paths to None resulted in hostnames
set prior to the update becoming ineffective. This was primarily
due to an error made while aligning with the oci spec: in an attempt
to match empty strings with None values in oci-spec-rs, all paths
were incorrectly set to None.
Fixes#10325
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Use rstest for unit test rather than TestData arrays where
possible to make the code more compact, easier to read
and open the possibility to enhance test cases with a
description more easily.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
While Kubernetes defines `binaryData` as `[]byte`,
when defined in a YAML file the raw bytes are
base64 encoded. Therefore, we need to read the YAML
value as `String` and not as `Vec<u8>`.
Fixes: #10410
Signed-off-by: Leonard Cohnen <lc@edgeless.systems>
The root cause is that the CDH client is a global variable, and unit tests `test_unseal_env` and `test_unseal_file`
share this lock-free global variable, leading to resource contention and destruction.
Merging the two unit tests into one test_sealed_secret will resolve this issue.
Fixes: #10403
Signed-off-by: ChengyuZhu6 <zhucy0405@gmail.com>
This reverts commit 31e09058af, as it's
breaking the agent unit tests CI.
This is a stop gap till Chengyu Zhu finds the time to properly address
the issue, avoiding the CI to be blocked for now.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Refactor CDH-related operations into the cdh_handler function to make the `create_container` code clearer.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Users must set the mount path to `/sealed/<path>` for kata agent to detect the sealed secret mount
and handle it in createcontainer stage.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
Introduced `unseal_file` function to unseal secret as files:
- Implemented logic to handle symlinks and regular files within the sealed secret directory.
- For each entry, call CDH to unseal secrets and the unsealed contents are written to a new file, and a symlink is created to replace the sealed symlink.
Fixes: #8123
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
add_device() now checks if QEMU is running already by checking if we have
a QMP connection. If we do a new function hotplug_device() is called
which hotplugs the device if it's a network one.
Signed-off-by: Pavel Mores <pmores@redhat.com>
With the helpers from previous commit, the actual hotplugging
implementation, though lengthy, is mostly just assembling a QMP command
to hotplug the network device backend and then doing the same for the
corresponding frontend.
Note that hotplug_network_device() takes cmdline_generator types Netdev
and DeviceVirtioNet. This is intentional and aims to take advantage of
the similarity between parameter sets needed to coldplug and hotplug
devices reuse and simplify our code. To enable using the types from qmp,
accessors were added as needed.
Signed-off-by: Pavel Mores <pmores@redhat.com>
Before adding network device hotplugging functionality itself we add
a couple of helpers in a separate commit since their functionality is
non-trivial.
To hotplug a device we need a free PCI slot. We add find_free_slot()
which can be called to obtain one. It looks for PCI bridges connected
to the root bridge and looks for an unoccupied slot on each of them. The
first found is returned to the caller. The algorithm explicitly doesn't
support any more complex bridge hierarchies since those are never produced
when coldplugging PCI bridges.
Sending netdev queue and vhost file descriptors to QEMU is slightly
involved and implemented in pass_fd(). The actual socket has to be passed
in an SCM_RIGHTS socket control message (also called ancillary data, see
man 3 cmsg) so we have to use the msghdr structure and sendmsg() call
(see man 2 sendmsg) to send the message. Since qapi-rs doesn't support
sending messages with ancillary data we have to do the sending sort of
"under it", manually, by retrieving qapi-rs's socket and using it directly.
Signed-off-by: Pavel Mores <pmores@redhat.com>
NetworkConfig::index has been used to generate an id for a network device
backend. However, it turns out that it's not unique (it's always zero
as confirmed by a comment at its definition) so it's not suitable to
generate an id that needs to be unique.
Use the host device name instead.
Signed-off-by: Pavel Mores <pmores@redhat.com>
Network device hotplugging will use the same infrastructure (Netdev,
DeviceVirtioNet) as coldplugging, i.e. QemuCmdLine. To make the code
of network device setup visible outside of QemuCmdLine we factor it out
to a non-member function `get_network_device()` and make QemuCmdLine just
delegate to it.
Signed-off-by: Pavel Mores <pmores@redhat.com>
The function takes a whole QemuCmdLine but only actually uses
HypervisorConfig. We increase callability of the function by limiting
its interface to what it needs. This will come handy shortly.
Signed-off-by: Pavel Mores <pmores@redhat.com>
At least one PCI bridge is necessary to hotplug PCI devices. We only
support PCI (at this point at least) since that's what the go runtime
does (note that looking at the code in virtcontainers it might seem that
other bus types are supported, however when the bridge objects are passed
to govmm, all but PCI bridges are actually ignored). The entire logic of
bridge setup is lifted from runtime-go for compatibility's sake.
Signed-off-by: Pavel Mores <pmores@redhat.com>
Inorder to support sandbox api, intorduce the sandbox_config
struct and split the sandbox start stage from init process.
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
Docker cannot exit normally after the container process exits when
used with runtime-rs since it doesn't receive the exit event. This
commit enable runtime-rs to send TaskExit to containerd after process
exits.
Also, it moves "system_time_into" and "option_system_time_into" from
crates/runtimes/common/src/types/trans_into_shim.rs to a new utility
mod.
Signed-off-by: Sicheng Liu <lsc2001@outlook.com>
When the sandbox api was enabled, the pasue container
wouldn't be created, thus the shared sandbox pidns
should be fallbacked to the first container's init process,
instead of return any error here.
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
When using network adapters that support SR-IOV, a VFIO device can be
plugged into a guest VM and claimed as a network interface. This can
significantly enhance network performance.
Fixes: #9758
Signed-off-by: Lei Huang <leih@nvidia.com>
rename the task_service to service, in order to
incopperate with the following added sandbox
services.
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
In order to make different from sandbox request/response, this commit
changed the task request/response to TaskRequest/TaskResponse.
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
Since the wait_vm would be called before calling stop_vm,
which would take the reader lock, thus blocking the stop_vm
getting the writer lock, which would trigge the dead lock.
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
Since the block_on would block on the current thread
which would prevent other async tasks to be run on this
worker thread, thus change it to use the async task for
this task.
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
This PR introduces support for selectively compiling Dragonball in
runtime-rs. By default, Dragonball will continue to be compiled into
the containerd-shim-kata-v2 executable, but users now have the option
to disable Dragonball compilation.
Fixes#10310
Signed-off-by: sidney chang <2190206983@qq.com>
Add cdi devices including ContainerDevice definition and
annotation_container_device method to annotate vfio device
in OCI Spec annotations which is inserted into Guest with
its mapping of vendor-class and guest pci path.
Fixes#10145
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
We need vfio device's properties device, vendor and
class, but we can only get property device and vendor.
just extend it with class is ok.
Fixes#10145
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
As we don't have any CI, nor maintainer to keep ACRN code around, we
better have it removed than give users the expectation that it should or
would work at some point.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
This change technically affects the path for enabled guest selinux as well,
however since this is not implemented in runtime-rs anyway nothing should
break. When guest selinux support is added this change will come handy.
Signed-off-by: Pavel Mores <pmores@redhat.com>
If guest selinux is off the runtime has to ensure that container OCI spec
contains no selinux labels for the container rootfs and process. Failure
to do so causes kata agent to try and apply the labels which fails since
selinux is not enabled in guest, which in turn causes container launch
to fail.
This is largely inspired by golang runtime(*) with a slight deviation
in ordering of checks. This change simply checks the disable_guest_selinux
config setting and if it's true it clears both rootfs and process label if
necessary. Golang runtime, on the other hand, seems to first check if
process label is non-empty and only then it checks the config setting,
meaning that if process label is empty the rootfs label is not reset
even if it's non-empty. Frankly, this looks like a potential bug though
probably unlikely to manifest since it can be assumed that the labels are
either both empty, or both non-empty.
(*) 4fd4b02f2e/src/runtime/virtcontainers/kata_agent.go (L1005)
Signed-off-by: Pavel Mores <pmores@redhat.com>
In order to handle the setting we have to first parse it and make its
value available to the rest of the program.
The yes() function is added to comply with serde which seems to insist
on default values being returned from functions. Long term, this is
surely not the best place for this function to live, however given that
this is currently the first and only place where it's used it seems
appropriate to put it near its use. If it ends up being reused elsewhere
a better place will surely emerge.
Signed-off-by: Pavel Mores <pmores@redhat.com>
kata-shim was not reporting `inactive_file` in memory stat.
This memory is deducted by containerd when calculating the size of container working set, as it can be paged out by the operating
system under memory pressure. Without reporting `inactive_file`, containerd will over report container memory usage.
[Here](https://github.com/containerd/containerd/blob/v1.7.22/pkg/cri/server/container_stats_list_linux.go#L117) is where containerd
deducts `inactive_file` from memory usage.
Note that kata-shim correctly reports `total_inactive_file` for cgroup v1, but this was not implemented for cgroup v2.
This commit:
- Adds code in kata-shim to report "inactive_file" memory for cgroup v2
- Implements reporting of all available cgroup v2 memory stats to containerd
- Uses defensive coding to avoid assuming existence of any memory.stat fields
The list of available cgroup v2 memory stats defined by containerd can be found
[here](https://pkg.go.dev/github.com/containerd/cgroups/v2/stats#MemoryStat).
Fixes#10280
Signed-off-by: Alex Man <alexman@stripe.com>
This patch adds support to call kata agents SetPolicy
API. Also adds tests for SetPolicy API using agent-ctl.
Fixes#9711
Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
agent built with policy feature initializes the policy engine using a
policy document from a default path, which is installed & linked during
UVM rootfs build. This commit adds support to provide a default agent
policy as environment variable.
This targets development/testing scenarios where kata-agent
is wanted to be started as a local process.
Fixes#10301
Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
Add two parameters for enabling cosign signature image verification.
- `enable_signature_verification`: to activate signature verification
- `image_policy`: URI of the image policy
config
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
new version of the anyhow crate has changed the backtrace capture thus
unit tests of kata-agent that compares a raised error with an expected
one would fail. To fix this, we need only panics to have backtraces,
thus set `RUST_BACKTRACE=1` and `RUST_LIB_BACKTRACE=0` for tests due to
document
https://docs.rs/anyhow/latest/anyhow/
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
It's a prerequisite PR to make built-in vmm dragonball compilation
options configurable.
Extract TAP device-related code from dragonball's dbs_utils into a
separate library within the runtime-rs hypervisor module.
To enhance functionality and reduce dependencies, the extracted code
has been reimplemented using the libc crate and the ifreq structure.
Fixes#10182
Signed-off-by: sidney chang <2190206983@qq.com>
Remove the recently added default UID/GID values, because the genpolicy
design is to initialize those fields before this new code path gets
executed.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Some container images are configured such that the user (and group)
under which their entrypoint should run is not a number (or pair of
numbers), but a user name.
For example, in a Dockerfile, one might write:
> USER 185
indicating that the entrypoint should run under UID=185.
Some images, however, might have:
> RUN groupadd --system --gid=185 spark
> RUN useradd --system --uid=185 --gid=spark spark
> ...
> USER spark
indicating that the UID:GID pair should be resolved at runtime via
/etc/passwd.
To handle such images correctly, read through all /etc/passwd files in
all layers, find the latest version of it (i.e., the top-most layer with
such a file), and, in so doing, ensure that whiteouts of this file are
respected (i.e., if one layer adds the file and some subsequent layer
removes it, don't use it).
Signed-off-by: Hernan Gatta <hernan.gatta@opaque.co>
Disabling the UID Policy rule was a workaround for #9928. Re-enable
that rule here and add a new test/CI temporary workaround for this
issue. This new test workaround will be removed after fixing #9928.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
kata-agent incorrectly reports CPU time for cgroup v2, causing 1000x underreporting.
For cgroup v2, kata-agent reads the cpu.stat file, which reports the time consumed by the processes in the cgroup in µs.
However, there was a bug in kata-agent where it returned this value in µs without converting it to ns.
This commit adds the necessary µs to ns conversion for cgroup v2, aligning it with v1 behavior and kata-shim's expectations.
This fixes#10278
Signed-off-by: Alex Man <alexman@stripe.com>
ITA stands for Intel Trust Authority, which is in the process to being
renamed to ITTS (Intel Tiber Trust Services).
As we've bumped guest-components on trustee, let's make sure we also
bump image-rs to the commit that brings ITA support in:
* 1db6c3a876
The reason we need to bump the dependency here is to avoid kbs_protocol
mismatch between the version used by the agent and the trustee one.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
In the existing implementation for the CopyFile subcommand,
- cmd line argument list is too long, including various metadata information.
- in case of a regular file, passing the actual data as bytes stream adds to the size and complexity of the input.
- the copy request will fail when the file size exceeds that of the allowed ttrpc max data length limit of 4Mb.
This change refactors the CopyFile handler and modifies the input to a known 'source' 'destination' syntax.
Fixes#9708
Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
It will panic when users do GPU vfio passthrough with cdi in runtime.
The root cause is that CustomSpec.Annotations is nil when new element
added.
To address this issue, initialization is introduced when it's nil.
Fixes#10266
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Move device code with nvdimm driver to nvdimm_device_handler, including
nvdimm device and pmem device.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Removed the `StorageHandlerManager` struct and its associated implementations and
introduced a type alias `StorageHandlerManager` for `HandlerManager` to simplify the code.
The new type alias maintains the same functionality while reducing redundancy.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Introduced `HandlerManager` struct to manage registered handlers, which will be used to storage and device management for kata-agent.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
I know right now we're always passing a value for that, but this doesn't
really have to be set unless attestation is used. Thus, let's also omit
it in case it's empty.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
This is a quick and simple pre-req for supporting initData, which will
take advantage of the mrconfigid in the TDX case.
While already adding mrconfigid, which is hardcoded empty right now,
let's do the same for mrowner and mrownerconfig, and leave it prepared
for future expansions.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
The reason we're relying on yet another function to do so is because the
TDX object will be used in its qom / qapi json format.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
This error is specific to SNP platforms, so let's make sure we only
error this out when an SNP platform is used.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Under normal circumstances, the virtual machine only requests memory
from the host and does not actively release it back to host when it is
no longer needed, leading to a waste of memory resources.
Free page reporting is a sub-feature of virtio-balloon. When this
feature is enabled, the Linux guest kernel will send information about
released pages to dragonball via virtio-balloon, and dragonball will
then release these pages.
This commit adds an option enable_balloon_f_reporting to runtime-rs.
When this option is enabled, runtime-rs will insert a virtio-balloon
device with the f_reporting option enabled during the Dragonball virtual
machine startup.
Signed-off-by: Hui Zhu <teawater@antgroup.com>
...or by using a binary with additional suffix.
This allows having multiple versions of nydus-overlayfs installed on the
host, telling nydus-snapshotter which one to use while still detecting
Nydus is used.
Signed-off-by: Paul Meyer <49727155+katexochen@users.noreply.github.com>
Move `unseal_env` and `secure_mount` functions on the global `CDH_CLIENT` instance to access the CDH client.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Introduced a global `CDH_CLIENT` instance to hold the cdh client and
implemented `init_cdh_client` function to initialize the cdh client if not already set.
Fixes: #10231
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
- Added `driver_types` method to `StorageHandler` trait to return driver
types managed by each handler.
- Implemented driver_types method for all storage handlers.
- Updated `STORAGE_HANDLERS` initialization to use `driver_types` for
handler registration.
Fixes: #10242
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
- Updated the `add_handler` function in `StorageHandlerManager` to accept a slice of IDs (`&[&str]`) instead of a single ID (`&str`).
This change allows a single handler to be registered for multiple storage device types.
- Refactored calls to `add_handler` in `Storage` of kata-agent to use the new function, passing arrays of storage drivers instead of single driver.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
This commit includes a fix for pulling an image on platforms that do not
support xattr.
Some platforms/file-systems do not support xattrs, this would make the
image pull fail because of failing to set xattr. This commit will check
whether the target path supports xattr. If yes, the unpacking will
maintain xattrs; if not, it will not set xattrs.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
slog's is_enabled() is documented as:
- "best effort", and
- Sometime resulting in false positives.
Use AGENT_CONFIG.log_level.as_usize() instead, to avoid those false
positives.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
PhysicalEndpoint unbinds its VF interface and rebinds it as a VFIO device,
then cold-plugs the VFIO device into the guest kernel.
When `cold_plug_vfio` is set to "no-port", cold-plugging the VFIO device
will fail.
This change checks if `cold_plug_vfio` is enabled before creating PhysicalEndpoint
to avoid unnecessary VFIO rebind operations.
Fixes: #10162
Signed-off-by: Lei Huang <leih@nvidia.com>
This patch re-generates the client code for Cloud Hypervisor v41.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.
Fixes: #10203
Signed-off-by: Bo Chen <chen.bo@intel.com>
This commit allows `cdh_api_timeout` to be configured from the configuration file.
The configuration is commented out with specifying a default value (50s) because
the default value is configured in the agent.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
This commit updates CDH_API_TIMEOUT to use AGENT_CONFIG.cdh_api_timeout
and changes it from a `const` to `lazy_static` to accommodate runtime-determined values.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
To make the `cdh_api_timeout` variable configurable, it has been added to
the `AgentConfig` structure.
This change includes storing the variable as a `time::Duration` type and
generalizing the existing `hotplug_timeout` code to handle both timeouts.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
* genpolicy: deny UpdateEphemeralMountsRequest
Deny UpdateEphemeralMountsRequest by default, because paths to
critical Guest components can be redirected using such request.
Signed-off-by: Dan Mihai <Daniel.Mihai@microsoft.com>
Add bind mounts for volumes defined by docker container images, unless
those mounts have been defined in the input K8s YAML file too.
For example, quay.io/opstree/redis defines two mounts:
/data
/node-conf
Before these changes, if these mounts were not defined in the YAML file
too, the auto-generated policy did not allow this container image to
start.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Improved error handling to provide clearer feedback on request failures.
For example:
Improve createcontainer request timeout error message from
"Error: failed to create containerd task: failed to create shim task:context deadline exceed"
to "Error: failed to create containerd task: failed to create shim task: CreateContainerRequest timed out: context deadline exceed".
Fixes: #10173 -- part II
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Container/Sandbox clean up should not fail if root FS is not mounted.
This PR handles EINVAL errors when umount2 is called.
Fixes: #10166
Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
Introduces `secure_mount` API in the cdh. It includes:
- Adding the `SecureMountServiceClient`.
- Implementing the `secure_mount` function to handle secure mounting requests.
- Updating the confidential_data_hub.proto file to define SecureMountRequest and SecureMountResponse messages
and adding the SecureMountService service.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
renames the sealed_secret.proto file to confidential_data_hub.proto and
updates the corresponding API namespace from sealed_secret to confidential_data_hub.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Hooks are executed on the host, so we don't expect to run hooks and thus
require that no hook paths are set.
Additional Kernel modules expand the attack surface, so require that
none are set. If a use case arises, modules should be allowlisted via
settings.
Signed-off-by: Markus Rudy <mr@edgeless.systems>
CopyFile is invoked by the host's FileSystemShare.ShareFile function,
which puts all files into directories with a common pattern. Copying
files anywhere else is dangerous and must be prevented. Thus, we check
that the target path prefix matches the expected directory pattern of
ShareFile, and that this directory is not escaped by .. traversal.
Signed-off-by: Markus Rudy <mr@edgeless.systems>
when use debug console, the shell run in child process may not be
exited, in some scenes.
eg. directly Ctrl-C in the host to terminate the kata-runtime process,
that will block the task handling the console connection,while waiting
for the child to exit.
Signed-off-by: soulfy <liukai254@jd.com>
Initialize the trusted stroage when the device is defined
as "/dev/trusted_store" with shell script as first step.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
After enable secure storage integrity for trusted storage, the initialize
time will take more times, the default value will be NOT enabled but add this config to
allow the user to enable if they care more strict security.
Fixes: #8142
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
Our code for handling images being pulled inside the guest relies on a
containerType ("sandbox" or "container") being set as part of the
container annotations, which is done by the CRI Engine being used, and
depending on the used CRI Engine we check for a specfic annotation
related to the image-name, which is then passed to the agent.
However, when running kata-containers without kubernetes, specifically
when using `nerdctl`, none of those annotations are set at all.
One thing that we can do to allow folks to use `nerdctl`, however, is to
take advantage of the `--label` flag, and document on our side that
users must pass `io.kubernetes.cri.image-name=$image_name` as part of
the label.
By doing this, and changing our "fallback" so we can always look for
such annotation, we ensure that nerdctl will work when using the nydus
snapshotter, with kata-containers, to perform image pulling inside the
pod sandbox / guest.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Provides a test runner that generates a policy and validates it
with canned requests. The initial set of test cases is mostly for
illustration and will be expanded incrementally.
In order to enable both cross-compilation on Ubuntu test runners as well
as native compilation on the Alpine tools builder, it is easiest to
switch to the vendored openssl-src variant. This builds OpenSSL from
source, which depends on Perl at build time.
Adding the test to the Makefile makes it execute in CI, on a variety of
architectures. Building on ppc64le requires a newer version of the
libz-ng-sys crate.
Fixes: #10061
Signed-off-by: Markus Rudy <mr@edgeless.systems>
cargo clippy has two new warnings that need addressing:
- assigning_clones
These were fixed by clippy itself.
- suspicious_open_options
I added truncate(false) because we're opening the file for reading.
Signed-off-by: Markus Rudy <mr@edgeless.systems>
This parameter has been deprecated for a long time and QEMU 9.1.0 finally removes it.
Fixes: kata-containers#10112
Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>
- Add --version flag to the genpolicy tool that prints the current
version
- Add version.rs.in template to store the version information
- Update makefile to autogenerate version.rs from version.rs.in
- Add license to Cargo.toml
Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
1. Use the new value of AllowRequestsFailingPolicy after setting up a
new Policy. Before this change, the only way to enable
AllowRequestsFailingPolicy was to change the default Policy file,
built into the Guest rootfs image.
2. Ignore errors returned by regorus while evaluating Policy rules, if
AllowRequestsFailingPolicy was enabled. For example, trying to
evaluate the UpdateInterfaceRequest rules using a policy that didn't
define any UpdateInterfaceRequest rules results in a "not found"
error from regorus. Allow AllowRequestsFailingPolicy := true to
bypass that error.
3. Add simple CI test for AllowRequestsFailingPolicy.
These changes are restoring functionality that was broken recently by
commmit df23eb09a6.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Since we can't find a homogeneous value for the resource/cgroup
management of multiple hypervisors, and we have decoupled the
env vars in the Makefile, we don't need the generic ones.
Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
To avoid overriding env vars when multiple hypervisors are
available, we add per-hypervisor vars for static resource
management and cgroups handling. We reflect that in the
relevant config files as well.
Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
It's better to check the container's status before
try to send signal to it. Since there's no need
to send signal to it when the container's stopped.
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
Since stop sandbox would be called in multi path,
thus it's better to set and check the sandbox's state.
Fixes: #10042
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
Generate policy that validates each exec command line argument, instead
of joining those args and validating the resulting string. Joining the
args ignored the fact that some of the args might include space
characters.
The older format from genpolicy-settings.json was similar to:
"ExecProcessRequest": {
"commands": [
"sh -c cat /proc/self/status"
],
"regex": []
},
That format will not be supported anymore. genpolicy will detect if its
users are trying to use the older "commands" field and will exit with
a relevant error message in that case.
The new settings format is:
"ExecProcessRequest": {
"allowed_commands": [
[
"sh",
"-c",
"cat /proc/self/status"
]
],
"regex": []
},
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
It's time to delete the kata oci spec implemented just
for kata. As we have already done align OCI Spec with
oci-spec-rs.
Fixes#9766
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
This commit aligns the OCI Spec implementation in runtime-rs
with the OCI Spec definitions and related operations provided
by oci-spec-rs. Key changes as below:
(1) Leveraged oci-spec-rs to align Kata Runtime OCI Spec with
the official OCI Spec.
(2) Introduced runtime-spec to separate OCI Spec definitions
from Kata-specific State data structures.
(3) Preserved the original code logic and implementation as
much as possible.
(4) Made minor code adjustments to adhere to Rust programming
conventions;
Fixes#9766
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Utilized oci-spec-rs to align OCI Spec structures
and data representations in runk with the OCI Spec.
Fixes#9766
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
This commit aligns the OCI Spec used within agent-ctl
with the oci-spec-rs definition and operations. This
enhancement ensures that agent-ctl adheres to the latest
OCI standards and provides a more consistent and reliable
experience for managing container images and configurations.
Fixes#9766
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
This commit transitions the data implementation for OCI Spec
from kata-oci-spec to oci-spec-rs. While both libraries adhere
to the OCI Spec standard, significant implementation details
differ. To ensure data exchange through TTRPC services, this
commit reimplements necessary data conversion logic.
This conversion bridges the gap between oci-spec-rs data and
TTRPC data formats, guaranteeing consistent and reliable data
transfer across the system.
Fixes#9766
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>