Commit Graph

4749 Commits

Author SHA1 Message Date
GabyCT
b05811587e
Merge pull request #10245 from ChengyuZhu6/handler-manager
agent: Refactor storage handler registration
2024-09-06 09:45:39 -06:00
Fabiano Fidêncio
65a4562050 runtime: qemu: tdx: Add omitempty to QuoteGenerationSocket
I know right now we're always passing a value for that, but this doesn't
really have to be set unless attestation is used.  Thus, let's also omit
it in case it's empty.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-06 15:05:55 +02:00
Fabiano Fidêncio
7818484120 runtime: qemu: tdx: Support mrconfigid / mrowner/ mrownerconfig
This is a quick and simple pre-req for supporting initData, which will
take advantage of the mrconfigid in the TDX case.

While already adding mrconfigid, which is hardcoded empty right now,
let's do the same for mrowner and mrownerconfig, and leave it prepared
for future expansions.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-06 15:05:54 +02:00
Fabiano Fidêncio
8285957678 runtime: qemu: Rename prepareObjectWithTDXQgs to prepareTDXObject
The reason we're relying on yet another function to do so is because the
TDX object will be used in its qom / qapi json format.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-06 14:36:09 +02:00
Archana Choudhary
f2625b0014 genpolicy: add support for PodDisruptionBudget
yaml

Prevent panic for PDB specs

Signed-off-by: Archana Choudhary <archana1@microsoft.com>
2024-09-05 11:33:47 -07:00
Dan Mihai
7ab95b56f1
Merge pull request #10251 from microsoft/saulparedes/support_readonly_hostpath
genpolicy: support readonly hostpath
2024-09-05 09:27:15 -07:00
Saul Paredes
24c2d13fd3 genpolicy: support readonly emptyDir mount
Set emptyDir access based on volume mount readOnly value

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-09-04 15:05:44 -07:00
Saul Paredes
36a4104753 genpolicy: support readonly hostpath
Set hostpath access based on volume mount readOnly value

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-09-04 14:55:22 -07:00
Fabiano Fidêncio
7d048f5963
Merge pull request #10254 from fidencio/topic/remove-amd-specific-warning-from-non-amd-systems
runtime: Don't error out about SNP cert path on non SNP platforms
2024-09-04 23:42:32 +02:00
Steve Horsman
f66e8c41a1
Merge pull request #10250 from squarti/remote-machine-type-default
runtime: fix bad default machine_type for remote hypervisor
2024-09-04 17:34:04 +01:00
Fabiano Fidêncio
b10256a7ca runtime: Don't error out about SNP cert path on non SNP platforms
This error is specific to SNP platforms, so let's make sure we only
error this out when an SNP platform is used.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-04 11:54:52 +02:00
Hui Zhu
447a7feccf runtime-rs: configuration-dragonball.toml.in: Add config for balloon
Add enable_balloon_f_reporting config to
configuration-dragonball.toml.in.

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-09-04 17:25:38 +08:00
Hui Zhu
ad9968ce2d runtime-rs: Add enable_balloon_f_reporting for dragonball
Under normal circumstances, the virtual machine only requests memory
from the host and does not actively release it back to host when it is
no longer needed, leading to a waste of memory resources.

Free page reporting is a sub-feature of virtio-balloon. When this
feature is enabled, the Linux guest kernel will send information about
released pages to dragonball via virtio-balloon, and dragonball will
then release these pages.

This commit adds an option enable_balloon_f_reporting to runtime-rs.
When this option is enabled, runtime-rs will insert a virtio-balloon
device with the f_reporting option enabled during the Dragonball virtual
machine startup.

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-09-04 16:38:13 +08:00
Paul Meyer
3be719c805 virtcontainers: allow specifying nydus-overlayfs binary by path
...or by using a binary with additional suffix.
This allows having multiple versions of nydus-overlayfs installed on the
host, telling nydus-snapshotter which one to use while still detecting
Nydus is used.

Signed-off-by: Paul Meyer <49727155+katexochen@users.noreply.github.com>
2024-09-04 08:29:40 +02:00
Silenio Quarti
9e1388728e runtime: fix bad default machine_type for remote hypervisor
Fixes: #10249

Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
2024-09-03 20:53:19 -04:00
ChengyuZhu6
77521cc8d2 agent:cdh: introduce a function to check initialization of cdh client
introduce a function to check initialization of cdh client.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-04 04:52:50 +08:00
ChengyuZhu6
07e0e843e8 agent:cdh: switch to the new method for initializing cdh client
Decouple the cdh client from AgentService and refactor cdh client usage and initialization.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-04 04:51:55 +08:00
ChengyuZhu6
bc8156c3ae agent:cdh: Refactor cdh client methods for better integration
Move `unseal_env` and `secure_mount` functions on the global `CDH_CLIENT` instance to access the CDH client.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-04 04:51:54 +08:00
ChengyuZhu6
0ad35dc91b agent:cdh: Initialize CDH client as a global asynchronous instance
Introduced a global `CDH_CLIENT` instance to hold the cdh client and
implemented `init_cdh_client` function to initialize the cdh client if not already set.

Fixes: #10231

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-04 04:49:54 +08:00
ChengyuZhu6
0d519162b5 agent:storage: Refactor storage handler registration
- Added `driver_types` method to `StorageHandler` trait to return driver
  types managed by each handler.
- Implemented driver_types method for all storage handlers.
- Updated `STORAGE_HANDLERS` initialization to use `driver_types` for
  handler registration.

Fixes: #10242

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-03 18:38:52 +08:00
ChengyuZhu6
e47eb0d7d4 kata-types:mount: support registering multiple IDs to a single handler
- Updated the `add_handler` function in `StorageHandlerManager` to accept a slice of IDs (`&[&str]`) instead of a single ID (`&str`).
  This change allows a single handler to be registered for multiple storage device types.
- Refactored calls to `add_handler` in `Storage` of kata-agent to use the new function, passing arrays of storage drivers instead of single driver.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-03 18:38:36 +08:00
Fabiano Fidêncio
1309c49c09 agent: Update image-rs to 02af65abc
As this brings in proper support to handle gzip whiteouts.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-02 14:15:04 +02:00
Aurélien Bombo
886b3047ac
Merge pull request #10222 from microsoft/danmihai1/log-level-false-positives
agent: avoid policy.txt log without debug enabled
2024-08-30 10:09:04 -07:00
Alex Lyn
8241423ba5
Merge pull request #10224 from amshinde/update-image-rs-xattr
agent: image-rs: check xattrs for image unpacking
2024-08-29 09:33:22 +08:00
GabyCT
dd9f41547c
Merge pull request #10160 from microsoft/saulparedes/support_priority_class
genpolicy: add priorityClassName as a field in PodSpec interface
2024-08-28 14:36:20 -06:00
Archana Shinde
c747852bce agent: image-rs: check xattrs for image unpacking
This commit includes a fix for pulling an image on platforms that do not
support xattr.

Some platforms/file-systems do not support xattrs, this would make the
image pull fail because of failing to set xattr. This commit will check
whether the target path supports xattr. If yes, the unpacking will
maintain xattrs; if not, it will not set xattrs.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-08-28 00:02:46 -07:00
Archana Choudhary
ae2cdedba8 genpolicy: add priorityClassName as a field in PodSpec interface
This allows generation of policy for pods specifying priority classes.

Signed-off-by: Archana Choudhary <archana1@microsoft.com>
2024-08-27 19:54:02 -07:00
Dan Mihai
aa8bdbde5a agent: avoid policy.txt log without debug enabled
slog's is_enabled() is documented as:
- "best effort", and
- Sometime resulting in false positives.

Use AGENT_CONFIG.log_level.as_usize() instead, to avoid those false
positives.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-08-28 02:33:56 +00:00
Alex Lyn
f24983b3cf
Merge pull request #10210 from l8huang/cold-vf
runtime: check if  cold_plug_vfio is enabled before create PhysicalEndpoint
2024-08-27 15:23:55 +08:00
Silenio Quarti
11ba8f05ca runtime: Allow machine_type in kata config for remote hypervisors
Fixes: #10211

Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
2024-08-26 10:17:40 -04:00
Lei Huang
70168a467d runtime: check if cold_plug_vfio is enabled before create PhysicalEndpoint
PhysicalEndpoint unbinds its VF interface and rebinds it as a VFIO device,
then cold-plugs the VFIO device into the guest kernel.

When `cold_plug_vfio` is set to "no-port", cold-plugging the VFIO device
will fail.

This change checks if `cold_plug_vfio` is enabled before creating PhysicalEndpoint
to avoid unnecessary VFIO rebind operations.

Fixes: #10162

Signed-off-by: Lei Huang <leih@nvidia.com>
2024-08-23 15:42:17 -07:00
Bo Chen
a0bd78b358
Merge pull request #10205 from likebreath/0819/upgrade_clh_v41.0
Upgrade to Cloud Hypervisor v41.0
2024-08-23 10:01:41 -07:00
Fabiano Fidêncio
45f69373a6
Merge pull request #10199 from BbolroC/make-cdh-api-timeout-configurable
agent/config: Make CDH_API_TIMEOUT configurable
2024-08-23 11:04:10 +02:00
Alex Lyn
44bf7ccb46
Merge pull request #10141 from soulfy/fix-delete-failed
agent: kill child process when console socket closed
2024-08-23 14:00:53 +08:00
Bo Chen
254f8bca74 runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v41.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #10203

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-08-22 11:05:54 -07:00
Hyounggyu Choi
7d0aba1a24 runtime: Enable to get cdh_api_timeout from configuration file
This commit allows `cdh_api_timeout` to be configured from the configuration file.
The configuration is commented out with specifying a default value (50s) because
the default value is configured in the agent.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-08-22 14:47:37 +02:00
Hyounggyu Choi
8615516823 agent: Add agent.cdh_api_timeout to README
This commit adds an explanation for `cdh_api_timeout` to the README file.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-08-22 14:47:37 +02:00
Hyounggyu Choi
2512ddeab2 agent/cdh: Use AGENT_CONFIG.cdh_api_timeout for CDH_API_TIMEOUT
This commit updates CDH_API_TIMEOUT to use AGENT_CONFIG.cdh_api_timeout
and changes it from a `const` to `lazy_static` to accommodate runtime-determined values.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-08-22 10:09:16 +02:00
Hyounggyu Choi
6139e253a0 agent/config: Add cdh_api_timeout to AgentConfig
To make the `cdh_api_timeout` variable configurable, it has been added to
the `AgentConfig` structure.
This change includes storing the variable as a `time::Duration` type and
generalizing the existing `hotplug_timeout` code to handle both timeouts.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-08-22 10:09:16 +02:00
Dan Mihai
6654491cc3 genpolicy: deny UpdateEphemeralMountsRequest
* genpolicy: deny UpdateEphemeralMountsRequest

Deny UpdateEphemeralMountsRequest by default, because paths to
critical Guest components can be redirected using such request.

Signed-off-by: Dan Mihai <Daniel.Mihai@microsoft.com>
2024-08-20 18:28:17 -07:00
Fabiano Fidêncio
e0ae398a2e
Merge pull request #10151 from squarti/rootdir2
runtime: Files are not synced between host and guest VMs
2024-08-18 12:32:52 +02:00
Fabiano Fidêncio
a78d82f4f1
Merge pull request #10159 from squarti/main
agent: Handle EINVAL error when umounting container rootfs
2024-08-16 22:07:50 +02:00
Dan Mihai
79c1d0a806
Merge pull request #10136 from microsoft/danmihai1/docker-image-volume2
genpolicy: add bind mounts for image volumes
2024-08-16 13:07:01 -07:00
Fabiano Fidêncio
28aa4314ba
Merge pull request #10175 from ChengyuZhu6/error_message
runtime: Add specific error message for gRPC request timeouts
2024-08-16 22:06:49 +02:00
Dan Mihai
c22ac4f72c genpolicy: add bind mounts for image volumes
Add bind mounts for volumes defined by docker container images, unless
those mounts have been defined in the input K8s YAML file too.

For example, quay.io/opstree/redis defines two mounts:
/data
/node-conf
Before these changes, if these mounts were not defined in the YAML file
too, the auto-generated policy did not allow this container image to
start.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-08-16 15:11:05 +00:00
Fabiano Fidêncio
8d63723910
Merge pull request #10161 from microsoft/saulparedes/ignore_role_resource
genpolicy: ignore Role resource
2024-08-16 16:50:16 +02:00
ChengyuZhu6
ca05aca548 runtime: Add specific error message for gRPC request timeouts
Improved error handling to provide clearer feedback on request failures.

For example:
Improve createcontainer request timeout error message from
"Error: failed to create containerd task: failed to create shim task:context deadline exceed"
to "Error: failed to create containerd task: failed to create shim task: CreateContainerRequest timed out: context deadline exceed".

Fixes: #10173 -- part II

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-16 20:24:48 +08:00
Chengyu Zhu
ba3c484d12
Merge pull request #9999 from ChengyuZhu6/trusted-storage
Trusted image storage
2024-08-16 15:39:50 +08:00
Silenio Quarti
5d815ffde1 runtime: Files are not synced between host and guest VMs
This PR resolves the default kubelet root dir symbolic link and
uses it as the absolute path for the fs watcher regexs

Fixes: https://github.com/kata-containers/kata-containers/issues/9986

Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
2024-08-15 23:19:08 -04:00
Silenio Quarti
0dd16e6b25 agent: Handle EINVAL error when umounting container rootfs
Container/Sandbox clean up should not fail if root FS is not mounted.
This PR handles EINVAL errors when umount2 is called.

Fixes: #10166

Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
2024-08-15 19:41:46 -04:00
Dan Mihai
905c76bd47
Merge pull request #10153 from microsoft/saulparedes/support_cron_job
genpolicy: Add support for cron jobs
2024-08-15 11:11:00 -07:00
Aurélien Bombo
0223eedda5
Merge pull request #10050 from burgerdev/request-hardening
genpolicy: hardening some agent requests
2024-08-15 08:31:21 -07:00
ChengyuZhu6
5f4209e008 agent:README: add secure_image_storage_integrity to agent's README
add secure_image_storage_integrity to agent's README.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-15 20:32:44 +08:00
ChengyuZhu6
a9b436f788 agent:cdh: Introduces secure_mount API in cdh
Introduces `secure_mount` API in the cdh. It includes:

- Adding the `SecureMountServiceClient`.
- Implementing the `secure_mount` function to handle secure mounting requests.
- Updating the confidential_data_hub.proto file to define SecureMountRequest and SecureMountResponse messages
  and adding the SecureMountService service.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-15 13:55:23 +08:00
ChengyuZhu6
1528d543b2 agent:cdh: Rename sealed_secret API namespace to confidential_data_hub
renames the sealed_secret.proto file to confidential_data_hub.proto and
updates the corresponding API namespace from sealed_secret to confidential_data_hub.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-08-15 13:55:23 +08:00
Saul Paredes
5ad47b8372 genpolicy: ignore Role resource
Ignore Role resources because they don't need a Policy.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-08-14 12:57:06 -07:00
Fupan Li
cadcf5f92d runtime-rs: Add the wait_vm support for hypervisors
Add the wait_vm method for hypervisors. This is a
prerequisite for sandbox api support.

Fixes: #7043

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-08-14 12:01:34 +08:00
Saul Paredes
88451d26d0 genpolicy: add support for cron jobs
Add support for cron jobs

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-08-13 10:47:42 -07:00
Markus Rudy
bce5cb2ce5 genpolicy: harden CreateSandboxRequest checks
Hooks are executed on the host, so we don't expect to run hooks and thus
require that no hook paths are set.

Additional Kernel modules expand the attack surface, so require that
none are set. If a use case arises, modules should be allowlisted via
settings.

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-08-13 09:01:58 +02:00
Markus Rudy
aee23409da genpolicy: harden CopyFileRequest checks
CopyFile is invoked by the host's FileSystemShare.ShareFile function,
which puts all files into directories with a common pattern. Copying
files anywhere else is dangerous and must be prevented. Thus, we check
that the target path prefix matches the expected directory pattern of
ShareFile, and that this directory is not escaped by .. traversal.

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-08-13 09:01:58 +02:00
soulfy
722b576eb3 agent: kill child process when console socket closed
when use debug console, the shell run in child process may not be
exited, in some scenes.
eg. directly Ctrl-C in the host to terminate the kata-runtime process,
that will block the task handling the console connection,while waiting
for the child to exit.

Signed-off-by: soulfy <liukai254@jd.com>
2024-08-13 10:18:03 +08:00
ChengyuZhu6
df993b0f88 agent:rpc: initialize trusted storage device
Initialize the trusted stroage when the device is defined
as "/dev/trusted_store" with shell script as first step.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
2024-08-12 16:36:54 +08:00
ChengyuZhu6
94347e2537 agent:config: Support secure_storage_integrity option for trusted storage
After enable secure storage integrity for trusted storage, the initialize
time will take more times, the default value will be NOT enabled but add this config to
allow the user to enable if they care more strict security.

Fixes: #8142

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
2024-08-12 16:36:54 +08:00
Fabiano Fidêncio
44b08b84b0
Merge pull request #10113 from Freax13/fix/no-scsi-off
qemu: don't emit scsi parameter
2024-08-08 16:23:36 +02:00
Dan Mihai
2da77c6979
Merge pull request #10068 from burgerdev/genpolicy-test
genpolicy: add crate-scoped integration test
2024-08-06 16:10:46 -07:00
Fabiano Fidêncio
f33f2d09f7 runtime: image-pull: Make it work with nerdctl
Our code for handling images being pulled inside the guest relies on a
containerType ("sandbox" or "container") being set as part of the
container annotations, which is done by the CRI Engine being used, and
depending on the used CRI Engine we check for a specfic annotation
related to the image-name, which is then passed to the agent.

However, when running kata-containers without kubernetes, specifically
when using `nerdctl`, none of those annotations are set at all.

One thing that we can do to allow folks to use `nerdctl`, however, is to
take advantage of the `--label` flag, and document on our side that
users must pass `io.kubernetes.cri.image-name=$image_name` as part of
the label.

By doing this, and changing our "fallback" so we can always look for
such annotation, we ensure that nerdctl will work when using the nydus
snapshotter, with kata-containers, to perform image pulling inside the
pod sandbox / guest.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-06 17:07:45 +02:00
Markus Rudy
69535e5458 genpolicy: add crate-scoped integration test
Provides a test runner that generates a policy and validates it
with canned requests. The initial set of test cases is mostly for
illustration and will be expanded incrementally.

In order to enable both cross-compilation on Ubuntu test runners as well
as native compilation on the Alpine tools builder, it is easiest to
switch to the vendored openssl-src variant. This builds OpenSSL from
source, which depends on Perl at build time.

Adding the test to the Makefile makes it execute in CI, on a variety of
architectures. Building on ppc64le requires a newer version of the
libz-ng-sys crate.

Fixes: #10061

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-08-05 11:52:01 +02:00
Markus Rudy
4d1416529d genpolicy: fix clippy v1.78.0 warnings
cargo clippy has two new warnings that need addressing:
- assigning_clones
  These were fixed by clippy itself.
- suspicious_open_options
  I added truncate(false) because we're opening the file for reading.

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-08-05 11:48:30 +02:00
Fabiano Fidêncio
43dca8deb4
Merge pull request #10121 from microsoft/saulparedes/add_version_flag
genpolicy: add --version flag
2024-08-03 21:22:10 +02:00
GabyCT
76af5a444b
Merge pull request #10075 from microsoft/saulparedes/hooks
genpolicy: reject create custom hook settings
2024-08-02 15:36:34 -06:00
Tom Dohrmann
322c80e7c8
qemu: don't emit scsi parameter
This parameter has been deprecated for a long time and QEMU 9.1.0 finally removes it.

Fixes: kata-containers#10112
Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>
2024-08-02 07:30:39 +02:00
Tom Dohrmann
b7999ac765
runtime-rs: don't emit scsi parameter for block devices
This parameter has been deprecated for a long time and QEMU 9.1.0 finally removes it.

Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>
2024-08-02 07:30:23 +02:00
Saul Paredes
194cc7ca81 genpolicy: add --version flag
- Add --version flag to the genpolicy tool that prints the current
version
- Add version.rs.in template to store the version information
- Update makefile to autogenerate version.rs from version.rs.in
- Add license to Cargo.toml

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-08-01 17:18:17 -07:00
Dan Mihai
9e99329bef genpolicy: reject create sandbox hooks
Reject CreateSandboxRequest hooks, because these hooks may be used by an
attacker.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-08-01 16:58:35 -07:00
Dan Mihai
c2a55552b2 agent: fix the AllowRequestsFailingPolicy functionality
1. Use the new value of AllowRequestsFailingPolicy after setting up a
   new Policy. Before this change, the only way to enable
   AllowRequestsFailingPolicy was to change the default Policy file,
   built into the Guest rootfs image.

2. Ignore errors returned by regorus while evaluating Policy rules, if
   AllowRequestsFailingPolicy was enabled. For example, trying to
   evaluate the UpdateInterfaceRequest rules using a policy that didn't
   define any UpdateInterfaceRequest rules results in a "not found"
   error from regorus. Allow AllowRequestsFailingPolicy := true to
   bypass that error.

3. Add simple CI test for AllowRequestsFailingPolicy.

These changes are restoring functionality that was broken recently by
commmit df23eb09a6.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-08-01 22:37:18 +00:00
Wainer Moschetta
528745fc88
Merge pull request #10052 from nubificus/feat_fix_qemu_after_8070
runtime-rs: Fix QEMU backend for runtime-rs
2024-07-30 11:00:14 -03:00
Fupan Li
e3f0d2a751 runtime-rs: enable dragonball hypervisor support initrd
enable the dragonball support initrd.

Fixes: #10023

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-07-30 14:50:24 +08:00
Fupan Li
4fbf9d67a5
Merge pull request #10043 from lifupan/fix_sandbox
runtime-rs : fix the issue of stop sandbox
2024-07-29 09:22:26 +08:00
Dan Mihai
5546ce4031
Merge pull request #10069 from microsoft/danmihai1/exec-args
genpolicy: validate each exec command line arg
2024-07-26 11:39:44 -07:00
Anastassios Nanos
d11657a581 runtime-rs: Remove unused env vars from build
Since we can't find a homogeneous value for the resource/cgroup
management of multiple hypervisors, and we have decoupled the
env vars in the Makefile, we don't need the generic ones.

Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
2024-07-26 14:03:50 +00:00
Anastassios Nanos
3f58ea9258 runtime-rs: Decouple Makefile env VARS
To avoid overriding env vars when multiple hypervisors are
available, we add per-hypervisor vars for static resource
management and cgroups handling. We reflect that in the
relevant config files as well.

Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
2024-07-26 14:02:35 +00:00
Fupan Li
c51ba73199 container: fix the issue of send signal to process
It's better to check the container's status before
try to send signal to it. Since there's no need
to send signal to it when the container's stopped.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-07-26 19:23:43 +08:00
Fupan Li
e156516bde sandbox: fix the issue of stop sandbox
Since stop sandbox would be called in multi path,
thus it's better to set and check the sandbox's state.

Fixes: #10042

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-07-26 19:23:34 +08:00
Qi Feng Huo
a113fc93c8 initdata: fix unit test code for initdata annotation
Added ut code for initdata annotation

Signed-off-by: Qi Feng Huo <huoqif@cn.ibm.com>
2024-07-26 18:24:05 +08:00
Qi Feng Huo
8d61029676 initdata: add unit test code for initdata annotation
Added ut code for initdata annotation

Signed-off-by: Qi Feng Huo <huoqif@cn.ibm.com>
2024-07-26 14:20:57 +08:00
Qi Feng Huo
b80057dfb5 initdata: Merge branch 'main' into annotation
- Merge branch 'main' into feature branch annotation
2024-07-26 14:01:04 +08:00
Archana Shinde
d7637f93f9
Merge pull request #9899 from amshinde/multiple-networks-fix
Fix issue while adding multiple networks with nerdctl
2024-07-25 11:56:27 -07:00
Dan Mihai
a37f10fc87 genpolicy: validate each exec command line arg
Generate policy that validates each exec command line argument, instead
of joining those args and validating the resulting string. Joining the
args ignored the fact that some of the args might include space
characters.

The older format from genpolicy-settings.json was similar to:

    "ExecProcessRequest": {
      "commands": [
                "sh -c cat /proc/self/status"
        ],
      "regex": []
    },

That format will not be supported anymore. genpolicy will detect if its
users are trying to use the older "commands" field and will exit with
a relevant error message in that case.

The new settings format is:

    "ExecProcessRequest": {
      "allowed_commands": [
        [
          "sh",
          "-c",
          "cat /proc/self/status"
        ]
      ],
      "regex": []
    },

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-25 16:57:17 +00:00
Alex Lyn
abb0a2659a
Merge pull request #9944 from Apokleos/align-ocispec-rs
Align kata oci spec with oci-spec-rs
2024-07-25 19:36:52 +08:00
Alex Lyn
bb2b60dcfc oci: Delete the kata oci spec
It's time to delete the kata oci spec implemented just
for kata. As we have already done align OCI Spec with
oci-spec-rs.

Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-25 17:47:01 +08:00
Alex Lyn
b56313472b agent: Align agent OCI spec with oci-spec-rs
Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-25 17:47:01 +08:00
Alex Lyn
882385858d runtime-rs: Align oci spec in runtime-rs with oci-spec-rs
This commit aligns the OCI Spec implementation in runtime-rs
with the OCI Spec definitions and related operations provided
by oci-spec-rs. Key changes as below:
(1) Leveraged oci-spec-rs to align Kata Runtime OCI Spec with
the official OCI Spec.
(2) Introduced runtime-spec to separate OCI Spec definitions
from Kata-specific State data structures.
(3) Preserved the original code logic and implementation as
much as possible.
(4) Made minor code adjustments to adhere to Rust programming
conventions;

Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-25 17:47:01 +08:00
Alex Lyn
bf813f85f2 runk: Align oci spec with oci-spec-rs
Utilized oci-spec-rs to align OCI Spec structures
and data representations in runk with the OCI Spec.

Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-25 17:47:01 +08:00
Alex Lyn
b3eab5ffea genpolicy: Align agent-ctl OCI Spec with oci-spec-rs
Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-25 17:47:01 +08:00
Alex Lyn
c500fd5761 agent-ctl: Align agent-ctl OCI Spec with oci-spec-rs
This commit aligns the OCI Spec used within agent-ctl
with the oci-spec-rs definition and operations. This
enhancement ensures that agent-ctl adheres to the latest
OCI standards and provides a more consistent and reliable
experience for managing container images and configurations.

Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-25 17:47:01 +08:00
Alex Lyn
faffee8909 libs: update Cargo config and lock file
update Cargo.toml and Cargo.lock for adding runtime-spec

Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-25 17:47:01 +08:00
Alex Lyn
8b5499204d protocols: Reimplement OCI Spec to TTRPC Data Translation
This commit transitions the data implementation for OCI Spec
from kata-oci-spec to oci-spec-rs. While both libraries adhere
to the OCI Spec standard, significant implementation details
differ. To ensure data exchange through TTRPC services, this
commit reimplements necessary data conversion logic.
This conversion bridges the gap between oci-spec-rs data and
TTRPC data formats, guaranteeing consistent and reliable data
transfer across the system.

Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-25 17:46:07 +08:00
Anastassios Nanos
cda00ed176 runtime-rs: Add FC specific KERNELPARAMS
To avoid overriding KERNELPARAMS for other hypervisors, add
FC-specific KERNELPARAMS.

Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
2024-07-25 08:53:57 +00:00
Alex Lyn
4e003a2125
Merge pull request #10058 from Apokleos/enhance-vsock-connect
runtime-rs: enhance debug info for agent connect.
2024-07-25 11:29:04 +08:00
Alex Lyn
36385a114d runtime-rs: enhance debug info for agent connect.
we need more friendly logs for debugging agent conntion
cases when kata pods fail.

Fixes #10057

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-25 08:51:57 +08:00
Aurélien Bombo
f08b594733
Merge pull request #9576 from microsoft/saulparedes/support_env_from
genpolicy: Add support for envFrom
2024-07-24 13:39:54 -07:00
Archana Shinde
49fbae4fb1 agent: Wait for interface in update_interface
For nerdctl and docker runtimes, network is hot-plugged instead of
cold-plugged. While this change was made in the runtime,
we did not have the agent waiting for the device to be ready.
On some systems, the device hotplug could take some time causing
the update_interface rpc call to fail as the interface is not available.

Add a watcher for the network interface based on the pci-path of the
network interface. Note, waiting on the device based on name is really
not reliable especially in case multiple networks are hotplugged.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-07-24 10:45:56 -07:00
Pavel Mores
dd1e09bd9d runtime-rs: add experimental support for memory hotunplugging to qemu-rs
Hotunplugging memory is not guaranteed or even likely to work.
Nevertheless I'd really like to have this code in for tests and
observation.  It shouldn't hurt, from experience so far.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-07-24 13:22:41 +02:00
Pavel Mores
3095b65ac3 runtime-rs: support hotplugging memory in QemuInner
The bulk of this implementation are simple though tedious sanity checks,
alignment computations and logging.

Note that before any hotplugging, we query qemu directly for the current
size of hotplugged memory.  This ensures that any request to resize memory
will be properly compared to the actual already available amount and only
necessary amount will be added.

Note also that we borrow checked_next_multiple_of() from CH implementation.
While this might look uncleanly it's just a rather temporary solution since
an equivalent function will apparently be part of std soon, likely the
upcoming 1.75.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-07-24 13:22:41 +02:00
Pavel Mores
4a1c828bf8 runtime-rs: support hotplugging memory in Qmp
The algorithm is rather simple - we query qemu for existing memory devices
to figure out the index of the one we're about to add.  Then we add a
backend object and a corresponding frontend device.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-07-24 13:22:41 +02:00
Pavel Mores
0e0b146b87 runtime-rs: support storage & retrieval of guest memblock size in qemu-rs
This will be used for ensuring that hotplugged memory block sizes are
properly aligned.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-07-24 13:22:41 +02:00
Alex Lyn
efb7390357 kata-sys-utils: align OCI Spec with oci-spec-rs
Do align oci spec and fix warnings to make clippy
happy.

Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-24 14:38:48 +08:00
Alex Lyn
012029063c runtime-spec: Introduce runtime-spec for Container State
As part of aligning the Kata OCI Spec with oci-spec-rs,
the concept of "State" falls outside the scope of the OCI
Spec itself. While we'll retain the existing code for State
management for now, to improve code organizationand clarity,
we propose moving the State-related code from the oci/ dir
to a dedicated directory named runtime-spec/.
This separation will be completed in subsequent commits with
the removal of the oci/ directory.

Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-24 14:38:30 +08:00
Dan Mihai
f26d595e5d
Merge pull request #9910 from microsoft/saulparedes/set_policy_rego_via_env
tools: Allow setting policy rego file via
2024-07-22 11:00:30 -07:00
Alex Lyn
67466aa27f kata-types: do alignment of oci-spec for kata-types
Fixes #9766

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-21 22:54:43 +08:00
Saul Paredes
2681fc7eb0 genpolicy: Add support for envFrom
This change adds support for the `envFrom` field in the `Pod` resource

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-07-19 09:53:58 -07:00
Greg Kurz
dc97f3f540
Merge pull request #10045 from lifupan/cleanup_container
runtime-rs: container: fix the issue of missing cleanup container
2024-07-19 16:36:04 +02:00
Alex Lyn
d0dc67bb96
Merge pull request #8597 from amshinde/vfio-hotplug-support
Implement hotplug support for physical endpoints
2024-07-19 13:41:11 +08:00
Fupan Li
8a2f7b7a8c container: fix the issue of missing cleanup container
When create container failed, it should cleanup the container
thus there's no device/resource left.

Fixes: #10044

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-07-19 11:02:55 +08:00
ms-mahuber
ddff762782 tools: Allow setting policy rego file via
environment variable

* Set policy file via env var

* Add restrictive policy file to kata-opa folder

* Change restrictive policy file name

* Change relative default path location

* Add license headers

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-07-18 15:05:45 -07:00
Aurélien Bombo
ab6f37aa52
Merge pull request #10022 from microsoft/danmihai1/probes-and-lifecycle
genpolicy: container.exec_commands args validation
2024-07-18 12:21:31 -07:00
Archana Shinde
1636c201f4 network: Implement network hotunplug for physical endpoints
Similar to HotAttach, the HotDetach method signature for network
endoints needs to be changed as well to allow for the method to make
use of device manager to manage the hot unplug of physical network
devices.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-07-17 16:42:41 -07:00
Archana Shinde
c6390f2a2a vfio: Introduce function to get vfio dev path
This function will be later used to get the vfio dev path.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-07-17 16:42:41 -07:00
Archana Shinde
1e304e6307 network: Implement hotplug for physical endpoints
Enable physical network interfaces to be hotplugged.
For this, we need to change the signature of the HotAttach method
to make use of Sandbox instead of Hypervisor. Similar approach was
followed for Attach method, but this change was overlooked for
HotAttach.
The signature change is required in order to make use of
device manager and receiver for physical network
enpoints.

Fixes: #8405

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-07-17 16:42:40 -07:00
Archana Shinde
2fef4bc844 vfio: use driver_override field for device binding.
The current implementation for device binding using driver bind/unbind
and new_id fails in the scenario when the physical device is not bound
to a driver before assigning it to vfio.
There exists and updated mechanism to accomplish the same that does not
have the same issue as above.
The driver_override field for a device allows us to specify the driver for a device
rather than relying on the bound driver to provide a positive match of the
device. It also has other advantages referenced here:
https://patchwork.kernel.org/project/linux-pci/patch/1396372540.476.160.camel@ul30vt.home/

So use the updated driver_override mechanism for binding/unbinding a
physical device/virtual function to vfio-pci.

Signed-off-by: liangxianlong <liang.xianlong@zte.com.cn>
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-07-17 16:42:40 -07:00
Dan Mihai
f31c1b121e
Merge pull request #9812 from microsoft/saulparedes/test_policy_on_tdx
gha: enable policy testing on TDX
2024-07-17 08:47:44 -07:00
Dan Mihai
9f4d1ffd43 genpolicy: container.exec_commands args validation
Keep track of individual exec args instead of joining them in the
policy text. Verifying each arg results in a more precise policy,
because some of the args might include space characters.

This improved validation applies to commands specified in K8s YAML
files using:

- livenessProbe
- readinessProbe
- startupProbe
- lifecycle.postStart
- lifecycle.preStop

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-17 01:19:23 +00:00
stevenhorsman
eb07f5ef5e agent: doc: Fix ordering of options
- Fix the config options to be back in alphabetical order to be
easier to find

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-16 21:39:31 -03:00
stevenhorsman
7cc81ce867 agent: image: Set image-rs auth config
If the agent-config has a value for `image_registry_auth`,
Then pass this to the image-rs client and enable auth mode too

Fixes: #8122

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-16 21:39:31 -03:00
stevenhorsman
265322990a agent: config: Add config option to provide auth for guest-pull
Add optional config for agent.image_registry_auth, to specify
the uri of credentials to be used when pulling images in the guest
from an authenticated registry

Fixes: #8122

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-16 21:39:31 -03:00
Saul Paredes
0b3d193730 genpolicy: Support cpath for mount sources
Add setting to allow specifying the cpath for a mount source.

cpath is the root path for most files used by a container. For example,
the container rootfs and various files copied from the Host to the
Guest when shared_fs=none are hosted under cpath.

mount_source_cpath is the root of the paths used a storage mount
sources. Depending on Kata settings, mount_source_cpath might have the
same value as cpath - but on TDX for example these two paths are
different: TDX uses "/run/kata-containers" as cpath,
but "/run/kata-containers/shared/containers" as mount_source_cpath.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-07-15 14:09:49 -07:00
Dan Mihai
bcaf7fc3b4
Merge pull request #10008 from microsoft/danmihai1/runAsUser
genpolicy: add support for runAsUser fields
2024-07-15 12:08:50 -07:00
Xynnn007
a56b15112a agent: add ocicrypt config
ocicrypt config is for kata-agent to connect to CDH to request for image
decryption key. This value is specified by an env. We use this
workaround the same as CCv0 branch.

In future, we will consider better ways instead of writting files and
setting envs inside inner logic of kata-agent.

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
2024-07-15 12:00:50 +01:00
Xynnn007
1072658219 agent: Enable kata-cc-rustls-tls in image-rs
- Enable the kata-cc-rustls-tls feature in image-rs, so that it
can get resources from the KBS in order to retrieve the registry
credentials.
- Also bump to the latest image-rs to pick up protobuf fixes
- Add libprotobuf-dev dependency to the agent packaging
as it is needed by the new image-rs feature
- Add extra env in the agent make test as the
new version of the anyhow crate has changed the backtrace capture thus unit
tests of kata-agent that compares a raised error with an expected one
would fail. To fix this, we need only panics to have backtraces, thus
set RUST_BACKTRACE=0 for tests due to document
https://docs.rs/anyhow/latest/anyhow/

Fixes #9538

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-15 12:00:50 +01:00
Fupan Li
a7179be31d
Merge pull request #9534 from Tim-Zhang/fix-stdin-stuck
Fix ctr exec stuck problem
2024-07-15 13:19:19 +08:00
Dan Mihai
f087044ecb genpolicy: add support for runAsUser
Add ability to auto-generate policy for SecurityContext.runAsUser and
PodSecurityContext.runAsUser.

Fixes: #8879

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-13 01:10:43 +00:00
Dan Mihai
5282701b5b genpolicy: add link to allow_user() active issue
Improve comment to workaround in rules.rego, to explain better the
reason for that workaround.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-13 01:05:58 +00:00
Qi Feng Huo
4d66ee1935 initdata: add initdata annotation in hypervisor config
- Add Initdata annotation for hypervisor config, so that it can be passed when CreateVM

Signed-off-by: Qi Feng Huo <huoqif@cn.ibm.com>
2024-07-11 10:56:18 +08:00
Silenio Quarti
8260ce8d15 runtime: Initialize SharedFS for remote hypervisor
Sets SharedFS config to NoSharedFS for remote hypervisor in order to start the file watcher which syncs files from the host to the guest VMs. 

Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
2024-07-10 14:31:25 -03:00
Xuewei Niu
7f71eac6de
Merge pull request #9868 from l8huang/dan
runtime: implement DAN in Go kata-runtime
2024-07-10 19:09:46 +08:00
Alex Lyn
dafff26f01
Merge pull request #9814 from Apokleos/bugfix-pcipath
runtime-rs: bugfix for root bus slot allocation
2024-07-10 16:19:06 +08:00
Steve Horsman
78bbc51ff0
Merge pull request #9806 from niteeshkd/nd_snp_certs
runtime: pass certificates to get extended attestation report for SNP coco
2024-07-10 08:57:45 +01:00
Lei Huang
171d298dea runtime: implement DAN in Go kata-runtime
The DAN feature has already been implemented in kata-runtime-rs, and
this commit brings the same capability to the Go kata-runtime.

Fixes: #9758

Signed-off-by: Lei Huang <leih@nvidia.com>
2024-07-10 00:22:30 -07:00
Alex Lyn
806e959b01 runtime-rs: bugfix for device slot allocation failed in dragonball
In dragonball Vfio device passthrough scenarois, the first passthrough
device will be allocated slot 0 which is occupied by root device.
It will cause error, looks like as below:
```
...
6: failed to add VFIO passthrough device: NoResource\n
7: no resource available for VFIO device"): unknown
...
```
To address such problem, we adopt another method with no pre-allocated
guest device id and just let dragonball auto allocate guest device id
and return it to runtime. With this idea, add_device will return value
Result<DeviceType> and apply the change to related code.

Fixes #9813

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-10 10:59:57 +08:00
Alex Lyn
27947cbb0b dragonball: make add vfio device return guest device id
Fixes #9813

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-07-10 10:59:51 +08:00
Hyounggyu Choi
37b907dfbc
Merge pull request #9859 from BbolroC/set-ocispec-for-vfio-ap
tests: Extend vfio-ap hotplug test to use a zcrypttest tool
2024-07-09 14:03:45 +02:00
Steve Horsman
ff498c55d1
Merge pull request #9719 from fitzthum/sealed-secret
Support Confidential Sealed Secrets (as env vars)
2024-07-09 09:43:51 +01:00
Niteesh Dubey
529660fafb runtime: pass certificates for SNP coco
This will be used to get extended attestation report.

Fixes: #9805

Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
2024-07-09 03:46:00 +00:00
Tim Zhang
8801554889 runtime-rs: Fix ctr exec stuck problem
Fixes: #9532

Instead of call agent.close_stdin in close_io, we call agent.write_stdin
with 0 len data when the stdin pipe ends.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2024-07-09 11:44:36 +08:00
Linda Yu
b4d61f887b agent: unittest for sealed secret as env in kata
To test unsealing secrets stored in environment variables,
we create a simple test server that takes the place of
the CDH. We start this server and then use it to
unseal a test secret.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
2024-07-08 17:32:45 -05:00
Linda Yu
6003608fe6 agent: support sealed secret as env in kata
When sealed-secret is enabled, the Kata Agent
intercepts environment variables containing
sealed secrets and uses the CDH to unseal
the value.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
2024-07-08 17:31:33 -05:00
stevenhorsman
d511820974 agent: Bump image-rs
- Bump the commit of image-rs we are pulling in to 413295415
Note: This is the last commmit before a change to whiteout handling
was introduced that lead to the error `'failed to unpack: convert whiteout"`
when pulling the agnhost:2.21 image

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-05 10:25:28 +01:00
ChengyuZhu6
e71c7ab932 agent/image: Remove functions about merging container spec for guest pull
Let me explain why:

In our previous approach, we implemented guest pull by passing PullImageRequest to the guest.
However, this method  resulted in the loss of specifications essential for running the container,
such as commands specified in YAML, during the CreateContainer stage. To address this,
it is necessary to integrate the OCI specifications and process information
from the image’s configuration with the container in guest pull.

The snapshotter method does not care this issue. Nevertheless, a problem arises
when two containers in the same pod attempt to pull the same image, like InitContainer.
This is because the image service searches for the existing configuration,
which resides in the guest. The configuration, associated with <image name, cid>,
is stored in the directory /run/kata-containers/<cid>. Consequently, when the InitContainer finishes
its task and terminates, the directory ceases to exist. As a result, during the creation
of the application container, the OCI spec and process information cannot
be merged due to the absence of the expected configuration file.

Fixes: kata-containers#9665
Fixes: kata-containers#9666
Fixes: kata-containers#9667
Fixes: kata-containers#9668

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-05 08:10:04 +08:00
ChengyuZhu6
c9d1a758cd agent/image: Reuse the mountpoint in image-rs
Currently, the image is pulled by image-rs in the guest and mounted at
`/run/kata-containers/image/cid/rootfs`. Finally, the agent rebinds
`/run/kata-containers/image/cid/rootfs` to `/run/kata-containers/cid/rootfs` in CreateContainer.
However, this process requires specific cleanup steps for these mount points.

To simplify, we reuse the mount point `/run/kata-containers/cid/rootfs`
and allow image-rs to directly mount the image there, eliminating the need for rebinding.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-05 08:10:04 +08:00
stevenhorsman
05cd1cc7a0 agent: Add CreateContainer support for pre-pulled bundle
- Add a check in setup_bundle to see if the bundle already exists
and if it does then skip the setup.

This commit is cherry-picked from 44ed3ab80e.

The reason that k8s-kill-all-process-in-container.bats failed is that
deletion of the directory `/root/kata-containers/cid/rootfs` failed during removing container
because it was mounted twice (one in image-rs and one in set_bundle ) and only unmounted once in removing container.

Fixes: #9664

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Dave Hay <david_hay@uk.ibm.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-07-05 08:10:00 +08:00
Biao Lu
6c1a2f01f8 protocols: add support for sealed_secret service
To unseal a secret, the Kata agent will contact the CDH
using ttRPC. Add the proto that describes the sealed
secret service and messages that will be used.

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Biao Lu <biao.lu@intel.com>
2024-07-04 01:03:41 -05:00
Anastassios Nanos
db75b5f3c4
Merge pull request #8070 from nubificus/feat_add-fc-runtime-rs
runtime-rs: firecracker hypervisor backend
2024-07-03 22:29:30 +03:00
Dan Mihai
ada53744ea
Merge pull request #9907 from microsoft/saulparedes/allow_empty_env_vars
genpolicy: allow some empty env vars
2024-07-03 08:07:23 -07:00
George Pyrros
2d19f3fbd7 runtime-rs: firecracker hypervisor backend
Add a basic runtime-rs `Hypervisor` trait implementation for
AWS Firecracker

- Add basic hypervisor operations (setup / start / stop / add_device)
- Implement AWS Firecracker API on a separate file `fc_api.rs`
- Add support for running jailed (include all sandbox-related content)
- Add initial device support (limited as hotplug is not supported)
- Add separate config for runtime-rs (FC)

Notes:
- devmapper is the only snapshotter supported
- to account for no sharefs support, we copy files in the sandbox (as
  in the GO runtime)
- nerdctl spawn is broken (TODO: #7703)

Fixes: #5268

Signed-off-by: George Pyrros <gpyrros@nubificus.co.uk>
Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
Signed-off-by: Charalampos Mainas <cmainas@nubificus.co.uk>
Signed-off-by: George Ntoutsos <gntouts@nubificus.co.uk>
2024-07-03 08:30:30 +00:00
Aurélien Bombo
33d08a8417
Merge pull request #9825 from microsoft/mahuber/main
osbuilder: allow rootfs builds w/o git or version file deps
2024-07-02 09:38:13 -07:00
Wainer Moschetta
ec695f67e1
Merge pull request #9577 from microsoft/saulparedes/topology
genpolicy: add topologySpreadConstraints support
2024-07-02 11:24:26 -03:00
Amulya Meka
dd12089e0d
Merge pull request #9914 from Amulyam24/qemu-fix
kata-deploy: fix qemu static build on ppc64le
2024-07-02 10:45:03 +05:30
Saul Paredes
f3f3caa80a genpolicy: update sample
Update pod-one-container.yaml sample

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-07-01 13:49:08 -07:00
Dan Mihai
75aee526a9 genpolicy: add topologySpreadConstraints support
Allow genpolicy to process Pod YAML files including
topologySpreadConstraints.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-07-01 13:32:49 -07:00
Hyounggyu Choi
99690ab202 runtime: Instantiate/pass vfio-ap device to ociSpec
This commit adds the missing step of passing an attached vfio-ap device
to a container via ociSpec. It instantiates and passes a vfio-ap device
(e.g. a Z crypto device).
A device at `/dev/z90crypt` covers all use cases at the time of writing.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-01 11:40:49 +02:00
Amulyam24
259ec408b5 kata-deploy: fix qemu static build for v8.2.1 on ppc64le
Do not install the packages librados-dev and librbd-dev as they are not needed for building static qemu.

Add machine option cap-ail-mode-3=off while creating the VM to qemu cmdline.
Fixes: #9893

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2024-07-01 14:56:43 +05:30
Archana Shinde
82a1892d34 agent: Add additional info while returning errors for update_interface
This should provide additional context for errors while updating network
interface.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-06-27 12:56:53 -07:00
Archana Shinde
2127288437 agent: Bring interface down before renaming it.
In case we are dealing with multiple interfaces and there exists a
network interface with a conflicting name, we temporarily rename it to
avoid name conflicts.
Before doing this, we need to rename bring the interface down.
Failure to do so results in netlink returning Resource busy errors.

The resource needs to be down for subsequent operation when the name is
swapped back as well.

This solves the issue of passing multiple networks in case of nerdctl
as:
nerdctl run --rm  --net foo --net bar docker.io/library/busybox:latest ip a

Fixes: #9900

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-06-27 12:56:53 -07:00
Bo Chen
25e3cab028 runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v40.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #9929

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-06-27 09:59:00 -07:00
Alex Lyn
d66c214ae7
Merge pull request #9849 from markyangcc/main
runtime: fix missing of VhostUserDeviceReconnect parameter assignment
2024-06-27 21:48:37 +08:00
Zvonko Kaiser
893fd2b59c
Merge pull request #9916 from zvonkok/config-fix
gpu: Missing separator
2024-06-26 14:46:47 +02:00
Greg Kurz
fe7ef878d2
Merge pull request #9913 from gkurz/update-kata-ctl-deps
kata-ctl: Update Cargo.lock
2024-06-26 14:31:03 +02:00
Zvonko Kaiser
e0aa54301f gpu: Missing separator
Add the correct separator for replacement

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-26 10:40:35 +00:00
Greg Kurz
ac33a389c0
Merge pull request #9879 from pmores/remove-dependency-on-containerd-bundle-dir-tree
runtime-rs: remove attempt to access sandbox bundle from container bu…
2024-06-26 10:57:50 +02:00
Greg Kurz
db7b2f7aaa kata-ctl: Update Cargo.lock
A previous change missed to refresh Cargo.lock.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-06-26 08:27:52 +02:00
Saul Paredes
ce19419d72 genpolicy: allow some empty env vars
Updated genpolicy settings to allow 2 empty environment variables that
may be forgotten to specify (AZURE_CLIENT_ID and AZURE_TENANT_ID)

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-06-25 10:53:05 -07:00
Aurélien Bombo
0582a9c75b
Merge pull request #9864 from 3u13r/feat/genpolicy/layers-cache-file-path
genpolicy: allow specifying layer cache file
2024-06-25 10:42:22 -07:00
Alex Lyn
2c5b3a5c20
Merge pull request #9830 from gaohuatao-1/ght/count-rs
runtime-rs: fix the bug of func count_files
2024-06-25 15:00:46 +08:00
Aurélien Bombo
b0cdf4eb0d
Merge pull request #9579 from microsoft/saulparedes/add_seccomp_support
genpolicy: ignore SeccompProfile in PodSpec
2024-06-24 08:58:01 -07:00
Leonard Cohnen
6a3ed38140 genpolicy: allow specifying layer cache file
Add --layers-cache-file-path flag to allow the user to
specify where the cache file for the container layers
is saved. This allows e.g. to have one cache file
independent of the user's working directory.

Signed-off-by: Leonard Cohnen <lc@edgeless.systems>
2024-06-24 14:53:27 +02:00
Wainer Moschetta
f7e0d6313b
Merge pull request #9865 from wainersm/qemu-coco-dev_updates
runtime: updates to qemu-coco-dev configuration
2024-06-21 10:14:30 -03:00
Saul Paredes
44afb4aa5f genpolicy: ignore SeccompProfile in PodSpec
Ignore SeccompProfile in PodSpec

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-06-20 09:42:17 -07:00
Dan Mihai
7aeaf2502a
Merge pull request #9856 from microsoft/danmihai1/new-policy-rules
genpolicy: reject untested CreateContainer field values
2024-06-20 09:34:53 -07:00
Steve Horsman
d5b4da7331
Merge pull request #9881 from stevenhorsman/remote-hypervisor-policy
runtime: Support policy in remote hypervisor
2024-06-20 14:01:29 +01:00
stevenhorsman
779754dcf6 runtime: Support policy in remote hypervisor
Move the `sandbox.agent.setPolicy` call out of the remoteHypervisor
if, block, so we can use the policy implementation on peer pods

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-19 16:43:53 +01:00
Pavel Mores
6a4919eeb9 runtime-rs: fix misleading log message
get_vmm_master_tid() currently returns an error with the message "cannot
get qemu pid (though it seems running)" when it finds a valid
QemuInner::qemu_process instance but fails to extract the PID out of it.

This condition however in fact means that a qemu child process was running
(otherwise QemuInner::qemu_process would be None) but isn't anymore (id()
returns None).

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-06-19 17:15:24 +02:00
Pavel Mores
af5492e773 runtime-rs: made Qemu::stop_vm() idempotent
Since Hypervisor::stop_vm() is called from the WaitProcess request handling
which appears to be per-container, it can be called multiple times during
kata pod shutdown.  Currently the function errors out on any subsequent
call after the initial one since there's no VM to stop anymore.  This
commit makes the function tolerate that condition.

While it seems conceivable that sandbox shouldn't be stopped by WaitProcess
handling, and the right fix would then have to happen elsewhere, this
commit at least makes qemu driver's behaviour consistent with other
hypervisor drivers in runtime-rs.

We also slightly improve the error message in case there's no
QemuInner::qemu_process instance.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-06-19 17:15:24 +02:00
Pavel Mores
5fbbff9e5e runtime-rs: remove attempt to access sandbox bundle from container bundle
Since no objections were raised in the linked issue (#9847) this commit
removes the attempt to derive sandbox bundle path from container bundle
path.  As described in more detail in the linked issue, this is container
runtime specific and doesn't seem to serve any purpose.

As for implementation, we hoist the only part of
get_shim_info_from_sandbox() that's still useful (getting the socket
address) directly into the caller and remove the function altogether.

Fixes #9847

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-06-19 17:09:15 +02:00
Dan Mihai
4df66568cf genpolicy: reject untested CreateContainer field values
Reject CreateContainerRequest field values that are not tested by
Kata CI and that might impact the confidentiality of CoCo Guests.

This change uses a "better safe than sorry" approach to untested
fields. It is very possible that in the future we'll encounter
reasonable use cases that will either:

- Show that some of these fields are benign and don't have to be
  verified by Policy, or
- Show that Policy should verify legitimate values of these fields

These are the new CreateContainerRequest Policy rules:

    count(input.shared_mounts) == 0
    is_null(input.string_user)

    i_oci := input.OCI
    is_null(i_oci.Hooks)
    is_null(i_oci.Linux.Seccomp)
    is_null(i_oci.Solaris)
    is_null(i_oci.Windows)

    i_linux := i_oci.Linux
    count(i_linux.GIDMappings) == 0
    count(i_linux.MountLabel) == 0
    count(i_linux.Resources.Devices) == 0
    count(i_linux.RootfsPropagation) == 0
    count(i_linux.UIDMappings) == 0
    is_null(i_linux.IntelRdt)
    is_null(i_linux.Resources.BlockIO)
    is_null(i_linux.Resources.Network)
    is_null(i_linux.Resources.Pids)
    is_null(i_linux.Seccomp)
    i_linux.Sysctl == {}

    i_process := i_oci.Process
    count(i_process.SelinuxLabel) == 0
    count(i_process.User.Username) == 0

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-06-18 18:09:31 +00:00
markyangcc
a28bf266f9 runtime: fix missing of VhostUserDeviceReconnect parameter assignment
Commit 'ca02c9f5124e' implements the vhost-user-blk reconnection functionality,
However, it has missed assigning VhostUserDeviceReconnect when new the QEMU
HypervisorConfig, resulting in VhostUserDeviceReconnect always set to default value 0.

Real change is this line, most of changes caused by go format,

return vc.HypervisorConfig{
	// ...
	VhostUserDeviceReconnect: h.VhostUserDeviceReconnect,
}, nil

Fixes: #9848
Signed-off-by: markyangcc <mmdou3@163.com>
2024-06-18 12:15:10 +08:00
Alex Lyn
388cd7dde4
Merge pull request #9772 from pmores/add-base-qmp-framework
runtime-rs: add base qmp framework
2024-06-18 09:53:28 +08:00
Alex Lyn
275c498dc9
Merge pull request #9834 from lifupan/main
sandbox: fix the issue of failed to get the vmm master tid
2024-06-18 08:57:21 +08:00
Alex Lyn
d3fb6bfd35
Merge pull request #9860 from stevenhorsman/tokio-vulnerability-bump
Tokio vulnerability bump
2024-06-18 08:35:34 +08:00
Wainer dos Santos Moschetta
bdbee78517 runtime: allow default_{vcpus,memory} annotations to qemu-coco-dev
This is a counterpart of commit abf52420a4 for the qemu-coco-dev
configuration. By allowing default_vcpu and default_memory annotations
users can fine-tune the VM based on the size of the container
image to avoid issues related with pulling large images in the guest.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-17 18:59:52 -03:00
Wainer dos Santos Moschetta
baa8d9d99c runtime: set shared_fs=none to qemu-coco-dev configuration
Just like the TEE configurations (sev, snp, tdx) we want to have the
qemu-coco-dev using shared_fs=none.

Fixes: #9676
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-17 18:42:46 -03:00
Wainer Moschetta
b6a28bd932
Merge pull request #9786 from microsoft/saulparedes/add_back_insecure_registry_pull
genpolicy: add back support for insecure
2024-06-17 15:21:25 -03:00
Wainer Moschetta
68415dabcd
Merge pull request #9815 from msanft/fix/genpolicy/flag-name
genpolicy: fix settings path flag name
2024-06-17 15:13:25 -03:00
stevenhorsman
53659f1ede libs: Update tokio dependencies
- Bump tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-17 13:03:01 +01:00
stevenhorsman
35f6be97df runtime-rs: Update tokio dependency
- Bump tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019

If possible it would be good to add the many runtime-rs creates into the
runtime-rs workspace and provide a centralised version to avoid the updates
in many places.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-17 13:03:01 +01:00
stevenhorsman
3bb1a67d80 agent-ctl: Update rustjail dependencies
- Run `cargo update -p rustjail` to pick up rustjail's bump of
tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-17 13:03:01 +01:00
stevenhorsman
d2d35d2dcc runk: Update tokio dependencies
- Bump tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-17 13:03:01 +01:00
stevenhorsman
adda401a8c genpolicy: Update tokio dependencies
- Bump tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-17 13:03:01 +01:00
stevenhorsman
b7928f465e agent: Update tokio dependencies
- Bump tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-17 13:02:47 +01:00
Steve Horsman
cce735a09e
Merge pull request #9840 from stevenhorsman/bump-agent-rust-1.75.0
versions: Bump rust toolchain
2024-06-17 11:28:07 +01:00
Fupan Li
b218c4bc10
Merge pull request #9836 from lifupan/main_fix
sandbox: fix the issue of double initial_size_manager config
2024-06-17 09:15:51 +08:00
Bo Chen
a68aeca356
Merge pull request #9575 from likebreath/0430/clh_v39.0
versions: Upgrade to Cloud Hypervisor v39.0
2024-06-14 09:10:19 -07:00
stevenhorsman
3fb176970f dragonball: Fix device manager warning
- Fix the lint error:
```
error: you seem to use `.enumerate()` and immediately discard the index
   --> src/device_manager/mod.rs:427:33
    |
427 |         for (_index, device) in self.virtio_devices.iter().enumerate() {
    |                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```
 by removing the unnecessary enumerate

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-14 16:45:16 +01:00
stevenhorsman
1ea2671f2f dragonball: Fix lint with rust 1.75.0
The ci failed with:
```
error: use of `or_insert_with` to construct default value
   --> src/address_space_manager.rs:650:14
    |
650 |             .or_insert_with(NumaNode::new);
    |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try: `or_default()`
    |
```

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-14 16:45:16 +01:00
Steve Horsman
ab8a9882c1
Merge pull request #9818 from EmmEff/fix-spelling
runtime: fix minor spelling issues
2024-06-14 13:12:56 +01:00
Steve Horsman
99bf95f773
Merge pull request #9827 from littlejawa/fix_panic_on_metrics_gathering
runtime: avoid panic on metrics gathering
2024-06-14 11:12:43 +01:00
Pavel Mores
380f8ad03f runtime-rs: add base vCPU hotplugging support
We take advantage of the Inner pattern to enable QemuInner::resize_vcpu()
take `&mut self` which we need to call non-const functions on Qmp.

This runs on Intel architecture but will need to be verified and ported
(if necessary) to other architectures in the future.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-06-14 10:13:32 +02:00
Pavel Mores
8231c6c4a3 runtime-rs: instantiate Qmp as (optional) member of QemuInner
The QMP_SOCKET_FILE constant in cmdline_generator.rs is made public to make
it accessible from QemuInner.  This is fine for now however if the constant
needs to be accessed from additional places in the future we could consider
moving it to somewhere more visible.

The Debug impl for Qmp is empty since first, we don't actually want it,
it's only forced by Hypervisor trait bounds, and second, it doesn't have
anything to display anyway.  If Qmp gets any members in the future that
can be meaningfully displayed they should be handled by Qmp's Debug::fmt().

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-06-14 10:13:32 +02:00
Pavel Mores
6fdb262dca runtime-rs: add Qmp object to encapsulate QMP functionality
The constructor handles QMP connection initialisation, too, so there can
be non-functional Qmp instance.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-06-14 10:13:32 +02:00
Manuel Huber
62fd84dfd8 build: allow rootfs builds w/o git or VERSION file deps
We set the VERSION variable consistently across Makefiles to
'unknown'  if the file is empty or not present.
We also use git commands consistently for calculating the COMMIT,
COMMIT_NO variables, not erroring out when building outside of
a git repository.
In create_summary_file we also account for a missing/empty VERSION
file.
This makes e.g. the UVM build process in an environment where we
build outside of git with a minimal/reduced set of files smoother.

Signed-off-by: Manuel Huber <mahuber@microsoft.com>
2024-06-13 22:46:52 +00:00
Mike Frisch
c2f61b0fe3 runtime: spelling fixes
Minor spelling fixes in runtime log messages.

Signed-off-by: Mike Frisch <mikef17@gmail.com>
2024-06-13 12:11:34 -04:00
Greg Kurz
b85b1c1058
Merge pull request #9790 from gkurz/kill-some-dead-runtime-code
Kill some dead runtime code
2024-06-13 15:45:51 +02:00
gaohuatao
4cb4e44234 runtime-rs: fix the bug of func count_files
When the total number of files observed is greater than limit, return -1 directly.
runtime has fixed this bug, it should b ported to runtime-rs.

Fixes:#9829

Signed-off-by: gaohuatao <gaohuatao@bytedance.com>
2024-06-13 16:02:33 +08:00
Fupan Li
cd68ef372f sandbox: fix the issue of double initial_size_manager config
It shouldn't call the initial_size_manager's setup_config
in the load_config since it had been called in the sandbox's
try_init function.

Fixes: #9778

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-06-13 15:44:51 +08:00
Fupan Li
61687992f4 sandbox: fix the issue of failed to get the vmm master tid
For kata container, the container's pid is meaning less to
containerd/crio since the container's pid is belonged to VM,
and containerd/crio couldn't use it. Thus we just return any
tid of kata shim or hypervisor. But since the hypervisor had
been stopped before deleting the container, and it wouldn't
get the hypervisor's tid for some supported hypervisor, thus
we'd better to return the kata shim's pid instead of hypervisor's
tid.

Fixes: #9777

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-06-13 10:27:04 +08:00
Fabiano Fidêncio
56423cbbfe
Merge pull request #9706 from burgerdev/burgerdev/genpolicy-devices
genpolicy: add support for devices
2024-06-12 23:03:41 +02:00
Fabiano Fidêncio
3a0247ed43
Merge pull request #9819 from stevenhorsman/config-envvar-precedence
agent: config: Ensure envs take precedence
2024-06-12 11:26:02 +02:00
Julien Ropé
9c86eb1d35 runtime: avoid panic on metrics gathering
While running with a remote hypervisor, whenever kata-monitor tries to access
metrics from the shim, the shim does a "panic" and no metric can be gathered.

The function GetVirtioFsPid() is called on metrics gathering, and had a call
to "panic()". Since there is no virtiofs process for remote hypervisor, the
right implementation is to return nil. The caller expects that, and will skip
metrics gathering for virtiofs.

Fixes: #9826

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-06-12 10:02:44 +02:00
Xuewei Niu
92cc5e0adb
Merge pull request #9781 from gaohuatao-1/ght/shm 2024-06-12 12:39:28 +08:00
Moritz Sanft
84903c898c
genpolicy: fix settings path flag name
This corrects the warning to point to the \`-j\` flag,
which is the correct flag for the JSON settings file.
Previously, the warning was confusing, as it pointed to
the \`-p\` flag, which specifies to the path for the Rego ruleset.

Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
2024-06-11 21:17:18 +02:00
Greg Kurz
1acf8d0c35 govmm: Drop QEMU's NoShutdown knob
Code is not used.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-06-11 19:55:54 +02:00
Greg Kurz
cb5b548ad7 govmm: Drop QEMU's Daemonize knob
Code isn't used anymore.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-06-11 19:55:54 +02:00
Greg Kurz
33eaf69d5f virtcontainers: Drop QEMU's Daemonize knob
QEMU isn't started as daemon anymore and this won't change (see #5736
for details). Drop the related code.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-06-11 19:55:54 +02:00
Dan Mihai
d47f40210a
Merge pull request #9808 from microsoft/saulparedes/oci_from_settings
genpolicy: load OCI version from settings
2024-06-11 10:42:04 -07:00
Saul Paredes
3e9d6c11a1 genpolicy: add back support for insecure
registries

Adding back changes from
77540503f9.

Fixes: #9008

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-06-11 09:42:23 -07:00
Bo Chen
2398442c58 runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v39.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #8694, #9574

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-06-11 09:42:17 -07:00
stevenhorsman
40e02b34cb agent: config: Ensure envs take precedence
- Update the config parsing logic so that when reading from the
agent-config.toml file any envs are still processed
- Add units tests to formalise that the envs take precedence over values
from the command line and the config file

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-11 16:31:10 +01:00
Steve Horsman
59ff40f054
Merge pull request #9811 from mkulke/mkulke/use-kebabcase-for-enum-values-in-config-file-parsing
agent: convert enum vals to kebab-case in cfg file
2024-06-11 14:49:30 +01:00
gaohuatao
638e9acf89 runtime: fix the bug of func countFiles
When the total number of files observed is greater than limit, return (-1, err).
When the returned err is not nil, the func countFiles should return -1.

Fixes:#9780

Signed-off-by: gaohuatao <gaohuatao@bytedance.com>
2024-06-11 18:17:18 +08:00
Alex Lyn
1c8db85d54
Merge pull request #9784 from Apokleos/bufix-testcases
kata-types: fix bug in kata-types several test cases
2024-06-11 10:01:45 +08:00
Saul Paredes
6a84562c16 genpolicy: load OCI version from settings
Load OCI version from genpolicy-settings.json and validate it in
rules.rego

Fixes: #9593

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-06-10 15:30:39 -07:00
Magnus Kulke
abc704a720 agent: convert enum vals to kebab-case in cfg file
fixes #9810

Add an annotation to the enum values in the agent config that will
deserialize them using a kebab-case conversion, aligning the behaviour
to parsing of params specified via kernel cmdline.

drive-by fix: add config override for guest_component_procs variable

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
2024-06-10 21:55:05 +02:00
Alex Lyn
27685c91e5 kata-types: fix bug in kata-types several test cases
(1) As mis-use of cap.set causing previous Caps lost which
causing assert! failed, just replacing cap.set with cap.add.

(2) It will return error if there's no such name setting when
do update_config_by_annotation {
    ...
if config.runtime.name.is_empty() {
            return Err(io::Error::new(
                io::ErrorKind::InvalidData,
                "Runtime name is missing in the configuration",
            ));
        }
...
}

Fixes #9783

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-06-07 09:16:23 +08:00
Niteesh Dubey
62d3d7c58f runtime: enable kernel-hashes for SNP confidential container
This is required to provide the hashes of kernel, initrd and cmdline
needed during the attestation of the coco.

Fixes: #9150

Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
2024-06-05 15:02:02 +00:00
Fabiano Fidêncio
138ef2c55f
Merge pull request #9678 from AdithyaKrishnan/main
TEEs: Skip a few CI tests for SEV/SNP
2024-06-04 23:42:51 +02:00
GabyCT
6c2e8bed77
Merge pull request #9725 from 3u13r/feat/genpolicy/filter-by-runtime
genpolicy: add ability to filter for runtimeClassName
2024-06-04 10:06:14 -06:00
Wainer Moschetta
b5561074c3
Merge pull request #9377 from beraldoleal/yqbump
deps: bumping yq to v4.40.7
2024-06-03 14:34:58 -03:00
Ryan Savino
6db08ed620 runtime: sev: snp: Use shared_fs=none
Disabling 9p for SEV and SNP TEEs.

Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
2024-06-03 01:14:16 -05:00
Markus Rudy
13310587ed genpolicy: check requested devices
CreateContainerRequest objects can specify devices to be created inside
the guest VM. This change ensures that requested devices have a
corresponding entry in the PodSpec.

Devices that are added to the pod dynamically, for example via the
Device Plugin architecture, can be allowlisted globally by adding their
definition to the settings file.

Fixes: #9651
Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-05-31 22:05:49 +02:00
Markus Rudy
ea578f0a80 genpolicy: add support for VolumeDevices
This adds structs and fields required to parse PodSpecs with
VolumeDevices and PVCs with non-default VolumeModes.

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2024-05-31 19:34:14 +02:00
Beraldo Leal
c99ba42d62 deps: bumping yq to v4.40.7
Since yq frequently updates, let's upgrade to a version from February to
bypass potential issues with versions 4.41-4.43 for now. We can always
upgrade to the newest version if necessary.

Fixes #9354
Depends-on:github.com/kata-containers/tests#5818

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Beraldo Leal
4f6732595d ci: skip go version check
golang.mk is not ready to deal with non GOPATH installs. This is
breaking test on s390x.

Since previous steps here are installing go and yq our way, we could
skip this aditional check. A full refactor to golang.mk would be needed
to work with different paths.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Magnus Kulke
9f04dc4c8b agent: introduce config for coco attestion procs
fixes #9748

A configuration option `guest_component_procs` has been introduced that
indicates which guest component processes are supposed to be spawned by
the agent. The default behaviour remains that all of those processes are
actively spawned by the agent. At the moment this is based on presence
of binaries in the rootfs and the guest_component_api_rest option.

The new option is incremental:

none -> attestation-agent -> confidential-data-hub -> api-server-rest

e.g. api-server-rest implies attestation-agent and confidential-data-hub

the `none` option has been removed from guest_component_api_rest, since
this is addresses by the introduced option.

To not change expected behaviour for  non-coco guests we still will still
only attempt to spawn the processes if the requested attestation binaries
are present on the rootfs, and issue in warning in those cases.

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
2024-05-31 12:15:41 +02:00
Leonard Cohnen
1d1690e2a4 genpolicy: add ability to filter for runtimeClassName
Add the CLI flag --runtime-class-names, which is used during
policy generation. For resources that can define a
runtimeClassName (e.g., Pods, Deployments, ReplicaSets,...)
the value must have any of the --runtime-class-names as
prefix, otherwise the resource is ignored.

This allows to run genpolicy on larger yaml
files defining many different resources and only generating
a policy for resources which will be deployed in a
confidential context.

Signed-off-by: Leonard Cohnen <lc@edgeless.systems>
2024-05-31 03:17:02 +02:00
Greg Kurz
b3cb19b6a7
Merge pull request #9639 from emanuellima1/rng-impl
runtime-rs: Add RNG to QEMU cmdline
2024-05-30 12:00:11 +02:00
Emanuel Lima
138d985c64
runtime-rs: Add RNG to QEMU cmdline
It creates this line, as the Golang runtime does:
-object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng-pci,rng=rng0

Signed-off-by: Emanuel Lima <emlima@redhat.com>
2024-05-29 13:11:00 -03:00
Zvonko Kaiser
d4832b3b74 vfio: Fix hotpunplug
We need to remove the device from the tracking map, a container
restart will increment the bus index and we will get out of root-ports
and crash the machine.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-28 07:37:30 +00:00
Zvonko Kaiser
4c93bb2d61 qemu: Add CDI device handling for any container type
We need special handling for pod_sandbox, pod_container and
single_container how and when to inject CDI devices

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-27 10:13:01 +00:00
Zvonko Kaiser
c7b41361b2 gpu: reintroduce pcie_root_port and add pcie_switch_port
In Kubernetes we still do not have proper VM sizing
at sandbox creation level. This KEP tries to mitigates
that: kubernetes/enhancements#4113 but this can take
some time until Kube and containerd or other runtimes
have those changes rolled out.

Before we used a static config of VFIO ports, and we
introduced CDI support which needs a patched contianerd.
We want to eliminate the patched continerd in the GPU case
as well.

Fixes: #8860

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-05-27 10:13:01 +00:00
Fupan Li
6f6a164451
Merge pull request #9268 from zvonkok/kata-agent-createcontainer
kata-agent: CreateContainer Hook
2024-05-27 16:36:22 +08:00
Alex Lyn
713c929a64
Merge pull request #9656 from pmores/document-qemu-rs-conventions
runtime-rs: document architecture & implementation conventions in qem…
2024-05-27 10:38:58 +08:00
Xuewei Niu
bb7a1c56e9
Merge pull request #9693 from sidneychang/9690/Adjust-indentation 2024-05-27 00:20:34 +08:00
Alex Lyn
55dbf6121a
Merge pull request #9604 from Apokleos/qmp-cmdline01
runtime-rs: add QMP support for Qemu(part I)
2024-05-26 20:22:59 +08:00
Alex Lyn
028b10ce7a
Merge pull request #9687 from l8huang/vfio-pci-gk
agent: collect PCI address mapping for both vfio-pci-gk and vfio-pci device
2024-05-26 17:48:25 +08:00
Steve Horsman
b89c3e35dd
Merge pull request #9583 from cncal/update_check_error_message
runtime: make kata-runtime check error more understandable when /dev/kvm doesn't exist
2024-05-24 17:49:43 +01:00
Alex Lyn
41fb7aeb89 runtime-rs: add QMP params suppport in cmdline
Fixes: #9603

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-05-24 22:16:24 +08:00
Alex Lyn
7ed6c6896b runtime-rs: add an option dbg_monitor_socket for HMP support
This option allows to add a debug monitor socket when
`enable_debug = true` to control QEMU within debugging case.

Fixes: #9603

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-05-24 22:16:17 +08:00
Lei Huang
3624573b12 agent: collect PCI address mapping for both vfio-pci-gk and vfio-pci device
The `update_env_pci()` function need the PCI address mapping to
translate the host PCI address to guest PCI address in below
environment variables:
- PCIDEVICE_<prefix>_<resource-name>_INFO
- PCIDEVICE_<prefix>_<resource-name>

So collect PCI address mapping for both vfio-pci-gk and
vfio-pci devices.

Fixes #9614

Signed-off-by: Lei Huang <leih@nvidia.com>
2024-05-23 21:20:01 -07:00
Fupan Li
d73876252e
Merge pull request #9690 from justxuewei/agent-timeout
runtime-rs: Remove obsoleted dial_timeout config
2024-05-24 10:31:12 +08:00
Zvonko Kaiser
3affd83e14
Merge pull request #9605 from l8huang/skip-env
kata-agent: update env PCIDEVICE_<prefix>_<resource-name>_INFO
2024-05-23 18:45:00 +02:00
Fabiano Fidêncio
d83cf39ba1
Merge pull request #9680 from kata-containers/dependabot/go_modules/src/runtime/go_modules-5e29427af7
build(deps): bump golang.org/x/net from 0.24.0 to 0.25.0 in /src/runtime in the go_modules group across 1 directory
2024-05-23 12:55:29 +02:00
Fabiano Fidêncio
0e33ecf7fc
Merge pull request #9653 from JakubLedworowski/fixes-9497-ensure-quote-generation-service-is-added-to-qemu-cmd-2
runtime: Enable connection to Quote Generation Service (QGS)
2024-05-22 15:49:23 +02:00
sidneychang
8938f35627 runtime-rs: Adjust indentation in ifneq statements within Makefile.
Replace tab indentation with spaces for the three lines within the ifneq statements, aligning them with the surrounding code.

Fixes:#9692

Signed-off-by: sidneychang <2190206983@qq.com>
2024-05-22 20:24:35 +08:00
Fabiano Fidêncio
94f7bbf253
Merge pull request #9682 from fidencio/topic/allow-increasing-cpus-and-memory-via-annotation-for-tdx
runtime: tdx: Allow default_{cpu,memory} annotations
2024-05-22 12:07:28 +02:00
Xuewei Niu
d31616cec3 runtime-rs: Remove obsoleted dial_timeout config
The `dial_timeout` works fine for Runtime-go, but is obsoleted in
Runtime-rs.

When the pod cannot connect to the Agent upon starting, we need to adjust
the `reconnect_timeout_ms` to increase the number of connection attempts to
the Agent.

Fixes: #9688

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-05-22 17:57:05 +08:00
Jakub Ledworowski
fc680139e5 runtime: Enable connection to Quote Generation Service (QGS)
For the TD attestation to work the connection to QGS on the host is needed.
By default QGS runs on vsock port 4050, but can be modified by the host owner.
Format of the qemu object follows the SocketAddress structure, so it needs to be provided in the JSON format, as in the example below:
-object '{"qom-type":"tdx-guest","id":"tdx","quote-generation-socket":{"type":"vsock","cid":"2","port":"4050"}}'

Fixes: #9497
Signed-off-by: Jakub Ledworowski <jakub.ledworowski@intel.com>
2024-05-22 11:16:24 +02:00
Alex Lyn
0331859740
Merge pull request #9642 from gkurz/drop-unused-knobs-qemu-rs
runtime-rs: Drop some useless QEMU arguments
2024-05-22 16:13:14 +08:00
Alex Lyn
ce030d1804
Merge pull request #9641 from cmaf/runtime-resize-mem-1
runtime: Add missing check in ResizeMemory for CH
2024-05-22 14:05:30 +08:00
Alex Lyn
b7af00be2a
Merge pull request #9624 from cncal/bugfix_duplicated_devices
runtime: fix duplicated devices requested to the agent
2024-05-22 12:45:46 +08:00
Steve Horsman
f41f642b90
Merge pull request #9635 from kata-containers/dependabot/go_modules/src/runtime/go_modules-f0df977846
build(deps): bump github.com/containerd/containerd from 1.7.11 to 1.7.16 in /src/runtime in the go_modules group across 1 directory
2024-05-21 21:19:32 +01:00
Steve Horsman
9b0ed3dfa7
Merge pull request #9657 from ajaypvictor/remote-hyp-annotations
runtime: Disable number of cpu comparison on remote hypervisor scenario
2024-05-21 21:19:12 +01:00
Lei Huang
b0a91b0d13 kata-agent: update env PCIDEVICE_<prefix>_<resource-name>_INFO
The new version of sriov-network-device-plugin adds an env
`PCIDEVICE_<prefix>_<resource-name>_INFO`, which has a json
value; kata-agent can't parse it as env
`PCIDEVICE_<prefix>_<resource-name>` which has value in format
"DDDD:BB:SS.F".

This change updates env `PCIDEVICE_<prefix>_<resource-name>_INFO`.

Signed-off-by: Lei Huang <leih@nvidia.com>
2024-05-21 10:46:41 -07:00
stevenhorsman
865fa9da15 runtime: Resolve go static-checks failure
Remove `rand.Seed` call to resolve the following failure:
```
rand.Seed is deprecated: As of Go 1.20 there is no reason to call Seed with a random value.
```

The go rand.Seed docs: https://pkg.go.dev/math/rand@go1.20#Seed
back this up and states:
> If Seed is not called, the generator is seeded randomly at program startup.
so I believe we can just delete the call.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-21 11:08:59 +01:00
Fabiano Fidêncio
abf52420a4 runtime: tdx: Allow default_{cpu,memory} annotations
For now, let's allow the users to set the default_cpu and default_memory
when using TDX, as they may hit issues related to the size of the
container image that must be pulled and unpacked inside the guest,

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-21 10:26:39 +02:00
stevenhorsman
75a201389d runtime: update go version in go.mod
- Make due to us bumping the golang version used in our CI
but `make vendor` fails without the go version in the runtime go.mod
being increased, so update this and run go mod tidy

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-21 09:11:46 +01:00
dependabot[bot]
735185b15c build(deps): bump github.com/containerd/containerd
Bumps the go_modules group with 1 update in the /src/runtime directory: [github.com/containerd/containerd](https://github.com/containerd/containerd).


Updates `github.com/containerd/containerd` from 1.7.11 to 1.7.16
- [Release notes](https://github.com/containerd/containerd/releases)
- [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md)
- [Commits](https://github.com/containerd/containerd/compare/v1.7.11...v1.7.16)

---
updated-dependencies:
- dependency-name: github.com/containerd/containerd
  dependency-type: direct:production
  dependency-group: go_modules
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-05-21 09:11:46 +01:00
Ajay Victor
abe607b0c7 runtime: Disable number of cpu comparison on remote hypervisor scenario
Fixes https://github.com/kata-containers/kata-containers/issues/9238

Signed-off-by: Ajay Victor <ajvictor@in.ibm.com>
2024-05-21 13:34:21 +05:30
dependabot[bot]
01868b2849
---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
  dependency-group: go_modules
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-05-20 22:06:41 +00:00
Fabiano Fidêncio
072b929b6f
Merge pull request #9660 from malt3/fix/genpolicy/namespace_empty_string
genpolicy: detect empty string in ns as default
2024-05-20 21:34:13 +02:00
Tim Zhang
857d2bbc8e agent: Fix ctr exec stuck problem
Fixes: #9532

Close stdin when write_stdin receives data of length 0.

Stop call notify_term_close() in close_stdin, because it could
discard stdout unexpectedly.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2024-05-20 14:52:14 +08:00
Fabiano Fidêncio
f2de259387
runtime: tdx: Use shared_fs=none
We shouldn't be using 9p, at all, with TEEs, as off right now we have no
way to ensure the channels are encrypted.  The way to work this around
for now is using guest pull, either with containerd + nydus snapshotter
or with CRI-O; or even tardev snapshotter for pulling on the host (which
is the approach used by MSFT).

This is only done for TDX for now, leaving the generic, AMD, and IBM
related stuff for the folks working on those to switch and debug
possible issues on their environment.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:47:09 +02:00
Malte Poll
babdab9078 genpolicy: detect empty string in ns as default
In Kubernetes, the following values for namespace are equivalent and all refer to the default namespace:

- ` ` (namespace field missing)
- `namespace: ""` (namespace field is the empty string)
- `namespace: "default"`(namespace field has the explicit value `default`)

Genpolicy currently does not handle the empty string case correctly.

Signed-Off-By: Malte Poll <1780588+malt3@users.noreply.github.com>
2024-05-18 12:44:59 +02:00
Pavel Mores
b9febc4458 runtime-rs: document architecture & implementation conventions in qemu-rs
Implementation of QemuCmdLine has a fairly uniform and repetitive structure
that's guided by a set of conventions.  These conventions have however been
mostly implicit so far, leading to a superfluous and annoying
request/force-push churn during qemu-rs PR reviews.

This commit aims to make things explicit so that contributors can take them
into account before an initial PR submission.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-05-17 12:21:44 +02:00
Chelsea Mafrica
5d2af555da runtime: Add missing check in ResizeMemory for CH
ResizeMemory for Cloud Hypervisor is missing a check for the new
requested memory being greater than the max hotplug size after
alignment. Add the check, and since an earlier check for this
setsrequested memory to the max hotplug size, do the same in the
post-alignment check.

Fixes #9640

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2024-05-15 11:29:18 -07:00
Greg Kurz
bd6420e0cc runtime-rs: Drop some useless QEMU arguments
All these settings are hardcoded as `false` and result in
no extra options on the QEMU command line, like the go
runtime does. There actually not needed :
- we're never going to ask QEMU to survive a guest shutdown
- we're never going to run QEMU daemonized since it prevents
  log collection
- we're never going to ask QEMU to start with the guest stopped

No need to keep this code around then.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-05-15 18:33:43 +02:00
Greg Kurz
e2117d3b71
Merge pull request #9571 from emanuellima1/fix-impl-rtc
runtime-rs: Fix constructing the RTC struct
2024-05-14 09:17:27 +02:00
cncal
232db2d906 runtime: fix duplicated devices requested to the agent
By default, when a container is created with the `--privileged` flag,
all devices in `/dev` from the host are mounted into the guest. If
there is a block device(e.g. `/dev/dm`) followed by a generic
device(e.g. `/dev/null`),two identical block devices(`/dev/dm`)
would be requested to the kata agent causing the agent to exit with error:

> Conflicting device updates for /dev/dm-2

As the generic device type does not hit any cases defined in `switch`,
the variable `kataDevice` which is defined outside of the loop is still
the value of the previous block device rather than `nil`. Defining `kataDevice`
in the loop fixes this bug.

Signed-off-by: cncal <flycalvin@qq.com>
2024-05-12 16:38:37 +08:00
Emanuel Lima
59c1567f80 runtime-rs: Fix constructing the RTC struct
RTC was being built in a wrong fashion on commit #2bc5e3c6e2ab0145fa9e8be95df0d5086c07a517

RTC was being constructed inside the QemuCmdLine struct,
but it should've been built inside the devices vector.

Signed-off-by: Emanuel Lima <emlima@redhat.com>
2024-05-09 15:00:47 -03:00
Fabiano Fidêncio
77f457c0e1
runtime: tdx: Drop sept-ve-disable=on
This was needed when we were using an old (and not maintained anymore)
host stack.  Considering what we have as part of the distros, Today,
this can simply be dropped, as I cannot find any reference of this one
being needed in any up-to-date documentation.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-09 07:59:12 +02:00
Fabiano Fidêncio
416d00228c
Revert "qemu: tdx: Adapt command line" (partially)
This reverts commit b7cccfa019.

The `private=on` bit has never made its way upstream, and was removed
from the latest iteration that we're using.  With that in mind, let's
revert its usage in the code.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-09 07:59:12 +02:00
Fabiano Fidêncio
1c3037fd25
Revert "govmm: tdx: Expose the private=on|off knob"
This reverts commit 582b5b6b19.

The `private=on` bit has never made its way upstream, and was removed
from the latest iteration that we're using.  With that in mind, let's
revert its addition, and later on its usage in the code.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-09 07:59:12 +02:00
Fabiano Fidêncio
f48450b360
runtime: config: tdx: Add QEMU / OVMF placeholder var
Let's add the PLACEHOLDER_FOR_DISTRO_{QEMU,OVMF}_WITH_TDX_SUPPORT
variables instead of actually setting a path, so we can easily replace
those as part of our deployment scripts.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-09 07:59:12 +02:00
GabyCT
3b8a910393
Merge pull request #9596 from lifupan/main
db: fix the issue of failed to init pci root bus
2024-05-08 13:14:20 -06:00
Alex Lyn
875e6e3815
Merge pull request #9601 from cncal/fix_redundant_log
qemu: the error is logged only when it occurs
2024-05-08 08:59:01 +08:00
GabyCT
22087f9db9
Merge pull request #9598 from lifupan/main_shim
runtime-rs: fix the issue of the leak of dead shim
2024-05-07 10:14:11 -06:00
GabyCT
a564422b7b
Merge pull request #9582 from cncal/main
build: fix the confusing build message if yq doesn't exist in GOPATH/bin
2024-05-07 09:34:27 -06:00
Fabiano Fidêncio
ddf6b367c7
Merge pull request #9568 from kata-containers/dependabot/go_modules/src/runtime/go_modules-22ef55fa20
build(deps): bump the go_modules group across 5 directories with 8 updates
2024-05-07 13:14:48 +02:00
cncal
15d511af97 qemu: the error is logged only when it occurs
Everytime I create contianer on arm64 machine, containerd/kata logs a redundant warning
as follows:
``` shell
time="2024-05-07" level=warning msg="<nil>" arch=arm64 name=containerd-shim-v2
pid=xxx sandbox=fdd1f05 source=virtcontainers/hypervisor
```
I added an error statement so that the error would be logged when it occurs.

Signed-off-by: cncal <flycalvin@qq.com>
2024-05-07 14:28:04 +08:00
Fupan Li
3694f3d9fe runtime-rs: fix the issue of the leak of dead shim
We should init and asign the runtime instance to runtime
handler, otherwise, if the pause container failed to start,
which means the runtime instance failed to start, then the
following delete & shutdown request wouldn't be run, thus
the dead shim would be left.

Fixes: #9597

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-05-06 17:31:31 +08:00
Fupan Li
26bee78e8d db: fix the issue of failed to init pci root bus
dragonball reserves 2048G of mmio space for the pci root bus by default
on physical addresses greater than 4G. However, for some machines with
smaller physical address widths, such as 39-bit wide physical addresses,
dragonball reserves the mmio space when initializing the memory. It is
less than 2048G, so this commit dynamically calculates and allocates the
mmio size of each pci root bus.

Fixes: #9509

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-05-06 11:34:18 +08:00
Aurélien Bombo
0cc2b07a8c tests: adapt Mariner CI to unblock CH v39 upgrade
The CH v39 upgrade in #9575 is currently blocked because of a bug in the
Mariner host kernel. To address this, we temporarily tweak the Mariner
CI to use an Ubuntu host and the Kata guest kernel, while retaining the
Mariner initrd. This is tracked in #9594.

Importantly, this allows us to preserve CI for genpolicy. We had to
tweak the default rules.rego however, as the OCI version is now
different in the Ubuntu host. This is tracked in #9593.

This change has been tested together with CH v39 in #9588.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-05-03 16:29:12 +00:00