Commit Graph

14877 Commits

Author SHA1 Message Date
Gabriela Cervantes
fdaf12d16c metrics: Remove unused remove img var in common script
This PR removes the remove_img variable in the metrics common script
as it is not being used.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-11 17:45:18 +00:00
Gabriela Cervantes
04d1122a46 tests: Decrease iterations in soak test
This PR decreases the number of iterations in the kubernetes soak test
as this is already taking more than 2 hours for the kata coco ci
stability.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-11 17:39:06 +00:00
Gabriela Cervantes
c48c6f974e tests: Enable stressng k8s stability test for Kata CoCo CI
This PR enables the stressng k8s stability test for Kata CoCo CI.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-11 17:38:13 +00:00
Alex Man
7e400f7bb2 agent: Fix CPU usage reporting for cgroup v2 in kata-agent
kata-agent incorrectly reports CPU time for cgroup v2, causing 1000x underreporting.

For cgroup v2, kata-agent reads the cpu.stat file, which reports the time consumed by the processes in the cgroup in µs.
However, there was a bug in kata-agent where it returned this value in µs without converting it to ns.

This commit adds the necessary µs to ns conversion for cgroup v2, aligning it with v1 behavior and kata-shim's expectations.

This fixes #10278

Signed-off-by: Alex Man <alexman@stripe.com>
2024-09-11 10:29:03 -07:00
Fabiano Fidêncio
1178fe20e9 tests: Adapt error parser for failed image decryption
With an older version of image-rs, we were getting the following error:
```
       Message:   failed to create containerd task: failed to create shim task: failed to handle layer: failed to get decrypt key no suitable key found for decrypting layer key:
```

However, with the version of image-rs we are bumping to, the error comes
as:
```
       Message:   failed to create containerd task: failed to create shim task: failed to handle layer: failed to get decrypt key

 Caused by:
     no suitable key found for decrypting layer key:
      keyprovider: failed to unwrap key by ttrpc
```

Due to this change, I'm splitting the check in two different ones.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-11 17:07:56 +02:00
Dan Mihai
66dda37877
Merge pull request #10271 from Sumynwa/sumsharma/agent_ctl_issue_9689_local
agent-ctl: Refactor CopyFile Handler
2024-09-11 07:35:09 -07:00
Fabiano Fidêncio
f6cfc33314
Merge pull request #10292 from fidencio/topic/ci-tdx-adapt-how-we-get-the-host-ip
ci: tdx: Adapt how we get the host IP
2024-09-11 14:42:22 +02:00
Fabiano Fidêncio
e2200f0690
versions: trustee: Update to a version that supports ITA
ITA stands for Intel Trust Authority, which is in the process to being
renamed to ITTS (Intel Tiber Trust Services).

Proper ITA / ITTS support on Trustee was finished as part of:
* 6f767fa15f

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-11 13:39:35 +02:00
Fabiano Fidêncio
d3e3ee7755
versions: guest-components: Update to a version that supports ITA
ITA stands for Intel Trust Authority, which is in the process to being
renamed to ITTS (Intel Tiber Trust Services).

As we've bumped guest-components on trustee, let's make sure we also
bump image-rs to the commit that brings ITA support in:
* https://github.com/confidential-containers/guest-components/commit/1db6c3a87665dde58d0efa56f4e4af5fc

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-11 13:36:56 +02:00
Fabiano Fidêncio
f94d80783d
agent: image-rs: Update to a version that supports ITA
ITA stands for Intel Trust Authority, which is in the process to being
renamed to ITTS (Intel Tiber Trust Services).

As we've bumped guest-components on trustee, let's make sure we also
bump image-rs to the commit that brings ITA support in:
* 1db6c3a876

The reason we need to bump the dependency here is to avoid kbs_protocol
mismatch between the version used by the agent and the trustee one.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-11 13:36:46 +02:00
Fabiano Fidêncio
3946aa7283
ci: tdx: Adapt how we get the host IP
In the process of switching the TDX CI machine we've noticed that
`hostname -i` in one of the machines returns an one and only IP address,
while in another machine it returns a full list of IPs.

As we're only interested in the first one, let's adapt the code to
always return the first one.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-11 09:31:43 +02:00
Sumedh Alok Sharma
b4bbbf65c6 ci: Do not start CDH/attestation procs with kata-agent as local process.
Since CDH/attestation related processes and its dependencies are not fully
available, the setup fails to start kata-agent as local process. This
fix removes these procs to prevent kata-agent from trying to start them.

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-09-11 11:53:59 +05:30
Sumedh Alok Sharma
8045a7a2ba ci: Install policy document on host to run kata-agent as local process.
The test setup starts kata-agent as a local process without the
UVM. The agent policy initialization fails due to missing policy
document at `/etc/kata-opa/default-policy.rego`. The fix
- installs a relaxed `allow-all.rego` policy document
- cleans up the install during exit

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-09-11 11:25:08 +05:30
Sumedh Alok Sharma
822f898433 ci: Install bats as dependencies
Install bats as part of dependencies for running the tests.

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-09-11 10:57:15 +05:30
Sumedh Alok Sharma
2c774fb207 ci: Add tests for CopyFile api.
This commit introduces test cases for testing
CopyFile API using kata-agent-ctl with improved command
semantics and handling.
- copy a file to /run/kata-containers
- copy symlink to /run/kata-containers
- copy directory to /run/kata-containers
- copy file to /tmp
- copy large file to /run/kata-containers

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-09-11 10:54:01 +05:30
Sumedh Alok Sharma
2af1113426 agent-ctl: Refactor CopyFile handler
In the existing implementation for the CopyFile subcommand,
- cmd line argument list is too long, including various metadata information.
- in case of a regular file, passing the actual data as bytes stream adds to the size and complexity of the input.
- the copy request will fail when the file size exceeds that of the allowed ttrpc max data length limit of 4Mb.

This change refactors the CopyFile handler and modifies the input to a known 'source' 'destination' syntax.

Fixes #9708

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-09-11 10:54:01 +05:30
Alex Lyn
d0968032f7
Merge pull request #10276 from Apokleos/fix-runtime-cdi
runtime: Fix runtime/cdi panic with assignment to entry in nil map
2024-09-11 09:00:11 +08:00
Alex Lyn
3f541aff4a
Merge pull request #10282 from teawater/dup
runtime-rs: configuration-dragonball.toml.in: Remove duplication
2024-09-10 11:46:40 +08:00
Hui Zhu
dfea12bc53 runtime-rs: configuration-dragonball.toml.in: Remove duplication
Remove duplicated description of enable_balloon_f_reporting from
configuration-dragonball.toml.in.

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2024-09-10 07:34:29 +08:00
David Esparza
6f8897249b
Merge pull request #10277 from GabyCT/topic/fixsk
tests: Increase timeout to wait for soak stability test deployment
2024-09-09 14:07:10 -06:00
Gabriela Cervantes
5a52fe1a75 tests: Increase timeout to wait for soak stability test deployment
This PR increases the timeout to wait that the deployment for the soak
stability test is ready in order to avoid random failures saying that
the deployment is not ready yet.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-09 16:13:40 +00:00
Alex Lyn
1684c1962c runtime: Fix runtime/cdi panic with assignment to entry in nil map
It will panic when users do GPU vfio passthrough with cdi in runtime.
The root cause is that CustomSpec.Annotations is nil when new element
added.
To address this issue, initialization is introduced when it's nil.

Fixes #10266

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-09-09 20:15:10 +08:00
Alex Lyn
f31839af63
Merge pull request #10253 from teawater/enable_balloon_f_reporting
Add support of dragonball virtio-balloon free page reporting
2024-09-09 17:37:52 +08:00
Fabiano Fidêncio
026a4d92a9
Merge pull request #10272 from fidencio/topic/add-tdx-mrconfigid-mrowner-mrownerconfig-support
runtime: qemu: tdx: Add support for setting mrconfigid / mrowner / mrownerconfig
2024-09-08 14:11:30 +02:00
Fabiano Fidêncio
51ee4c381a
Merge pull request #10257 from fidencio/topic/kata-deploy-remove-unused-vars-for-cleanup
kata-deploy: Remove kata-cleanup unneeded vars
2024-09-07 11:27:14 +02:00
Chengyu Zhu
3a37652d01
Merge pull request #10213 from ChengyuZhu6/device
Refine device management for kata-agent
2024-09-07 12:02:32 +08:00
ChengyuZhu6
75816d17f1 agent: switch to new device subsystem
Switch to new device subsystem to handle various devices in kata-agent.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:45:43 +08:00
ChengyuZhu6
df55f37dfe agent: Move unit tests about vfio device to vfio_device_handler
Move unit tests about vfio device to vfio_device_handler.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:45:43 +08:00
ChengyuZhu6
41c2d81fd3 agent: Move unit tests about scsi device to scsi_device_handler
Move unit tests about scsi device to scsi_device_handler.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:45:43 +08:00
ChengyuZhu6
f45129cb44 agent: Move unit tests about network device to network_device_handler
Move unit tests about network device to network_device_handler.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:45:43 +08:00
ChengyuZhu6
52203db760 agent: Move unit tests about block device to block_device_handler
Move unit tests about block device to block_device_handler.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:45:43 +08:00
ChengyuZhu6
e1afb92a28 agent: Move common unit tests about device
Move common unit tests about device to mod.rs

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:45:43 +08:00
ChengyuZhu6
25bd04c02a agent: Use DeviceHandlerManager to handle various devices
Use DeviceHandlerManager to handle various devices.

Fixes: #10218

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:45:42 +08:00
ChengyuZhu6
5fc645c869 agent: Move network device code to network_device_handler
Move network device code to network_device_handler to simplify the code.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:40:30 +08:00
ChengyuZhu6
07f104085a agent: Move vfio device code to vfio_device_handler
Move vfio device code to vfio_device_handler to simplify the code.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:40:30 +08:00
ChengyuZhu6
0cb87767ae agent: Move device code with virtio scsi driver to scsi_device_handler
Move scsi device code to scsi_device_handler to simplify the code.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:40:30 +08:00
ChengyuZhu6
0738d75a92 agent: Move device code with nvdimm driver to nvdimm_device_handler
Move device code with nvdimm driver to nvdimm_device_handler, including
nvdimm device and pmem device.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:40:30 +08:00
ChengyuZhu6
bbf934161b agent: Move virtio-block device handlers to block_device_handler
Move virtio-block device handlers to block_device_handler to simplify
the code.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:40:30 +08:00
ChengyuZhu6
4e33665be8 kata-types: Move device driver constants to kata-types
Move device driver constants and add DeviceHandlerManager type alias.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 09:40:30 +08:00
ChengyuZhu6
0b3ad2f830 kata-types: Replace StorageHandlerManager with type alias
Removed the `StorageHandlerManager` struct and its associated implementations and
introduced a type alias `StorageHandlerManager` for `HandlerManager` to simplify the code.
The new type alias maintains the same functionality while reducing redundancy.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 07:53:31 +08:00
ChengyuZhu6
281f0d7f29 kata-types: Add HandlerManager to manage registered handlers
Introduced `HandlerManager` struct to manage registered handlers, which will be used to storage and device management for kata-agent.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-09-07 07:51:48 +08:00
GabyCT
b05811587e
Merge pull request #10245 from ChengyuZhu6/handler-manager
agent: Refactor storage handler registration
2024-09-06 09:45:39 -06:00
GabyCT
37ddb837c4
Merge pull request #10267 from GabyCT/topic/updatemlcomments
metrics: Update openVINO and oneDNN tests references
2024-09-06 09:42:21 -06:00
Fabiano Fidêncio
65a4562050 runtime: qemu: tdx: Add omitempty to QuoteGenerationSocket
I know right now we're always passing a value for that, but this doesn't
really have to be set unless attestation is used.  Thus, let's also omit
it in case it's empty.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-06 15:05:55 +02:00
Fabiano Fidêncio
7818484120 runtime: qemu: tdx: Support mrconfigid / mrowner/ mrownerconfig
This is a quick and simple pre-req for supporting initData, which will
take advantage of the mrconfigid in the TDX case.

While already adding mrconfigid, which is hardcoded empty right now,
let's do the same for mrowner and mrownerconfig, and leave it prepared
for future expansions.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-06 15:05:54 +02:00
Fabiano Fidêncio
8285957678 runtime: qemu: Rename prepareObjectWithTDXQgs to prepareTDXObject
The reason we're relying on yet another function to do so is because the
TDX object will be used in its qom / qapi json format.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-06 14:36:09 +02:00
Fabiano Fidêncio
29ce2205a1
Merge pull request #10268 from microsoft/saulparedes/pdb-support
genpolicy: add support for PodDisruptionBudget yaml
2024-09-06 09:53:36 +02:00
Dan Mihai
1885478e2e
Merge pull request #10270 from Sumynwa/sumsharma/enable_agent_tests_in_ci
ci: Enable kata agent API tests
2024-09-05 14:24:49 -07:00
Archana Choudhary
f2625b0014 genpolicy: add support for PodDisruptionBudget
yaml

Prevent panic for PDB specs

Signed-off-by: Archana Choudhary <archana1@microsoft.com>
2024-09-05 11:33:47 -07:00
Sumedh Alok Sharma
e1ac2f4416 ci: Enable kata agent api tests
This commit enables running tests for kata agent apis.
The 'api-tests' directory will contain bats test files for
individual APIs.

Fixes #10269

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-09-06 00:02:55 +05:30