While enabling the attestation for IBM SE, it was observed that
a kernel config `CONFIG_S390_UV_UAPI` is missing.
This config is required to present an ultravisor in the guest VM.
Ths commit adds the missing config.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Now, using vanilla kubernetes, let's re-enable the TDX CI and hope it
becomes more stable than it used to be.
The cleanup-snapshotter is now taking ~4 minutes, and that matches with
the other platforms, mainly considering there's a sum of 210 seconds
sleep in the process.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Move the `sandbox.agent.setPolicy` call out of the remoteHypervisor
if, block, so we can use the policy implementation on peer pods
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
get_vmm_master_tid() currently returns an error with the message "cannot
get qemu pid (though it seems running)" when it finds a valid
QemuInner::qemu_process instance but fails to extract the PID out of it.
This condition however in fact means that a qemu child process was running
(otherwise QemuInner::qemu_process would be None) but isn't anymore (id()
returns None).
Signed-off-by: Pavel Mores <pmores@redhat.com>
Since Hypervisor::stop_vm() is called from the WaitProcess request handling
which appears to be per-container, it can be called multiple times during
kata pod shutdown. Currently the function errors out on any subsequent
call after the initial one since there's no VM to stop anymore. This
commit makes the function tolerate that condition.
While it seems conceivable that sandbox shouldn't be stopped by WaitProcess
handling, and the right fix would then have to happen elsewhere, this
commit at least makes qemu driver's behaviour consistent with other
hypervisor drivers in runtime-rs.
We also slightly improve the error message in case there's no
QemuInner::qemu_process instance.
Signed-off-by: Pavel Mores <pmores@redhat.com>
We've noticed a bunch of issues related to deploying and deleting the
nydus-snapshotter. As we don't see the same issues on other machines
using vanilla kubernetes, let's avoid using k3s for now follow the flow.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR fixes the variables names for the network that was created as well
removes the network that were created for the tests to ensure a clean environment
when running all the tests and avoid failures specially on baremental environments
that network already exists.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Reject CreateContainerRequest field values that are not tested by
Kata CI and that might impact the confidentiality of CoCo Guests.
This change uses a "better safe than sorry" approach to untested
fields. It is very possible that in the future we'll encounter
reasonable use cases that will either:
- Show that some of these fields are benign and don't have to be
verified by Policy, or
- Show that Policy should verify legitimate values of these fields
These are the new CreateContainerRequest Policy rules:
count(input.shared_mounts) == 0
is_null(input.string_user)
i_oci := input.OCI
is_null(i_oci.Hooks)
is_null(i_oci.Linux.Seccomp)
is_null(i_oci.Solaris)
is_null(i_oci.Windows)
i_linux := i_oci.Linux
count(i_linux.GIDMappings) == 0
count(i_linux.MountLabel) == 0
count(i_linux.Resources.Devices) == 0
count(i_linux.RootfsPropagation) == 0
count(i_linux.UIDMappings) == 0
is_null(i_linux.IntelRdt)
is_null(i_linux.Resources.BlockIO)
is_null(i_linux.Resources.Network)
is_null(i_linux.Resources.Pids)
is_null(i_linux.Seccomp)
i_linux.Sysctl == {}
i_process := i_oci.Process
count(i_process.SelinuxLabel) == 0
count(i_process.User.Username) == 0
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
This PR will avoid the failures when collecting artifacts for the gha.
This will ensure that we collect and archive system's data for the
purpose of debugging.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
TDX CI has been having some issues with the Nydus snapshotter cleanup,
which has been stuck for hours depending every now and then.
With this in mind, let's disable the TDX CI, so we avoid it blocking the
progress of Kata Containers project, and we re-enable it as soon as we
have it solved on Intel's side.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The following error was observed during the deployment of nydus snapshotter:
```
Error from server (NotFound):
the server could not find the requested resource ( pods/log nydus-snapshotter-5v82v)
'kubectl logs nydus-snapshotter-5v82v -n nydus-system' failed after 3 tries
Error: Process completed with exit code 1.
```
This error can occur when a pod is re-created by a daemonset during the retry interval.
This commit addresses the issue by using `--selector` rather than the pod name
for `kubectl logs/describe`.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
To build the build-kata-deploy image, it should be copied ci/install_yq.sh to
tools/packaging/kata-deploy/local-build/dockerbuild as this script will install
yq within the image. Currently, if
tools/packaging/kata-deploy/local-build/dockerbuild/install_yq.sh exists then
make won't copy it again. This can raise problems as, for example, the current
update of yq version (commit c99ba42d) in ci/install_yq.sh won't force the
rebuild of the build-kata-deploy image.
Note: this isn't a problem on a fresh dev or CI environment.
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
- Bump tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019
If possible it would be good to add the many runtime-rs creates into the
runtime-rs workspace and provide a centralised version to avoid the updates
in many places.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Run `cargo update -p rustjail` to pick up rustjail's bump of
tokio to 1.38.0 to fix the security vulnerability
https://rustsec.org/advisories/RUSTSEC-2024-0019
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Fix the lint error:
```
error: you seem to use `.enumerate()` and immediately discard the index
--> src/device_manager/mod.rs:427:33
|
427 | for (_index, device) in self.virtio_devices.iter().enumerate() {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```
by removing the unnecessary enumerate
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The ci failed with:
```
error: use of `or_insert_with` to construct default value
--> src/address_space_manager.rs:650:14
|
650 | .or_insert_with(NumaNode::new);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try: `or_default()`
|
```
Signed-off-by: stevenhorsman <steven@uk.ibm.com>