This will help us to in two fronts:
* catching possible issues related to kata-deploy cleanup
* do more (like, in the future, collect logs) after the tests run
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
kata-debug is a tool that is used as part of the Kata Containers CI to gather
information from the node, in order to help debugging issues with Kata
Containers.
As one can imagine, this can be expanded and used outside of the CI context,
and any contribution back to the script is very much welcome.
The resulting container is stored at the [Kata Containers quay.io
space](https://quay.io/repository/kata-containers/kata-debug) and can
be used as shown below:
```sh
kubectl debug $NODE_NAME -it --image=quay.io/kata-containers/kata-debug:latest
```
Fixes: #7397
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We need to correctly get the full path of the versions.yaml file as part
of the merge-builds.sh script, as we do a `pushd` there and that leads
to a fail merging the artefacts as the `versions.yaml` file does not
exists in that path.
Fixes: #7405
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make things simpler to figure out which version of Kata
Containers has been deployed, and also which artefacts come with it.
This will help us immensely in the future, for the TEEs use case, so we
can easily know whether we can deploy a specific guest kernel for a
specific host kernel.
Fixes: #7394
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Although this file is far away from being a SBOM, it'll help folks to
easily visualise which components are part of a release, and even have
SBOMs generated from that.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We've not been using nor shipping this kernel for a very long time.
Regardless, we're leaving behind the logic in the kernel scripts to
build it, in case it becomes necessary in the future.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR adds the tensorflow function in gha-run script in order to
be triggered in the gha.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR enables the TensorFlow benchmark on gha for the kata metrics CI.
Fixes#7362
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR adds function before function of the variables at the memory
inside container script in order to have uniformity across the script.
Fixes#7386
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR kills the hypervisor and the kata shim in the
init_env stage prior to launch any metric test.
Additionally this PR adds info messages in the main blocks
of the blogbench test to help in debugging.
Fixes: #7366
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
This PR adds C-Ray performance test in order to be part of the kata
metrics CI.
Fixes#7375
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
We don't need to export KUBECONFIG there. Let's just make sure we have
the server correctly setup and avoid doing that.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
CC-GPU seems to have issues with v6.1, so downgrade the kernels used for
SEV-SNP to a known-working version. It is worth mentioning that TDX is also
still on 5.19.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Now that we have a new TDX machine plugged into our CI, let's re-enable
the TDX tests.
Fixes: #7368
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Kernel v6.1.38 is the current latest LTS version, switch to it. No
patches should be necessary. Some CONFIG options have been removed:
- CONFIG_MEMCG_SWAP is covered by CONFIG_SWAP and CONFIG_MEMCG
- CONFIG_ARCH_RANDOM is unconditionally compiled in
- CONFIG_ARM64_CRYPTO is covered by CONFIG_CRYPTO and ARCH=arm64
Fixes: #6086
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
This PR updates the machine learning documentation related with
Tensorflow and Pytorch benchmarks.
Fixes#7359
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR adds the Tensorflow mobilinet documentation for the machine
learning README.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Issue #4747 and pull request #4748 fix exec hang issues where the exec
command hangs when a process's stdout is not closed. However, the PR might
cause the exec command not to work as expected, leading to CI failure. The
PR was reverted in #7042. This PR resolves the exec hang issues and has
undergone 1000 rounds of testing to verify that it would not cause any CI
failures.
Fixes: #4747
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>