Commit Graph

13492 Commits

Author SHA1 Message Date
Archana Shinde
87f0097b18 docs: Document Intel Discrete GPUs usage with Kata
Document describes the steps needed to pass an entire Intel Discrete GPU
as well a GPU SR-IOV interface to a Kata Container.

Fixes: #9083

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-04-16 11:50:02 -07:00
Dan Mihai
2c4d1ef76b tests: k8s: inject agent policy failures (part 3)
Auto-generate the policy and then simulate attacks from the K8s
control plane by modifying the test yaml files. The policy then
detects and blocks those changes.

These test cases are using K8s Pods. Additional policy failures
are injected during CI using other types of K8s resources - e.g.,
using Jobs and Replication Controllers - from separate PRs.

Fixes: #9491

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-16 18:15:12 +00:00
Dan Mihai
c26dad8fe5
Merge pull request #9294 from burgerdev/burgerdev/genpolicy-configurable-pause
genpolicy: support insecure registries and custom pause containers
2024-04-16 09:39:33 -07:00
GabyCT
9238daf729
Merge pull request #9464 from microsoft/danmihai1/rc-tests
tests: k8s: inject agent policy failures (part2)
2024-04-16 10:01:39 -06:00
Hyounggyu Choi
d523e865c0 rootfs: Make OPA build working in docker for s390x and ppc64le
The commit is to make the OPA build from source working in `ubuntu-rootfs-osbuilder`.
To achieve the goal, the configuration is changed as follows:

- Switch the make target to `ci-build-linux-static` not triggering docker-in-docker build
- Install go in the builder image for s390x and ppc64le

Fixes: #9466

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-16 16:49:12 +02:00
Greg Kurz
aca6a1bcb5
Merge pull request #9353 from pmores/pr-8866-follow-up
runtime-rs: refactor qemu driver
2024-04-16 16:07:36 +02:00
Fabiano Fidêncio
7bb5490676
Merge pull request #9479 from wainersm/fix_coco_nontee_jobs
gha: make run-kata-coco-tests inherit secrets
2024-04-16 13:46:52 +02:00
Hyounggyu Choi
7b11fd2546
Merge pull request #9471 from BbolroC/coco-kernel-version-s390x
version: Add coco name and version for {image,initrd} for s390x
2024-04-15 16:03:20 +02:00
Wainer dos Santos Moschetta
77541008fc gha: make run-kata-coco-tests inherit secrets
The new CoCo non-tee job introduced on commit 0d5399ba92 need to read secrets
like AZ_TENANT_ID, so run-kata-coco-tests workflow should inherit the secrets from
the caller workflow.

Fixes #9477
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-15 10:53:44 -03:00
Zvonko Kaiser
78e3ebb011 version: add initrd, image NVIDIA sections
Fixes: #9472

For initrd and image, the related NVIDIA will not use the default targets and we will pin them to a specific release.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-04-15 13:31:35 +00:00
Wainer Moschetta
c85e1ca674
Merge pull request #9404 from ldoktor/ci-mcp-timeout
ci.ocp: Increase the MCP update time
2024-04-15 09:42:14 -03:00
Hyounggyu Choi
3ec209dcf1
Merge pull request #9469 from BbolroC/coco-kernel-config-s390x
kernel: Adjust s390x config for confidential containers
2024-04-15 13:55:28 +02:00
Hyounggyu Choi
8fce600493 version: Add coco name and version for {image,initrd} for s390x
In order to build a coco {image,initrd}, it is required to
specify its name and version in versions.yaml. This commit
is to add the configuration for them, respectively.

Fixes: #9470

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-15 12:53:00 +02:00
Hyounggyu Choi
a792dc3e2b kernel: Adjust s390x config for confidential containers
`CONFIG_TN3270_TTY` and `CONFIG_S390_AP_IOMMU` are dropped for s390x
in 6.7.x which is used for a confidential kernel.
But they are still used for a vanilla kernel. So we need to add them
to the whitelist.

Fixes: #9465

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-15 10:28:59 +02:00
Hyounggyu Choi
32f58abfde
Merge pull request #9403 from BbolroC/runtime-rs-ci-qemu
CI: Enable GHA cri-containerd workflow for runtime-rs with QEMU
2024-04-15 09:31:25 +02:00
Xuewei Niu
402d8a968e
Merge pull request #9430 from UiPath/fix-agent-shutdown
agent: shutdown vm on exit when agent is used as init process
2024-04-15 10:47:07 +08:00
Wainer Moschetta
0a04f54a8e
Merge pull request #9454 from GabyCT/topic/pulltype
gha: Define unbound PULL TYPE variable
2024-04-12 14:48:56 -03:00
Wainer Moschetta
a0b21d0e14
Merge pull request #9424 from wainersm/cc_guest_pull-encrypted
CC: run guest-pull tests on non-TEE jobs
2024-04-12 09:34:35 -03:00
Hyounggyu Choi
cf20a6a4ae gha: Add qemu-runtime-rs to VMM matrix for run-cri-containerd
This commit expands the VMM matrix for run-cri-containerd,
adding a new item `qemu-runtime-rs` for a test scenario where
the VMM is QEMU and runtime-rs is employed.
This expansion affects the workflows for both x86_64 and s390x platforms.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-12 12:25:53 +02:00
Hyounggyu Choi
606f8e1ab2 runtime-rs: Adjust configuration for qemu-runtime-rs
To make `qemu-runtime-rs` working for CI, we have to rename a configuration
template file and `CONFIG_FILE_QEMU` in Makefile.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-12 12:25:53 +02:00
Hyounggyu Choi
3c217c6c15 ci|cri-containerd: Introduce qemu-runtime-rs for KATA_HYPERVISOR
`qemu-runtime-rs` will be utilized to handle a test scenario where
the VMM is QEMU and runtime-rs is employed.

Note: Some of the tests are skipped. They are going to be reintegrated in
the follow-up PR (Check out #9375).

Fixes: #9371

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-12 12:25:53 +02:00
Alexandru Matei
9e01732f7a agent: shutdown vm on exit when agent is used as init process
Linux kernel generates a panic when the init process exits.
The kernel is booted with panic=1, hence this leads to a
vm reboot.
When used as a service the kata-agent service has an ExecStop
option which does a full sync and shuts down the vm.
This patch mimicks this behavior when kata-agent is used as
the init process.

Fixes: #9429

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2024-04-12 11:32:31 +03:00
Alexandru Matei
54923164b5 clh: isClhRunning waits for full timeout when clh exits
isClhRunning uses signal 0 to test whether the process is
still alive or not. This doesn't work because the process is a
direct child of the shim. Once it is dead the process becomes
zombie.
Since no one waits for it the process lingers until
its parent dies and init reaps it. Hence sending signal 0 in
isClhRunning will always return success whether the process is
dead or not.
This patch calls wait to reap the process, if it succeeds that
means it is our child process, if not we send the signal.

Fixes: #9431

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2024-04-12 11:31:53 +03:00
Dan Mihai
e51cbdcff9 tests: k8s: inject agent policy failures (part2)
Auto-generate the policy and then simulate attacks from the K8s
control plane by modifying the test yaml files. The policy then
detects and blocks those changes.

These test cases are using K8s Replication Controllers. Additional
policy failures will be injected using other types of K8s resources
- e.g., using Pods and/or Jobs - in separate PRs.

Fixes: #9463

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-11 21:08:53 +00:00
Markus Rudy
77540503f9 genpolicy: add support for insecure registries
genpolicy is a handy tool to use in CI systems, to prepare workloads
before applying them to the Kubernetes API server. However, many modern
build systems like Bazel or Nix restrict network access, and rightfully
so, so any registry interaction must take place on localhost.
Configuring certificates for localhost is tricky at best, and since
there are no privacy concerns for localhost traffic, genpolicy should
allow to contact some registries insecurely. As this is a runtime
environment detail, not a target environment detail, configuring
insecure registries does not belong into the JSON settings, so it's
implemented as command line flags.

Fixes: #9008

Signed-off-by: Markus Rudy <webmaster@burgerdev.de>
2024-04-11 22:29:03 +02:00
Wainer dos Santos Moschetta
4f74617897 tests: pass --overwrite-existing to aks get-credentials
By passing --overwrite-existing to `aks get-credentials` it will stop
asking if I want to overwrite the existing credentials. This is handy
for running the scripts locally.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-11 15:31:40 -03:00
Wainer dos Santos Moschetta
3508f3a43a tests/k8s: use CoCo image on guest-pull when non-TEE
When running on non-TEE environments (e.g. KATA_HYPERVISOR=qemu) the tests should
be stressing the CoCo image (/opt/kata/share/kata-containers/kata-containers-confidential.img)
although currently the default image/initrd is built to be able to do guest-pull as well.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-11 15:31:40 -03:00
Wainer dos Santos Moschetta
c24f13431d tests/k8s: enable guest-pull tests on non-TEE
Enabled guest-pull tests on non-TEE environment. It know requires the SNAPSHOTTER environment
variable to avoid it running on jobs where nydus-snapshotter is not installed

Fixes: #9410
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-11 15:31:40 -03:00
Wainer dos Santos Moschetta
0d5399ba92 gha: Create CoCo tests jobs on non-TEE
Created the new run-k8s-tests-coco-nontee jobs for running CoCo tests on
non-TEE. It currently generates the run-k8s-tests-coco-nontee(qemu, nydus, guest-pull)
job only to run the guest-pull tests.

Fixes: #9410
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-11 15:31:40 -03:00
Gabriela Cervantes
5420595d03 tests/k8s: Add uninstall kbs client command function
This PR adds the function to uninstall kbs client command function
specially when we are running with baremetal devices.

Fixes #9460

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-04-11 17:06:11 +00:00
Steve Horsman
6b2d655857
Merge pull request #9457 from justxuewei/fs_manager_tests
agent: Fix the issue with the "test_new_fs_manager" test
2024-04-11 17:02:58 +01:00
Fabiano Fidêncio
5611233ed8
Merge pull request #9439 from microsoft/danmihai1/job-tests
tests: k8s: inject agent policy failures
2024-04-11 17:21:54 +02:00
Markus Rudy
bc2292bc27 genpolicy: make pause container image configurable
CRIs don't always use a pause container, but even if they do the
concrete container choice is not specified. Even if the CRI config can
be tweaked, it's not guaranteed that registries in the public internet
can be reached. To be portable across CRI implementations and
configurations, the genpolicy user needs to be able to configure the
container the tool should append to the policy.

Signed-off-by: Markus Rudy <webmaster@burgerdev.de>
2024-04-11 16:26:35 +02:00
Markus Rudy
8b30fa103f genpolicy: parse json settings during config init
Decouple initialization of the Settings struct from creating the
AgentPolicy struct, so that the settings are available for evaluating,
extending or overriding command line arguments.

Signed-off-by: Markus Rudy <webmaster@burgerdev.de>
2024-04-11 16:17:33 +02:00
Xuewei Niu
50f78ec52c agent: Fix the issue with the "test_new_fs_manager" test
This patch introduces a one-time cpath to mitigate the cgroup residuals. It
might break the device cgroup merging rules when the cgroup has children.

Fixes: #9456

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-04-11 18:06:05 +08:00
GabyCT
08dcdc62de
Merge pull request #9423 from GabyCT/topic/improvecleanup
tests: Improve the kbs_k8s_delete function
2024-04-10 14:28:21 -06:00
Gabriela Cervantes
4a2ee3670f gha: Define unbound PULL TYPE variable
This PR defines the PULL_TYPE variable to avoid failures of unbound
variable when this is being test it locally.

Fixes #9453

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-04-10 17:16:19 +00:00
GabyCT
dab837d71d
Merge pull request #9450 from GabyCT/topic/fixinnydus
gha: Fix indentation in gha run script
2024-04-10 11:07:56 -06:00
David Esparza
9e1368dbc5
Merge pull request #9391 from dborquez/add-onednn-openvino-ml-benchs
add onednn and openvino ml-benchmarks
2024-04-09 19:03:00 -06:00
Dan Mihai
ea31df8bff
Merge pull request #9185 from microsoft/saulparedes/genpolicy_add_containerd_pull
genpolicy: Add optional toggle to pull images using containerd
2024-04-09 12:29:19 -07:00
Gabriela Cervantes
6ebdcf8974 gha: Fix indentation in gha run script
This PR fixes an identation in gha run script.

Fixes #9449

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-04-09 16:37:17 +00:00
Greg Kurz
89353249fc
Merge pull request #8988 from beraldoleal/ci-docs
docs: adding an initial CI documentation
2024-04-09 18:26:15 +02:00
Dan Mihai
2252490a96 tests: k8s: inject agent policy failures
Auto-generate the policy and then simulate attacks from the K8s
control plane by modifying the test yaml files. The policy then
detects and blocks those changes.

These test cases are using K8s Jobs. Additional policy failures
will be injected using other types of K8s resources - e.g., using
Pods and/or Replication Controllers - in future PRs.

Fixes: #9406

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-09 15:36:57 +00:00
David Esparza
facf3c9364
metrics: Add onednn benchmark.
This PR adds onednn test to exercise additional ML benchmarks.

Onednn is an Intel-optimized library for Deep Neural Networks.

Fixes: #9390

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-04-09 09:05:51 -06:00
David Esparza
3bde511d0d
metrics: Add openvino benchmark.
This PR adds openvino test in order to exercise additional ML
benchmarks.

OpenVino bench used to optimize and deploy deep learning models.

Fixes: #9389

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-04-09 09:05:51 -06:00
David Esparza
b37c5f8ba1
metrics:libs: Add HTTPS and HTTP vars to docker build.
Include HTTP and HTTPS env variables in the building docker
images because they are required to download packages
such as Phoronix.

Added a restriction that verifies that docker building images
is performed as root.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-04-09 09:05:51 -06:00
David Esparza
3355dd9e2b
metrics:libs: Adds a function to set new kata configuration.
Adds a function that receives as a single parameter the name of
a valid Kata configuration file which  will be established as
the default kata configuration to start kata containers.

Adds a second function that returns the path to the current
kata configuration file.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-04-09 09:05:51 -06:00
David Esparza
cb4380d1c9
metrics: common: Add function to clean the cache.
The function clear the Page Cache only.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-04-09 09:05:51 -06:00
David Esparza
3a419ba3b1
metrics: common: Add function to update kata config.
Add an extra function that updates kata config
to use the max num. of vcpus available and
to use the available memory in the system.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-04-09 09:05:51 -06:00
Beraldo Leal
959e56525c docs: adding an initial CI documentation
This is actually a first attempt to document our CI, and all this
content was based on the document created by Fabiano Fidencio (kudos to
him). We are just moving the content and discussion from Google Docs to
here.

I used the "poetic license" to add some notes on what I believe our CI
will look like in the future.

Fixes #9006

Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-04-09 09:21:47 -04:00