kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2025-08-15 22:53:43 +00:00

Author	SHA1	Message	Date
Fabiano Fidêncio	460988c5f7	ci: cache: Remove the script used to cache artefacts on Jenkins That's not needed anymore, as we've switched to using ORAS and an OCI registry to cache the artefacts. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 10:27:55 +02:00
Fabiano Fidêncio	4533a7a416	ci: cache: Also store the ${component} sha256sum This is something that was done by our Jenkins jobs, but that I ended up missing when writing `d0c257b3a7`. Now, let's also add the sha256sum to the cached artefact, and in a coming up PR (after this one is merged) we will also start checking for that. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 10:25:26 +02:00
Fabiano Fidêncio	eccc76df63	ci: cache: Use the cached artefacts from ORAS In the previous series related to the artefacts we build, we've switching from storing the artefacts on Jenkins, to storing those in the ghcr.io/kata-containers/cached-artefacts/${artefact_name}. Now, let's take advantage of that and actually use the artefacts coming from that "package" (as GitHub calls it). NOTE: One thing that I've noticed that we're missing, is storing and checking the sha256sum of the artefact. The storing part will be done in a different commit, and the checking the sha256sum will be done in a different PR, as we need to ensure those were pushed to the registry before actually taking the bullet to check for them. Fixes: #7834 -- part 2 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 10:13:47 +02:00
Jeremi Piotrowski	6f30d00ae7	Merge pull request #7956 from fidencio/topic/ci-reduce-the-machine-size-used ci: Reduce the size of the AKS VMs	2023-09-15 08:49:08 +02:00
Steve Horsman	1b8f3fa9ae	Merge pull request #7957 from fidencio/topic/ci-cache-using-oras-part-1 ci: cache: Allow pushing our artefacts to an OCI registry	2023-09-15 07:45:24 +01:00
Fabiano Fidêncio	094b6b2cf8	ci: k8s: Temporarily disable tests that require a bigger VM instance The list of tests which require a bigger VM instance is: * k8s-number-cpus.bats -- failing on all CIs * k8s-parallel.bats -- only failing on the cbl-mariner CI * k8s-scale-nginx.bats -- only failing on the cbl-mariner CI We'll keep those disabled while we re-work the logic to only run those in a bigger (and more expensive) VM instance. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 01:33:19 +02:00
GabyCT	6fe5cd3bd5	Merge pull request #7937 from GabyCT/topic/iperfbandwidth metrics: Add iperf value for cpu utilization	2023-09-14 16:47:19 -06:00
Fabiano Fidêncio	d0c257b3a7	ci: cache: Push cached artefacts to ghcr.io Let's push the artefacts to ghcr.io and stop relying on jenkins for that. Fixes: #7834 -- part 1 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:39:57 +02:00
Fabiano Fidêncio	108f1b60dd	kata-deploy: Generate latest_{artefact,image_builder} files Right now this is not used, but it'll be used when we start caching the artefacts using ORAS. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:39:57 +02:00
Fabiano Fidêncio	be2eb7b378	ci: cache: Install ORAS in the kata-deploy binaries builder container ORAS is the tool which will help us to deal with our artefacts being pushed to and pulled from a container registry. As both the push to and the pull from will be done inside the kata-deploy binaries builder container, we need it installed there. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:39:57 +02:00
Fabiano Fidêncio	92fff129fd	ci: k8s: Don't set cpu limit request for k8s-inotofy test Without setting the cpu limit / request to 1, we can make this test run in a smaller VM instance without any issue. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-14 22:03:16 +02:00
Fabiano Fidêncio	faf98c0623	ci: Reduce the size of the AKS VMs We do not need a very powerful machine for our tests, as we're not building anything there. The instance we switched to (Standard_D2s_v5) still has nested virt available, as shown here[0], but has half of the amount of vCPUs / Memory, which should be fine only for running the tests, costing us basically half of the price[1]. [0]: https://learn.microsoft.com/en-us/azure/virtual-machines/dv5-dsv5-series [1]: https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/#pricing Fixes: #7955 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-14 22:03:16 +02:00
Fabiano Fidêncio	adc18ecdb1	ci: cache: For consistency, read all used env vars Instead of having some of them only being considered if explicitly passed to the script. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-14 20:24:48 +02:00
Fabiano Fidêncio	c7a851efd7	ci: cache: Pass the exposed env vars to the kata-deploy binaries in docker As the environment variables are now being passed down from the GitHub Actions, let's make sure they're exposed to the container used to build the kata-deploy binaries, and during the build process we'll be able to use those to log in and push the artefacts to the OCI registry, using ORAS. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-14 20:24:48 +02:00
Fabiano Fidêncio	2e8b41f39c	Merge pull request #7954 from fidencio/topic/ci-cache-using-oras-part-0 ci: cache: Export env vars needed to use ORAS	2023-09-14 20:23:55 +02:00
Fabiano Fidêncio	6bd15a85d5	ci: cache: Export env vars needed to use ORAS We do the build of our artefacts inside a container image, and we need to expose some env vars to the container so ORAS can be used there to push the artefacts we want to cache to ghcr.io. The env vars we're exposing are: * ARTEFACT_REGISTRY: The registry where we're going to save the artefacts. * ARTEFACT_REGISTRY_USERNAME: The username to log in to the registry, as ORAS does not use the same json file used by docker. * ARTEFACT_REGISTRY_PASSWORD: The pasword to log in to the the registry, as the ORAS does not use the same json file used by docker. * TARGET_BRANCH: The target branch, which will be part of the tag of the artefact, as we may end up caching the artefacts for both main and stable branches. Fixes: #7834 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-14 19:36:33 +02:00
Gabriela Cervantes	cd4fd1292a	metrics: Add iperf cpu utilization limit for qemu This PR adds the iperf cpu utilization limit for qemu for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-14 17:17:47 +00:00
Gabriela Cervantes	df5cd10ea0	metrics: Add iperf value for cpu utilization This PR adds the iperf value for cpu utilization for kata metrics. Fixes #7936 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-14 16:06:49 +00:00
Jeremi Piotrowski	b54dd8cdf4	Merge pull request #7704 from jepio/vfio-part-1 gha: vfio: Import test script	2023-09-14 16:45:31 +02:00
Jeremi Piotrowski	a96050a7ad	tests: Apply timeout to 'ctr t kill' This task has been observed to hang at times. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	9d93036783	tests/vfio: Bump VM image to Fedora 38 We need a very recent L2 guest kernel to fix all the bugs that occur in nested virtualization. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	faee59b520	tests/vfio: Accept single device in vfio group for CLH cloud hypervisor does not emulate pcie switches or pci bridges, so we need to accept a lonely device. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	df3dc1105c	tests/vfio: Get rid of sync's It is fine to start a VM with the disk image without syncing it as we now run the test in an ephemeral Azure instance. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	7211c3dccc	gha: vfio: Set test timeout to 15m Sometimes the test gets stuck running commands in the container - need to investigate why later. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	1b02f89e4f	packaging: kernel: Enable VIRTIO_IOMMU on x86_64 Cloud Hypervisor exposes a VIRTIO_IOMMU device to the VM when IOMMU support is enabled. We need to add it to the whitelist because dragonball uses kernel v5.10 which restricted VIRTIO_IOMMU to ARM64 only. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	3a1db7a86b	runtime: clh: Support enabling iommu by enabling IOMMU on the default PCI segment. For hotplug to work we need a virtualized iommu and clh exposes one if there is some device or PCI segment that requests it. I would have preferred to add a separate PCI segment for hotplugging vfio devices but unfortunately kata assumes there is only one segment all over the place. See create_pci_root_bus_path(), split_vfio_pci_option() and grep for '0000'. Enabling the IOMMU on the default PCI segment requires passing enabling IOMMU on every device that is attached to it, which is why it is sprinkled all over the place. CLH does not support IOMMU for VirtioFs, so I've added a non IOMMU segment for that device. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	9f1a42c6cc	tests/vfio: Give commands 30s to execute This is a to catch the case of the guest getting stuck. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	b46b0ecf8b	tests/vfio: Configure a value for 'hot_plug_vfio' for both vmms This shouldn't be hiding behind only a qemu check, we need this for clh as well. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	bfc93927fb	runtime: Remove redundant check in checkPCIeConfig There is no way for this branch to be hit, as port is only set when it is different than config.NoPort. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	7c4e73b609	runtime: Add test cases for checkPCIeConfig These test cases shows which options are valid for CLH/Qemu, and test that we correctly catch unsupported combinations. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	fc51e4b9eb	runtime: Check config for supported CLH (cold\|hot)_plug_vfio values The only supported options are hot_plug_vfio=root-port or no-port. cold_plug_vfio not supported yet. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	509771e6f5	runtime: clh: Add hot_plug_vfio entry to config hot_plug_vfio needs to be set to root-port, otherwise attaching vfio devices to CLH VMs fails. Either cold_plug_vfio or hot_plug_vfio is required, and we have not implemented support for cold_plug_vfio in CLH yet. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	5f6475a28a	tests/vfio: Gather debug info and disable tdp_mmu tdp_mmu had some issues up until around Linux v6.3 that make it work particularly bad when running nested on Hyper-V. Reload the module at the start of the test and disable the tdp_mmu param. Gather debug info at the end of the test to make it easier to figure out what went wrong. This uses github actions group syntax so that each section can be collapsed. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	8fffdc81c5	tests/vfio: Capture journal from vm For debugging (though this doesn't get exposed yet). Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	df815087e7	tests/vfio: Change to get the test working in GHA - reduce memory and cpu usage to fit in a D4s_v5 - source correct lib - mount workspace from 9p - disable cpu mitigations for speed - drop unused commands and variables - install containerd - install kata from built artifacts Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	a92ddeea15	tests/vfio: Move dependency installation to gha-run.sh To match the flow of other github actions workflows. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	5a551a85b1	gha: vfio: Import jobs scripts from tests repo This imports the vfio test scripts github.com/kata-containers/tests. The test case doesn't work yet but doing the changes in a separate commit will make it easier to track the changes. The only change in this commit is renaming vfio_jenkins_job_build.sh -> vfio_fedora_vm_wrapper.sh Fixes: #6555 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Fabiano Fidêncio	a1e3fa7ac4	Merge pull request #7905 from microsoft/danmihai1/mariner-annotations tests: fix kernel and initrd annotations	2023-09-14 10:37:42 +02:00
GabyCT	1d331124ad	Merge pull request #7925 from GabyCT/topic/bandwidthlimit metrics: Add iperf bandwidth value for kata metrics	2023-09-13 17:43:55 -06:00
Gabriela Cervantes	49e2fa189c	metrics: Increase jitter value for qemu This PR increases the jitter value for qemu for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-13 22:36:09 +00:00
Gabriela Cervantes	49234433a7	metrics: Increase value limit for jitter in clh This PR increases the value limit for jitter in clh. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-13 21:27:08 +00:00
David Esparza	0a24d3f718	Merge pull request #7923 from GabyCT/topic/addcassandradoc metrics: Add Cassandra Metrics documentation	2023-09-13 10:17:00 -06:00
GabyCT	c565053bac	Merge pull request #7895 from GabyCT/topic/removewarning metrics: Remove warning from metrics documentation	2023-09-13 10:16:38 -06:00
Fabiano Fidêncio	8b9df1d32e	Merge pull request #7929 from fidencio/topic/use-tcp-port-ping-on-docker-nerdctl-tests ci: docker: nerdctl: Switch to tcp port 80 ping	2023-09-13 15:46:31 +02:00
Peng Tao	55ca7e8aec	Merge pull request #7907 from Xuanqing-Shi/7876/network-devices-naming-conflict runtime: Naming conflict of network devices	2023-09-13 19:29:41 +08:00
Fabiano Fidêncio	813bfdec01	ci: docker: nerdtl: Use io.containerd.kata-${KATA_HYPERVISOR}.io This will ensure that we're calling the correct binary for the hypervisor. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-13 13:10:14 +02:00
Fabiano Fidêncio	46bc0b1c01	ci: nerdctl: Create the containerd config Otherwise we'll fail to configure kata-containers in the `install-kata` step. This is mostly needed because the nerdctl-full tarball doesn't provide a contaienrd configuration, just the binary, as contaienrd does not actually require a configuration file to run with the default config. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-13 13:00:57 +02:00
Fabiano Fidêncio	13968aa7f6	ci: nerdctl: Switch to tcp port 80 ping TIL that the Azure VMs we use are created without an explicit outbund connectivity defined. This leads us to issues using `ping ...` as part of our tests, and when consulting Jeremi Piotrowski about the issue he pointed me out to two interesting links: * https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access * https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity For your own sanity, do not read the comments, after all this is internet. :-) Anyways, the suggestion is to use nping instead, which is provided by the nmap package, so we can explicitly switch to using the tcp port 80 for the ping. With this in mind, I'm switching the image we use for the test and using one that provided nping as a possible entry point, and from now on (this part of) the tests should work. Fixes: #7910 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-13 13:00:57 +02:00
Fabiano Fidêncio	e0c811678b	ci: docker: Switch to tcp port 80 ping TIL that the Azure VMs we use are created without an explicit outbund connectivity defined. This leads us to issues using `ping ...` as part of our tests, and when consulting Jeremi Piotrowski about the issue he pointed me out to two interesting links: * https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access * https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity For your own sanity, do not read the comments, after all this is internet. :-) Anyways, the suggestion is to use nping instead, which is provided by the nmap package, so we can explicitly switch to using the tcp port 80 for the ping. With this in mind, I'm switching the image we use for the test and using one that provided nping as a possible entry point, and from now on (this part of) the tests should work. Fixes: #7910 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-13 13:00:57 +02:00
Peng Tao	9766f9090c	Merge pull request #7719 from beraldoleal/nullable Remove gogoproto.nullable extension	2023-09-13 15:11:56 +08:00

1 2 3 4 5 ...

11659 Commits