kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2025-04-28 03:42:09 +00:00

Author	SHA1	Message	Date
stevenhorsman	1022d8d260	metrics: Update range for clh tests In `ef0e8669fb` we had been seeing some significantly lower minvalues in the jitter.Result test, so I lowered the mid-value rather than having a very high minpercent, but it appears that the variability of this result is very high, so we are still getting the occasional high value, so reset the midval and just have a bigger ranges on both sides, to try and keep the test stable. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-14 14:54:30 +00:00
stevenhorsman	d77008b817	metrics: Further reduce repeats for boot time tests on qemu I've seen failures on the third run, so reduce it further to just run twice on qemu Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-14 14:53:26 +00:00
stevenhorsman	97151cce4e	metrics: Improve iperf timeout The kubectl wait has a built in timeout of 30s, so wrapping it in waitForProcess, means we have 180/2 * 30 delay, which is much longer than intended, so just set the timeout directly. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-14 14:53:26 +00:00
Zvonko Kaiser	4bb0eb4590	Merge pull request #10954 from kata-containers/topic/metrics-kata-deploy Rework and fix metrics issues	2025-03-04 20:22:53 -05:00
stevenhorsman	b220cca253	shellcheck: Fix shellcheck SC2066 > Since you double-quoted this, it will not word split, and the loop will only run once. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:39:10 +00:00
stevenhorsman	c5ff513e0b	shellcheck: Fix shellcheck SC2068 > Double quote array expansions to avoid re-splitting elements Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:35:46 +00:00
stevenhorsman	58672068ff	shellcheck: Fix shellcheck SC2145 > Argument mixes string and array. Use * or separate argument. - Swap echos for printfs and improve formatting - Replace $@ with $* - Split arrays into separate arguments Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:35:46 +00:00
stevenhorsman	c69509be1c	metrics: Reduce repeats for boot time tests on qemu On qemu the run seems to error after ~4-7 runs, so try a cut down version of repetitions to see if this helps us get results in a stable way. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-02 08:42:00 +00:00
stevenhorsman	0962cd95bc	metrics: Increase minpercent range for qemu iperf test We have a new metrics machine and environment and the iperf jitter result failed as it finished too quickly, so increase the minpercent to try and get it stable Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-02 08:32:26 +00:00
stevenhorsman	ef0e8669fb	metrics: Increase minpercent range for clh tests We have a new metrics machine and environment and the fio write.bw and iperf3 parallel.Results tests failed for clh, as below the minimum range, so increase the minpercent to try and get it stable Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-02 08:32:26 +00:00
stevenhorsman	f81c85e73d	metrics: Increase maxpercent range for clh boot times We have a new metrics machine and environment and the boot time test failed for clh, so increase the maxpercent to try and get it stable Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
stevenhorsman	435ee86fdd	metrics: Update iperf affinity The iperf deployment is quite a lot out of date and uses `master` for it's affinity and toleration, so update this to control-plane, so it can run on newer Kubernetes clusters Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
stevenhorsman	85bbc0e969	metrics: Increase wait time The new metrics runner seems slower, so we are seeing errors like: The iperf3 tests are failing with: ``` pod rejected: RuntimeClass "kata" not found ``` so give more time for it to succeed Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
stevenhorsman	4ce94c2d1b	Revert "metrics: Add init_env function to latency test" This reverts commit `9ac29b8d38`. to remove the duplicate `init_env` call Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
stevenhorsman	658a5e032b	metrics: Increase containerd start timeout - Move `kill_kata_components` from common.bash into the metrics code base as the only user of it - Increase the timeout on the start of containerd as the last 10 nightlies metric tests have failed with: ``` 223478 Killed sudo timeout -s SIGKILL "${TIMEOUT}" systemctl start containerd ``` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
stevenhorsman	3fab7944a3	workflows: Improve metrics jobs - As the metrics tests are largely independent then allow subsequent tests to run even if previous ones failed. The results might not be perfect if clean-up is required, but we can work on that later. - Move the test results check out of the latency test that seems arbitrary and into it's own job step - Add timeouts to steps that might fail/hang if there are containerd/K8s issues Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
stevenhorsman	6f918d71f5	workflows: Update metrics jobs Currently the run-metrics job runs a manual install and does this in a separate job before the metrics tests run. This doesn't make sense as if we have multiple CI runs in parallel (like we often do), there is a high chance that the setup for another PR runs between the metrics setup and the runs, meaning it's not testing the correct version of code. We want to remove this from happening, so install (and delete to cleanup) kata as part of the metrics test jobs. Also switch to kata-deploy rather than manual install for simplicity and in order to test what we recommend to users. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
Balint Tobik	1943a1c96d	tests: replace egrep with grep -E to avoid deprecation warning https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html Signed-off-by: Balint Tobik <btobik@redhat.com>	2025-01-29 11:26:27 +01:00
stevenhorsman	d031e479ab	metrics: Increase minval range for blogbench test In the last couple of days I've seen the blogbench metrics write latency test on clh fail a few times because the latency was too low, so adjust the minimum range to tolerate quicker finishes. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-23 15:58:31 +00:00
stevenhorsman	aaae5b6d0f	metrics: clh: Increase network-iperf3 range We hit a failure with: ``` time="2025-01-09T09:55:58Z" level=warning msg="Failed Minval (0.017600 > 0.015000) for [network-iperf3]" ``` The range is very big, but in the last 3 test runs I reviewed we have got a minimum value of 0.015s and a max value of 0.052, so there is a ~350% difference possible so I think we need to have a wide range to make this stable. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-09 11:25:57 +00:00
stevenhorsman	e946d9d5d3	metrics: qemu: Increase latency test range After the kernel version bump, in the latest nightly run https://github.com/kata-containers/kata-containers/actions/runs/12681309963/job/35345228400 The sequential read throughput result was 79.7% of the expected (so failed) and the sequential write was 84% of the expected, so was fairly close, so increase their minimum ranges to make them more robust. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-09 11:25:50 +00:00
stevenhorsman	dc069d83b5	metrics: Increase latency test range The bump to kernel 6.12 seems to have reduced the latency in the metrics test, so increase the ranges for the minimal value, to account for this. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-08 15:11:49 +00:00
stevenhorsman	b87b4b6756	metrics: Increase ranges range for qemu failing tests We've also seen the qemu metrics tests are failing due to the results being slightly outside the max range for network-iperf3 parallel and minimum for network-iperf3 jitter tests on PRs that have no code changes, so we've increase the bounds to not see false negatives. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-29 10:52:16 +00:00
stevenhorsman	4011071526	metrics: Increase minval range for failing tests We've seen a couple of instances recently where the metrics tests are failing due to the results being below the minimum value by ~2%. For tests like latency I'm not sure why values being too low would be an issue, but I've updated the minpercent range of the failing tests to try and get them passing. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-29 10:50:02 +00:00
Gabriela Cervantes	52ef092489	metrics: Update fast footprint script to use grep This PR updates the fast footprint script to remove the use of egrep as this command has been deprecated and change it to use grep command. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-30 17:43:08 +00:00
Gabriela Cervantes	fdaf12d16c	metrics: Remove unused remove img var in common script This PR removes the remove_img variable in the metrics common script as it is not being used. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-11 17:45:18 +00:00
Gabriela Cervantes	fcc35dd3a7	metrics: Update openVINO and oneDNN tests references This PR updates the machine learning tests references or urls for the openVINO and oneDNN scripts as currently they are refering to a different performance benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-05 15:39:21 +00:00
Gabriela Cervantes	5b0ab7f17c	metrics: Remove metrics report for Kata Containers This PR removes the metrics report which is not longer being used in Kata Containers. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-03 16:11:07 +00:00
Gabriela Cervantes	aa8635727d	metrics: Remove unused variable in oneDNN benchmark This PR removes an unused variable in oneDNN metrics benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-29 15:52:47 +00:00
Gabriela Cervantes	3affde5b28	docs: Add oneDNN benchmark information to metrics README This PR adds the oneDNN benchmark information to the machine learning metrics README. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-27 16:32:50 +00:00
Gabriela Cervantes	2fa8e85439	metrics: Add OpenVINO general information into README This PR adds the OpenVINO benchmark general information into the machine learning README metrics information. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-22 16:08:06 +00:00
Gabriela Cervantes	59e31baaee	metrics: Remove unused variable in openvino script This PR removes an unused variable in the openvino script for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-21 16:05:55 +00:00
David Esparza	dcd0c0b269	metrics: Remove duplicated headers from results file. This PR removes duplicated entries (vcpus count, and available memory), from onednn and openvino results files. Fixes: #10119 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-08-01 18:11:06 -06:00
Gabriela Cervantes	7454908690	metrics: Update memory tests to use grep -F This PR updates the memory tests like fast footprint to use grep -F instead of fgrep as this command has been deprecated. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-01 17:20:57 +00:00
Gabriela Cervantes	3d17a7038a	metrics: Update launch times to use grep -F This PR updates the metrics launch times to use grep -F instead of fgrep as this command has been deprecated. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-23 17:13:52 +00:00
David Esparza	60f52a4b93	metrics: update avg reference values for blogbench. This PR updates the Blogbench reference values for read and write operations used in the CI check metrics job. This is due to the update to version 1.2 of blobench. Fixes: #10039 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-07-18 15:47:14 -06:00
David Esparza	1fdc5c1183	Merge pull request #10028 from amshinde/upgrade-blogbench-1.2 metric: Upgrade blogbench to 1.2	2024-07-18 11:30:17 -06:00
Archana Shinde	30e5e88ff1	metric: Upgrade blogbench to 1.2 Move to blogbench 1.2 version from 1.1. This version includes an important fix for the read_score test which was reported to be broken in the previous version. It essentially fixes this issue here: https://github.com/jedisct1/Blogbench/issues/4 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-07-17 11:32:09 -07:00
GabyCT	2056eda5f0	Merge pull request #9922 from GabyCT/topic/updateblogname metrics: Update container name in blogbench test	2024-07-11 10:05:35 -06:00
Aurélien Bombo	25e0e2fb35	ci: fix run-nydus tests GH-9973 introduced: * New function get_kata_memory_and_vcpus() in tests/metrics/lib/common.bash. * A call to get_kata_memory_and_vcpus() from extract_kata_env(), which is defined in tests/common.bash. Because the nydus test only sources tests/common.bash, it can't find get_kata_memory_and_vcpus() and errors out. We fix this by moving the get_kata_memory_and_vcpus() call from tests/common.bash to tests/metrics/lib/json.bash so that it doesn't impact the nydus test. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-10 17:19:08 +00:00
Alex Lyn	e4997760f1	Merge pull request #9987 from kata-containers/remove_double_process_check_from_memory_usage_test metrics: Remove duplicate check of processes from memory test.	2024-07-10 10:12:18 +08:00
David Esparza	e77d44614b	metrics: Remove duplicate check of processes from memory test. This PR removes the common_init function call from the memory usage script to eliminate duplicate checking that is also done from the init_env function. It also eliminates duplicaction of nested conditionals. Fixes: #9984 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-07-09 12:34:51 -06:00
David Esparza	04df85a44f	metrics: Add num_vcpus and free_mem to metrics results template. This PR retrieves the free memory and the vcpus count from a kata container and includes them to the json results file of any metric. Additionally this PR parses the requested vcpus quantity and the requested amount memory from kata configuration file and includes this pair of values into the json results file of any metric. Finally, the file system defined in the kata configuration file is included in the results template. Fixes: #9972 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-07-09 10:29:29 -06:00
David Esparza	a554541495	metrics: Improvement to the description of certain functions. This PR rephrased the description and usage of certain functions as such as: - set_kata_configuration_performance - set_kata_config_file - get_current_kata_config_file - check_if_root - check_ctr_images Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-07-09 10:29:29 -06:00
Gabriela Cervantes	b7da1291ea	metrics: Remove variable in sysbench that is not being used This PR removes the CI_JOB variable which previously was used but not longer being supported of the metrics sysbench test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-02 15:29:50 +00:00
Gabriela Cervantes	e3318a04f7	metrics: Update container name in blogbench test This PR updates the container name to put a random name instead of using a hard coded name. This PR is a general improvement to avoid random bug failures specially when we are running on baremetal environments. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-01 19:28:16 +00:00
Chelsea Mafrica	0b83c8549a	tests: Update help section in openvino test Test reports that it is a onednn test when it is openvino; update description. Fixes: #9948 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2024-07-01 14:24:50 +00:00
Gabriela Cervantes	671d9af456	metrics: Improve variable definition in memory inside containers script This PR improves the variable definition in memory inside the container script for metrics. This change declares and assigns the variables separately to avoid masking return values. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-18 16:56:12 +00:00
Gabriela Cervantes	a96ff49060	metrics: Use function definition to have uniformity This PR uses the function definition to have uniformity across all the launch times script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-11 17:36:08 +00:00
GabyCT	6d58fce4a9	Merge pull request #9677 from GabyCT/topic/memoryusags metrics: Improve variable definition in memory usage script	2024-05-29 10:16:56 -06:00

1 2 3 4 5 ...

356 Commits