The changes done are:
* cpu/cpu.shares was replaced by cpu.weight
* The weight, according to our reference[0], is calculated by:
weight = (1 + ((request - 2) * 9999) / 262142)
* cpu/cpu.cfs_quota_us & cpu/cpu.cfs_period_us were replaced by cpu.max,
where quota and period are written together (in this order)
[0]: https://github.com/containers/crun/blob/main/crun.1.md#cgroup-v2
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
This reverts commit 091ad2a1b2, in order
to ensure tests would be running with cgroupsv2 on the guest.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
In the last couple of days I've seen the blogbench
metrics write latency test on clh fail a few times because
the latency was too low, so adjust the minimum range
to tolerate quicker finishes.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The static-checks targets are `pull_request`, so
they can run the PR workflow version, so we want to
update the required-tests.yaml so that static-check
workflow changes do trigger static checks in order
to test them properly.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Now we have the build-assets running on the gh-hosted
runners, try the same approach for the static-checks
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
I've noticed the following error when running the tests with SEV:
```
2025-01-21T17:10:28.7999896Z # @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
2025-01-21T17:10:28.8000614Z # @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
2025-01-21T17:10:28.8001217Z # @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
2025-01-21T17:10:28.8001857Z # IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
2025-01-21T17:10:28.8003009Z # Someone could be eavesdropping on you right now (man-in-the-middle attack)!
2025-01-21T17:10:28.8003348Z # It is also possible that a host key has just been changed.
2025-01-21T17:10:28.8004422Z # The fingerprint for the ED25519 key sent by the remote host is
2025-01-21T17:10:28.8005019Z # SHA256:x7wF8zI+LLyiwphzmUhqY12lrGY4gs5qNCD81f1Cn1E.
2025-01-21T17:10:28.8005459Z # Please contact your system administrator.
2025-01-21T17:10:28.8006734Z # Add correct host key in /home/kata/.ssh/known_hosts to get rid of this message.
2025-01-21T17:10:28.8007031Z # Offending ED25519 key in /home/kata/.ssh/known_hosts:178
2025-01-21T17:10:28.8007254Z # remove with:
2025-01-21T17:10:28.8008172Z # ssh-keygen -f "/home/kata/.ssh/known_hosts" -R "10.244.0.71"
```
And this was causing a failure to ssh into the confidential pod.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Relying on dmesg is really not ideal, as we may lose important info,
mainly those which happen very early in the boot, depending on the size
of kernel ring buffer.
So, for this specific test, let's increase the kernel ring buffer, by
default, to 4M.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Let's make sure that we don't use Kata Containers' agent as init for the
Confidential related rootfses*, as we don't want to increase the agent's
complexity for no reason ... mainly when we can rely on a proper init
system.
*:
- images already used systemd as init
- initrds are now using systemd as init
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Bumps the go_modules group with 1 update in the /src/runtime directory: [golang.org/x/net](https://github.com/golang/net).
Bumps the go_modules group with 1 update in the /src/tools/csi-kata-directvolume directory: [golang.org/x/net](https://github.com/golang/net).
Bumps the go_modules group with 1 update in the /tools/testing/kata-webhook directory: [golang.org/x/net](https://github.com/golang/net).
Updates `golang.org/x/net` from 0.25.0 to 0.33.0
- [Commits](https://github.com/golang/net/compare/v0.25.0...v0.33.0)
Updates `golang.org/x/net` from 0.23.0 to 0.33.0
- [Commits](https://github.com/golang/net/compare/v0.25.0...v0.33.0)
Updates `golang.org/x/net` from 0.23.0 to 0.33.0
- [Commits](https://github.com/golang/net/compare/v0.25.0...v0.33.0)
---
updated-dependencies:
- dependency-name: golang.org/x/net
dependency-type: indirect
dependency-group: go_modules
- dependency-name: golang.org/x/net
dependency-type: direct:production
dependency-group: go_modules
- dependency-name: golang.org/x/net
dependency-type: indirect
dependency-group: go_modules
...
Signed-off-by: dependabot[bot] <support@github.com>
When the agent is run as the init process cgroupfs is being
setup. In the case of cgroupsV1 we needed to enable the memory hiearchy
this is now per default enabled in cgroupsV2. Additionally the file
/sys/fs/cgroup/memory/memory.use_hierarchy isn't even available with V2.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
The `Create AKS cluster` step in `run-k8s-tests-on-aks.yaml` is likely
to fail fail since we are trying to issue `PUT` to `aks` in a relatively
high frequency, while the `aks` end has it's limit on `bucket-size` and
`refill-rate`, documented here [1].
Use `nick-fields/retry@v3` to retry in 10 seconds after request fail,
based on observations that AKS were request 7, or 8 second delays
before retry as part of their 429 response
[1] https://learn.microsoft.com/en-us/azure/aks/quotas-skus-regions#throttling-limits-on-aks-resource-provider-apisFixes: #10772
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
With this change, `virtiofsd` (gnu target) could be built and then to be
used with other components.
Depends: #10741Fixes: #10739
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
While working on #10559, I realized that some parts of the codebase use
$GH_PR_NUMBER, while other parts use $PR_NUMBER.
Notably, in that PR, since I used $GH_PR_NUMBER for CoCo non-TEE tests
without realizing that TEE tests use $PR_NUMBER, the tests on that PR
fail on TEEs:
https://github.com/kata-containers/kata-containers/actions/runs/12818127344/job/35744760351?pr=10559#step:10:45
...
44 error: error parsing STDIN: error converting YAML to JSON: yaml: line 90: mapping values are not allowed in this context
...
135 image: ghcr.io/kata-containers/csi-kata-directvolume:
...
So let's unify on $GH_PR_NUMBER so that this issue doesn't repro in the
future: I replaced all instances of PR_NUMBER with GH_PR_NUMBER.
Note that since some test scripts also refer to that variable, the CI
for this PR will fail (would have also happened with the converse
substitution), hence I'm not adding the ok-to-test label and we should
force-merge this after review.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
With ubuntu 20.04 image, virtiofsd gnu target couldn't be built due to
"unsupported ISA subset z" reported by "cc".
Updating to ubuntu 22.04 image addresses this problem.
Relates: #10739
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
The pyinstaller is located per default under /usr/local/bin
some prior versions were installing it to ${HOME}.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Skip logging empty lines of text from the Guest console output, if
there are any such lines.
Without this change, the Guest console log from CLH + /dev/pts/0 has
twice as many lines of text. Half of these lines are empty.
Fixes: #10737
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
On s390x, some tests for trusted storage occasionally failed due to:
```bash
etcdserver: request timed out
```
or
```bash
Internal error occurred: resource quota evaluation timed out
```
These timeouts were not observed previously on k3s but occur
sporadically on kubeadm. Importantly, they appear to be temporary
and transient, which means they can be ignored in most cases.
To address this, we introduced a new wrapper function, `retry_kubectl_apply()`,
for `kubectl create`. This function retries applying a given manifest up to 5
times if it fails due to a timeout. However, it will still catch and handle
any other errors during pod creation.
Fixes: #10651
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>