We graduate the `CPUManagerPolicyOptions` feature to beta
in the 1.23 cycle, and we add new experimental feature gates
to guard new options which are planned in the 1.23 and in the
following cycles.
We introduce additional feature gate called `CPUManagerPolicyAlphaOptions` and
`CPUManagerPolicyBetaOptions`. The basic idea is to avoid the
cumbersome process of adding a feature gate for each option, and to have
feature gates which track the maturity level of _groups_ of options.
Besides this change, the graduation process, and the process in general,
for adding new policy options is still unchanged.
The `full-pcpus-only` option added in the 1.22 cycle is intentionally
moved into the beta policy options
For more details:
- KEP: https://github.com/kubernetes/enhancements/pull/2933
- sig-arch discussion:
https://groups.google.com/u/1/g/kubernetes-sig-architecture/c/Nxsc7pfe5rw
Signed-off-by: Francesco Romani <fromani@redhat.com>
The Topology Manager e2e tests wants to run on real multi-NUMA system
and want to consume real devices supported by device plugins; SRIOV
devices happen to be the most commonly available of such devices.
CI machines aren't multi NUMA nor expose SRIOV devices, so the biggest portion
of the tests will just skip, and we need to keep it like this until we
figure out how to enable these features.
However, some organizations can and want to run the testsuite on bare metal;
in this case, the current test will skip (not fail) with misconfigured
boxes, and this reports a misleading result. It will be much better to
fail if the test preconditions aren't met.
To satisfy both needs, we add an option, controlled by an environment
variable, to fail (not skip) if the machine on which the test run
doesn't meet the expectations (multi-NUMA, 4+ cores per NUMA cell,
expose SRIOV VFs).
We keep the old behaviour as default to keep being CI friendly.
Signed-off-by: Francesco Romani <fromani@redhat.com>
In older versions of Kubernetes (at least pre-0.19, it's the earliest
this test will run unmodified on), Pods that depended on devices could be
restarted after the device plugin had been removed. Currently however,
this isn't possible, as during ContainerManager.GetResources(), we
attempt to DeviceManager.GetDeviceRunContainerOptions() which fails as
there's no cached endpoint information for the plugin type.
This commit therefore breaks apart the existing test into two:
- One active test that validates that assignments are maintained across
restarts
- One skipped test that validates the behaviour after GPUs have been
removed, in case we decide that this is a bug that should be fixed in
the future.
Prior to this change, the pod was not getting scheduled on the node as
we don't have a running scheduler in e2e_node. PodClient solves this
problem by manually assigning the pod to the node.
The current GPU installer was built in 2017, from source that no longer
exists in Kubernetes ([adding commit][1]. The image was built on 2017-06-13.
Unfortunately, this installer no longer appears to work. When debugging
on the same node type as used by test-infra, it failed to build the
driver as the kernel sha was no longer available.
This lead to needing to find a new way to install GPUs. The smallest
logical change was switching to [cos-gpu-installer][2]
. There is a newer version of this available on [googlesource][3] that
I have not yet tested as it's not clear what the state of the project
is, as I couldn't find docs outside of the source itself.
We install things to the same location as previously to avoid needing
extra downstream changes. There are a couple of weird issues here
however, like needing to run the container twice to correctly update the
LD Cache.
[1]: 1e77594958/cluster/gce/gci/nvidia-gpus/Dockerfile
[2]: https://github.com/GoogleCloudPlatform/cos-gpu-installer
[3]: https://cos.googlesource.com/cos/tools/+/refs/heads/master/src/cmd/cos_gpu_installer/
This commit adds an e2e test for the kubelet flags `--lock-file` and
`exit-on-lock-contention`. Eventually we would like to move them to the
kubelet configuration file rather than flags.
This test is based on the premise that whenever there is a lock
contention of the lock file (e.g. /var/run/kubelet.lock), the running
kubelet must terminate and the waiting for the lock on the lock file to
be released before starting again.
In this test we simulate that behaviour of a file contention. The test
would try to acquire the lock on the lock file.
Success of the test is determined kubelet health check when the lock is
acquired by the test and passes when the lock on the lock file is
released.
Signed-off-by: Imran Pochi <imran@kinvolk.io>
For some reason when we send them to journald, many log lines are
consistently dropped as soon as the PLEG is started.
If we log directly to file, we don't have this problem. As a bonus, if
the tests crash, the kubelet logs will always be available since they
were already written; otherwise we normally wait until the end of the
test run to collect them from journald, meaning that we often end up
with empty logs.
- recover to last-known-good ConfigMap.KubeletConfigKey
~12m to run in CI, 13m locally
- non-nil last-known-good to a new non-nil last-known-good
~24m to run in CI
- recover to last-known-good ConfigMap
~12m to run in CI
- state transitions
~8m to run in CI
Add e2e tests to cover the basic flows for the `full-pcpus-only` option:
negative flow to ensure rejection with proper error message, and
positive flow to verify the actual cpu allocation.
Co-authored-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>