Changing the encryption key doesn't work with watch cache as it doesn't
break decoding newly written objects. A new object will be written using
a new key, and decoded using a new key.
This is a verbatim copy of the current pkg/controller/taintseviction code,
revision fc268ecd09 (v1.33.0 plus one commit),
minus the TimedWorker helper.
The intent is to modify the code such that it enforces eviction of pods which
use tainted devices.
WithTB was originally defined as "uses the existing logger". But what we want
there and in the newer TContext.Run is the usual per-test logging, now for the
sub-test.
The purpose of the tracker is to emulate a ResourceSlice informer, including
cache and event handlers. In contrast to that informer, the tracker adds taints
from a DeviceTaint such that they appear in the ResourceSlice device
definition. Code using the tracker doesn't need to care where the taints are
coming from.
The main advantage is that it enables fine-grained reactions to taints that
only affect a few devices, the common case. Without this tracker, the pod
eviction controller would have to sync all pods when any slice or any taint
change.
In the scheduler it avoids re-evaluating the selection criteria repeatedly.
The tracker serves as a cross-pod-scheduling cache.
This adds the "DeviceTaint" top-level type to v1alpha3 and related fields to
ResourceSlice and ResourceClaim. It's complete enough bring up an API server
and generate files.
If there was an unexpected status, the code extracting the expected error
message crashed with a panic. Happened once so far, for unknown reasons
because the unexpected status then didn't get logged.
When the new RollingUpdate option is used, the DRA driver gets deployed such
that it uses unique socket paths and uses file locking to serialize gRPC
calls. This enables the kubelet to pick arbitrarily between two concurrently
instances. The handover is seamless (no downtime, no removal of ResourceSlices
by the kubelet).
For file locking, the fileutils package from etcd is used because that was
already a Kubernetes dependency. Unfortunately that package brings in some
additional indirect dependency for DRA drivers (zap, multierr), but those
seem acceptable.
The key difference is that the kubelet must remember all plugin instances
because it could always happen that the new instance dies and leaves only the
old one running.
The endpoints of each instance must be different. Registering a plugin with the
same endpoint as some other instance is not supported and triggers an error,
which should get reported as "not registered" to the plugin. This should only
happen when the kubelet missed some unregistration event and re-registers the
same instance again. The recovery in this case is for the plugin to shut down,
remove its socket, which should get observed by kubelet, and then try again
after a restart.
When doing an update of a DaemonSet, first the old pod gets stopped and
then the new one is started. This causes the kubelet to remove all
ResourceSlices directly after removal and forces the new pod to recreate all of
them.
Now the kubelet waits 30 seconds before it deletes ResourceSlices. If a new
driver registers during that period, nothing is done at all. The new driver
finds the existing ResourceSlices and only needs to update them if something
changed.
The downside is that if the driver gets removed permanently, this creates a
delay where pods might still get scheduled to the node although the driver is
not going to run there anymore and thus the pods will be stuck.
While they have the LinuxOnly part, other tests have that and this
e2eskipper line. Let's add it just in case.
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>