Add unit test with a volume plugin that does not support SELinux. That
simulates a CSi driver whose spec.SELinuxMount is empty or false.
This requires a little refactoring, each unit test now has a flag if it
runs with a volume plugin that supports SELinux.
Reset SELinuxChangePolicy of Pods that have no SELinux label set to
Recursive. Kubelet cannot mount with `-o context=<label>`, if the label is
not known.
This fixes the e2e test error revealed by the previous commit - it changed the
e2e test to check for events when no events are expected and it found a
warning about a Pod with no label, but MountOption policy.
When a Pod reaches its final state (Succeeded or Failed), its volumes are
getting unmounted and therefore their SELinux mount option will not
conflict with any other pod.
Let the SELinux controller monitor "pod updated" events to see the pod is
finished
This was broken since 666a41c2ea when the label value became non-integer encoded
The chance of one controller revision hash label being int-parsable: 7/27 ^ 8 = 0.00002041 = ~0
The chance of both being int-parsable: 0.00002041^2 = ~0
Hash comparison locks in differences in content failing EqualRevision
even when the semantic content is normalized to be equal.
Certain failures during SetupDevice and MapPodDevice are not treated as
transient in the csi raw block plugin implementation, while they are in
the file mode plugin. This can lead to certain failures causing volumes
to be marked as unmounted incorrectly.
This patch brings the block plugin up to parity with the fs one by
marking the equivalent calls as transient. This mostly covers API server
and some csi driver calls.
The taint toleration plugin records taint keys and values
from non-matching nodes. Taint keys and values may be
sensitive information in some environments.
Use a generic message, and show the info in logs instead.
Fixes a bug where startup probe workers terminate incorrectly for sidecar
containers with restartPolicy=Always when the pod has restartPolicy=Never,
causing main containers to remain stuck in Initializing state.
Changes:
- Add container-level restart policy check for init containers only
- Extract complex boolean logic to named variable for readability
- Refactor test helper to use existing newWorker() function
- Add comprehensive unit and e2e tests for both scenarios
When gRPC notifies the kubelet that a connection ended, the kubelet tries to
reconnect because it needs to know when a DRA driver comes back. The same code
gets called when a connection goes idle, by default after 30 minutes. In that
and only that case the conn.Connect call deadlocks while calling into the gRPC
idle manager.
This can be reproduced with a new unit test which artificially shortens the
idle timeout. This fix is to move the Connect call into a goroutine because
then both HandleConn and Connect can proceed. It's sufficient that Connect
finishes at some point, it doesn't need to be immediately.
DRA also calls Register at pkg/kubelet/cm/container_manager_linux.go NewContainerManager(), causing volume stats collector being ignored.
Fix this by moving it out of `sync.Once()`, allowing multiple calls to `Register()` func.