Kubelet in standalone mode won't have kubeclient, it cannot get node.status
and get devices from it. Such a kubelet cannot mount attachable volumes
anyway.
DesiredStateOfWorld must remember both
- the effective SELinux label to apply as a mount option (non-empty for
RWOP volumes, empty otherwise)
- and the label that _would_ be used if the mount option would be used by
all access modes.
Mismatch warning metrics must be generated from the second label.
The path module has a few different functions:
Clean, Split, Join, Ext, Dir, Base, IsAbs. These functions do not
take into account the OS-specific path separator, meaning that they
won't behave as intended on Windows.
For example, Dir is supposed to return all but the last element of the
path. For the path "C:\some\dir\somewhere", it is supposed to return
"C:\some\dir\", however, it returns ".".
Instead of these functions, the ones in filepath should be used instead.
SELinux context discovered from Pod is not final, it can be cleared when a
volume plugin does not support SELinux or the volume is not
ReadWriteOncePod. Update the existing log line + add a new one for easier
debugging.
Add volume reconstruction logs to V(2) to see initial kubelet
ActualStateOfWorld after kubelet start. Kubelet logs SetUp / TearDown
events at V(2) already, so we can track the whole volume mount state in
V(2) logs.
To preserve fix in https://github.com/kubernetes/kubernetes/pull/110670,
add an unit test that check a volume is *uncertain* even after final mount
error when it was reconstructed.
And actually fix a regression introduced in the previous patch.
Move reconciler logic from reconstruct{new}.go to:
- reconciler.go - only the functionality used by the current (old)
reconciler.
- reconciler_new.go - only the functionality used by the new reconciler.
- reconciler_common.go - common functions.
Subsequent SELinux work (see http://kep.k8s.io/1710) will need
ActualStateOfWorld populated around the time kubelet starts mounting
volumes.
Therefore reconstruct volumes before starting reconciler, but do not depend
on the desired state of world populated nor node.status - both need a
working API server, which may not be available at that time.
All reconstructed volumes are marked as Uncertain and reconciler will sort
them out - call SetUp to ensure the volume is really mounted when a pod
needs the volume or call TearDown then there is no such pod.
Finish the reconstruction when the API server becomes available:
- Clean up volumes that failed reconstruction and are not needed.
- Update devicePath of reconstructed volumes from node.status. Make sure
not to overwrite devicePath that may have been updated when the volume
was mounted by reconcile().
Hiding all this rework behind SELinuxMountReadWriteOncePod FeatureGate,
just to make sure we have a way back if this commit is buggy.
When a volume is already mounted with an unexpected SELinux label,
kubelet must unmount it first and then mount it back with the expected one.
Report an error to user, just in case the unmount takes too long.
In therory, this error should not happen too often, because two Pods with
different SELinux label will not enter Desired State of World, see
dsw.AddPodToVolume. It can happen when DSW and ASW SELinux labels only when
a volume has been deleted from DSW (= Pod was deleted) or a volume was
reconstructed after kubelet restart. In both cases, volume manager should
unmount the volume quickly.
In PodExistsInVolume with volumeObj.seLinuxMountContext != nil we know that
the volume has been previously mounted with a given SELinuxMountContext.
Either it has been mounted by this kubelet and we know it's correct or it
was by a previous instance of kubelet and the context has been
reconstructed from the filesystem. In both cases, the actual context is
correct, regardless if the volume plugin or PV access mode supports SELinux
mounts.
The Desired State of World can require a different SELinux mount context than
is in the Actual State of World and it's perfectly OK. For example when
user changes SELinux context of Pods or when the context is reconstructed
after kubelet restart.
Don't spam log and don't report errors to the user as event - reconciler
will do the right thing and unmount the old volume (with wrong context) and
mount a new one in the next reconciliation. It's not an error, it's
expected workflow.
github.com/opencontainers/selinux/go-selinux needs OS that supports SELinux
and SELinux enabled in it to return useful data, therefore add an interface
in front of it, so we can mock its behavior in unit tests.