This adds a test for the just added wrapping error message, as well as
for the other already present error messages that initialization can
fail with.
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
These tests will create the userns record mapping file, so let's use a
temporal directory for that.
Without specifying one, by mistake we were using the
"/tmp/non-existant-dir.This-is-not-used-in-tests/" directory.
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
The error we are wrapping is already verbose, let's just use minimal
wrapping as it is usually the case in go code.
Note that the error on parseUserNsFileAndRecord() can be returned to the
user, so we added some context about user namespace. Otherwise, an error
to parse the json would not be clear to which of all the json the kubelet
parses it refers to.
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Most error messages are properly wrapped already, but this was missing.
The kubelet logs will show something like this now:
E0201 12:00:03.505680 3007049 run.go:74] "command failed" err="failed to run Kubelet: failed to create kubelet: record pod mappings: create user namespace store: mkdir XXX: permission denied"
Before this commit, the message was not so clear:
E0120 16:02:40.484404 474711 run.go:74] "command failed" err="failed to run Kubelet: failed to create kubelet: mkdir XXX: permission denied"
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
The winkernel code was originally based on the iptables code but never
made use of some parts of it. (e.g., it logs a warning if you didn't
set `--cluster-cidr`, even though it doesn't actually use
`--cluster-cidr` if you do set it.)
Blocking API calls during a scheduling cycle like the DRA plugin is doing slow
down overall scheduling, i.e. also affecting pods which don't use DRA.
It is easy to move the blocking calls into a goroutine while the scheduling
cycle ends with "pod unschedulable". The hard part is handling an error when
those API calls then fail in the background. There is a solution for that
(see https://github.com/kubernetes/kubernetes/pull/120963), but it's complex.
Instead, publishing the modified PodSchedulingContext can also be done
later. In the more common case of a pod which is ready for binding except for
its claims, that'll be in PreBind, which runs in a separate goroutine already.
In the less common case that a pod cannot be scheduled, that'll be in
Unreserve which is still blocking.
This moves adding a pod to ReservedFor out of the main scheduling cycle into
PreBind. There it is done concurrently in different goroutines. For claims
which were specifically allocated for a pod (the most common case), that
usually makes no difference because the claim is already reserved.
It starts to matter when that pod then cannot be scheduled for other reasons,
because then the claim gets unreserved to allow deallocating it. It also
matters for claims that are created separately and then get used multiple times
by different pods.
Because multiple pods might get added to the same claim rapidly independently
from each other, it makes sense to do all claim status updates via patching:
then it is no longer necessary to have an up-to-date copy of the claim because
the patch operation will succeed if (and only if) the patched claim is valid.
Server-side-apply cannot be used for this because a client always has to send
the full list of all entries that it wants to be set, i.e. it cannot add one
entry unless it knows the full list.