If someone gains the ability to create static pods, they might try to use that
ability to run code which gets access to the resources associated with some
existing claim which was previously allocated for some other pod. Such an
attempt already fails because the claim status tracks which pods are allowed to
use the claim, the static pod is not in that list, the node is not authorized
to add it, and the kubelet checks that list before starting the pod in
195803cde5/pkg/kubelet/cm/dra/manager.go (L218-L222).
Even if the pod were started, DRA drivers typically manage node-local resources
which can already be accessed via such an attack without involving DRA. DRA
drivers which manage non-node-local resources have to consider access by a
compromised node as part of their threat model.
Nonetheless, it is better to not accept static pods which reference
ResourceClaims or ResourceClaimTemplates in the first place because there
is no valid use case for it.
This is done at different levels for defense in depth:
- configuration validation in the kubelet
- admission checking of node restrictions
- API validation
Co-authored-by: Jordan Liggitt <liggitt@google.com>
Code changes by Jordan, with one small change (resourceClaims -> resourceclaims).
Unit tests by Patrick.
This makes it clear the error comes due to a user namespace
configuration. Otherwise the error returned looks too generic and is not
clear.
Before this PR, the error was:
Warning FailedCreatePodSandBox 1s kubelet Failed to create pod sandbox: the handler "" is not known
Now it is:
Warning FailedCreatePodSandBox 1s kubelet Failed to create pod sandbox: runtime does not support user namespaces
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
When using an old runtime like containerd 1.7, this message is not
implemented and what we get here is an empty non-nil slice. Let's check
the len of the slice instead.
While we are there, let's just return false and no error. In the
following commits we will wrap the error and we didn't find any more
info to add here.
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Finalizers do not work as expected when an informer with a field
selector is used. Any time a pod changing its state gets excluded by the
field selector a synthetic delete event is issues even though the pod
with a finalizer set is still present. Thus, making the scheduler
schedule the high and medium priority pods before any of the low
priority pod finalizers is removed. Instead, rely on preStop hook and
TerminationGracePeriodSeconds to keep all low priority pods long enough
included by the field selector so all high priority pods can set their
.status.nominatedNodeName field.
Also, update the check for how many medium priority pods are expected to
be scheduled. Each node can accept 10 pods of the given extended
resources. Given there's 5 high priority created per node, there's
always 5 times number of nodes spots left for the medium priority pods.
* Change: Handling nil runtime.Object
Signed-off-by: Taha Farahani <tahacodes@proton.me>
* Change: Return only if there is error in rollout_history
Signed-off-by: Taha Farahani <tahacodes@proton.me>
* Change: Return the unknown revision error directly in rollout_history.go
Signed-off-by: Taha Farahani <tahacodes@proton.me>
* Change: Remove unintended newline
Signed-off-by: Taha Farahani <tahacodes@proton.me>
* Change: Using go idiomatic way for checking if historyInfo[o.Revision] exists
Signed-off-by: Taha Farahani <tahacodes@proton.me>
* Change: Remove 'error:' from returned error message in rollout_history.go
Signed-off-by: Taha Farahani <tahacodes@proton.me>
* Change: Check for printer.PrintObj returned err
Signed-off-by: Taha Farahani <tahacodes@proton.me>
* Change: Add TestRolloutHistoryErrors test
Signed-off-by: Taha Farahani <tahacodes@proton.me>
* Change: Simple typo fix on Complete() function description
Signed-off-by: Taha Farahani <tahacodes@proton.me>
* Change: Checking for error on o.Complete in TestRolloutHistoryErrors
Signed-off-by: Taha Farahani <tahacodes@proton.me>
---------
Signed-off-by: Taha Farahani <tahacodes@proton.me>
This PR fixes a possible panic caused by decoding a JSON document
followed by a YAML document that is shorter than the first json
document.
This can cause a panic because the stream already consumed the JSON
data. When we fallback to YAML reader, the YAML starts with a zero
offset while the stream consumed data is non-zero. This could lead into
consuming negative bytes because `d.yaml.InputOffset() -
d.stream.Consumed()` is negative which will cause a panic.
Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>
With StreamingCollectionEncodingToJSON and
StreamingCollectionEncodingToProtobuf, the WatchList must re-justify its
necessity. To prevent an ecosystem from building around a feature that
may not be promoted, we will stop serving list-via-watch until
performance numbers can justify its inclusion.
This also stops the kube-controller-manager from using the
list-via-watch by default. The fallback is a regular list, so during
the skew during an upgrade the "right" thing will happen and the new
StreamingCollectionEncoding will be used.