The existing test had two problems:
- It only made connections from within the cluster, so for VIP-type
LBs, the connections would always be short-circuited and so this
only tested kube-proxy's LBSR implementation, not the cloud's.
- For non-VIP-type LBs, it would only work if pod-to-LB connections
were not masqueraded, which is not the case for most network
plugins.
Fix this by (a) testing connectivity from the test binary, so as to
test filtering external IPs, and ensure we're testing the cloud's
behavior; and (b) using both pod and node IPs when testing the
in-cluster case.
Also some general cleanup of the test case.
* Add `Linux{Sandbox,Container}SecurityContext.SupplementalGroupsPolicy` and `ContainerStatus.user` in cri-api
* Add `PodSecurityContext.SupplementalGroupsPolicy`, `ContainerStatus.User` and its featuregate
* Implement DropDisabledPodFields for PodSecurityContext.SupplementalGroupsPolicy and ContainerStatus.User fields
* Implement kubelet so to wire between SecurityContext.SupplementalGroupsPolicy/ContainerStatus.User and cri-api in kubelet
* Clarify `SupplementalGroupsPolicy` is an OS depdendent field.
* Make `ContainerStatus.User` is initially attached user identity to the first process in the ContainerStatus
It is because, the process identity can be dynamic if the initially attached identity
has enough privilege calling setuid/setgid/setgroups syscalls in Linux.
* Rewording suggestion applied
* Add TODO comment for updating SupplementalGroupsPolicy default value in v1.34
* Added validations for SupplementalGroupsPolicy and ContainerUser
* No need featuregate check in validation when adding new field with no default value
* fix typo: identitiy -> identity
The previous attempt to fix this in
6aa779f4ed (diff-efa2cd1347df22ace5a516ea794152d00ef2a079db135c81787ed920ecb73658)
didn't address the root cause (or perhaps created it, not sure): the goroutine
must not be started if watch creation failed.
Instead, the error gets logged (as before) and an empty watch gets returned to
the caller (new). This is necessary because the function doesn't have an error
return value and changing that now would be disruptive. The empty watch is
valid and usable, so callers won't crash when they calls Stop.
This showed up recently in failed unit tests, probably because test
cancellation makes this error more likely:
"Unable start event watcher (will not retry!)" err="broadcaster already
stopped" logger="TestGarbageCollectorConstruction leaked goroutine"
The logger value and a preceding warning show that this occurs after test
completion.
Currently if etcd.yaml does not have a diff on "kubeadm upgrade"
certificate renewal for it is also skipped.
Check if kube-apiserver.yaml needs an upgrade, if so and if
cert renewal is not disabled, renew etcd's certs and restart
its static pod.
With the new `cri-client` staging repository it's finally possible to
decouple `kubeadm` from `crictl`.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>