The --docker-disable-shared-pid flag has been deprecated since 1.10 and
has been superceded by ShareProcessNamespace in the pod API, which is
scheduled for beta in 1.12.
Automatic merge from submit-queue (batch tested with PRs 66152, 66406, 66218, 66278, 65660). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Refactor kubelet node status setters, add test coverage
This internal refactor moves the node status setters to a new package, explicitly injects dependencies to facilitate unit testing, and adds individual unit tests for the setters.
I gave each setter a distinct commit to facilitate review.
Non-goals:
- I intentionally excluded the class of setters that return a "modified" boolean, as I want to think more carefully about how to cleanly handle the behavior, and this PR is already rather large.
- I would like to clean up the status update control loops as well, but that belongs in a separate PR.
```release-note
NONE
```
When --cgroups-per-qos=false (default is true), kubelet sets pod
container management to podContainerManagerNoop implementation and
GetPodContainerName() returns '/' as cgroup parent (default cgroup root).
(1) In case of 'systemd' cgroup driver, '/' is invalid parent as
docker daemon expects '.slice' suffix and throws this error:
'cgroup-parent for systemd cgroup should be a valid slice named as \"xxx.slice\"'
(5fc12449d8/daemon/daemon_unix.go (L618))
'/' corresponds to '-.slice' (root slice) in systemd but I don't think
we want to assign root slice instead of runtime specific default value.
In case of docker runtime, this will be 'system.slice'
(e2593239d9/daemon/oci_linux.go (L698))
(2) In case of 'cgroupfs' cgroup driver, '/' is valid parent but I don't
think we want to assign root instead of runtime specific default value.
In case of docker runtime, this will be '/docker'
(e2593239d9/daemon/oci_linux.go (L695))
Current fix will not set the cgroup parent when --cgroups-per-qos is disabled.
Automatic merge from submit-queue (batch tested with PRs 65771, 65849). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add a new conversion path to replace GenericConversionFunc
reflect.Call is very expensive. We currently use a switch block as part of AddGenericConversionFunc to avoid the bulk of top level a->b conversion for our primary types which is hand-written. Instead of having these be handwritten, we should generate them.
The pattern for generating them looks like:
```
scheme.AddConversionFunc(&v1.Type{}, &internal.Type{}, func(a, b interface{}, scope conversion.Scope) error {
return Convert_v1_Type_to_internal_Type(a.(*v1.Type), b.(*internal.Type), scope)
})
```
which matches AddDefaultObjectFunc (which proved out the approach last year). The
conversion machinery should then do a simple map lookup based on the incoming types and invoke the function. Like defaulting, it's up to the caller to match the types to arguments, which we do by generating this code. This bypasses reflect.Call and in the future allows Golang mid-stack inlining to optimize this code.
As part of this change I strengthened registration of custom functions to be generated instead of hand registered, and also strengthened error checking of the generator when it sees a manual conversion to error out. Since custom functions are automatically used by the generator, we don't really have a case for not registering the functions.
Once this is fully tested out, we can remove the reflection based path and the old registration methods, and all conversion will work from point to point methods (whether generated or custom).
Much of the need for the reflection path has been removed by changes to generation (to omit fields) and changes to Go (to make assigning equivalent structs easy).
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 66175, 66324, 65828, 65901, 66350). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Start cloudResourceSyncsManager before getNodeAnyWay (initializeModules) to avoid kubelet getting stuck in retrieving node addresses from a cloudprovider.
**What this PR does / why we need it**:
This PR starts cloudResourceSyncsManager before getNodeAnyWay (initializeModules) otherwise kubelet gets stuck in setNodeAddress->kl.cloudResourceSyncManager.NodeAddresses() (https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet_node_status.go#L470) forever retrieving node addresses from a cloud provider, and due to this cloudResourceSyncsManager will not be started at all.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
None
```
@ingvagabund @derekwaynecarr @sjenning @kubernetes/sig-node-bugs
Instead of hiding these behind a helper, we just register them in a
uniform way. We are careful to keep the call-order of the setters the
same, though we can consider re-ordering in a future PR to achieve
fewer appends.
Automatic merge from submit-queue (batch tested with PRs 65052, 65594). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Derive kubelet serving certificate CSR template from node status addresses
xref https://github.com/kubernetes/features/issues/267fixes#55633
Builds on https://github.com/kubernetes/kubernetes/pull/65587
* Makes the cloud provider authoritative when recording node status addresses
* Makes the node status addresses authoritative for the kube-apiserver determining how to speak to a kubelet (stops paying attention to the hostname label when determining how to reach a kubelet, which was only done to support kubelets < 1.5)
* Updates kubelet certificate rotation to be driven from node status
* Avoids needing to compute node addresses a second time, and differently, in order to request serving certificates.
* Allows the kubelet to react to changes in its status addresses by updating its serving certificate
* Allows the kubelet to be driven by external cloud providers recording node addresses on the node status
test procedure:
```sh
# setup
export FEATURE_GATES=RotateKubeletServerCertificate=true
export KUBELET_FLAGS="--rotate-server-certificates=true --cloud-provider=external"
# cleanup from previous runs
sudo rm -fr /var/lib/kubelet/pki/
# startup
hack/local-up-cluster.sh
# wait for a node to register, verify it didn't set addresses
kubectl get nodes
kubectl get node/127.0.0.1 -o jsonpath={.status.addresses}
# verify the kubelet server isn't available, and that it didn't populate a serving certificate
curl --cacert _output/certs/server-ca.crt -v https://localhost:10250/pods
ls -la /var/lib/kubelet/pki
# set an address on the node
curl -X PATCH http://localhost:8080/api/v1/nodes/127.0.0.1/status \
-H "Content-Type: application/merge-patch+json" \
--data '{"status":{"addresses":[{"type":"Hostname","address":"localhost"}]}}'
# verify a csr was submitted with the right SAN, and approve it
kubectl describe csr
kubectl certificate approve csr-...
# verify the kubelet connection uses a cert that is properly signed and valid for the specified hostname, but NOT the IP
curl --cacert _output/certs/server-ca.crt -v https://localhost:10250/pods
curl --cacert _output/certs/server-ca.crt -v https://127.0.0.1:10250/pods
ls -la /var/lib/kubelet/pki
# set an hostname and IP address on the node
curl -X PATCH http://localhost:8080/api/v1/nodes/127.0.0.1/status \
-H "Content-Type: application/merge-patch+json" \
--data '{"status":{"addresses":[{"type":"Hostname","address":"localhost"},{"type":"InternalIP","address":"127.0.0.1"}]}}'
# verify a csr was submitted with the right SAN, and approve it
kubectl describe csr
kubectl certificate approve csr-...
# verify the kubelet connection uses a cert that is properly signed and valid for the specified hostname AND IP
curl --cacert _output/certs/server-ca.crt -v https://localhost:10250/pods
curl --cacert _output/certs/server-ca.crt -v https://127.0.0.1:10250/pods
ls -la /var/lib/kubelet/pki
```
```release-note
* kubelets that specify `--cloud-provider` now only report addresses in Node status as determined by the cloud provider
* kubelet serving certificate rotation now reacts to changes in reported node addresses, and will request certificates for addresses set by an external cloud provider
```
Automatic merge from submit-queue (batch tested with PRs 66076, 65792, 65649). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubernetes: fix printf format errors
These are all flagged by Go 1.11's
more accurate printf checking in go vet,
which runs as part of go test.
```release-note
NONE
```
These are all flagged by Go 1.11's
more accurate printf checking in go vet,
which runs as part of go test.
Lubomir I. Ivanov <neolit123@gmail.com>
applied ammend for:
pkg/cloudprovider/provivers/vsphere/nodemanager.go
Automatic merge from submit-queue (batch tested with PRs 65987, 65962). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix pod worker deadlock.
Preemption will stuck forever if `killPodNow` timeout once. The sequence is:
* `killPodNow` create the response channel (size 0) and send it to pod worker.
* `killPodNow` timeout and return.
* Pod worker finishes killing the pod, and tries to send back response via the channel.
However, because the channel size is 0, and the receiver has exited, the pod worker will stuck forever.
In @jingxu97's case, this causes a critical system pod (apiserver) unable to come up, because the csi pod can't be preempted.
I checked the history, and the bug was introduced 2 years ago 6fefb428c1.
I think we should at least cherrypick this to `1.11` since preemption is beta and enabled by default in 1.11.
@kubernetes/sig-node-bugs @derekwaynecarr @dashpole @yujuhong
Signed-off-by: Lantao Liu <lantaol@google.com>
```release-note
none
```