Merge pull request #65987 from Random-Liu/fix-pod-worker-deadlock

mirror of https://github.com/k3s-io/kubernetes.git synced 2025-07-28 14:07:14 +00:00

Automatic merge from submit-queue (batch tested with PRs 65987, 65962). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix pod worker deadlock.

Preemption will stuck forever if `killPodNow` timeout once. The sequence is:
* `killPodNow` create the response channel (size 0) and send it to pod worker.
* `killPodNow` timeout and return.
*  Pod worker finishes killing the pod, and tries to send back response via the channel.

However, because the channel size is 0, and the receiver has exited, the pod worker will stuck forever.

In @jingxu97's case, this causes a critical system pod (apiserver) unable to come up, because the csi pod can't be preempted.

I checked the history, and the bug was introduced 2 years ago 6fefb428c1.

I think we should at least cherrypick this to `1.11` since preemption is beta and enabled by default in 1.11.

@kubernetes/sig-node-bugs @derekwaynecarr @dashpole @yujuhong 
Signed-off-by: Lantao Liu <lantaol@google.com>

```release-note
none
```

This commit is contained in:

Kubernetes Submit Queue

2018-07-09 16:53:59 -07:00

committed by

GitHub

parent 24ee75e265 0f4c739b2c

commit 55620e2be6

No known key found for this signature in database

GPG Key ID: 4AEE18F83AFDEB23

1 changed files with 1 additions and 1 deletions

									
										2

pkg/kubelet/pod_workers.go
									
										View File
										
					@ -306,7 +306,7 @@ func killPodNow(podWorkers PodWorkers, recorder record.EventRecorder) eviction.K

							type response struct {

							type response struct {

								err error

								err error

							}

							}

							ch := make(chan response)

							ch := make(chan response, 1)

							podWorkers.UpdatePod(&UpdatePodOptions{

							podWorkers.UpdatePod(&UpdatePodOptions{

								Pod:        pod,

								Pod:        pod,

								UpdateType: kubetypes.SyncPodKill,

								UpdateType: kubetypes.SyncPodKill,

Merge pull request #65987 from Random-Liu/fix-pod-worker-deadlock

2 pkg/kubelet/pod_workers.go Unescape Escape View File

2

pkg/kubelet/pod_workers.go

View File