Merge pull request #55028 from sjenning/remove-orphaned-checkpoints

Automatic merge from submit-queue (batch tested with PRs 55050, 53464, 54936, 55028, 54928). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: dockershim: remove orphaned checkpoint files

Fixes https://github.com/kubernetes/kubernetes/issues/55070

Currently, `ListPodSandbox()` returns a combined list of sandboxes populated from both the runtime and the dockershim checkpoint files.  However the sandboxes in the checkpoint files might not exist anymore.

The kubelet sees the sandbox returned by `ListPodSandbox()` and determines it shouldn't be running and calls `StopPodSandbox()` on it.  This generates an error when `StopContainer()` is called as the container does not exist.  However the checkpoint file is not cleaned up.  This leads to subsequent calls to `StopPodSandbox()` that fail in the same way each time.

This PR removes the checkpoint file if StopContainer fails due to container not found.

The only other place `RemoveCheckpoint()` is called, except if it is corrupt, is from `RemoveSandbox()`.  If the container does not exist, what `RemoveSandbox()` would have done has been effectively been done already.  So this is just clean up.

@derekwaynecarr @eparis @freehan @dcbw
This commit is contained in:
Kubernetes Submit Queue 2017-11-03 12:59:19 -07:00 committed by GitHub
commit b448dfa0e9
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -243,6 +243,9 @@ func (ds *dockerService) StopPodSandbox(podSandboxID string) error {
if !libdocker.IsContainerNotFoundError(err) {
glog.Errorf("Failed to stop sandbox %q: %v", podSandboxID, err)
errList = append(errList, err)
} else {
// remove the checkpoint for any sandbox that is not found in the runtime
ds.checkpointHandler.RemoveCheckpoint(podSandboxID)
}
}
return utilerrors.NewAggregate(errList)