Merge pull request #62937 from vikaschoudhary16/fix-dockershim-e2e

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix dockershim e2e

**What this PR does / why we need it**:
Delete checkpoint file when GetCheckpoint fails due to corrupt checkpoint. Earlier, before checkpointmanager, [`GetCheckpoint` in dockershim was deleting corrupt checkpoint file implicitly](https://github.com/kubernetes/kubernetes/pull/56040/files#diff-9a174fa21408b7faeed35309742cc631L116). In checkpointmanager's `GetCheckpoint` this implicit deletion of corrupt checkpoint is not happening. Because of this few e2e tests are failing because these tests are testing this deletion.
Changes are being added to delete checkpoint file if found corrupted. 

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #62738 


**Special notes for your reviewer**:
No new behavior is being introduced. Implicit deletion of corrupt checkpoint is being done explicitly.

**Release note**:

```release-note
None
```
/cc @dashpole @sjenning @derekwaynecarr
This commit is contained in:
Kubernetes Submit Queue 2018-04-26 16:26:14 -07:00 committed by GitHub
commit 7ff38a23f0
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 14 additions and 0 deletions

View File

@ -84,6 +84,7 @@ go_library(
"//pkg/kubelet/apis/kubeletconfig:go_default_library",
"//pkg/kubelet/checkpointmanager:go_default_library",
"//pkg/kubelet/checkpointmanager/checksum:go_default_library",
"//pkg/kubelet/checkpointmanager/errors:go_default_library",
"//pkg/kubelet/cm:go_default_library",
"//pkg/kubelet/container:go_default_library",
"//pkg/kubelet/dockershim/cm:go_default_library",

View File

@ -31,6 +31,7 @@ import (
utilerrors "k8s.io/apimachinery/pkg/util/errors"
runtimeapi "k8s.io/kubernetes/pkg/kubelet/apis/cri/runtime/v1alpha2"
"k8s.io/kubernetes/pkg/kubelet/checkpointmanager"
"k8s.io/kubernetes/pkg/kubelet/checkpointmanager/errors"
kubecontainer "k8s.io/kubernetes/pkg/kubelet/container"
"k8s.io/kubernetes/pkg/kubelet/dockershim/libdocker"
"k8s.io/kubernetes/pkg/kubelet/qos"
@ -209,6 +210,12 @@ func (ds *dockerService) StopPodSandbox(ctx context.Context, r *runtimeapi.StopP
// actions will only have sandbox ID and not have pod namespace and name information.
// Return error if encounter any unexpected error.
if checkpointErr != nil {
if checkpointErr != errors.ErrCheckpointNotFound {
err := ds.checkpointManager.RemoveCheckpoint(podSandboxID)
if err != nil {
glog.Errorf("Failed to delete corrupt checkpoint for sandbox %q: %v", podSandboxID, err)
}
}
if libdocker.IsContainerNotFoundError(statusErr) {
glog.Warningf("Both sandbox container and checkpoint for id %q could not be found. "+
"Proceed without further sandbox information.", podSandboxID)
@ -517,6 +524,12 @@ func (ds *dockerService) ListPodSandbox(_ context.Context, r *runtimeapi.ListPod
err := ds.checkpointManager.GetCheckpoint(id, checkpoint)
if err != nil {
glog.Errorf("Failed to retrieve checkpoint for sandbox %q: %v", id, err)
if err == errors.ErrCorruptCheckpoint {
err = ds.checkpointManager.RemoveCheckpoint(id)
if err != nil {
glog.Errorf("Failed to delete corrupt checkpoint for sandbox %q: %v", id, err)
}
}
continue
}
result = append(result, checkpointToRuntimeAPISandbox(id, checkpoint))