Merge pull request #54509 from vmware/node_poweroff_test

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

E2E test to verify pod failover during node power-off

**What this PR does / why we need it**:

This PR adds test to verify volume status after the node where the pod got provisioned being powered off and failed over to a different node.

Test performs following tasks:

1. Create a StorageClass
2. Create a PVC with the StorageClass
3. Create a Deployment with 1 replica, using the PVC
4. Verify the pod got provisioned on a node
5. Verify the volume is attached to the node
6. Power off the node where pod got provisioned
7. Verify the pod got provisioned on a different node
8. Verify the volume is attached to the new node
9. Verify the volume is detached from the previous node
10. Power on the previous node
11. Delete the Deployment
12. Delete the PVC
13. Delete the StorageClass

**Which issue this PR fixes**:

Fixes https://github.com/vmware/kubernetes/issues/272

**Special notes for your reviewer**:

Test logs:
```
# go run hack/e2e.go --check-version-skew=false --v --test --test_args='--ginkgo.focus=Node\sPoweroff'
flag provided but not defined: -check-version-skew
Usage of /tmp/go-build212295472/command-line-arguments/_obj/exe/e2e:
  -get
                go get -u kubetest if old or not installed (default true)
  -old duration
                Consider kubetest old if it exceeds this (default 24h0m0s)
2017/10/24 11:48:28 e2e.go:55: NOTICE: go run hack/e2e.go is now a shim for test-infra/kubetest
2017/10/24 11:48:28 e2e.go:56:   Usage: go run hack/e2e.go [--get=true] [--old=24h0m0s] -- [KUBETEST_ARGS]
2017/10/24 11:48:28 e2e.go:57:   The separator is required to use --get or --old flags
2017/10/24 11:48:28 e2e.go:58:   The -- flag separator also suppresses this message
2017/10/24 11:48:28 e2e.go:77: Calling kubetest --check-version-skew=false --v --test --test_args=--ginkgo.focus=Node\sPoweroff...
2017/10/24 11:48:28 util.go:154: Running: ./cluster/kubectl.sh --match-server-version=false version
2017/10/24 11:48:28 util.go:156: Step './cluster/kubectl.sh --match-server-version=false version' finished in 350.700421ms
2017/10/24 11:48:28 util.go:154: Running: ./hack/e2e-internal/e2e-status.sh
Skeleton Provider: prepare-e2e not implemented
Client Version: version.Info{Major:"1", Minor:"9+", GitVersion:"v1.9.0-alpha.1.1627+54fc02df4a3a2a", GitCommit:"54fc02df4a3a2a12e14fb72d84a1aaa658ba6689", GitTreeState:"clean", BuildDate:"2017-10-24T18:33:37Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9+", GitVersion:"v1.9.0-alpha.1.1437+ba66fcb63de9e9", GitCommit:"ba66fcb63de9e9b72e2ccf8b823df33a22df0522", GitTreeState:"clean", BuildDate:"2017-10-20T07:16:05Z", GoVersion:"go1.9.1", Compiler:"gc", Platform:"linux/amd64"}
2017/10/24 11:48:28 util.go:156: Step './hack/e2e-internal/e2e-status.sh' finished in 315.334518ms
2017/10/24 11:48:28 util.go:154: Running: ./hack/ginkgo-e2e.sh --ginkgo.focus=Node\sPoweroff
Conformance test: not doing test setup.
Oct 24 11:48:30.391: INFO: Overriding default scale value of zero to 1
Oct 24 11:48:30.391: INFO: Overriding default milliseconds value of zero to 5000
I1024 11:48:30.637436     409 e2e.go:378] Starting e2e run "ed9fdfc7-b8eb-11e7-a595-0050569c26b8" on Ginkgo node 1
Running Suite: Kubernetes e2e suite
===================================
Random Seed: 1508870909 - Will randomize all specs
Will run 1 of 717 specs
 
Oct 24 11:48:30.678: INFO: >>> kubeConfig: /root/.kube/config
Oct 24 11:48:30.685: INFO: Waiting up to 4h0m0s for all (but 0) nodes to be schedulable
Oct 24 11:48:30.719: INFO: Waiting up to 10m0s for all pods (need at least 0) in namespace 'kube-system' to be running and ready
Oct 24 11:48:30.857: INFO: 17 / 17 pods in namespace 'kube-system' are running and ready (0 seconds elapsed)
Oct 24 11:48:30.857: INFO: expected 4 pod replicas in namespace 'kube-system', 4 are Running and Ready.
Oct 24 11:48:30.863: INFO: Waiting for pods to enter Success, but no pods in "kube-system" match label map[name:e2e-image-puller]
Oct 24 11:48:30.863: INFO: Dumping network health container logs from all nodes...
Oct 24 11:48:30.877: INFO: Client version: v1.9.0-alpha.1.1627+54fc02df4a3a2a
Oct 24 11:48:30.879: INFO: Server version: v1.9.0-alpha.1.1437+ba66fcb63de9e9
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
[sig-storage] Node Poweroff [Feature:vsphere] [Slow] [Disruptive]
  verify volume status after node power off
  /root/divyenp/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/storage/vsphere_volume_node_poweroff.go:149
[BeforeEach] [sig-storage] Node Poweroff [Feature:vsphere] [Slow] [Disruptive]
  /root/divyenp/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:133
STEP: Creating a kubernetes client
Oct 24 11:48:30.885: INFO: >>> kubeConfig: /root/.kube/config
STEP: Building a namespace api object
STEP: Waiting for a default service account to be provisioned in namespace
[BeforeEach] [sig-storage] Node Poweroff [Feature:vsphere] [Slow] [Disruptive]
  /root/divyenp/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/storage/vsphere_volume_node_poweroff.go:64
Oct 24 11:48:30.984: INFO: Waiting up to 4h0m0s for all (but 0) nodes to be schedulable
[It] verify volume status after node power off
  /root/divyenp/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/storage/vsphere_volume_node_poweroff.go:149
STEP: Creating a Storage Class
STEP: Creating PVC using the Storage Class
STEP: Waiting for PVC to be in bound phase
Oct 24 11:48:31.141: INFO: Waiting up to 5m0s for PersistentVolumeClaim pvc-zxz56 to have phase Bound
Oct 24 11:48:31.150: INFO: PersistentVolumeClaim pvc-zxz56 found but phase is Pending instead of Bound.
Oct 24 11:48:33.155: INFO: PersistentVolumeClaim pvc-zxz56 found and phase=Bound (2.013403698s)
STEP: Creating a Deployment
I1024 11:48:33.180161     409 deployment_util.go:254] Waiting deployment "deployment-ef6b820e-b8eb-11e7-a595-0050569c26b8" to complete
Oct 24 11:48:33.192: INFO: deployment status: v1beta1.DeploymentStatus{ObservedGeneration:0, Replicas:0, UpdatedReplicas:0, ReadyReplicas:0, AvailableReplicas:0, UnavailableReplicas:0, Conditions:[]v1beta1.DeploymentCondition(nil), CollisionCount:(*int32)(nil)}
Oct 24 11:48:35.197: INFO: deployment status: v1beta1.DeploymentStatus{ObservedGeneration:1, Replicas:1, UpdatedReplicas:1, ReadyReplicas:0, AvailableReplicas:0, UnavailableReplicas:1, Conditions:[]v1beta1.DeploymentCondition{v1beta1.DeploymentCondition{Type:"Available", Status:"True", LastUpdateTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, LastTransitionTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, Reason:"MinimumReplicasAvailable", Message:"Deployment has minimum availability."}}, CollisionCount:(*int32)(nil)}
Oct 24 11:48:37.197: INFO: deployment status: v1beta1.DeploymentStatus{ObservedGeneration:1, Replicas:1, UpdatedReplicas:1, ReadyReplicas:0, AvailableReplicas:0, UnavailableReplicas:1, Conditions:[]v1beta1.DeploymentCondition{v1beta1.DeploymentCondition{Type:"Available", Status:"True", LastUpdateTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, LastTransitionTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, Reason:"MinimumReplicasAvailable", Message:"Deployment has minimum availability."}}, CollisionCount:(*int32)(nil)}
Oct 24 11:48:39.196: INFO: deployment status: v1beta1.DeploymentStatus{ObservedGeneration:1, Replicas:1, UpdatedReplicas:1, ReadyReplicas:0, AvailableReplicas:0, UnavailableReplicas:1, Conditions:[]v1beta1.DeploymentCondition{v1beta1.DeploymentCondition{Type:"Available", Status:"True", LastUpdateTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, LastTransitionTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, Reason:"MinimumReplicasAvailable", Message:"Deployment has minimum availability."}}, CollisionCount:(*int32)(nil)}
Oct 24 11:48:41.197: INFO: deployment status: v1beta1.DeploymentStatus{ObservedGeneration:1, Replicas:1, UpdatedReplicas:1, ReadyReplicas:0, AvailableReplicas:0, UnavailableReplicas:1, Conditions:[]v1beta1.DeploymentCondition{v1beta1.DeploymentCondition{Type:"Available", Status:"True", LastUpdateTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, LastTransitionTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, Reason:"MinimumReplicasAvailable", Message:"Deployment has minimum availability."}}, CollisionCount:(*int32)(nil)}
Oct 24 11:48:43.197: INFO: deployment status: v1beta1.DeploymentStatus{ObservedGeneration:1, Replicas:1, UpdatedReplicas:1, ReadyReplicas:0, AvailableReplicas:0, UnavailableReplicas:1, Conditions:[]v1beta1.DeploymentCondition{v1beta1.DeploymentCondition{Type:"Available", Status:"True", LastUpdateTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, LastTransitionTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, Reason:"MinimumReplicasAvailable", Message:"Deployment has minimum availability."}}, CollisionCount:(*int32)(nil)}
Oct 24 11:48:45.197: INFO: deployment status: v1beta1.DeploymentStatus{ObservedGeneration:1, Replicas:1, UpdatedReplicas:1, ReadyReplicas:0, AvailableReplicas:0, UnavailableReplicas:1, Conditions:[]v1beta1.DeploymentCondition{v1beta1.DeploymentCondition{Type:"Available", Status:"True", LastUpdateTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, LastTransitionTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, Reason:"MinimumReplicasAvailable", Message:"Deployment has minimum availability."}}, CollisionCount:(*int32)(nil)}
Oct 24 11:48:47.198: INFO: deployment status: v1beta1.DeploymentStatus{ObservedGeneration:1, Replicas:1, UpdatedReplicas:1, ReadyReplicas:0, AvailableReplicas:0, UnavailableReplicas:1, Conditions:[]v1beta1.DeploymentCondition{v1beta1.DeploymentCondition{Type:"Available", Status:"True", LastUpdateTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, LastTransitionTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, Reason:"MinimumReplicasAvailable", Message:"Deployment has minimum availability."}}, CollisionCount:(*int32)(nil)}
Oct 24 11:48:49.198: INFO: deployment status: v1beta1.DeploymentStatus{ObservedGeneration:1, Replicas:1, UpdatedReplicas:1, ReadyReplicas:0, AvailableReplicas:0, UnavailableReplicas:1, Conditions:[]v1beta1.DeploymentCondition{v1beta1.DeploymentCondition{Type:"Available", Status:"True", LastUpdateTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, LastTransitionTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, Reason:"MinimumReplicasAvailable", Message:"Deployment has minimum availability."}}, CollisionCount:(*int32)(nil)}
Oct 24 11:48:51.196: INFO: deployment status: v1beta1.DeploymentStatus{ObservedGeneration:1, Replicas:1, UpdatedReplicas:1, ReadyReplicas:0, AvailableReplicas:0, UnavailableReplicas:1, Conditions:[]v1beta1.DeploymentCondition{v1beta1.DeploymentCondition{Type:"Available", Status:"True", LastUpdateTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, LastTransitionTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, Reason:"MinimumReplicasAvailable", Message:"Deployment has minimum availability."}}, CollisionCount:(*int32)(nil)}
Oct 24 11:48:53.197: INFO: deployment status: v1beta1.DeploymentStatus{ObservedGeneration:1, Replicas:1, UpdatedReplicas:1, ReadyReplicas:0, AvailableReplicas:0, UnavailableReplicas:1, Conditions:[]v1beta1.DeploymentCondition{v1beta1.DeploymentCondition{Type:"Available", Status:"True", LastUpdateTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, LastTransitionTime:v1.Time{Time:time.Time{sec:63644467713, nsec:0, loc:(*time.Location)(0x5db10c0)}}, Reason:"MinimumReplicasAvailable", Message:"Deployment has minimum availability."}}, CollisionCount:(*int32)(nil)}
STEP: Get pod from the deployement
STEP: Verify disk is attached to the node: kubernetes-node5
STEP: Power off the node: kubernetes-node5
Oct 24 11:49:07.337: INFO: Waiting for pod to be failed over from "kubernetes-node5"
Oct 24 11:49:17.336: INFO: Waiting for pod to be failed over from "kubernetes-node5"
Oct 24 11:49:27.340: INFO: Waiting for pod to be failed over from "kubernetes-node5"
Oct 24 11:49:37.340: INFO: The pod has been failed over from "kubernetes-node5" to "kubernetes-node7"
STEP: Waiting for disk to be attached to the new node: kubernetes-node7
Oct 24 11:49:47.534: INFO: Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" has successfully attached to "kubernetes-node7".
STEP: Waiting for disk to be detached from the previous node: kubernetes-node5
Oct 24 11:49:57.707: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:50:07.702: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:50:17.710: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:50:27.733: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:50:37.713: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:50:47.723: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:50:57.705: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:51:07.710: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:51:17.719: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:51:27.716: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:51:37.717: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:51:47.712: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:51:57.707: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:52:07.724: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:52:17.716: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:52:27.711: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:52:37.716: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:52:47.709: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:52:57.714: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:53:07.715: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:53:17.711: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:53:27.714: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:53:37.713: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:53:47.705: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:53:57.711: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:54:07.712: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:54:17.705: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:54:27.712: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:54:37.707: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:54:47.698: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:54:57.705: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:55:07.711: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:55:17.699: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:55:27.702: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:55:37.704: INFO: Waiting for Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" to detach from "kubernetes-node5".
Oct 24 11:55:47.703: INFO: Volume "[vsanDatastore] 165fee59-08d0-b42e-e5c4-020047ab7bb1/kubernetes-dynamic-pvc-ee347cf2-b8eb-11e7-8558-005056a2ed7b.vmdk" has successfully detached from "kubernetes-node5".
STEP: Power on the previous node: kubernetes-node5
Oct 24 11:55:49.168: INFO: Deleting PersistentVolumeClaim "pvc-zxz56"
[AfterEach] [sig-storage] Node Poweroff [Feature:vsphere] [Slow] [Disruptive]
  /root/divyenp/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:134
Oct 24 11:55:49.253: INFO: Waiting up to 3m0s for all (but 0) nodes to be ready
STEP: Destroying namespace "e2e-tests-node-poweroff-l245b" for this suite.
Oct 24 11:55:57.630: INFO: namespace: e2e-tests-node-poweroff-l245b, resource: bindings, ignored listing per whitelist
Oct 24 11:55:57.643: INFO: namespace e2e-tests-node-poweroff-l245b deletion completed in 8.379395732s
 
• [SLOW TEST:446.758 seconds]
[sig-storage] Node Poweroff [Feature:vsphere] [Slow] [Disruptive]
/root/divyenp/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/storage/framework.go:22
  verify volume status after node power off
  /root/divyenp/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/storage/vsphere_volume_node_poweroff.go:149
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSOct 24 11:55:57.647: INFO: Running AfterSuite actions on all node
Oct 24 11:55:57.647: INFO: Running AfterSuite actions on node 1
 
Ran 1 of 717 Specs in 446.969 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 716 Skipped PASS
 
Ginkgo ran 1 suite in 7m27.797177022s
Test Suite Passed
2017/10/24 11:55:57 util.go:156: Step './hack/ginkgo-e2e.sh --ginkgo.focus=Node\sPoweroff' finished in 7m28.760818768s
2017/10/24 11:55:57 e2e.go:81: Done
```
VMware Reviewers: @divyenpatel @pshahzeb 

**Release note**:

```release-note
NONE
```
This commit is contained in:
Kubernetes Submit Queue 2017-11-14 00:56:26 -08:00 committed by GitHub
commit ea66c00522
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
4 changed files with 328 additions and 19 deletions

View File

@ -25,6 +25,7 @@ import (
"k8s.io/api/core/v1"
extensions "k8s.io/api/extensions/v1beta1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/util/uuid"
"k8s.io/apimachinery/pkg/util/wait"
"k8s.io/apimachinery/pkg/watch"
clientset "k8s.io/client-go/kubernetes"
@ -211,3 +212,86 @@ func WaitForDeploymentRevision(c clientset.Interface, d *extensions.Deployment,
func CheckDeploymentRevisionAndImage(c clientset.Interface, ns, deploymentName, revision, image string) error {
return testutils.CheckDeploymentRevisionAndImage(c, ns, deploymentName, revision, image)
}
func CreateDeployment(client clientset.Interface, replicas int32, podLabels map[string]string, namespace string, pvclaims []*v1.PersistentVolumeClaim, command string) (*extensions.Deployment, error) {
deploymentSpec := MakeDeployment(replicas, podLabels, namespace, pvclaims, false, command)
deployment, err := client.Extensions().Deployments(namespace).Create(deploymentSpec)
if err != nil {
return nil, fmt.Errorf("deployment %q Create API error: %v", deploymentSpec.Name, err)
}
Logf("Waiting deployment %q to complete", deploymentSpec.Name)
err = WaitForDeploymentComplete(client, deployment)
if err != nil {
return nil, fmt.Errorf("deployment %q failed to complete: %v", deploymentSpec.Name, err)
}
return deployment, nil
}
// MakeDeployment creates a deployment definition based on the namespace. The deployment references the PVC's
// name. A slice of BASH commands can be supplied as args to be run by the pod
func MakeDeployment(replicas int32, podLabels map[string]string, namespace string, pvclaims []*v1.PersistentVolumeClaim, isPrivileged bool, command string) *extensions.Deployment {
if len(command) == 0 {
command = "while true; do sleep 1; done"
}
zero := int64(0)
deploymentName := "deployment-" + string(uuid.NewUUID())
deploymentSpec := &extensions.Deployment{
ObjectMeta: metav1.ObjectMeta{
Name: deploymentName,
Namespace: namespace,
},
Spec: extensions.DeploymentSpec{
Replicas: &replicas,
Template: v1.PodTemplateSpec{
ObjectMeta: metav1.ObjectMeta{
Labels: podLabels,
},
Spec: v1.PodSpec{
TerminationGracePeriodSeconds: &zero,
Containers: []v1.Container{
{
Name: "write-pod",
Image: "busybox",
Command: []string{"/bin/sh"},
Args: []string{"-c", command},
SecurityContext: &v1.SecurityContext{
Privileged: &isPrivileged,
},
},
},
RestartPolicy: v1.RestartPolicyAlways,
},
},
},
}
var volumeMounts = make([]v1.VolumeMount, len(pvclaims))
var volumes = make([]v1.Volume, len(pvclaims))
for index, pvclaim := range pvclaims {
volumename := fmt.Sprintf("volume%v", index+1)
volumeMounts[index] = v1.VolumeMount{Name: volumename, MountPath: "/mnt/" + volumename}
volumes[index] = v1.Volume{Name: volumename, VolumeSource: v1.VolumeSource{PersistentVolumeClaim: &v1.PersistentVolumeClaimVolumeSource{ClaimName: pvclaim.Name, ReadOnly: false}}}
}
deploymentSpec.Spec.Template.Spec.Containers[0].VolumeMounts = volumeMounts
deploymentSpec.Spec.Template.Spec.Volumes = volumes
return deploymentSpec
}
// GetPodsForDeployment gets pods for the given deployment
func GetPodsForDeployment(client clientset.Interface, deployment *extensions.Deployment) (*v1.PodList, error) {
replicaSet, err := deploymentutil.GetNewReplicaSet(deployment, client.ExtensionsV1beta1())
if err != nil {
return nil, fmt.Errorf("Failed to get new replica set for deployment %q: %v", deployment.Name, err)
}
if replicaSet == nil {
return nil, fmt.Errorf("expected a new replica set for deployment %q, found none", deployment.Name)
}
podListFunc := func(namespace string, options metav1.ListOptions) (*v1.PodList, error) {
return client.Core().Pods(namespace).List(options)
}
rsList := []*extensions.ReplicaSet{replicaSet}
podList, err := deploymentutil.ListPods(deployment, rsList, podListFunc)
if err != nil {
return nil, fmt.Errorf("Failed to list Pods of Deployment %q: %v", deployment.Name, err)
}
return podList, nil
}

View File

@ -33,6 +33,7 @@ go_library(
"vsphere_volume_disksize.go",
"vsphere_volume_fstype.go",
"vsphere_volume_master_restart.go",
"vsphere_volume_node_poweroff.go",
"vsphere_volume_ops_storm.go",
"vsphere_volume_perf.go",
"vsphere_volume_placement.go",
@ -63,6 +64,7 @@ go_library(
"//vendor/google.golang.org/api/googleapi:go_default_library",
"//vendor/k8s.io/api/batch/v1:go_default_library",
"//vendor/k8s.io/api/core/v1:go_default_library",
"//vendor/k8s.io/api/extensions/v1beta1:go_default_library",
"//vendor/k8s.io/api/policy/v1beta1:go_default_library",
"//vendor/k8s.io/api/rbac/v1beta1:go_default_library",
"//vendor/k8s.io/api/storage/v1:go_default_library",

View File

@ -46,6 +46,14 @@ const (
storageclass4 = "sc-user-specified-ds"
)
// volumeState represents the state of a volume.
type volumeState int32
const (
volumeStateDetached volumeState = 1
volumeStateAttached volumeState = 2
)
// Sanity check for vSphere testing. Verify the persistent disk attached to the node.
func verifyVSphereDiskAttached(vsp *vsphere.VSphere, volumePath string, nodeName types.NodeName) (bool, error) {
var (
@ -101,40 +109,58 @@ func waitForVSphereDisksToDetach(vsp *vsphere.VSphere, nodeVolumes map[k8stype.N
return nil
}
// Wait until vsphere vmdk is deteched from the given node or time out after 5 minutes
func waitForVSphereDiskToDetach(vsp *vsphere.VSphere, volumePath string, nodeName types.NodeName) error {
// Wait until vsphere vmdk moves to expected state on the given node, or time out after 6 minutes
func waitForVSphereDiskStatus(vsp *vsphere.VSphere, volumePath string, nodeName types.NodeName, expectedState volumeState) error {
var (
err error
diskAttached = true
detachTimeout = 5 * time.Minute
detachPollTime = 10 * time.Second
err error
diskAttached bool
currentState volumeState
timeout = 6 * time.Minute
pollTime = 10 * time.Second
)
if vsp == nil {
vsp, err = vsphere.GetVSphere()
if err != nil {
return err
}
var attachedState = map[bool]volumeState{
true: volumeStateAttached,
false: volumeStateDetached,
}
err = wait.Poll(detachPollTime, detachTimeout, func() (bool, error) {
var attachedStateMsg = map[volumeState]string{
volumeStateAttached: "attached to",
volumeStateDetached: "detached from",
}
err = wait.Poll(pollTime, timeout, func() (bool, error) {
diskAttached, err = verifyVSphereDiskAttached(vsp, volumePath, nodeName)
if err != nil {
return true, err
}
if !diskAttached {
framework.Logf("Volume %q appears to have successfully detached from %q.",
volumePath, nodeName)
currentState = attachedState[diskAttached]
if currentState == expectedState {
framework.Logf("Volume %q has successfully %s %q", volumePath, attachedStateMsg[currentState], nodeName)
return true, nil
}
framework.Logf("Waiting for Volume %q to detach from %q.", volumePath, nodeName)
framework.Logf("Waiting for Volume %q to be %s %q.", volumePath, attachedStateMsg[expectedState], nodeName)
return false, nil
})
if err != nil {
return err
}
if diskAttached {
return fmt.Errorf("Gave up waiting for Volume %q to detach from %q after %v", volumePath, nodeName, detachTimeout)
if currentState != expectedState {
err = fmt.Errorf("Gave up waiting for Volume %q to be %s %q after %v", volumePath, attachedStateMsg[expectedState], nodeName, timeout)
}
return nil
return err
}
// Wait until vsphere vmdk is attached from the given node or time out after 6 minutes
func waitForVSphereDiskToAttach(vsp *vsphere.VSphere, volumePath string, nodeName types.NodeName) error {
return waitForVSphereDiskStatus(vsp, volumePath, nodeName, volumeStateAttached)
}
// Wait until vsphere vmdk is detached from the given node or time out after 6 minutes
func waitForVSphereDiskToDetach(vsp *vsphere.VSphere, volumePath string, nodeName types.NodeName) error {
return waitForVSphereDiskStatus(vsp, volumePath, nodeName, volumeStateDetached)
}
// function to create vsphere volume spec with given VMDK volume path, Reclaim Policy and labels

View File

@ -0,0 +1,197 @@
/*
Copyright 2017 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package storage
import (
"fmt"
"os"
"path/filepath"
"time"
. "github.com/onsi/ginkgo"
. "github.com/onsi/gomega"
"github.com/vmware/govmomi/find"
"golang.org/x/net/context"
vimtypes "github.com/vmware/govmomi/vim25/types"
"k8s.io/api/core/v1"
extensions "k8s.io/api/extensions/v1beta1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
"k8s.io/apimachinery/pkg/util/wait"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/kubernetes/pkg/cloudprovider/providers/vsphere"
"k8s.io/kubernetes/test/e2e/framework"
)
/*
Test to verify volume status after node power off:
1. Verify the pod got provisioned on a different node with volume attached to it
2. Verify the volume is detached from the powered off node
*/
var _ = SIGDescribe("Node Poweroff [Feature:vsphere] [Slow] [Disruptive]", func() {
f := framework.NewDefaultFramework("node-poweroff")
var (
client clientset.Interface
namespace string
vsp *vsphere.VSphere
workingDir string
err error
)
BeforeEach(func() {
framework.SkipUnlessProviderIs("vsphere")
client = f.ClientSet
namespace = f.Namespace.Name
framework.ExpectNoError(framework.WaitForAllNodesSchedulable(client, framework.TestContext.NodeSchedulableTimeout))
nodeList := framework.GetReadySchedulableNodesOrDie(f.ClientSet)
Expect(nodeList.Items).NotTo(BeEmpty(), "Unable to find ready and schedulable Node")
Expect(len(nodeList.Items) > 1).To(BeTrue(), "At least 2 nodes are required for this test")
vsp, err = vsphere.GetVSphere()
Expect(err).NotTo(HaveOccurred())
workingDir = os.Getenv("VSPHERE_WORKING_DIR")
Expect(workingDir).NotTo(BeEmpty())
})
/*
Steps:
1. Create a StorageClass
2. Create a PVC with the StorageClass
3. Create a Deployment with 1 replica, using the PVC
4. Verify the pod got provisioned on a node
5. Verify the volume is attached to the node
6. Power off the node where pod got provisioned
7. Verify the pod got provisioned on a different node
8. Verify the volume is attached to the new node
9. Verify the volume is detached from the old node
10. Delete the Deployment and wait for the volume to be detached
11. Delete the PVC
12. Delete the StorageClass
*/
It("verify volume status after node power off", func() {
By("Creating a Storage Class")
storageClassSpec := getVSphereStorageClassSpec("test-sc", nil)
storageclass, err := client.StorageV1().StorageClasses().Create(storageClassSpec)
Expect(err).NotTo(HaveOccurred(), fmt.Sprintf("Failed to create storage class with err: %v", err))
defer client.StorageV1().StorageClasses().Delete(storageclass.Name, nil)
By("Creating PVC using the Storage Class")
pvclaimSpec := getVSphereClaimSpecWithStorageClassAnnotation(namespace, "1Gi", storageclass)
pvclaim, err := framework.CreatePVC(client, namespace, pvclaimSpec)
Expect(err).NotTo(HaveOccurred(), fmt.Sprintf("Failed to create PVC with err: %v", err))
defer framework.DeletePersistentVolumeClaim(client, pvclaim.Name, namespace)
By("Waiting for PVC to be in bound phase")
pvclaims := []*v1.PersistentVolumeClaim{pvclaim}
pvs, err := framework.WaitForPVClaimBoundPhase(client, pvclaims, framework.ClaimProvisionTimeout)
Expect(err).NotTo(HaveOccurred(), fmt.Sprintf("Failed to wait until PVC phase set to bound: %v", err))
volumePath := pvs[0].Spec.VsphereVolume.VolumePath
By("Creating a Deployment")
deployment, err := framework.CreateDeployment(client, int32(1), map[string]string{"test": "app"}, namespace, pvclaims, "")
defer client.Extensions().Deployments(namespace).Delete(deployment.Name, &metav1.DeleteOptions{})
By("Get pod from the deployement")
podList, err := framework.GetPodsForDeployment(client, deployment)
Expect(podList.Items).NotTo(BeEmpty())
pod := podList.Items[0]
node1 := types.NodeName(pod.Spec.NodeName)
By(fmt.Sprintf("Verify disk is attached to the node: %v", node1))
isAttached, err := verifyVSphereDiskAttached(vsp, volumePath, node1)
Expect(err).NotTo(HaveOccurred())
Expect(isAttached).To(BeTrue(), "Disk is not attached to the node")
By(fmt.Sprintf("Power off the node: %v", node1))
govMoMiClient, err := vsphere.GetgovmomiClient(nil)
Expect(err).NotTo(HaveOccurred())
f := find.NewFinder(govMoMiClient.Client, true)
ctx, _ := context.WithCancel(context.Background())
vmPath := filepath.Join(workingDir, string(node1))
vm, err := f.VirtualMachine(ctx, vmPath)
Expect(err).NotTo(HaveOccurred())
_, err = vm.PowerOff(ctx)
Expect(err).NotTo(HaveOccurred())
defer vm.PowerOn(ctx)
err = vm.WaitForPowerState(ctx, vimtypes.VirtualMachinePowerStatePoweredOff)
Expect(err).NotTo(HaveOccurred(), "Unable to power off the node")
// Waiting for the pod to be failed over to a different node
node2, err := waitForPodToFailover(client, deployment, node1)
Expect(err).NotTo(HaveOccurred(), "Pod did not fail over to a different node")
By(fmt.Sprintf("Waiting for disk to be attached to the new node: %v", node2))
err = waitForVSphereDiskToAttach(vsp, volumePath, node2)
Expect(err).NotTo(HaveOccurred(), "Disk is not attached to the node")
By(fmt.Sprintf("Waiting for disk to be detached from the previous node: %v", node1))
err = waitForVSphereDiskToDetach(vsp, volumePath, node1)
Expect(err).NotTo(HaveOccurred(), "Disk is not detached from the node")
By(fmt.Sprintf("Power on the previous node: %v", node1))
vm.PowerOn(ctx)
err = vm.WaitForPowerState(ctx, vimtypes.VirtualMachinePowerStatePoweredOn)
Expect(err).NotTo(HaveOccurred(), "Unable to power on the node")
})
})
// Wait until the pod failed over to a different node, or time out after 3 minutes
func waitForPodToFailover(client clientset.Interface, deployment *extensions.Deployment, oldNode types.NodeName) (types.NodeName, error) {
var (
err error
newNode types.NodeName
timeout = 3 * time.Minute
pollTime = 10 * time.Second
)
err = wait.Poll(pollTime, timeout, func() (bool, error) {
newNode, err = getNodeForDeployment(client, deployment)
if err != nil {
return true, err
}
if newNode != oldNode {
framework.Logf("The pod has been failed over from %q to %q", oldNode, newNode)
return true, nil
}
framework.Logf("Waiting for pod to be failed over from %q", oldNode)
return false, nil
})
if err != nil {
if err == wait.ErrWaitTimeout {
framework.Logf("Time out after waiting for %v", timeout)
}
framework.Logf("Pod did not fail over from %q with error: %v", oldNode, err)
return "", err
}
return getNodeForDeployment(client, deployment)
}
func getNodeForDeployment(client clientset.Interface, deployment *extensions.Deployment) (types.NodeName, error) {
podList, err := framework.GetPodsForDeployment(client, deployment)
if err != nil {
return "", err
}
return types.NodeName(podList.Items[0].Spec.NodeName), nil
}