Production-Grade Container Scheduling and Management
Go to file
Kubernetes Submit Queue f2951a54f9 Merge pull request #30674 from ivan4th/add-e2e-tests-for-wrapped-volume-race
Automatic merge from submit-queue

Add e2e tests that check for wrapped volume race

This PR adds two new e2e tests that reproduce the race condition fixed in #29641 (see e.g. #29297)

In order to observe the race, you need to revert the PR that fixes it, via e.g.
```
git revert -n df1e925143
```
or
```
curl -sL https://github.com/kubernetes/kubernetes/pull/29641.patch | patch -p1 -R
```

The tests are `[Slow]` because they need to run several passes that involve creating pods with many volumes. They also are `[Serial]` because the load on the cluster may affect reproducibility of the race. They take about ~450s each when they fail on standard GCE cluster created by `go run hack/e2e.go -v --up`. `git_repo` test takes about 66s to run when it succeeds (fix PR not reverted) and `configmap` test takes about 546s in this case because configmap mounting is slower and still requires 3 passes x 5 pods x 50 configmap volumes to fail constantly with fix PR reverted. Probably these times can be reduced but frankly I've already spent quite a bit of time on tuning the numbers to find a balance between reproducibility and speed.

Managed to reproduce the problem in more or less reliable way for `configMap` and `gitRepo` volumes. Tried to reproduce it for `secret` volumes too but without success so far because they use tmpfs-based `emptyDir` variety. For `downwardAPI` volumes I expect the same problems with race reproducibility as with `secret` volumes, although I think some e2e races were caused by the bug, e.g. #29633.

The tests operate by creating several pods (via an RC) with many volumes and waiting for them to become Running. It sets node affinity for pods so that they all get created on a single node (the first one in the node list). The race condition leads to volume mount failures with slow retries, thus causing the test to time out.

The test failures look like this:

configmap:
```
• Failure [435.547 seconds]
[k8s.io] Wrapped EmptyDir volumes
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:709
  should not cause race condition when used for configmaps [Serial] [Slow] [It]
  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/wrapped_empty_dir.go:170

  Failed waiting for pod wrapped-volume-race-8c097734-6376-11e6-9ffa-5254003793ad-acbtt to enter running state
  Expected error:
      <*errors.errorString | 0xc8201758d0>: {
          s: "timed out waiting for the condition",
      }
      timed out waiting for the condition
  not to have occurred

  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/wrapped_empty_dir.go:395
```
You'll see errors like this in kubelet log on the first node in the cluster:
```
E0816 00:27:23.319431    3510 configmap.go:174] Error creating atomic writer: stat /var/lib/kubelet/pods/e5986355-6347-11e6-a5d7-42010af00002/volumes/kubernetes.io~configmap/racey-configmap-14: no such file or directory
E0816 00:27:23.319478    3510 nestedpendingoperations.go:232] Operation for "\"kubernetes.io/configmap/e5986355-6347-11e6-a5d7-42010af00002-racey-configmap-14\" (\"e5986355-6347-11e6-a5d7-42010af00002\")" failed. No retries permitted until 2016-08-16 00:28:27.319450118 +0000 UTC (durationBeforeRetry 1m4s). Error: MountVolume.SetUp failed for volume "kubernetes.io/configmap/e5986355-6347-11e6-a5d7-42010af00002-racey-configmap-14" (spec.Name: "racey-configmap-14") pod "e5986355-6347-11e6-a5d7-42010af00002" (UID: "e5986355-6347-11e6-a5d7-42010af00002") with: stat /var/lib/kubelet/pods/e5986355-6347-11e6-a5d7-42010af00002/volumes/kubernetes.io~configmap/racey-configmap-14: no such file or directory
```

git_repo:
```
• Failure [455.035 seconds]                                                                                                                                                                                                                           [0/1882]
[k8s.io] Wrapped EmptyDir volumes
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:709
  should not cause race condition when used for git_repo [Serial] [Slow] [It]
  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/wrapped_empty_dir.go:179

  Failed waiting for pod wrapped-volume-race-71b12b3d-6375-11e6-9ffa-5254003793ad-b0slz to enter running state
  Expected error:
      <*errors.errorString | 0xc8201758d0>: {
          s: "timed out waiting for the condition",
      }
      timed out waiting for the condition
  not to have occurred

  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/wrapped_empty_dir.go:395
```
Errors in kubelet log:
```
E0815 23:41:08.670203    3510 nestedpendingoperations.go:232] Operation for "\"kubernetes.io/git-repo/97636bd8-6341-11e6-a5d7-42010af00002-racey-git-repo-8\" (\"97636bd8-6341-11e6-a5d7-42010af00002\")" failed. No retries permitted until 2016-08-15 23:42:12.670181604 +0000 UTC (durationBeforeRetry 1m4s). Error: MountVolume.SetUp failed for volume "kubernetes.io/git-repo/97636bd8-6341-11e6-a5d7-42010af00002-racey-git-repo-8" (spec.Name: "racey-git-repo-8") pod "97636bd8-6341-11e6-a5d7-42010af00002" (UID: "97636bd8-6341-11e6-a5d7-42010af00002") with: failed to exec 'git clone http://10.0.68.35:2345 test': : chdir /var/lib/kubelet/pods/97636bd8-6341-11e6-a5d7-42010af00002/volumes/kubernetes.io~git-repo/racey-git-repo-8: no such file or directory
```

Generally, the races cause unexpected "no such directory" errors in kubelet logs with subsequent volume mount failures.

I've added race tests to e2e test `empty_dir_wrapper.go` ("EmptyDir wrapper volumes"). This test was added in #18445, the same PR that introduced the race bug. The original purpose of the test was making sure that no conflicts occur between different wrapped emptyDir volumes, so I've replaced "should becomes" with "should not conflict" in the first `It(...)`.
2016-09-11 03:39:21 -07:00
.github issue-template: remove emoji suggestion 2016-08-14 22:07:43 +03:00
api/swagger-spec Merge pull request #31271 from deads2k/self-sar 2016-09-08 01:29:48 -07:00
build Merge pull request #24629 from david-mcmahon/release-deprecation 2016-09-09 19:15:12 -07:00
cluster Merge pull request #26244 from asalkeld/kconfig-common.sh 2016-09-10 23:51:37 -07:00
cmd Merge pull request #32101 from ping035627/ping035627-patch-0906 2016-09-11 01:11:42 -07:00
contrib Merge pull request #30825 from wongma7/pv-controller-informer 2016-09-10 12:40:30 -07:00
docs Merge pull request #24629 from david-mcmahon/release-deprecation 2016-09-09 19:15:12 -07:00
examples Merge pull request #30777 from k82cn/use_k8petstore_ns 2016-09-11 02:38:50 -07:00
federation Merge pull request #32144 from nikhiljindal/freshDebugNs 2016-09-08 15:54:20 -07:00
Godeps Merge pull request #31586 from brendandburns/kubectl 2016-09-10 13:58:13 -07:00
hack Merge pull request #32110 from soltysh/issue31009 2016-09-10 19:52:54 -07:00
hooks Use make as the main build tool 2016-07-12 21:52:00 -07:00
logo Convert the font (ubuntu) to paths in SVG 2016-05-23 18:38:45 -07:00
pkg Merge pull request #31767 from asalkeld/bad-context-error 2016-09-11 02:00:34 -07:00
plugin Merge pull request #30733 from asalkeld/compat-test 2016-09-10 03:48:49 -07:00
staging // client-go/copy.sh 2016-09-07 17:38:38 -07:00
test Merge pull request #30674 from ivan4th/add-e2e-tests-for-wrapped-volume-race 2016-09-11 03:39:21 -07:00
third_party Merge pull request #28133 from freehan/e2ejunit 2016-07-09 01:55:07 -07:00
vendor Merge pull request #31586 from brendandburns/kubectl 2016-09-10 13:58:13 -07:00
www
.generated_docs Merge pull request #29821 from alexbrand/config-cluster 2016-08-26 11:44:46 -07:00
.gitignore Do not ignore .drone.sec file 2016-08-18 13:50:50 -07:00
CHANGELOG.md Update CHANGELOG.md for v1.4.0-beta.1. 2016-09-10 13:30:31 -07:00
code-of-conduct.md code-of-conduct: provide concrete points of contact 2016-06-29 14:36:14 -07:00
CONTRIB.md Markdown files in root updated by update-generated-docs.sh. 2016-03-31 16:53:52 -07:00
CONTRIBUTING.md Update CONTRIBUTING.md 2016-07-14 16:33:34 +03:00
DESIGN.md Markdown files in root updated by update-generated-docs.sh. 2016-03-31 16:53:52 -07:00
labels.yaml Adding file to manage labels: issue #1423. 2016-07-28 21:12:07 -07:00
LICENSE Remove "All rights reserved" from all the headers. 2016-06-29 17:47:36 -07:00
Makefile choose a particular directory test-integration 2016-08-26 12:33:06 -04:00
Makefile.generated_files Error if someone uses the sub-makefile directly 2016-08-22 15:26:08 -07:00
OWNERS Adding top-level OWNERS file. 2016-08-16 23:06:21 -07:00
README.md add pointer to feature repo 2016-08-23 14:22:08 -07:00
Vagrantfile Vagrantfile: Remove unused variables 2016-09-05 02:30:28 -07:00

Kubernetes

Submit Queue Widget GoDoc Widget Coverage Status Widget

Are you ...

  • Interested in learning more about using Kubernetes? Please see our user-facing documentation on kubernetes.io
  • Interested in hacking on the core Kubernetes code base? Keep reading!

Kubernetes is an open source system for managing containerized applications across multiple hosts, providing basic mechanisms for deployment, maintenance, and scaling of applications.

Kubernetes is:

  • lean: lightweight, simple, accessible
  • portable: public, private, hybrid, multi cloud
  • extensible: modular, pluggable, hookable, composable
  • self-healing: auto-placement, auto-restart, auto-replication

Kubernetes builds upon a decade and a half of experience at Google running production workloads at scale, combined with best-of-breed ideas and practices from the community.


Kubernetes is ready for Production!

With the 1.0.1 release Kubernetes is ready to serve your production workloads.

Kubernetes can run anywhere!

You can run Kubernetes on your local workstation under Vagrant, cloud providers (e.g. GCE, AWS, Azure), and physical hardware. Essentially, anywhere Linux runs you can run Kubernetes. Checkout the Getting Started Guides for details.

Concepts

Kubernetes works with the following concepts:

Cluster
A cluster is a set of physical or virtual machines and other infrastructure resources used by Kubernetes to run your applications. Kubernetes can run anywhere! See the Getting Started Guides for instructions for a variety of services.
Node
A node is a physical or virtual machine running Kubernetes, onto which pods can be scheduled.
Pod
Pods are a colocated group of application containers with shared volumes. They're the smallest deployable units that can be created, scheduled, and managed with Kubernetes. Pods can be created individually, but it's recommended that you use a replication controller even if creating a single pod.
Replication controller
Replication controllers manage the lifecycle of pods. They ensure that a specified number of pods are running at any given time, by creating or killing pods as required.
Service
Services provide a single, stable name and address for a set of pods. They act as basic load balancers.
Label
Labels are used to organize and select groups of objects based on key:value pairs.

Documentation

Kubernetes documentation is organized into several categories.

Community, discussion, contribution, and support

See which companies are committed to driving quality in Kubernetes on our community page.

Do you want to help "shape the evolution of technologies that are container packaged, dynamically scheduled and microservices oriented?"

You should consider joining the Cloud Native Computing Foundation. For details about who's involved and how Kubernetes plays a role, read their announcement.

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.

Are you ready to add to the discussion?

We have presence on:

You can also view recordings of past events and presentations on our Media page.

For Q&A, our threads are at:

Want to contribute to Kubernetes?

If you're interested in being a contributor and want to get involved in developing Kubernetes, start in the Kubernetes Developer Guide and also review the contributor guidelines.

Or, if you just have an idea for a new feature, see the Kubernetes Features repository for details on how to propose it.

Support

While there are many different channels that you can use to get ahold of us, you can help make sure that we are efficient in getting you the help that you need.

If you need support, start with the troubleshooting guide and work your way through the process that we've outlined.

That said, if you have questions, reach out to us one way or another. We don't bite!

Community resources

You can find more projects, tools and articles related to Kubernetes on the awesome-kubernetes list. Add your project there and help us make it better.

Instructive & educational resources for the Kubernetes community. By the community.

  • Community Documentation

Here you can learn more about the current happenings in the kubernetes community.

Analytics