mirror of https://github.com/k3s-io/kubernetes.git synced 2026-01-05 07:27:21 +00:00

Go to file

Kubernetes Submit Queue f499606bfe Merge pull request #45346 from codablock/fix_double_attach

Automatic merge from submit-queue

Don't try to attach volumes which are already attached to other nodes

This PR is a replacement for https://github.com/kubernetes/kubernetes/pull/40148. I was not able to push fixes and rebases to the original branch as I don't have access to the Github organization anymore.

CC @saad-ali You probably have to update the PR link in [Q2 2017 (v1.7)](https://docs.google.com/spreadsheets/d/1t4z5DYKjX2ZDlkTpCnp18icRAQqOE85C1T1r2gqJVck/edit#gid=14624465)

I assume the PR will need a new "ok to test"

**ORIGINAL PR DESCRIPTION**

This PR fixes an issue with the attach/detach volume controller. There are cases where the `desiredStateOfWorld` contains the same volume for multiple nodes, resulting in the attach/detach controller attaching this volume to multiple nodes. This of course fails for volumes like AWS EBS, Azure Disks, ...

I observed this situation on Azure when using Azure Disks and replication controllers which start to reschedule PODs. When you delete a POD that belongs to a RC, the RC will immediately schedule a new POD on another node. This results in a short time (max a few seconds) where you have 2 PODs which try to attach/mount the same volume on different nodes. As the old POD is still alive, the attach/detach controller does not try to detach the volume and starts to attach the volume to the new POD immediately.

This behavior was probably not noticed before on other clouds as the bogus attempt to attach probably fails pretty fast and thus is unnoticed. As the situation with the 2 PODs disappears after a few seconds, a detach for the old POD is initiated and thus the new POD can attach successfully.

On Azure however, attaching and detaching takes quite long, resulting in the first bogus attach attempt to already eat up much time.
When attaching fails on Azure and reports that it is already attached somewhere else, the cloud provider immediately does a detach call for the same volume+node it tried to attach to. This is done to make sure the failed attach request is aborted immediately. You can find this here: https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/azure/azure_storage.go#L74

The complete flow of attach->fail->abort eats up valuable time and the attach/detach controller can not proceed with other work while this is happening. This means, if the old POD disappears in the meantime, the controller can't even start the detach for the volume which delays the whole process of rescheduling and reattaching.

Also, I and other people have observed very strange behavior where disks ended up being "attached" to multiple VMs at the same time as reported by Azure Portal. This results in the controller to fail reattaching forever. It's hard to figure out why and when this happens and there is no reproducer known yet. I can imagine however that the described behavior correlates with what I described above.

I was not sure if there are actually cases where it is perfectly fine to have a volume mounted to multiple PODs/nodes. At least technically, this should be possible with network based volumes, e.g. nfs. Can someone with more knowledge about volumes help me here? I may need to add a check before skipping attaching in `reconcile`.

CC @colemickens @rootfs

-->
```release-note
Don't try to attach volume to new node if it is already attached to another node and the volume does not support multi-attach.
```

2017-05-19 21:54:42 -07:00

.github

Redirect kubeadm issues to kubeadm repo

2017-05-19 12:18:55 +02:00

api

generated: api changes

2017-05-18 10:07:47 -04:00

build

Merge pull request #45855 from mikedanese/out-of-root

2017-05-18 00:56:47 -07:00

cluster

Merge pull request #38169 from caseydavenport/calico-daemonset

2017-05-19 19:38:59 -07:00

cmd

add "admission" API group

2017-05-19 10:17:37 -06:00

docs

generated: api changes

2017-05-18 10:07:47 -04:00

examples

Merge pull request #45678 from a-robinson/1.0

2017-05-17 18:40:59 -07:00

federation

Merge pull request #46063 from madhusudancs/fed-kubefed-logv4

2017-05-18 21:48:39 -07:00

Godeps

Update k8s.io/gengo/... dependency

2017-05-17 00:09:38 -07:00

hack

Merge pull request #45564 from whitlockjc/admission-api-group

2017-05-19 18:57:38 -07:00

hooks

Fix spelling in package naming linter error message

2016-12-20 15:48:14 -05:00

logo

Updated top level owners file to match new format

2017-01-19 11:29:16 -08:00

pkg

Merge pull request #45346 from codablock/fix_double_attach

2017-05-19 21:54:42 -07:00

plugin

Merge pull request #46104 from liggitt/node-admission

2017-05-19 10:58:07 -07:00

staging

Merge pull request #46059 from nikhita/test-int-preserve

2017-05-19 08:35:08 -07:00

test

Merge pull request #45979 from bowei/owners

2017-05-19 19:39:05 -07:00

third_party

autogenerated

2017-04-14 10:40:57 -07:00

translations

Extract a bunch more strings from kubectl

2017-04-06 20:12:50 -07:00

vendor

Update k8s.io/gengo/... dependency

2017-05-17 00:09:38 -07:00

.bazelrc

move build related files out of the root directory

2017-05-15 15:53:54 -07:00

.gazelcfg.json

Add go_genrule for zz_generated.openapi.go.

2017-04-25 17:51:36 -07:00

.generated_files

Move .generated_docs to docs/ so docs OWNERS can review / approve

2017-02-16 10:11:57 -08:00

.gitattributes

Add -diff attributes for generated files

2016-12-08 17:12:07 -08:00

.gitignore

Remove verify_gen_openapi make rule.

2017-04-25 17:41:33 -07:00

BUILD.bazel

move build related files out of the root directory

2017-05-15 15:53:54 -07:00

CHANGELOG.md

Update CHANGELOG.md for v1.6.4.

2017-05-19 12:30:25 -07:00

code-of-conduct.md

Change code of conduct to call CNCF CoC by reference

2016-10-19 13:22:35 -04:00

CONTRIBUTING.md

Close kubernetes/community#420

2017-03-08 09:59:30 -08:00

labels.yaml

Update labels.yaml with sig labels

2017-04-28 14:27:32 -07:00

LICENSE

LICENSE: revert modifications to Apache license

2016-11-22 11:44:46 -08:00

Makefile

move build related files out of the root directory

2017-05-15 15:53:54 -07:00

Makefile.generated_files

move build related files out of the root directory

2017-05-15 15:53:54 -07:00

OWNERS

Add wojtec to global approvers

2017-01-25 11:57:00 -06:00

OWNERS_ALIASES

Merge pull request #42953 from kargakis/rm-myself

2017-04-03 01:50:58 -07:00

README.md

Adjust the link to the right troubleshooting doc page

2017-04-13 08:20:39 +00:00

Vagrantfile

Customizable vagrant rsync args and excludes

2016-11-14 11:18:44 +01:00

WORKSPACE

move build related files out of the root directory

2017-05-15 15:53:54 -07:00

README.md

Kubernetes

Kubernetes is an open source system for managing containerized applications across multiple hosts, providing basic mechanisms for deployment, maintenance, and scaling of applications.

Kubernetes builds upon a decade and a half of experience at Google running production workloads at scale using a system called Borg, combined with best-of-breed ideas and practices from the community.

Kubernetes is hosted by the Cloud Native Computing Foundation (CNCF). If you are a company that wants to help shape the evolution of technologies that are container-packaged, dynamically-scheduled and microservices-oriented, consider joining the CNCF. For details about who's involved and how Kubernetes plays a role, read the CNCF announcement.

To start using Kubernetes

See our documentation on kubernetes.io.

Try our interactive tutorial.

Take a free course on Scalable Microservices with Kubernetes.

To start developing Kubernetes

The community repository hosts all information about building Kubernetes from source, how to contribute code and documentation, who to contact about what, etc.

If you want to build Kubernetes right away there are two options:

You have a working Go environment.

$ go get -d k8s.io/kubernetes
$ cd $GOPATH/src/k8s.io/kubernetes
$ make

You have a working Docker environment.

$ git clone https://github.com/kubernetes/kubernetes
$ cd kubernetes
$ make quick-release

If you are less impatient, head over to the developer's documentation.

Support

If you need support, start with the troubleshooting guide and work your way through the process that we've outlined.

That said, if you have questions, reach out to us one way or another.