Production-Grade Container Scheduling and Management
Go to file
Kubernetes Submit Queue 5e70562c6a
Merge pull request #57512 from cofyc/improve_rbd_highavailability
Automatic merge from submit-queue (batch tested with PRs 57572, 57512, 57770). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

RBD Plugin: Pass monitors addresses in a comma-separed list instead of trying one by one.

**What this PR does / why we need it**:

In production, monitors may crash (or have a network problem), if we try monitors one by one, rbd
command will hang a long time (e.g. `rbd map -m <unconnectable_host_ip>`
on linux 4.4 timed out in 6 minutes) when trying a unconnectable monitor. This is unacceptable.

Actually, we can simply pass a comma-separated list monitor addresses to `rbd`
command utility. Kernel rbd/libceph modules will pick monitor randomly
and try one by one, `rbd` command utility succeed soon if there is a
good one in monitors list.

[Docs](http://docs.ceph.com/docs/jewel/man/8/rbd/#cmdoption-rbd-m) about `-m` option of `rbd` is wrong,  'rbd' utility simply pass '-m <mon>' parameter to kernel rbd/libceph modules, which
takes a comma-seprated list of one or more monitor addresses (e.g. ip1[:port1][,ip2[:port2]...]) in its first version in linux (see 602adf4002/net/ceph/ceph_common.c (L239)). Also, libceph choose monitor randomly, so we can simply pass all addresses without randomization (see 602adf4002/net/ceph/mon_client.c (L132)).

From what I saw, there is no need to iterate monitor hosts one by one.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

Run `rbd map` against unconnectable monitor address logs on Linux 4.4:

```
root@myhost:~# uname -a
Linux myhost 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
root@myhost:~# time rbd map kubernetes-dynamic-pvc-941ff4d2-b951-11e7-8836-049fca8e58df --pool <pool> --id <id> -m <unconnectable_host_ip> --key=<password>
rbd: sysfs write failed
2017-12-20 18:55:11.810583 7f7ec56863c0  0 monclient(hunting): authenticate timed out after 300
2017-12-20 18:55:11.810638 7f7ec56863c0  0 librados: client.<id> authentication error (110) Connection timed out
rbd: couldn't connect to the cluster!
In some cases useful info is found in syslog - try "dmesg | tail" or so.
rbd: map failed: (110) Connection timed out

real	6m0.018s
user	0m0.052s
sys	0m0.064s
```  

We can simply pass a comma-separated list of monitors, if there is a good one in them, `rbd map` succeed soon.

```
root@myhost:~# time rbd map kubernetes-dynamic-pvc-941ff4d2-b951-11e7-8836-049fca8e58df --pool <pool> --id <id> -m <unconnectable_host_ip>,<good_host_ip> --key=<password>

/dev/rbd3

real	0m0.426s
user	0m0.008s
sys	0m0.008s
```

**Release note**:

```release-note
NONE
```
2018-01-03 13:46:32 -08:00
.github
api Generate specs after fixing typo in documentation. 2018-01-02 17:59:53 +00:00
build Bump rules_go to 0.8.1 2017-12-23 13:12:02 -08:00
cluster Merge pull request #57756 from mborsz/exec-manifest 2018-01-03 02:25:42 -08:00
cmd Merge pull request #57797 from kargakis/i-can-do-reviews 2018-01-03 11:15:41 -08:00
docs Run update-api-reference-docs.sh. 2018-01-03 00:18:03 +00:00
examples Autogenerate BUILD files 2017-12-23 13:12:11 -08:00
Godeps Update to latest gophercloud 2018-01-02 12:13:45 -05:00
hack remove /k8s.io/kubernetes/pkg/kubectl/testing 2018-01-03 09:39:33 +08:00
logo
pkg Merge pull request #57512 from cofyc/improve_rbd_highavailability 2018-01-03 13:46:32 -08:00
plugin Merge pull request #57128 from liggitt/kubelet-admin 2018-01-03 08:30:33 -08:00
staging Merge pull request #57572 from sttts/sttts-move-metrics-md 2018-01-03 13:29:41 -08:00
test Merge pull request #57721 from sbezverk/e2e_bootstrapper_decommission 2018-01-02 16:26:40 -08:00
third_party Autogenerate BUILD files 2017-12-23 13:12:11 -08:00
translations
vendor Update to latest gophercloud 2018-01-02 12:13:45 -05:00
.bazelrc
.generated_files
.gitattributes
.gitignore
.kazelcfg.json
BUILD.bazel
CHANGELOG-1.2.md
CHANGELOG-1.3.md
CHANGELOG-1.4.md
CHANGELOG-1.5.md
CHANGELOG-1.6.md
CHANGELOG-1.7.md Update CHANGELOG-1.7.md for v1.7.12. 2017-12-29 13:42:49 +01:00
CHANGELOG-1.8.md
CHANGELOG-1.9.md
CHANGELOG-1.10.md
CHANGELOG.md
code-of-conduct.md
CONTRIBUTING.md
labels.yaml
LICENSE
Makefile
Makefile.generated_files
OWNERS
OWNERS_ALIASES
README.md
SUPPORT.md
Vagrantfile
WORKSPACE

Kubernetes

Submit Queue Widget GoDoc Widget CII Best Practices


Kubernetes is an open source system for managing containerized applications across multiple hosts, providing basic mechanisms for deployment, maintenance, and scaling of applications.

Kubernetes builds upon a decade and a half of experience at Google running production workloads at scale using a system called Borg, combined with best-of-breed ideas and practices from the community.

Kubernetes is hosted by the Cloud Native Computing Foundation (CNCF). If you are a company that wants to help shape the evolution of technologies that are container-packaged, dynamically-scheduled and microservices-oriented, consider joining the CNCF. For details about who's involved and how Kubernetes plays a role, read the CNCF announcement.


To start using Kubernetes

See our documentation on kubernetes.io.

Try our interactive tutorial.

Take a free course on Scalable Microservices with Kubernetes.

To start developing Kubernetes

The community repository hosts all information about building Kubernetes from source, how to contribute code and documentation, who to contact about what, etc.

If you want to build Kubernetes right away there are two options:

You have a working Go environment.
$ go get -d k8s.io/kubernetes
$ cd $GOPATH/src/k8s.io/kubernetes
$ make
You have a working Docker environment.
$ git clone https://github.com/kubernetes/kubernetes
$ cd kubernetes
$ make quick-release

For the full story, head over to the developer's documentation.

Support

If you need support, start with the troubleshooting guide and work your way through the process that we've outlined.

That said, if you have questions, reach out to us one way or another.

Analytics