Merge pull request #66264 from joejulian/workaround_for_slow_arm64_math

Automatic merge from submit-queue (batch tested with PRs 66341, 66405, 66403, 66264, 66447). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

extend timeout to workaround slow arm64 math

**What this PR does / why we need it**:

The math/big functions are slow on arm64. There is improvement coming
with go1.11 but until such time as that version can be used to build releases, 
if a server uses rsa certificates on arm64, the math load for the multitude
of watches over-taxes the ability of the processor and the TLS connections
time out. Retries will also not succeed and serve to exacerbate the problem.

By extending the timeout, the TLS connections will eventually be
successful and the load will drop.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64649

**Special notes for your reviewer**:
This was tested on a Raspberry Pi 3

**Release note**:
```release-note
Extend TLS timeouts to work around slow arm64 math/big
```
This commit is contained in:
Kubernetes Submit Queue 2018-07-20 16:02:14 -07:00 committed by GitHub
commit b914542b9c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -34,14 +34,15 @@ import (
"k8s.io/apiserver/pkg/storage/value"
)
var (
// The short keepalive timeout and interval have been chosen to aggressively
// detect a failed etcd server without introducing much overhead.
keepaliveTime = 30 * time.Second
keepaliveTimeout = 10 * time.Second
// dialTimeout is the timeout for failing to establish a connection.
dialTimeout = 10 * time.Second
)
// The short keepalive timeout and interval have been chosen to aggressively
// detect a failed etcd server without introducing much overhead.
const keepaliveTime = 30 * time.Second
const keepaliveTimeout = 10 * time.Second
// dialTimeout is the timeout for failing to establish a connection.
// It is set to 20 seconds as times shorter than that will cause TLS connections to fail
// on heavily loaded arm64 CPUs (issue #64649)
const dialTimeout = 20 * time.Second
func newETCD3HealthCheck(c storagebackend.Config) (func() error, error) {
// constructing the etcd v3 client blocks and times out if etcd is not available.