Commit Graph

1732 Commits

Author SHA1 Message Date
Tim Hockin
d33e57d2c8 Better log line in e2e 2018-06-07 17:01:49 -07:00
Jan Chaloupka
ab616a88b9 Promote sysctl annotations to API fields 2018-06-05 23:17:00 +02:00
Krzysztof Siedlecki
aa022310a4 Collecting etcd metrics 2018-06-04 16:23:08 +02:00
Kubernetes Submit Queue
5710943612 Merge pull request #63839 from wgliang/master.movepkg
Automatic merge from submit-queue (batch tested with PRs 63348, 63839, 63143, 64447, 64567). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Move pkg/scheduler/schedulercache -> pkg/scheduler/cache

**What this PR does / why we need it**:
Move pkg/scheduler/schedulercache -> pkg/scheduler/cache

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #63813

**Special notes for your reviewer**:

In order to prevent name conflicts still rename the `cache` to `schedulercache`.

**Release note**:

```release-note
NONE
```
2018-06-01 12:12:15 -07:00
Kubernetes Submit Queue
2cef77db8e Merge pull request #64480 from verult/repd-ig-fix
Automatic merge from submit-queue (batch tested with PRs 62460, 64480, 63774, 64540, 64337). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Modified regional PD test to fetch template name from GCE

**What this PR does / why we need it**: Previously, the regional PD failover e2e test assumes a specific relationship between the names of an instance group and its corresponding template. It turns out to not always hold true for different types of clusters. Instead, the test should fetch the correct template name by calling out to GCE.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Part of #59988

Need to cherry pick this back to 1.10 along with #64223 

**Release note**:

```release-note
NONE
```

/assign @saad-ali @wojtek-t 
/sig storage
2018-05-31 14:12:15 -07:00
Guoliang Wang
761cf41427 Move pkg/scheduler/schedulercache -> pkg/scheduler/cache 2018-05-31 22:55:34 +08:00
Kubernetes Submit Queue
f3d54f3f95 Merge pull request #64311 from liggitt/min-startup-pods
Automatic merge from submit-queue (batch tested with PRs 64318, 64269, 64438, 64516, 64311). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Revert "Change default min-startup-pods value"

This reverts commit de0bf05f46.

The number of pods to require is a cluster-dependent setting, and should be set by the invoker, not defaulted here

https://github.com/kubernetes/test-infra/pull/8135 must be merged first

```release-note
NONE
```
2018-05-30 11:25:29 -07:00
Kubernetes Submit Queue
a2d8636559 Merge pull request #64316 from krzysied/scheduling_latency_metric
Automatic merge from submit-queue (batch tested with PRs 63328, 64316, 64444, 64449, 64453). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fixing scheduling latency metrics

**What this PR does / why we need it**:
Allows to measure and to display scheduling latency metrics during tests. Provides new functionality of resetting scheduler latency metrics.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #63493

**Special notes for your reviewer**:
E2eSchedulingLatency, SchedulingAlgorithmLatency, BindingLatency are now available 
as subtypes of OperationLatency.

**Release note**:

```release-note
NONE
```
2018-05-30 08:42:16 -07:00
Krzysztof Siedlecki
0e833bfc83 Fixing scheduling latency metrics 2018-05-30 11:20:12 +02:00
Cheng Xing
82f9d9365e Modified regional PD test to fetch template name from GCE 2018-05-29 15:51:24 -07:00
Jan Safranek
39263a3f5d Add reliable wait for volume server startup.
Remove sleep(20) and check for readiness of volume servers by checking
logs.
2018-05-29 09:54:20 +02:00
Kubernetes Submit Queue
35038bd59a Merge pull request #64308 from jsafrane/rbd-startup
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Move Ceph server secret creation to common code.

The secret should be created only on one place.

**Release note**:

```release-note
NONE
```

@jeffvance @copejon @rootfs @msau42 PTAL
2018-05-28 18:02:37 -07:00
Kubernetes Submit Queue
13f084933f Merge pull request #64406 from soltysh/issue64359
Automatic merge from submit-queue (batch tested with PRs 64399, 64324, 64404, 64406, 64396). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add daemonset when to getReplicasFromRuntimeObject when cleaning objects in e2e

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64359

**Special notes for your reviewer**:
/assign @dims 

**Release note**:

```release-note
NONE
```
2018-05-28 15:06:23 -07:00
Maciej Szulik
a3e841871c Add daemonset when to getReplicasFromRuntimeObject when cleaning objects in e2e 2018-05-28 17:50:36 +02:00
wojtekt
23bf2246fe Fix GKE Regional Clusters upgrade tests 2018-05-28 15:00:23 +02:00
Patrick Ohly
cc2a5954ac e2e/storage: central argument handling
Putting the command line argument handling into the central test
context seems like the better solution, in particular considering that
argument handling might get changed in the future to use Viper.
2018-05-28 11:46:50 +02:00
Kubernetes Submit Queue
9872a0502b Merge pull request #64288 from gnufied/take-volume-resize-beta
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Move volume resize feature to beta

Move volume resizing feature to beta. 

xref https://github.com/kubernetes/features/issues/284

```release-note
Move Volume expansion to Beta
```
2018-05-26 01:34:17 -07:00
Kubernetes Submit Queue
5f578f3385 Merge pull request #63979 from soltysh/drop_reapers
Automatic merge from submit-queue (batch tested with PRs 63859, 63979). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Drop reapers

/assign @deads2k @juanvallejo 

**Release note**:
```release-note
kubectl delete does not use reapers for removing objects anymore, but relies on server-side GC entirely
```
2018-05-26 00:32:11 -07:00
Jiaying Zhang
6e0badc0d1 Fix DsFromManifest() after we switch from extensions/v1beta1 to apps/v1
in cluster/addons/device-plugins/nvidia-gpu/daemonset.yaml.
2018-05-25 16:05:35 -07:00
Maciej Szulik
383872615d Remove kubectl reapers 2018-05-25 22:18:05 +02:00
Hemant Kumar
23a73283b3 Fix breaking volume resize e2e tests 2018-05-25 15:32:44 -04:00
Kubernetes Submit Queue
b8db949560 Merge pull request #64266 from shyamjvs/measure-max-scheduler-throughput-metric
Automatic merge from submit-queue (batch tested with PRs 63232, 64257, 64183, 64266, 64134). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Measure scheduler throughput in density test

This is a step towards exposing scheduler-related metrics on [perf-dash](http://perf-dash.k8s.io/).
This particular PR adds scheduler throughput computation and makes the results available in our test artifacts.
So if you do some experiments, you'll have some historical baseline data to compare against.

xref https://github.com/kubernetes/kubernetes/issues/63493

fyi - @wojtek-t @davidopp @bsalamat @misterikkit 
cc @kubernetes/sig-scheduling-misc @kubernetes/sig-scalability-misc 

```release-note
NONE
```
2018-05-25 08:24:22 -07:00
Jordan Liggitt
2a9d491fb8 Revert "Change default min-startup-pods value"
This reverts commit de0bf05f46.
2018-05-25 09:05:38 -04:00
Shyam Jeedigunta
f363f549c0 Measure scheduler throughput in density test 2018-05-25 14:49:11 +02:00
Jan Safranek
5a099d70c9 Move Ceph server secret creation to common code. 2018-05-25 14:02:59 +02:00
Kubernetes Submit Queue
deb632e727 Merge pull request #64204 from sttts/sttts-unify-NewNoxuInstance
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

apiextensions: unify mono- and multi-versioned test helpers

The mono-versioned helpers are a special case of the multi-versioned ones.

Fixes part of https://github.com/kubernetes/kubernetes/issues/64136.
2018-05-25 04:49:37 -07:00
Kubernetes Submit Queue
10377f6593 Merge pull request #63896 from mtaufen/refactor-test-metrics
Automatic merge from submit-queue (batch tested with PRs 64013, 63896, 64139, 57527, 62102). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Refactor test utils that deal with Kubelet metrics for clarity

I found these functions hard to understand, because the names did not
accurately reflect their behavior. For example, GetKubeletMetrics
assumed that all of the metrics passed in were measuring latency.
The caller of GetKubeletMetrics was implicitly making this assumption,
but it was not obvious at the call site.

```release-note
NONE
```
2018-05-23 19:44:15 -07:00
Kubernetes Submit Queue
0a22c159e5 Merge pull request #64015 from cofyc/improvetests
Automatic merge from submit-queue (batch tested with PRs 62756, 63862, 61419, 64015, 64063). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Handle TERM signal to reduce pod terminating time.

**What this PR does / why we need it**:

Register signal handler for `TERM`.

For container, process is run as PID 1. PID 1 is special in linux, it ignore any signal with the default action. So process will not terminate on `SIGINT` or `SIGTERM` unless it is coded to do so.

By default, docker use `TERM` signal to termiante pods, it will wait 10 seconds before sending `KILL` signal.

```
$ docker run -d --rm --name test busybox sh -c 'while true; do sleep 1; done'
aa827df5d7bfffc5ca4fae2429d0b761a5a142c57ba3e1faa59b305c92f1c875
$ time docker stop test
test

real	0m10.408s
user	0m0.048s
sys	0m0.020s
```

It's better to register a exit handler for `TERM`, it reduces a lot of time in waiting pods to termiante.

```
$ docker run -d --rm --name test busybox sh -c 'trap exit TERM; while true; do sleep 1; done'
e331bff454dba8e45df6065c3bd2a928e1c41303aafdf88ede38def3e4e5781f
$ time docker stop test
test

real	0m0.690s
user	0m0.048s
sys	0m0.020s
```

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-05-23 18:14:16 -07:00
Kubernetes Submit Queue
5fe35cdbf9 Merge pull request #61419 from enisoc/apps-v1-deploy
Automatic merge from submit-queue (batch tested with PRs 62756, 63862, 61419, 64015, 64063). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use apps/v1 Deployment/ReplicaSet in controller and kubectl

This updates the Deployment controller and integration/e2e tests to use apps/v1, as part of #55714.

This also requires updating any other components that use the `deployment/util` package, most notably `kubectl`. That means client versions 1.11 and above will only work with server versions 1.9 and above. This is well within our client-server version skew policy of +/-1 minor version.

However, this PR *only* updates the parts of `kubectl` that used `deployment/util`. So although kubectl now requires apps/v1, it still also depends on extensions/v1beta1. Migrating other parts of kubectl to apps/v1 is beyond the scope of this PR, which was just to change the Deployment controller and fix all the fallout.

```release-note
kubectl: This client version requires the `apps/v1` APIs, so it will not work against a cluster version older than v1.9.0. Note that kubectl only guarantees compatibility with clusters that are +/-1 minor version away.
```
2018-05-23 18:14:13 -07:00
Dr. Stefan Schimanski
818147d6fb apiextensions: make CreateNewCustomResourceDefinition return created CRD 2018-05-23 21:41:55 +02:00
Anthony Yeh
436db71751 Set explicit labels/selector for apps/v1 Deployment/RS. 2018-05-22 13:43:07 -07:00
Anthony Yeh
a6a5190494 test/e2e: Use apps/v1 Deployment/ReplicaSet.
This must be done at the same time as the controller update,
since they share code.
2018-05-22 13:43:06 -07:00
Jacob Gillespie
98bc39dcd5 Add Logf message for skipped succeeded pods 2018-05-22 12:40:20 -07:00
Jacob Gillespie
31bf75c116 Fix running e2e tests with completed kube-system pods 2018-05-21 09:16:36 -05:00
Michael Taufen
83509a092f Refactor test utils that deal with Kubelet metrics for clarity
I found these functions hard to understand, because the names did not
accurately reflect their behavior. For example, GetKubeletMetrics
assumed that all of the metrics passed in were measuring latency.
The caller of GetKubeletMetrics was implicitly making this assumption,
but it was not obvious at the call site.
2018-05-18 11:32:29 -07:00
Yecheng Fu
55bc8d74a2 Handle TERM signal to reduce pod terminating time. 2018-05-18 16:55:00 +08:00
Kubernetes Submit Queue
e6688fc65a Merge pull request #63946 from msau42/fix-reconstruction-flake
Automatic merge from submit-queue (batch tested with PRs 63920, 63716, 63928, 60553, 63946). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Wait for pod deletion instead of termination in reconstruction test

**What this PR does / why we need it**:
Change volume test to wait for pod deletion instead of pod termination

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Addresses https://github.com/kubernetes/kubernetes/issues/63923

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-05-18 01:07:25 -07:00
Michelle Au
46b62c20e4 Wait for pod deletion instead of termination 2018-05-17 10:08:54 -07:00
Ashley Gau
101dde0c22 check for NEG healthcheck with correct name 2018-05-16 14:30:21 -07:00
Kubernetes Submit Queue
5686fcfcf8 Merge pull request #62328 from serathius/monitoring-default-none
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Deprecate InfluxDB cluster monitoring

InfluxDB cluster monitoring addon will no longer be supported and will be removed in k8s 1.12.
Default monitoring solution will be changed to `standalone`.
Heapster will still be deployed for backward compatibility of `kubectl top`

```release-note
Stop using InfluxDB as default cluster monitoring
InfluxDB cluster monitoring is deprecated and will be removed in v1.12
```
cc @piosz
2018-05-16 07:07:05 -07:00
Kubernetes Submit Queue
c2b4fc99df Merge pull request #63911 from shyamjvs/change-default-minStartupPods-value
Automatic merge from submit-queue (batch tested with PRs 63850, 63911). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Change default min-startup-pods value

As ~all our e2e jobs on CI (not just scalability ones) seem to be setting this flag to 8. See - https://github.com/kubernetes/test-infra/blob/master/jobs/config.json

/cc @wojtek-t 

```release-note
NONE
```
2018-05-16 04:04:19 -07:00
Kubernetes Submit Queue
8c240523ca Merge pull request #63870 from shyamjvs/autocalculate-allowed-not-ready-nodes
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Auto-calculate allowed-not-ready-nodes in test framework

Actually we (sig-scalability) are pretty much the only users of this flag.
This reduces the overhead of having to provide its value based on num-nodes each time we run our tests.

/cc @wojtek-t 

```release-note
NONE
```
2018-05-16 02:36:59 -07:00
Shyam Jeedigunta
de0bf05f46 Change default min-startup-pods value 2018-05-16 10:50:00 +02:00
Shyam Jeedigunta
6514dd656c Auto-calculate allowed-not-ready-nodes in test framework 2018-05-16 09:50:13 +02:00
Shyam Jeedigunta
00abb651a9 Decrease default node schedulable timeout in e2e framework 2018-05-15 16:50:04 +02:00
Ashley Gau
054b4a7978 check for new backend naming scheme 2018-05-14 09:58:21 -07:00
Kubernetes Submit Queue
92ba95c39c Merge pull request #63446 from deads2k/client-08-remove-old
Automatic merge from submit-queue (batch tested with PRs 63367, 63718, 63446, 63723, 63720). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

finish new dynamic client and deprecate old dynamic client

Builds on a couple other pulls.  This completes the transition to the new dynamic client.

@kubernetes/sig-api-machinery-pr-reviews 
@caesarxuchao @sttts 

```release-note
The old dynamic client has been replaced by a new one.  The previous dynamic client will exist for one release in `client-go/deprecated-dynamic`.  Switch as soon as possible.
```
2018-05-11 14:49:16 -07:00
Kubernetes Submit Queue
dbc491c031 Merge pull request #63594 from justinsb/introduce_tooling_to_e2e
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

e2e: add a tooling argument to differentiate tooling

We expect lots of tools to be able to install on provider=gce - the cluster
API, kops, kube-up etc.

We introduce a new optional flag to e2e ('tooling') to enable switching on
the tooling, not just the cloud.

This will prove useful for upgrade tests, for example, where the mechanism
will likely vary by tooling, but is currently tightly bound to the provider
(i.e. cloud)

```release-note
NONE
```
2018-05-11 12:35:47 -07:00
David Eads
fd044d152e fix dynamic client name 2018-05-11 13:12:09 -04:00
Kubernetes Submit Queue
2c165efbb4 Merge pull request #63246 from losipiuk/lo/autoscaler-e2e-gpu-tests
Automatic merge from submit-queue (batch tested with PRs 63246, 63185). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

 Add scale-up test from 0 for GPU node-pool

**Release note**:
```release-note
NONE
```
2018-05-11 03:30:10 -07:00