This was broken back in #43637 when the logic in
`(*StatefulSetTester).CreateStatefulSet` switched from using
`generated.ReadOrDie` to read the entire service.yaml file and pass it
to kubectl to using `manifest.SvcFromManifest`, which assumes that the
file contains only a single service.
Fixes#52750
Automatic merge from submit-queue (batch tested with PRs 52843, 52710, 52821, 52844). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
improve retrying logic when checking CA status
This should reduce the flake rate in cluster size autoscaling test suite.
Automatic merge from submit-queue (batch tested with PRs 48406, 52819). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Fixed nil dereference in dynamic provisioning e2e tests
**What this PR does / why we need it**: Fixed nil dereference in dynamic provisioning e2e tests.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#52815
**Release note**:
```release-note-none
NONE
```
/sig storage
/assign @saad-ali
/cc @wongma7
/release-note-none
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Retry if possible while creating latency pods in density test
Saw the [last run](https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-scale-performance/37) of density test on 5k-node fail due to it:
```
Expected error:
<*errors.StatusError | 0xc44f2fd7a0>: {
ErrStatus: {
TypeMeta: {Kind: "", APIVersion: ""},
ListMeta: {SelfLink: "", ResourceVersion: "", Continue: ""},
Status: "Failure",
Message: "timeout",
Reason: "",
Details: nil,
Code: 500,
},
}
timeout
not to have occurred
```
cc @kubernetes/sig-scalability-misc
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Support kubernetes-anywhere provider
**What this PR does / why we need it**:
Implements a new `kubernetes-anywhere` provider to allow upgrade testing in the e2e binary. This is the final step to allow https://github.com/kubernetes/test-infra/pull/4495 and https://github.com/kubernetes/kubernetes-anywhere/pull/450.
**Which issue this PR fixes**:
https://github.com/kubernetes/kubeadm/issues/311
**Special notes for your reviewer**:
Some questions I had
- Does the `--provider` flag specified [here](dbbf6261e0/jobs/config.json (L8587)) get sent to the flag defined [here](https://github.com/kubernetes/kubernetes/blob/master/test/e2e/framework/test_context.go#L219)? Or should I add another `--provider` flag inside `--upgrade_args` like this: `--upgrade_args=... --provider=kubernetes-anywhere`?
- Is it necessary to add waiting logic after the `make` command, or will it implicitly handle that by itself?
Some other points:
- I chose `sed` to manipulate the current kubernetes-anywhere `.config` rather than duplicating another [`anywhere.go`](https://github.com/kubernetes/test-infra/blob/master/kubetest/anywhere.go). One suggestion was to use `jq` but since the config on disk is not serialized to JSON yet, I'm not sure how that'd work.
- Since I don't have a GCE/GKE account or vCenter, I can't actually verify the e2e binary works. I've managed to build it, but if somebody could quickly run a smoke test, I'd appreciate it. This is my first poke around test-infra and e2e, so there might be some plumbing missing
/cc @jessicaochen @luxas @pipejakob @roberthbailey
Automatic merge from submit-queue (batch tested with PRs 52500, 52533). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Add mount options e2e test
**What this PR does / why we need it**: A test for newly added StorageClass.mountOptions and PV.mountOptions: provision a pv using a class with its storageclass.mountoptions set, and the end result should be that the mount options can be seen from the mounter.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: Fixes#52138
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
In autoscaling integration test, use allocatable instead of capacity for node memory
This makes the remaining cluster autoscaling test (integration test of HPA and CA working together to scale up the cluster) use node allocatable resources when computing how much memory we need to consume in order to trigger scale up/prevent scale down. Follow up to #52650 as that one is already merging.
cc @wasylkowski
Automatic merge from submit-queue (batch tested with PRs 51337, 47080, 52646, 52635, 52666). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Fix: update system spec to support Docker 17.03
Docker 17.03 is 1.13 with bug fixes so they are of the same minor version release. We've validated them both in https://github.com/kubernetes/kubernetes/issues/42926. This PR changes the system spec to support Docker 17.03.
**This should be in 1.8.**
**Release note**:
```
Kubernetes 1.8 supports docker version 17.03.x.
```
/assign @Random-Liu
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
In cluster size autoscaling tests, use allocatable instead of capacity for node memory
This makes cluster size autoscaling e2e tests use node allocatable resources when computing how much memory we need to consume in order to trigger scale up/prevent scale down. It should fix failing tests in GKE.
Automatic merge from submit-queue (batch tested with PRs 52350, 52659). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Add e2e test for storageclass.reclaimpolicy
**What this PR does / why we need it**: Adds another dynamic provisioning test where the storageclass.reclaimpolicy == retain. Have to manually delete the PV at the end of the test.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: https://github.com/kubernetes/kubernetes/issues/52138
**Special notes for your reviewer**: I have not tested it but it's ready for review, I will comment and edit this when i've verified it actually works.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51796, 52223)
Add bsalamat to sig-scheduling-maintainers
**What this PR does / why we need it**:
Adds bsalamat to sig-scheduling-maintainers.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes # N/A
**Release note**:
```release-note
NONE
```
@kubernetes/sig-scheduling-pr-reviews @davidopp @timothysc @k82cn @wojtek-t
Automatic merge from submit-queue
Bugfix: Fix e2e Flaky Apps/Job BackoffLimit test
This fix is linked to the PR #51153 that introduce the `JobSpec.BackoffLimit`.
Previously the Timeout used in the test was too aggressive and generates flaky test execution. Now it used the default `framework.JobTimeout` used in others tests.
**What this PR does / why we need it**:
This PR should fix flaky "[sig-apps] Job should exceed backoffLimit" test, due to a too short timeout duration.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
fixes#51153
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 51824, 50476, 52451, 52009, 52237)
Improve apiserver metrics reporting
Normalize "WATCHLIST" to "WATCH", add "scope" to the other metrics (listing 50k pods is != listing pods in a namespace), and add a new scope "resource" to cover individual resource calls.
This roughly aligns metrics with our ACL model (technically resource scope is GET, but POST to a subresource and POST to a namespace are not the same thing).
```release-note
WATCHLIST calls are now reported as WATCH verbs in prometheus for the apiserver_request_* series. A new "scope" label is added to all apiserver_request_* values that is either 'cluster', 'resource', or 'namespace' depending on which level the query is performed at.
```
Automatic merge from submit-queue (batch tested with PRs 52442, 52247, 46542, 52363, 51781)
Add more tests for pod preemption
**What this PR does / why we need it**:
Adds more e2e and integration tests for pod preemption.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
This PR is based on #50949. Only the last commit is new.
**Release note**:
```release-note
NONE
```
ref/ #47604
@kubernetes/sig-scheduling-pr-reviews @davidopp
Automatic merge from submit-queue
[fluentd-gcp addon] Remove some e2e tests out of blocking suites
Fixes https://github.com/kubernetes/kubernetes/issues/52433
Some Stackdriver Logging e2e tests are broken in release-blocking suites:
- Due to the change in Docker 1.13, on some systems logs are automatically split by 16K chunks. This PR removes an e2e test that assumes otherwise
- In large clusters, it's not possible to ingest system logs from all nodes
Since it's not a Kubernetes problem per se, mitigating this by removing these tests from blocking suites.
Automatic merge from submit-queue
Fix failing autoscaling test in GKE
This should fix `[sig-autoscaling] Cluster size autoscaling [Slow] should increase cluster size if pending pods are small and there is another node pool that is not autoscaled [Feature:ClusterSizeAutoscalingScaleUp]` by getting a list of nodes from GKE nodepool in a different way (filtering nodes by labels.) Currently, gcloud command used for it is failing, as we only have GKE node pool name in the test and not the actual MIG name.
Automatic merge from submit-queue (batch tested with PRs 52376, 52439, 52382, 52358, 52372)
Remove the conversion of client config
It was needed because the clientset code in client-go was a copy of the clientset code in Kubernetes.. client-go is authoritative now, so we can remove the nasty copy.
This fix is linked to the PR #51153 that introduce the
JobSpec.BackoffLimit.
Previously the Timeout used in the test was too agressive and generates
flaky test execution. Now it used the default framework.JobTimeout used
in others tests.