The unit test for the ingress controller was previously adding
a cluster twice, which resulted in a cluster being deleted and added
back. The deletion was racing the controller shutdown to close
informer channels. This change ensures that the informer clears its
map of informers when Stop() is called to prevent a double close, and
that the test no longer adds the cluster twice.
Automatic merge from submit-queue
Add fabianofranz as approver for test/e2e/kubectl.go
Adding myself as approver for `kubectl` end-to-end tests.
```release-note
NONE
```
Automatic merge from submit-queue
Fixed incorrect result of getMinTolerationTime.
For the following case, `getMinTolerationTime` should return one; but it returned -1 :
1. for tolerations[0], TolerationSeconds is nil, minTolerationTime is not set
2. for tolerations[1], it's TolerationSeconds (1) is bigger than `minTolerationTime`, so minTolerationTime is still -1 which means infinite.
```
+ {
+ tolerations: []v1.Toleration{
+ {
+ TolerationSeconds: nil,
+ },
+ {
+ TolerationSeconds: &one,
+ },
+ },
+ },
```
Automatic merge from submit-queue (batch tested with PRs 42969, 42966)
kubeadm: update kubeadm banner to beta
**What this PR does / why we need it**: Updates the intro banner for kubeadm, which used to state it is in alpha (but we are going to beta). This also updates the tagged github group (one that no longer exists) to the sig-cluster-lifecycle-misc group.
**Special notes for your reviewer**: /cc @jbeda
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 42969, 42966)
kubeadm: fixed warning nil logging
**What this PR does / why we need it**: Fix bug in warning aggregation for preflight checks. Would cause logging like this:
`[preflight] WARNING: %!s(<nil>)`
Will now only append non-nil cases to warning.
**Special notes for your reviewer**: /cc @jbeda
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
[Federation] Unjoin only the joined clusters while bringing down the federation control plane.
A few other minor improvements.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
hack/godep-restore.sh: use godep v79 which works
Godep v74 gives me:
```shell
godep: Checking dependency: k8s.io/metrics/pkg/apis/custom_metrics
godep: Dep (k8s.io/metrics/pkg/apis/custom_metrics) restored, but was unable to load it with error:
Package (k8s.io/apimachinery/pkg/api/resource) not found
godep: Checking dependency: k8s.io/metrics/pkg/apis/custom_metrics/install
godep: Dep (k8s.io/metrics/pkg/apis/custom_metrics/install) restored, but was unable to load it with error:
Package (k8s.io/apimachinery/pkg/apimachinery/announced) not found
godep: Checking dependency: k8s.io/metrics/pkg/apis/custom_metrics/v1alpha1
godep: Dep (k8s.io/metrics/pkg/apis/custom_metrics/v1alpha1) restored, but was unable to load it with error:
Package (k8s.io/apimachinery/pkg/api/resource) not found
godep: Checking dependency: k8s.io/metrics/pkg/apis/metrics
godep: Dep (k8s.io/metrics/pkg/apis/metrics) restored, but was unable to load it with error:
Package (k8s.io/apimachinery/pkg/apis/meta/v1) not found
godep: Checking dependency: k8s.io/metrics/pkg/apis/metrics/install
godep: Dep (k8s.io/metrics/pkg/apis/metrics/install) restored, but was unable to load it with error:
Package (k8s.io/apimachinery/pkg/apimachinery/announced) not found
godep: Checking dependency: k8s.io/metrics/pkg/apis/metrics/v1alpha1
godep: Dep (k8s.io/metrics/pkg/apis/metrics/v1alpha1) restored, but was unable to load it with error:
Package (k8s.io/apimachinery/pkg/api/resource) not found
godep: Checking dependency: k8s.io/metrics/pkg/client/clientset_generated/clientset
godep: Dep (k8s.io/metrics/pkg/client/clientset_generated/clientset) restored, but was unable to load it with error:
Package (k8s.io/client-go/discovery) not found
godep: Checking dependency: k8s.io/metrics/pkg/client/clientset_generated/clientset/fake
godep: Dep (k8s.io/metrics/pkg/client/clientset_generated/clientset/fake) restored, but was unable to load it with error:
Package (k8s.io/apimachinery/pkg/runtime) not found
godep: Checking dependency: k8s.io/metrics/pkg/client/clientset_generated/clientset/scheme
godep: Dep (k8s.io/metrics/pkg/client/clientset_generated/clientset/scheme) restored, but was unable to load it with error:
Package (k8s.io/apimachinery/pkg/apis/meta/v1) not found
godep: Checking dependency: k8s.io/metrics/pkg/client/clientset_generated/clientset/typed/metrics/v1alpha1
godep: Dep (k8s.io/metrics/pkg/client/clientset_generated/clientset/typed/metrics/v1alpha1) restored, but was unable to load it with error:
Package (k8s.io/apimachinery/pkg/apis/meta/v1) not found
godep: Checking dependency: k8s.io/metrics/pkg/client/clientset_generated/clientset/typed/metrics/v1alpha1/fake
godep: Dep (k8s.io/metrics/pkg/client/clientset_generated/clientset/typed/metrics/v1alpha1/fake) restored, but was unable to load it with error:
Package (k8s.io/apimachinery/pkg/apis/meta/v1) not found
godep: Checking dependency: k8s.io/metrics/pkg/client/custom_metrics
godep: Dep (k8s.io/metrics/pkg/client/custom_metrics) restored, but was unable to load it with error:
Package (k8s.io/apimachinery/pkg/api/meta) not found
godep: Checking dependency: k8s.io/metrics/pkg/client/custom_metrics/fake
godep: Dep (k8s.io/metrics/pkg/client/custom_metrics/fake) restored, but was unable to load it with error:
Package (k8s.io/apimachinery/pkg/labels) not found
godep: Checking dependency: vbom.ml/util/sortorder
godep: Error checking some deps.
2,64s user 2,75s system 11% cpu 47,395s total
```
v79 works.
Automatic merge from submit-queue
Fix taint based pod eviction for clusters where controller manager is not running with allocate-node-cidrs set
Fixes https://github.com/kubernetes/kubernetes/issues/42733
In my cluster, I have not set allocate-node-cidr, and It is causing taint based pod eviction to fail.
@gmarek @kubernetes/sig-scheduling-bugs @davidopp @derekwaynecarr
Automatic merge from submit-queue (batch tested with PRs 41794, 42349, 42755, 42901, 42933)
Fixes kubectl skew test failure when using kubectl.sh
Fixes leftovers from https://github.com/kubernetes/kubernetes/pull/42737.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 41794, 42349, 42755, 42901, 42933)
Fix DefaultTolerationSeconds admission plugin
DefaultTolerationSeconds is not working as expected. It is supposed to add default tolerations (for unreachable and notready conditions). but no pod was getting these toleration. And api server was throwing this error:
```
Mar 08 13:43:57 fedora25 hyperkube[32070]: E0308 13:43:57.769212 32070 admission.go:71] expected pod but got Pod
Mar 08 13:43:57 fedora25 hyperkube[32070]: E0308 13:43:57.789055 32070 admission.go:71] expected pod but got Pod
Mar 08 13:44:02 fedora25 hyperkube[32070]: E0308 13:44:02.006784 32070 admission.go:71] expected pod but got Pod
Mar 08 13:45:39 fedora25 hyperkube[32070]: E0308 13:45:39.754669 32070 admission.go:71] expected pod but got Pod
Mar 08 14:48:16 fedora25 hyperkube[32070]: E0308 14:48:16.673181 32070 admission.go:71] expected pod but got Pod
```
The reason for this error is that the input to admission plugins is internal api objects not versioned objects so expecting versioned object is incorrect. Due to this, no pod got desired tolerations and it always showed:
```
Tolerations: <none>
```
After this fix, the correct tolerations are being assigned to pods as follows:
```
Tolerations: node.alpha.kubernetes.io/notReady=:Exists:NoExecute for 300s
node.alpha.kubernetes.io/unreachable=:Exists:NoExecute for 300s
```
@davidopp @kevin-wangzefeng @kubernetes/sig-scheduling-pr-reviews @kubernetes/sig-scheduling-bugs @derekwaynecarr
Fixes https://github.com/kubernetes/kubernetes/issues/42716
Automatic merge from submit-queue (batch tested with PRs 41794, 42349, 42755, 42901, 42933)
AppArmor cluster upgrade test
Add a cluster upgrade test for AppArmor. I still need to test this (having some trouble with the cluster-upgrade tests), but wanted to start the review process.
/cc @dchen1107 @roberthbailey
Automatic merge from submit-queue (batch tested with PRs 41794, 42349, 42755, 42901, 42933)
[Federation][e2e] Add framework for upgrade test in federation
Adding framework for federation upgrade tests. please refer to #41791
cc @madhusudancs @nikhiljindal @kubernetes/sig-federation-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 42642, 42899, 42922)
[Federation] Deployments unaware of ReadyReplicas
The Deployment controller was not propagating ReadyReplicas to underlying clusters causing these errors:
```
Error syncing cluster controller: Deployment.apps "federation-deployment" is invalid: status.availableReplicas: Invalid value: 5: cannot be greater than readyReplicas
```
This was caught in e2e testing and is a 1.6 regression for support that was added in #37959. Without this fix, users will be unable to scale up their deployments.
Automatic merge from submit-queue (batch tested with PRs 42642, 42899, 42922)
Update cadvisor godeps to v0.25.0
Completes #42008, a 1.6 issue.
The cadvisor changes include only a couple minor bug fixes, mainly for the devicemapper storage driver.
cc @dchen1107
```release-note
Disable devicemapper thin_ls due to excessive iops
```
Automatic merge from submit-queue
Invalid environment var names are reported and pod starts
When processing EnvFrom items, all invalid keys are collected and
reported as a single event.
The Pod is allowed to start.
fixes#42583
Automatic merge from submit-queue (batch tested with PRs 41830, 42630)
Arrange for elasticsearch to shutdown cleanly
Kubernetes initiates "graceful shutdown" by sending SIGTERM to pid 1, which
is exactly what elasticsearch is expecting (good!)
The way the existing startup scripts worked however, this signal arrived at
the shell wrapper, not elasticsearch, and the shell wrapper exited,
killing the container immediately (bad!)
Before this change:
```
1 ? Ss 0:00 /bin/sh -c /run.sh
6 ? S 0:00 /bin/bash /run.sh
13 ? S 0:00 \_ /bin/su -c /elasticsearch/bin/elasticsearch elasticsearch
14 ? Ss 0:00 \_ sh -c /elasticsearch/bin/elasticsearch
15 ? Sl 19:18 \_ /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java ... org.elasticsearch.bootstrap.Elasticsearch start
```
After this change:
```
1 ? Ssl 0:29 /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java ... org.elasticsearch.bootstrap.Elasticsearch start
```
Automatic merge from submit-queue
[Federation] Kubefed Init should use the right RBAC API version clientset
**What this PR does / why we need it**:
Implements the need as described in https://github.com/kubernetes/kubernetes/issues/41263
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/41263
**Special notes for your reviewer**:
@madhusudancs @shashidharatd @marun
cc @kubernetes/sig-federation-bugs
**Release note**:
```
NONE
```
The Deployment controller was not propagating ReadyReplicas to underlying clusters causing these errors:
```
Error syncing cluster controller: Deployment.apps "federation-deployment" is invalid: status.availableReplicas: Invalid value: 5: cannot be greater than readyReplicas
```
This was caught in e2e testing and is a 1.6 regression for support that was added in #37959. Without this fix, users will be unable to scale up their deployments.
Automatic merge from submit-queue (batch tested with PRs 38805, 42362, 42862)
Let GC print specific message for RESTMapping failure
Make the error messages reported in https://github.com/kubernetes/kubernetes/issues/39816 to be more specific, also only print the message once.
I'll also update the garbage collector's doc to clearly state we don't support tpr yet.
We'll wait for the watchable discovery feature (@sttts are you going to work on that?) to land in 1.7, and then enable the garbage collector to handle TPR.
cc @hongchaodeng @MikaelCluseau @djMax
Automatic merge from submit-queue (batch tested with PRs 38805, 42362, 42862)
Fix deployment generator after introducing deployments in apps/v1beta1
This PR does two things:
1. Switches all generator to produce versioned objects, to bypass the problem of having an object in multiple versions, which then results in not having stable generator (iow. producing exactly the same object).
2. Introduces new generator for `apps/v1beta1` deployments.
@kargakis @janetkuo ptal
@kubernetes/sig-apps-pr-reviews @kubernetes/sig-cli-pr-reviews ptal
This is a followup to https://github.com/kubernetes/kubernetes/pull/39683, so I'm adding 1.6 milestone.
```release-note
Introduce new generator for apps/v1beta1 deployments
```
Automatic merge from submit-queue (batch tested with PRs 42608, 42444)
Return nil when deleting non-exist GCE PD
When gce cloud tries to delete a disk, if the disk could not be found
from the zones, the function should return nil error. This modified behavior is also consistent with AWS
Automatic merge from submit-queue (batch tested with PRs 42877, 42853)
discriminate more when parsing kube-env :(
Exactly match the key. Right now CA_KEY matches ETCD_CA_KEY and we just pick the first because fml.
I HATE BASH
more fixes for kubelet rbac enablement upgrades.
Automatic merge from submit-queue (batch tested with PRs 42877, 42853)
Remove unused functions and make logs slightly better
Zero risk cleanup, removing function that are not used anymore, and adding few more logs to help debugging problems.
cc @aveshagarwal
Automatic merge from submit-queue (batch tested with PRs 36704, 42719)
Extend timeouts in taints test to account for slow Pod deletions
Fix#42685
Before merging this we need a consensus on what to do with slow Pod deletions.
Automatic merge from submit-queue
Use Prometheus instrumentation conventions
The `System` and `Subsystem` parameters are subject to removal.
(x-ref: https://github.com/prometheus/client_golang/issues/240)
All metrics should use base units, which is seconds in the duration
case.
Counters should always end in `_total` and metrics should avoid
referring to potential label dimensions. Those should rather be
mentioned in the documentation string.
@kubernetes/sig-instrumentation
Reference docs:
https://prometheus.io/docs/practices/instrumentation/https://prometheus.io/docs/practices/naming/
**Release note**:
```
Breaking change: Renamed REST client Prometheus metrics to follow the instrumentation conventions ("request_latency_microseconds" -> "rest_client_request_latency_seconds", "request_status_codes" -> "rest_client_requests_total"). Please update your alerting pipeline if you rely on them.
```
Automatic merge from submit-queue
e2e test: Log container output on TestContainerOutput error
When a pod started with TestContainerOutput or TestContainerOutputRegexp
fails from unknown reason, we should log all output of all its containers
so we can analyze what went wrong.
This would help us to see what wrong in https://github.com/kubernetes/kubernetes/issues/40811 - a container is running there for 3 minutes and dies and we want to see what it did for these 3 minutes.
```release-note
NONE
```