This is the old behaviour and we did not intent to change it due to enabled authn/z in general.
As the kube-apiserver this sets the "system:unsecured" user info.
Automatic merge from submit-queue (batch tested with PRs 65256, 64236, 64919, 64879, 57932). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Cancellable leader election
**What this PR does / why we need it**:
Adds ability to cancel leader election. Useful in integration tests where the whole app is started and stopped in each test.
**Special notes for your reviewer**:
I used the `context` package - it is impossible/hard to achieve the same behaviour with just channels without spawning additional goroutines but it is trivial with `context`. See `acquire()` and `renew()` methods.
**Release note**:
```release-note
NONE
```
/kind enhancement
/sig api-machinery
Automatic merge from submit-queue (batch tested with PRs 63049, 59731). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
re-enable nodeipam in kube-controller-manager
**What this PR does / why we need it**:
Re-enables nodeipam controller for external clouds. Also does a small refactor so that we don't need to pass in `allocateNodeCidr` into the controller.
In v1.10 we made a change (9187b343e1 (diff-f11913dc67d80d36b3d06a93f61c49cf) in https://github.com/kubernetes/kubernetes/pull/57492) where nodeipam would be disabled for any cluster that sets `--cloud-provider=external`. The original intention behind this was that the nodeipam controller is cloud specific for some clouds (only GCE at the moment) so it should be moved to the CCM (cloud controller manager). After some discussions with wg-cloud-provider it makes sense to re-enable nodeipam controller in KCM and have GCE CCM enable its own cloud-specific IPAM controller as part of [Initialize()](https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/cloud.go#L33-L35). This would allow for GCE to run nodeipam in both KCM (by setting --cloud-provider=gce and --allocate-node-cidr) and in the CCM (once implemented in `Initialize()`) without disabling nodeipam in the KCM for all external clouds and avoids having to implement nodeipam in CCM.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Re-enable nodeipam controller for external clouds.
```
Automatic merge from submit-queue (batch tested with PRs 61306, 60270, 62496, 62181, 62234). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
split up the huge set of flags into smaller option structs
**What this PR does / why we need it**:
To make generic, we do following work:
1. Spliting `KubeControllerManagerConfiguration` in kube-controller-manager and cloud-controller-manager into fewer smaller struct options order by controller, and modify relative flag. Also part of #59483.
2. Spliting `componentconfig` in controller-manager into fewer smaller config order by controller too.
All works follow #59582, using `option+config` logic.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
With d7ddcca231, we lost the logging
of the flags. We should at least log what the command line flags
were used to start processes as those incredibly useful for trouble shooting.
Add a separate method in a new file for creating cloud providers.
Currently the code is all mixed into the controller manager. We
should actively control what is made available to the cloud provider
so list explicitly the parms needed and move the code out. This will
avoid linkages to sneak in as we will catch it better during reviews.
Automatic merge from submit-queue (batch tested with PRs 58626, 58791). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
serviceaccount: check token is issued by correct iss before verifying
Right now if a JWT for an unknown issuer, for any subject hits the
serviceaccount token authenticator, we return a errors as if the token
was meant for us but we couldn't find a key to verify it. We should
instead return nil, false, nil.
This change helps us support multiple service account token
authenticators with different issuers.
https://github.com/kubernetes/kubernetes/issues/58790
```release-note
NONE
```
Right now if a JWT for an unknown issuer, for any subject hits the
serviceaccount token authenticator, we return a errors as if the token
was meant for us but we couldn't find a key to verify it. We should
instead return nil, false, nil.
This change helps us support multiple service account token
authenticators with different issuers.
Prepatory work fpr removing cloud provider dependency from node
controller running in Kube Controller Manager. Splitting the node
controller into its two major pieces life-cycle and CIDR/IP
management. Both pieces currently need the the cloud system to do their work.
Removing lifecycles dependency on cloud will be fixed ina followup PR.
Moved node scheduler code to live with node lifecycle controller.
Got the IPAM/Lifecycle split completed. Still need to rename pieces.
Made changes to the utils and tests so they would be in the appropriate
package.
Moved the node based ipam code to nodeipam.
Made the relevant tests pass.
Moved common node controller util code to nodeutil.
Removed unneeded pod informer sync from node ipam controller.
Fixed linter issues.
Factored in feedback from @gmarek.
Factored in feedback from @mtaufen.
Undoing unneeded change.
Automatic merge from submit-queue (batch tested with PRs 57651, 56411, 56779, 57523, 57624). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Use authentication client with explicit version
**What this PR does / why we need it**:
Authentication client without explicit version has been deprecated, change them to the one with explicit version.
**Which issue(s) this PR fixes**:
Fixes partially #55993
**Special notes for your reviewer**:
/cc @caesarxuchao @sttts
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix garbage collector when leader-elect=false
**What this PR does / why we need it**:
In a 1.8.x master with --leader-elect=false, the garbage collector controller
does not work.
When deleting a deployment with v1meta.DeletePropagationForeground, the deployment
had its deletionTimestamp set and a foreground Deletion finalizer was added,
but the deployment, rs and pod were not deleted.
This is an issue with how the garbage collector graph_builder behaves when the
stopCh=nil. This PR creates a dummy stop channel for the garbage collector controller (and other
controllers started by the controller-manager) so that they can work more like they do when
when the controller-manager is configured with --leader-elect=true.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#57044
**Special notes for your reviewer**:
**Release note**:
```release-note
Fix garbage collection when the controller-manager uses --leader-elect=false
```
Seperate loop and plugin control in the kube-controller-manager.
Adding an "--external-plugin" flag to specify a plugin to load when
cloud-provider is set to "external". Flag has no effect currently
when the cloud-provider is not set to external. The expectation is
that the cloud provider and external plugin flags would go away once
all cloud providers are on stage 2 cloud-controller-manager solutions.
Managing the control loops more directly based on start up flags.
Addressing issue brought up by @wlan0
Switched to using the main node controller in CCM.
Changes to enable full NodeController to start in CCM.
Fix related tests.
Unifying some common code between KCM and CCM.
Fix related tests and comments.
Folded in feedback from @jhorwit2 and @wlan0
**What this PR does / why we need it**:
In a 1.8.x master with --leader-elect=false, the garbage collector controller
does not work.
When deleting a deployment with v1meta.DeletePropagationForeground, the deployment
had its deletionTimestamp set and a foreground Deletion finalizer was added,
but the deployment, rs and pod were not deleted.
This is an issue with how the garbage collector graph_builder behaves when the
stopCh=nil. This PR creates a dummy stop channel for the garbage collector controller (and other
controllers started by the controller-manager) so that they can work more like they do when
when the controller-manager is configured with --leader-elect=true.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#57044
**Special notes for your reviewer**:
**Release note**:
<!-- Write your release note:
1. Enter your extended release note in the below block. If the PR requires additional action from users switching to the new release, include the string "action required".
2. If no release note is required, just write "NONE".
-->
```release-note
Garbage collection doesn't work when the controller-manager uses --leader-elect=false
```
running (renamed to GetAllCurrentZones). Added E2E test to confirm this
behavior.
Added node informer to cloud-provider controller to keep track of zones
with k8s nodes in them.
Automatic merge from submit-queue (batch tested with PRs 51034, 53239). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix conditional for warning while starting KCM without secret file
@liggitt @spiffxp @lavalamp
Fixes#53291
A small bug was introduced in this PR - https://github.com/kubernetes/kubernetes/pull/50288, where the warning message is printed when the file is specified, and it is not printed if it is left blank - exactly the opposite of the intended behavior.
This fixes that.
```
release-note-none
```
Automatic merge from submit-queue (batch tested with PRs 52445, 52380, 52516, 52531, 52538). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
remove repeated import'k8s.io/client-go/kubernetes' in controllermana…
**What this PR does / why we need it**:
There are duplicate importing "k8s.io/client-go/kubernetes", we just need 'clientset'.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 50173, 50324, 50288, 50263, 50333)
Honor --use-service-account-credentials and warn when missing private key
Fixes#50275 by logging a warning and failing to start rather than continue to run ignoring the user's specified config
- Move public key functions to client-go/util/cert
- Move pki file helper functions to client-go/util/cert
- Standardize on certutil package alias
- Update dependencies to client-go/util/cert
Automatic merge from submit-queue (batch tested with PRs 49538, 49708, 47665, 49750, 49528)
Use the core client with version
**What this PR does / why we need it**:
Replace the **deprecated** `clientSet.Core()` with `clientSet.CoreV1()`.
**Which issue this PR fixes**: fixes#49535
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49665, 49689, 49495, 49146, 48934)
make it possible to allow discovery errors for controllers
Update the discovery client to return partial discovery information *and* an error. Since we can aggregate API servers, discovery of some resources can fail independently. Callers of this function who want to tolerate the errors can, existing callers will still get an error and fail in normal blocks.
@kubernetes/sig-api-machinery-misc @sttts
Tokens controller previously needed a bit of extra help in order to be
safe for concurrent use. The new MutationCache allows it to keep a local
cache and still use a shared informer. The filtering event handler lets
it only see changes to secrets it cares about.
Automatic merge from submit-queue (batch tested with PRs 45374, 44537, 45739, 44474, 45888)
Allow kcm and scheduler to lock on ConfigMaps.
**What this PR does / why we need it**:
Plumbs through the ability to lock on ConfigMaps through the kcm and scheduler.
**Which issue this PR fixes**
Fixes: #44857
Addresses issues with: #45415
**Special notes for your reviewer**:
**Release note**:
```
Add leader-election-resource-lock support to kcm and scheduler to allow for locking on ConfigMaps as well as Endpoints(default)
```
/cc @kubernetes/sig-cluster-lifecycle-pr-reviews @jamiehannaford @bsalamat @mikedanese
Automatic merge from submit-queue
Add support for IP aliases for pod IPs (GCP alpha feature)
```release-note
Adds support for allocation of pod IPs via IP aliases.
# Adds KUBE_GCE_ENABLE_IP_ALIASES flag to the cluster up scripts (`kube-{up,down}.sh`).
KUBE_GCE_ENABLE_IP_ALIASES=true will enable allocation of PodCIDR ips
using the ip alias mechanism rather than using routes. This feature is currently
only available on GCE.
## Usage
$ CLUSTER_IP_RANGE=10.100.0.0/16 KUBE_GCE_ENABLE_IP_ALIASES=true bash -x cluster/kube-up.sh
# Adds CloudAllocator to the node CIDR allocator (kubernetes-controller manager).
If CIDRAllocatorType is set to `CloudCIDRAllocator`, then allocation
of CIDR allocation instead is done by the external cloud provider and
the node controller is only responsible for reflecting the allocation
into the node spec.
- Splits off the rangeAllocator from the cidr_allocator.go file.
- Adds cloudCIDRAllocator, which is used when the cloud provider allocates
the CIDR ranges externally. (GCE support only)
- Updates RBAC permission for node controller to include PATCH
```
Automatic merge from submit-queue
Remove alphaProvisioner in PVController and AlphaStorageClassAnnotation
remove alpha annotation and alphaProvisioner
**Release note**:
```release-note
NONE
```
If CIDRAllocatorType is set to `CloudCIDRAllocator`, then allocation
of CIDR allocation instead is done by the external cloud provider and
the node controller is only responsible for reflecting the allocation
into the node spec.
- Splits off the rangeAllocator from the cidr_allocator.go file.
- Adds cloudCIDRAllocator, which is used when the cloud provider allocates
the CIDR ranges externally. (GCE support only)
- Updates RBAC permission for node controller to include PATCH
Automatic merge from submit-queue (batch tested with PRs 42692, 42169, 42173)
Add pprof trace support
Add support for `/debug/pprof/trace`
Can wait for master to reopen for 1.7.
cc @smarterclayton @wojtek-t @gmarek @timothysc @jeremyeder @kubernetes/sig-scalability-pr-reviews
Automatic merge from submit-queue
Make controller-manager resilient to stale serviceaccount tokens
Now that the controller manager is spinning up controller loops using service accounts, we need to be more proactive in making sure the clients will actually work.
Future additional work:
* make a controller that reaps invalid service account tokens (c.f. https://github.com/kubernetes/kubernetes/issues/20165)
* allow updating the client held by a controller with a new token while the controller is running (c.f. https://github.com/kubernetes/kubernetes/issues/4672)
Automatic merge from submit-queue
Switch service controller to shared informers
Originally part of #40097
cc @deads2k @smarterclayton @gmarek @wojtek-t @timothysc @sttts @liggitt @kubernetes/sig-scalability-pr-reviews
Automatic merge from submit-queue
Remove alpha provisioning
This is the first part of https://github.com/kubernetes/features/issues/36
@kubernetes/sig-storage-misc
**Release note**:
```release-note
Alpha version of dynamic volume provisioning is removed in this release. Annotation
"volume.alpha.kubernetes.io/storage-class" does not have any special meaning. A default storage class
and DefaultStorageClass admission plugin can be used to preserve similar behavior of Kubernetes cluster,
see https://kubernetes.io/docs/user-guide/persistent-volumes/#class-1 for details.
```
Automatic merge from submit-queue (batch tested with PRs 39418, 41175, 40355, 41114, 32325)
TaintController
```release-note
This PR adds a manager to NodeController that is responsible for removing Pods from Nodes tainted with NoExecute Taints. This feature is beta (as the rest of taints) and enabled by default. It's gated by controller-manager enable-taint-manager flag.
```
Automatic merge from submit-queue (batch tested with PRs 39418, 41175, 40355, 41114, 32325)
ResyncPeriod Comment
ResyncPeriod Comment:
// ResyncPeriod returns a function which generates a duration each time it is
// invoked; this is so that multiple controllers don't get into lock-step and all
// hammer the apiserver with list requests simultaneously.
Automatic merge from submit-queue (batch tested with PRs 39661, 39740, 39801, 39468, 39743)
add --controllers to controller manager
Adds a `--controllers` flag to the `kube-controller-manager` to indicate which controllers are enabled and disabled. From the help:
```
--controllers stringSlice A list of controllers to enable. '*' enables all on-by-default controllers, 'foo' enables the controller named 'foo', '-foo' disables the controller named 'foo'.
All controllers: certificatesigningrequests, cronjob, daemonset, deployment, disruption, endpoint, garbagecollector, horizontalpodautoscaling, job, namespace, podgc, replicaset, replicationcontroller, resourcequota, serviceaccount, statefuleset
```
PV controller should not use Controller.Requeue, as as it is not available in
shared informers. We need to implement our own work queues instead where we
can enqueue volumes/claims as we want.
Currently, the HPA considers unready pods the same as ready pods when
looking at their CPU and custom metric usage. However, pods frequently
use extra CPU during initialization, so we want to consider them
separately.
This commit causes the HPA to consider unready pods as having 0 CPU
usage when scaling up, and ignores them when scaling down. If, when
scaling up, factoring the unready pods as having 0 CPU would cause a
downscale instead, we simply choose not to scale. Otherwise, we simply
scale up at the reduced amount caculated by factoring the pods in at
zero CPU usage.
The effect is that unready pods cause the autoscaler to be a bit more
conservative -- large increases in CPU usage can still cause scales,
even with unready pods in the mix, but will not cause the scale factors
to be as large, in anticipation of the new pods later becoming ready and
handling load.
Similarly, if there are pods for which no metrics have been retrieved,
these pods are treated as having 100% of the requested metric when
scaling down, and 0% when scaling up. As above, this cannot change the
direction of the scale.
This commit also changes the HPA to ignore superfluous metrics -- as
long as metrics for all ready pods are present, the HPA we make scaling
decisions. Currently, this only works for CPU. For custom metrics, we
cannot identify which metrics go to which pods if we get superfluous
metrics, so we abort the scale.