Currently, the HPA considers unready pods the same as ready pods when
looking at their CPU and custom metric usage. However, pods frequently
use extra CPU during initialization, so we want to consider them
separately.
This commit causes the HPA to consider unready pods as having 0 CPU
usage when scaling up, and ignores them when scaling down. If, when
scaling up, factoring the unready pods as having 0 CPU would cause a
downscale instead, we simply choose not to scale. Otherwise, we simply
scale up at the reduced amount caculated by factoring the pods in at
zero CPU usage.
The effect is that unready pods cause the autoscaler to be a bit more
conservative -- large increases in CPU usage can still cause scales,
even with unready pods in the mix, but will not cause the scale factors
to be as large, in anticipation of the new pods later becoming ready and
handling load.
Similarly, if there are pods for which no metrics have been retrieved,
these pods are treated as having 100% of the requested metric when
scaling down, and 0% when scaling up. As above, this cannot change the
direction of the scale.
This commit also changes the HPA to ignore superfluous metrics -- as
long as metrics for all ready pods are present, the HPA we make scaling
decisions. Currently, this only works for CPU. For custom metrics, we
cannot identify which metrics go to which pods if we get superfluous
metrics, so we abort the scale.
Automatic merge from submit-queue
CRI: Add security context for sandbox/container
Part of #29478. This PR
- adds security context for sandbox and fixes#33139
- encaps container security context to `SecurityContext` and adds missing features
- Note that capability is not fully accomplished in this PR because it is under discussion at #33614.
cc/ @yujuhong @yifan-gu @Random-Liu @kubernetes/sig-node
Automatic merge from submit-queue
Fix kubectl drain for statefulset
Support deleting pets for `kubectl drain`.
Use evict to delete pods.
Fixes: #33727
```release-note
Adds support for StatefulSets in kubectl drain.
Switches to use the eviction sub-resource instead of deletion in kubectl drain, if server supports.
```
@foxish @caesarxuchao
Automatic merge from submit-queue
Rename experimental-runtime-integration-type to experimental-cri
Also rename the field in the component config to `EnableCRI`
Automatic merge from submit-queue
copy PodInitialized condition to v1
Copied from pkg/api/types.go
I might batch this change with other similar changes, but want to gets this reviewed first.
cc @dchen1107 @yujuhong
Automatic merge from submit-queue
Default kube-proxy to the old behavior for proxier sync.
Fix#36281.
This PR defaults `minSyncPeriod` to 0 and makes kube-proxy fall back to the old behavior to fix the immediate problem #36266.
@bprashanth
Automatic merge from submit-queue
Rename ScheduledJobs to CronJobs
I went with @smarterclayton idea of registering named types in schema. This way we can support both the new (CronJobs) and old (ScheduledJobs) resource name. Fixes#32150.
fyi @erictune @caesarxuchao @janetkuo
Not ready yet, but getting close there...
**Release note**:
```release-note
Rename ScheduledJobs to CronJobs.
```
Automatic merge from submit-queue
azure: loadbalancer rules use DSR
**What this PR does / why we need it**:
Enables "direct server return" on the load balancer in Azure, which causes the DIP to be preserved when traffic goes through the load balancer. This enables service traffic to go to the Service Port rather than having to go through the NodePort.
**Special notes for your reviewer**:
N/A.
**Tested with...**:
```shell
kubectl run nginx --image=nginx
kubectl run nginx2 --image=nginx
kubectl expose deployment nginx --port=80 --type=LoadBalancer
kubectl expose deployment nginx2 --port=80 --type=LoadBalancer
```
Ensuring that both services got external IPs and that the resources created looked correct.
**Release note**:
```release-note
azure: load balancer preserves destination ip address
```
CC: @brendandburns
Automatic merge from submit-queue
Adding more e2e tests for federated namespace cascading deletion and fixing bugs
Ref https://github.com/kubernetes/kubernetes/issues/33612
Adding more e2e tests for testing cascading deletion of federated namespace.
New tests are now verifying that cascading deletion happen when DeletionOptions.OrphanDependents=false and it does not happen when DeleteOptions.OrphanDependents=true.
Also updated deletion helper to always add OrphanFinalizer. generic registry will remove it if DeleteOptions.OrphanDependents=false. Also updated namespace registry to do the same.
We need to add the orphan finalizer to keep the orphan by default behavior. We assume that its dependents are going to be orphaned and hence add that finalizer. If user does not want the orphan behavior, he can do so using DeleteOptions and then the registry will remove that finalizer.
cc @kubernetes/sig-cluster-federation @caesarxuchao @derekwaynecarr
Automatic merge from submit-queue
Restore old apiserver cert CN
This patch got lost during rebase of https://github.com/kubernetes/kubernetes/pull/35109:
- set `host@<unix-timestamp>` as CN in self-signed apiserver certs
- skip non-domain CN in getNamedCertificateMap
Automatic merge from submit-queue
Add more events to disruption controller
To provide users with information that their PDB may not be working as intended.
cc: @davidopp
Automatic merge from submit-queue
[RFC] Prepare for deprecating NodeLegacyHostIP
Ref https://github.com/kubernetes/kubernetes/issues/9267#issuecomment-257994766
*What this PR does*
- Add comments saying "LegacyHostIP" will be deprecated in 1.7;
- Add v1.NodeLegacyHostIP to be consistent with the internal API (useful for client-go migration #35159)
- Let cloudproviders who used to only set LegacyHostIP set the IP as both InternalIP and ExternalIP
- Master used to ssh tunnel to node's ExternalIP or LegacyHostIP to do [healthz check](https://github.com/kubernetes/kubernetes/blame/master/pkg/master/master.go#L328-L332). OTOH, if on-prem, kubelet only [sets](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet_node_status.go#L430-L431) LegacyHostIP or InternalIP. In order to deprecate LegacyHostIP in 1.7, I let healthz check to use InternalIP if ExternalIP is not available. (The healthz check is the only consumer of LegacyHostIP in k8s.)
@liggitt @justinsb @bgrant0607
```release-note
LegacyHostIP will be deprecated in 1.7.
```
Automatic merge from submit-queue
Add caching for discovery info with invalidation on cache-miss
TODO:
- [x] write tests for `CachedDiscoveryClient`
- [x] write tests for `DeferredDiscoveryRESTMapper` on cache-miss
- [x] find better way/structure to get rid of `invalidateCh` in c06ba3175b
Automatic merge from submit-queue
Fix LBaaS version detection in openstack cloudprovider
`lbversion` is the local variable used for version detection when `os.lbOpts.LBVersion` is not specified.
xref https://bugzilla.redhat.com/show_bug.cgi?id=1391837
@ncdc @derekwaynecarr @anguslees
Automatic merge from submit-queue
Fix possible race in operationNotSupportedCache
Because we can run multiple workers to delete namespaces simultaneously, the
operationNotSupportedCache needs to be guarded with a mutex to avoid concurrent
map read/write errors.
Automatic merge from submit-queue
Fix the crossbuild that #35132 broke
@dashpole @dchen1107 @vishh
A quick LGTM would be nice in order to not block any releases.
Automatic merge from submit-queue
kubelet bootstrap: start hostNetwork pods before we have PodCIDR
Network readiness was checked in the pod admission phase, but pods that
fail admission are not retried. Move the check to the pod start phase.
Issue #35409
Issue #35521
Automatic merge from submit-queue
Add DisruptedPod list to PodDisruptionBudgetStatus
To ensure that PodDisruptionBudget is race free a list of pods that were planned to be disrupted needs to be added to the status. ApiServer when evicting a pod will add it to this list. Disruption controller will skip pods from that list when calculating the number of healthy pods. The pods from the list are removed either when they are gone or when they were not actually disrupted.
Automatic merge from submit-queue
Add missing expansion files to versioned clientset
I copied the expansion functions that only existed in the internalclientset to release_1_5.
Most changes are mechanical. This is needed for migrating k8s to use versioned clientset, so I add the 1.5 milestone.
Automatic merge from submit-queue
CRI: rearrange kubelet rutnime initialization
Consolidate the code used by docker+cri and remote+cri for consistency, and to
prevent changing one without the other. Enforce that
`--experimental-runtime-integration-type` has to be set in order for kubelet
use the CRI interface, *even for out-of-process shims`. This simplifies the
temporary `if` logic in kubelet while CRI still co-exists with older logic.
Automatic merge from submit-queue
SetSelfLink is inefficient
Generating self links, especially for lists, is inefficient. Replace
use of net.URL.String() call with direct encoding that reduces number of
allocations. Switch from calling meta.ExtractList|SetList to a function
that iterates over each object in the list.
In steady state for nodes performing frequently small get/list
operations, and for larger LISTs significantly reduces CPU and
allocations.
@wojtek-t this is the next big chunk of CPU use during the large N nodes simulation test (11% of master CPU). Takes a few allocations out of the critical path