Commit Graph

61839 Commits

Author SHA1 Message Date
Da K. Ma
4df591fc5d Updated comments to correct flag of taint.
Signed-off-by: Da K. Ma <madaxa@cn.ibm.com>
2018-02-17 10:01:25 +08:00
Kubernetes Submit Queue
3a60b0b4f2
Merge pull request #59686 from nicksardo/gce-roles
Automatic merge from submit-queue (batch tested with PRs 59683, 59964, 59841, 59936, 59686). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

GCE: Create cloud-provider roles and bindings via addons

**What this PR does / why we need it**:
This removes the `cloud-provider` role and role binding from the rbac boostrapper and replaces it with a policy applied via addon mgr. This also creates a new clusterrole allowing the service account to create events for any namespace.  

**Special notes for your reviewer**:
/assign @bowei @timstclair 
/cc timstclair

**Release note**:
```release-note
GCE: A role and clusterrole will now be provided with GCE/GKE for allowing the cloud-provider to post warning events on all services and watching configmaps in the kube-system namespace.
```
2018-02-16 16:31:40 -08:00
Kubernetes Submit Queue
31ea4c9981
Merge pull request #59936 from rramkumar1/local-up-cluster-ipvs
Automatic merge from submit-queue (batch tested with PRs 59683, 59964, 59841, 59936, 59686). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Move ipvs module loading logic in local-cluster-up.sh

**What this PR does / why we need it**:
This PR makes the module loading logic for ipvs kube-proxy a little more robust. Previously we were attempting to load the modules and not checking that it succeeded. Now we make sure the loading was successful before proceeding with using ipvs as the proxier.

/assign @cblecker 

Release Note
```release-note
None
```
2018-02-16 16:31:37 -08:00
Kubernetes Submit Queue
270ed995f4
Merge pull request #59841 from dashpole/metrics_after_reclaim
Automatic merge from submit-queue (batch tested with PRs 59683, 59964, 59841, 59936, 59686). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Reevaluate eviction thresholds after reclaim functions

**What this PR does / why we need it**:
When the node comes under `DiskPressure` due to inodes or disk space, the eviction manager runs garbage collection functions to clean up dead containers and unused images.
Currently, we use the strategy of trying to measure the disk space and inodes freed by garbage collection.  However, as #46789 and #56573 point out, there are gaps in the implementation that can cause extra evictions even when they are not required.  Furthermore, for nodes which frequently cycle through images, it results in a large number of evictions, as running out of inodes always causes an eviction.

This PR changes this strategy to call the garbage collection functions and ignore the results.  Then, it triggers another collection of node-level metrics, and sees if the node is still under DiskPressure.
This way, we can simply observe the decrease in disk or inode usage, rather than trying to measure how much is freed.

**Which issue(s) this PR fixes**:
Fixes #46789
Fixes #56573
Related PR #56575

**Special notes for your reviewer**:
This will look cleaner after #57802  removes arguments from [makeSignalObservations](https://github.com/kubernetes/kubernetes/pull/57802/files#diff-9e5246d8c78d50ce4ba440f98663f3e9R719).

**Release note**:
```release-note
NONE
```

/sig node
/kind bug
/priority important-soon
cc @kubernetes/sig-node-pr-reviews
2018-02-16 16:31:33 -08:00
Kubernetes Submit Queue
b544314c2f
Merge pull request #59964 from nikhiljindal/kubemciComments
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Updating code to use TempDir in manifest test

Follow up based on comments in https://github.com/kubernetes/kubernetes/pull/59234

```release-note
NONE
```

cc @MrHohn @madhusudancs @G-Harmon
2018-02-16 16:23:50 -08:00
Kubernetes Submit Queue
6efdc940e8
Merge pull request #59683 from oomichi/cleanup
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove unused getClusterCIDR()

**What this PR does / why we need it**:

getClusterCIDR() has been unused since the PR 57305[1], so this
removes the method for code cleanup.

[1]: https://github.com/kubernetes/kubernetes/pull/57305

**Release note**: "NONE"
2018-02-16 15:41:26 -08:00
Kubernetes Submit Queue
cfa6d35c85
Merge pull request #59827 from dashpole/depreciate_cadvisor_port
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Deprecate kubelet flag for cadvisor port

**Which issue(s) this PR fixes**:
Issue: #56523
TL;DR the Kubelet's `stats/summary` API is the preferred way of monitoring the node.  If you need additional metrics from cAdvisor,  it can be run as a daemonset.

**Release note**:
```release-note
Deprecate the kubelet's cadvisor port
```

/assign @mtaufen @tallclair 
cc @kubernetes/sig-node-pr-reviews
2018-02-16 15:02:06 -08:00
Kubernetes Submit Queue
df92baf6e4
Merge pull request #59874 from dims/log-command-line-flags
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Log the command line flags

**What this PR does / why we need it**:

With d7ddcca231, we lost the logging
of the flags. We should at least log what the command line flags
were used to start processes as those incredibly useful for trouble shooting.


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:
/assign @deads2k 
/assign @liggitt 

**Release note**:

```release-note
NONE
```
2018-02-16 14:22:25 -08:00
Kubernetes Submit Queue
930f86574f
Merge pull request #57885 from cimomo/kubelet-fixes
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Improve comments for kubelet

**What this PR does / why we need it**:
Improve comments and fix typos for kubelet.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-02-16 13:38:49 -08:00
Kubernetes Submit Queue
9df102b4e2
Merge pull request #59956 from mlmhl/fix_pv_controller_metric_e2e
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Wait for bound pvc metric updated before validating

**What this PR does / why we need it**:

We should wait for both bound pv and pvc metrics updated before validating the metric values(Currently we only wait for bound pv metric updated).

**Release note**:

```release-note
NONE
```

/sig storage
2018-02-16 12:06:10 -08:00
Rohit Ramkumar
ab53cb2429 Move ipvs module loading logic 2018-02-16 11:43:02 -08:00
nikhiljindal
0694dd7065 Updating code to use TempDir in manifest test 2018-02-16 11:18:27 -08:00
Kubernetes Submit Queue
244549f02a
Merge pull request #59769 from dashpole/capacity_ephemeral_storage
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Collect ephemeral storage capacity on initialization

**What this PR does / why we need it**:
We have had some node e2e flakes where a pod can be rejected if it requests ephemeral storage.  This is because we don't set capacity and allocatable for ephemeral storage on initialization.
This PR causes cAdvisor to do one round of stats collection during initialization, which will allow it to get the disk capacity when it first sets the node status.
It also sets the node to NotReady if capacities have not been initialized yet.

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
/assign @jingxu97 @Random-Liu 

/sig node
/kind bug
/priority important-soon
2018-02-16 11:17:02 -08:00
Kubernetes Submit Queue
9586cd06c2
Merge pull request #59920 from juju-solutions/bug/cleancredreq
Automatic merge from submit-queue (batch tested with PRs 57136, 59920). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Clean-up not needed method in juju charms

**What this PR does / why we need it**: Improve code quality. Remove code that is not offeringany functionality.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-02-16 10:35:36 -08:00
Kubernetes Submit Queue
eac5bc0035
Merge pull request #57136 from k82cn/k8s_54313
Automatic merge from submit-queue (batch tested with PRs 57136, 59920). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Updated PID pressure node condition.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
part of #54313 

**Release note**:

```release-note
Updated PID pressure node condition
```
2018-02-16 10:35:33 -08:00
Kubernetes Submit Queue
d594a13d69
Merge pull request #59954 from msau42/index-sc
Automatic merge from submit-queue (batch tested with PRs 57700, 59954). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Index PVs by StorageClass in assume cache

**What this PR does / why we need it**:
Performance optimization for delayed binding in the scheduler to only search for PVs with a matching StorageClass name.  This means that if you prebind the PV to a PVC, the PV must have a matching StorageClass name.  This behavior is different from when you prebind with immediate binding, which doesn't care about StorageClass.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #56102

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-02-16 09:24:30 -08:00
Kubernetes Submit Queue
f9c3a0abc7
Merge pull request #57700 from porridge/improve-msg-conn-kill
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Improve the error message.

**What this PR does / why we need it**:

Makes the error message more descriptive and less scary. Previously it
is far from obvious whether connection kill is a symptom or cause of the
problem, see for example https://github.com/kubernetes/kubernetes/issues/55779#issuecomment-353582852

In paricular the crucial missing piece of information is that this is a
way of handling a timeout.

**Release note**:
```release-note
NONE
```
2018-02-16 08:50:07 -08:00
David Ashpole
e0830d0b71 reevaluate eviction thresholds after reclaim functions 2018-02-16 08:35:24 -08:00
Kubernetes Submit Queue
11ecad2629
Merge pull request #59914 from iliastsi/feature-csi-stale-path
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

csi: Remove stale volume path

**What this PR does / why we need it**:

The CSI mounter creates the following paths during SetUp():

   * .../pods/\<podID\>/volumes/kubernetes.io~csi/\<specVolId\>/mount/
   * .../pods/\<podID\>/volumes/kubernetes.io~csi/\<specVolId\>/volume_data.json

During TearDown(), it does not remove the `.../kubernetes.io~csi/<specVolId>/`
directory, leaving behind orphan volumes: method cleanupOrphanedPodDirs()
complains with 'Orphaned pod found, but volume paths are still present
on disk'.

Fix that by removing the above directory in removeMountDir().

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-02-16 08:04:32 -08:00
Kubernetes Submit Queue
f223f90542
Merge pull request #59870 from deads2k/admission-21-decorator
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

add an admission decorator chain

Admission decorators are good wrappers for general function, but we logically need a chain of them.  This builds a chain similar to admission.

/assign @sttts 
@kubernetes/sig-api-machinery-pr-reviews
2018-02-16 07:23:39 -08:00
Kubernetes Submit Queue
0e81651e77
Merge pull request #59909 from jsafrane/volumemanager-approvers
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add jsafrane as AWS approver.

**What this PR does / why we need it**:
I contrinbuted several PRs in AWS storage and I'm willing to share review/approval duty.

**Release note**:

```release-note
NONE
```

/assign @justinsb
2018-02-16 06:02:01 -08:00
Kubernetes Submit Queue
1db96b0e9b
Merge pull request #59668 from brycecarman/ccm-iam-role
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add AWS cloud provider option for IAM role

**What this PR does / why we need it**:
Adds the option to provide an IAM role ARN in the AWS cloud provider config file that should be assumed when communicating with the AWS APIs. 
For example, this allows running Controller Manager in a account separate from the worker nodes, but still allows all resources created to interact with the workers. ELBs created would be in the same account as the worker nodes for instance. 

**Which issue(s) this PR fixes** *(optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged)*:
Fixes #59526

**Special notes for your reviewer**:
None

**Release note**:

```release-note
Add AWS cloud provider option to use an assumed IAM role 
```
2018-02-16 06:01:45 -08:00
David Eads
1ae856484b add an admission decorator chain 2018-02-16 08:54:31 -05:00
Kubernetes Submit Queue
ada9400915
Merge pull request #59917 from gmarek/quotas
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add quotas to density and load tests

@kubernetes/sig-scalability-misc 

```release-note
NONE
```
2018-02-16 03:56:24 -08:00
Kubernetes Submit Queue
fc45081784
Merge pull request #59913 from bskiba/e2e-regional
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix cluster autoscaler test to support regional clusters.

**What this PR does / why we need it**:
Fixes cluster autoscaler e2e tests to work with regional clusters.

**Release note**:
```NONE```
2018-02-16 03:17:10 -08:00
Marek Grabowski
77a1268fed Add quotas to density and load tests 2018-02-16 09:53:26 +00:00
Kubernetes Submit Queue
1252dc66b0
Merge pull request #59808 from x13n/sd-version-bump
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

use prometheus-to-sd 0.2.4 and fluentd-gcp-image 2.0.16

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

cc @tallclair
2018-02-16 01:33:52 -08:00
Daniel Kłobuszewski
a88ddac1e4 use prometheus-to-sd 0.2.4 and fluentd-gcp-image 2.0.16 2018-02-16 09:16:59 +01:00
Kubernetes Submit Queue
430c1a68c8
Merge pull request #59955 from nikhiljindal/kubemcie2e
Automatic merge from submit-queue (batch tested with PRs 59809, 59955). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Updating kubemci e2e test to not add kubeconfig flag for get-status

Follow up to https://github.com/kubernetes/kubernetes/pull/59234

Updating RunKubemciCmd to not add the --kubeconfig flag and adding a RunKubemciWithKubeconfig method that adds the kubeconfig param before calling RunKubemciCmd
And Updating get-status to use RunKubemciCmd instead of RunKubemciWithKubeconfig.

```release-note
NONE
```

cc @MrHohn @G-Harmon @madhusudancs
2018-02-15 22:42:33 -08:00
Kubernetes Submit Queue
01ec7a9eb8
Merge pull request #59809 from phsiao/59733_port_forward_with_target_port
Automatic merge from submit-queue (batch tested with PRs 59809, 59955). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubectl port-forward should resolve service port to target port

**What this PR does / why we need it**:

Continues on the work in #59705, this PR adds additional support for looking up targetPort for a service, as well as enable using svc/name to select a pod.

**Which issue(s) this PR fixes**:
Fixes #15180
Fixes #59733

**Special notes for your reviewer**:

I decided to create pkg/kubectl/util/service_port.go to contain two functions that might be re-usable.

**Release note**:
```release-note
`kubectl port-forward` now supports specifying a service to port forward to: `kubectl port-forward svc/myservice 8443:443`
```
2018-02-15 22:42:30 -08:00
Bryce Carman
3b99e1b487 Add AWS cloud provider option for IAM role
Currently the AWS cloud provider uses the EC2 instance role when
interacting with AWS APIs. This change gives the option to provide and IAM
role that the cloud provider will assume before calling the APIs. All
resources created by the role will be owned by that account instead of
the account where the EC2 instance is running.
2018-02-15 21:14:58 -08:00
Kubernetes Submit Queue
c105796e4b
Merge pull request #59953 from Random-Liu/fix-pod-scheduled
Automatic merge from submit-queue (batch tested with PRs 59873, 59933, 59923, 59944, 59953). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix pod scheduled.

Fix `PodScheduled` condition.

The test `[k8s.io] EquivalenceCache [Serial] validates pod affinity works properly when new replica pod is scheduled` for cri-containerd is flaky.
The reason is that it assume all existing pods should have `PodScheduled` condition, but it is not the case:
```
Feb 15 15:31:01.359: INFO: with-label-390d246e-1265-11e8-beb8-0a580a3c7b55       bootstrap-e2e-minion-group-l6qw  Running         [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 15:30:59 +0000 UTC  } {Ready True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 15:31:00 +0000 UTC  } {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 15:30:59 +0000 UTC  }]
Feb 15 15:31:01.359: INFO: calico-node-7mzxc                                     bootstrap-e2e-minion-group-hztx  Running         [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 14:17:05 +0000 UTC  } {Ready True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 14:17:59 +0000 UTC  }]
Feb 15 15:31:01.359: INFO: calico-node-kvrsx                                     bootstrap-e2e-minion-group-l6qw  Running         [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 15:24:54 +0000 UTC  } {Ready True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 15:25:20 +0000 UTC  }]
Feb 15 15:31:01.359: INFO: calico-node-llwjh        
```

I'm not sure why this doesn't happen to docker. One theory is that we don't prepull image in cri-containerd, and we do start pod a bit faster for cri-containerd, and that exposes the race condition.

/cc @kubernetes/sig-node-bugs 
Signed-off-by: Lantao Liu <lantaol@google.com>



**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
none
```
2018-02-15 20:16:44 -08:00
Kubernetes Submit Queue
f60083549a
Merge pull request #59944 from aveshagarwal/master-update-reviewers
Automatic merge from submit-queue (batch tested with PRs 59873, 59933, 59923, 59944, 59953). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Update reviewers for sig-scheduling.

@bsalamat @timothysc @kubernetes/sig-scheduling-misc 

**Release note**: 
```release-note
None
```
2018-02-15 20:16:41 -08:00
Kubernetes Submit Queue
99c87cf679
Merge pull request #59923 from jsafrane/volumemanager-logs
Automatic merge from submit-queue (batch tested with PRs 59873, 59933, 59923, 59944, 59953). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Rework volume manager log levels

- all normal logs to go to level 4
- too frequent / duplicate logs go to level 5 (e.g. when something else logged similar message not too far away).

I checked that there is no excessive spam in the log - reconciler runs every 100ms, but it does not log anything if there is nothing to do.

**What this PR does / why we need it**:
This will help us debug flakes. E2e tests do not log levels 10-12 used in volume manager

**Release note**:

```release-note
NONE
```

/sig storage
/sig node
cc: @jingxu97 @sjenning
2018-02-15 20:16:38 -08:00
Kubernetes Submit Queue
72e1cf21c4
Merge pull request #59933 from mikedanese/rm-cert-controller
Automatic merge from submit-queue (batch tested with PRs 59873, 59933, 59923, 59944, 59953). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

gke-certificates-controller: rm -rf

Fixes https://github.com/kubernetes/kubernetes/issues/53439

```release-note
NONE
```
2018-02-15 20:16:36 -08:00
Kubernetes Submit Queue
c7c5d89e32
Merge pull request #59873 from jsafrane/fix-downward-flake
Automatic merge from submit-queue (batch tested with PRs 59873, 59933, 59923, 59944, 59953). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix DownwardAPI refresh race.

WaitForAttachAndMount should mark only pod in DesiredStateOfWorldPopulator (DSWP) and DSWP should mark the volume to be remounted only when the new pod has been processed.

Otherwise DSWP and reconciler race who gets the new pod first. If it's reconciler, then DownwardAPI and Projected volumes of the pod are not refreshed with new content and they are updated after the next periodic sync (60-90 seconds).

Fixes #59813 

/assign @jingxu97 @saad-ali 
/sig storage
/sig node

```release-note
None
```
2018-02-15 20:16:32 -08:00
Kubernetes Submit Queue
bfdd94c6a0
Merge pull request #59170 from cofyc/fix_kubelet_volume_metrics
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix kubelet PVC stale metrics

**What this PR does / why we need it**:

Volumes on each node changes, we should not only add PVC metrics into
gauge vector. It's better use a collector to collector metrics from internal
stats.

Currently, if a PV (bound to a PVC `testpv`)  is attached and used by node A, then migrated to node B or just deleted from node A later.  `testpvc` metrics will not disappear from kubelet on node A. After a long running time, `kubelet` process will keep a lot of stale volume metrics in memory.

For these dynamic metrics, it's better to use a collector to collect metrics from a data source (`StatsProvider` here), like [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics) scraping metrics from kube-apiserver.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/57686

**Special notes for your reviewer**:

**Release note**:

```release-note
Fix kubelet PVC stale metrics
```
2018-02-15 18:44:08 -08:00
nikhiljindal
d2fe556309 Updating kubemci e2e test to not add kubeconfig flag for get-status 2018-02-15 18:23:57 -08:00
mlmhl
dcbd1ae3cf wait for bound pvc metric updated before validating 2018-02-16 09:57:30 +08:00
David Ashpole
b259543985 collect ephemeral storage capacity on initialization 2018-02-15 17:33:22 -08:00
Michelle Au
5271edd9e2 Index PVs by StorageClass in assume cache 2018-02-15 17:12:32 -08:00
Lantao Liu
f69b4e9262 Fix pod scheduled.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-02-16 00:51:20 +00:00
Kubernetes Submit Queue
271c267fff
Merge pull request #59830 from khenidak/az-ratelimit
Automatic merge from submit-queue (batch tested with PRs 59939, 59830). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Azure - ARM Read/Write rate limiting

**What this PR does / why we need it**:

Azure cloud provider currently runs with:
1. Single ARM rate limiter for both `read [put/post/delete]` and `write` operations, while ARM provide [different rates for read/write] (https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-manager-request-limits). This causes write operation to stop even if there is available write request quotas. 
2. Cloud provider uses rate limiter's `Accept()` instead of `TryAccept()` This causes control loop to wait for prolonged tike `in case of no request quota available` for **all** requests even for those does not require ARM interaction. A case for that the `Service` control loop will wait for a prolonged time trying to create `LoadBalancer` service even though it can fail and work on the next service which is `ClusterIP`. This PR moves cloud provider tp `TryAccept()`

**Which issue(s) this PR fixes**:
Fixes # https://github.com/kubernetes/kubernetes/issues/58770

**Special notes for your reviewer**:
`n/a`

**Release note**:

```release-note
- Separate current ARM rate limiter into read/write
- Improve control over how ARM rate limiter is used within Azure cloud provider
```

cc @jackfrancis (need your help carefully reviewing this one) @brendanburns @jdumars
2018-02-15 16:43:37 -08:00
Kubernetes Submit Queue
281cb00776
Merge pull request #59939 from dims/avoid-calls-to-cloud-instances-unless-taint-present
Automatic merge from submit-queue (batch tested with PRs 59939, 59830). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Avoid call to get cloud instances

**What this PR does / why we need it**:

if a node does not have the taint, we really don't need to make calls
to get the list of instances from the cloud provider

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:
Found when reviewing code for #59887

**Release note**:

```release-note
NONE
```
2018-02-15 16:43:34 -08:00
Davanum Srinivas
265e5ae085 Log the command line flags
With d7ddcca231, we lost the logging
of the flags. We should at least log what the command line flags
were used to start processes as those incredibly useful for trouble shooting.
2018-02-15 18:04:04 -05:00
Kubernetes Submit Queue
cdecea5455
Merge pull request #59948 from JulienBalestra/revert-kubelet-pod-status
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: revert the get pod with updated status

**What this PR does / why we need it**:

Following #59892 this PR finish to revert #57106 

The first revert didn't solve the reboot test in the [test grid](https://k8s-testgrid.appspot.com/google-gce#gci-gce-reboot).


**Special notes for your reviewer**:

cc @dashpole  @Random-Liu 

**Release note**:
```release-note
None
```
2018-02-15 14:51:53 -08:00
Nick Sardo
911a082d65 Add cloud-provider policies to be applied via addon mgr 2018-02-15 14:49:33 -08:00
JulienBalestra
493f335830 kubelet: revert the get pod status 2018-02-15 22:24:35 +01:00
Avesh Agarwal
9b19141281 Update reviewers for sig-scheduling. 2018-02-15 16:08:52 -05:00
Kubernetes Submit Queue
f88ff2ab41
Merge pull request #59937 from MrHohn/addon-manager-reviewer
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add a reviewer to addon-manager

**What this PR does / why we need it**:
Would like to keep an eye on this until it goes away.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #NONE 

**Special notes for your reviewer**:
/assign @mikedanese 

**Release note**:

```release-note
NONE
```
2018-02-15 12:17:42 -08:00