Commit Graph

2367 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
93055c7730 Merge pull request #65330 from freehan/neg-rate-limit
Automatic merge from submit-queue (batch tested with PRs 59214, 65330). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

add rate limiting for NEG calls

```release-note
None
```
2018-06-25 18:19:04 -07:00
Kubernetes Submit Queue
3079c1df2f Merge pull request #65389 from Random-Liu/add-crictl-into-sudoer-path
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add /home/kubernetes/bin into sudoers path, so that `sudo crictl` works.

Add `/home/kubernetes/bin` to sudoers path, so that user can call `sudo crictl` directly.

Without this fix, user has to either use the full path `sudo /home/kubernetes/bin/crictl` or switch to root, which is not a good user experience.

/cc @yujuhong @feiskyer @filbranden @kubernetes/sig-node-pr-reviews @kubernetes/sig-gcp-pr-reviews 
**Release note**:

```release-note
User can now use `sudo crictl` on GCE cluster.
```
2018-06-23 00:00:53 -07:00
Lantao Liu
2af997470f Add /home/kubernetes/bin into sudoers path, so that sudo crictl works. 2018-06-22 17:10:55 -07:00
Jeff Grafton
23ceebac22 Run hack/update-bazel.sh 2018-06-22 16:22:57 -07:00
Minhan Xia
760e17542c add rate limiting for NEG calls 2018-06-22 11:16:07 -07:00
Kubernetes Submit Queue
b48339704f Merge pull request #65024 from jingax10/calico_custom_branch
Automatic merge from submit-queue (batch tested with PRs 65024, 65287, 65345, 64693, 64941). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add a helper function to customize K8s addon yamls and use it to customize Calico addons on GKE

**What this PR does / why we need it**:

Allow customizing Calico addon in GCP. With #65022, this allows us to do a couple of things:, e.g., run Calico 3.0+ on GCP, use a non-default MTU etc.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #65045, #65067

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-06-22 06:10:10 -07:00
Kubernetes Submit Queue
ea0c683e2d Merge pull request #65301 from wojtek-t/heapster_node_first
Automatic merge from submit-queue (batch tested with PRs 65301, 65291, 65307, 63845, 65313). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Create heapster node first

This should help with mitigating failures like this:
https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-scale-correctness/127/build-log.txt
2018-06-22 03:08:07 -07:00
Kubernetes Submit Queue
7888a34f47 Merge pull request #65176 from kawych/master
Automatic merge from submit-queue (batch tested with PRs 65123, 65176, 65139, 65084, 65056). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Pass cluster_location argument to Heapster

**What this PR does / why we need it**:
Fixes Stackdriver monitoring on GCE clusters where cluster location is not a single zone, for example regional clusters.

**Release note**:
```release-note
Pass cluster_location argument to Heapster
```
2018-06-21 16:56:10 -07:00
Jing Ai
4dce8973ce Add a helper function to customize K8s addon yamls and use it to customize Calico addons on GKE. 2018-06-21 15:07:22 -07:00
wojtekt
226aa7306c Create heapster node first 2018-06-21 11:00:18 +02:00
Lantao Liu
e862da1709 Update crictl to v1.11.0. 2018-06-19 18:04:15 -07:00
Aleksandra Malinowska
e9611b5b00 Cluster Autoscaler 1.3.0 2018-06-19 15:58:06 +02:00
Karol Wychowaniec
eefdff659d Pass cluster_location argument to Heapster 2018-06-18 13:54:22 +02:00
Aleksandra Malinowska
4be77c5fea Update Cluster Autoscaler to v1.3.0-beta.2 2018-06-15 19:18:13 +02:00
immutablet
02e57ac118 Add kms-plugin-container.manifest to release manifest tarball. 2018-06-12 16:04:20 -07:00
Kubernetes Submit Queue
8e03228c1a Merge pull request #64643 from dashpole/memcg_poll
Automatic merge from submit-queue (batch tested with PRs 64503, 64903, 64643, 64987). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use unix.EpollWait to determine when memcg events are available to be Read

**What this PR does / why we need it**:
This fixes a file descriptor leak introduced in https://github.com/kubernetes/kubernetes/pull/60531 when the `--experimental-kernel-memcg-notification` kubelet flag is enabled.  The root of the issue is that `unix.Read` blocks indefinitely when reading from an event file descriptor and there is nothing to read.  Since we refresh the memcg notifications, these reads accumulate until the memcg threshold is crossed, at which time all reads complete.  However, if the node never comes under memory pressure, the node can run out of file descriptors.

This PR changes the eviction manager to use `unix.EpollWait` to wait, with a 10 second timeout, for events to be available on the eventfd.  We only read from the eventfd when there is an event available to be read, preventing an accumulation of `unix.Read` threads, and allowing the event file descriptors to be reclaimed by the kernel.

This PR also breaks the creation, and updating of the memcg threshold into separate portions, and performs creation before starting the periodic synchronize calls.  It also moves the logic of configuring memory thresholds into memory_threshold_notifier into a separate file.

This also reverts https://github.com/kubernetes/kubernetes/pull/64582, as the underlying leak that caused us to disable it for testing is fixed here.

Fixes #62808

**Release note**:
```release-note
NONE
```

/sig node
/kind bug
/priority critical-urgent
2018-06-11 17:29:19 -07:00
Kubernetes Submit Queue
ec434662bd Merge pull request #64503 from kgolab/kg-ca-rbac
Automatic merge from submit-queue (batch tested with PRs 64503, 64903, 64643, 64987). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Create system:cluster-autoscaler account & role and introduce it to C…

**What this PR does / why we need it**:

This PR adds cluster-autoscaler ClusterRole & binding, to be used by the Cluster Autoscaler (kubernetes/autoscaler repository).
It also updates GCE scripts to make CA use the cluster-autoscaler user account.

User account instead of Service account is chosen to be more in line with kube-scheduler.

**Which issue(s) this PR fixes**:

Fixes [issue 383](https://github.com/kubernetes/autoscaler/issues/383) from kubernetes/autoscaler.

**Special notes for your reviewer**:

This PR might be treated as a security fix since prior to it CA on GCE was using system:cluster-admin account, assumed due to default handling of unsecured & unauthenticated traffic over plain HTTP.

**Release note**:

```release-note
A cluster-autoscaler ClusterRole is added to cover only the functionality required by Cluster Autoscaler and avoid abusing system:cluster-admin role.

action required: Cloud providers other than GCE might want to update their deployments or sample yaml files to reuse the role created via add-on.
```
2018-06-11 17:29:13 -07:00
Kubernetes Submit Queue
de8cc31355 Merge pull request #64977 from aleksandra-malinowska/cluster-autoscaler-1.3.0-beta.1
Automatic merge from submit-queue (batch tested with PRs 64945, 64977). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Cluster Autoscaler 1.3.0-beta.1

Update Cluster Autoscaler version to 1.3.0-beta.1

```release-note
Update Cluster Autoscaler version to 1.3.0-beta.1. Release notes: https://github.com/kubernetes/autoscaler/releases/tag/cluster-autoscaler-1.3.0-beta.1
```
2018-06-11 12:38:14 -07:00
Karol Gołąb
9e2fa69d20 Limit the mounted directory to cluster-autoscaler/ 2018-06-11 21:03:47 +02:00
Aleksandra Malinowska
77a6892e92 Cluster Autoscaler 1.3.0-beta.1 2018-06-11 15:22:10 +02:00
Karol Gołąb
faa4dc39c4 Disambiguate a comment 2018-06-11 10:56:02 +02:00
Kubernetes Submit Queue
c2b27efd3b Merge pull request #60699 from CaoShuFeng/remove-enable-custom-metrics
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

remove deprecated option '--enable-custom-metrics'

**Release note**:
```release-note
deprecated and inactive option '--enable-custom-metrics' is removed in 1.11
```
2018-06-08 11:23:02 -07:00
Karol Gołąb
c70b554af9 Create system:cluster-autoscaler account & role and introduce it to CA start-up script 2018-06-08 14:15:52 +02:00
David Ashpole
796b31edcc re-enable memcg for testing on gce 2018-06-07 13:03:38 -07:00
Kubernetes Submit Queue
e2d997cfea Merge pull request #64276 from wangzhen127/manifests-seccomp
Automatic merge from submit-queue (batch tested with PRs 64276, 64094, 64719, 64766, 64750). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use default seccomp profile for GCE manifests

**What this PR does / why we need it**:
This PR sets the default seccomp profile of unprivileged addons to 'docker/default' for GCE manifests. This PR is a followup of #62662. We are using 'docker/default' instead of 'runtime/default' in addons in order to handle node version skew. When seccomp profile is applied automatically by default later, we can remove those annotations.

This is PR is part of #39845.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-06-05 11:35:10 -07:00
Cao Shufeng
d767ce24a9 remove deprecated option '--enable-custom-metrics' 2018-06-05 11:19:23 +08:00
Kubernetes Submit Queue
898831ad9d Merge pull request #64592 from ravisantoshgudimetla/revert-64364-remove-rescheduler
Automatic merge from submit-queue (batch tested with PRs 63453, 64592, 64482, 64618, 64661). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Revert "Remove rescheduler and corresponding tests from master"

Reverts kubernetes/kubernetes#64364

After discussing with @bsalamat on how DS controllers(ref: https://github.com/kubernetes/kubernetes/pull/63223#discussion_r192277527) cannot create pods if the cluster is at capacity and they have to rely on rescheduler for making some space, we thought it is better to 

- Bring rescheduler back.
- Make rescheduler priority aware.
- If cluster is full and if **only** DS controller is not able to create pods, let rescheduler be run and let it evict some pods which have less priority.
- The DS controller pods will be scheduled now.

So, I am reverting this PR now. Step 2, 3 above are going to be in rescheduler.

/cc @bsalamat @aveshagarwal @k82cn 

Please let me know your thoughts on this. 

```release-note
Revert #64364 to resurrect rescheduler. More info https://github.com/kubernetes/kubernetes/issues/64725 :)
```
2018-06-04 16:56:11 -07:00
Kubernetes Submit Queue
4f088e6263 Merge pull request #64591 from cadmuxe/custom_netd
Automatic merge from submit-queue (batch tested with PRs 61610, 64591, 58143, 63929). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add netd as an addon for GCP

**What this PR does / why we need it**:
Add netd as an addon for GKE.
The PR will add setup functions and var to help deploy netd daemon on GKE.
Please checkout more detail for netd at https://github.com/GoogleCloudPlatform/netd

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2018-06-04 12:26:16 -07:00
Kubernetes Submit Queue
36a3daa355 Merge pull request #61610 from rajansandeep/kubeupaddon
Automatic merge from submit-queue (batch tested with PRs 61610, 64591, 58143, 63929). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Create CoreDNS and kube-dns folders

**What this PR does / why we need it**:
Separate the CoreDNS and kube-dns manifests by creating their own folders (dns/coredns and dns/kube-dns) 

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #61435 

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
cc @MrHohn
2018-06-04 12:26:12 -07:00
Koonwah Chen
daf5e15535 add NON_MASTER_NODE_LABELS to config-test.sh 2018-06-03 20:47:26 -07:00
Koonwah Chen
37059e7efa Code clean up 2018-06-03 19:41:47 -07:00
Koonwah Chen
bb8272ead4 support netd on k8s 2018-06-03 01:35:27 -07:00
Kubernetes Submit Queue
586e558c3b Merge pull request #59938 from rramkumar1/gce-cluster-up-ipvs
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add ipvs module loading logic to gce scripts

**What this PR does / why we need it**:
Add ipvs module loading logic to gce scripts. 
Fixes a part of #59402.

/cc @Lion-Wei 
/assign @roberthbailey @m1093782566 

**Release note**:
```release-note
None
```
2018-05-31 20:55:44 -07:00
Koonwah Chen
d903d32856 Add netd as an addon for GKE. 2018-05-31 19:25:15 -07:00
RaviSantosh Gudimetla
872addf9e3 Revert "Remove rescheduler and corresponding tests from master" 2018-05-31 22:18:49 -04:00
Kubernetes Submit Queue
a7998a2a0e Merge pull request #64292 from awly/gce-pull-exec-plugin
Automatic merge from submit-queue (batch tested with PRs 64582, 64292). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Pull gke-exec-auth-plugin binary on Nodes

If the plugin URL is set and VM is not master, pull the plugin binary.

**What this PR does / why we need it**: implement deployment of https://github.com/kubernetes/cloud-provider-gcp/tree/master/cmd/gke-exec-auth-plugin on Node VMs.

**Release note**:
```release-note
NONE
```
2018-05-31 19:04:03 -07:00
Kubernetes Submit Queue
01e21b8516 Merge pull request #64582 from dashpole/turn_off_memcg
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Disable memcg for testing prior to 1.11 release

**What this PR does / why we need it**:
Turn off kubelet memory cgroup notifications on GCE to unblock scalability testing.
Related issue: #62808 

```release-note
NONE
```
/sig node
/kind bug
/priority critical-urgent
/assign @shyamjvs @yujuhong
2018-05-31 19:00:28 -07:00
Andrew Lytvynov
7e444a453b Quote shell variable expansion 2018-05-31 16:04:19 -07:00
David Ashpole
c844b9afc4 disable memcg for testing prior to 1.11 release 2018-05-31 15:25:58 -07:00
Zhen Wang
227f7d761d Use default seccomp profile for GCE manifests 2018-05-31 10:35:26 -07:00
Rohit Ramkumar
cc87e73dd8 Add ipvs module loading logic to gce scripts 2018-05-31 08:40:05 -07:00
ravisantoshgudimetla
7559a3678b Build files generated 2018-05-29 20:04:43 -04:00
ravisantoshgudimetla
aeccffc339 Phase out rescheduler in favor of priority and preemption 2018-05-29 19:52:06 -04:00
Sandeep Rajan
753632d85b create coredns and kube-dns folders 2018-05-29 11:52:57 -04:00
Kubernetes Submit Queue
930b3939f1 Merge pull request #64294 from vishh/shutdown-script
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Adding a shutdown script that would enable handling preemptible VM terminations gracefully in GCP environment

This PR adds a shutdown script to COS nodes in GCP k8s clusters that will make preemptible nodes sleep for however long they can between the time they receive an ACPI shutdown request and get's terminated.
https://cloud.google.com/compute/docs/instances/preemptible#preemption_process

This will then allow for catching termination signals via GCE metadata APIs and gracefully evict pods in k8s.

xref https://github.com/kubernetes/release/pull/560/
2018-05-25 22:33:33 -07:00
Vishnu kannan
9475292cd8 Adding a shutdown script that would enable handling preemptible VM terminations gracefully in GCP environment
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2018-05-25 16:20:24 -07:00
Kubernetes Submit Queue
d7c40cf69e Merge pull request #64275 from mtaufen/dkcfg-beta
Automatic merge from submit-queue (batch tested with PRs 63417, 64249, 64242, 64128, 64275). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

graduate DynamicKubeletConfig feature to beta

Everything in the https://github.com/kubernetes/features/issues/281 beta target except a few documentation-related items is complete. Docs should be done very soon, so I'm sending this PR to graduate to beta before freeze.

```release-note
The dynamic Kubelet config feature is now beta, and the DynamicKubeletConfig feature gate is on by default. In order to use dynamic Kubelet config, ensure that the Kubelet's --dynamic-config-dir option is set. 
```

/cc @luxas
2018-05-24 20:49:22 -07:00
Kubernetes Submit Queue
e299a5ea90 Merge pull request #63904 from hzxuzhonghu/gce-alpha-feature
Automatic merge from submit-queue (batch tested with PRs 64060, 63904, 64218, 64208, 64247). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Revert enable PodPreset admission and also enable settings.k8s.io/v1a…

…lpha1 api resource



**What this PR does / why we need it**:

Enable PodPreset admission for there are alpha feature test cases covering it.  Simultaneously enable sttings.k8s.io/v1alpha1 api resource.

Fixes #63843 

**Release note**:

```release-note
NONE
```
2018-05-24 17:01:14 -07:00
Andrew Lytvynov
1f7671b18d Pull gke-exec-auth-plugin binary on Nodes
If the plugin URL is set and VM is not master, pull the plugin binary.
2018-05-24 15:08:35 -07:00
Kubernetes Submit Queue
972a74e238 Merge pull request #63755 from tomoe/dumpstack-docker
Automatic merge from submit-queue (batch tested with PRs 63434, 64172, 63975, 64180, 63755). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Dump Stack when docker fails on healthcheck

Save stack dump of docker daemon in order to be able to
investigate why docker daemon was unresposive to `docker ps`

See https://github.com/moby/moby/blob/master/daemon/daemon.go on
how docker sets up a trap for SIGUSR1 with `setupDumpStackTrap()`

**What this PR does / why we need it**:

This allows us to investigate why docker daemon was unresponsive to "docker ps" command. 

**Special notes for your reviewer**:
Manually tested on Ubuntu and COS.

**Release note**:

```release-note
NONE
```
2018-05-24 12:18:25 -07:00