Commit Graph

1711 Commits

Author SHA1 Message Date
Shyam Jeedigunta
cc8bb857f9 Allow creating special node for heapster in GCE 2017-06-28 21:27:36 +02:00
Kubernetes Submit Queue
63d4af44ac Merge pull request #48004 from dnardo/gke
Automatic merge from submit-queue (batch tested with PRs 48004, 48205, 48130, 48207)

Do not set CNI in cases where there is a private master and network policy provider is set.

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
In GCE and in a "private master" setup, do not set the network-plugin provider to CNI by default if a network policy provider is given.
```
2017-06-28 10:35:10 -07:00
gmarek
10ce8e2c0d Fix bug cluster-subnet logic 2017-06-28 14:27:52 +02:00
Zach Loafman
903bc643b1 Bump GCE ContainerVM to container-vm-v20170627
Remove the built-in kubelet (finally), pick up security fixes.
2017-06-27 16:14:55 -07:00
Kubernetes Submit Queue
ede78d9ee7 Merge pull request #47513 from gmarek/subnet
Automatic merge from submit-queue

Make big clusters work again after introduction of subnets

This PR does two things: 
  - make IP aliases automatically pick Node IP Range based on number of Nodes,
  - fix logic for starting clusters >4095 Nodes that was broken by introduction of subnets,

cc @wojtek-t @shyamjvs 

```release-note
Setting env var ENABLE_BIG_CLUSTER_SUBNETS=true will allow kube-up.sh to start clusters bigger that 4095 Nodes on GCE.
```

Ref https://github.com/kubernetes/kubernetes/issues/47344
2017-06-27 08:52:50 -07:00
Kubernetes Submit Queue
0dad2d0803 Merge pull request #47983 from yguo0905/memcg
Automatic merge from submit-queue (batch tested with PRs 48092, 47894, 47983)

Enables memcg notification in cluster/node e2e tests

Ref: https://github.com/kubernetes/kubernetes/issues/42676

This PR sets Kubelet flag `--experimental-kernel-memcg-notification=true` when running cluster/node e2e tests on COS and Ubuntu images.

Tested:
```
e2e-node-cos:
I0623 00:09:06.641776    1080 server.go:147] Starting server "kubelet" with command "/usr/bin/systemd-run --unit=kubelet-777178888.service --slice=runtime.slice --remain-after-exit /tmp/node-e2e-20170622T170739/kubelet --kubelet-cgroups=/kubelet.slice --cgroup-root=/ --api-servers http://localhost:8080 --address 0.0.0.0 --port 10250 --read-only-port 10255 --volume-stats-agg-period 10s --allow-privileged true --serialize-image-pulls false --pod-manifest-path /tmp/node-e2e-20170622T170739/pod-manifest571288056 --file-check-frequency 10s --pod-cidr 10.100.0.0/24 --eviction-pressure-transition-period 30s --feature-gates  --eviction-hard memory.available<250Mi,nodefs.available<10%%,nodefs.inodesFree<5%% --eviction-minimum-reclaim nodefs.available=5%%,nodefs.inodesFree=5%% --v 4 --logtostderr --network-plugin=kubenet --cni-bin-dir /tmp/node-e2e-20170622T170739/cni/bin --cni-conf-dir /tmp/node-e2e-20170622T170739/cni/net.d --hostname-override tmp-node-e2e-bfe5799d-cos-stable-59-9460-64-0 --experimental-mounter-path=/tmp/node-e2e-20170622T170739/cluster/gce/gci/mounter/mounter --experimental-kernel-memcg-notification=true"

e2e-node-ubuntu:
I0623 00:03:28.526984    2279 server.go:147] Starting server "kubelet" with command "/usr/bin/systemd-run --unit=kubelet-1407651753.service --slice=runtime.slice --remain-after-exit /tmp/node-e2e-20170622T170203/kubelet --kubelet-cgroups=/kubelet.slice --cgroup-root=/ --api-servers http://localhost:8080 --address 0.0.0.0 --port 10250 --read-only-port 10255 --volume-stats-agg-period 10s --allow-privileged true --serialize-image-pulls false --pod-manifest-path /tmp/node-e2e-20170622T170203/pod-manifest083943734 --file-check-frequency 10s --pod-cidr 10.100.0.0/24 --eviction-pressure-transition-period 30s --feature-gates  --eviction-hard memory.available<250Mi,nodefs.available<10%%,nodefs.inodesFree<5%% --eviction-minimum-reclaim nodefs.available=5%%,nodefs.inodesFree=5%% --v 4 --logtostderr --network-plugin=kubenet --cni-bin-dir /tmp/node-e2e-20170622T170203/cni/bin --cni-conf-dir /tmp/node-e2e-20170622T170203/cni/net.d --hostname-override tmp-node-e2e-e48cdd73-ubuntu-gke-1604-xenial-v20170420-1 --experimental-kernel-memcg-notification=true"

e2e-node-containervm:
I0623 00:14:35.392383    2774 server.go:147] Starting server "kubelet" with command "/tmp/node-e2e-20170622T171318/kubelet --runtime-cgroups=/docker-daemon --kubelet-cgroups=/kubelet --cgroup-root=/ --system-cgroups=/system --api-servers http://localhost:8080 --address 0.0.0.0 --port 10250 --read-only-port 10255 --volume-stats-agg-period 10s --allow-privileged true --serialize-image-pulls false --pod-manifest-path /tmp/node-e2e-20170622T171318/pod-manifest507536807 --file-check-frequency 10s --pod-cidr 10.100.0.0/24 --eviction-pressure-transition-period 30s --feature-gates  --eviction-hard memory.available<250Mi,nodefs.available<10%,nodefs.inodesFree<5% --eviction-minimum-reclaim nodefs.available=5%,nodefs.inodesFree=5% --v 4 --logtostderr --network-plugin=kubenet --cni-bin-dir /tmp/node-e2e-20170622T171318/cni/bin --cni-conf-dir /tmp/node-e2e-20170622T171318/cni/net.d --hostname-override tmp-node-e2e-9e3fdd7c-e2e-node-containervm-v20161208-image"

e2e-cos:
Jun 23 17:54:38 e2e-test-ygg-minion-group-t5r0 kubelet[2005]: I0623 17:54:38.646374    2005 flags.go:52] FLAG: --experimental-kernel-memcg-notification="true"

e2e-ubuntu:
Jun 23 18:25:27 e2e-test-ygg-minion-group-19qp kubelet[1547]: I0623 18:25:27.722253    1547 flags.go:52] FLAG: --experimental-kernel-memcg-notification="true"

e2e-containervm:
I0623 18:55:51.886632    3385 flags.go:52] FLAG: --experimental-kernel-memcg-notification="false"
```

**Release note**:
```
None
```

/sig node
/area node-e2e
/assign @dchen1107 @dashpole
2017-06-26 21:08:10 -07:00
gmarek
536f48ef15 Fix test commands in cluster/gce/util.sh 2017-06-26 21:27:04 +02:00
gmarek
64f6606833 Make big clusters work again after introduction of subnets 2017-06-26 21:27:04 +02:00
Yang Guo
50d49d9c51 Enables memcg notification in cluster/node e2e tests 2017-06-26 11:40:22 -07:00
Kubernetes Submit Queue
14edc46c2e Merge pull request #47892 from ajitak/npd-config
Automatic merge from submit-queue (batch tested with PRs 47993, 47892, 47591, 47469, 47845)

Bump up npd version to v0.4.1

```
Bump up npd version to v0.4.1
```

Fixes #47219
2017-06-23 18:05:46 -07:00
Kubernetes Submit Queue
de86a83535 Merge pull request #47993 from dnardo/ip-masq-agent
Automatic merge from submit-queue (batch tested with PRs 47993, 47892, 47591, 47469, 47845)

Use a different env var to enable the ip-masq-agent addon.

We shouldn't mix setting the non-masq-cidr with enabling the addon.



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```

https://github.com/kubernetes/kubernetes/issues/47865
2017-06-23 18:05:44 -07:00
Daniel Nardo
00b62df554 Do not set CNI on a private master when enabling network policy. 2017-06-23 17:07:56 -07:00
Daniel Nardo
c75de4e39f Use a different env var to enable the ip-masq-agent addon.
We shouldn't mix setting the non-masq-cidr with enabling the addon.
2017-06-23 14:47:49 -07:00
Tim St. Clair
ed8ea24f65 Strip versions from known api groups in audit policy 2017-06-23 11:55:45 -07:00
Kubernetes Submit Queue
cdc9770346 Merge pull request #46792 from ianchakeres/avoid-redundant-copy-to-staging
Automatic merge from submit-queue (batch tested with PRs 47403, 46646, 46906, 46527, 46792)

Avoid redundant copying of tars during kube-up for gce if the same file already exists

**What this PR does / why we need it**: 

Whenever I execute cluster/kube-up.sh it copies my tar files to google cloud, even if the files haven't changed. This PR checks to see whether the files already exist, and avoids uploading them again. These files are large and can take a long time to upload.

**Which issue this PR fixes**: fixes #46791

**Special notes for your reviewer**:

Here is the new output:

cluster/kube-up.sh 
... Starting cluster in us-central1-b using provider gce
... calling verify-prereqs
... calling verify-kube-binaries
... calling kube-up
Project: PROJECT
Zone: us-central1-b
+++ Staging server tars to Google Storage: gs://kubernetes-staging-PROJECT/kubernetes-devel
+++ kubernetes-server-linux-amd64.tar.gz uploaded earlier, cloud and local file md5 match (md5 = 3a095kcf27267a71fe58f91f89fab1bc)


**Release note**:
```cluster/kube-up.sh on gce now avoids redundant copying of kubernetes tars if the local and cloud files' md5 hash match```
2017-06-23 02:59:31 -07:00
Kubernetes Submit Queue
9e71b122f5 Merge pull request #47922 from dnardo/ip-masq-agent
Automatic merge from submit-queue

Remove limits from ip-masq-agent for now and disable ip-masq-agent in GCE

ip-masq-agent when issuing an iptables-save will read any configured iptables on the node.  This means that the ip-masq-agent's memory requirements would grow with the number of iptables (i.e. services) on the node.



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
#47865
**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-06-22 20:41:26 -07:00
Daniel Nardo
630fb9657a Remove limits from ip-masq-agent for now.
ip-masq-agent when issuing an iptables-save will read
any configured iptables on the node.  This means that
the ip-masq-agent's memory requirements would grow
with the number of iptables (i.e. services) on the node.

Disable ip-masq-agent in GCE
2017-06-22 17:01:22 -07:00
Tim St. Clair
dcdcb19c47 Don't audit log tokens in TokenReviews 2017-06-22 13:38:44 -07:00
Ajit Kumar
caff16c678 Bump up npd version to v0.4.1 2017-06-22 13:13:50 -07:00
Kubernetes Submit Queue
afa78083de Merge pull request #47794 from dnardo/ip-masq-agent
Automatic merge from submit-queue

Add ip-masq-agent readiness label by default.  

Since we are setting the non-masq-cidr in the kubelet to 0.0.0.0/0 we
need to ensure the ip-masq-agent runs.

pr/#46473 made the NON_MASQUERADE_CIDR default to 0.0.0.0/0 which means we need to have this label set now.



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
fixes #47752
**Special notes for your reviewer**:

**Release note**:

```release-note
ip-masq-agent is now the default for GCE
```
2017-06-20 23:26:30 -07:00
Kubernetes Submit Queue
26c431affa Merge pull request #47632 from mwielgus/node-taints-scripts
Automatic merge from submit-queue (batch tested with PRs 45268, 47573, 47632, 47818)

NODE_TAINTS in gce startup scripts

Currently there is now way to pass a list of taints that should be added on node registration (at least not in gce or other saltbased deployment). This PR adds necessary plumbing to pass the taints from user or instance group template to kubelet startup flags. 

```release-note
Taints support in gce/salt startup scripts. 
```

The PR was manually tested. 
```
NODE_TAINTS: 'dedicated=ml:NoSchedule'
```
in kube-env results in 
```
spec:
[...]
  taints:
  - effect: NoSchedule
    key: dedicated
    timeAdded: null
    value: ml
```

cc: @davidopp @gmarek @dchen1107 @MaciekPytel
2017-06-20 18:18:59 -07:00
Daniel Nardo
fc279e069e Add ip-masq-agent readiness label by default. Since we are
setting the non-masq-cidr in the kubelet to 0.0.0.0/0 we
need to ensure the ip-masq-agent runs.

Add node label pre-req back to ip-masq-agent.

Make gce test consistent with gce default scripts.
2017-06-20 16:19:50 -07:00
Kubernetes Submit Queue
d746cbbb39 Merge pull request #47634 from mwielgus/expander-price
Automatic merge from submit-queue (batch tested with PRs 46604, 47634)

Set price expander in Cluster Autoscaler for GCE

With CA 0.6 we will make price-preferred node expander the default one for GCE. For other cloud providers we will stick to the default one (random) until the community implement the required interfaces in CA repo.

https://github.com/kubernetes/autoscaler/issues/82

cc: @MaciekPytel @aleksandra-malinowska
2017-06-20 03:15:57 -07:00
Marcin Wielgus
9143569891 NODE_TAINTS in gce startup scripts 2017-06-20 00:51:56 +02:00
Kubernetes Submit Queue
c5f38f4478 Merge pull request #47669 from caseydavenport/fix-typha
Automatic merge from submit-queue

Set Typha replica count to 0 when Calico is not enabled

**What this PR does / why we need it**:
A replacement for https://github.com/kubernetes/kubernetes/pull/47624, which turned out not to be the right fix. 

**Which issue this PR fixes**
https://github.com/kubernetes/kubernetes/issues/47622

**Release note**:
```release-note
NONE
```
2017-06-19 15:06:02 -07:00
Marcin Wielgus
8d801d918d Set price expander in Cluster Autoscaler for gce 2017-06-19 23:52:47 +02:00
Kubernetes Submit Queue
cc645a8c6f Merge pull request #46327 from supereagle/mark-network-plugin-dir-deprecated
Automatic merge from submit-queue (batch tested with PRs 46327, 47166)

mark --network-plugin-dir deprecated for kubelet

**What this PR does / why we need it**:

**Which issue this PR fixes** : fixes #43967

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-06-19 11:23:54 -07:00
Casey Davenport
2ba0f1c211 Set Typha replica count to 0 when Calico is not enabled 2017-06-19 11:08:17 -07:00
Kubernetes Submit Queue
b6faf34862 Merge pull request #47530 from mindprince/issue-47388-remove-dead-code
Automatic merge from submit-queue (batch tested with PRs 47530, 47679)

Use cos-stable-59-9460-64-0 instead of cos-beta-59-9460-20-0.

Remove dead code that has now moved to another repo as part of #47467

**Release note**:
```release-note
NONE
```

/sig node
2017-06-16 20:57:58 -07:00
Kubernetes Submit Queue
d7e5a8b67e Merge pull request #47626 from Q-Lee/metadata-fix
Automatic merge from submit-queue (batch tested with PRs 47626, 47674, 47683, 47290, 47688)

The KUBE-METADATA-SERVER firewall must be applied before the universa…

…l tcp ACCEPT



**What this PR does / why we need it**: the metadata firewall rule was broken by being appended after the universal tcp accept.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-06-16 19:56:55 -07:00
Kubernetes Submit Queue
b00b6b97b7 Merge pull request #47606 from cheftako/requestCAFile
Automatic merge from submit-queue (batch tested with PRs 38751, 44282, 46382, 47603, 47606)

Working on fixing #43716.

This will create the necessary certificates.
On GCE is will upload those certificates to Metadata.
They are then pulled down on to the kube-apiserver.
They are written to the /etc/src/kubernetes/pki directory.
Finally they are loaded vi the appropriate command line flags.
The requestheader-client-ca-file can be seen by running the following:-
kubectl get ConfigMap extension-apiserver-authentication
--namespace=kube-system -o yaml
Minor bug fixes.
Made sure AGGR_MASTER_NAME is set up in all configs.
Clean up variable names.
Added additional requestheader configuration parameters.
Added check so that if there is no Aggregator CA contents we won't start
the aggregator with the relevant flags.

**What this PR does / why we need it**:
This PR creates a request header CA. It also creates a proxy client cert/key pair.
It causes these files to end up on kube-apiserver and set the CLI flags so they are properly loaded.
Without it the customer either has to set them up themselves or re-use the master CA which is a security vulnerability.
Currently this creates everything on GCE.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #43716

**Special notes for your reviewer**:
This is a reapply of pull/47094 with the GKE issue resolved.

**Release note**: None
2017-06-16 18:05:55 -07:00
Rohit Agarwal
3a86c97cf6 Use cos-stable-59-9460-64-0 instead of cos-beta-59-9460-20-0.
- It contains a fix for ipaliasing.
- It contains a fix which decouples GPU driver installation from kernel
version.

Remove dead code that has now moved to another repo as part of #47467
2017-06-16 13:48:50 -07:00
Quintin Lee
b886897f9d Prepend the metadata firewall in gce, so it isn't superceded. 2017-06-16 10:08:48 -07:00
Kubernetes Submit Queue
6742fda0bb Merge pull request #47624 from caseydavenport/fix-typha
Automatic merge from submit-queue

Don't start any Typha instances if not using Calico

**What this PR does / why we need it**:

Don't start any Typha instances if Calico isn't being used.  A recent change now includes all add-ons on the master, but we don't always want a Typha replica.

**Which issue this PR fixes**

Fixes https://github.com/kubernetes/kubernetes/issues/47622

**Release note**:
```release-note
NONE
```


cc @dnardo
2017-06-15 22:58:31 -07:00
Kubernetes Submit Queue
c8dc08ea87 Merge pull request #47562 from verult/VolumeDirFlag
Automatic merge from submit-queue (batch tested with PRs 47562, 47605)

Adding option in node start script to add "volume-plugin-dir" flag to kubelet.

**What this PR does / why we need it**: Adds a variable to allow specifying FlexVolume driver directory through cluster/kube-up.sh. Without this, the process of setting up FlexVolume in a non-default directory is very manual.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #47561
2017-06-15 21:54:28 -07:00
Kubernetes Submit Queue
8e1cf60a48 Merge pull request #47481 from sakshamsharma/encprovenv
Automatic merge from submit-queue

Add encryption provider support via environment variables

These changes are needed to allow cloud providers to use the encryption providers as an alpha feature. The version checks can be done in the respective cloud providers'.

Context: #46460 and #46916

@destijl @jcbsmpsn @smarterclayton
2017-06-15 20:39:57 -07:00
Casey Davenport
199ecdbbdd Don't start any Typha instances if not using Calico 2017-06-15 17:20:32 -07:00
Walter Fender
e06795533f Working on fixing #43716.
This will create the necessary certificates.
On GCE is will upload those certificates to Metadata.
They are then pulled down on to the kube-apiserver.
They are written to the /etc/src/kubernetes/pki directory.
Finally they are loaded vi the appropriate command line flags.
The requestheader-client-ca-file can be seen by running the following:-
kubectl get ConfigMap extension-apiserver-authentication
--namespace=kube-system -o yaml
Minor bug fixes.
Made sure AGGR_MASTER_NAME is set up in all configs.
Clean up variable names.
Added additional requestheader configuration parameters.
Added check so that if there is no Aggregator CA contents we won't start
the aggregator with the relevant flags.
2017-06-15 10:48:34 -07:00
Kubernetes Submit Queue
b2d844bd77 Merge pull request #47492 from bowei/fix-gci-gcloud
Automatic merge from submit-queue

Fix dangling reference to gcloud alpha API for GCI (should be beta)

This reference to the alpha API was missed (fixed in GCE, but not GCI)

Fixes #47494

```release-note
none
```
2017-06-14 21:39:21 -07:00
Kubernetes Submit Queue
454233512d Merge pull request #47482 from timstclair/audit-policy
Automatic merge from submit-queue (batch tested with PRs 47510, 47516, 47482, 47521, 47537)

Fix typos in audit policy config

For kubernetes/features#22
2017-06-14 20:32:47 -07:00
Kubernetes Submit Queue
fa23890bd9 Merge pull request #47510 from mwielgus/allow-zero-size-migs
Automatic merge from submit-queue (batch tested with PRs 47510, 47516, 47482, 47521, 47537)

Allow autoscaler min at 0 in GCE

Allow scaling migs to zero in GCE startup scripts. This only makes sense when there is more than 1 mig. The main use case (for now) will be to test scaling to to zero in e2e tests.
2017-06-14 20:32:43 -07:00
Saksham Sharma
a50114ac02 Add encryption provider support via env variables 2017-06-14 18:40:36 -07:00
Cheng Xing
6eecd3fb59 Adding option in node start script to add "volume-plugin-dir" flag to kubelet. 2017-06-14 17:56:06 -07:00
Dawn Chen
d6e1e21230 Revert "Set up proxy certs for Aggregator." 2017-06-14 13:44:34 -07:00
Ian Chakeres
b2450d2eb7 Moved gsutil_get_tar_md5 function before copy-to-staging function 2017-06-14 07:49:59 -07:00
Marcin Wielgus
5e390eff1a Allow autoscaler min at 0 in GCE 2017-06-14 07:36:18 +02:00
Bowei Du
f927946dea Fix dangling reference to gcloud alpha API for GCI (should be beta)
This reference to the alpha API was missed (fixed in GCE, but not GCI)
2017-06-13 21:52:34 -07:00
Tim St. Clair
947efaf2d7 Fix typos in audit policy config 2017-06-13 18:34:19 -07:00
Ian Chakeres
14391d3eb8 Moved md5 comand to a separate function and added comments 2017-06-13 16:12:21 -07:00
Kubernetes Submit Queue
5d2dbb58d7 Merge pull request #46796 from mikedanese/gce-2
Automatic merge from submit-queue

enable Node authorizer and NodeRestriction admission controller

Fixes https://github.com/kubernetes/kubernetes/issues/46999
Fixes https://github.com/kubernetes/kubernetes/issues/47135

```release-note
gce kube-up: The `Node` authorization mode and `NodeRestriction` admission controller are now enabled
```
2017-06-13 02:03:14 -07:00