Commit Graph

1269 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
b29d9cdbcf Merge pull request #39898 from ixdy/bazel-release-tars
Automatic merge from submit-queue

Build release tars using bazel

**What this PR does / why we need it**: builds equivalents of the various kubernetes release tarballs, solely using bazel.

For example, you can now do
```console
$ make bazel-release
$ hack/e2e.go -v -up -test -down
```

**Special notes for your reviewer**: this is currently dependent on 3b29803eb5, which I have yet to turn into a pull request, since I'm still trying to figure out if this is the best approach.

Basically, the issue comes up with the way we generate the various server docker image tarfiles and load them on nodes:
* we `md5sum` the binary being encapsulated (e.g. kube-proxy) and save that to `$binary.docker_tag` in the server tarball
* we then build the docker image and tag using that md5sum (e.g. `gcr.io/google_containers/kube-proxy:$MD5SUM`)
* we `docker save` this image, which embeds the full tag in the `$binary.tar` file.
* on cluster startup, we `docker load` these tarballs, which are loaded with the tag that we'd created at build time. the nodes then use the `$binary.docker_tag` file to find the right image.

With the current bazel `docker_build` rule, the tag isn't saved in the docker image tar, so the node is unable to find the image after `docker load`ing it.

My changes to the rule save the tag in the docker image tar, though I don't know if there are subtle issues with it. (Maybe we want to only tag when `--stamp` is given?)

Also, the docker images produced by bazel have the timestamp set to the unix epoch, which is not great for debugging. Might be another thing to change with a `--stamp`.

Long story short, we probably need to follow up with bazel folks on the best way to solve this problem.

**Release note**:

```release-note
NONE
```
2017-01-18 14:24:48 -08:00
Kubernetes Submit Queue
6dfe5c49f6 Merge pull request #38865 from vwfs/ext4_no_lazy_init
Automatic merge from submit-queue

Enable lazy initialization of ext3/ext4 filesystems

**What this PR does / why we need it**: It enables lazy inode table and journal initialization in ext3 and ext4.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #30752, fixes #30240

**Release note**:
```release-note
Enable lazy inode table and journal initialization for ext3 and ext4
```

**Special notes for your reviewer**:
This PR removes the extended options to mkfs.ext3/mkfs.ext4, so that the defaults (enabled) for lazy initialization are used.

These extended options come from a script that was historically located at */usr/share/google/safe_format_and_mount* and later ported to GO so this dependency to the script could be removed. After some search, I found the original script here: https://github.com/GoogleCloudPlatform/compute-image-packages/blob/legacy/google-startup-scripts/usr/share/google/safe_format_and_mount

Checking the history of this script, I found the commit [Disable lazy init of inode table and journal.](4d7346f7f5). This one introduces the extended flags with this description:
```
Now that discard with guaranteed zeroing is supported by PD,
initializing them is really fast and prevents perf from being affected
when the filesystem is first mounted.
```

The problem is, that this is not true for all cloud providers and all disk types, e.g. Azure and AWS. I only tested with magnetic disks on Azure and AWS, so maybe it's different for SSDs on these cloud providers. The result is that this performance optimization dramatically increases the time needed to format a disk in such cases.

When mkfs.ext4 is told to not lazily initialize the inode tables and the check for guaranteed zeroing on discard fails, it falls back to a very naive implementation that simply loops and writes zeroed buffers to the disk. Performance on this highly depends on free memory and also uses up all this free memory for write caching, reducing performance of everything else in the system. 

As of https://github.com/kubernetes/kubernetes/issues/30752, there is also something inside kubelet that somehow degrades performance of all this. It's however not exactly known what it is but I'd assume it has something to do with cgroups throttling IO or memory. 

I checked the kernel code for lazy inode table initialization. The nice thing is, that the kernel also does the guaranteed zeroing on discard check. If it is guaranteed, the kernel uses discard for the lazy initialization, which should finish in a just few seconds. If it is not guaranteed, it falls back to using *bio*s, which does not require the use of the write cache. The result is, that free memory is not required and not touched, thus performance is maxed and the system does not suffer.

As the original reason for disabling lazy init was a performance optimization and the kernel already does this optimization by default (and in a much better way), I'd suggest to completely remove these flags and rely on the kernel to do it in the best way.
2017-01-18 09:09:52 -08:00
Mik Vyatskov
83df5b8495 Remove library copying from fluentd image 2017-01-17 15:00:48 +01:00
Mik Vyatskov
5b96233423 Sync fluentd daemonset liveness probe with static pod liveness probe 2017-01-17 13:29:54 +01:00
Jordan Liggitt
7e98e06e48 Include system:masters group in the bootstrap admin client certificate 2017-01-16 14:01:24 -05:00
Jeff Grafton
bc4b6ac397 Build release tarballs in bazel and add make bazel-release rule 2017-01-13 16:17:44 -08:00
Kubernetes Submit Queue
31483bf546 Merge pull request #39770 from ixdy/ubuntu-slim-base-image
Automatic merge from submit-queue

Update images that use ubuntu-slim base image to :0.6

**What this PR does / why we need it**: `ubuntu-slim:0.4` is somewhat old, being based on Ubuntu 16.04, whereas `ubuntu-slim:0.6` is based on Ubuntu 16.04.1.

**Special notes for your reviewer**: I haven't pushed any of these images yet, so I expect all of the e2e builds to fail. If we're happy with the changes, I can push the images and then re-trigger tests.

**Release note**:

```release-note
NONE
```

cc @aledbf as FYI
2017-01-12 20:39:13 -08:00
Zihong Zheng
f62be637c8 Update kubectl to stable version for Addon Manager 2017-01-12 13:49:13 -08:00
Jeff Grafton
1c2ea28080 Update images that use ubuntu-slim base image to :0.6 2017-01-11 15:07:04 -08:00
Kubernetes Submit Queue
addc6cae4a Merge pull request #38212 from mikedanese/kubeletauth
Automatic merge from submit-queue (batch tested with PRs 38212, 38792, 39641, 36390, 39005)

Generate a kubelet CA and kube-apiserver cert-pair for kubelet auth.

cc @cjcullen
2017-01-10 19:48:09 -08:00
Piotr Szczesniak
da7b81c4d8 Added owners to monitoring and logging related directories 2017-01-10 12:14:10 +01:00
Kubernetes Submit Queue
f4a8713088 Merge pull request #36229 from wojtek-t/bump_etcd_version
Automatic merge from submit-queue (batch tested with PRs 36229, 39450)

Bump etcd to 3.0.14 and switch to v3 API in etcd.

Ref #20504

**Release note**:

```release-note
Switch default etcd version to 3.0.14.
Switch default storage backend flag in apiserver to `etcd3` mode.
```
2017-01-04 17:36:06 -08:00
CJ Cullen
d0997a3d1f Generate a kubelet CA and kube-apiserver cert-pair for kubelet auth.
Plumb through to kubelet/kube-apiserver on gci & cvm.
2017-01-03 14:30:45 -08:00
Kubernetes Submit Queue
2d15499984 Merge pull request #39151 from Crassirostris/fluentd-gcp-default-format
Automatic merge from submit-queue

Try parse golang logs by default

Glog by default logs to stderr, so Stackdriver Logging shows them all as errors. This PR makes fluentd try to parse messages using glog format and if succeeded, set timestamp and severity accordingly.

CC @piosz @fgrzadkowski
2017-01-03 05:50:33 -08:00
Yifan Gu
dd59aa1c3b cluster/gce: Rename coreos to container-linux. 2016-12-30 15:32:02 -08:00
Kubernetes Submit Queue
1f2f05df4b Merge pull request #39140 from kerneltime/master
Automatic merge from submit-queue

Remove kube-up for vsphere

**What this PR does / why we need it**:
Kube-up for vSphere does not work in master or 1.5 branch due to changes in networking model within kubernetes.
Kube-up is deprecated
Kube-up for vSphere is not being maintained instead the focus is on kubernetes-anywhere.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: 
fixes https://github.com/kubernetes/kubernetes/issues/37150, fixes https://github.com/kubernetes/kubernetes/issues/36440, fixes https://github.com/kubernetes/kubernetes/issues/27947, fixes https://github.com/kubernetes/kubernetes/issues/24407, fixes https://github.com/kubernetes/kubernetes/issues/22390, fixes https://github.com/kubernetes/kubernetes/issues/14368, fixes https://github.com/kubernetes/kubernetes/issues/14363, fixes https://github.com/kubernetes/kubernetes/issues/3630, fixes https://github.com/kubernetes/kubernetes/issues/22885


**Special notes for your reviewer**:

This is related to https://github.com/kubernetes/kubernetes.github.io/pull/2021

**Release note**:

```release-note
Remove the deprecated vsphere kube-up.
```
2016-12-29 11:38:08 -08:00
Dawn Chen
9d3f4d7152 Revert "Make fluentd pods critical" 2016-12-22 15:58:36 -08:00
Ritesh H Shukla
35a750ac21 Remove kube-up for vsphere 2016-12-22 20:15:37 +00:00
Kubernetes Submit Queue
0e9cb8f86b Merge pull request #39146 from Crassirostris/make-fluentd-critical
Automatic merge from submit-queue

Make fluentd pods critical

Related to https://github.com/kubernetes/kubernetes/issues/38322

Make fluentd critical so it will be evicted with less probability.

CC @piosz @fgrzadkowski
2016-12-22 05:17:25 -08:00
Kubernetes Submit Queue
a30b6e2d16 Merge pull request #38622 from Crassirostris/fluentd-liveness-probe
Automatic merge from submit-queue

Add liveness probe for fluentd-gcp

It's known that fluentd can hung up during execution until manual restart.

Liveness probe fixes this problem in the following way: if no buffer chunks were sent or created in the last 5 minutes, fluentd is hanging and should be restarted.

CC @piosz
2016-12-22 02:43:28 -08:00
Mik Vyatskov
4a7b3d7528 Make fluentd pods critical 2016-12-21 19:56:46 +01:00
deads2k
2e2a2e4b94 update gce for RBAC, controllers, proxy, kubelet (p1) 2016-12-21 13:51:49 -05:00
Mik Vyatskov
a3940ba874 Add liveness probe for fluentd-gcp 2016-12-21 17:32:28 +01:00
Mik Vyatskov
5105102940 Try parse golang logs by default 2016-12-20 17:31:13 +01:00
Wojciech Tyczynski
498a893fa3 Switch to etcd v3 API by default 2016-12-20 11:57:46 +01:00
Wojciech Tyczynski
76f115a8ee Bump etcd to 3.0.14 2016-12-20 11:57:45 +01:00
Alexander Block
13a2bc8afb Enable lazy initialization of ext3/ext4 filesystems 2016-12-18 11:08:51 +01:00
Kubernetes Submit Queue
699964c972 Merge pull request #38836 from bprashanth/kubelet_critical
Automatic merge from submit-queue

Admit critical pods in the kubelet

Haven't verified in a live cluster yet, just unittested, so applying do-not-merge label.
2016-12-16 17:21:46 -08:00
Kubernetes Submit Queue
e3c6ab1c8f Merge pull request #35582 from surajssd/use-daemonset-registry-proxy
Automatic merge from submit-queue

Use daemonset in docker registry add on

When using registry add on with kubernetes cluster it will be right to use `daemonset` to bring up a pod on each node of cluster, right now the docs suggests to bring up a pod on each node manually by dropping the pod manifests into directory `/etc/kubernetes/manifests`.
2016-12-16 12:29:46 -08:00
bprashanth
4fff49bb93 Make kube-proxy a critical pod 2016-12-15 18:58:13 -08:00
Piotr Szczesniak
a52637f09f Migrated fluentd to daemon set 2016-12-15 13:48:32 +01:00
Kubernetes Submit Queue
14e7b85b18 Merge pull request #38213 from Crassirostris/fluentd-gcp-logging-loop
Automatic merge from submit-queue (batch tested with PRs 38760, 38213)

Avoid exporting fluentd-gcp own logs

To prevent fluentd from exporting its own logs, redirect the output to a file. Ability to read fluentd logs remains, but because these logs will not be exported, we can increase the verbosity of these logs.

Same change should be made for fluentd-es image.

CC @piosz
2016-12-14 07:09:48 -08:00
Suraj Deshmukh
9afdfa2b74 Use daemonset in docker registry add on
Using daemonset to bring up a pod on each node of cluster,
right now the docs suggests to bring up a pod on each node by
manually dropping the pod manifests into directory /etc/kubernetes/manifests.
2016-12-14 19:22:03 +05:30
Mik Vyatskov
e52c3e77e2 Avoid exporting fluentd-gcp own logs 2016-12-14 14:43:05 +01:00
bprashanth
e4302a2b41 Bump up glbc version 2016-12-12 19:08:37 -08:00
Kubernetes Submit Queue
37cd01dc8c Merge pull request #38438 from MrHohn/addon-manager-coreos
Automatic merge from submit-queue

Keeps addon manager yamls in sync

From #38437.

We should have kept all addon manager YAML files in sync. This does not fix the release scripts issue, but we should still have this.

@mikedanese @ixdy
2016-12-11 11:41:35 -08:00
Kubernetes Submit Queue
d8c925319a Merge pull request #38523 from MrHohn/kube-dns-rename
Automatic merge from submit-queue (batch tested with PRs 38058, 38523)

Renames kube-dns configure files from skydns* to kubedns*

`skydns-` prefix and `-rc` suffix are confusing and misleading. Renaming it to `kubedns` in existing yaml files and scripts.

@bowei @thockin
2016-12-10 17:04:53 -08:00
Kubernetes Submit Queue
3d47fcc8ac Merge pull request #38286 from Crassirostris/fluentd-es-logging-loop
Automatic merge from submit-queue

Avoid exporting fluentd-es own logs

Follow-up of https://github.com/kubernetes/kubernetes/pull/38213 for fluentd-es version

CC @piosz
2016-12-09 05:27:05 -08:00
Zihong Zheng
4ad06df18f Renames kube-dns configure files from skydns* to kubedns* 2016-12-08 20:01:19 -08:00
Zihong Zheng
95910cc40b Keeps addon manager yamls in sync 2016-12-08 19:54:14 -08:00
Kubernetes Submit Queue
7f2622e668 Merge pull request #32663 from anguslees/extraroutes
Automatic merge from submit-queue

openstack: Implement the `Routes` provider API

``` release-note

Implement the Routes provider API for OpenStack using Neutron extraroute extension.  This removes the need for flannel/etc where supported.  To use, ensure all your nodes are on the same Neutron (private) network and specify the router ID in new `[Route]` section of provider config:

    [Route]
    router-id = <router UUID>
```
2016-12-07 21:36:13 -08:00
Euan Kemp
b8d2099b3f cluster: bindmount more cert paths
/etc/ssl/certs is currently mounted through in a number of places.
However, on Gentoo and CoreOS (and probably others), the files in
/etc/ssl/certs are just symlinks to files in /usr/share/ca-certificates.

For these components to correclty work, the target of the symlinks needs
to be available as well.

This is especially important for kube-controller-manager, where this
issue was noticed.

This change was originally part of #33965, but was split out for ease of
review.
2016-12-07 15:21:53 -08:00
Mik Vyatskov
a971941ee3 Avoid exporting fluentd-es own logs 2016-12-07 13:58:50 +01:00
Marcin Wielgus
af6b6a9af3 Bump Cluster Autoscaler to 0.4.0 2016-12-07 10:55:33 +01:00
Mike Danese
e225625a80 add a configuration for kubelet to register as a node with taints
and deprecate register-schedulable
2016-12-06 10:32:54 -08:00
gmarek
aef56cdf21 Increase max mutating inflight requests in large clusters 2016-12-05 09:33:05 +01:00
Angus Lees
29fadb3541 openstack-heat: Drop flannel for cloud Routes API 2016-12-05 15:24:01 +11:00
Kubernetes Submit Queue
ce4af7f0b5 Merge pull request #37941 from Crassirostris/fluentd-gcp-config-unification
Automatic merge from submit-queue (batch tested with PRs 37692, 37785, 37647, 37941, 37856)

Use unified gcp fluentd image for gci and cvm

Follow-up of https://github.com/kubernetes/kubernetes/pull/37681

Actually unify the pod specs for CVM and GCI, to simplify the configuration

CC @piosz
2016-12-03 11:45:02 -08:00
Dawn Chen
38a63e388d Set kernel.softlockup_panic =1 based on the flag. 2016-12-02 16:09:16 -08:00
Mik Vyatskov
74a3b77c73 Use unified gcp fluentd image for gci and cvm 2016-12-01 17:29:27 +01:00