Commit Graph

54770 Commits

Author SHA1 Message Date
Mik Vyatskov
ccf40abd50 Make advanced audit policy on GCP configurable 2017-09-13 14:36:26 +02:00
Marcin Wielgus
6ae3abd606 Bump Cluster Autoscaler to 0.7.0-beta1 2017-09-13 14:06:59 +02:00
Shyam Jeedigunta
6ae0eb8806 Fix bug with gke in logdump 2017-09-13 14:03:03 +02:00
Kubernetes Submit Queue
991afb2436 Merge pull request #52375 from jiayingz/deviceplugin-e2e
Automatic merge from submit-queue (batch tested with PRs 52316, 52289, 52375)

Extends GPUDevicePlugin e2e test to exercise device plugin restarts.

**What this PR does / why we need it**:
This is part of issue #52189 but does not fix it.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-09-13 04:04:55 -07:00
Kubernetes Submit Queue
c9759ae318 Merge pull request #52289 from crassirostris/sd-logging-trim-long-lines
Automatic merge from submit-queue (batch tested with PRs 52316, 52289, 52375)

[fluentd-gcp addon] Trim too long log entries due to Stackdriver limitations

Stackdriver doesn't support log entries bigger than 100KB, so by default fluentd plugin just drops such entries. To avoid that and increase the visibility of this problem it's suggested to trim long lines instead.

/cc @igorpeshansky

```release-note
[fluentd-gcp addon] Fluentd will trim lines exceeding 100KB instead of dropping them.
```
2017-09-13 04:04:52 -07:00
Kubernetes Submit Queue
a789fc777f Merge pull request #52316 from jpbetz/salt-request-timeout-quickfix
Automatic merge from submit-queue (batch tested with PRs 52316, 52289, 52375)

Small fix in salt manifest for kube-apiserver for request-timeout flag

**What this PR does / why we need it**:

Fixes a minor bug in salt manifest (typo from #51480)

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes
**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

xref: #51355
2017-09-13 04:04:50 -07:00
Mik Vyatskov
a9fb3c8efb Add new api groups to the GCE advanced audit policy 2017-09-13 12:07:48 +02:00
Aleksandra Malinowska
c173296632 log gcloud command error 2017-09-13 11:56:55 +02:00
Mik Vyatskov
d8525f8bd1 [fluentd-gcp addon] Trim too long log entries due to Stackdriver limitation 2017-09-13 10:27:17 +02:00
Kubernetes Submit Queue
be78d113b1 Merge pull request #52201 from timothysc/ephemeral_gate
Automatic merge from submit-queue

Version gates the ephemeral storage e2e test

Version gates the ephemeral storage e2e test.

**Release note**:
```
NONE
```

@kubernetes/sig-testing-pr-reviews
2017-09-12 23:24:42 -07:00
Kubernetes Submit Queue
dc02dfe560 Merge pull request #52301 from tallclair/psp-seccomp
Automatic merge from submit-queue (batch tested with PRs 52339, 52343, 52125, 52360, 52301)

'*' is valid for allowed seccomp profiles

**What this PR does / why we need it**:
This should be valid on a PodSecurityPolicy, but is currently rejected:
```
seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*'
```

**Which issue this PR fixes**: fixes #52300

```release-note
NONE
```
2017-09-12 21:46:02 -07:00
Kubernetes Submit Queue
83c2f358c9 Merge pull request #52360 from shyamjvs/add-debug-statements
Automatic merge from submit-queue (batch tested with PRs 52339, 52343, 52125, 52360, 52301)

Make log-dump use 'gcloud ssh' for GKE also

Fixes https://github.com/kubernetes/test-infra/issues/4323

I tested it locally (with some hacking for mimicking gke's DumpClusterLogs function in kubetest) and it worked.

cc @ericchiang
2017-09-12 21:45:59 -07:00
Kubernetes Submit Queue
c6a9b1e198 Merge pull request #52125 from yujuhong/fix-file-sync
Automatic merge from submit-queue (batch tested with PRs 52339, 52343, 52125, 52360, 52301)

dockershim: check if f.Sync() returns an error and surface it

```release-note
dockershim: check the error when syncing the checkpoint.
```
2017-09-12 21:45:56 -07:00
Kubernetes Submit Queue
e81aeb59aa Merge pull request #52343 from crassirostris/audit-policy-switch-to-beta
Automatic merge from submit-queue (batch tested with PRs 52339, 52343, 52125, 52360, 52301)

Switch default audit policy to beta and omit RequestReceived stage

Related to https://github.com/kubernetes/kubernetes/issues/52265

```release-note
By default, clusters on GCE no longer sends RequestReceived audit event, if advanced audit is configured.
```
2017-09-12 21:45:54 -07:00
Kubernetes Submit Queue
5bc9d7b412 Merge pull request #52339 from liggitt/alpha-test
Automatic merge from submit-queue (batch tested with PRs 52339, 52343, 52125, 52360, 52301)

Prevent enabling alpha APIs by default

related to #47691
This is a follow up to #51839 to add a check that we do not enable alpha APIs by default
2017-09-12 21:45:52 -07:00
Balaji Subramaniam
e2e356964a Make CPU manager release allocated CPUs when container enters completed phase. 2017-09-12 21:01:01 -07:00
Kubernetes Submit Queue
9636522137 Merge pull request #52352 from enisoc/sts-deflake
Automatic merge from submit-queue (batch tested with PRs 48226, 52046, 52231, 52344, 52352)

StatefulSet: Deflake e2e RunHostCmd more.

It turns out that at some points while the Node is recovering from a reboot, we get a different kind of error ("unable to upgrade connection"). Since we can't distinguish these transient errors from an error encountered after successfully executing the remote command, let's just retry all errors for 5min. If this doesn't work, I'm gonna blame it on sig-node.

ref #48031
2017-09-12 19:40:06 -07:00
Kubernetes Submit Queue
b04f81d342 Merge pull request #52344 from smarterclayton/no_log_pull
Automatic merge from submit-queue (batch tested with PRs 48226, 52046, 52231, 52344, 52352)

Log at higher verbosity levels some common SyncPod errors

This log message was 90% of all glog.Errorf level statements reported on a production cluster, hiding other more impactful errors. We already log it in start container, but for extra caution we continue to log it at v(3) here (the downside of not logging a start container error is worse than some log spam at higher levels).

HandleError() is intended only for unknown and unexpected errors.

```release-note
NONE
```

@derekwaynecarr @sjenning
2017-09-12 19:40:03 -07:00
Kubernetes Submit Queue
434fffb6e0 Merge pull request #52231 from mkumatag/guestbook_multiarch
Automatic merge from submit-queue (batch tested with PRs 48226, 52046, 52231, 52344, 52352)

Port Guestbook tests to mutiarch

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #52232 

**Special notes for your reviewer**:

**Release note**:

```NONE
NONE
```
2017-09-12 19:39:59 -07:00
Kubernetes Submit Queue
32f1521cc2 Merge pull request #52046 from dashpole/soft_eviction
Automatic merge from submit-queue (batch tested with PRs 48226, 52046, 52231, 52344, 52352)

[BugFix] Soft Eviction timer works correctly

fixes #51516

thresholdsMet should not exclude previously met thresholds when we do not have new stats for a threshold.

/assign @vishh @derekwaynecarr 
cc @kubernetes/sig-node-bugs
2017-09-12 19:39:55 -07:00
Kubernetes Submit Queue
83b4c0ac84 Merge pull request #48226 from wongma7/pd-predicate-log
Automatic merge from submit-queue (batch tested with PRs 48226, 52046, 52231, 52344, 52352)

Log get PVC/PV errors in MaxPD predicate only at high verbosity

The error is effectively ignored since even if a PVC/PV doesn't exist it gets counted, and it's rarely actionable either so let's reduce the verbosity.

Basically a user somewhere on the cluster will have to have done something "wrong" for this error to occur, e.g. if *,while the pod is running, pod's PVC is deleted or pods' PVC's PV is deleted. And from that point forward the logs will be spammed every time the predicate is evaluated on a node where that "wrong" pod exists

**Release note**:

```release-note
NONE
```
2017-09-12 19:39:52 -07:00
Kubernetes Submit Queue
58b126d98b Merge pull request #52248 from zjj2wry/set-env-list
Automatic merge from submit-queue

fix kubectl set env --list description

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
none
```
2017-09-12 17:43:28 -07:00
Kubernetes Submit Queue
39659ac1dd Merge pull request #51252 from andyzhangx/azuredisk-windows
Automatic merge from submit-queue

Azuredisk mount on windows node

**What this PR does / why we need it**:
This PR will enable azure disk on windows node, customer could create a pod mounted with azure disk on windows node. 
There are a few pending items still left:
1) Current fstype would be forced as NTFS, will change if there is such requirement
2) GetDeviceNameFromMount function is not implemented(empty) because in Linux, we could use "cat /proc/mounts" to read all mounting points in OS easily, but in Windows, there is no such place, I am still figuring out. The empty function would cause a few warning logging, but it will not affect the main logic now.

**Special notes for your reviewer**:
1. This PR depends on https://github.com/kubernetes/kubernetes/pull/51240, which allow windows mount path in config validation
2. There is a bug in docker on windows(https://github.com/moby/moby/issues/34729), the ContainerPath could only be a drive letter now(e.g. D:), dir path would fail in the end.

The example pod with mount path is like below:

```
kind: Pod
apiVersion: v1
metadata:
  name: pod-uses-shared-hdd-5g
  labels:
    name: storage
spec:
  containers:
  - image: microsoft/iis
    name: az-c-01
    volumeMounts:
    - name: blobdisk01
      mountPath: 'F:'
  nodeSelector:
    beta.kubernetes.io/os: windows
  volumes:
  - name: blobdisk01
    persistentVolumeClaim:
      claimName: pv-dd-shared-hdd-5
```

**Release note**:

```release-note
2017-09-12 17:43:13 -07:00
Jiaying Zhang
06b31849e1 Extends GPUDevicePlugin e2e test to exercise device plugin restarts. 2017-09-12 16:58:19 -07:00
Chao Xu
6c5a8d5db9 Remove the conversion of client config, because client-go is authoratative now 2017-09-12 16:02:17 -07:00
Shyam Jeedigunta
05fcefc0df Make log-dump use 'gcloud ssh' for GKE also 2017-09-13 00:14:57 +02:00
fabriziopandini
56d830776b fix Kubeadm phase addon 2017-09-12 23:52:20 +02:00
fabriziopandini
36562db310 fix kubeadm token create error 2017-09-12 23:30:14 +02:00
Kubernetes Submit Queue
108ee22096 Merge pull request #52305 from MrHohn/kube-proxy-ds-warning
Automatic merge from submit-queue

[GCE kube-up] Add a warning for kube-proxy DaemonSet option

**What this PR does / why we need it**:
Add a warning for kube-proxy DaemonSet option for GCE kube-up so that user will be aware of the risks.

Ref: https://github.com/kubernetes/kubernetes/issues/23225

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #NONE 

**Special notes for your reviewer**:
/assign @bowei 

**Release note**:

```release-note
NONE
```
2017-09-12 13:53:44 -07:00
Kubernetes Submit Queue
523ec9c78a Merge pull request #52304 from luxas/deb_owners
Automatic merge from submit-queue

Add OWNERS for build/debs

**What this PR does / why we need it**:

Makes this directory reflect the actual ownership over this file.
@mikedanese, @pipejakob and myself have worked on the kubeadm e2e CI and the building of debs using bazel, which this folder is responsible for.

@jbeda is already implicitely an owner here

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
@kubernetes/sig-cluster-lifecycle-pr-reviews
2017-09-12 11:44:44 -07:00
Kubernetes Submit Queue
3ed9191ac7 Merge pull request #52164 from soltysh/set_job_image
Automatic merge from submit-queue

Update set image description to remove job from resources that can update container image

**What this PR does / why we need it**:
This addressed the comment raised in https://github.com/kubernetes/kubernetes/issues/48388#issuecomment-322500960 by @harrissAvalon

**Special notes for your reviewer**:

**Release note**:

```release-note
none
```
2017-09-12 11:44:33 -07:00
Kubernetes Submit Queue
b05d8ad1ec Merge pull request #52338 from PiotrProkop/fix-hugepages
Automatic merge from submit-queue (batch tested with PRs 51041, 52297, 52296, 52335, 52338)

Fix pagesize mount option name

**What this PR does / why we need it**:
Fixes #52337  .
2017-09-12 11:10:18 -07:00
Kubernetes Submit Queue
36b43013c6 Merge pull request #52335 from crassirostris/sd-logging-add-metric
Automatic merge from submit-queue (batch tested with PRs 51041, 52297, 52296, 52335, 52338)

[fluentd-gcp addon] Restore the metric for the number of read log entries

This metric, previously removed, will allow to monitor the number of log entries, that were read, but weren't sent by the output plugin because of liveness probe removing the data.
2017-09-12 11:10:15 -07:00
Kubernetes Submit Queue
51afd82cb8 Merge pull request #52296 from gnufied/fix-glusterfs-expand-unit
Automatic merge from submit-queue (batch tested with PRs 51041, 52297, 52296, 52335, 52338)

Glusterfs expands in units of GB not GiB

When expanding glusterfs volumes, we should use GB units not GiB.  More information - https://github.com/heketi/heketi/wiki/API

Fixes https://github.com/kubernetes/kubernetes/issues/52298 

```release-note
Fixes Glusterfs storage allocation units
```
2017-09-12 11:10:13 -07:00
Kubernetes Submit Queue
8e95e39c15 Merge pull request #52297 from derekwaynecarr/code-hygiene
Automatic merge from submit-queue (batch tested with PRs 51041, 52297, 52296, 52335, 52338)

Use cAdvisor constant for crio imagefs

**What this PR does / why we need it**:
code hygiene to use a constant from cAdvisor

**Release note**:
```release-note
NONE
```
2017-09-12 11:10:10 -07:00
Kubernetes Submit Queue
a63e3deec3 Merge pull request #51041 from balajismaniam/cpuman-e2e-tests
Automatic merge from submit-queue

Node e2e tests for the CPU Manager. 

**What this PR does / why we need it**:
- Adds node e2e tests for the CPU Manager implementation in https://github.com/kubernetes/kubernetes/pull/49186.

**Special notes for your reviewer**: 
- Previous PR in this series: #51180
- Only `test/e2e_node/cpu_manager_test.go` must be reviewed as a part of this PR (i.e., the last commit). Rest of the comments belong in #51357 and #51180.
- The tests have been on run on `n1-standard-n4` and `n1-standard-n2` instances on GCE. 

To run this node e2e test, use the following command:
```sh
make test-e2e-node TEST_ARGS='--feature-gates=DynamicKubeletConfig=true' FOCUS="CPU Manager" SKIP="" PARALLELISM=1
```

CC @ConnorDoyle @sjenning
2017-09-12 10:46:06 -07:00
Anthony Yeh
bff5f7e6b0
StatefulSet: Deflake e2e RunHostCmd more.
It turns out that at some points while the Node is recovering from a
reboot, we get a different kind of error ("unable to upgrade
connection"). Since we can't distinguish these transient errors from an
error encountered after successfully executing the remote command,
let's just retry all errors for 5min. If this doesn't work, I'm gonna
blame it on sig-node.
2017-09-12 10:12:46 -07:00
Kubernetes Submit Queue
6b6b1e5779 Merge pull request #52291 from derekwaynecarr/fix-summary
Automatic merge from submit-queue (batch tested with PRs 52007, 52196, 52169, 52263, 52291)

Summary tests should expect rss usage now

**What this PR does / why we need it**:
Fixes summary test to expect rss usage now.

Previously, cAdvisor reported rss and not total_rss, but that has now been fixed in most recent version of cAdvisor now in the project.

See: https://github.com/kubernetes/kubernetes/pull/43399#issuecomment-287858599

**Release note**:
```release-note
NONE
```
2017-09-12 08:46:17 -07:00
Kubernetes Submit Queue
4775dae1c0 Merge pull request #52263 from crassirostris/event-exporter-metric-fix
Automatic merge from submit-queue (batch tested with PRs 52007, 52196, 52169, 52263, 52291)

[fluentd-gcp addon] Update event-exporter to address metrics problem

Follow-up of https://github.com/GoogleCloudPlatform/k8s-stackdriver/pull/37:

```
In the clusters with CA, the number of metric streams will continuously grow if the host is included.
```

Name is updated b/c otherwise addon manager will not be able to pick up the change.
2017-09-12 08:46:15 -07:00
Kubernetes Submit Queue
1f072babe8 Merge pull request #52169 from dims/remove-links-to-specific-cloud-providers
Automatic merge from submit-queue (batch tested with PRs 52007, 52196, 52169, 52263, 52291)

Remove links to GCE/AWS cloud providers from PersistentVolumeCo…

…ntroller




**What this PR does / why we need it**:

We should be able to build a cloud-controller-manager without having to
pull in code specific to GCE and AWS clouds. Note that this is a tactical
fix for now, we should have allow PVLabeler to be passed into the
PersistentVolumeController, maybe come up with better interfaces etc. Since
it is too late to do all that for 1.8, we just move cloud specific code
to where they belong and we check for PVLabeler method and use it where
needed.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

Fixes #51629

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-09-12 08:46:12 -07:00
Kubernetes Submit Queue
63b3eea81e Merge pull request #52196 from luxas/kubeadm_enable_rotation
Automatic merge from submit-queue (batch tested with PRs 52007, 52196, 52169, 52263, 52291)

kubeadm: Enable certificate rotation

**What this PR does / why we need it**:

Enables cert rotation as planned for the v1.8 cycle in https://github.com/kubernetes/kubeadm/issues/386
Can now be done as everything's in place in the code now that beta.1 is released with all the necessary features (Kubelet clientcert rotation now beta, woot!)

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

fixes: https://github.com/kubernetes/kubeadm/issues/386

**Special notes for your reviewer**:

This file does _ONLY_ affect the kubeadm e2e CI.
What will actually end up in the debs/rpms is going into kubernetes/release right before v1.8 is released (due to how those scripts work, not optimal :/ )

**Release note**:

```release-note
kubeadm: Enable kubelet client certificate rotation
```
@kubernetes/sig-cluster-lifecycle-pr-reviews @kubernetes/sig-auth-pr-reviews
2017-09-12 08:46:09 -07:00
Kubernetes Submit Queue
a4b7100c20 Merge pull request #52007 from oracle/for/upstream/master/ccm-sa-run-jitter
Automatic merge from submit-queue (batch tested with PRs 52007, 52196, 52169, 52263, 52291)

Fixed CCM service controller start jitter

**What this PR does / why we need it**: The start jitter for the service controller was running regardless if the service controller was being ran. This should help startup time for CCM's without the service controller implementation. 

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

/cc @wlan0 @andrewsykim @luxas @jhorwit2

/area cloudprovider
/sig cluster-lifecycle
2017-09-12 08:46:05 -07:00
Clayton Coleman
a5ac80cbce
Log at higher verbosity levels some common SyncPod errors 2017-09-12 10:52:31 -04:00
Mik Vyatskov
0933f5c8e0 Switch default audit policy to beta and omit RequestReceived stage 2017-09-12 16:36:13 +02:00
Jordan Liggitt
d8bf50267a
Prevent enabling alpha APIs by default 2017-09-12 09:48:03 -04:00
Kubernetes Submit Queue
13b9c9afd3 Merge pull request #52306 from luxas/kubeadm_selfhosting_alpha
Automatic merge from submit-queue (batch tested with PRs 52119, 52306)

kubeadm: Mark self-hosting alpha in v1.8

**What this PR does / why we need it**:

Self-hosting is alpha in v1.8, not beta. We targeted it to be beta, hence the initial add of this feature gates' value, but now changing back to alpha.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
@kubernetes/sig-cluster-lifecycle-pr-reviews
2017-09-12 05:50:25 -07:00
Kubernetes Submit Queue
d8847a8f1d Merge pull request #52119 from mtaufen/sync-files
Automatic merge from submit-queue

fsync config checkpoint files after writing

@yujuhong brought up that it's possible for a hard reboot to result in empty checkpoint files, if they haven't been synced to disk yet. This PR ensures that Kubelet configuration checkpoints are synced after writing to avoid this issue.

fixes #52222

**Release note**:
```release-note
NONE
```
2017-09-12 05:41:25 -07:00
PiotrProkop
8465f96d8d Fix pagesize mount option name 2017-09-12 14:34:39 +02:00
Kubernetes Submit Queue
0ae98b6ffe Merge pull request #52146 from resouer/eclass-fix
Automatic merge from submit-queue

Note equivalence class for dev and other fix

**What this PR does / why we need it**:
1. Add a note for predicate developers to respect equivalence class design
2. Add comments and re-ordered the related data structure, ref https://github.com/kubernetes/community/pull/1031
3. Fix some nits (typo, code length etc)


**Special notes for your reviewer**:

**Release note**:

```release-note
Scheduler predicate developer should respect equivalence class cache
```
2017-09-12 04:36:10 -07:00
Mik Vyatskov
683fc23000 [fluentd-gcp addon] Restore the metric for the number of read log entries 2017-09-12 13:24:55 +02:00