Commit Graph

48823 Commits

Author SHA1 Message Date
David Ashpole
20eb016597 dont attach a GPU to ubuntu machines 2017-05-23 14:34:18 -07:00
Nick Sardo
729303f0de Watching ClusterId from within GCE cloud provider 2017-05-23 14:11:24 -07:00
Random-Liu
dc023144a3 Move docker validation test to separate project. 2017-05-23 14:07:15 -07:00
Kubernetes Submit Queue
98c66f9fca Merge pull request #46267 from Random-Liu/extend-stop-container-request-timeout
Automatic merge from submit-queue

Double `StopContainer` request timeout.

Doubled `StopContainer` request timeout to leave some time for `SIGKILL` container.

@yujuhong @feiskyer
2017-05-23 14:06:23 -07:00
p0lyn0mial
c5019bf696 remove init blocks from all admission plugins 2017-05-23 22:00:32 +02:00
Andy Goldstein
23e32b100f Fix in-cluster kubectl --namespace override
Before this change, if the config was empty, ConfirmUsable() would
return an "invalid configuration" error instead of examining and
honoring the value of the --namespace flag. This change looks at the
overrides first, and returns the overridden value if it exists before
attempting to check if the config is usable. This is most applicable to
in-cluster clients, where they don't have a kubeconfig but they do have
a token and can use KUBERNETES_SERVICE_HOST/_PORT.
2017-05-23 15:56:48 -04:00
Tim St. Clair
7bc9b30049
Generated code 2017-05-23 12:44:41 -07:00
Kubernetes Submit Queue
f8815c96e0 Merge pull request #46285 from yastij/scheduling-delete-todo
Automatic merge from submit-queue

removing generic_scheduler todo after discussion (#46027)

**What this PR does / why we need it**:

**Which issue this PR fixes** #46027 

**Special notes for your reviewer**: just a quick clean cc @wojtek-t 

**Release note**:
```release-note
```
2017-05-23 12:43:15 -07:00
Tim St. Clair
6875e95378
Append X-Forwarded-For in proxy handler 2017-05-23 12:40:01 -07:00
Andy Goldstein
3b69884843 Use storage instead of REST for the CRD finalizer
Switch the custom resource definition finalizer controller to use
storage instead of a REST client, because a client could incorrectly try
to delete ThirdPartyResources whose names happen to collide with the
CustomResource instances.
2017-05-23 14:14:55 -04:00
Kubernetes Submit Queue
1e2105808b Merge pull request #45136 from vishh/cos-nvidia-driver-install
Automatic merge from submit-queue

Enable "kick the tires" support for Nvidia GPUs in COS

This PR provides an installation daemonset that will install Nvidia CUDA drivers on Google Container Optimized OS (COS).
User space libraries and debug utilities from the Nvidia driver installation are made available on the host in a special directory on the host -
* `/home/kubernetes/bin/nvidia/lib` for libraries
*  `/home/kubernetes/bin/nvidia/bin` for debug utilities

Containers that run CUDA applications on COS are expected to consume the libraries and debug utilities (if necessary) from the host directories using `HostPath` volumes.

Note: This solution requires updating Pod Spec across distros. This is a known issue and will be addressed in the future. Until then CUDA workloads will not be portable.

This PR updates the COS base image version to m59. This is coupled with this PR for the following reasons:
1. Driver installation requires disabling a kernel feature in COS. 
2. The kernel API for disabling this interface changed across COS versions
3. If the COS image update is not handled in this PR, then a subsequent COS image update will break GPU integration and will require an update to the installation scripts in this PR.
4. Instead of having to post `3` PRs, one each for adding the basic installer, updating COS to m59, and then updating the installer again, this PR combines all the changes to reduce review overhead and latency, and additional noise that will be created when GPU tests break.

**Try out this PR**
1. Get Quota for GPUs in any region
2. `export `KUBE_GCE_ZONE=<zone-with-gpus>` KUBE_NODE_OS_DISTRIBUTION=gci`
3. `NODE_ACCELERATORS="type=nvidia-tesla-k80,count=1" cluster/kube-up.sh`
4. `kubectl create -f cluster/gce/gci/nvidia-gpus/cos-installer-daemonset.yaml`
5. Run your CUDA app in a pod.

**Another option is to run a e2e manually to try out this PR**
1. Get Quota for GPUs in any region
2. export `KUBE_GCE_ZONE=<zone-with-gpus>` KUBE_NODE_OS_DISTRIBUTION=gci
3. `NODE_ACCELERATORS="type=nvidia-tesla-k80,count=1"`
4. `go run hack/e2e.go -- --up` 
5. `hack/ginkgo-e2e.sh --ginkgo.focus="\[Feature:GPU\]"`
The e2e will install the drivers automatically using the daemonset and then run test workloads to validate driver integration.

TODO:
- [x] Update COS image version to m59 release.
- [x] Remove sleep from the install script and add it to the daemonset
- [x] Add an e2e that will run the daemonset and run a sample CUDA app on COS clusters.
- [x] Setup a test project with necessary quota to run GPU tests against HEAD to start with https://github.com/kubernetes/test-infra/pull/2759
- [x] Update node e2e serial configs to install nvidia drivers on COS by default
2017-05-23 10:46:10 -07:00
Kubernetes Submit Queue
9ebfe9662f Merge pull request #46286 from zjj2wry/timstamps-timestamps
Automatic merge from submit-queue (batch tested with PRs 45587, 46286)

fix typo in kubelet

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-05-23 10:29:58 -07:00
Kubernetes Submit Queue
1602e2a338 Merge pull request #45587 from foxish/pdb-maxunavailab
Automatic merge from submit-queue (batch tested with PRs 45587, 46286)

PDB Max Unavailable Field

Completes https://github.com/kubernetes/features/issues/285

```release-note
Adds a MaxUnavailable field to PodDisruptionBudget
```


Individual commits are self-contained; Last commit can be ignored because it is autogenerated code.
cc @kubernetes/sig-apps-api-reviews @kubernetes/sig-apps-pr-reviews
2017-05-23 10:29:56 -07:00
Nick Sardo
f40f45abc1 Defer test stop & cleanup 2017-05-23 10:11:46 -07:00
Tim St. Clair
4c98cab4db
Update audit API with missing pieces 2017-05-23 09:55:00 -07:00
Random-Liu
5f0288e022 Double StopContainer request timeout. 2017-05-23 09:35:48 -07:00
Andy Goldstein
d1a0384678 GC: allow ignored resources to be customized
Allow the list of resources the garbage collector controller should
ignore to be customizable, so downstream integrators can add their own
resources to the list, if necessary.
2017-05-23 12:05:09 -04:00
Andy Goldstein
d30fb0d9d5 GC: update required verbs for deletable resources
The garbage collector controller currently needs to list, watch, get,
patch, update, and delete resources. Update the criteria for
deletable resources to reflect this.
2017-05-23 12:00:10 -04:00
Humble Chirammal
8700776d26 Add CephFS volume source to describe printer.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2017-05-23 20:27:00 +05:30
Anirudh
48d76edc74 PDB MaxUnavailable: Generated 2017-05-23 07:42:24 -07:00
Kubernetes Submit Queue
8e07e61a43 Merge pull request #46223 from smarterclayton/scheduler_max
Automatic merge from submit-queue (batch tested with PRs 45766, 46223)

Scheduler should use a shared informer, and fix broken watch behavior for cached watches

Can be used either from a true shared informer or a local shared
informer created just for the scheduler.

Fixes a bug in the cache watcher where we were returning the "current" object from a watch event, not the historic event.  This means that we broke behavior when introducing the watch cache.  This may have API implications for filtering watch consumers - but on the other hand, it prevents clients filtering from seeing objects outside of their watch correctly, which can lead to other subtle bugs.

```release-note
The behavior of some watch calls to the server when filtering on fields was incorrect.  If watching objects with a filter, when an update was made that no longer matched the filter a DELETE event was correctly sent.  However, the object that was returned by that delete was not the (correct) version before the update, but instead, the newer version.  That meant the new object was not matched by the filter.  This was a regression from behavior between cached watches on the server side and uncached watches, and thus broke downstream API clients.
```
2017-05-23 07:42:00 -07:00
Kubernetes Submit Queue
1f45c4846b Merge pull request #45766 from sttts/sttts-audit-event-in-context
Automatic merge from submit-queue (batch tested with PRs 45766, 46223)

Audit: fill audit.Event in handler chain

Related:
- external API types https://github.com/kubernetes/kubernetes/pull/45315
- policy checker https://github.com/kubernetes/kubernetes/pull/46009

Decisions:
- ~~[ ] decide whether we want to send an event before `WriteHeader` https://github.com/kubernetes/kubernetes/pull/45766#pullrequestreview-38664161~~ Follow-up described in https://github.com/kubernetes/kubernetes/pull/46065/files#r117438531
- [ ] decide how to handle `AuditID`s and the IP chain https://github.com/kubernetes/kubernetes/pull/45766#pullrequestreview-38659371. Is the variant in the proposal (https://github.com/kubernetes/community/pull/625) final? Then we need the API type update.
- ~~[ ] decide how to mark intermediate/incomplete events? set a special reason in `ResponseStatus.Reason` vs. having extra fields for that `Event.NonFinal`
 https://github.com/kubernetes/kubernetes/pull/45766#discussion_r116795888~~ Follow-up of #46065
- [ ] decide whether and how to protect the `Audit-Level` header https://github.com/kubernetes/kubernetes/pull/45766#pullrequestreview-38937691

TODOs:
- ~~[ ] move `AuditIDHeader`, `AuditLevelHeader` to types https://github.com/kubernetes/kubernetes/pull/45766#discussion_r117064094, @timstclair for the type PR~~ Follow-up of https://github.com/kubernetes/kubernetes/pull/46065
- [x] add SourceIP/ForwardedFor support https://github.com/kubernetes/kubernetes/pull/45766#discussion_r116778101
- [x] adapt ObjectReference.Resource to API PR https://github.com/kubernetes/kubernetes/pull/45766#pullrequestreview-38656828
2017-05-23 07:41:56 -07:00
Anirudh
63e51dc66e PDB MaxUnavailable: e2e tests 2017-05-23 07:18:44 -07:00
Anirudh
078f9566d9 PDB MaxUnavailable: kubectl changes 2017-05-23 07:18:44 -07:00
Anirudh
ce48d4fb5c PDB MaxUnavailable: Disruption Controller Changes 2017-05-23 07:18:44 -07:00
Anirudh
2b0de599a7 PDB MaxUnavailable: API changes 2017-05-23 07:18:43 -07:00
Kubernetes Submit Queue
4a1483efda Merge pull request #46216 from deads2k/owners-02-tighten
Automatic merge from submit-queue

tighten and simplify owners in some staging repos

With the move to staging, we can have much cleaner owners across the related packages.  This pares down the list of OWNERS to better match for code and activity.  It should help get PRs directed to people more active and familiar with the areas for quicker review.

@kubernetes/sig-api-machinery-misc 
@lavalamp @smarterclayton ptal.
2017-05-23 06:15:54 -07:00
Kubernetes Submit Queue
4871f4a75b Merge pull request #45637 from xilabao/hide-api-version
Automatic merge from submit-queue

remove --api-version
2017-05-23 06:15:45 -07:00
zhengjiajin
c79b0c797f fix typo in kubelet 2017-05-23 19:54:10 +08:00
Yassine TIJANI
a348a4e881 removing this todo after discussion (#46027) 2017-05-23 13:34:14 +02:00
Matt Potter
743cc5d685 autogen BUILD file 2017-05-23 11:37:48 +01:00
Matt Potter
ae102d64c4 refactor to use sets.String 2017-05-23 11:37:48 +01:00
Matt Potter
b8c0314861 deduplicate endpoints before DNS registration 2017-05-23 11:37:48 +01:00
Dr. Stefan Schimanski
9fdc36a47a Update bazel 2017-05-23 11:20:14 +02:00
Dr. Stefan Schimanski
ce942d19c3 audit: wire through non-nil context everywhere 2017-05-23 11:20:14 +02:00
Dr. Stefan Schimanski
0b5bcb0219 audit: add audit event to the context and fill in handlers 2017-05-23 11:20:14 +02:00
Dr. Stefan Schimanski
c1bf6e832e apiserver: move LongRunningRequestCheck type into endpoints/request 2017-05-23 11:20:13 +02:00
Kubernetes Submit Queue
8bee44b65f Merge pull request #46234 from wojtek-t/faster_selflink
Automatic merge from submit-queue (batch tested with PRs 46060, 46234)

Speedup generating selflinks for list and watch requests

I've seen profiles, where GenerateSelflink was 8-9% of whole cpu usage of apiserver (profiles over 30s). Most of this where spent in getting RequestInfo from the context and creating the context.

This PR changes the API of the GenerateLink method of the namer which results in computing the context and requestInfo only once per LIST/WATCH request (instead of computing it for every single returned element of LIST/WATCH).

@smarterclayton @deads2k - can one of you please take a look?
2017-05-23 01:41:57 -07:00
Kubernetes Submit Queue
7e75998233 Merge pull request #46060 from MrHohn/fix-serviceregistry-externaltraffic
Automatic merge from submit-queue (batch tested with PRs 46060, 46234)

Randomize test nodePort to prevent collision

Fix #37982.

/assign @bowei 

**Release note**:

```release-note
NONE
```
2017-05-23 01:41:55 -07:00
Kubernetes Submit Queue
286bcc6f5c Merge pull request #45995 from humblec/glusterfs-mount-3
Automatic merge from submit-queue

Add `auto_unmount` mount option for glusterfs fuse mount.

libfuse has an auto_unmount option which, if enabled, ensures that
the file system is unmounted at FUSE server termination by running a
separate monitor process that performs the unmount when that occurs.
(This feature would probably better be called "robust auto-unmount",
as FUSE servers usually do try to unmount their file systems upon
termination, it's just this mechanism is not crash resilient.)
This change implements that option and behavior for glusterfs.

This option will be only supported for clients with version >3.11.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
2017-05-23 00:29:41 -07:00
Kubernetes Submit Queue
455e9fff09 Merge pull request #46176 from vmware/vSphereStoragePolicySupport
Automatic merge from submit-queue

vSphere storage policy support for dynamic volume provisioning

Till now, vSphere cloud provider provides support to configure persistent volume with VSAN storage capabilities - kubernetes#42974. Right now this only works with VSAN.

Also there might be other use cases:

- The user might need a way to configure a policy on other datastores like VMFS, NFS etc.
- Use Storage IO control, VMCrypt policies for a persistent disk.

We can achieve about 2 use cases by using existing storage policies which are already created on vCenter using the Storage Policy Based Management service. The user will specify the SPBM policy ID as part of dynamic provisioning 

- resultant persistent volume will have the policy configured with it. 
- The persistent volume will be created on the compatible datastore that satisfies the storage policy requirements. 
- If there are multiple compatible datastores, the datastore with the max free space would be chosen by default.
- If the user specifies the datastore along with the storage policy ID, the volume will created on this datastore if its compatible. In case if the user specified datastore is incompatible, it would error out the reasons for incompatibility to the user.
- Also, the user will be able to see the associations of persistent volume object with the policy on the vCenter once the volume is attached to the node.

For instance in the below example, the volume will created on a compatible datastore with max free space that satisfies the "Gold" storage policy requirements.

```
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
       name: fast
provisioner: kubernetes.io/vsphere-volume
parameters:
      diskformat: zeroedthick
      storagepolicyName: Gold
```

For instance in the below example, the vSphere CP checks if "VSANDatastore" is compatible with "Gold" storage policy requirements. If yes, volume will be provisioned on "VSANDatastore" else it will error that "VSANDatastore" is not compatible with the exact reason for failure.

```
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
       name: fast
provisioner: kubernetes.io/vsphere-volume
parameters:
      diskformat: zeroedthick
      storagepolicyName: Gold
      datastore: VSANDatastore
```

As a part of this change, 4 commits have been added to this PR.

1. Vendor changes for vmware/govmomi
2. Changes to the VsphereVirtualDiskVolumeSource in the Kubernetes API. Added 2 additional fields StoragePolicyName, StoragePolicyID
3. Swagger and Open spec API changes.
4. vSphere Cloud Provider changes to implement the storage policy support.

**Release note**:


```release-note
vSphere cloud provider: vSphere Storage policy Support for dynamic volume provisioning
```
2017-05-22 23:41:10 -07:00
jianglingxia
adc0faa2c4 fix the invalid link 2017-05-23 14:36:43 +08:00
Kubernetes Submit Queue
3bfae793f0 Merge pull request #46008 from NickrenREN/openstack-add-metric
Automatic merge from submit-queue

Recording openstack metrics

add openstack operation metrics


**Release note**:
```release-note
Add support for emitting metrics from openstack cloudprovider about storage operations.
```

/assign @gnufied
2017-05-22 21:54:02 -07:00
xiangpengzhao
a320c94cc3
Update examples in Makefile with WHAT arguments changed. 2017-05-23 12:36:45 +08:00
Kubernetes Submit Queue
644a544d62 Merge pull request #46062 from alexandercampbell/correct-deprecation-errors
Automatic merge from submit-queue (batch tested with PRs 46201, 45952, 45427, 46247, 46062)

kubectl: fix deprecation warning bug

**What this PR does / why we need it**:

Some kubectl commands were deprecated but would fail to print the
correct warning message when a flag was given before the command name.

	# Correctly prints the warning that "resize" is deprecated and
	# "scale" is now preferred.
	kubectl resize [...]

	# Should print the same warning but no warning is printed.
	kubectl --v=1 resize [...]

This was due to a fragile check on os.Args[1].

This commit implements a new function deprecatedCmd() that is used to
construct new "passthrough" commands which are marked as deprecated and
hidden.

Note that there is an existing "filters" system that may be preferable
to the system created in this commit. I'm not sure why the "filters"
array was not used for all deprecated commands in the first place.

**Release note**:

```release-note
NONE
```
2017-05-22 20:58:07 -07:00
Kubernetes Submit Queue
31bd852ec1 Merge pull request #46247 from marun/fed-override-etcd-default-image
Automatic merge from submit-queue (batch tested with PRs 46201, 45952, 45427, 46247, 46062)

[Federation][kubefed]: Add support for etcd image override

This PR adds support for overriding the default etcd image used by ``kubefed init`` by providing an argument to ``--etcd-image``.  This is primarily intended to allow consumers like openshift to provide a different default, but as a nice side-effect supports code-free validation of non-default etcd images. 

**Release note**:

```release-note
'kubefed init' now supports overriding the default etcd image name with the --etcd-image parameter.
```
cc: @kubernetes/sig-federation-pr-reviews
2017-05-22 20:58:05 -07:00
Kubernetes Submit Queue
cc6e51c6e8 Merge pull request #45427 from ncdc/gc-shared-informers
Automatic merge from submit-queue (batch tested with PRs 46201, 45952, 45427, 46247, 46062)

Use shared informers in gc controller if possible

Modify the garbage collector controller to try to use shared informers for resources, if possible, to reduce the number of unique reflectors listing and watching the same thing.

cc @kubernetes/sig-api-machinery-pr-reviews @caesarxuchao @deads2k @liggitt @sttts @smarterclayton @timothysc @soltysh @kargakis @kubernetes/rh-cluster-infra @derekwaynecarr @wojtek-t @gmarek
2017-05-22 20:58:03 -07:00
Kubernetes Submit Queue
2718429e4f Merge pull request #45952 from harryge00/update-es-image
Automatic merge from submit-queue (batch tested with PRs 46201, 45952, 45427, 46247, 46062)

remove the elasticsearch template

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```
NONE
```
Loading file-based index template has been disabled since 2.0.0-beta1 version of Elasticsearch.  https://www.elastic.co/guide/en/elasticsearch/reference/2.0/breaking_20_index_api_changes.html#_file_based_index_templates 

So the `template-k8s-logstash.json` is not longer useful.

On the other hand, as https://github.com/kubernetes/kubernetes/issues/25127 indicated, we might better curl the elasticsearch API to load this template.
2017-05-22 20:58:01 -07:00
Kubernetes Submit Queue
6f5193593d Merge pull request #46201 from wojtek-t/address_kubeproxy_todos
Automatic merge from submit-queue

Address remaining TODOs in kube-proxy.

Followup PR from the previous two.
2017-05-22 20:54:14 -07:00
Kubernetes Submit Queue
99a8f7c303 Merge pull request #43590 from dashpole/eviction_complete_deletion
Automatic merge from submit-queue (batch tested with PRs 46022, 46055, 45308, 46209, 43590)

Eviction does not evict unless the previous pod has been cleaned up

Addresses #43166
This PR makes two main changes:
First, it makes the eviction loop re-trigger immediately if there may still be pressure.  This way, if we already waited 10 seconds to delete a pod, we dont need to wait another 10 seconds for the next synchronize call.
Second, it waits for the pod to be cleaned up (including volumes, cgroups, etc), before moving on to the next synchronize call.  It has a timeout for this operation currently set to 30 seconds.
2017-05-22 20:00:03 -07:00