Automatic merge from submit-queue
Build and push kube-dns for 1.4 release.
Fix#31355.
Following docker images had been uploaded:
gcr.io/google_containers/kubedns-amd64:1.7
gcr.io/google_containers/kubedns-arm:1.7
gcr.io/google_containers/kubedns-arm64:1.7
Build for ppc64le is disabled by default, and it failed to be built using:
`KUBE_BUILD_PPC64LE=y make release`
I'm still working on making the ppc64le build. Updates will be added following this thread.
@girishkalele @thockin
Automatic merge from submit-queue
Add ReclaimPolicy to the resource printer for 'get pv'
Propose we add the RECLAIMPOLICY (persistentVolumeReclaimPolicy) from resource_printer.go to show the policy when a user does a ```kubectl get pv```
```
[root@k8dev nfs]# kubectl get pv
NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM REASON AGE
pv-nfs 1Gi RWO Retain Available 1m
pv-nfs2 1Gi RWO Delete Available 4s
```
Automatic merge from submit-queue
Allow services which use same port, different protocol to use the same nodePort for both
fix#20092
@thockin @smarterclayton ptal.
Automatic merge from submit-queue
Filter internal Kubernetes labels from Prometheus metrics
**What this PR does / why we need it**:
Kubernetes uses Docker labels as storage for some internal labels. The
majority of these labels are not meaningful metric labels and a few of
them are even harmful as they're not static and cause wrong aggregation
results.
This change provides a custom labels func to only attach meaningful
labels to cAdvisor exported metrics.
**Which issue this PR fixes**
google/cadvisor#1312
**Special notes for your reviewer**:
Depends on google/cadvisor#1429. Once that is merged, I'll update the vendor update commit.
**Release note**:
```release-note
Remove environment variables and internal Kubernetes Docker labels from cAdvisor Prometheus metric labels.
Old behavior:
- environment variables explicitly whitelisted via --docker-env-metadata-whitelist were exported as `container_env_*=*`. Default is zero so by default non were exported
- all docker labels were exported as `container_label_*=*`
New behavior:
- Only `container_name`, `pod_name`, `namespace`, `id`, `image`, and `name` labels are exposed
- no environment variables will be exposed ever via /metrics, even if whitelisted
```
---
Given that we have full control over the exported label set, I shortened the pod_name, pod_namespace and container_name label names. Below an example of the change (reformatted for readability).
```
# BEFORE
container_cpu_cfs_periods_total{
container_label_io_kubernetes_container_hash="5af8c3b4",
container_label_io_kubernetes_container_name="sync",
container_label_io_kubernetes_container_restartCount="1",
container_label_io_kubernetes_container_terminationMessagePath="/dev/termination-log",
container_label_io_kubernetes_pod_name="popularsearches-web-3165456836-2bfey",
container_label_io_kubernetes_pod_namespace="popularsearches",
container_label_io_kubernetes_pod_terminationGracePeriod="30",
container_label_io_kubernetes_pod_uid="6a291e48-47c4-11e6-84a4-c81f66bdf8bd",
id="/docker/68e1f15353921f4d6d4d998fa7293306c4ac828d04d1284e410ddaa75cf8cf25",
image="redacted.com/popularsearches:42-16-ba6bd88",
name="k8s_sync.5af8c3b4_popularsearches-web-3165456836-2bfey_popularsearches_6a291e48-47c4-11e6-84a4-c81f66bdf8bd_c02d3775"
} 72819
# AFTER
container_cpu_cfs_periods_total{
container_name="sync",
pod_name="popularsearches-web-3165456836-2bfey",
namespace="popularsearches",
id="/docker/68e1f15353921f4d6d4d998fa7293306c4ac828d04d1284e410ddaa75cf8cf25",
image="redacted.com/popularsearches:42-16-ba6bd88",
name="k8s_sync.5af8c3b4_popularsearches-web-3165456836-2bfey_popularsearches_6a291e48-47c4-11e6-84a4-c81f66bdf8bd_c02d3775"
} 72819
```
Feedback requested on:
* Label names. Other suggestions? Should we keep these very long ones?
* Do we need to export io.kubernetes.pod.uid? It makes working with the metrics a bit more complicated and the pod name is already unique at any time (but not over time). The UID is aslo part of `name`.
As discussed with @timstclair, this should be added to v1.4 as the current labels are harmful.
PTAL @jimmidyson @fabxc @vishh
Automatic merge from submit-queue
Split the version metric out to its own package
This PR breaks a client dependency on prometheus. Combined with #30638, the client will no longer depend on these packages.
Automatic merge from submit-queue
Add ExternalName kube-dns e2e test
ExternalName allows kubedns to return CNAME records for external
services. No proxying is involved.
Built on top of and includes #30599
See original issue at
https://github.com/kubernetes/kubernetes/issues/13748
Feature tracking at
https://github.com/kubernetes/features/issues/33
The e2e test is at least as comprehensive as the one for headless services (namely, only to some degree)
```release-note
Add ExternalName services as CNAME references to external ones
```
Automatic merge from submit-queue
return destroy func to clean up internal resources of storage
What?
Provide a destroy func to clean up internal resources of storage.
It changes **unit tests** to clean up resources. (Maybe fix integration test in another PR.)
Why?
Although apiserver is designed to be long running, there are some cases that it's not.
See https://github.com/kubernetes/kubernetes/issues/31262#issuecomment-242208771
We need to gracefully shutdown and clean up resources.
Automatic merge from submit-queue
gci: decouple from the built-in kubelet version
Prior to this change, configure.sh would:
(1) compare versions of built-in kubelet and downloaded kubelet, and
(2) bind-mount downloaded kubelet at /usr/bin/kubelet in case of
version mismatch
With this change, configure.sh:
(1) compares the two versions only on test clusters, and
(2) uses the actual file paths to start kubelet w/o any bind-mounting
To allow (2), this change also provides its own version of kubelet
systemd service file.
Effectively with this change we will always use the downloaded kubelet
binary along with its own systemd service file on non-test clusters. The
main advantage is this change does not rely on the kubelet being built in to
the OS image.
@dchen1107 @wonderfly can you please review
cc/ @kubernetes/goog-image FYI
Automatic merge from submit-queue
Improve godoc for goroutinemap
Improves the godoc of goroutinemap; found while preparing to use this type in another PR.
@saad-ali
Automatic merge from submit-queue
Fix sort hint in `hack/verify-golint.sh`
The `verify-golint.sh` sorts all items with `LANG=C sort`, but it only hints developers to use `sort`, which causes a little trouble for me.
/cc @jfrazelle @sttts
Automatic merge from submit-queue
Node Conformance Test: Refactor node e2e framework
For #30122, #30174.
Based on #30348.
**Please only review the last 3 commits.**
This PR is part of our roadmap to package node conformance test.
The 1st commit is from #30348, it removed unnecessary dependencies in the node e2e test framework, because we've statically linked these dependencies.
The PR refactored the node e2e framework. Moving different utilities into different packages under `pkg/`.
We need to do this because:
1) Files like e2e_remote.go and e2e_build.go should only be used by runner, but they were compiled into the test suite because they were placed in the same package. The worst thing is that it will introduce some never used flags in the test suite binary.
2) Make the directory structure more clear. Only test should be placed in `test/e2e_node`, other utilities should be placed in different packages in `pkg/`.
@dchen1107 @vishh
/cc @kubernetes/sig-node @kubernetes/sig-testing
Automatic merge from submit-queue
Refactor to simplify the hard-traveled path of the KubeletConfiguration object
### There are two main goals of this PR:
- Make `NewMainKubelet` take `KubeletConfiguration` and `KubeletDeps` as its only arguments.
- Finally eliminate the legacy `KubeletConfig` type.
### Why am I doing this?
Long story short, I started adding an endpoint to the Kubelet to display the *current* config that the Kubelet was running with, and I realized a few things:
- There were so many transformations to the configuration, in so many different places, before it was used that I wasn't confident the values initially passed in on the `KubeletConfiguration` would be the correct values to report by the time someone used the endpoint to check on them.
- Trying to reconstruct a `KubeletConfiguration` object from a mix of the `Kubelet` object and the legacy `KubeletConfig` object would just add to the mess (not to mention maintenance burden), and it would be much easier if we passed the `KubeletConfiguration` all the way down to where we construct the `Kubelet` object, and then just store a reference to the `KubeletConfiguration` object on the `Kubelet` for later retrieval.
- My hope is that by eliminating unnecessary internal transformations to the config information, and by consolidating the remaining ones in a single place (`NewMainKubelet`), we can have a much clearer understanding of what happens to the config before it makes it to the `Kubelet` object, and also a better ability to report up-to-date information on the status of the Kubelet.
So I started cleaning things up :-).
### Discussion points
It was relatively simple to get `NewMainKubelet` to just take the legacy `KubeletConfig` as its only argument, because most of its arguments were just passing through `KubeletConfig` fields or passing information that was generated solely from `KubeletConfig` fields.
Completely eliminating the legacy `KubeletConfig` type has been more difficult, because the fields of the `KubeletConfiguration` do not have a one-to-one relationship with the fields of the `KubeletConfig`. While I was able to eliminate many of the `KubeletConfig` fields, I'm starting to get into the nontrivial stuff and I'd like to get a discussion started on what should happen with the remaining fields (pending cherry-picking notwithstanding).
On my `kconf-refactor` branch, the legacy `KubeletConfig` object is down to the following 27 fields (from the initial 93). I'd really appreciate any guidance people have on what should happen with these fields.
```
type KubeletConfig struct {
Auth server.AuthInterface
AutoDetectCloudProvider bool
Builder KubeletBuilder
CAdvisorInterface cadvisor.Interface
Cloud cloudprovider.Interface
ContainerManager cm.ContainerManager
DockerClient dockertools.DockerInterface
EventClient *clientset.Clientset
Hostname string
HostNetworkSources []string
HostPIDSources []string
HostIPCSources []string
KubeClient *clientset.Clientset
Mounter mount.Interface
NetworkPlugins []network.NetworkPlugin
NodeName string
OOMAdjuster *oom.OOMAdjuster
OSInterface kubecontainer.OSInterface
PodConfig *config.PodConfig
Recorder record.EventRecorder
Reservation kubetypes.Reservation
TLSOptions *server.TLSOptions
Writer kubeio.Writer
VolumePlugins []volume.VolumePlugin
EvictionConfig eviction.Config
ContainerRuntimeOptions []kubecontainer.Option
Options []Option
}
```
The patterns I've seen so far with respect to eliminating `KubeletConfig` fields may be of some help:
- Some fields could just be eliminated, because they were either the same on `KubeletConfiguration` or just a typecast away from being the same.
- Some fields from `KubeletConfiguration` just ended up in substructures of `KubeletConfig`; it was easy to just remove those substructure fields from `KubeletConfig` and construct them using local vars in `NewMainKubelet` instead.
- Some fields, e.g. `Runonce`, were able to move into the `KubeletConfiguration`.
**P.S.** Part of the way I'm making the transition is by adding an extra `KubeletConfiguration` argument to functions that originally took a `KubeletConfig`, and field-by-field, switching those functions over to using information from the `KubeletConfiguration`. Once the `KubeletConfig` is gone, I'll remove the `KubeletConfig` argument, and the transition will be complete.
**Final note:**
Please try to keep in mind that this is not a general Kubelet cleanup effort, it is just me cleaning things up that are directly in the path of what I'm trying to do. Let's keep this focused on cleanup related to the path that config takes on it's way to the Kubelet.
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```
Removed Flags
- Removes the --auth-path flag. This has been deprecated in favor of --kubeconfig for two releases.
```
Automatic merge from submit-queue
Keep vendor/ and Godep/ when creating the staging client, add a readme
In copy.sh, instead of removing the vendor/, moving it to _vendor. vendor/ is needed when we publish the staging client to its own repository.
Automatic merge from submit-queue
Fixed integer overflow bug in rate limiter.
```release-note
Fix overflow issue in controller-manager rate limiter
```
This PR fixes a bug in the delayed work-queue used by some controllers.
The integer overflow bug would previously cause hotlooping behavior after a few failures
as `time.Duration(..)` on values larger than MaxInt64 behaves unpredictably, and
after a certain value returns 0 always.
cc @bprashanth @pwittrock
This refactor removes the legacy KubeletConfig object and adds a new
KubeletDeps object, which contains injected runtime objects and
separates them from static config. It also reduces NewMainKubelet to two
arguments: a KubeletConfiguration and a KubeletDeps.
Some mesos and kubemark code was affected by this change, and has been
modified accordingly.
And a few final notes:
KubeletDeps:
KubeletDeps will be a temporary bin for things we might consider
"injected dependencies", until we have a better dependency injection
story for the Kubelet. We will have to discuss this eventually.
RunOnce:
We will likely not pull new KubeletConfiguration from the API server
when in runonce mode, so it doesn't make sense to make this something
that can be configured centrally. We will leave it as a flag-only option
for now. Additionally, it is increasingly looking like nobody actually uses the
Kubelet's runonce mode anymore, so it may be a candidate for deprecation
and removal.
Automatic merge from submit-queue
[GarbageCollector] add absent owner cache
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**:
Reducing the Request sent to the API server by the garbage collector to check if an owner exists.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
#26120
**Special notes for your reviewer**:
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
```
Currently when processing an item in the dirtyQueue, the garbage collector issues GET to check if any of its owners exist. If the owner is a replication controller with 1000 pods, the garbage collector sends a GET for the RC 1000 times. This PR caches the owner's UID if it does not exist according to the API server. This cuts 1/3 of the garbage collection time of the density test in the gce-500 and gce-scale, where the QPS is the bottleneck.
Automatic merge from submit-queue
rkt: Improve support for privileged pod (pod whose all containers are privileged)
Fix https://github.com/kubernetes/kubernetes/issues/31100
This takes advantage of https://github.com/coreos/rkt/pull/2983 . By appending the new `--all-run` insecure-options to `rkt run-prepared` command when all the containers are privileged. The pod now gets more privileged power.
Automatic merge from submit-queue
Add sysctl support
Implementation of proposal https://github.com/kubernetes/kubernetes/pull/26057, feature https://github.com/kubernetes/features/issues/34
TODO:
- [x] change types.go
- [x] implement docker and rkt support
- [x] add e2e tests
- [x] decide whether we want apiserver validation
- ~~[ ] add documentation~~: api docs exist. Existing PodSecurityContext docs is very light and links back to the api docs anyway: 6684555ed9/docs/user-guide/security-context.md
- [x] change PodSecurityPolicy in types.go
- [x] write admission controller support for PodSecurityPolicy
- [x] write e2e test for PodSecurityPolicy
- [x] make sure we are compatible in the sense of https://github.com/kubernetes/kubernetes/blob/master/docs/devel/api_changes.md
- [x] test e2e with rkt: it only works with kubenet, not with no-op network plugin. The later has no sysctl support.
- ~~[ ] add RunC implementation~~ (~~if that is already in kube,~~ it isn't)
- [x] update whitelist
- [x] switch PSC fields to annotations
- [x] switch PSP fields to annotations
- [x] decide about `--experimental-whitelist-sysctl` flag to be additive or absolute
- [x] decide whether to add a sysctl node whitelist annotation
### Release notes:
```release-note
The pod annotation `security.alpha.kubernetes.io/sysctls` now allows customization of namespaced and well isolated kernel parameters (sysctls), starting with `kernel.shm_rmid_forced`, `net.ipv4.ip_local_port_range`, `net.ipv4.tcp_max_syn_backlog` and `net.ipv4.tcp_syncookies` for Kubernetes 1.4.
The pod annotation `security.alpha.kubernetes.io/unsafeSysctls` allows customization of namespaced sysctls where isolation is unclear. Unsafe sysctls must be enabled at-your-own-risk on the kubelet with the `--experimental-allowed-unsafe-sysctls` flag. Future versions will improve on resource isolation and more sysctls will be considered safe.
```
Automatic merge from submit-queue
Fix scale x->x in kubectl for ReplicationController
Fix#31374
This fixes problem introduced in #31051 (which in turn was fixing a different problem).
@lavalamp - FYI
Automatic merge from submit-queue
add throughput in perf data and disable --cgroups-per-qos
This PR adds throughput data to printed perf data for benchmark. It also disables --cgrous-per-qos in jenkinds-benchmark.properties.