This is necessary to show any RootFs usage on systems where the backing
filesystem of overlay2 is xfs.
The current test only created directories (for mount points) in the
upper layer of the overlay. Outside of the mount namespace, only the
directories are visible. When running `du` on those, usually filesystems
will show some usage, but not xfs, which shows a disk usage of 0 for
directories.
Fix this by creating a file in the root directory, outside the volumes,
in order to trigger some disk usage that can be measured by `du`.
Tested: make test-e2e-node REMOTE=true HOSTS=centos-e2e-node FOCUS="Summary API"
Automatic merge from submit-queue (batch tested with PRs 61829, 61908, 61307, 61872, 60100). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
node authorizer sets up access rules for dynamic config
This PR makes the node authorizer automatically set up access rules for
dynamic Kubelet config.
I also added some validation to the node strategy, which I discovered we
were missing while writing this.
This PR is based on another WIP from @liggitt.
```release-note
The node authorizer now automatically sets up rules for Node.Spec.ConfigSource when the DynamicKubeletConfig feature gate is enabled.
```
This makes it easier to figure out which execution was last when looking
at the output of `systemd list-units kubelet-*.service`.
We try to find the name of the /tmp/node-e2e-* directory and use the
same timestamp if we can. Otherwise, we just call Now() again, which
isn't as nice (as the unit name and directory name will not match) but
will still produce unit names that will be ordered when launching
multiple subsequent executions on the same host.
Curl is more ubiquitous than wget. For instance, the GCE centos-7 and
rhel-7 image families ship curl by default, but not wget.
Looking at the shell scripts under cluster/, they tend to use curl more
than wget. (The ones that use wget, such as get-kube.sh, try curl first
and only fallback to wget if it's not available.)
Tested: by running node-e2e-test on Ubuntu, COS and CentOS.
This PR makes the node authorizer automatically set up access rules for
dynamic Kubelet config.
I also added some validation to the node strategy, which I discovered we
were missing while writing this.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Replace package "golang.org/x/net/context" with "context"
**What this PR does / why we need it**:
Replace package "golang.org/x/net/context" with "context"
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#60560
**Special notes for your reviewer**:
As of Go 1.7 this package(golang.org/x/net/context) is available in the standard library under the name context. see (https://godoc.org/golang.org/x/net/context)
It is almost machinery replace.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 61378, 60915, 61499, 61507, 61478). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Capture pod startup phases as metrics
Learning from https://github.com/kubernetes/kubernetes/issues/60589, we should also start collecting and graphing sub-parts of pod-startup latency.
/sig scalability
/kind feature
/priority important-soon
/cc @wojtek-t
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59740, 59728, 60080, 60086, 58714). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
more concise to merge the slice
**What this PR does / why we need it**:
more concise to merge the slice
**Special notes for your reviewer**:
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixes the races around devicemanager Allocate() and endpoint deletion.
There is a race in predicateAdmitHandler Admit() that getNodeAnyWayFunc()
could get Node with non-zero deviceplugin resource allocatable for a
non-existing endpoint. That race can happen when a device plugin fails,
but is more likely when kubelet restarts as with the current registration
model, there is a time gap between kubelet restart and device plugin
re-registration. During this time window, even though devicemanager could
have removed the resource initially during GetCapacity() call, Kubelet
may overwrite the device plugin resource capacity/allocatable with the
old value when node update from the API server comes in later. This
could cause a pod to be started without proper device runtime config set.
To solve this problem, introduce endpointStopGracePeriod. When a device
plugin fails, don't immediately remove the endpoint but set stopTime in
its endpoint. During kubelet restart, create endpoints with stopTime set
for any checkpointed registered resource. The endpoint is considered to be
in stopGracePeriod if its stoptime is set. This allows us to track what
resources should be handled by devicemanager during the time gap.
When an endpoint's stopGracePeriod expires, we remove the endpoint and
its resource. This allows the resource to be exported through other channels
(e.g., by directly updating node status through API server) if there is such
use case. Currently endpointStopGracePeriod is set as 5 minutes.
Given that an endpoint is no longer immediately removed upon disconnection,
mark all its devices unhealthy so that we can signal the resource allocatable
change to the scheduler to avoid scheduling more pods to the node.
When a device plugin endpoint is in stopGracePeriod, pods requesting the
corresponding resource will fail admission handler.
Tested:
Ran GPUDevicePlugin e2e_node test 100 times and all passed now.
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/60176
**Special notes for your reviewer**:
**Release note**:
```release-note
Fixes the races around devicemanager Allocate() and endpoint deletion.
```
There is a race in predicateAdmitHandler Admit() that getNodeAnyWayFunc()
could get Node with non-zero deviceplugin resource allocatable for a
non-existing endpoint. That race can happen when a device plugin fails,
but is more likely when kubelet restarts as with the current registration
model, there is a time gap between kubelet restart and device plugin
re-registration. During this time window, even though devicemanager could
have removed the resource initially during GetCapacity() call, Kubelet
may overwrite the device plugin resource capacity/allocatable with the
old value when node update from the API server comes in later. This
could cause a pod to be started without proper device runtime config set.
To solve this problem, introduce endpointStopGracePeriod. When a device
plugin fails, don't immediately remove the endpoint but set stopTime in
its endpoint. During kubelet restart, create endpoints with stopTime set
for any checkpointed registered resource. The endpoint is considered to be
in stopGracePeriod if its stoptime is set. This allows us to track what
resources should be handled by devicemanager during the time gap.
When an endpoint's stopGracePeriod expires, we remove the endpoint and
its resource. This allows the resource to be exported through other channels
(e.g., by directly updating node status through API server) if there is such
use case. Currently endpointStopGracePeriod is set as 5 minutes.
Given that an endpoint is no longer immediately removed upon disconnection,
mark all its devices unhealthy so that we can signal the resource allocatable
change to the scheduler to avoid scheduling more pods to the node.
When a device plugin endpoint is in stopGracePeriod, pods requesting the
corresponding resource will fail admission handler.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add node-e2e test for ShareProcessNamespace
**What this PR does / why we need it**: Adds a node-e2e test for kubernetes/features#495
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#59554
**Special notes for your reviewer**: This requires a feature gate to be enabled in both the kubelet and API server. I'm not sure which jenkins configs need to be updated (or if these are even still used) so I just updated a pile of them.
opened kubernetes/test-infra#7030 for https://github.com/kubernetes/test-infra/blob/master/jobs/config.json
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix todo:Move function readinessCheck to util
**What this PR does / why we need it**:
fix todo:Move function readinessCheck to util in test/e2e_node/services
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60376, 55584, 60358, 54631, 60291). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
remove gcloud docker -- since it's deprecated
docker handles this now and it raises an error.
try 3
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60236, 60332, 57375, 60451, 57408). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update device plugin e2e_node test to not changing Kubelet config
as DevicePlugins feature is enabled by default now.
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 59159, 60318, 60079, 59371, 57415). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Made a couple API changes to deviceplugin/v1beta1 to avoid future
incompatible API changes:
- Add GetDevicePluginOptions rpc call. This is needed when we switch
from Registration service to probe-based plugin watcher.
- Change AllocateRequest and AllocateResponse to allow device requests
from multiple containers in a pod. Currently only made mechanical
change on the devicemanager and test code to cope with the API but
still issues an Allocate call per container. We can modify the
devicemanager in 1.11 to issue a single Allocate call per pod.
The change will also facilitate incremental API change to communicate
pod level information through Allocate rpc if there is such future
need.
**What this PR does / why we need it**:
Made a couple API changes to deviceplugin/v1beta1 to avoid future incompatible API changes.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/59370
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 60324, 60269, 59771, 60314, 59941). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
expunge the word 'manifest' from Kubelet's config API
The word 'manifest' technically refers to a container-group specification
that predated the Pod abstraction. We should avoid using this legacy
terminology where possible. Fortunately, the Kubelet's config API will
be beta in 1.10 for the first time, so we still had the chance to make
this change.
I left the flags alone, since they're deprecated anyway.
I changed a few var names in files I touched too, but this PR is the
just the first shot, not the whole campaign
(`git grep -i manifest | wc -l -> 1248`).
```release-note
Some field names in the Kubelet's now v1beta1 config API differ from the v1alpha1 API: PodManifestPath is renamed to PodPath, ManifestURL is renamed to PodURL, ManifestURLHeader is renamed to PodURLHeader.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update test framework featuregates type
**What this PR does / why we need it**:
A cleanup following #53025 and #57962.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
ref: #53025
and #57962.
**Special notes for your reviewer**:
but yeah, not sure if it's worthy to do this :)
**Release note**:
```release-note
NONE
```
incompatible changes:
- Add GetDevicePluginOptions rpc call. This is needed when we switch
from Registration service to probe-based plugin watcher.
- Change AllocateRequest and AllocateResponse to allow device requests
from multiple containers in a pod. Currently only made mechanical
change on the devicemanager and test code to cope with the API but
still issues an Allocate call per container. We can modify the
devicemanager in 1.11 to issue a single Allocate call per pod.
The change will also facilitate incremental API change to communicate
pod level information through Allocate rpc if there is such future
need.
The word 'manifest' technically refers to a container-group specification
that predated the Pod abstraction. We should avoid using this legacy
terminology where possible. Fortunately, the Kubelet's config API will
be beta in 1.10 for the first time, so we still had the chance to make
this change.
I left the flags alone, since they're deprecated anyway.
I changed a few var names in files I touched too, but this PR is the
just the first shot, not the whole campaign
(`git grep -i manifest | wc -l -> 1248`).
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
set default enabled admission plugins by official document
**What this PR does / why we need it**:
https://kubernetes.io/docs/admin/admission-controllers/#is-there-a-recommended-set-of-admission-controllers-to-use
recommend running the following set of admission controllers
```
If you previously had not set the `--admission-control` flag, your cluster behavior may change (to be more standard). See [https://kubernetes.io/docs/admin/admission-controllers/] for explanation of admission control.
```
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Set default enabled admission plugins `NamespaceLifecycle,LimitRanger,ServiceAccount,PersistentVolumeLabel,DefaultStorageClass,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota`
```
Automatic merge from submit-queue (batch tested with PRs 60158, 60156, 58111, 57583, 60055). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Bugfix/logf newline
**What this PR does / why we need it**:
Removes all redundant new lines being passed into the `Logf()` function. This involved going through code in both `test/e2e` and `test/e2e_node`, finding the newline redundancies in calls to `Logf()` and removing them.
**Which issue(s) this PR fixes**:
Fixes [#57102](https://github.com/kubernetes/kubernetes/issues/57102)
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 58716, 59977, 59316, 59884, 60117). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
remove deprecated /proxy paths
These were deprecated in v1.2.
ref https://github.com/kubernetes/kubernetes/issues/59885
```release-note
kube-apiserver: the root /proxy paths have been removed (deprecated since v1.2). Use the /proxy subresources on objects that support HTTP proxying.
```
@kubernetes/sig-api-machinery-api-reviews
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update bazelbuild/rules_go, kubernetes/repo-infra, and gazelle dependencies
**What this PR does / why we need it**: updates our bazelbuild/rules_go dependency in order to bump everything to go1.9.4. I'm separating this effort into two separate PRs, since updating rules_go requires a large cleanup, removing an attribute from most build rules.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59927, 59989, 59950). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix e2e node setKubeletConfiguration helper
The helper should have been using `apiequality.Semantic.DeepEqual`,
instead of `reflect.DeepEqual`. Previously, nil vs empty containers
were treated as not equal, but they should be considered equal for
objects managed by Kubernetes API machinery, like KubeletConfiguration.
This should fix the failing eviction tests.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59683, 59964, 59841, 59936, 59686). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Reevaluate eviction thresholds after reclaim functions
**What this PR does / why we need it**:
When the node comes under `DiskPressure` due to inodes or disk space, the eviction manager runs garbage collection functions to clean up dead containers and unused images.
Currently, we use the strategy of trying to measure the disk space and inodes freed by garbage collection. However, as #46789 and #56573 point out, there are gaps in the implementation that can cause extra evictions even when they are not required. Furthermore, for nodes which frequently cycle through images, it results in a large number of evictions, as running out of inodes always causes an eviction.
This PR changes this strategy to call the garbage collection functions and ignore the results. Then, it triggers another collection of node-level metrics, and sees if the node is still under DiskPressure.
This way, we can simply observe the decrease in disk or inode usage, rather than trying to measure how much is freed.
**Which issue(s) this PR fixes**:
Fixes#46789Fixes#56573
Related PR #56575
**Special notes for your reviewer**:
This will look cleaner after #57802 removes arguments from [makeSignalObservations](https://github.com/kubernetes/kubernetes/pull/57802/files#diff-9e5246d8c78d50ce4ba440f98663f3e9R719).
**Release note**:
```release-note
NONE
```
/sig node
/kind bug
/priority important-soon
cc @kubernetes/sig-node-pr-reviews
The helper should have been using `apiequality.Semantic.DeepEqual`,
instead of `reflect.DeepEqual`. Previously, nil vs empty containers
were treated as not equal, but they should be considered equal for
objects managed by Kubernetes API machinery, like KubeletConfiguration.
This should fix the failing eviction tests.
Automatic merge from submit-queue (batch tested with PRs 57136, 59920). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Updated PID pressure node condition.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
part of #54313
**Release note**:
```release-note
Updated PID pressure node condition
```
Automatic merge from submit-queue (batch tested with PRs 59353, 59905, 53833). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Graduate kubeletconfig API group to beta
Regarding https://github.com/kubernetes/features/issues/281, this PR moves the kubeletconfig API group to beta.
After #53088, the KubeletConfiguration type should not contain any deprecated or experimental fields, and we should not have to remove any more fields from the type before graduating it to beta.
We need the community to double check for two things, however:
1. Are there any fields currently in the KubeletConfiguration type that you were going to mark deprecated this quarter, but haven't yet?
2. Are there any fields currently in the KubeletConfiguration type that are experimental or alpha, but were not explicitly denoted as such?
Please comment on this PR if you can answer "yes" to either of those two questions. Please cc anyone with a stake in the kubeletconfig API, so we get as much coverage as possible.
/cc @thockin @dchen1107 @Random-Liu @yujuhong @dashpole @tallclair @vishh @abw @freehan @dnardo @bowei @MrHohn @luxas @liggitt @ncdc @derekwaynecarr @mikedanese
@kubernetes/sig-network-pr-reviews, @kubernetes/sig-node-pr-reviews
```release-note
action required: The `kubeletconfig` API group has graduated from alpha to beta, and the name has changed to `kubelet.config.k8s.io`. Please use `kubelet.config.k8s.io/v1beta1`, as `kubeletconfig/v1alpha1` is no longer available.
```
**TODO:**
- [x] Move experimental/non-gated-alpha/soon-to-be-deprecated fields to `KubeletFlags`
- [x] #53088
- [x] #54154
- [x] #54160
- [x] #55562
- [x] #55983
- [x] #57851
- [x] Lift embedded structure out of strings
- [x] #53025
- [x] #54643
- [x] #54823
- [x] #55254
- [x] Resolve relative paths against the location config files are loaded from
- [x] #55648
- [x] Rename to `kubelet.config.k8s.io`
- [x] Comments
- [x] Make sure existing comments at least read sensibly.
- [x] Note default values in comments on the versioned struct.
- [x] Remove any reference to default values in comments on the internal struct.
- [x] Most fields should be `+optional` and `omitempty`. Add where necessary. ~Where omitted, explicitly comment.~ Edit: We should not distinguish between nil and empty, see below items.
- [x] Ensure defaults are specified via `pkg/kubelet/apis/kubelet.config.k8s.io/v1beta1/defaults.go`, not `cmd/kubelet/app/options/options.go`.
- [x] #57770
- [x] Ensure kubeadm does not persist v1alpha1 KubeletConfiguration objects (or feature-gates this functionality)
- [x] Don't make a distinction between empty and nil, because of #43203.
- [x] #59515
- [x] #59681
- [x] Take the opportunity to fix insecure Kubelet defaults @tallclair
- [x] #59666
- [x] Remove CAdvisorPort from KubeletConfiguration wrt #56523.
- [x] #59580
- [x] Hide `ConfigTrialDuration` until we're more sure what to do with it.
- [x] #59628
- [x] Fix `// default: x` comments after rebasing on recent changes.
This is a more accurate name for the condition, as it describes the
status of the Kubelet's configuration.
Also cleans up capitalization of internal names.