Automatic merge from submit-queue (batch tested with PRs 42443, 38924, 42367, 42391, 42310)
Fix StatefulSet e2e flake
**What this PR does / why we need it**:
Fixes StatefulSet e2e flake by ensuring that the StatefulSet controller has observed the unreadiness of Pods prior to attempting to exercise scale functionality.
**Which issue this PR fixes**
fixes#41889
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 41306, 42187, 41666, 42275, 42266)
Bump test timeouts to make secret tests work in large clusters
The previous Get/Update pattern with no retry on resource version mismatch
would flake with the following error:
"the object has been modified; please apply your changes to the latest
version and try again"
Automatic merge from submit-queue (batch tested with PRs 41984, 41682, 41924, 41928)
Move node problem detector test into node e2e.
Move current NPD e2e test into node e2e.
In fact, current NPD e2e test is only a functionality test for NPD. It creates test NPD pod, sets test configuration, generates test logs and verifies test result.
It doesn't actually test the NPD really deployed in the cluster.
So it doesn't actually need to run in cluster e2e. Running it in node e2e will:
1) Make it easier to run the test.
2) Make it more light weight to introduce this as a pre/post submit test in NPD repo in the future.
Except this, I'm working on a cluster e2e to run some basic functionality test and benchmark test against the real NPD deployed in the cluster. Will send the PR later.
/cc @dchen1107 @kubernetes/node-problem-detector-reviewers
Automatic merge from submit-queue (batch tested with PRs 42128, 42064, 42253, 42309, 42322)
Add storage.k8s.io/v1 API
This is combined version of reverted #40088 (first 4 commits) and #41646. The difference is that all controllers and tests use old `storage.k8s.io/v1beta1` API so in theory all tests can pass on GKE.
Release note:
```release-note
StorageClassName attribute has been added to PersistentVolume and PersistentVolumeClaim objects and should be used instead of annotation `volume.beta.kubernetes.io/storage-class`. The beta annotation is still working in this release, however it will be removed in a future release.
```
Automatic merge from submit-queue (batch tested with PRs 41980, 42192, 42223, 41822, 42048)
Adjust parameters of GCL cluster logging load tests
This PR increases the amount of logs produced in load tests to match the number of nodes and provide the predictable load of 100 KB/sec on each node.
Also this PR reduces in half amount of time, given for ingesting logs.
Automatic merge from submit-queue (batch tested with PRs 41980, 42192, 42223, 41822, 42048)
Take into account number of restarts in cluster logging tests
Before, in cluster logging tests, we only measured e2e number of lines delivered to the backend.
Also, befure https://github.com/kubernetes/kubernetes/pull/41795 was merged, from the k8s perspective, fluentd was always working properly, even if it's crashlooping inside.
Now we can detect whether fluentd is truly working properly, experiencing no, or almost no OOMs duing its operation.
Automatic merge from submit-queue (batch tested with PRs 41931, 39821, 41841, 42197, 42195)
Admission Controller: Add Pod Preset
Based off the proposal in https://github.com/kubernetes/community/pull/254
cc @pmorie @pwittrock
TODO:
- [ ] tests
**What this PR does / why we need it**: Implements the Pod Injection Policy admission controller
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Added new Api `PodPreset` to enable defining cross-cutting injection of Volumes and Environment into Pods.
```
Automatic merge from submit-queue (batch tested with PRs 41644, 42020, 41753, 42206, 42212)
Ingress-glbc upgrade tests
Basically #41676 but with some fixes and added comments. @bprashanth has been away this week and it's desirable to have this in before code freeze.
export functions from pkg/api/validation
add settings API
add settings to pkg/registry
add settings api to pkg/master/master.go
add admission control plugin for pod preset
add new admission control plugin to kube-apiserver
add settings to import_known_versions.go
add settings to codegen
add validation tests
add settings to client generation
add protobufs generation for settings api
update linted packages
add settings to testapi
add settings install to clientset
add start of e2e
add pod preset plugin to config-test.sh
Signed-off-by: Jess Frazelle <acidburn@google.com>
Automatic merge from submit-queue
Extend experimental support to multiple Nvidia GPUs
Extended from #28216
```release-note
`--experimental-nvidia-gpus` flag is **replaced** by `Accelerators` alpha feature gate along with support for multiple Nvidia GPUs.
To use GPUs, pass `Accelerators=true` as part of `--feature-gates` flag.
Works only with Docker runtime.
```
1. Automated testing for this PR is not possible since creation of clusters with GPUs isn't supported yet in GCP.
1. To test this PR locally, use the node e2e.
```shell
TEST_ARGS='--feature-gates=DynamicKubeletConfig=true' FOCUS=GPU SKIP="" make test-e2e-node
```
TODO:
- [x] Run manual tests
- [x] Add node e2e
- [x] Add unit tests for GPU manager (< 100% coverage)
- [ ] Add unit tests in kubelet package
Automatic merge from submit-queue (batch tested with PRs 35094, 42095, 42059, 42143, 41944)
Use chroot for containerized mounts
This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
Automatic merge from submit-queue (batch tested with PRs 41234, 42186, 41615, 42028, 41788)
Additional upgrade e2e tests
**What this PR does / why we need it**: Add basic upgrade tests for DaemonSet and Job, and add "during upgrade" testing to ConfigMap test. Add a simple harness for testing upgrade tests.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
**Special notes for your reviewer**: continuation of #41296 @krousey please review, thanks
**Release note**: `NONE`
Automatic merge from submit-queue (batch tested with PRs 41205, 42196, 42068, 41588, 41271)
Implements an upgrade test for Job
**What this PR does / why we need it**:
This PR implements a cluster upgrade test for Job. Some functionality for Job testing has been moved from the e2e package to the framework package to facilitate code reuse between the e2e package and the upgrade package without introducing cyclic dependencies.
We need this PR to help automate the testing of cluster upgrades between versions.
**Release note**
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 41116, 41804, 42104, 42111, 42120)
make kubectl taint command respect effect NoExecute
**What this PR does / why we need it**:
Part of feature forgiveness implementation, make kubectl taint command respect effect NoExecute.
**Which issue this PR fixes**:
Related Issue: #1574
Related PR: #39469
**Special notes for your reviewer**:
**Release note**:
```release-note
make kubectl taint command respect effect NoExecute
```
Automatic merge from submit-queue (batch tested with PRs 41962, 42055, 42062, 42019, 42054)
PV upgrade test
**What this PR does / why we need it**:
This PR adds a PV upgrade test to the new upgrade test framework. Before, this test had to be done manually. Currently the upgrade test framework only works on the GCE environment, so I plan to add support for other providers later. In order to write the test, I had to modify and refactor some volume test util libraries. I reran the impacted tests to make sure they still passed.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
It's probably easier to review the two commits separately. I split it up into the refactor changes, and the upgrade test changes.
**Release note**:
NONE
cc @saad-ali @krousey
Automatic merge from submit-queue (batch tested with PRs 42044, 41694, 41927, 42050, 41987)
Use existing nginx image in the node
**What this PR does / why we need it**:
Fixes a test flake: `[k8s.io] Garbage collector should orphan pods created by rc if delete options say so`
**Which issue this PR fixes**
fixes#35771
**Special notes for your reviewer**:
**Release note**:
```release-note NONE
```
Automatic merge from submit-queue (batch tested with PRs 35408, 41915, 41992, 41964, 41925)
e2e/upgrade: add sysctls
Add sysctl upgrade tests.
How can these be effectively tested?
Automatic merge from submit-queue (batch tested with PRs 41994, 41969, 41997, 40952, 40576)
Guaranteed admission for Critical Pods
This is the first step in implementing node-level preemption for critical pods.
It defines the AdmissionFailureHandler interface, which allows callers, like the kubelet, to define how failed predicates are handled, and take steps to correct failures if necessary.
In the kubelet's implementation, it triggers preemption if the pod being admitted is critical, and if the only failed predicates are InsufficientResourceErrors, then it prempts (not yet implemented) other other pods to allow admission of the critical pod.
cc: @vishh
Automatic merge from submit-queue (batch tested with PRs 40932, 41896, 41815, 41309, 41628)
Modify CronJob API to add job history limits, cleanup jobs in controller
**What this PR does / why we need it**:
As discussed in #34710: this adds two limits to `CronJobSpec`, to limit the number of finished jobs created by a CronJob to keep.
**Which issue this PR fixes**: fixes#34710
**Special notes for your reviewer**:
cc @soltysh, please have a look and let me know what you think -- I'll then add end to end testing and update the doc in a separate commit. What is the timeline to get this into 1.6?
The plan:
- [x] API changes
- [x] Changing versioned APIs
- [x] `types.go`
- [x] `defaults.go` (nothing to do)
- [x] `conversion.go` (nothing to do?)
- [x] `conversion_test.go` (nothing to do?)
- [x] Changing the internal structure
- [x] `types.go`
- [x] `validation.go`
- [x] `validation_test.go`
- [x] Edit version conversions
- [x] Edit (nothing to do?)
- [x] Run `hack/update-codegen.sh`
- [x] Generate protobuf objects
- [x] Run `hack/update-generated-protobuf.sh`
- [x] Generate json (un)marshaling code
- [x] Run `hack/update-codecgen.sh`
- [x] Update fuzzer
- [x] Actual logic
- [x] Unit tests
- [x] End to end tests
- [x] Documentation changes and API specs update in separate commit
**Release note**:
```release-note
Add configurable limits to CronJob resource to specify how many successful and failed jobs are preserved.
```
Automatic merge from submit-queue (batch tested with PRs 42106, 42094, 42069, 42098, 41852)
Pod deletion observation is flaking, increase timeout and debug more
We can afford to wait longer than 30 seconds, and we should be printing
more error and output information about the cause of the failure.
Fixes / triages #41902