Commit Graph

2539 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
900237fada Merge pull request #118635 from ffromani/devmgr-check-pod-running
kubelet: devices: skip allocation for running pods
2023-07-15 05:43:16 -07:00
Shiming Zhang
b2613dd381 Add e2e to check that hostIPs and Downward API works 2023-07-14 09:35:31 +08:00
Kubernetes Prow Robot
047d040ce7 Merge pull request #119012 from pohly/dra-batch-node-prepare
kubelet: support batched prepare/unprepare in v1alpha3 DRA plugin API
2023-07-12 10:57:37 -07:00
Patrick Ohly
d743c50bb9 kubelet: support batched prepare/unprepare in v1alpha3 DRA plugin API
Combining all prepare/unprepare operations for a pod enables plugins to
optimize the execution. Plugins can continue to use the v1beta2 API for now,
but should switch. The new API is designed so that plugins which want to work
on each claim one-by-one can do so and then report errors for each claim
separately, i.e. partial success is supported.
2023-07-12 14:50:30 +02:00
Francesco Romani
d78671447f e2e: node: add test to check device-requiring pods are cleaned up
Make sure orphanded pods (pods deleted while kubelet is down) are
handled correctly.
Outline:
1. create a pod (not static pod)
2. stop kubelet
3. while kubelet is down, force delete the pod on API server
4. restart kubelet
the pod becomes an orphaned pod and is expected to be killed by HandlePodCleanups.

There is a similar test already, but here we want to check device
assignment.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-07-12 13:25:36 +02:00
Francesco Romani
5cf50105a2 e2e: node: devices: improve the node reboot test
The recently added e2e device plugins test to cover node reboot
works fine if runs every time on CI environment (e.g CI) but
doesn't handle correctly partial setup when run repeatedly on
the same instance (developer setup).

To accomodate both flows, we extend the error management, checking
more error conditions in the flow.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-07-12 13:25:36 +02:00
Francesco Romani
b926aba268 e2e: node: devicemanager: update tests
Fix e2e device manager tests.
Most notably, the workload pods needs to survive a kubelet
restart. Update tests to reflect that.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-07-12 13:25:36 +02:00
Kubernetes Prow Robot
da61644869 Merge pull request #119179 from gjkim42/add-prestop-e2e-test
node-e2e: Add container lifecycle e2e tests for preStop hook
2023-07-11 10:33:23 -07:00
Kubernetes Prow Robot
86038ae590 Merge pull request #116846 from moshe010/e2e--node-pod-resources
kubelet pod-resources: add e2e for KubeletPodResourcesGet feature
2023-07-11 04:53:24 -07:00
Gunju Kim
8fb5b6eb4c node-e2e: Add container lifecycle e2e tests for preStop hook
This ensures that the container's pre-stop hook is invoked if the
startup or liveness probe fails.
2023-07-10 08:55:48 +09:00
Kubernetes Prow Robot
0e14098333 Merge pull request #116429 from SergeyKanzhelev/sidecar
Add SidecarContainers feature
2023-07-07 17:39:03 -07:00
Gunju Kim
03c2217687 Sidecar: Add e2e tests
Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com>
2023-07-08 07:26:12 +09:00
Davanum Srinivas
10dc1ca084 Skip GracefulNodeShutdown on older systemd versions
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2023-07-07 16:08:42 -04:00
Kubernetes Prow Robot
ddb2013363 Merge pull request #119103 from pohly/e2e-unexpected-args
e2e: detect unexpected command line arguments
2023-07-07 04:37:04 -07:00
Kubernetes Prow Robot
d48fc2ad2d Merge pull request #119035 from saschagrunert/critical-pod
Fix `should be able to create and delete a critical pod` test
2023-07-06 00:51:03 -07:00
Kubernetes Prow Robot
88e42040b1 Merge pull request #118981 from ffromani/e2e-podres-deflake
e2e: node: podresources: cooldown the rate limit
2023-07-05 12:00:50 -07:00
Patrick Ohly
16e9cc42c1 e2e node: remove unused test/e2e_node/gcp
The test package was not included anywhere and thus just dead code that doesn't
need to be maintained anymore.
2023-07-05 14:31:32 +02:00
Patrick Ohly
932d0337b8 e2e: detect unexpected command line arguments
Invalid flags are detected by flag parsing, but optional arguments are just
passed through to the E2E suites. None of them support any, so rejecting them
with an error message is useful because it helps catch typos (like a missing
hyphen before a flag).
2023-07-05 13:34:09 +02:00
Sascha Grunert
bcbc12cd79 Fix should be able to create and delete a critical pod test
The namespace the crictical pod was referring to was wrong, because it
was using the generated one instead of `kube-system`. This and the
resulting test condition is now fixed.

The test seems to run only in `ci-crio-cgroupv1-node-e2e-flaky` for now.

Closes https://github.com/kubernetes/kubernetes/issues/109296

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2023-07-03 11:15:59 +02:00
Davanum Srinivas
b01a4145b2 Install ecr-credential-provider during node e2e tests
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2023-06-30 09:18:44 -04:00
Kubernetes Prow Robot
3180aa4272 Merge pull request #118977 from dims/copy-container-logs-for-easier-debugging
Copy container logs for easier debugging
2023-06-29 12:09:43 -07:00
Davanum Srinivas
f96d83af66 Copy container logs for easier debugging
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2023-06-29 11:56:00 -04:00
Francesco Romani
dfc150ca18 e2e: node: podresources: cooldown the rate limit
We have a e2e test which want to get a rate limit error. To do so, we
sent an abnormally high amount of calls in a tight loop.

The relevant test per se is reportedly fine, but wwe need to play nicer
with *other* tests which may run just after and which need to query the API.
If the testsuite runs "too fast", it's possible an innocent test falls in the
same rate limit watch period which was saturated by the ratelimit test,
so the innocent test can still be throttled because the throttling period
is not exhausted yet, yielding false negatives, leading to flakes.

We can't reset the period for the rate limit, we just wait "long enough" to
make sure we absorb the burst and other legit queries are not rejected.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-06-29 17:40:36 +02:00
Davanum Srinivas
52ef833b6c Bump cadvisor version in tests to v0.47.2
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2023-06-29 09:58:31 -04:00
Kubernetes Prow Robot
9516a25ce4 Merge pull request #118951 from dims/drop-docker.log-and-add-cloud-init-output.log
Drop docker.log and add cloud init output.log
2023-06-29 03:51:47 -07:00
Davanum Srinivas
931456a142 Simplify the node name for metrics - just use localhost
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2023-06-28 18:21:12 -04:00
Davanum Srinivas
3e5fafd57a Drop docker.log and add cloud-init-output.log
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2023-06-28 18:08:40 -04:00
Kubernetes Prow Robot
2190775b69 Merge pull request #118280 from stlaz/e2e_psa_labels
Set all PSa labels in tests
2023-06-28 11:14:43 -07:00
Kubernetes Prow Robot
470889278f Merge pull request #118910 from dims/better-url-for-scraping-metrics-from-kubelet
Better URL for scraping metrics from kubelet in node e2e tests
2023-06-27 16:00:41 -07:00
Kubernetes Prow Robot
48a6fb0c42 Merge pull request #118909 from dims/bump-to-latest-node-problem-detector-version-with-arm64
Bump to latest node-problem-detector version with arm64
2023-06-27 14:58:42 -07:00
Kubernetes Prow Robot
e4b3b4f20c Merge pull request #118904 from dims/cleanup-pods-at-the-end-in-pod-conditions-e2e-node-test
Cleanup pods at the end in Pod conditions e2e node test
2023-06-27 14:58:31 -07:00
Davanum Srinivas
a75b00ea39 Better URL for scraping metrics from kubelet
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2023-06-27 16:14:59 -04:00
Davanum Srinivas
685b0c5efa Bump to latest node-problem-detector version with arm64
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2023-06-27 16:04:12 -04:00
Kubernetes Prow Robot
76b2198da1 Merge pull request #118901 from dims/set-aws-specific-credential-provider-when-running-there
Set AWS specific credential provider when running there
2023-06-27 08:56:51 -07:00
Davanum Srinivas
6c587b43e9 Cleanup pods at the end in Pod conditions e2e node test
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2023-06-27 09:36:00 -04:00
Patrick Ohly
e7df337eba e2e: replace gomega.Equal(true/false) with gomega.BeTrue/BeFalse()
The failure message becomes a bit nicer. Found by the new ginkgolinter, for
example:

    test/e2e/windows/memory_limits.go:160:2: ginkgo-linter: wrong boolean assertion; consider using `gomega.Eventually(ctx, func() bool {
     eventList, err := f.ClientSet.CoreV1().Events(f.Namespace.Name).List(ctx, metav1.ListOptions{})
     ...
    }, 3*time.Minute, 10*time.Second).Should(gomega.BeTrue())` instead (ginkgolinter)
2023-06-27 14:20:20 +02:00
Patrick Ohly
8b33e8bdd1 e2e: fix gomega.Expect calls without assertions
"gomega.Expect" is not the same as "assert" in C: it always has to be combined
with a statement of what is expected.

Found with the new ginkgolinter, for example:

    test/e2e/node/pod_resize.go:242:3: ginkgo-linter: "Expect": missing assertion method. Expected "Should()", "To()", "ShouldNot()", "ToNot()" or "NotTo()" (ginkgolinter)
              gomega.Expect(found == true)
2023-06-27 14:20:20 +02:00
Davanum Srinivas
0ef1f2f2d8 Set AWS specific credential provider when running there
NOTE: we are not installing the ecr-credential-provider binary
itself here we are, we need to do it out-of-band from the test
suite itself before it runs.

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2023-06-27 07:45:59 -04:00
Dr. Stefan Schimanski
764da8a01d FIXUP: cmd/kube-apiserver/app/options: split apart controlplane part 2023-06-26 21:50:38 +02:00
Dr. Stefan Schimanski
6e079545c4 cmd/kube-apiserver: move options completion into options package 2023-06-26 15:20:40 +02:00
Moshe Levi
38222014c6 kubelet pod-resources: add e2e for KubeletPodResourcesGet feature
Signed-off-by: Moshe Levi <moshele@nvidia.com>
2023-06-26 08:10:24 +03:00
Michal Wozniak
e3ee9b9adc Fix the deletion of rejected pods 2023-06-22 09:18:34 +02:00
Kubernetes Prow Robot
a532733703 Merge pull request #118754 from pacoxu/fix-prometheus-client
fix metrics test with 1.16.0 prometheus client
2023-06-21 20:25:39 -07:00
Stanislav Laznicka
7f532891c9 e2e tests: set all PSa labels instead of just enforcing 2023-06-21 15:05:13 +02:00
Paco Xu
420fbd11e4 ignore Histogram for prometheus client v1.16.0 2023-06-21 09:38:05 +08:00
Paco Xu
bbae445d17 fix metrics test with 1.16.0 prometheus client 2023-06-20 16:46:31 +08:00
Ed Bartosh
a1e0aa0e50 DRA Node E2E: add NodeAlphaFeature to fix CI
Added NodeAlphaFeature:DynamicResourceAllocation to the Node DRA test
to fix failing containerd serial jobs. Those jobs skip tests labeled
with NodeAlphaFeature.
2023-06-17 13:12:40 +03:00
Kubernetes Prow Robot
e56002ab04 Merge pull request #118665 from bart0sh/PR119-DRA-E2E-remove-NodeFeature
DRA Node E2E: remove NodeFeature label
2023-06-14 18:10:18 -07:00
Ed Bartosh
a83edd35c4 DRA Node E2E: relabel test suite to fix CI
Removed NodeFeature:DynamicResourceAllocation label from the
tests to fix cos-cgroupv1/v2-containerd-node-e2e-serial CI jobs.

It turned out that labeling DRA Node tests as NodeFeature was
a mistake. Re-labeling with NodeAlphaFeature would not work either.
It would fail certain containerd jobs as DRA requires containerd >= 1.7
2023-06-14 20:46:24 +03:00
Kubernetes Prow Robot
99e050f88e Merge pull request #117597 from CoderSherlock/master
Added e2e_node test for sigkilled pods exit code and exit reason check
2023-06-14 10:34:29 -07:00