Commit Graph

2580 Commits

Author SHA1 Message Date
Fan Shang Xiang
8d9517318a Extend npd e2e timeout to fix npd e2e error 2023-08-29 17:22:28 +08:00
Kubernetes Prow Robot
232d343d58 Merge pull request #119969 from saschagrunert/cni-plugins
Update CNI plugins to v1.3.0
2023-08-23 12:41:57 -07:00
Dixita Narang
d2dbc583a0 Adding coverage for OOM Kill scenario due to node allocatable memory limits, when pod level memory limits are not set 2023-08-22 00:45:17 +00:00
Davanum Srinivas
3e9a4c15a8 Restrict what imports get into code within test/e2e_node
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2023-08-21 15:04:23 -04:00
Kubernetes Prow Robot
4dee8398ae Merge pull request #120078 from tzneal/investigate-test-failure
expect the new resource_scape_error metric
2023-08-21 04:13:34 -07:00
Todd Neal
b8512cfe24 expect the new resource_scape_error metric 2023-08-20 14:17:54 -05:00
Todd Neal
905f07f1ac Revert "mark the OOM killer as serial to reduce flakes"
This reverts commit bd6f548746.

Running as serial didn't completely eliminate the flake so I think
there's something more going on here.  Reverting the change to serial
since its not a solution.
2023-08-20 13:38:07 -05:00
Todd Neal
bd6f548746 mark the OOM killer as serial to reduce flakes
In testing I could only reproduce the flake by running stress-ng to load
the CPU. Running it as serial should reduce and hopefully eliminate the
flakiness.
2023-08-18 13:18:50 -05:00
Todd Neal
577197559a remove the legacy test dependency
This removes the import which added a bunch of apparently
old failing tests.
2023-08-17 12:54:20 -05:00
Sascha Grunert
7933368460 Update CNI plugins to v1.3.0
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2023-08-17 09:50:53 +02:00
Kubernetes Prow Robot
4d166947cf Merge pull request #119097 from pacoxu/fix-eviction-pid
PIDPressure condition is triggered slow on CRI-O with large PID pressure/heavy load
2023-08-16 16:36:19 -07:00
Kubernetes Prow Robot
88d14edc26 Merge pull request #119197 from saschagrunert/stop-container-runtime-err
Check dbus error on container runtime start/stop
2023-08-16 15:27:52 -07:00
Kubernetes Prow Robot
b1e35d5616 Merge pull request #119974 from tzneal/bump-busybox-test-version
bump the busybox test version to resolve test failures
2023-08-16 12:44:13 -07:00
Kubernetes Prow Robot
dd44792cec Merge pull request #119880 from saschagrunert/seccomp-filter
Make seccomp status checks in e2e tests more robust
2023-08-16 12:43:54 -07:00
Todd Neal
b75c5d33e5 bump the busybox test version to resolve test failures
- bump busybox version
- specify the path to /bin/sleep to avoid calling a new shell
  builtin
2023-08-16 08:50:20 -05:00
Kubernetes Prow Robot
c41c448b80 Merge pull request #119890 from tzneal/containers-lifecycle-flake
crio: increase test buffer to eliminate test flakes
2023-08-15 23:13:45 -07:00
Kubernetes Prow Robot
061ae8a68b Merge pull request #119765 from tzneal/detect-nfsv3-and-change-mount-path
fix mirror pod nfs test failure due to differing NFS versions
2023-08-15 23:12:44 -07:00
Kubernetes Prow Robot
3111fee8bf Merge pull request #119670 from lengrongfu/fix/oomkill-multi-target-container
fix OOM killer
2023-08-15 19:43:40 -07:00
Kubernetes Prow Robot
3525255622 Merge pull request #119212 from CoderSherlock/master
Added oomkill test for init container and fix typos
2023-08-15 15:17:48 -07:00
Todd Neal
e258228e4a use a buffer equivalent to grace period to eliminate test flakes
This modifies the test to wait up to 2x the grace period for the pod to
be removed.
2023-08-11 14:08:11 -05:00
Todd Neal
717c149a73 fix mirror pod nfs test failure due to differing NFS versions
/exports *(rw,fsid=0,insecure,no_root_squash)

can be mounted as `/exports` using NFSv3 and `/` using NFSv4

Mount as '/', since clients that support both can try both.
2023-08-11 07:27:05 -05:00
Sascha Grunert
8ab6bee676 Make seccomp status checks in e2e tests more robust
The tests have been introduced in
ca7be7dc6d
and checked for `ecc` in `/proc/self/status` since its creation.

We got a new field `Seccomp_filters:` with the Linux commit
c818c03b66,
means that `ecc` would now match both and interfere with possible test
results depending on the host.

The field `Seccomp:` got introduced in
2f4b3bf6b2
and has never changed since then, means we can use it directly to make
the tests more strict.

Refers to https://github.com/kubernetes-sigs/cri-tools/pull/1236

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2023-08-10 09:51:03 +02:00
lengrongfu
c23cee1be3 fix OOM killer
Signed-off-by: lengrongfu <rongfu.leng@daocloud.io>
2023-07-30 11:16:12 +08:00
Davanum Srinivas
b4ef4015a2 Avoid pulling mounter.tar through the CDN
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2023-07-28 22:15:55 -04:00
upodroid
a65d207507 calculate the correct machine-type 2023-07-26 23:10:06 +00:00
upodroid
7d13c9b096 set map to nil if an empty string is passed 2023-07-26 10:32:27 +03:00
upodroid
1c99f9591b add node-env and instance-type flags to node-e2e tests 2023-07-21 21:46:37 +00:00
Gunju Kim
e0a6eb93a1 node_e2e: Fix createStaticSystemNodeCriticalPod's invalid spec
This fixes `createStaticSystemNodeCriticalPod` to set pod's
restartPolicy instead of container's restartPolicy.
2023-07-20 20:18:05 +09:00
Itamar Holder
ee82654e39 Add pod_swap_usage_bytes as an expected metric in e2e test
Use haveKeys() matcher from previous commit to ensure
required keys exist.

Signed-off-by: Itamar Holder <iholder@redhat.com>
2023-07-19 14:44:05 +03:00
Itamar Holder
81abfca407 Add a haveKeys() helper function to match multiple keys
Signed-off-by: Itamar Holder <iholder@redhat.com>
2023-07-19 14:44:04 +03:00
Kubernetes Prow Robot
b4d793c450 Merge pull request #118865 from iholder101/kubelet/add-swap-to-summary-stats
Add swap to stats to Summary API and Prometheus endpoints (`/stats/summary` and `/metrics/resource`)
2023-07-17 19:49:18 -07:00
Kubernetes Prow Robot
da2fdf8cc3 Merge pull request #118764 from iholder101/Swap/burstableQoS-impl
Add full cgroup v2 swap support with automatically calculated swap limit for LimitedSwap and Burstable QoS Pods
2023-07-17 19:49:07 -07:00
Kubernetes Prow Robot
d17f3ba2cf Merge pull request #119168 from gjkim42/sidecar-allow-probes-and-lifecycle-hooks
Allow all probes and lifecycle for restartable init containers
2023-07-17 18:11:07 -07:00
Itamar Holder
4cb5547f93 Adjust summary API e2e test
Signed-off-by: Itamar Holder <iholder@redhat.com>
2023-07-18 02:55:56 +03:00
Gunju Kim
3bf282652f Allow restartable init containers to have lifecycle 2023-07-18 08:12:24 +09:00
Kubernetes Prow Robot
92856db662 Merge pull request #118973 from ffromani/kubelet-podresources-getallocatable-ga
node: podresources: getallocatable: move to GA
2023-07-17 13:47:33 -07:00
Paco Xu
709eb6c030 eviction for pid trigger PIDPressure condition slowly on CRI-O
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
2023-07-17 16:34:28 +08:00
Kubernetes Prow Robot
900237fada Merge pull request #118635 from ffromani/devmgr-check-pod-running
kubelet: devices: skip allocation for running pods
2023-07-15 05:43:16 -07:00
Itamar Holder
619be9c153 Add a swap e2e test
Signed-off-by: Itamar Holder <iholder@redhat.com>
2023-07-14 14:52:28 +03:00
Shiming Zhang
b2613dd381 Add e2e to check that hostIPs and Downward API works 2023-07-14 09:35:31 +08:00
CoderSherlock
b7cbebcd03 Added oomkill test for init container and fix typos 2023-07-13 17:19:34 +00:00
Kubernetes Prow Robot
047d040ce7 Merge pull request #119012 from pohly/dra-batch-node-prepare
kubelet: support batched prepare/unprepare in v1alpha3 DRA plugin API
2023-07-12 10:57:37 -07:00
Patrick Ohly
d743c50bb9 kubelet: support batched prepare/unprepare in v1alpha3 DRA plugin API
Combining all prepare/unprepare operations for a pod enables plugins to
optimize the execution. Plugins can continue to use the v1beta2 API for now,
but should switch. The new API is designed so that plugins which want to work
on each claim one-by-one can do so and then report errors for each claim
separately, i.e. partial success is supported.
2023-07-12 14:50:30 +02:00
Francesco Romani
01c3a51a78 node: podresources: getallocatable: move to GA
lock the feature gate to GA, and remove the now-redundant code.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-07-12 14:11:22 +02:00
Francesco Romani
d78671447f e2e: node: add test to check device-requiring pods are cleaned up
Make sure orphanded pods (pods deleted while kubelet is down) are
handled correctly.
Outline:
1. create a pod (not static pod)
2. stop kubelet
3. while kubelet is down, force delete the pod on API server
4. restart kubelet
the pod becomes an orphaned pod and is expected to be killed by HandlePodCleanups.

There is a similar test already, but here we want to check device
assignment.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-07-12 13:25:36 +02:00
Francesco Romani
5cf50105a2 e2e: node: devices: improve the node reboot test
The recently added e2e device plugins test to cover node reboot
works fine if runs every time on CI environment (e.g CI) but
doesn't handle correctly partial setup when run repeatedly on
the same instance (developer setup).

To accomodate both flows, we extend the error management, checking
more error conditions in the flow.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-07-12 13:25:36 +02:00
Francesco Romani
b926aba268 e2e: node: devicemanager: update tests
Fix e2e device manager tests.
Most notably, the workload pods needs to survive a kubelet
restart. Update tests to reflect that.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-07-12 13:25:36 +02:00
Kubernetes Prow Robot
da61644869 Merge pull request #119179 from gjkim42/add-prestop-e2e-test
node-e2e: Add container lifecycle e2e tests for preStop hook
2023-07-11 10:33:23 -07:00
Kubernetes Prow Robot
86038ae590 Merge pull request #116846 from moshe010/e2e--node-pod-resources
kubelet pod-resources: add e2e for KubeletPodResourcesGet feature
2023-07-11 04:53:24 -07:00
Sascha Grunert
3bae26ae58 Check dbus error on container runtime start/stop
We should evaluate the error, otherwise we risk to hang indefinately on
waiting for the `reschan` in:

64939b66c6/test/e2e_node/util.go (L419)

We also increase the timeout, because it can take a bit longer for
runtimes to determinate depending on the work they have to be done on
running containers.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2023-07-10 13:45:40 +02:00