Commit Graph

132375 Commits

Author SHA1 Message Date
Patrick Ohly
aefd2effc5 test: automatically lower Ginkgo parallelism when using race detection
When race detection is enabled, merely running 25 e2e.test instances was too much
and the OOM killer shut down the Prow test pod because of the memory overhead.

A CI job could control that via GINKGO_PARALLEL_NODES, but we should also have
saner defaults which take this into account.
2025-09-16 20:25:53 +02:00
Patrick Ohly
7a62519b36 E2E: treat data races in e2e suite as failures
Ginkgo itself doesn't do this, in which case prun-junit-xml drops the output
and Spyglass wouldn't show the test as failed. If the data race warning is
captured, we now treat that as the failure of a test if it hasn't already
failed for other reasons.

While at it, the entire report cleanup gets moved to our junit package.
2025-09-16 19:34:36 +02:00
Patrick Ohly
9b696ff58c build: also support KUBE_RACE for test binaries
This is relevant for building e2e.test in
pull-kubernetes-e2e-kind-alpha-beta-features-race: if it contains data races,
tests might be flaky.
2025-09-09 17:10:10 +02:00
Kubernetes Prow Robot
a8905a154b Merge pull request #133179 from nmn3m/fix-strings-title
Replace deprecated strings.Title with cases.Title
2025-09-09 05:53:30 -07:00
Kubernetes Prow Robot
1bec132e1e Merge pull request #133939 from pohly/scheduler-perf-testing-B-metrics
scheduler_perf: reset and stop testing.B metrics
2025-09-09 02:59:31 -07:00
Kubernetes Prow Robot
ddd8e70b1e Merge pull request #133933 from zhifei92/cleanup-duplicate-logs
Clean up duplicate logs
2025-09-08 14:05:43 -07:00
Kubernetes Prow Robot
01f11bfa24 Merge pull request #133930 from bart0sh/PR198-migrate-stats-to-contextual-logging
chore(kubelet): migrate stats to contextual logging
2025-09-08 14:05:36 -07:00
Kubernetes Prow Robot
a3fcc7070e Merge pull request #129768 from liggitt/delete-finalizer-race
protect against race between deletion and adding finalizers
2025-09-08 14:05:28 -07:00
Kubernetes Prow Robot
164e467deb Merge pull request #130157 from zhifei92/migrate-kubelet-metrics-to-contextual-logging
chore(kubelet): migrate metrics to contextual logging
2025-09-08 12:21:27 -07:00
Kubernetes Prow Robot
29218e6d07 Merge pull request #131826 from yanhuan0802/bugfix-detach-typo
fix typo for forceDetachTimeoutExpired
2025-09-08 11:23:29 -07:00
Patrick Ohly
af6da561dd scheduler_perf: reset and stop testing.B metrics
Before, metrics gathered by testing.B (runtime_seconds,
-benchmem's B/op and allocs/op) covered the entire test case, including
starting the apiserver and the initialization steps of a workload. Now those
metrics are also limited to the period where the workload is configured to
collect metrics.
2025-09-08 19:17:24 +02:00
Kubernetes Prow Robot
447ca5ff02 Merge pull request #133898 from HadrienPatte/client-go/compatibility-matrix
Update client-go compatibility matrix to include releases up to 1.34
2025-09-08 09:23:28 -07:00
Kubernetes Prow Robot
90b03f1af0 Merge pull request #133910 from bitoku/fix-graceful-shutdown
Fix GracefulNodeShutdown perma failing test
2025-09-08 07:39:38 -07:00
Kubernetes Prow Robot
08946ca255 Merge pull request #132606 from Peac36/fix/132539
add paths section to scheduler statusz endpoint
2025-09-08 07:39:31 -07:00
zhangzhifei16
39170e2ed6 chore: Clean up duplicate logs 2025-09-08 21:37:13 +08:00
Kubernetes Prow Robot
597a684bb0 Merge pull request #133172 from ania-borowiec/move_handle_and_plugin
Move interfaces: Handle and Plugin and related types from kubernetes/kubernetes to staging repo kube-scheduler
2025-09-08 06:05:31 -07:00
Lan
cfeeff7ace chore(kubelet): migrate stats to contextual logging
Signed-off-by: Lan <gcslyp@gmail.com>
Co-Authored-By: Ed Bartosh <eduard.bartosh@intel.com>
2025-09-08 15:35:02 +03:00
Ayato Tokubi
5ed98e97e1 Remove getLocalNode to fix GracefulNodeShutdown e2e.
getLocalNode tried to get a ready node and fails if there's none.
The e2e test sends termination signal to kubelet and it's expected to have no ready nodes. Because of this, the e2e was permafailing.

Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2025-09-08 12:20:55 +00:00
Nikola
b42b96f518 add paths section to scheduler statusz endpoint
Signed-off-by: Nikola <peac36@abv.bg>
2025-09-08 13:13:42 +03:00
Kubernetes Prow Robot
d95e07d0f8 Merge pull request #133926 from pohly/dra-kubelet-grpc-idle
DRA kubelet: avoid deadlock when gRPC connection to driver goes idle
2025-09-08 01:09:28 -07:00
Patrick Ohly
06c5eb992e DRA kubelet: avoid deadlock when gRPC connection to driver goes idle
When gRPC notifies the kubelet that a connection ended, the kubelet tries to
reconnect because it needs to know when a DRA driver comes back. The same code
gets called when a connection goes idle, by default after 30 minutes. In that
and only that case the conn.Connect call deadlocks while calling into the gRPC
idle manager.

This can be reproduced with a new unit test which artificially shortens the
idle timeout. This fix is to move the Connect call into a goroutine because
then both HandleConn and Connect can proceed. It's sufficient that Connect
finishes at some point, it doesn't need to be immediately.
2025-09-08 08:59:55 +02:00
zhangzhifei16
d38c1df3f3 chore(kubelet): migrate metrics to contextual logging. 2025-09-08 10:53:42 +08:00
Huan Yan
7aa6cabd63 fix typo for forceDetachTimeoutExpired 2025-09-07 16:37:34 +08:00
Kubernetes Prow Robot
d9b31d602d Merge pull request #133893 from HirazawaUi/close-connections
Kubeadm: Close container runtime connections after use
2025-09-06 01:35:24 -07:00
Kubernetes Prow Robot
cca2ff05f7 Merge pull request #133838 from macsko/fix_race_in_movepodstoactiveorbackoffq
Fix race in movePodsToActiveOrBackoffQueue
2025-09-06 00:31:25 -07:00
HirazawaUi
8118636321 Close container runtime connections after use 2025-09-06 14:42:36 +08:00
Kubernetes Prow Robot
b508767369 Merge pull request #132655 from ylink-lfs/ci/httpd_removal
ci: remove httpd usage while using agnhost instead
2025-09-05 20:23:24 -07:00
Kubernetes Prow Robot
4786451d81 Merge pull request #130213 from zhifei92/migrate-kubelet-container-to-contextual-logging
chore(kubelet): migrate container to contextual logging
2025-09-05 15:41:25 -07:00
Kubernetes Prow Robot
e8b19be173 Merge pull request #133440 from carlory/deflake-service-tests
deflake e2e test: Services should implement NodePort and HealthCheckNodePort correctly when ExternalTrafficPolicy changes
2025-09-05 14:37:42 -07:00
Kubernetes Prow Robot
3b687533aa Merge pull request #133217 from bart0sh/PR187-migrate-utils-logs-to-contextual-logging
Kubelet: Migrate util/ and logs/ to contextual logging
2025-09-05 14:37:34 -07:00
Kubernetes Prow Robot
1166dcb0ef Merge pull request #130154 from zhifei92/watchdog-migrate-structured-logging
chore(kubelet): migrate watchdog to contextual logging
2025-09-05 14:37:26 -07:00
Kubernetes Prow Robot
dc5066c229 Merge pull request #133888 from omerap12/fix-external-metrics-overflow
hpa: prevent integer overflow in external metrics sum
2025-09-05 13:45:32 -07:00
Kubernetes Prow Robot
b9c467483e Merge pull request #133690 from pohly/log-client-go-leaderelection
client-go leaderelection: structured, contextual logging
2025-09-05 13:45:25 -07:00
Kubernetes Prow Robot
edcac86acb Merge pull request #133907 from ardaguclu/fix-wait-prefix
Remove redundant experimental prefix in wait command
2025-09-05 11:13:32 -07:00
Kubernetes Prow Robot
c02a198b55 Merge pull request #130487 from zhifei92/migrate-kubelet-prober-to-contextual-logging
chore(kubelet): migrate prober to contextual logging
2025-09-05 11:13:25 -07:00
zhangzhifei16
80e6f9e20a chore(kubelet): migrate container to contextual logging
fix: the failed ci
2025-09-05 21:55:15 +08:00
Kubernetes Prow Robot
ef4add4509 Merge pull request #133356 from mayuka-c/issue-133175
Replace usage of deprecated ErrWaitTimeout with recommended method across all Pkgs
2025-09-05 06:43:34 -07:00
Kubernetes Prow Robot
5988bf7f36 Merge pull request #133067 from tchap/kubectl-logs-ctx
kubectl/logs: Add LogOptions.RunLogsContext
2025-09-05 06:43:27 -07:00
zhangzhifei16
f1b28b0d1f chore(kubelet): migrate watchdog to contextual logging
fix: fix failed typecheck

fix unit test
2025-09-05 21:40:44 +08:00
Arda Güçlü
98f81fc291 Remove redundant experimental prefix in wait command 2025-09-05 16:07:54 +03:00
Kubernetes Prow Robot
078a8f1894 Merge pull request #130581 from zhifei92/migrate-kubelet-config-to-contextual-logging
chore(kubelet):  migrate config to contextual logging
2025-09-05 05:25:26 -07:00
Kubernetes Prow Robot
26e94a35d3 Merge pull request #133900 from lucming/typo-sattsfied
fix typo for sattsfied
2025-09-05 03:39:25 -07:00
Omer Aplatony
d75d4860e7 kubelet: migrate module logs to contextual logging
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
Co-authored-by: Ed Bartosh <eduard.bartosh@intel.com>
2025-09-05 13:22:00 +03:00
Omer Aplatony
9c1cf79d74 kubelet: migrate utils to contextual logging
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
Co-authored-by: Ed Bartosh <eduard.bartosh@intel.com>
2025-09-05 13:19:56 +03:00
Kubernetes Prow Robot
f278302957 Merge pull request #133894 from lalitc375/k8s-name
Add k8s-long-name and k8s-short-name  format validation tags
2025-09-05 01:01:36 -07:00
Kubernetes Prow Robot
c4fdf5f027 Merge pull request #133239 from Peac36/fix/133184
add paths section to kubelet statusz endpoint
2025-09-05 01:01:29 -07:00
Kubernetes Prow Robot
56f6358c11 Merge pull request #133890 from huww98/fix-volume-metrics
kubelet/metrics: fix multiple Register call
2025-09-04 23:23:00 -07:00
Lalit Chauhan
c88f2f3142 Add k8s-long-name, k8s-short-name format validation tags 2025-09-05 04:50:40 +00:00
lucming
c8681531ab fix typo for sattsfied 2025-09-05 11:47:18 +08:00
Kubernetes Prow Robot
ecf2c52f75 Merge pull request #133729 from HirazawaUi/add-HirazawaUi-to-reviewer
Self nominate HirazawaUi as sig-node reviewer
2025-09-04 19:21:27 -07:00