Commit Graph

7076 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
de0b01ed56 Merge pull request #136097 from atiratree/automated-cherry-pick-of-#135428-upstream-release-1.35
Automated cherry pick of #135428: schedule pod availability checks at the correct time in StatefulSets
2026-01-10 09:39:40 +05:30
Kubernetes Prow Robot
072e21c15a Merge pull request #136098 from jsafrane/automated-cherry-pick-of-#135629-upstream-release-1.35
Automated cherry pick of #135629: selinux: Fix the controller to ignore finished pods
2026-01-09 21:29:44 +05:30
Filip Křepinský
69e322920e mark QuotaMonitor as not running and invalidate monitors list
to prevent close of closed channel panic
2026-01-08 13:19:24 +01:00
Jan Safranek
923cb5be10 Add unit test with CSIDriver.SELinuxMount=false
Add unit test with a volume plugin that does not support SELinux. That
simulates a CSi driver whose spec.SELinuxMount is empty or false.

This requires a little refactoring, each unit test now has a flag if it
runs with a volume plugin that supports SELinux.
2026-01-08 11:15:04 +01:00
Jan Safranek
2aeedbd767 Use only enqueuePod to add pods to the controller queue
enqueuePod already creates the right key for a pod, it's better to reuse it
than copy the code around.
2026-01-08 11:15:04 +01:00
Jan Safranek
44b1306e55 Fix policy of Pods with unknown SELinux label
Reset SELinuxChangePolicy of Pods that have no SELinux label set to
Recursive. Kubelet cannot mount with `-o context=<label>`, if the label is
not known.

This fixes the e2e test error revealed by the previous commit - it changed the
e2e test to check for events when no events are expected and it found a
warning about a Pod with no label, but MountOption policy.
2026-01-08 11:15:04 +01:00
Jan Safranek
7d9af54b95 Add new unit tests 2026-01-08 11:15:04 +01:00
Jan Safranek
6edce1ddec Rework unit tests to builder pattern 2026-01-08 11:15:03 +01:00
Jan Safranek
b84206f5af selinux: Do not report conflits with finished pods
When a Pod reaches its final state (Succeeded or Failed), its volumes are
getting unmounted and therefore their SELinux mount option will not
conflict with any other pod.

Let the SELinux controller monitor "pod updated" events to see the pod is
finished
2026-01-08 11:15:03 +01:00
Jan Safranek
9993d83107 refactoring: use a common function to enqueue Pod
addPod and deletePod have the same implementation, merge them into
enqueuePod
2026-01-08 11:15:03 +01:00
Filip Křepinský
802ed9eaa9 add StatefulSetAvailabilityCheck test 2026-01-08 11:03:05 +01:00
Filip Křepinský
04da1f09e5 replace "k8s.io/klog/v2/ktesting" with "k8s.io/kubernetes/test/utils/ktesting"
for advanced features (e.g. Eventually)
2026-01-08 11:03:05 +01:00
Filip Křepinský
f8578e8d8b schedule pod availability checks at the correct time in StatefulSets 2026-01-08 11:03:05 +01:00
Filip Křepinský
e7c2ecf799 wire now (time) to the availability checks in the StatefulSet controller
- this helps to make the controller reconcilliation consistent
2026-01-08 11:03:05 +01:00
Heba
aceb89debc KEP-5471: Extend tolerations operators (#134665)
* Add numeric operations to tolerations

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* code review feedback

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* add default feature gate

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Add integration tests

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Add toleration value validation

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

* Add validate options for new operators

Signed-off-by: helayoty <heelayot@microsoft.com>

* Remove log

Signed-off-by: helayoty <heelayot@microsoft.com>

* Update feature gate check

Signed-off-by: helayoty <heelayot@microsoft.com>

* emove IsValidNumericString func

Signed-off-by: helayoty <heelayot@microsoft.com>

* Implement IsDecimalInteger

Signed-off-by: helayoty <heelayot@microsoft.com>

* code review feedback

Signed-off-by: helayoty <heelayot@microsoft.com>

* Add logs to v1/toleration

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>
Signed-off-by: helayoty <heelayot@microsoft.com>

* Update integration tests and address code review feedback

Signed-off-by: helayoty <heelayot@microsoft.com>

* Add feature gate to the scheduler framework

Signed-off-by: helayoty <heelayot@microsoft.com>

* Remove extra test

Signed-off-by: helayoty <heelayot@microsoft.com>

* Fix integration test

Signed-off-by: helayoty <heelayot@microsoft.com>

* pass feature gate via TolerationsTolerateTaint

Signed-off-by: helayoty <heelayot@microsoft.com>

---------

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>
Signed-off-by: helayoty <heelayot@microsoft.com>
2025-11-10 12:42:54 -08:00
Kubernetes Prow Robot
59d65dad34 Merge pull request #134945 from tchap/kcm-controllers-check-threads
pkg/controller: Improve goroutine management (part 2)
2025-11-06 00:43:01 -08:00
Kubernetes Prow Robot
50b4bcbab5 Merge pull request #134210 from yliaog/admit_quota
DRA extended resource quota
2025-11-06 00:42:53 -08:00
Kubernetes Prow Robot
6723beac00 Merge pull request #135154 from kubernetes/revert-134840-ahmet/mini-cleanup
Revert "controller: duplicate utility method cleanup"
2025-11-05 22:49:04 -08:00
Kubernetes Prow Robot
ca03752ee7 Merge pull request #135104 from mimowo/mutable-job-directives
Allow mutable job scheduling directives on suspended Jobs
2025-11-05 21:57:11 -08:00
Kubernetes Prow Robot
f025bcace9 Merge pull request #135068 from pohly/dra-device-taints-1.35-full
DRA device taint eviction: several improvements
2025-11-05 18:52:58 -08:00
yliao
870062df4f adjusts DRA extended resource quota to include devices usages from regular resource claims 2025-11-05 23:24:24 +00:00
Maciej Szulik
499bff4ca4 Revert "controller: duplicate utility method cleanup" 2025-11-05 21:06:09 +01:00
Michał Woźniak
5a7c90fb76 Allow mutable scheduling directives for suspended Jobs 2025-11-05 19:37:33 +00:00
Patrick Ohly
60744fc8b9 DRA device taint eviction: track evicting rules
This avoids having to call the rule lister (which theoretically, but not in
practice) fail and having to iterate over rules which can be ignored (might be
a small performance boost).
2025-11-05 20:03:17 +01:00
Patrick Ohly
9527987293 DRA device taint eviction: use NOP queue during simulation
It's slightly more efficient and a bit cleaner.
2025-11-05 20:03:17 +01:00
Patrick Ohly
eaee6b6bce DRA device taints: add separate feature gate for rules
Support for DeviceTaintRules depends on a significant amount of
additional code:
- ResourceSlice tracker is a NOP without it.
- Additional informers and corresponding permissions in scheduler and controller.
- Controller code for handling status.

Not all users necessarily need DeviceTaintRules, so adding a second feature
gate for that code makes it possible to limit the blast radius of bugs in that
code without having to turn off device taints and tolerations entirely.
2025-11-05 20:03:17 +01:00
Kubernetes Prow Robot
9ef1a14d68 Merge pull request #134840 from ahmetb/ahmet/mini-cleanup
controller: duplicate utility method cleanup
2025-11-05 08:06:58 -08:00
Kubernetes Prow Robot
9a192aa1c3 Merge pull request #134432 from Karthik-K-N/fix-sv-test
Fix storage version test flake
2025-11-05 06:56:52 -08:00
Ayato Tokubi
320987ead3 Addressed comments 2025-11-05 10:44:50 +00:00
Ayato Tokubi
5102591a6b Refactor resource claim metrics to use structured labels and add "source" dimension.
Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2025-11-05 09:52:47 +00:00
Kubernetes Prow Robot
c1a6a3ca71 Merge pull request #134152 from pohly/dra-device-taints-1.35
DRA: device taints: new ResourceSlice API, new features
2025-11-04 15:32:07 -08:00
Ondra Kupka
024382658b controller/volume/vacprotection: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
e08d03b1b5 controller/volume/selinuxwarning: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
1e6ad423bf controller/volume/pvprotection: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
0caae6f704 controller/volume/pvcprotection: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
ed74779a0f controller/volume/persistentvolume: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
8eab454e38 controller/volume/expand: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
27774052ab controller/volume/ephemeral: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
12205df76d controller/volume/attachdetach: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
9d4ff6ecf2 controller/tainteviction: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
d2a443db75 controller/serviceaccount: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
c641df792b controller/resourcequota: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
d908a470a5 controller/garbagecollector: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Kubernetes Prow Robot
97cb47a913 Merge pull request #135080 from dejanzele/feat/promote-job-managedby-to-ga
KEP-4368: Job Managed By; Promote to GA
2025-11-04 13:42:12 -08:00
Patrick Ohly
bbf8bc766e DRA device taints: DeviceTaintRule status
To update the right statuses, the controller must collect more information
about why a pod is being evicted. Updating the DeviceTaintRule statuses then is
handled by the same work queue as evicting pods.

Both operations already share the same client instance and thus QPS+server-side
throttling, so they might as well share the same work queue. Deleting pods is
not necessarily more important than informing users or vice-versa, so there is
no strong argument for having different queues.

While at it, switching the unit tests to usage of the same mock work queue as
in staging/src/k8s.io/dynamic-resource-allocation/internal/workqueue. Because
there is no time to add it properly to a staging repo, the implementation gets
copied.
2025-11-04 21:57:24 +01:00
Patrick Ohly
0689b628c7 generated files 2025-11-04 21:57:24 +01:00
Patrick Ohly
f4a453389d DRA device taint eviction: configurable number of workers
It might never be necessary to change the default, but it is hard to be sure.
It's better to have the option, just in case.
2025-11-04 21:57:24 +01:00
Kubernetes Prow Robot
a058cf788a Merge pull request #134624 from yt2985/podcertificates-beta
Promote Pod Certificates feature to beta
2025-11-04 11:42:12 -08:00
Dejan Zele Pejchev
3dabd4417d KEP-4368: Job Managed By; Promote to GA
Signed-off-by: Dejan Zele Pejchev <pejcev.dejan@gmail.com>
2025-11-04 10:59:45 +01:00
Kubernetes Prow Robot
d6aa2db57e Merge pull request #135027 from omerap12/remove-reactor-hpa
Remove unused delete reactor
2025-11-04 01:30:10 -08:00