Commit Graph

3547 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
2a8408811a Merge pull request #127064 from macsko/dont_panic_when_scheduling_queue_empty
Don't panic when popping from empty scheduling queue
2024-09-04 15:05:46 +01:00
Joe Betz
2595aa1309 generate 2024-09-03 14:26:26 -04:00
Kubernetes Prow Robot
4bc6a11d78 Merge pull request #127083 from sanposhiho/scheduler-smaller-event
feat: implement Pod smaller update events
2024-09-03 14:05:22 +01:00
Kensei Nakada
03e3779d40 feat: implement Pod smaller update events 2024-09-03 16:25:28 +09:00
Maciej Skoczeń
1f157bcb90 Don't panic when popping from empty scheduling queue 2024-09-02 12:12:19 +00:00
Kubernetes Prow Robot
e90364f45d Merge pull request #126465 from googs1025/podEligibleToPreemptOthers_refactor
feat: add ctx param for PodEligibleToPreemptOthers
2024-09-02 12:02:56 +01:00
Kubernetes Prow Robot
59051eb003 Merge pull request #126029 from sanposhiho/backoff-preenqueue
scheduler: impose a backoff penalty on gated Pods
2024-08-28 21:58:01 +01:00
Kensei Nakada
b5a156971f scheduler: impose a backoff penalty on gated Pods 2024-08-27 09:57:59 +09:00
Kensei Nakada
baf69640d3 fix(scheduler_one): call Done() as soon as possible 2024-08-27 09:30:47 +09:00
Kubernetes Prow Robot
072825f9a3 Merge pull request #126904 from sanposhiho/move-internal
chore: move the scheduler internal components out of internal dir
2024-08-26 20:51:02 +01:00
Kubernetes Prow Robot
0bcbc3b77a Merge pull request #124003 from carlory/scheduler-rm-non-csi-limit
kube-scheduler remove non-csi volumelimit plugins
2024-08-26 12:02:13 +01:00
Kensei Nakada
8519d3399f chore: move the scheduler internal components out of internal dir 2024-08-25 13:10:29 +09:00
Maciej Skoczeń
dc5e1a404f Reduce length of NodeTree logs 2024-08-23 08:28:13 +00:00
Kubernetes Prow Robot
e955c1d6a8 Merge pull request #126808 from macsko/move_activeq_fields_follow_up
Don't expose lock outside activeQueue in scheduling queue
2024-08-22 20:33:47 +01:00
Kubernetes Prow Robot
b1559c66ca Merge pull request #126807 from pohly/dra-resourceslice-update
DRA scheduler: ResourceSlice update
2024-08-22 15:18:09 +01:00
Maciej Skoczeń
eabdc612dd Use queue.Add instead of activeQ.AddOrUpdate in scheduling queuue tests 2024-08-22 10:28:36 +00:00
Maciej Skoczeń
3eefd62f94 Make update and delete active queue methods 2024-08-22 09:26:05 +00:00
Maciej Skoczeń
9773a39b28 Don't expose lock outside activeQueue in scheduling queue 2024-08-22 09:21:35 +00:00
Patrick Ohly
e85d3babf0 DRA scheduler: fix re-scheduling after ResourceSlice changes
Making unschedulable pods schedulable again after ResourceSlice cluster events
was accidentally left out when adding structured parameters to Kubernetes 1.30.

All E2E tests were defined so that a driver starts first. A new test with a
different order (create pod first, wait for unschedulable, start driver)
triggered the bug and now passes.
2024-08-22 10:09:32 +02:00
Patrick Ohly
6dd2ade762 DRA scheduler: reduce log verbosity
That a pod with no claims remains unschedulable on claim changes is a pretty
normal case. It should only be logged when debugging.
2024-08-22 10:09:32 +02:00
Maciej Skoczeń
a7ad94f93b Unexport podRef in scheduling queue's nominator 2024-08-21 07:25:57 +00:00
Maciej Skoczeń
e303808896 Move scheduling queue's nominator to a separate file 2024-08-21 07:25:55 +00:00
Maciej Skoczeń
33815db3c1 Move NominatedPodsForNode to scheduling queue directly 2024-08-21 07:24:52 +00:00
Patrick Ohly
89e2feaf46 DRA scheduler: fix feature gate check for PodSchedulingContext event
The event is only relevant when DRAControlPlaneController (= "classic DRA") is
enabled.

This change has no effect in practice because the only plugin using this event,
the dynamic resource plugin, also checks feature gates when asking for events
and correctly only asks for PodSchedulingContext events when
DRAControlPlaneController is enabled.
2024-08-20 10:49:08 +02:00
Kubernetes Prow Robot
b8dcc2c983 Merge pull request #126802 from googs1025/fix/faker/scheduler_queue
[Flake Test] scheduler(queue): fix flake test for InFlightPods
2024-08-20 00:48:02 -07:00
googs1025
ff983bbfbf scheduler(queue): fix flake test for InFlightPods 2024-08-20 14:41:38 +08:00
Kubernetes Prow Robot
113b12c6fb Merge pull request #124439 from bells17/csi-translation-lib-structured-and-contextual-logging
Migrate k8s.io/csi-translation-lib/.* to structured logging
2024-08-19 18:13:54 -07:00
Maciej Skoczeń
8e630a9f68 Move activeQ related fields to separate struct in scheduling queue 2024-08-19 07:35:31 +00:00
googs1025
fc0fcd0044 feat: add ctx param for PodEligibleToPreemptOthers 2024-08-14 20:06:05 +08:00
Kubernetes Prow Robot
03e8154063 Merge pull request #126644 from Huang-Wei/fix-preemption
Fix a scheduler preemption issue where the victim isn't properly patched, leading to preemption not functioning as expected
2024-08-13 22:12:09 -07:00
Kubernetes Prow Robot
5e2cead785 Merge pull request #126534 from googs1025/scheduler_cleanup
scheduler: use logger instead of new klog.FromContext(ctx)
2024-08-13 22:10:56 -07:00
Kubernetes Prow Robot
5b95fdb374 Merge pull request #126476 from pohly/scheduler-framework-filter-docs
scheduler: document behavior of Error status returned by Filter
2024-08-13 22:10:42 -07:00
Kubernetes Prow Robot
af782d05aa Merge pull request #126292 from googs1025/scheduler_ut
scheduler(ut): call close method when finish profileMap in ut
2024-08-13 21:03:24 -07:00
Kubernetes Prow Robot
6cf49df138 Merge pull request #126158 from macsko/use_generics_in_scheduling_queue_heap
Use generics in scheduling queue's heap
2024-08-13 21:03:02 -07:00
Kubernetes Prow Robot
ea1143efc7 Merge pull request #126022 from macsko/new_node_to_status_map_structure
Change structure of NodeToStatus map in scheduler
2024-08-13 21:02:55 -07:00
Toru Komatsu
a7242fcff7 Implement PVC/Add QueueingHint in CSILimit plugin (#124703)
Signed-off-by: utam0k <k0ma@utam0k.jp>
2024-08-13 21:02:42 -07:00
Wei Huang
f6a11da279 fix a scheduler preemption issue that victim is not patched properly 2024-08-12 15:25:10 -07:00
carlory
cba2b3f773 kube-scheduler remove non-csi volumelimit plugins 2024-08-05 15:02:32 +08:00
googs1025
6427243676 use logger instead of new klog.FromContext(ctx) 2024-08-04 21:09:02 +08:00
Patrick Ohly
d71d59b91e scheduler: document behavior of Error status returned by Filter
This behavior was useful for https://github.com/kubernetes/kubernetes/pull/125488 but
wasn't obvious when reading the documentation.
2024-07-31 08:55:46 +02:00
Maciej Skoczeń
98be7dfc5d Change structure of NodeToStatus map in scheduler 2024-07-25 07:48:35 +00:00
Maciej Skoczeń
6b33e2e632 Use generics in scheduling queue's heap 2024-07-24 06:55:47 +00:00
googs1025
9eaede70ed call close method when finish profileMap in ut 2024-07-24 13:15:07 +08:00
Kubernetes Prow Robot
39a80796b6 Merge pull request #122628 from sanposhiho/pod-smaller-events
add(scheduler/framework): implement smaller Pod update events
2024-07-23 18:01:46 -07:00
Kubernetes Prow Robot
a00181d4d4 Merge pull request #121902 from carlory/kep-3751-pv-controller
[kep-3751] pvc bind pv with vac
2024-07-23 11:02:13 -07:00
Kubernetes Prow Robot
43691598da Merge pull request #126227 from sanposhiho/queueing_hint_execution_duration_seconds
feature: support queueing_hint_execution_duration_seconds metric
2024-07-23 02:12:29 -07:00
Kensei Nakada
3f59d9fc4c fix typo 2024-07-23 17:43:21 +09:00
carlory
3a6a4830df pvc bind pv with vac 2024-07-23 15:04:11 +08:00
Kubernetes Prow Robot
d21b17264e Merge pull request #125488 from pohly/dra-1.31
DRA for 1.31
2024-07-22 11:45:55 -07:00
Patrick Ohly
9f36c8d718 DRA: add DRAControlPlaneController feature gate for "classic DRA"
In the API, the effect of the feature gate is that alpha fields get dropped on
create. They get preserved during updates if already set. The
PodSchedulingContext registration is *not* restricted by the feature gate.
This enables deleting stale PodSchedulingContext objects after disabling
the feature gate.

The scheduler checks the new feature gate before setting up an informer for
PodSchedulingContext objects and when deciding whether it can schedule a
pod. If any claim depends on a control plane controller, the scheduler bails
out, leading to:

    Status:       Pending
    ...
      Warning  FailedScheduling             73s   default-scheduler  0/1 nodes are available: resourceclaim depends on disabled DRAControlPlaneController feature. no new claims to deallocate, preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.

The rest of the changes prepare for testing the new feature separately from
"structured parameters". The goal is to have base "dra" jobs which just enable
and test those, then "classic-dra" jobs which add DRAControlPlaneController.
2024-07-22 18:09:34 +02:00