Commit Graph

88 Commits

Author SHA1 Message Date
Patrick Ohly
c339eafb76 scheduler: allow PreBind to return "Pending" and "Unschedulable"
Any error result from PreBind was treated as a pod scheduling failure. This was
overlooked when moving blocking API calls in the DRA plugin into a PreBind
implementation, leading to:

    E0604 15:45:50.980929  306340 schedule_one.go:1048] "Error scheduling pod; retrying" err="waiting for resource driver" pod="test/test-draqld28"

That's because DRA's PreBind does some updates in the apiserver, then returns
Pending to wait for the outcome.

The fix is to allow PreBind to return the same special status codes as other
extension points.
2024-06-06 15:28:08 +02:00
AxeZhan
cf73c9d93c remove EvaluatedNodes field in Diagnosis struct 2024-06-04 14:20:55 +08:00
Gabe
c8f0ea1a54 Don't fill in NodeToStatusMap with UnschedulableAndUnresolvable 2024-05-31 15:52:16 +00:00
Gabe
7ea3bf4db4 Revert "scheduler: preallocation for NodeToStatusMap"
This reverts commit 9fcd791c01.
2024-05-29 14:09:58 +00:00
AxeZhan
d6d1e6ad8a base on allNodes when calculating nextStartNodeIndex 2024-05-18 00:30:38 +08:00
Kensei Nakada
9cd62186e8 cleanup: eliminate unncessary NodeToStatusMap creation 2024-05-11 12:14:22 +00:00
AxeZhan
bcf1c55837 evaluated nodes only consider filter stage 2024-05-10 12:40:12 +08:00
Kensei Nakada
9fcd791c01 scheduler: preallocation for NodeToStatusMap 2024-05-07 00:01:24 +00:00
Kensei Nakada
2b56de43e5 register Node/UpdateNodeTaint event to plugins which has Node/Add only, doesn't have Node/UpdateNodeTaint 2024-03-16 14:13:06 +00:00
Oleg Guba
ba525460e0 change result size to numAllNodes 2024-03-01 02:06:17 -08:00
Oleg Guba
e6dd36759f [kubernetes/scheduler] use lockless diagnosis collection in findNodesThatPassFilters 2024-02-29 20:43:50 -08:00
Aleksandra Malinowska
dd1e617ba0 Scheduler first fit (#123384)
* Don't evaluate extra nodes if there's no score plugin defined

* Fix existing unit test (add no op scoring plugin)

* Add unit tests for no score plugin scenario

* address review comments

* add a test with non-filter, non-scoring extender
2024-02-26 11:07:19 -08:00
AxeZhan
630ff96f9d Revert "Scheduler first fit" 2024-02-14 20:43:59 +08:00
Kubernetes Prow Robot
919d4624a0 Merge pull request #122503 from sunbinnnnn/scheduler-extender-support-ignore-bind
Support ignore scheduler extender error when binding
2024-01-08 17:30:44 +01:00
Neil Sun
87816ffb2c Support ignore scheduler extender error when binding
Signed-off-by: sunbinnnnn <sunbinnnnn@hotmail.com>
2024-01-08 21:06:25 +08:00
Kensei Nakada
09abd6be5a address reviews 2024-01-02 02:10:41 +00:00
Kensei Nakada
041efcd1d4 scheduler: update an old comment 2023-12-22 02:01:13 +00:00
Aleksandra Malinowska
f89c744b7b Only run Prioritize() for extenders with prioritizeVerb configured 2023-12-21 13:42:27 +01:00
Aleksandra Malinowska
e19be41f58 Don't evaluate extra nodes if there's no score plugin defined 2023-12-21 13:29:46 +01:00
AxeZhan
be48c93689 Sched framework: expose NodeInfo in all functions of PluginsRunner interface 2023-12-15 11:30:06 +08:00
Paco Xu
1160521a4f Revert "Scheduler first fit" 2023-12-14 17:27:25 +08:00
Kubernetes Prow Robot
517091cdc5 Merge pull request #122058 from aleksandra-malinowska/scheduler-first-fit
Scheduler first fit
2023-12-14 05:10:19 +01:00
Kubernetes Prow Robot
5322af7f9e Merge pull request #122022 from sanposhiho/extender-fix
fix: requeue pods rejected by Extenders properly
2023-12-14 05:10:01 +01:00
Kubernetes Prow Robot
6bd8f96f35 Merge pull request #122001 from olderTaoist/scheduler-metric
report scheduling_algorithm_duration_seconds metric when pods is unschedulable
2023-12-14 05:09:25 +01:00
Toru Komatsu
01916625da Remove unnecessary error catch in scheduling failure (#121981)
* Deleted from the cache in the handling of scheduling failures due to missing Node

Signed-off-by: utam0k <k0ma@utam0k.jp>

* Support only `nodes`

* Remove unnecessary error catch

Signed-off-by: utam0k <k0ma@utam0k.jp>

* Fix a build error

Signed-off-by: utam0k <k0ma@utam0k.jp>

* Fix a build error

Signed-off-by: utam0k <k0ma@utam0k.jp>

---------

Signed-off-by: utam0k <k0ma@utam0k.jp>
2023-12-14 05:09:08 +01:00
olderTaoist
78b4ab11d5 also report scheduling_algorithm_duration_seconds metric when the pods is unschedulable 2023-12-06 19:17:03 +08:00
Aleksandra Malinowska
3df00d1bdd Only run Prioritize() for extenders with prioritizeVerb configured 2023-11-28 17:13:13 +01:00
Aleksandra Malinowska
199dc03bdd Don't evaluate extra nodes if there's no score plugin defined 2023-11-28 10:39:49 +01:00
Kensei Nakada
468e2dac81 fix: requeue pods rejected by Extenders properly 2023-11-23 13:20:02 +00:00
Patrick Ohly
2a23061f6c scheduler: fix performance regression at -v3 + contextual logging
The logging instrumentation for contextual logging that was added for 1.29
slowed down the scheduler (i.e. logging verbosity <= 3) by a significant
percentage (-28.66% for SchedulingBasic/5000Nodes at -v3) if (and only if!)
contextual logging was enabled.

Retrieving the logger from the context causes no measurable slowdown, it's only
the various WithName/WithValues calls which cause this.

By being more careful about when to use those, the performance impact can be
avoided:
- At -v3 or lower, only `WithValues("pod")` is used once per scheduling cycle.
  This has the intended effect that all log messages for the cycle include the
  pod information. Once contextual logging is GA, "pod" key/value pairs can
  be removed from all log calls.
- At -v4 or higher, richer log entries get produced where `WithValues` is also
  used for the node (when applicable) and `WithName` is used for the current
  operation and plugin.

With these changes, enabling contextual logging causes no measurable slowdown
at -v3 or lower. At -v4, the slowdown depends on the test case (-30.51%
throughput for SchedulingBasic/5000Nodes, no change for
SchedulingCSIPVs/5000Nodes). For some unknown reason (measuring bias?),
SchedulingCSIPVs/500Nodes has a ~3& *higher* throughput with contextual
logging.
2023-11-03 17:28:55 +01:00
Kubernetes Prow Robot
fd5c406112 Merge pull request #120933 from mengjiao-liu/contextual-logging-scheduler-remaining-part
kube-scheduler: convert the remaining part to use contextual logging
2023-10-27 10:30:58 +02:00
Kensei Nakada
27bb66fd7b cleanup: rename failedPlugin to plugin in framework.Status 2023-10-25 12:03:56 +00:00
Mengjiao Liu
b0a73213d6 kube-scheduler: convert the remaining part to use contextual logging 2023-10-24 17:56:48 +08:00
Kensei Nakada
4f5bc7e8d7 fix based on reviews 2023-10-20 02:53:06 +00:00
Kensei Nakada
cb5dc46edf feature(scheduler): simplify QueueingHint by introducing new statuses 2023-10-19 11:02:11 +00:00
Kubernetes Prow Robot
130a5a423f Merge pull request #119785 from sanposhiho/waitonpermit-fiterror
fix: register the plugin rejects Pods in WaitOnPermit to UnschedulablePlugins
2023-08-15 23:13:04 -07:00
Kubernetes Prow Robot
719d1a84f7 Merge pull request #119778 from sanposhiho/bugfix-unschedulableandunresolvable
fix: when PreFilter returns UnschedulableAndUnresolvable, copy the state in all nodes in statusmap
2023-08-15 23:12:57 -07:00
Heba Elayoty
224087abfa Add Pod Scheduling SLI Duration metric (#119049)
Signed-off-by: Heba Elayoty <hebaelayoty@gmail.com>
Co-authored-by: Aldo Culquicondor <1299064+alculquicondor@users.noreply.github.com>
2023-08-15 15:17:41 -07:00
Kensei Nakada
cf3f0bd778 fix: register the plugin rejects Pods in WaitOnPermit to UnschedulablePlugins 2023-08-12 07:18:01 +00:00
Kensei Nakada
b008223705 fix: when PreFilter returns UnschedulableAndUnresolvable, copy the state in all nodes in statusmap 2023-08-12 06:58:49 +00:00
Patrick Ohly
2f30fae0e8 scheduler: fix data race after binding failure
When binding has failed, `Done` gets called by
`handleBindingCycleError`. Calling it again is at best redundant and worse,
suffers from a data race:
- the `assumedPodInfo` is placed in the backoff queue
- an event causes the `Pod` pointer to get updated in it
- reading `assumedPodInfo.Pod.UID` races with that write

This race was found with`go test -race`.
2023-08-02 11:04:10 +02:00
Kensei Nakada
c7e7eee554 feature(scheduling_queue): track events per Pods (#118438)
* feature(sscheduling_queue): track events per Pods

* fix typos

* record events in one slice and make each in-flight Pod to refer it

* fix: use Pop() in test before AddUnschedulableIfNotPresent to register in-flight Pods

* eliminate MakeNextPodFuncs

* call Done inside the scheduling queue

* fix comment

* implement done() not to require lock in it

* fix UTs

* improve the receivedEvents implementation based on suggestions

* call DonePod when we don't call AddUnschedulableIfNotPresent

* fix UT

* use queuehint to filter out events for in-flight Pods

* fix based on suggestion from aldo

* fix based on suggestion from Wei

* rename lastEventBefore → previousEvent

* fix based on suggestion

* address comments from aldo

* fix based on the suggestion from Abdullah

* gate in-flight Pods logic by the SchedulingQueueHints feature gate
2023-07-17 15:53:07 -07:00
kerthcet
c0eb0caf4a Support fine-gained rescheduling in ReservePlugin
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-07-07 13:30:29 +08:00
kerthcet
278a8376e1 Fix: fiterror in permit plugin not handled perfectly
We only added failed plulgins, but actually this will not work unless
we make the status with a fitError because we only copy the failured plugins
to podInfo if it is a fitError

Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-07-07 10:35:59 +08:00
Kubernetes Prow Robot
d9714078f8 Merge pull request #118551 from sanposhiho/event-to-register
feature(scheduler): implement ClusterEventWithHint to filter out useless events
2023-06-26 06:41:45 -07:00
Kensei Nakada
6f8d38406a feature(scheduler): implement ClusterEventWithHint to filter out useless events 2023-06-22 13:36:19 +00:00
Heba Elayoty
902c711fb4 Unset gated pod info timestamp in addToActiveQ
Signed-off-by: Heba Elayoty <hebaelayoty@gmail.com>
2023-06-21 14:16:08 -07:00
Kubernetes Prow Robot
d58492b19c Merge pull request #114688 from sanposhiho/sanposhiho/scheduling-one-score
feature(schedule_one): use heap to find the highest score node
2023-06-08 15:40:12 -07:00
Mengjiao Liu
074900e81b scheduler: update the scheduler interface and cache methods to use contextual logging 2023-05-29 13:26:32 +08:00
Kensei Nakada
0535e74224 feature(schedule_one): use heap to find the highest score node 2023-05-27 11:34:32 +00:00