When some plugin was registered as "unschedulable" in some previous scheduling
attempt, it kept that attribute for a pod forever. When that plugin then later
failed with an error that requires backoff, the pod was incorrectly moved to the
"unschedulable" queue where it got stuck until the periodic flushing because
there was no event that the plugin was waiting for.
Here's an example where that happened:
framework.go:1280: E0831 20:03:47.184243] Reserve/DynamicResources: Plugin failed err="Operation cannot be fulfilled on podschedulingcontexts.resource.k8s.io \"test-dragxd5c\": the object has been modified; please apply your changes to the latest version and try again" node="scheduler-perf-dra-7l2v2" plugin="DynamicResources" pod="test/test-dragxd5c"
schedule_one.go:1001: E0831 20:03:47.184345] Error scheduling pod; retrying err="running Reserve plugin \"DynamicResources\": Operation cannot be fulfilled on podschedulingcontexts.resource.k8s.io \"test-dragxd5c\": the object has been modified; please apply your changes to the latest version and try again" pod="test/test-dragxd5c"
...
scheduling_queue.go:745: I0831 20:03:47.198968] Pod moved to an internal scheduling queue pod="test/test-dragxd5c" event="ScheduleAttemptFailure" queue="Unschedulable" schedulingCycle=9576 hint="QueueSkip"
Pop still needs the information about unschedulable plugins to update the
UnschedulableReason metric. It can reset that information before returning the
PodInfo for the next scheduling attempt.
The 'NewCloudControllerManagerCommand' function create a cobra.Command
object that is used for the main entry point to various CCM
implementations. This function accepts an 'additionalFlags' parameter,
allowing users to register additional controller-specific options
beyond the standard set used for all controllers. While we were dumping
the standard set of flags in the usage string - seen when running the
command with '--help' or when parsing fails - we were not dumping the
additional options. Correct this oversight.
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
This uses the generic ptr.To in k8s.io/utils to replace functions and
code constructs which only serve to return pointers to intstr
values. Other uses of the deprecated pointer package are updated in
modified files.
Signed-off-by: Stephen Kitt <skitt@redhat.com>
Instead of modifying the PodSchedulingContext and then creating or updating it,
now the required changes (selected node, potential nodes) are tracked and the
actual input for an API call is created if (and only if) needed at the end.
This makes the code easier to read and change. In particular, replacing the
Update call with Patch or Apply is easy.
After 12 months as reviewer, constantly reviewing and contributing
to client-go, I think is fair to move to the next ladder.
Change-Id: I49e579dcefcd39c6f0b29400c90467df00719cca