Production-Grade Container Scheduling and Management
Go to file
Kubernetes Submit Queue bc7ccfe93b Merge pull request #50106 from julia-stripe/improve-scheduler-error-handling
Automatic merge from submit-queue

Retry scheduling pods after errors more consistently in scheduler

**What this PR does / why we need it**:

This fixes 2 places in the scheduler where pods can get stuck in Pending forever.  In both these places, errors happen and `sched.config.Error` is not called afterwards. This is a problem because `sched.config.Error` is responsible for requeuing pods to retry scheduling when there are issues (see [here](2540b333b2/plugin/pkg/scheduler/factory/factory.go (L958))), so if we don't call `sched.config.Error` then the pod will never get scheduled (unless the scheduler is restarted).

One of these (where it returns when `ForgetPod` fails instead of continuing and reporting an error) is a regression from [this refactor](https://github.com/kubernetes/kubernetes/commit/ecb962e6585#diff-67f2b61521299ca8d8687b0933bbfb19L234), and with the [old behavior](80f26fa8a8/plugin/pkg/scheduler/scheduler.go (L233-L237)) the error was reported correctly. As far as I can tell changing the error handling in that refactor wasn't intentional.

When AssumePod fails there's never been an error reported but I think adding this will help the scheduler recover when something goes wrong instead of letting pods possibly never get scheduled.

This will help prevent issues like https://github.com/kubernetes/kubernetes/issues/49314 in the future.

**Release note**:

```release-note
Fix incorrect retry logic in scheduler
```
2017-08-07 01:35:17 -07:00
.github Merge pull request #46714 from castrojo/new-issue-template 2017-06-22 16:43:47 -07:00
api Merge pull request #49678 from smarterclayton/429_metric 2017-08-05 01:28:00 -07:00
build Merge pull request #50052 from foxyriver/clean-install 2017-08-05 12:33:01 -07:00
cluster Merge pull request #49855 from zouyee/kiq 2017-08-05 19:07:50 -07:00
cmd Update generated code 2017-08-06 15:32:28 +02:00
docs Merge pull request #49678 from smarterclayton/429_metric 2017-08-05 01:28:00 -07:00
examples Update wordpress to 4.8.0 2017-07-20 10:08:49 +08:00
federation Update generated code 2017-08-06 15:32:28 +02:00
Godeps c-go: Add dependencies for http-cache 2017-08-04 14:39:22 -07:00
hack Merge pull request #47181 from dims/fail-on-swap-enabled 2017-08-04 14:29:36 -07:00
logo
pkg Merge pull request #49481 from jianglingxia/jlx72417 2017-08-06 08:52:56 -07:00
plugin Merge pull request #50106 from julia-stripe/improve-scheduler-error-handling 2017-08-07 01:35:17 -07:00
staging Merge pull request #50159 from liggitt/includeObject 2017-08-06 22:49:09 -07:00
test Fix printer hack to get a versioned client 2017-08-06 15:30:13 +02:00
third_party Merge pull request #47614 from mengqiy/fix_naming 2017-07-19 21:51:49 -07:00
translations removed 'Storage' option from 'kubectl top' like options 2017-06-23 08:34:53 -07:00
vendor c-go: Add dependencies for http-cache 2017-08-04 14:39:22 -07:00
.bazelrc move build related files out of the root directory 2017-05-15 15:53:54 -07:00
.generated_files
.gitattributes
.gitignore Remove verify_gen_openapi make rule. 2017-04-25 17:41:33 -07:00
.kazelcfg.json Switch from gazel to kazel, and move kazelcfg into build/root 2017-07-18 12:48:51 -07:00
BUILD.bazel move build related files out of the root directory 2017-05-15 15:53:54 -07:00
CHANGELOG.md Merge pull request #50104 from MrHohn/kube-proxy-1.7.3-changelog 2017-08-03 12:55:04 -07:00
code-of-conduct.md
CONTRIBUTING.md Close kubernetes/community#420 2017-03-08 09:59:30 -08:00
labels.yaml Update labels.yaml 2017-07-11 11:21:18 -07:00
LICENSE
Makefile move build related files out of the root directory 2017-05-15 15:53:54 -07:00
Makefile.generated_files move build related files out of the root directory 2017-05-15 15:53:54 -07:00
OWNERS Add jregan to OWNERS for kubectl isolation work. 2017-05-30 14:32:48 -07:00
OWNERS_ALIASES Add sig-testing OWNERS_ALIASES 2017-07-25 11:05:18 -07:00
README.md update submit-queue URL in README.md 2017-08-01 20:36:17 -07:00
Vagrantfile
WORKSPACE move build related files out of the root directory 2017-05-15 15:53:54 -07:00

Kubernetes

Submit Queue Widget GoDoc Widget


Kubernetes is an open source system for managing containerized applications across multiple hosts, providing basic mechanisms for deployment, maintenance, and scaling of applications.

Kubernetes builds upon a decade and a half of experience at Google running production workloads at scale using a system called Borg, combined with best-of-breed ideas and practices from the community.

Kubernetes is hosted by the Cloud Native Computing Foundation (CNCF). If you are a company that wants to help shape the evolution of technologies that are container-packaged, dynamically-scheduled and microservices-oriented, consider joining the CNCF. For details about who's involved and how Kubernetes plays a role, read the CNCF announcement.


To start using Kubernetes

See our documentation on kubernetes.io.

Try our interactive tutorial.

Take a free course on Scalable Microservices with Kubernetes.

To start developing Kubernetes

The community repository hosts all information about building Kubernetes from source, how to contribute code and documentation, who to contact about what, etc.

If you want to build Kubernetes right away there are two options:

You have a working Go environment.
$ go get -d k8s.io/kubernetes
$ cd $GOPATH/src/k8s.io/kubernetes
$ make
You have a working Docker environment.
$ git clone https://github.com/kubernetes/kubernetes
$ cd kubernetes
$ make quick-release

If you are less impatient, head over to the developer's documentation.

Support

If you need support, start with the troubleshooting guide and work your way through the process that we've outlined.

That said, if you have questions, reach out to us one way or another.

Analytics