mirror of
https://github.com/k3s-io/kubernetes.git
synced 2025-08-16 07:13:53 +00:00
Merge pull request #58107 from ironcladlou/quota-controller-deadlock
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Fix quota controller worker deadlock The resource quota controller worker pool can deadlock when: * Worker goroutines are idle waiting for work from queues * The Sync() method detects discovery updates to apply The problem is workers acquire a read lock while idle, making write lock acquisition dependent upon the presence of work in the queues. The Sync() method blocks on a pending write lock acquisition and won't unblock until every existing worker processes one item from their queue and releases their read lock. While the Sync() method's lock is pending, all new read lock acquisitions will block; if a worker does process work and release its lock, it will then become blocked on a read lock acquisition; they become blocked on Sync(). This can easily deadlock all the workers processing from one queue while any workers on the other queue remain blocked waiting for work. Fix the deadlock by refactoring workers to acquire a read lock *after* work is popped from the queue. This allows writers to get locks while workers are idle, while preserving the worker pause semantics necessary to allow safe sync. ```release-note Fixes an infrequent problem causing the resource quota controller to become stuck in clusters with low ResourceQuota churn, potentially preventing quota from being recalculated until the controller is restarted or until bursts of diverse quota activity unstick the controller. ``` /cc @kubernetes/sig-api-machinery-bugs
This commit is contained in:
commit
15b1d165fb
@ -237,15 +237,13 @@ func (rq *ResourceQuotaController) addQuota(obj interface{}) {
|
||||
// worker runs a worker thread that just dequeues items, processes them, and marks them done.
|
||||
func (rq *ResourceQuotaController) worker(queue workqueue.RateLimitingInterface) func() {
|
||||
workFunc := func() bool {
|
||||
|
||||
rq.workerLock.RLock()
|
||||
defer rq.workerLock.RUnlock()
|
||||
|
||||
key, quit := queue.Get()
|
||||
if quit {
|
||||
return true
|
||||
}
|
||||
defer queue.Done(key)
|
||||
rq.workerLock.RLock()
|
||||
defer rq.workerLock.RUnlock()
|
||||
err := rq.syncHandler(key.(string))
|
||||
if err == nil {
|
||||
queue.Forget(key)
|
||||
|
Loading…
Reference in New Issue
Block a user