mirror of
https://github.com/k3s-io/kubernetes.git
synced 2025-07-23 19:56:01 +00:00
Merge pull request #55884 from mpolednik/dpi-race-fix
Automatic merge from submit-queue (batch tested with PRs 55839, 54495, 55884, 55983, 56069). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. deviceplugin: fix race when multiple plugins are registered **What this PR does / why we need it**: When registering multiple device plugins to Kubelet concurrently, there exists a race that crashes the Kubelet. Consider two plugins: D1 and D2. The call order method is roughly D1 -> manager.go:register -> endpoint.go:listAndWatch -> device_plugin_handler.go:(*D1).callback D2 -> manager.go:register -> endpoint.go:listAndWatch -> device_plugin_handler.go:(*D2).callback The callback function accesses HandlerImpl's allDevices map that maps (resourceName -> DeviceID). If both plugins reach these accesses at the same time, Kubelet crashes with "fatal error: concurrent map read and map write". This can be solved by making sure handler is locked when allDevices are being updated. The functionality is needed to avoid Kubelet crashes when multiple device plugins are trying to register with Kubelet at the same moment. Occurs frequently when single binary tries to register itself as multiple plugins. **Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*: **Special notes for your reviewer**: **Release note**: ```release-note NONE ```
This commit is contained in:
commit
0b1d023aa7
@ -92,6 +92,10 @@ func NewHandlerImpl(updateCapacityFunc func(v1.ResourceList)) (*HandlerImpl, err
|
||||
deviceManagerMonitorCallback := func(resourceName string, added, updated, deleted []pluginapi.Device) {
|
||||
var capacity = v1.ResourceList{}
|
||||
kept := append(updated, added...)
|
||||
|
||||
handler.Lock()
|
||||
defer handler.Unlock()
|
||||
|
||||
if _, ok := handler.allDevices[resourceName]; !ok {
|
||||
handler.allDevices[resourceName] = sets.NewString()
|
||||
}
|
||||
|
Loading…
Reference in New Issue
Block a user