This PR also looses the check to allow zero since the API doc has
explained that value zero indicates no waiting.
Signed-off-by: Dave Chen <dave.chen@arm.com>
- Change the feature gate from alpha to beta and enable it by default
- Update a few of the unit tests due to feature gate being enabled by
default
- Small refactor in `nodeshutdown_manager` which adds `featureEnabled`
function (which checks that feature gate and that
`kubeletConfig.ShutdownGracePeriod > 0`).
- Use `featureEnabled()` to exit early from shutdown manager in the case
that the feature is disabled
- Update kubelet config defaulting to be explicit that
`ShutdownGracePeriod` and `ShutdownGracePeriodCriticalPods` default to
zero and update the godoc comments.
- Update defaults and add featureGate tag in api config godoc.
With this feature now in beta and the feature gate enabled by default,
to enable graceful shutdown all that will be required is to configure
`ShutdownGracePeriod` and `ShutdownGracePeriodCriticalPods` in the
kubelet config. If not configured, they will be defaulted to zero, and
graceful shutdown will effectively be disabled.
* Removes discovery v1alpha1 API
* Replaces per Endpoint Topology with a read only DeprecatedTopology
in GA API
* Adds per Endpoint Zone field in GA API
The streamwatcher has a synchronization problem that may lead to
a go routine blocking forever when closing a stream watch.
This occasionally happens, when informers are cancelled together with the
watch request using the stop channel, which leads to an increaing
number of blocked go routines, if imformers are dynamicaly created and deleted
again.
The function `receive` checks under a lock whether the watch has been stopped,
before an error is reported to the result channel.
The problem here is, that in between the watcher might be stopped by
calling the `Stop` method. In the actual code this is done by the
`cache.Reflector` using the streamwatcher by a defer which is executed after
the caller already stopped reading from the result channel.
As a result the stopping flag might be set after the check
and trying to send the error event blocks this send operation forever,
because there will never be a receiver again.
The fix introduces a dedicated local stop channel that is closed by the
`Stop` method and used in a select statement together with the send
operation to finally abort the loop.