- Change the feature gate from alpha to beta and enable it by default
- Update a few of the unit tests due to feature gate being enabled by
default
- Small refactor in `nodeshutdown_manager` which adds `featureEnabled`
function (which checks that feature gate and that
`kubeletConfig.ShutdownGracePeriod > 0`).
- Use `featureEnabled()` to exit early from shutdown manager in the case
that the feature is disabled
- Update kubelet config defaulting to be explicit that
`ShutdownGracePeriod` and `ShutdownGracePeriodCriticalPods` default to
zero and update the godoc comments.
- Update defaults and add featureGate tag in api config godoc.
With this feature now in beta and the feature gate enabled by default,
to enable graceful shutdown all that will be required is to configure
`ShutdownGracePeriod` and `ShutdownGracePeriodCriticalPods` in the
kubelet config. If not configured, they will be defaulted to zero, and
graceful shutdown will effectively be disabled.
We will have two layers of the validation.
- the first part of the validation logic will be implemented under the
`ValidateKubeletConfiguration` method
- the second one that requires knowledge about machine topology and
node allocatable resources will be implemented under the memory manager.
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
Implements KEP 2000, Graceful Node Shutdown:
https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2000-graceful-node-shutdown
* Add new FeatureGate `GracefulNodeShutdown` to control
enabling/disabling the feature
* Add two new KubeletConfiguration options
* `ShutdownGracePeriod` and `ShutdownGracePeriodCriticalPods`
* Add new package, `nodeshutdown` that implements the Node shutdown
manager
* The node shutdown manager uses the systemd inhibit package, to
create an system inhibitor, monitor for node shutdown events, and
gracefully terminate pods upon a node shutdown.
It covers deviceplugin & cpumanager.
It has drawback, since cpuset and all other structs including cadvisor's keep
cpu as int, but for protobuf based interface is better to have fixed
int.
This patch also introduces additional interface CPUsProvider, while
DeviceProvider might have been extended too.
Checkpoint not covered by unit test.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
* Rename const for topology.../zone
* Rename const for topology.../region
* Rename const for failure-domain.../zone
* Rename const for failure-domain.../region
* Restore old names for compat