Merge pull request #63755 from tomoe/dumpstack-docker

Automatic merge from submit-queue (batch tested with PRs 63434, 64172, 63975, 64180, 63755). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Dump Stack when docker fails on healthcheck

Save stack dump of docker daemon in order to be able to
investigate why docker daemon was unresposive to `docker ps`

See https://github.com/moby/moby/blob/master/daemon/daemon.go on
how docker sets up a trap for SIGUSR1 with `setupDumpStackTrap()`

**What this PR does / why we need it**:

This allows us to investigate why docker daemon was unresponsive to "docker ps" command. 

**Special notes for your reviewer**:
Manually tested on Ubuntu and COS.

**Release note**:

```release-note
NONE
```
This commit is contained in:
Kubernetes Submit Queue 2018-05-24 12:18:25 -07:00 committed by GitHub
commit 972a74e238
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -50,6 +50,12 @@ function container_runtime_monitoring {
while true; do
if ! timeout 60 ${healthcheck_command} > /dev/null; then
echo "Container runtime ${container_runtime_name} failed!"
if [[ "$container_runtime_name" == "docker" ]]; then
# Dump stack of docker daemon for investigation.
# Log fle name looks like goroutine-stacks-TIMESTAMP and will be saved to
# the exec root directory, which is /var/run/docker/ on Ubuntu and COS.
pkill -SIGUSR1 dockerd
fi
systemctl kill --kill-who=main "${container_runtime_name}"
# Wait for a while, as we don't want to kill it again before it is really up.
sleep 120