clh: isClhRunning waits for full timeout when clh exits

isClhRunning uses signal 0 to test whether the process is
still alive or not. This doesn't work because the process is a
direct child of the shim. Once it is dead the process becomes
zombie.
Since no one waits for it the process lingers until
its parent dies and init reaps it. Hence sending signal 0 in
isClhRunning will always return success whether the process is
dead or not.
This patch calls wait to reap the process, if it succeeds that
means it is our child process, if not we send the signal.

Fixes: #9431

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
This commit is contained in:
Alexandru Matei 2024-04-08 15:44:46 +03:00
parent 6b2d655857
commit 54923164b5

View File

@ -1467,7 +1467,12 @@ func (clh *cloudHypervisor) isClhRunning(timeout uint) (bool, error) {
timeStart := time.Now()
cl := clh.client()
for {
err := syscall.Kill(pid, syscall.Signal(0))
waitedPid, err := syscall.Wait4(pid, nil, syscall.WNOHANG, nil)
if waitedPid == pid && err == nil {
return false, nil
}
err = syscall.Kill(pid, syscall.Signal(0))
if err != nil {
return false, nil
}