The container status waiting reason toggles between `ImagePullBackOff`
and the actual pull error, resulting in a bad user experience for
consumers like kubectl. For example, the output of
`kubectl get pods` does return either:
```
NAME READY STATUS RESTARTS AGE
pod 0/1 SignatureValidationFailed 0 10s
```
or
```
NAME READY STATUS RESTARTS AGE
pod 0/1 ImagePullBackOff 0 18s
```
depending in which state the image pull is. We now improve that behavior
by preserving the actual error in the `message` of the `waiting` state
from the pull during back-off:
```json
{
"waiting": {
"message": "Back-off pulling image \"quay.io/crio/unsigned:latest\": SignatureValidationFailed: image pull failed for quay.io/crio/unsigned:latest because the signature validation failed: Source
image rejected: A signature was required, but no signature exists",
"reason": "ImagePullBackOff"
}
}
```
While the `SignatureValidationFailed` value inherits from the previous
known state:
```json
{
"waiting": {
"message": "image pull failed for quay.io/crio/unsigned:latest because the signature validation failed: Source image rejected: A signature was required, but no signature exists",
"reason": "SignatureValidationFailed"
}
}
```
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
NestedNumberAsFloat64 will convert int64s to float64 only if the int64 value can be represented
exactly by a float64. The original test for this property used a roundtrip conversion from int64 to
float64 and back, and the behavior of these conversions is inconsistent across architectures.
As a quick fix for a flake, bceec5a3ff
introduced polling with wait.Poll in all callers of CheckDaemonStatus.
This commit reverts all callers to what they were before (CheckDaemonStatus +
ExpectNoError) and implements polling according to E2E best practices
(https://github.com/kubernetes/community/blob/master/contributors/devel/sig-testing/writing-good-e2e-tests.md#polling-and-timeouts):
- no logging while polling
- support for progress reporting while polling
- last but not least, produce an informative failure message in case of a
timeout, including a dump of the daemon set as YAML
The util for checking on daemonstatus was checking once if the Status of
the daemonset was reporting that all the desired Pods are scheduled and
ready.
However, the pattern used in the e2e test for this function was not
taking into consideration that the controller needs to propagate the Pod
status to the DeamonSet status, and was asserting on the condition only
once after waiting for all the Pods to be ready.
In order to avoid more churn code, change the CheckDaemonStatus
signature to the wait.Condition type and use it in a async poll loop on
the tests.
The RPC call usually does not take much time for containerd or CRI-O. We
now assume the default timeout is fine and therefore resolve the `TODO`.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>