mirror of
https://github.com/k3s-io/kubernetes.git
synced 2025-11-03 07:11:01 +00:00
The node-kubelet-flaky e2e job that runs the the `Node Performance Testing [Serial] [Slow] [Flaky]` e2e tests have been flaking because of inconsistencies on the cpu manager checkpoint file. This seems to be caused because the checkpoint file is deleted (which is what needs to happen in order to change the CPU manager policy which is used for these e2e tests) right after the e2e tests asserts that a pod does not exist anymore. However, after a pod is deleted, the CPU manager may still be cleaning up the resources used by the pod which may result in the checkpoint file being created. Whenever this happened, the kubelet would panic if we then try to subsequently change the CPU manager policy to "static" from "none" or vice versa (this is done 4 times in these tests). Signed-off-by: alejandrox1 <alarcj137@gmail.com>