mirror of
https://github.com/kata-containers/kata-containers.git
synced 2026-07-01 22:50:54 +00:00
The verification Job assumed the DaemonSet model: it waited for the
DaemonSet to exist, for its pods, and for `rollout status daemonset/...`,
then required every node in the cluster to be labeled. None of that holds
for deploymentMode: job, where install happens via the dispatcher and the
per-node Jobs it fans out, and only the targeted (worker) nodes get
labeled.
Make the hook mode-aware:
- Hook weight: in job mode the install dispatcher runs as a
post-install hook at weight 5, so verification now runs at weight 10
(after it); daemonset mode keeps weight 0 (the DaemonSet is a normal
resource).
- Readiness wait: in job mode, wait for the install dispatcher Job to
complete and then for the per-node install Jobs
(kata-deploy/stage=install) to finish (with the same CRI-restart
retry logic) instead of a DaemonSet rollout.
- Label check: in job mode, verify exactly the nodes the dispatcher
targeted are labeled, rather than comparing the labeled count against
all nodes in the cluster.
- Grant the verification ClusterRole read access to batch/jobs (used by
the job-mode waits; harmless in daemonset mode).
The daemonset code path is unchanged and the default render (no
verification.pod) is byte-for-byte identical.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <cursoragent@cursor.com>