mirror of
https://github.com/k3s-io/kubernetes.git
synced 2025-07-23 11:50:44 +00:00
Merge pull request #50310 from shyamjvs/block-on-master-startup
Automatic merge from submit-queue Block on master-creation step for large clusters (>50 nodes) in kube-up I recently noticed a failure in our 5000-node scale test where the master failed to initialize within time. But it went on and created all 5000 nodes due to not blocking on master creation. Turned out the master VM wasn't even created: ``` W0808 10:00:49.340] ERROR: (gcloud.compute.instances.create) Could not fetch resource: ... Try a different zone, or try again later. ``` Even some of our 100-node tests are flaking occasionally during cluster startup (with master validation step timing out) and I think the reason is the same (issue - https://github.com/kubernetes/kubernetes/issues/49453) We should block on that step for large clusters. cc @kubernetes/sig-scalability-misc @gmarek
This commit is contained in:
commit
3e0eff9f55
@ -953,6 +953,7 @@ function delete-subnetworks() {
|
||||
#
|
||||
# Assumed vars:
|
||||
# KUBE_TEMP: temporary directory
|
||||
# NUM_NODES: #nodes in the cluster
|
||||
#
|
||||
# Args:
|
||||
# $1: host name
|
||||
@ -1044,7 +1045,13 @@ function create-master() {
|
||||
create-certs "${MASTER_RESERVED_IP}"
|
||||
create-etcd-certs ${MASTER_NAME}
|
||||
|
||||
create-master-instance "${MASTER_RESERVED_IP}" &
|
||||
if [[ "${NUM_NODES}" -ge "50" ]]; then
|
||||
# We block on master creation for large clusters to avoid doing too much
|
||||
# unnecessary work in case master start-up fails (like creation of nodes).
|
||||
create-master-instance "${MASTER_RESERVED_IP}"
|
||||
else
|
||||
create-master-instance "${MASTER_RESERVED_IP}" &
|
||||
fi
|
||||
}
|
||||
|
||||
# Adds master replica to etcd cluster.
|
||||
|
Loading…
Reference in New Issue
Block a user