runtime-rs: Increase reconnect_timeout_ms for confidential VMs

The Go runtime's CoCo dev config uses dial_timeout = 45s, but all
runtime-rs confidential VM configs had reconnect_timeout_ms set to
3000ms (3s) or 5000ms (SE). This is too short for confidential VMs,
especially on arm64 where UEFI firmware (AAVMF) adds significant
boot time on top of the measured boot process, causing ECONNRESET
errors on the vsock connection before the agent is ready.

Bump reconnect_timeout_ms to 45000ms across all confidential VM
configs (coco-dev, SNP, TDX, SE) to match the Go runtime.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Made-with: Cursor
This commit is contained in:
Fabiano Fidêncio
2026-04-14 18:14:10 +02:00
parent 35e48fdfd1
commit d04bb98e09
4 changed files with 20 additions and 20 deletions

View File

@@ -544,17 +544,17 @@ kernel_modules = []
debug_console_enabled = false
# Agent dial timeout in millisecond.
# (default: 10)
dial_timeout_ms = 10
# (default: 100)
dial_timeout_ms = 100
# Agent reconnect timeout in millisecond.
# Retry times = reconnect_timeout_ms / dial_timeout_ms (default: 300)
# Retry times = reconnect_timeout_ms / dial_timeout_ms (default: 450)
# If you find pod cannot connect to the agent when starting, please
# consider increasing this value to increase the retry times.
# You'd better not change the value of dial_timeout_ms, unless you have an
# idea of what you are doing.
# (default: 3000)
reconnect_timeout_ms = 3000
# (default: 45000)
reconnect_timeout_ms = 45000
# Timeout in seconds for guest components (attestation-agent, confidential-data-hub)
# to create their Unix sockets after being spawned by the agent.

View File

@@ -521,17 +521,17 @@ kernel_modules = []
debug_console_enabled = false
# Agent dial timeout in millisecond.
# (default: 10)
dial_timeout_ms = 90
# (default: 100)
dial_timeout_ms = 100
# Agent reconnect timeout in millisecond.
# Retry times = reconnect_timeout_ms / dial_timeout_ms (default: 300)
# Retry times = reconnect_timeout_ms / dial_timeout_ms (default: 450)
# If you find pod cannot connect to the agent when starting, please
# consider increasing this value to increase the retry times.
# You'd better not change the value of dial_timeout_ms, unless you have an
# idea of what you are doing.
# (default: 3000)
reconnect_timeout_ms = 5000
# (default: 45000)
reconnect_timeout_ms = 45000
# Timeout in seconds for guest components (attestation-agent, confidential-data-hub)
# to create their Unix sockets after being spawned by the agent.

View File

@@ -563,17 +563,17 @@ kernel_modules = []
debug_console_enabled = false
# Agent dial timeout in millisecond.
# (default: 10)
dial_timeout_ms = 10
# (default: 100)
dial_timeout_ms = 100
# Agent reconnect timeout in millisecond.
# Retry times = reconnect_timeout_ms / dial_timeout_ms (default: 300)
# Retry times = reconnect_timeout_ms / dial_timeout_ms (default: 450)
# If you find pod cannot connect to the agent when starting, please
# consider increasing this value to increase the retry times.
# You'd better not change the value of dial_timeout_ms, unless you have an
# idea of what you are doing.
# (default: 3000)
reconnect_timeout_ms = 3000
# (default: 45000)
reconnect_timeout_ms = 45000
# Timeout in seconds for guest components (attestation-agent, confidential-data-hub)
# to create their Unix sockets after being spawned by the agent.

View File

@@ -539,17 +539,17 @@ kernel_modules = []
debug_console_enabled = false
# Agent dial timeout in millisecond.
# (default: 10)
dial_timeout_ms = 10
# (default: 100)
dial_timeout_ms = 100
# Agent reconnect timeout in millisecond.
# Retry times = reconnect_timeout_ms / dial_timeout_ms (default: 300)
# Retry times = reconnect_timeout_ms / dial_timeout_ms (default: 450)
# If you find pod cannot connect to the agent when starting, please
# consider increasing this value to increase the retry times.
# You'd better not change the value of dial_timeout_ms, unless you have an
# idea of what you are doing.
# (default: 3000)
reconnect_timeout_ms = 3000
# (default: 45000)
reconnect_timeout_ms = 45000
# Timeout in seconds for guest components (attestation-agent, confidential-data-hub)
# to create their Unix sockets after being spawned by the agent.