Commit Graph

610 Commits

Author SHA1 Message Date
Aurélien Bombo
1217dd1584 Merge pull request #12373 from kata-containers/disable-guest-empty-dir
runtime: Set `disable_guest_empty_dir = true` by default
2026-06-24 20:09:46 -05:00
Aurélien Bombo
e191c5b716 runtime-go/rs: Reconcile hugepage emptyDirs and disable_guest_empty_dir
This addresses an issue where the disable_guest_empty_dir=true code paths did
not take into account that hugepage-backed emptyDirs should always be recreated
in the guest (using guest hugepages).

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-06-24 15:22:13 -05:00
Christophe de Dinechin
631fd96715 runtime: Change default log level from Warn to Info
When the kata configuration does not set log_level to debug, the
containerd-shim-v2 defaults to WarnLevel, which suppresses important
diagnostic information logged at Info level.

Key Info-level logs that are currently hidden:
- QEMU command line (qemu.go:3566) - critical for debugging VM issues
- VM lifecycle events (creation, start, stop)
- Device hotplug operations (VFIO, network, volumes)
- Resource configuration (NUMA, memory)
- QMP socket details

Info level provides significantly better diagnostic data without
flooding logs with excessive detail (which would occur at Debug level).
This change improves troubleshooting capabilities for production
deployments where debug mode is not enabled.

Note: runtime-rs already defaults to Info level (see
src/runtime-rs/crates/shim/src/logger.rs:13,30), so this change only
affects the Go runtime.

Fixes: #13260

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2026-06-23 10:29:33 +02:00
Cameron Baird
730307f32c factory: Default to normal sandbox boot path when factory init not done
The behavior we had before was that, for a starting k8s pod,
it sees enable_template=true and therefore:

1. Tries NewFactory with fetchOnly=true
2. When that fails (because template.Fetch fails to find the artifacts,
	we retry with fetchOnly=false. This creates a direct factory
	which creates the template from scratch
	(hence we pay a full pod sandbox boot time here)
	and then restores from that. Hence the boot times
	are strictly worse on this path.

Now, even when enable_template=true, we don't try to force a direct factory.
Instead we just revert to the standard sandbox boot path.

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
2026-06-19 18:00:02 +00:00
Cameron Baird
c0f9744225 runtime: Implement support for VM Template factory in clh
Add support for VM Template factory on the clh path.

In order to support snapshot/restore-based VM templating,
the following changes were needed:
1. For clh.go, implement SaveVM, PauseVM, restoreVM, ResumeVM
2. Remove initrd config check for VM Templating path. The
        root disk image (when using image mode) is created in memory
        and therefore captured in the VM snapshot.
3. Truncate the memory file to the size of the VM at factory VM
        create time. This allows CLH to use the memory file
        as the backing for the template VM memory, allowing O(1)
        snapshot times.
4. CLH uses memory zones as backing for its memory on the template paths
5. Update StartVM in CLH to use the restore path when template is
        configured and available

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
2026-06-19 18:00:02 +00:00
Fabiano Fidêncio
0ddb2ee1f1 Merge pull request #13160 from LandonTClipp/kata_visible_devices
feat(agent): translate KATA_VISIBLE_DEVICES into CDI GPU requests
2026-06-16 19:10:35 +02:00
LandonTClipp
a1dd28cb52 feat(runtime): plumb VISIBLE_CDI_DEVICES through the Go runtime
Add a `visible_cdi_devices` TOML option to the Go runtime so the
agent.visible_cdi_devices=true kernel parameter is emitted to the guest
when enabled. Wire the option through the NVIDIA GPU configuration
templates and add tests verifying the kernel-params flow.

Signed-off-by: LandonTClipp <lclipp@coreweave.com>
2026-06-16 11:44:09 +02:00
Fabiano Fidêncio
6203e28bef runtime: Propagate block-mode device read-only flag to the VMM
Block-mode volumes (e.g. Kubernetes volumeDevices) are passed to the
container as device nodes in spec.Linux.Devices and carry no mount "ro"
option. Their read-only intent is expressed only via the cgroup device
access in spec.Linux.Resources.Devices ("rm" = read+mknod, no write, for
read-only; "rwm" for read-write).

The device path ignored that signal: newLinuxDeviceInfo() built the
DeviceInfo without ever setting ReadOnly (it only consumed FileMode, the
node permission bits, not the read/write access), so the device was
always attached read-write.

This is problematic for filesystems such as XFS, which inspect the block
device read-only state to decide whether to attempt journal/log recovery.
When the guest device is writable, XFS tries to replay the log even for a
read-only mount, which fails badly. Mounting "-o ro" in the guest is not
enough; the device itself must advertise read-only, which only happens
when the VMM opens the backing device read-only (DeviceInfo.ReadOnly ->
BlockDrive.ReadOnly -> qemu read-only=on / clh Readonly).

Derive the read-only flag from two independent signals, combined with OR
so either one marks the device read-only:

  - the cgroup device access rule that exactly matches the device, so a
    block-mode volume marked read-only by the orchestrator (e.g. a pod
    volume with persistentVolumeClaim.readOnly: true) is honored, and
  - the host block device's own read-only flag (queried via the BLKROGET
    ioctl). Block-mode volumes frequently carry no read-only signal in
    the OCI spec at all, so the device flag is often the only reliable
    source.

The BLKROGET probe is shared (pkg/device/config.BlockDeviceIsReadOnly,
Linux-only with a stub on other platforms) between the device-node path
(newLinuxDeviceInfo, probing /dev/block/<major>:<minor>) and the
bind-mounted/filesystem block path (createDeviceInfo). None of this
relies on external host tooling such as "blockdev --setro".

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor
2026-06-15 23:18:36 +02:00
Manuel Huber
70d8f1bf3d runtime: remove file_mem_backend config option
Remove the Go runtime file_mem_backend and valid_file_mem_backends
config knobs, along with the corresponding sandbox annotation handling.

The runtime still enables file-backed shared memory automatically for
virtio-fs by using /dev/shm as the backing directory. This only removes
the user-selectable backend path.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Assisted-by: OpenAI Codex <codex@openai.com>
2026-06-12 00:07:16 +00:00
Fabiano Fidêncio
a2bb3f64b0 Merge pull request #12436 from mythi/tdx-updates-2026-3
runtime(-rs): tdx: use TDX QGS via unix-domain-socket by default
2026-06-03 08:50:26 +02:00
Fabiano Fidêncio
9b5b829265 runtime: oci: derive sandbox CPUs from shares only if unconstrained
The shares-based fallback added for cpuManagerPolicy=static fired whenever
the quota-based CPU count was 0, including for BestEffort sandboxes that
have no CPU request. Those sandboxes still carry the cgroup-floor shares
value (2), so the fallback derived ceil(2/1024)=1 and inflated every such
sandbox by one vCPU. For peer-pods (static resource management) this
changed the VM sizing to default_vcpus+1, regressing the libvirt
instance-type CI checks.

Gate the fallback on the quota being explicitly unconstrained (< 0), which
is the actual cpuManagerPolicy=static signal, instead of on numCPU == 0.
BestEffort sandboxes (quota 0/absent) now correctly contribute 0 vCPUs
while the static-policy case still recovers the CPU count from shares.

Add unit tests covering the static-policy, rounding, BestEffort, and
explicit-quota cases.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-06-01 09:50:49 +02:00
Fabiano Fidêncio
23c5250933 runtime/qemu: emit id= for VFIODevice on -device cmdline
Without an explicit id= on the vfio-pci device, QEMU auto-generates
an internal name that does not match vfioDev.ID, so any subsequent
qomGetPciPath(vfioDev.ID) call via QMP fails with "Device 'X' not
found". This breaks resolveColdPlugVFIOGuestPciPaths which needs the
device ID to look up the guest PCI path, leaving GuestPciPath nil and
causing update_interface to fail repeatedly as the agent can't find
the interface to configure.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <cursoragent@cursor.com>
2026-05-28 21:54:52 +02:00
Mikko Ylinen
733c6791d3 runtime(-rs): make TDX QGS port=0 change backwards compatible
Changing Kata runtime configurations to use TDX QGS port=0 (unix domain
socket transport) means cluster admins must also reconfigure qgsd to
the same and have /var/run/tdx-qgs/qgs.sock available.

Since the early days of TDX attestation in Kata, the configuration has used
vsock with cid=2, port=4050. To avoid unncessary breakages when Kata default
moves to unix domain socket, fall back to the old configuration if
/var/run/tdx-qgs/qgs.sock is not available on the worker node.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2026-05-26 17:01:52 +03:00
Zvonko Kaiser
aeadb1af35 Merge pull request #12948 from fidencio/topic/numa
runtime (go): agent: Add NUMA support for QEMU
2026-05-25 15:33:14 +02:00
Fabiano Fidêncio
1cbe930fc9 runtime: Add pxb-pcie NUMA-aware PCIe topology for VFIO devices
When NUMA placement is active and VFIO devices are cold-plugged,
create a pxb-pcie (PCIe Expander Bridge) per NUMA node that has
devices.  Each pxb-pcie carries a numa_node property that gives the
guest kernel correct NUMA affinity for all PCI devices beneath it.

Root ports are created on each pxb-pcie bus instead of pcie.0, and
VFIODevice.Attach() assigns each device to the root port on its host
NUMA node's pxb bridge.  Non-VFIO devices remain on pcie.0.

NUMA placement is "active" when there is more than one guest NUMA
node OR a single guest node mapped to a specific host node (the
latter happens when maybeRightSizeAutoNUMA() collapses a multi-node
sandbox to the GPU's host NUMA node).  In both cases
buildNUMATopology() also emits the matching
memory-backend-ram,host-nodes=,policy=bind entries so guest memory
is sourced from the right host node.

So pxb-pcie can never capture a leaf virtio-pci device as the
default bus, every virtio-pci device emitter (NetDevice, VSOCK,
vhost-user-{net,scsi,blk,fs}) now appends bus=pcie.0 explicitly when
the machine actually exposes a pcie.0 root.  Detection is done via a
new hasPCIeRoot() helper that returns true only for q35/virt machine
types — ppc64le's pseries (pci.0), s390x's s390-ccw-virtio (CCW
transport) and microvm (no PCI) intentionally skip the pin to avoid
"Bus 'pcie.0' not found" at startup.

This is the only QEMU mechanism that works for both regular and
confidential (TDX/SNP) guests, as it operates through the PCI bus
hierarchy rather than ACPI table injection.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-24 22:00:46 +02:00
Fabiano Fidêncio
b688619314 runtime: oci: Fix sandbox CPU sizing with cpuManagerPolicy=static
When cpuManagerPolicy=static is configured, kubelet sets the sandbox
CPU quota to -1 (unconstrained) because it uses cpuset pinning instead
of CFS quota. This causes CalculateSandboxSizing to compute 0 workload
CPUs, resulting in the VM starting with only default_vcpus.

Fall back to deriving the CPU count from sandbox CPU shares (1024
shares per CPU) when the quota-based calculation yields 0.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-24 22:00:46 +02:00
Fabiano Fidêncio
d0d7deb262 runtime: Add host NUMA distance discovery and build guest NUMA topology
Add sysfs-based host NUMA distance reading (GetHostNUMADistances) that
parses /sys/devices/system/node/nodeN/distance to mirror the host NUMA
distance matrix into the guest via -numa dist entries.

Implement buildNUMATopology() which translates the GuestNUMANodes
configuration into govmm NUMANode and NUMADist slices. Each guest NUMA
node gets a floor-divided share of vCPUs and memory, with the last node
absorbing any remainder. This handles the common Kata case of +1 VMM
overhead vCPU gracefully. Memory backends are selected based on
hugepages/virtio-fs/file-backed-mem configuration.

Guard multi-NUMA topology generation to amd64 and arm64 only, since
other architectures (s390x, riscv64) do not support QEMU NUMA/DIMM.

Wire buildNUMATopology() into CreateVM so the QEMU config includes NUMA
nodes and distances.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-05-24 22:00:46 +02:00
Fabiano Fidêncio
447e2a3faf runtime: Add VFIO device NUMA node detection and placement validation
Add PCISysFsDevicesNUMANode property and GetPCIDeviceNUMANode() helper
to read /sys/bus/pci/devices/<BDF>/numa_node when discovering VFIO
devices. Store the result in the new NUMANode field on VFIODev (-1 for
unknown/no affinity).

Wire NUMA node detection into both GetAllVFIODevicesFromIOMMUGroup()
(legacy VFIO path) and GetDeviceFromVFIODev() (IOMMUFD path) so every
discovered VFIO device carries its host NUMA node.

Add validateVFIODeviceNUMAPlacement() which runs at the end of
buildNUMATopology(). It checks every cold-plugged VFIO device's host
NUMA node against the guest NUMA topology and logs a warning if a device
is on a host NUMA node not covered by any guest NUMA node (indicating
potential cross-NUMA memory access overhead), or an info message
confirming correct placement.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-05-24 22:00:46 +02:00
Fabiano Fidêncio
1e9da61d48 govmm: Add multi-NUMA memory backend and distance matrix support
Introduce NUMANode and NUMADist types, add NUMANodes/NUMADists fields to
Config, and implement appendMultiNUMAMemoryKnobs() to generate per-node
memory-backend objects with host-nodes/policy=bind, -numa node entries
with cpus= ranges, and -numa dist entries for the distance matrix.

Gate the multi-NUMA path in appendMemoryKnobs() behind isDimmSupported()
to ensure architectures without DIMM support (s390x, riscv64) fall back
to the single-node path. Drop 386 from isDimmSupported since 32-bit x86
is not a supported Kata target.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-05-24 22:00:46 +02:00
Florian Vichot
554e8f91b1 kata-monitor: use full URI for connecting to containerd
Without the protocol in the URI, grpc-go defaults to the DNS resolver,
which results in an error for unix sockets (`name resolver error: produced
zero addresses`).

We also remove the `getAddressAndDialer(...)` and `dial(...)` functions, as
they are no longer necessary, grpc-go supports connecting to unix sockets
directly. This also removes the matching tests.

This also adds a `Makefile` and tweaks the Dockerfile to simplify building
the Docker image.

Fixes #12398

Signed-off-by: Florian Vichot <florian.vichot@gmail.com>
2026-05-23 16:47:46 +02:00
Fabiano Fidêncio
c7e3f95883 tests: remove disabled tracing tests and CI job
The run-tracing job in basic-ci-amd64.yaml has been disabled
(if: false) due to issue #9763, with no path to re-enablement.
Remove the job definition and the backing
tests/functional/tracing/ directory.

Made-with: Cursor
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-23 08:46:12 +02:00
Fabiano Fidêncio
b17dd2a902 runtime: Fix concurrent map read/write panic in Wait()
Wait() was releasing s.mu immediately after getContainer(), then
calling getExec() — which reads c.execs — without holding any lock.
Concurrent Exec() or Delete() calls that write to c.execs under s.mu
triggered a "concurrent map read and map write" fatal panic.

Add a dedicated sync.RWMutex to the container struct that protects the
execs map. getExec() now acquires a read lock internally, and all
writes go through new setExec()/deleteExec() helpers that acquire the
write lock. This keeps the locking concern local to the map and avoids
complicating the s.mu usage in Wait().

Add a regression test (TestConcurrentExecAccess) that exercises
concurrent getExec reads against setExec/deleteExec writes; this
reliably reproduces the panic under the race detector without the fix.

Fixes: #12825

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-13 21:14:28 +02:00
Fabiano Fidêncio
36a2d8e7f2 agent: Make launch_process_timeout configurable
The hardcoded DEFAULT_LAUNCH_PROCESS_TIMEOUT of 6 seconds in the kata
agent is insufficient for environments with NVIDIA GPUs and NVSwitches,
where the attestation-agent needs significantly more time to collect
evidence during initialization (e.g. ~2 seconds per NVSwitch).

When the timeout expires, the agent (PID 1) exits with an error, causing
the guest kernel to perform an orderly shutdown before the
attestation-agent has finished starting.

Make this timeout configurable via the kernel parameter
agent.launch_process_timeout (in seconds), preserving the 6-second
default for backward compatibility. The Go runtime is wired up to pass
this value from the TOML config's [agent.kata] section through to the
kernel command line.

The NVIDIA GPU configs set the new default to 15 seconds.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Made-with: Cursor
2026-04-10 14:47:01 +02:00
llink5
f7878cc385 runtime: fix Docker 26+ networking by rescanning after Start
Docker 26+ configures container networking (veth pair, IP addresses,
routes) after task creation rather than before. Kata's endpoint scan
runs during CreateSandbox, before the interfaces exist, resulting in
VMs starting without network connectivity (no -netdev passed to QEMU).

Add RescanNetwork() which runs asynchronously after the Start RPC.
It polls the network namespace until Docker's interfaces appear, then
hotplugs them to QEMU and informs the guest agent to configure them
inside the VM.

Additional fixes:
- mountinfo parser: find fs type dynamically instead of hardcoded
  field index, fixing parsing with optional mount tags (shared:,
  master:)
- IsDockerContainer: check CreateRuntime hooks for Docker 26+
- DockerNetnsPath: extract netns path from libnetwork-setkey hook
  args with path traversal protection
- detectHypervisorNetns: verify PID ownership via /proc/pid/cmdline
  to guard against PID recycling
- startVM guard: rescan when len(endpoints)==0 after VM start

Fixes: #9340

Signed-off-by: llink5 <llink5@users.noreply.github.com>
2026-04-02 21:23:16 +02:00
stevenhorsman
12578b41f2 govmm: Delete old files
The govmm workflow isn't run by us and it and the other CI files
are just legacy from when it was a separate repo, so let's clean up
this debt rather than having to update it frequently.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-30 10:45:28 +01:00
stevenhorsman
b3179bdd8e workflows: Update actions/checkout version
Update the action to resolve the following warning in GHA:
> Node.js 20 actions are deprecated. The following actions are running
> on Node.js 20 and may not work as expected:
> actions/checkout@11bd71901b.
> Actions will be forced to run with Node.js 24 by default starting June 2nd, 2026.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-30 10:45:28 +01:00
PiotrProkop
64735222c6 runtime: allow specifying logical/physical sector size for block devices
Add two new configuration knobs that control the logical and physical
sector sizes advertised by virtio-blk devices to the guest:

  block_device_logical_sector_size  (config file)
  block_device_physical_sector_size (config file)

  io.katacontainers.config.hypervisor.blk_logical_sector_size  (annotation)
  io.katacontainers.config.hypervisor.blk_physical_sector_size (annotation)

The annotation names are abbreviated relative to the config file keys
because Kubernetes enforces a 63-character limit on annotation name
segments, and the full names would exceed it.

Both settings default to 0 (let QEMU decide). When set, they are passed
as logical_block_size and physical_block_size in the QMP device_add
command during block device hotplug.

Setting logical_sector_size smaller then container filesystem
block size will cause EINVAL on mount. The physical_sector_size can
always be set independently.

Values must be 0 or a power of 2 in the range [512, 65536]; other
values are rejected with an error at sandbox creation time.

Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2026-03-27 18:56:54 +01:00
Roaa Sakr
858620d2e7 clh: Add VFIO device cold-plug support
Enable VFIO device pass-through at VM creation time on Cloud Hypervisor,
in addition to the existing hot-plug path.

Signed-off-by: Roaa Sakr <romoh@microsoft.com>
2026-03-25 16:39:25 -07:00
Zvonko Kaiser
8ff5d164c6 runtime: make CDI annotation vendor-agnostic with lookup table
Replace hardcoded NVIDIA vendor ID (0x10de) and class (0x030) checks
with a vendor-agnostic lookup table (cdiDeviceKind) that maps PCI
vendor/class pairs to CDI device kinds. This makes it straightforward
to add support for new device types by adding entries to the table.

Refactor siblingAnnotation to resolve device BDFs once upfront and
reuse them for both CDI type detection and sibling matching, eliminating
redundant sysfs reads. Devices not in the lookup table (e.g. NVSwitches)
are skipped with errNoSiblingFound, while known device types that fail
to match a sibling produce a hard error.

Consolidate the hot-plug and cold-plug device loops into a single loop
over extracted container paths, removing duplicated filtering logic.

Export GetPCIDeviceProperty from the device drivers package to allow
vendor/class lookup from sysfs in the container annotation path.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-03-15 09:53:32 -07:00
Aurélien Bombo
a4fd32a29a runtime: Support trusted ephemeral data storage
* Introduces the `emptydir_mode` config flag to allow instructing the runtime
   to create a block device for emptyDir volumes.
 * The block device is created in the original emptyDir folder on the host
   so that Kubelet can monitors its disk usage and evict the pod if it exceeds
   its sizeLimit. This matches runc and virtio-fs.
 * The block device's disk image file is sparse to minimize host disk
   footprint.

Fixes: #10560

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-03-09 14:52:17 -05:00
Hyounggyu Choi
347ce5e3bc runtime: Skip to call sandboxDevices() for remote hypervisor
The remote hypervisor delegates VM creation to a remote service.
The VM runs on cloud infrastructure, not the local host kernel.
So requiring a KVM/MSHV device is semantically wrong and would
cause a hard failure on any host where these devices are absent
(e.g., a VM that doesn't expose nested virtualization).

Skip sandboxDevices() entirely when the configured hypervisor type
is remoteHypervisor{}.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2026-03-03 13:44:12 +01:00
Fabiano Fidêncio
0a73638744 runtime: add configurable kubelet root dir
Different kubernetes distributions, such as k0s, use a different kubelet
root dir location instead of the default /var/lib/kubelet, so ConfigMap
and Secret volume propagation were failing.

This adds a kubelet_root_dir config option that the go runtime uses when
matching volume paths and kata-deploy now sets it automatically for k0s
via a drop-in file.

runtime-rs does not need this option: it identifies ConfigMap/Secret,
projected, and downward-api volumes by volume-type path segment
(kubernetes.io~configmap, etc.), not by kubelet root prefix.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-02-27 14:10:57 +01:00
Steve Horsman
3442fc7d07 Merge pull request #12477 from kata-containers/workflow-improvements
workflow: Recommended improvements
2026-02-27 11:57:22 +00:00
Hyounggyu Choi
be5ae7d1e1 Merge pull request #12573 from BbolroC/support-memory-hotplug-go-runtime-s390x
runtime: Support memory hotplug via virtio-mem on s390x
2026-02-27 09:59:40 +01:00
Hyounggyu Choi
b9f3d5aa67 runtime: Support memory hotplug with virtio-mem on s390x
This commit adds logic to properly handle memory hotplug
for QemuCCWVirtio in the ExecMemdevAdd() path.

The new logic is triggered only when virtio-mem is enabled.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2026-02-26 14:21:34 +01:00
stevenhorsman
1b2ca678e5 runtime: Fix identifier names
Fix identifiers that are non compliant with go's conventions
e.g. not capitalising initialisations

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-02-24 14:33:04 +00:00
stevenhorsman
e86338c9c0 runtime: Remove explicit types in variable declarations
QF1011 - use the short declaration as the type can be inferred

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-02-24 14:33:04 +00:00
stevenhorsman
f60ee411f0 runtime: Update poorly chosen Duration names
ST1011 - having time.Duration values with variable names of MS/Secs
is misleading

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-02-24 14:33:04 +00:00
stevenhorsman
a0ccb63f47 runtime: Use ReplaceAll over Replace
strings.ReplaceAll was introduced in Go 1.12 as a more readable and self-documenting way to say "replace everything".

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-02-24 14:33:04 +00:00
stevenhorsman
a78d212dfc kata-monitor: Switch to switch statements
Resolve: `QF1003: could use tagged switch`

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-02-24 14:33:04 +00:00
stevenhorsman
6f438bfb19 runtime: Improve receiver name
Update from `this` to fix:
```
ST1006: receiver name should be a reflection of its identity; don't use generic names such as "this" or "self" (staticcheck)
```

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-02-24 14:33:04 +00:00
stevenhorsman
f1960103d1 runtime: Improve split statement
strings.SplitN(s, sep, -1) is functionally identical to strings.Split(s, sep)
as -1 says to return all substrings, so choose the more concise version

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-02-24 14:33:04 +00:00
stevenhorsman
8cd3aa8c84 runtime: Remove embedded field from selector
GenericDevice is an embedded (anonymous) field in the device struct, so its fields
and methods are "promoted" to the outer struct, so we go straight to it.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-02-24 14:33:04 +00:00
stevenhorsman
4351a61f67 runtime: Fix error string formatting
Resolve `ST1005: error strings should not end with punctuation or newlines (staticcheck)`

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-02-24 14:33:04 +00:00
stevenhorsman
312567a137 runtime: Fix double imports
Remove one of the double imports to tidy up the code

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-02-24 14:33:04 +00:00
stevenhorsman
cff8994336 runtime: Switch to switch statements
Resolve: `QF1003: could use tagged switch on major (staticcheck)`
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-02-24 14:22:10 +00:00
stevenhorsman
5ca4c34a34 kata-monitor: Fix golangci-lint warning
QF1012: Use fmt.Fprintf(...) instead of Write([]byte(fmt.Sprintf(...))) (staticcheck)
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-02-24 10:02:48 +00:00
Balint Tobik
295a6a81d0 runtime: refactor hypervisor devices cgroup creation
Separatly added hypervisor devices to cgroup to
omit not relevant warnings and fail if none of them
are available.
Also fix a testcase reload removed kernel modules to later testcases
and skip some tests on ARM because lack of virtualization support
Fixes #6656

Signed-off-by: Balint Tobik <btobik@redhat.com>
2026-02-13 09:23:08 +01:00
stevenhorsman
c5aadada98 workflows: Pin all actions
Previously zizmor only mandated pinning of third-party actions,
but has recommended rolling this out to all actions now.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-02-12 16:26:45 +00:00
Konstantin Khlebnikov
5d99a141d9 runtime: add hypervisor options for NUMA topology
With enable_numa=true hypervisor will expose host NUMA topology as is:
map vm NUMA nodes to host 1:1 and bind vpus to relates CPUS.

Option "numa_mapping" allows to redefine NUMA nodes mapping:
- map each vm node to particular host node or several numa nodes
- emulate numa on host without numa (useful for tests)

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Co-authored-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-02-09 20:09:25 +01:00