mirror of
				https://github.com/kata-containers/kata-containers.git
				synced 2025-10-22 04:18:53 +00:00 
			
		
		
		
	1. Implemented metrics collection for runtime-rs shim and dragonball hypervisor. 2. Described the current supported metrics in runtime-rs.(docs/design/kata-metrics-in-runtime-rs.md) Fixes: #5017 Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
		
			
				
	
	
	
		
			7.8 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	
			7.8 KiB
		
	
	
	
	
	
	
	
Kata Metrics in Rust Runtime(runtime-rs)
Rust Runtime(runtime-rs) is responsible for:
- Gather metrics about shim.
- Gather metrics from hypervisor(throughchannel).
- Get metrics from agent(throughttrpc).
Here are listed all the metrics gathered by runtime-rs.
- Current status of each entry is marked as:
- ✅:DONE
- 🚧:TODO
Kata Shim
| STATUS | Metric name | Type | Units | Labels | 
|---|---|---|---|---|
| 🚧 | kata_shim_agent_rpc_durations_histogram_milliseconds:RPC latency distributions. | HISTOGRAM | milliseconds | 
 | 
| ✅ | kata_shim_fds:Kata containerd shim v2 open FDs. | GAUGE | 
 | |
| ✅ | kata_shim_io_stat:Kata containerd shim v2 process IO statistics. | GAUGE | 
 | |
| ✅ | kata_shim_netdev:Kata containerd shim v2 network devices statistics. | GAUGE | 
 | |
| 🚧 | kata_shim_pod_overhead_cpu:Kata Pod overhead for CPU resources(percent). | GAUGE | percent | 
 | 
| 🚧 | kata_shim_pod_overhead_memory_in_bytes:Kata Pod overhead for memory resources(bytes). | GAUGE | bytes | 
 | 
| ✅ | kata_shim_proc_stat:Kata containerd shim v2 process statistics. | GAUGE | 
 | |
| ✅ | kata_shim_proc_status:Kata containerd shim v2 process status. | GAUGE | 
 | |
| 🚧 | kata_shim_process_cpu_seconds_total:Total user and system CPU time spent in seconds. | COUNTER | seconds | 
 | 
| 🚧 | kata_shim_process_max_fds:Maximum number of open file descriptors. | GAUGE | 
 | |
| 🚧 | kata_shim_process_open_fds:Number of open file descriptors. | GAUGE | 
 | |
| 🚧 | kata_shim_process_resident_memory_bytes:Resident memory size in bytes. | GAUGE | bytes | 
 | 
| 🚧 | kata_shim_process_start_time_seconds:Start time of the process since unixepoch in seconds. | GAUGE | seconds | 
 | 
| 🚧 | kata_shim_process_virtual_memory_bytes:Virtual memory size in bytes. | GAUGE | bytes | 
 | 
| 🚧 | kata_shim_process_virtual_memory_max_bytes:Maximum amount of virtual memory available in bytes. | GAUGE | bytes | 
 | 
| 🚧 | kata_shim_rpc_durations_histogram_milliseconds:RPC latency distributions. | HISTOGRAM | milliseconds | 
 | 
| ✅ | kata_shim_threads:Kata containerd shim v2 process threads. | GAUGE | 
 | 
Kata Hypervisor
Different from golang runtime, hypervisor and shim in runtime-rs belong to the same process, so all previous metrics for hypervisor and shim only need to be gathered once. Thus, we currently only collect previous metrics in kata shim.
At the same time, we added the interface(VmmAction::GetHypervisorMetrics) to gather hypervisor metrics, in case we design tailor-made metrics for hypervisor in the future. Here're metrics exposed from src/dragonball/src/metric.rs.
| Metric name | Type | Units | Labels | 
|---|---|---|---|
| kata_hypervisor_scrape_count:Metrics scrape count | COUNTER | 
 | |
| kata_hypervisor_vcpu:Hypervisor metrics specific to VCPUs' mode of functioning. | IntGauge | 
 | |
| kata_hypervisor_seccomp:Hypervisor metrics for the seccomp filtering. | IntGauge | 
 | |
| kata_hypervisor_seccomp:Hypervisor metrics for the seccomp filtering. | IntGauge | 
 |