Merge pull request #516 from liubin/feature/472-integrate-fc-metrics

fc: integrate Firecracker's metrics
This commit is contained in:
Peng Tao
2020-09-29 21:30:37 +08:00
committed by GitHub
4 changed files with 1352 additions and 6 deletions

View File

@@ -220,6 +220,566 @@ components:
fixed: false
values: []
since: 2.0.0
- prefix: kata_firecracker
title: Firecracker vmm metrics
desc: Metrics for Firecracker vmm
metrics:
- name: kata_firecracker_api_server
type: GAUGE
unit: ""
help: Metrics related to the internal API server.
labels:
- name: item
desc: ""
manually_edit: false
fixed: true
values:
- value: process_startup_time_cpu_us
desc: ""
- value: process_startup_time_us
desc: ""
- value: sync_response_fails
desc: ""
- value: sync_vmm_send_timeout_count
desc: ""
- name: sandbox_id
desc: ""
manually_edit: false
fixed: false
values: []
since: 2.0.0
- name: kata_firecracker_block
type: GAUGE
unit: ""
help: Block Device associated metrics.
labels:
- name: item
desc: ""
manually_edit: false
fixed: true
values:
- value: activate_fails
desc: ""
- value: cfg_fails
desc: ""
- value: event_fails
desc: ""
- value: execute_fails
desc: ""
- value: flush_count
desc: ""
- value: invalid_reqs_count
desc: ""
- value: no_avail_buffer
desc: ""
- value: queue_event_count
desc: ""
- value: rate_limiter_event_count
desc: ""
- value: rate_limiter_throttled_events
desc: ""
- value: read_bytes
desc: ""
- value: read_count
desc: ""
- value: update_count
desc: ""
- value: update_fails
desc: ""
- value: write_bytes
desc: ""
- value: write_count
desc: ""
- name: sandbox_id
desc: ""
manually_edit: false
fixed: false
values: []
since: 2.0.0
- name: kata_firecracker_get_api_requests
type: GAUGE
unit: ""
help: Metrics specific to GET API Requests for counting user triggered actions and/or failures.
labels:
- name: item
desc: ""
manually_edit: false
fixed: true
values:
- value: instance_info_count
desc: ""
- value: instance_info_fails
desc: ""
- value: machine_cfg_count
desc: ""
- value: machine_cfg_fails
desc: ""
- name: sandbox_id
desc: ""
manually_edit: false
fixed: false
values: []
since: 2.0.0
- name: kata_firecracker_i8042
type: GAUGE
unit: ""
help: Metrics specific to the i8042 device.
labels:
- name: item
desc: ""
manually_edit: false
fixed: true
values:
- value: error_count
desc: ""
- value: missed_read_count
desc: ""
- value: missed_write_count
desc: ""
- value: read_count
desc: ""
- value: reset_count
desc: ""
- value: write_count
desc: ""
- name: sandbox_id
desc: ""
manually_edit: false
fixed: false
values: []
since: 2.0.0
- name: kata_firecracker_latencies_us
type: GAUGE
unit: ""
help: Performance metrics related for the moment only to snapshots.
labels:
- name: item
desc: ""
manually_edit: false
fixed: true
values:
- value: diff_create_snapshot
desc: ""
- value: full_create_snapshot
desc: ""
- value: load_snapshot
desc: ""
- value: pause_vm
desc: ""
- value: resume_vm
desc: ""
- value: vmm_diff_create_snapshot
desc: ""
- value: vmm_full_create_snapshot
desc: ""
- value: vmm_load_snapshot
desc: ""
- value: vmm_pause_vm
desc: ""
- value: vmm_resume_vm
desc: ""
- name: sandbox_id
desc: ""
manually_edit: false
fixed: false
values: []
since: 2.0.0
- name: kata_firecracker_logger
type: GAUGE
unit: ""
help: Metrics for the logging subsystem.
labels:
- name: item
desc: ""
manually_edit: false
fixed: true
values:
- value: log_fails
desc: ""
- value: metrics_fails
desc: ""
- value: missed_log_count
desc: ""
- value: missed_metrics_count
desc: ""
- name: sandbox_id
desc: ""
manually_edit: false
fixed: false
values: []
since: 2.0.0
- name: kata_firecracker_mmds
type: GAUGE
unit: ""
help: Metrics for the MMDS functionality.
labels:
- name: item
desc: ""
manually_edit: false
fixed: true
values:
- value: connections_created
desc: ""
- value: connections_destroyed
desc: ""
- value: rx_accepted
desc: ""
- value: rx_accepted_err
desc: ""
- value: rx_accepted_unusual
desc: ""
- value: rx_bad_eth
desc: ""
- value: rx_count
desc: ""
- value: tx_bytes
desc: ""
- value: tx_count
desc: ""
- value: tx_errors
desc: ""
- value: tx_frames
desc: ""
- name: sandbox_id
desc: ""
manually_edit: false
fixed: false
values: []
since: 2.0.0
- name: kata_firecracker_net
type: GAUGE
unit: ""
help: Network-related metrics.
labels:
- name: item
desc: ""
manually_edit: false
fixed: true
values:
- value: activate_fails
desc: ""
- value: cfg_fails
desc: ""
- value: event_fails
desc: ""
- value: mac_address_updates
desc: ""
- value: no_rx_avail_buffer
desc: ""
- value: no_tx_avail_buffer
desc: ""
- value: rx_bytes_count
desc: ""
- value: rx_count
desc: ""
- value: rx_event_rate_limiter_count
desc: ""
- value: rx_fails
desc: ""
- value: rx_packets_count
desc: ""
- value: rx_partial_writes
desc: ""
- value: rx_queue_event_count
desc: ""
- value: rx_rate_limiter_throttled
desc: ""
- value: rx_tap_event_count
desc: ""
- value: tap_read_fails
desc: ""
- value: tap_write_fails
desc: ""
- value: tx_bytes_count
desc: ""
- value: tx_count
desc: ""
- value: tx_fails
desc: ""
- value: tx_malformed_frames
desc: ""
- value: tx_packets_count
desc: ""
- value: tx_partial_reads
desc: ""
- value: tx_queue_event_count
desc: ""
- value: tx_rate_limiter_event_count
desc: ""
- value: tx_rate_limiter_throttled
desc: ""
- value: tx_spoofed_mac_count
desc: ""
- name: sandbox_id
desc: ""
manually_edit: false
fixed: false
values: []
since: 2.0.0
- name: kata_firecracker_patch_api_requests
type: GAUGE
unit: ""
help: Metrics specific to PATCH API Requests for counting user triggered actions and/or failures.
labels:
- name: item
desc: ""
manually_edit: false
fixed: true
values:
- value: drive_count
desc: ""
- value: drive_fails
desc: ""
- value: machine_cfg_count
desc: ""
- value: machine_cfg_fails
desc: ""
- value: network_count
desc: ""
- value: network_fails
desc: ""
- name: sandbox_id
desc: ""
manually_edit: false
fixed: false
values: []
since: 2.0.0
- name: kata_firecracker_put_api_requests
type: GAUGE
unit: ""
help: Metrics specific to PUT API Requests for counting user triggered actions and/or failures.
labels:
- name: item
desc: ""
manually_edit: false
fixed: true
values:
- value: actions_count
desc: ""
- value: actions_fails
desc: ""
- value: boot_source_count
desc: ""
- value: boot_source_fails
desc: ""
- value: drive_count
desc: ""
- value: drive_fails
desc: ""
- value: logger_count
desc: ""
- value: logger_fails
desc: ""
- value: machine_cfg_count
desc: ""
- value: machine_cfg_fails
desc: ""
- value: metrics_count
desc: ""
- value: metrics_fails
desc: ""
- value: network_count
desc: ""
- value: network_fails
desc: ""
- name: sandbox_id
desc: ""
manually_edit: false
fixed: false
values: []
since: 2.0.0
- name: kata_firecracker_rtc
type: GAUGE
unit: ""
help: Metrics specific to the RTC device.
labels:
- name: item
desc: ""
manually_edit: false
fixed: true
values:
- value: error_count
desc: ""
- value: missed_read_count
desc: ""
- value: missed_write_count
desc: ""
- name: sandbox_id
desc: ""
manually_edit: false
fixed: false
values: []
since: 2.0.0
- name: kata_firecracker_seccomp
type: GAUGE
unit: ""
help: Metrics for the seccomp filtering.
labels:
- name: item
desc: ""
manually_edit: false
fixed: true
values:
- value: num_faults
desc: ""
- name: sandbox_id
desc: ""
manually_edit: false
fixed: false
values: []
since: 2.0.0
- name: kata_firecracker_signals
type: GAUGE
unit: ""
help: Metrics related to signals.
labels:
- name: item
desc: ""
manually_edit: false
fixed: true
values:
- value: sigbus
desc: ""
- value: sigsegv
desc: ""
- name: sandbox_id
desc: ""
manually_edit: false
fixed: false
values: []
since: 2.0.0
- name: kata_firecracker_uart
type: GAUGE
unit: ""
help: Metrics specific to the UART device.
labels:
- name: item
desc: ""
manually_edit: false
fixed: true
values:
- value: error_count
desc: ""
- value: flush_count
desc: ""
- value: missed_read_count
desc: ""
- value: missed_write_count
desc: ""
- value: read_count
desc: ""
- value: write_count
desc: ""
- name: sandbox_id
desc: ""
manually_edit: false
fixed: false
values: []
since: 2.0.0
- name: kata_firecracker_vcpu
type: GAUGE
unit: ""
help: Metrics specific to VCPUs' mode of functioning.
labels:
- name: item
desc: ""
manually_edit: false
fixed: true
values:
- value: exit_io_in
desc: ""
- value: exit_io_out
desc: ""
- value: exit_mmio_read
desc: ""
- value: exit_mmio_write
desc: ""
- value: failures
desc: ""
- value: filter_cpuid
desc: ""
- name: sandbox_id
desc: ""
manually_edit: false
fixed: false
values: []
since: 2.0.0
- name: kata_firecracker_vmm
type: GAUGE
unit: ""
help: Metrics specific to the machine manager as a whole.
labels:
- name: item
desc: ""
manually_edit: false
fixed: true
values:
- value: device_events
desc: ""
- value: panic_count
desc: ""
- name: sandbox_id
desc: ""
manually_edit: false
fixed: false
values: []
since: 2.0.0
- name: kata_firecracker_vsock
type: GAUGE
unit: ""
help: Vsock-related metrics.
labels:
- name: item
desc: ""
manually_edit: false
fixed: true
values:
- value: activate_fails
desc: ""
- value: cfg_fails
desc: ""
- value: conn_event_fails
desc: ""
- value: conns_added
desc: ""
- value: conns_killed
desc: ""
- value: conns_removed
desc: ""
- value: ev_queue_event_fails
desc: ""
- value: killq_resync
desc: ""
- value: muxer_event_fails
desc: ""
- value: rx_bytes_count
desc: ""
- value: rx_packets_count
desc: ""
- value: rx_queue_event_count
desc: ""
- value: rx_queue_event_fails
desc: ""
- value: rx_read_fails
desc: ""
- value: tx_bytes_count
desc: ""
- value: tx_flush_fails
desc: ""
- value: tx_packets_count
desc: ""
- value: tx_queue_event_count
desc: ""
- value: tx_queue_event_fails
desc: ""
- value: tx_write_fails
desc: ""
- name: sandbox_id
desc: ""
manually_edit: false
fixed: false
values: []
since: 2.0.0
- prefix: kata_guest
title: Kata guest OS metrics
desc: Guest OS's metrics in hypervisor.

View File

@@ -9,6 +9,7 @@
* [Metrics list](#metrics-list)
* [Metric types](#metric-types)
* [Kata agent metrics](#kata-agent-metrics)
* [Firecracker metrics](#firecracker-metrics)
* [Kata guest OS metrics](#kata-guest-os-metrics)
* [Hypervisor metrics](#hypervisor-metrics)
* [Kata monitor metrics](#kata-monitor-metrics)
@@ -152,6 +153,7 @@ Metrics is categorized by component where metrics are collected from and for.
* [Metric types](#metric-types)
* [Kata agent metrics](#kata-agent-metrics)
* [Firecracker metrics](#firecracker-metrics)
* [Kata guest OS metrics](#kata-guest-os-metrics)
* [Hypervisor metrics](#hypervisor-metrics)
* [Kata monitor metrics](#kata-monitor-metrics)
@@ -198,6 +200,30 @@ Agent's metrics contains metrics about agent process.
| `kata_agent_total_time`: <br> Agent process total time | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_agent_total_vm`: <br> Agent process total `vm` size | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
### Firecracker metrics
Metrics for Firecracker vmm.
| Metric name | Type | Units | Labels | Introduced in Kata version |
|---|---|---|---|---|
| `kata_firecracker_api_server`: <br> Metrics related to the internal API server. | `GAUGE` | | <ul><li>`item`<ul><li>`process_startup_time_cpu_us`</li><li>`process_startup_time_us`</li><li>`sync_response_fails`</li><li>`sync_vmm_send_timeout_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_block`: <br> Block Device associated metrics. | `GAUGE` | | <ul><li>`item`<ul><li>`activate_fails`</li><li>`cfg_fails`</li><li>`event_fails`</li><li>`execute_fails`</li><li>`flush_count`</li><li>`invalid_reqs_count`</li><li>`no_avail_buffer`</li><li>`queue_event_count`</li><li>`rate_limiter_event_count`</li><li>`rate_limiter_throttled_events`</li><li>`read_bytes`</li><li>`read_count`</li><li>`update_count`</li><li>`update_fails`</li><li>`write_bytes`</li><li>`write_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_get_api_requests`: <br> Metrics specific to GET API Requests for counting user triggered actions and/or failures. | `GAUGE` | | <ul><li>`item`<ul><li>`instance_info_count`</li><li>`instance_info_fails`</li><li>`machine_cfg_count`</li><li>`machine_cfg_fails`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_i8042`: <br> Metrics specific to the i8042 device. | `GAUGE` | | <ul><li>`item`<ul><li>`error_count`</li><li>`missed_read_count`</li><li>`missed_write_count`</li><li>`read_count`</li><li>`reset_count`</li><li>`write_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_latencies_us`: <br> Performance metrics related for the moment only to snapshots. | `GAUGE` | | <ul><li>`item`<ul><li>`diff_create_snapshot`</li><li>`full_create_snapshot`</li><li>`load_snapshot`</li><li>`pause_vm`</li><li>`resume_vm`</li><li>`vmm_diff_create_snapshot`</li><li>`vmm_full_create_snapshot`</li><li>`vmm_load_snapshot`</li><li>`vmm_pause_vm`</li><li>`vmm_resume_vm`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_logger`: <br> Metrics for the logging subsystem. | `GAUGE` | | <ul><li>`item`<ul><li>`log_fails`</li><li>`metrics_fails`</li><li>`missed_log_count`</li><li>`missed_metrics_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_mmds`: <br> Metrics for the MMDS functionality. | `GAUGE` | | <ul><li>`item`<ul><li>`connections_created`</li><li>`connections_destroyed`</li><li>`rx_accepted`</li><li>`rx_accepted_err`</li><li>`rx_accepted_unusual`</li><li>`rx_bad_eth`</li><li>`rx_count`</li><li>`tx_bytes`</li><li>`tx_count`</li><li>`tx_errors`</li><li>`tx_frames`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_net`: <br> Network-related metrics. | `GAUGE` | | <ul><li>`item`<ul><li>`activate_fails`</li><li>`cfg_fails`</li><li>`event_fails`</li><li>`mac_address_updates`</li><li>`no_rx_avail_buffer`</li><li>`no_tx_avail_buffer`</li><li>`rx_bytes_count`</li><li>`rx_count`</li><li>`rx_event_rate_limiter_count`</li><li>`rx_fails`</li><li>`rx_packets_count`</li><li>`rx_partial_writes`</li><li>`rx_queue_event_count`</li><li>`rx_rate_limiter_throttled`</li><li>`rx_tap_event_count`</li><li>`tap_read_fails`</li><li>`tap_write_fails`</li><li>`tx_bytes_count`</li><li>`tx_count`</li><li>`tx_fails`</li><li>`tx_malformed_frames`</li><li>`tx_packets_count`</li><li>`tx_partial_reads`</li><li>`tx_queue_event_count`</li><li>`tx_rate_limiter_event_count`</li><li>`tx_rate_limiter_throttled`</li><li>`tx_spoofed_mac_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_patch_api_requests`: <br> Metrics specific to PATCH API Requests for counting user triggered actions and/or failures. | `GAUGE` | | <ul><li>`item`<ul><li>`drive_count`</li><li>`drive_fails`</li><li>`machine_cfg_count`</li><li>`machine_cfg_fails`</li><li>`network_count`</li><li>`network_fails`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_put_api_requests`: <br> Metrics specific to PUT API Requests for counting user triggered actions and/or failures. | `GAUGE` | | <ul><li>`item`<ul><li>`actions_count`</li><li>`actions_fails`</li><li>`boot_source_count`</li><li>`boot_source_fails`</li><li>`drive_count`</li><li>`drive_fails`</li><li>`logger_count`</li><li>`logger_fails`</li><li>`machine_cfg_count`</li><li>`machine_cfg_fails`</li><li>`metrics_count`</li><li>`metrics_fails`</li><li>`network_count`</li><li>`network_fails`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_rtc`: <br> Metrics specific to the RTC device. | `GAUGE` | | <ul><li>`item`<ul><li>`error_count`</li><li>`missed_read_count`</li><li>`missed_write_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_seccomp`: <br> Metrics for the seccomp filtering. | `GAUGE` | | <ul><li>`item`<ul><li>`num_faults`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_signals`: <br> Metrics related to signals. | `GAUGE` | | <ul><li>`item`<ul><li>`sigbus`</li><li>`sigsegv`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_uart`: <br> Metrics specific to the UART device. | `GAUGE` | | <ul><li>`item`<ul><li>`error_count`</li><li>`flush_count`</li><li>`missed_read_count`</li><li>`missed_write_count`</li><li>`read_count`</li><li>`write_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_vcpu`: <br> Metrics specific to VCPUs' mode of functioning. | `GAUGE` | | <ul><li>`item`<ul><li>`exit_io_in`</li><li>`exit_io_out`</li><li>`exit_mmio_read`</li><li>`exit_mmio_write`</li><li>`failures`</li><li>`filter_cpuid`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_vmm`: <br> Metrics specific to the machine manager as a whole. | `GAUGE` | | <ul><li>`item`<ul><li>`device_events`</li><li>`panic_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_vsock`: <br> Vsock-related metrics. | `GAUGE` | | <ul><li>`item`<ul><li>`activate_fails`</li><li>`cfg_fails`</li><li>`conn_event_fails`</li><li>`conns_added`</li><li>`conns_killed`</li><li>`conns_removed`</li><li>`ev_queue_event_fails`</li><li>`killq_resync`</li><li>`muxer_event_fails`</li><li>`rx_bytes_count`</li><li>`rx_packets_count`</li><li>`rx_queue_event_count`</li><li>`rx_queue_event_fails`</li><li>`rx_read_fails`</li><li>`tx_bytes_count`</li><li>`tx_flush_fails`</li><li>`tx_packets_count`</li><li>`tx_queue_event_count`</li><li>`tx_queue_event_fails`</li><li>`tx_write_fails`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
### Kata guest OS metrics
Guest OS's metrics in hypervisor.