mirror of
https://github.com/kata-containers/kata-containers.git
synced 2026-03-14 08:42:15 +00:00
Compare commits
50 Commits
3.2.0-alph
...
3.2.0-rc0
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
d424f3c595 | ||
|
|
cf8899f260 | ||
|
|
542012c8be | ||
|
|
5979f3790b | ||
|
|
006ecce49a | ||
|
|
6ad16d4977 | ||
|
|
4e812009f5 | ||
|
|
29855ed0c6 | ||
|
|
025596b289 | ||
|
|
e1a69c0c92 | ||
|
|
1a6b27bf6a | ||
|
|
a536d4a7bf | ||
|
|
ad6e53c399 | ||
|
|
7ffc0c1225 | ||
|
|
35d6d86ab5 | ||
|
|
f764248095 | ||
|
|
2205fb9d05 | ||
|
|
11631c681a | ||
|
|
7923de8999 | ||
|
|
e2c31fce23 | ||
|
|
2fc5f0e2e0 | ||
|
|
c0171ea0a7 | ||
|
|
58f9a57c20 | ||
|
|
07694ef3ae | ||
|
|
d8439dba89 | ||
|
|
bda83cee5d | ||
|
|
badff23c71 | ||
|
|
27c02367f9 | ||
|
|
a0a524efc2 | ||
|
|
f5e9985afe | ||
|
|
f910c66d6f | ||
|
|
1a94aad44f | ||
|
|
6328181762 | ||
|
|
8933d54428 | ||
|
|
8a584589ff | ||
|
|
21f5b65233 | ||
|
|
69f05cf9e6 | ||
|
|
87d41b3dfa | ||
|
|
cddcde1d40 | ||
|
|
b3901c46d6 | ||
|
|
5a1b5d3672 | ||
|
|
ad413d1646 | ||
|
|
1512560111 | ||
|
|
62e328ca5c | ||
|
|
458e1bc712 | ||
|
|
1cc1c81c9a | ||
|
|
1a5f90dc3f | ||
|
|
731e7c763f | ||
|
|
d74639d8c6 | ||
|
|
02cc4fe9db |
1
.github/workflows/ci-on-push.yaml
vendored
1
.github/workflows/ci-on-push.yaml
vendored
@@ -3,6 +3,7 @@ on:
|
||||
pull_request_target:
|
||||
branches:
|
||||
- 'main'
|
||||
- 'stable-*'
|
||||
types:
|
||||
# Adding 'labeled' to the list of activity types that trigger this event
|
||||
# (default: opened, synchronize, reopened) so that we can run this
|
||||
|
||||
5
.github/workflows/run-metrics.yaml
vendored
5
.github/workflows/run-metrics.yaml
vendored
@@ -49,9 +49,12 @@ jobs:
|
||||
- name: run tensorflow test
|
||||
run: bash tests/metrics/gha-run.sh run-test-tensorflow
|
||||
|
||||
- name: run fio test
|
||||
run: bash tests/metrics/gha-run.sh run-test-fio
|
||||
|
||||
- name: make metrics tarball ${{ matrix.vmm }}
|
||||
run: bash tests/metrics/gha-run.sh make-tarball-results
|
||||
|
||||
|
||||
- name: archive metrics results ${{ matrix.vmm }}
|
||||
uses: actions/upload-artifact@v3
|
||||
with:
|
||||
|
||||
@@ -87,7 +87,8 @@ build_and_install_libseccomp() {
|
||||
curl -sLO "${libseccomp_tarball_url}"
|
||||
tar -xf "${libseccomp_tarball}"
|
||||
pushd "libseccomp-${libseccomp_version}"
|
||||
./configure --prefix="${libseccomp_install_dir}" CFLAGS="${cflags}" --enable-static --host="${arch}"
|
||||
[ "${arch}" == $(uname -m) ] && cc_name="" || cc_name="${arch}-linux-gnu-gcc"
|
||||
CC=${cc_name} ./configure --prefix="${libseccomp_install_dir}" CFLAGS="${cflags}" --enable-static --host="${arch}"
|
||||
make
|
||||
make install
|
||||
popd
|
||||
|
||||
@@ -14,6 +14,7 @@ Kata Containers design documents:
|
||||
- [`Inotify` support](inotify.md)
|
||||
- [`Hooks` support](hooks-handling.md)
|
||||
- [Metrics(Kata 2.0)](kata-2-0-metrics.md)
|
||||
- [Metrics in Rust Runtime(runtime-rs)](kata-metrics-in-runtime-rs.md)
|
||||
- [Design for Kata Containers `Lazyload` ability with `nydus`](kata-nydus-design.md)
|
||||
- [Design for direct-assigned volume](direct-blk-device-assignment.md)
|
||||
- [Design for core-scheduling](core-scheduling.md)
|
||||
|
||||
50
docs/design/kata-metrics-in-runtime-rs.md
Normal file
50
docs/design/kata-metrics-in-runtime-rs.md
Normal file
@@ -0,0 +1,50 @@
|
||||
# Kata Metrics in Rust Runtime(runtime-rs)
|
||||
|
||||
Rust Runtime(runtime-rs) is responsible for:
|
||||
|
||||
- Gather metrics about `shim`.
|
||||
- Gather metrics from `hypervisor` (through `channel`).
|
||||
- Get metrics from `agent` (through `ttrpc`).
|
||||
|
||||
---
|
||||
|
||||
Here are listed all the metrics gathered by `runtime-rs`.
|
||||
|
||||
> * Current status of each entry is marked as:
|
||||
> * ✅:DONE
|
||||
> * 🚧:TODO
|
||||
|
||||
### Kata Shim
|
||||
|
||||
| STATUS | Metric name | Type | Units | Labels |
|
||||
| ------ | ------------------------------------------------------------ | ----------- | -------------- | ------------------------------------------------------------ |
|
||||
| 🚧 | `kata_shim_agent_rpc_durations_histogram_milliseconds`: <br> RPC latency distributions. | `HISTOGRAM` | `milliseconds` | <ul><li>`action` (RPC actions of Kata agent)<ul><li>`grpc.CheckRequest`</li><li>`grpc.CloseStdinRequest`</li><li>`grpc.CopyFileRequest`</li><li>`grpc.CreateContainerRequest`</li><li>`grpc.CreateSandboxRequest`</li><li>`grpc.DestroySandboxRequest`</li><li>`grpc.ExecProcessRequest`</li><li>`grpc.GetMetricsRequest`</li><li>`grpc.GuestDetailsRequest`</li><li>`grpc.ListInterfacesRequest`</li><li>`grpc.ListProcessesRequest`</li><li>`grpc.ListRoutesRequest`</li><li>`grpc.MemHotplugByProbeRequest`</li><li>`grpc.OnlineCPUMemRequest`</li><li>`grpc.PauseContainerRequest`</li><li>`grpc.RemoveContainerRequest`</li><li>`grpc.ReseedRandomDevRequest`</li><li>`grpc.ResumeContainerRequest`</li><li>`grpc.SetGuestDateTimeRequest`</li><li>`grpc.SignalProcessRequest`</li><li>`grpc.StartContainerRequest`</li><li>`grpc.StatsContainerRequest`</li><li>`grpc.TtyWinResizeRequest`</li><li>`grpc.UpdateContainerRequest`</li><li>`grpc.UpdateInterfaceRequest`</li><li>`grpc.UpdateRoutesRequest`</li><li>`grpc.WaitProcessRequest`</li><li>`grpc.WriteStreamRequest`</li></ul></li><li>`sandbox_id`</li></ul> |
|
||||
| ✅ | `kata_shim_fds`: <br> Kata containerd shim v2 open FDs. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> |
|
||||
| ✅ | `kata_shim_io_stat`: <br> Kata containerd shim v2 process IO statistics. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/io`)<ul><li>`cancelledwritebytes`</li><li>`rchar`</li><li>`readbytes`</li><li>`syscr`</li><li>`syscw`</li><li>`wchar`</li><li>`writebytes`</li></ul></li><li>`sandbox_id`</li></ul> |
|
||||
| ✅ | `kata_shim_netdev`: <br> Kata containerd shim v2 network devices statistics. | `GAUGE` | | <ul><li>`interface` (network device name)</li><li>`item` (see `/proc/net/dev`)<ul><li>`recv_bytes`</li><li>`recv_compressed`</li><li>`recv_drop`</li><li>`recv_errs`</li><li>`recv_fifo`</li><li>`recv_frame`</li><li>`recv_multicast`</li><li>`recv_packets`</li><li>`sent_bytes`</li><li>`sent_carrier`</li><li>`sent_colls`</li><li>`sent_compressed`</li><li>`sent_drop`</li><li>`sent_errs`</li><li>`sent_fifo`</li><li>`sent_packets`</li></ul></li><li>`sandbox_id`</li></ul> |
|
||||
| 🚧 | `kata_shim_pod_overhead_cpu`: <br> Kata Pod overhead for CPU resources(percent). | `GAUGE` | percent | <ul><li>`sandbox_id`</li></ul> |
|
||||
| 🚧 | `kata_shim_pod_overhead_memory_in_bytes`: <br> Kata Pod overhead for memory resources(bytes). | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> |
|
||||
| ✅ | `kata_shim_proc_stat`: <br> Kata containerd shim v2 process statistics. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/stat`)<ul><li>`cstime`</li><li>`cutime`</li><li>`stime`</li><li>`utime`</li></ul></li><li>`sandbox_id`</li></ul> |
|
||||
| ✅ | `kata_shim_proc_status`: <br> Kata containerd shim v2 process status. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/status`)<ul><li>`hugetlbpages`</li><li>`nonvoluntary_ctxt_switches`</li><li>`rssanon`</li><li>`rssfile`</li><li>`rssshmem`</li><li>`vmdata`</li><li>`vmexe`</li><li>`vmhwm`</li><li>`vmlck`</li><li>`vmlib`</li><li>`vmpeak`</li><li>`vmpin`</li><li>`vmpmd`</li><li>`vmpte`</li><li>`vmrss`</li><li>`vmsize`</li><li>`vmstk`</li><li>`vmswap`</li><li>`voluntary_ctxt_switches`</li></ul></li><li>`sandbox_id`</li></ul> |
|
||||
| 🚧 | `kata_shim_process_cpu_seconds_total`: <br> Total user and system CPU time spent in seconds. | `COUNTER` | `seconds` | <ul><li>`sandbox_id`</li></ul> |
|
||||
| 🚧 | `kata_shim_process_max_fds`: <br> Maximum number of open file descriptors. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> |
|
||||
| 🚧 | `kata_shim_process_open_fds`: <br> Number of open file descriptors. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> |
|
||||
| 🚧 | `kata_shim_process_resident_memory_bytes`: <br> Resident memory size in bytes. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> |
|
||||
| 🚧 | `kata_shim_process_start_time_seconds`: <br> Start time of the process since `unix` epoch in seconds. | `GAUGE` | `seconds` | <ul><li>`sandbox_id`</li></ul> |
|
||||
| 🚧 | `kata_shim_process_virtual_memory_bytes`: <br> Virtual memory size in bytes. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> |
|
||||
| 🚧 | `kata_shim_process_virtual_memory_max_bytes`: <br> Maximum amount of virtual memory available in bytes. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> |
|
||||
| 🚧 | `kata_shim_rpc_durations_histogram_milliseconds`: <br> RPC latency distributions. | `HISTOGRAM` | `milliseconds` | <ul><li>`action` (Kata shim v2 actions)<ul><li>`checkpoint`</li><li>`close_io`</li><li>`connect`</li><li>`create`</li><li>`delete`</li><li>`exec`</li><li>`kill`</li><li>`pause`</li><li>`pids`</li><li>`resize_pty`</li><li>`resume`</li><li>`shutdown`</li><li>`start`</li><li>`state`</li><li>`stats`</li><li>`update`</li><li>`wait`</li></ul></li><li>`sandbox_id`</li></ul> |
|
||||
| ✅ | `kata_shim_threads`: <br> Kata containerd shim v2 process threads. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> |
|
||||
|
||||
### Kata Hypervisor
|
||||
|
||||
Different from golang runtime, hypervisor and shim in runtime-rs belong to the **same process**, so all previous metrics for hypervisor and shim only need to be gathered once. Thus, we currently only collect previous metrics in kata shim.
|
||||
|
||||
At the same time, we added the interface(`VmmAction::GetHypervisorMetrics`) to gather hypervisor metrics, in case we design tailor-made metrics for hypervisor in the future. Here're metrics exposed from [src/dragonball/src/metric.rs](https://github.com/kata-containers/kata-containers/blob/main/src/dragonball/src/metric.rs).
|
||||
|
||||
| Metric name | Type | Units | Labels |
|
||||
| ------------------------------------------------------------ | ---------- | ----- | ------------------------------------------------------------ |
|
||||
| `kata_hypervisor_scrape_count`: <br> Metrics scrape count | `COUNTER` | | <ul><li>`sandbox_id`</li></ul> |
|
||||
| `kata_hypervisor_vcpu`: <br>Hypervisor metrics specific to VCPUs' mode of functioning. | `IntGauge` | | <ul><li>`item`<ul><li>`exit_io_in`</li><li>`exit_io_out`</li><li>`exit_mmio_read`</li><li>`exit_mmio_write`</li><li>`failures`</li><li>`filter_cpuid`</li></ul></li><li>`sandbox_id`</li></ul> |
|
||||
| `kata_hypervisor_seccomp`: <br> Hypervisor metrics for the seccomp filtering. | `IntGauge` | | <ul><li>`item`<ul><li>`num_faults`</li></ul></li><li>`sandbox_id`</li></ul> |
|
||||
| `kata_hypervisor_seccomp`: <br> Hypervisor metrics for the seccomp filtering. | `IntGauge` | | <ul><li>`item`<ul><li>`sigbus`</li><li>`sigsegv`</li></ul></li><li>`sandbox_id`</li></ul> |
|
||||
1705
src/dragonball/Cargo.lock
generated
1705
src/dragonball/Cargo.lock
generated
File diff suppressed because it is too large
Load Diff
@@ -10,6 +10,7 @@ license = "Apache-2.0"
|
||||
edition = "2018"
|
||||
|
||||
[dependencies]
|
||||
anyhow = "1.0.32"
|
||||
arc-swap = "1.5.0"
|
||||
bytes = "1.1.0"
|
||||
dbs-address-space = { path = "./src/dbs_address_space" }
|
||||
@@ -29,6 +30,8 @@ libc = "0.2.39"
|
||||
linux-loader = "0.6.0"
|
||||
log = "0.4.14"
|
||||
nix = "0.24.2"
|
||||
procfs = "0.12.0"
|
||||
prometheus = { version = "0.13.0", features = ["process"] }
|
||||
seccompiler = "0.2.0"
|
||||
serde = "1.0.27"
|
||||
serde_derive = "1.0.27"
|
||||
@@ -42,8 +45,8 @@ vm-memory = { version = "0.9.0", features = ["backend-mmap"] }
|
||||
crossbeam-channel = "0.5.6"
|
||||
|
||||
[dev-dependencies]
|
||||
slog-term = "2.9.0"
|
||||
slog-async = "2.7.0"
|
||||
slog-term = "2.9.0"
|
||||
test-utils = { path = "../libs/test-utils" }
|
||||
|
||||
[features]
|
||||
|
||||
@@ -16,6 +16,8 @@ use crate::event_manager::EventManager;
|
||||
use crate::vm::{CpuTopology, KernelConfigInfo, VmConfigInfo};
|
||||
use crate::vmm::Vmm;
|
||||
|
||||
use crate::hypervisor_metrics::get_hypervisor_metrics;
|
||||
|
||||
use self::VmConfigError::*;
|
||||
use self::VmmActionError::MachineConfig;
|
||||
|
||||
@@ -58,6 +60,11 @@ pub enum VmmActionError {
|
||||
#[error("Upcall not ready, can't hotplug device.")]
|
||||
UpcallServerNotReady,
|
||||
|
||||
/// Error when get prometheus metrics.
|
||||
/// Currently does not distinguish between error types for metrics.
|
||||
#[error("failed to get hypervisor metrics")]
|
||||
GetHypervisorMetrics,
|
||||
|
||||
/// The action `ConfigureBootSource` failed either because of bad user input or an internal
|
||||
/// error.
|
||||
#[error("failed to configure boot source for VM: {0}")]
|
||||
@@ -135,6 +142,9 @@ pub enum VmmAction {
|
||||
/// Get the configuration of the microVM.
|
||||
GetVmConfiguration,
|
||||
|
||||
/// Get Prometheus Metrics.
|
||||
GetHypervisorMetrics,
|
||||
|
||||
/// Set the microVM configuration (memory & vcpu) using `VmConfig` as input. This
|
||||
/// action can only be called before the microVM has booted.
|
||||
SetVmConfiguration(VmConfigInfo),
|
||||
@@ -208,6 +218,8 @@ pub enum VmmData {
|
||||
Empty,
|
||||
/// The microVM configuration represented by `VmConfigInfo`.
|
||||
MachineConfiguration(Box<VmConfigInfo>),
|
||||
/// Prometheus Metrics represented by String.
|
||||
HypervisorMetrics(String),
|
||||
}
|
||||
|
||||
/// Request data type used to communicate between the API and the VMM.
|
||||
@@ -262,6 +274,7 @@ impl VmmService {
|
||||
VmmAction::GetVmConfiguration => Ok(VmmData::MachineConfiguration(Box::new(
|
||||
self.machine_config.clone(),
|
||||
))),
|
||||
VmmAction::GetHypervisorMetrics => self.get_hypervisor_metrics(),
|
||||
VmmAction::SetVmConfiguration(machine_config) => {
|
||||
self.set_vm_configuration(vmm, machine_config)
|
||||
}
|
||||
@@ -381,6 +394,13 @@ impl VmmService {
|
||||
Ok(VmmData::Empty)
|
||||
}
|
||||
|
||||
/// Get prometheus metrics.
|
||||
fn get_hypervisor_metrics(&self) -> VmmRequestResult {
|
||||
get_hypervisor_metrics()
|
||||
.map_err(|_| VmmActionError::GetHypervisorMetrics)
|
||||
.map(VmmData::HypervisorMetrics)
|
||||
}
|
||||
|
||||
/// Set virtual machine configuration.
|
||||
pub fn set_vm_configuration(
|
||||
&mut self,
|
||||
|
||||
102
src/dragonball/src/hypervisor_metrics.rs
Normal file
102
src/dragonball/src/hypervisor_metrics.rs
Normal file
@@ -0,0 +1,102 @@
|
||||
// Copyright 2021-2022 Ant Group
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
extern crate procfs;
|
||||
|
||||
use crate::metric::{IncMetric, METRICS};
|
||||
use anyhow::{anyhow, Result};
|
||||
use prometheus::{Encoder, IntCounter, IntGaugeVec, Opts, Registry, TextEncoder};
|
||||
use std::sync::Mutex;
|
||||
|
||||
const NAMESPACE_KATA_HYPERVISOR: &str = "kata_hypervisor";
|
||||
|
||||
lazy_static! {
|
||||
static ref REGISTERED: Mutex<bool> = Mutex::new(false);
|
||||
|
||||
// custom registry
|
||||
static ref REGISTRY: Registry = Registry::new();
|
||||
|
||||
// hypervisor metrics
|
||||
static ref HYPERVISOR_SCRAPE_COUNT: IntCounter =
|
||||
IntCounter::new(format!("{}_{}",NAMESPACE_KATA_HYPERVISOR,"scrape_count"), "Hypervisor metrics scrape count.").unwrap();
|
||||
|
||||
static ref HYPERVISOR_VCPU: IntGaugeVec =
|
||||
IntGaugeVec::new(Opts::new(format!("{}_{}",NAMESPACE_KATA_HYPERVISOR,"vcpu"), "Hypervisor metrics specific to VCPUs' mode of functioning."), &["item"]).unwrap();
|
||||
|
||||
static ref HYPERVISOR_SECCOMP: IntGaugeVec =
|
||||
IntGaugeVec::new(Opts::new(format!("{}_{}",NAMESPACE_KATA_HYPERVISOR,"seccomp"), "Hypervisor metrics for the seccomp filtering."), &["item"]).unwrap();
|
||||
|
||||
static ref HYPERVISOR_SIGNALS: IntGaugeVec =
|
||||
IntGaugeVec::new(Opts::new(format!("{}_{}",NAMESPACE_KATA_HYPERVISOR,"signals"), "Hypervisor metrics related to signals."), &["item"]).unwrap();
|
||||
}
|
||||
|
||||
/// get prometheus metrics
|
||||
pub fn get_hypervisor_metrics() -> Result<String> {
|
||||
let mut registered = REGISTERED
|
||||
.lock()
|
||||
.map_err(|e| anyhow!("failed to check hypervisor metrics register status {:?}", e))?;
|
||||
|
||||
if !(*registered) {
|
||||
register_hypervisor_metrics()?;
|
||||
*registered = true;
|
||||
}
|
||||
|
||||
update_hypervisor_metrics()?;
|
||||
|
||||
// gather all metrics and return as a String
|
||||
let metric_families = REGISTRY.gather();
|
||||
|
||||
let mut buffer = Vec::new();
|
||||
let encoder = TextEncoder::new();
|
||||
encoder.encode(&metric_families, &mut buffer)?;
|
||||
|
||||
Ok(String::from_utf8(buffer)?)
|
||||
}
|
||||
|
||||
fn register_hypervisor_metrics() -> Result<()> {
|
||||
REGISTRY.register(Box::new(HYPERVISOR_SCRAPE_COUNT.clone()))?;
|
||||
REGISTRY.register(Box::new(HYPERVISOR_VCPU.clone()))?;
|
||||
REGISTRY.register(Box::new(HYPERVISOR_SECCOMP.clone()))?;
|
||||
REGISTRY.register(Box::new(HYPERVISOR_SIGNALS.clone()))?;
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn update_hypervisor_metrics() -> Result<()> {
|
||||
HYPERVISOR_SCRAPE_COUNT.inc();
|
||||
|
||||
set_intgauge_vec_vcpu(&HYPERVISOR_VCPU);
|
||||
set_intgauge_vec_seccomp(&HYPERVISOR_SECCOMP);
|
||||
set_intgauge_vec_signals(&HYPERVISOR_SIGNALS);
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn set_intgauge_vec_vcpu(icv: &prometheus::IntGaugeVec) {
|
||||
icv.with_label_values(&["exit_io_in"])
|
||||
.set(METRICS.vcpu.exit_io_in.count() as i64);
|
||||
icv.with_label_values(&["exit_io_out"])
|
||||
.set(METRICS.vcpu.exit_io_out.count() as i64);
|
||||
icv.with_label_values(&["exit_mmio_read"])
|
||||
.set(METRICS.vcpu.exit_mmio_read.count() as i64);
|
||||
icv.with_label_values(&["exit_mmio_write"])
|
||||
.set(METRICS.vcpu.exit_mmio_write.count() as i64);
|
||||
icv.with_label_values(&["failures"])
|
||||
.set(METRICS.vcpu.failures.count() as i64);
|
||||
icv.with_label_values(&["filter_cpuid"])
|
||||
.set(METRICS.vcpu.filter_cpuid.count() as i64);
|
||||
}
|
||||
|
||||
fn set_intgauge_vec_seccomp(icv: &prometheus::IntGaugeVec) {
|
||||
icv.with_label_values(&["num_faults"])
|
||||
.set(METRICS.seccomp.num_faults.count() as i64);
|
||||
}
|
||||
|
||||
fn set_intgauge_vec_signals(icv: &prometheus::IntGaugeVec) {
|
||||
icv.with_label_values(&["sigbus"])
|
||||
.set(METRICS.signals.sigbus.count() as i64);
|
||||
icv.with_label_values(&["sigsegv"])
|
||||
.set(METRICS.signals.sigsegv.count() as i64);
|
||||
}
|
||||
@@ -9,6 +9,9 @@
|
||||
//TODO: Remove this, after the rest of dragonball has been committed.
|
||||
#![allow(dead_code)]
|
||||
|
||||
#[macro_use]
|
||||
extern crate lazy_static;
|
||||
|
||||
/// Address space manager for virtual machines.
|
||||
pub mod address_space_manager;
|
||||
/// API to handle vmm requests.
|
||||
@@ -19,6 +22,8 @@ pub mod config_manager;
|
||||
pub mod device_manager;
|
||||
/// Errors related to Virtual machine manager.
|
||||
pub mod error;
|
||||
/// Prometheus Metrics.
|
||||
pub mod hypervisor_metrics;
|
||||
/// KVM operation context for virtual machines.
|
||||
pub mod kvm_context;
|
||||
/// Metrics system.
|
||||
|
||||
590
src/runtime-rs/Cargo.lock
generated
590
src/runtime-rs/Cargo.lock
generated
File diff suppressed because it is too large
Load Diff
@@ -121,5 +121,6 @@ impl_agent!(
|
||||
set_ip_tables | crate::SetIPTablesRequest | crate::SetIPTablesResponse | None,
|
||||
get_volume_stats | crate::VolumeStatsRequest | crate::VolumeStatsResponse | None,
|
||||
resize_volume | crate::ResizeVolumeRequest | crate::Empty | None,
|
||||
online_cpu_mem | crate::OnlineCPUMemRequest | crate::Empty | None
|
||||
online_cpu_mem | crate::OnlineCPUMemRequest | crate::Empty | None,
|
||||
get_metrics | crate::Empty | crate::MetricsResponse | None
|
||||
);
|
||||
|
||||
@@ -7,7 +7,7 @@
|
||||
use std::convert::Into;
|
||||
|
||||
use protocols::{
|
||||
agent::{self, OOMEvent},
|
||||
agent::{self, Metrics, OOMEvent},
|
||||
csi, empty, health, types,
|
||||
};
|
||||
|
||||
@@ -19,13 +19,13 @@ use crate::{
|
||||
Empty, ExecProcessRequest, FSGroup, FSGroupChangePolicy, GetIPTablesRequest,
|
||||
GetIPTablesResponse, GuestDetailsResponse, HealthCheckResponse, HugetlbStats, IPAddress,
|
||||
IPFamily, Interface, Interfaces, KernelModule, MemHotplugByProbeRequest, MemoryData,
|
||||
MemoryStats, NetworkStats, OnlineCPUMemRequest, PidsStats, ReadStreamRequest,
|
||||
ReadStreamResponse, RemoveContainerRequest, ReseedRandomDevRequest, ResizeVolumeRequest,
|
||||
Route, Routes, SetGuestDateTimeRequest, SetIPTablesRequest, SetIPTablesResponse,
|
||||
SignalProcessRequest, StatsContainerResponse, Storage, StringUser, ThrottlingData,
|
||||
TtyWinResizeRequest, UpdateContainerRequest, UpdateInterfaceRequest, UpdateRoutesRequest,
|
||||
VersionCheckResponse, VolumeStatsRequest, VolumeStatsResponse, WaitProcessRequest,
|
||||
WriteStreamRequest,
|
||||
MemoryStats, MetricsResponse, NetworkStats, OnlineCPUMemRequest, PidsStats,
|
||||
ReadStreamRequest, ReadStreamResponse, RemoveContainerRequest, ReseedRandomDevRequest,
|
||||
ResizeVolumeRequest, Route, Routes, SetGuestDateTimeRequest, SetIPTablesRequest,
|
||||
SetIPTablesResponse, SignalProcessRequest, StatsContainerResponse, Storage, StringUser,
|
||||
ThrottlingData, TtyWinResizeRequest, UpdateContainerRequest, UpdateInterfaceRequest,
|
||||
UpdateRoutesRequest, VersionCheckResponse, VolumeStatsRequest, VolumeStatsResponse,
|
||||
WaitProcessRequest, WriteStreamRequest,
|
||||
},
|
||||
OomEventResponse, WaitProcessResponse, WriteStreamResponse,
|
||||
};
|
||||
@@ -755,6 +755,14 @@ impl From<agent::WaitProcessResponse> for WaitProcessResponse {
|
||||
}
|
||||
}
|
||||
|
||||
impl From<Empty> for agent::GetMetricsRequest {
|
||||
fn from(_: Empty) -> Self {
|
||||
Self {
|
||||
..Default::default()
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl From<Empty> for agent::GetOOMEventRequest {
|
||||
fn from(_: Empty) -> Self {
|
||||
Self {
|
||||
@@ -789,6 +797,14 @@ impl From<health::VersionCheckResponse> for VersionCheckResponse {
|
||||
}
|
||||
}
|
||||
|
||||
impl From<agent::Metrics> for MetricsResponse {
|
||||
fn from(from: Metrics) -> Self {
|
||||
Self {
|
||||
metrics: from.metrics,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl From<agent::OOMEvent> for OomEventResponse {
|
||||
fn from(from: OOMEvent) -> Self {
|
||||
Self {
|
||||
|
||||
@@ -18,13 +18,14 @@ pub use types::{
|
||||
CloseStdinRequest, ContainerID, ContainerProcessID, CopyFileRequest, CreateContainerRequest,
|
||||
CreateSandboxRequest, Empty, ExecProcessRequest, GetGuestDetailsRequest, GetIPTablesRequest,
|
||||
GetIPTablesResponse, GuestDetailsResponse, HealthCheckResponse, IPAddress, IPFamily, Interface,
|
||||
Interfaces, ListProcessesRequest, MemHotplugByProbeRequest, OnlineCPUMemRequest,
|
||||
OomEventResponse, ReadStreamRequest, ReadStreamResponse, RemoveContainerRequest,
|
||||
ReseedRandomDevRequest, ResizeVolumeRequest, Route, Routes, SetGuestDateTimeRequest,
|
||||
SetIPTablesRequest, SetIPTablesResponse, SignalProcessRequest, StatsContainerResponse, Storage,
|
||||
TtyWinResizeRequest, UpdateContainerRequest, UpdateInterfaceRequest, UpdateRoutesRequest,
|
||||
VersionCheckResponse, VolumeStatsRequest, VolumeStatsResponse, WaitProcessRequest,
|
||||
WaitProcessResponse, WriteStreamRequest, WriteStreamResponse,
|
||||
Interfaces, ListProcessesRequest, MemHotplugByProbeRequest, MetricsResponse,
|
||||
OnlineCPUMemRequest, OomEventResponse, ReadStreamRequest, ReadStreamResponse,
|
||||
RemoveContainerRequest, ReseedRandomDevRequest, ResizeVolumeRequest, Route, Routes,
|
||||
SetGuestDateTimeRequest, SetIPTablesRequest, SetIPTablesResponse, SignalProcessRequest,
|
||||
StatsContainerResponse, Storage, TtyWinResizeRequest, UpdateContainerRequest,
|
||||
UpdateInterfaceRequest, UpdateRoutesRequest, VersionCheckResponse, VolumeStatsRequest,
|
||||
VolumeStatsResponse, WaitProcessRequest, WaitProcessResponse, WriteStreamRequest,
|
||||
WriteStreamResponse,
|
||||
};
|
||||
|
||||
use anyhow::Result;
|
||||
@@ -86,6 +87,7 @@ pub trait Agent: AgentManager + HealthService + Send + Sync {
|
||||
|
||||
// utils
|
||||
async fn copy_file(&self, req: CopyFileRequest) -> Result<Empty>;
|
||||
async fn get_metrics(&self, req: Empty) -> Result<MetricsResponse>;
|
||||
async fn get_oom_event(&self, req: Empty) -> Result<OomEventResponse>;
|
||||
async fn get_ip_tables(&self, req: GetIPTablesRequest) -> Result<GetIPTablesResponse>;
|
||||
async fn set_ip_tables(&self, req: SetIPTablesRequest) -> Result<SetIPTablesResponse>;
|
||||
|
||||
@@ -556,6 +556,11 @@ pub struct VersionCheckResponse {
|
||||
pub agent_version: String,
|
||||
}
|
||||
|
||||
#[derive(PartialEq, Clone, Default, Debug)]
|
||||
pub struct MetricsResponse {
|
||||
pub metrics: String,
|
||||
}
|
||||
|
||||
#[derive(PartialEq, Clone, Default, Debug)]
|
||||
pub struct OomEventResponse {
|
||||
pub container_id: String,
|
||||
|
||||
@@ -536,6 +536,10 @@ impl CloudHypervisorInner {
|
||||
caps.set(CapabilityBits::FsSharingSupport);
|
||||
Ok(caps)
|
||||
}
|
||||
|
||||
pub(crate) async fn get_hypervisor_metrics(&self) -> Result<String> {
|
||||
todo!()
|
||||
}
|
||||
}
|
||||
|
||||
// Log all output from the CH process until a shutdown signal is received.
|
||||
|
||||
@@ -152,6 +152,11 @@ impl Hypervisor for CloudHypervisor {
|
||||
let inner = self.inner.read().await;
|
||||
inner.capabilities().await
|
||||
}
|
||||
|
||||
async fn get_hypervisor_metrics(&self) -> Result<String> {
|
||||
let inner = self.inner.read().await;
|
||||
inner.get_hypervisor_metrics().await
|
||||
}
|
||||
}
|
||||
|
||||
#[async_trait]
|
||||
|
||||
@@ -92,6 +92,11 @@ impl DragonballInner {
|
||||
))
|
||||
}
|
||||
|
||||
pub(crate) async fn get_hypervisor_metrics(&self) -> Result<String> {
|
||||
info!(sl!(), "get hypervisor metrics");
|
||||
self.vmm_instance.get_hypervisor_metrics()
|
||||
}
|
||||
|
||||
pub(crate) async fn disconnect(&mut self) {
|
||||
self.state = VmmState::NotReady;
|
||||
}
|
||||
|
||||
@@ -160,6 +160,11 @@ impl Hypervisor for Dragonball {
|
||||
let inner = self.inner.read().await;
|
||||
inner.capabilities().await
|
||||
}
|
||||
|
||||
async fn get_hypervisor_metrics(&self) -> Result<String> {
|
||||
let inner = self.inner.read().await;
|
||||
inner.get_hypervisor_metrics().await
|
||||
}
|
||||
}
|
||||
|
||||
#[async_trait]
|
||||
|
||||
@@ -267,6 +267,15 @@ impl VmmInstance {
|
||||
std::process::id()
|
||||
}
|
||||
|
||||
pub fn get_hypervisor_metrics(&self) -> Result<String> {
|
||||
if let Ok(VmmData::HypervisorMetrics(metrics)) =
|
||||
self.handle_request(Request::Sync(VmmAction::GetHypervisorMetrics))
|
||||
{
|
||||
return Ok(metrics);
|
||||
}
|
||||
Err(anyhow!("Failed to get hypervisor metrics"))
|
||||
}
|
||||
|
||||
pub fn stop(&mut self) -> Result<()> {
|
||||
self.handle_request(Request::Sync(VmmAction::ShutdownMicroVm))
|
||||
.map_err(|e| {
|
||||
|
||||
@@ -97,4 +97,5 @@ pub trait Hypervisor: std::fmt::Debug + Send + Sync {
|
||||
async fn get_jailer_root(&self) -> Result<String>;
|
||||
async fn save_state(&self) -> Result<HypervisorState>;
|
||||
async fn capabilities(&self) -> Result<Capabilities>;
|
||||
async fn get_hypervisor_metrics(&self) -> Result<String>;
|
||||
}
|
||||
|
||||
@@ -136,6 +136,10 @@ impl QemuInner {
|
||||
info!(sl!(), "QemuInner::hypervisor_config()");
|
||||
self.config.clone()
|
||||
}
|
||||
|
||||
pub(crate) async fn get_hypervisor_metrics(&self) -> Result<String> {
|
||||
todo!()
|
||||
}
|
||||
}
|
||||
|
||||
use crate::device::DeviceType;
|
||||
|
||||
@@ -147,4 +147,9 @@ impl Hypervisor for Qemu {
|
||||
let inner = self.inner.read().await;
|
||||
inner.capabilities().await
|
||||
}
|
||||
|
||||
async fn get_hypervisor_metrics(&self) -> Result<String> {
|
||||
let inner = self.inner.read().await;
|
||||
inner.get_hypervisor_metrics().await
|
||||
}
|
||||
}
|
||||
|
||||
@@ -22,6 +22,8 @@ hyperlocal = "0.8"
|
||||
serde_json = "1.0.88"
|
||||
nix = "0.25.0"
|
||||
url = "2.3.1"
|
||||
procfs = "0.12.0"
|
||||
prometheus = { version = "0.13.0", features = ["process"] }
|
||||
|
||||
agent = { path = "../agent" }
|
||||
common = { path = "./common" }
|
||||
|
||||
@@ -41,4 +41,8 @@ pub trait Sandbox: Send + Sync {
|
||||
async fn direct_volume_stats(&self, volume_path: &str) -> Result<String>;
|
||||
async fn direct_volume_resize(&self, resize_req: agent::ResizeVolumeRequest) -> Result<()>;
|
||||
async fn agent_sock(&self) -> Result<String>;
|
||||
|
||||
// metrics function
|
||||
async fn agent_metrics(&self) -> Result<String>;
|
||||
async fn hypervisor_metrics(&self) -> Result<String>;
|
||||
}
|
||||
|
||||
@@ -4,6 +4,9 @@
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#[macro_use(lazy_static)]
|
||||
extern crate lazy_static;
|
||||
|
||||
#[macro_use]
|
||||
extern crate slog;
|
||||
|
||||
@@ -12,5 +15,6 @@ logging::logger_with_subsystem!(sl, "runtimes");
|
||||
pub mod manager;
|
||||
pub use manager::RuntimeHandlerManager;
|
||||
pub use shim_interface;
|
||||
mod shim_metrics;
|
||||
mod shim_mgmt;
|
||||
pub mod tracer;
|
||||
|
||||
@@ -221,7 +221,7 @@ impl std::fmt::Debug for RuntimeHandlerManager {
|
||||
}
|
||||
|
||||
impl RuntimeHandlerManager {
|
||||
pub async fn new(id: &str, msg_sender: Sender<Message>) -> Result<Self> {
|
||||
pub fn new(id: &str, msg_sender: Sender<Message>) -> Result<Self> {
|
||||
Ok(Self {
|
||||
inner: Arc::new(RwLock::new(RuntimeHandlerManagerInner::new(
|
||||
id, msg_sender,
|
||||
|
||||
235
src/runtime-rs/crates/runtimes/src/shim_metrics.rs
Normal file
235
src/runtime-rs/crates/runtimes/src/shim_metrics.rs
Normal file
@@ -0,0 +1,235 @@
|
||||
// Copyright 2021-2022 Ant Group
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
extern crate procfs;
|
||||
|
||||
use anyhow::{anyhow, Result};
|
||||
use prometheus::{Encoder, Gauge, GaugeVec, Opts, Registry, TextEncoder};
|
||||
use slog::warn;
|
||||
use std::sync::Mutex;
|
||||
|
||||
const NAMESPACE_KATA_SHIM: &str = "kata_shim";
|
||||
|
||||
// Convenience macro to obtain the scope logger
|
||||
macro_rules! sl {
|
||||
() => {
|
||||
slog_scope::logger().new(o!("subsystem" => "metrics"))
|
||||
};
|
||||
}
|
||||
|
||||
lazy_static! {
|
||||
static ref REGISTERED: Mutex<bool> = Mutex::new(false);
|
||||
|
||||
// custom registry
|
||||
static ref REGISTRY: Registry = Registry::new();
|
||||
|
||||
// shim metrics
|
||||
static ref SHIM_THREADS: Gauge = Gauge::new(format!("{}_{}", NAMESPACE_KATA_SHIM, "threads"),"Kata containerd shim v2 process threads.").unwrap();
|
||||
|
||||
static ref SHIM_PROC_STATUS: GaugeVec =
|
||||
GaugeVec::new(Opts::new(format!("{}_{}",NAMESPACE_KATA_SHIM,"proc_status"), "Kata containerd shim v2 process status."), &["item"]).unwrap();
|
||||
|
||||
static ref SHIM_PROC_STAT: GaugeVec = GaugeVec::new(Opts::new(format!("{}_{}",NAMESPACE_KATA_SHIM,"proc_stat"), "Kata containerd shim v2 process statistics."), &["item"]).unwrap();
|
||||
|
||||
static ref SHIM_NETDEV: GaugeVec = GaugeVec::new(Opts::new(format!("{}_{}",NAMESPACE_KATA_SHIM,"netdev"), "Kata containerd shim v2 network devices statistics."), &["interface", "item"]).unwrap();
|
||||
|
||||
static ref SHIM_IO_STAT: GaugeVec = GaugeVec::new(Opts::new(format!("{}_{}",NAMESPACE_KATA_SHIM,"io_stat"), "Kata containerd shim v2 process IO statistics."), &["item"]).unwrap();
|
||||
|
||||
static ref SHIM_OPEN_FDS: Gauge = Gauge::new(format!("{}_{}", NAMESPACE_KATA_SHIM, "fds"), "Kata containerd shim v2 open FDs.").unwrap();
|
||||
}
|
||||
|
||||
pub fn get_shim_metrics() -> Result<String> {
|
||||
let mut registered = REGISTERED
|
||||
.lock()
|
||||
.map_err(|e| anyhow!("failed to check shim metrics register status {:?}", e))?;
|
||||
|
||||
if !(*registered) {
|
||||
register_shim_metrics()?;
|
||||
*registered = true;
|
||||
}
|
||||
|
||||
update_shim_metrics()?;
|
||||
|
||||
// gather all metrics and return as a String
|
||||
let metric_families = REGISTRY.gather();
|
||||
|
||||
let mut buffer = Vec::new();
|
||||
let encoder = TextEncoder::new();
|
||||
encoder.encode(&metric_families, &mut buffer)?;
|
||||
|
||||
Ok(String::from_utf8(buffer)?)
|
||||
}
|
||||
|
||||
fn register_shim_metrics() -> Result<()> {
|
||||
REGISTRY.register(Box::new(SHIM_THREADS.clone()))?;
|
||||
REGISTRY.register(Box::new(SHIM_PROC_STATUS.clone()))?;
|
||||
REGISTRY.register(Box::new(SHIM_PROC_STAT.clone()))?;
|
||||
REGISTRY.register(Box::new(SHIM_NETDEV.clone()))?;
|
||||
REGISTRY.register(Box::new(SHIM_IO_STAT.clone()))?;
|
||||
REGISTRY.register(Box::new(SHIM_OPEN_FDS.clone()))?;
|
||||
|
||||
// TODO:
|
||||
// REGISTRY.register(Box::new(RPC_DURATIONS_HISTOGRAM.clone()))?;
|
||||
// REGISTRY.register(Box::new(SHIM_POD_OVERHEAD_CPU.clone()))?;
|
||||
// REGISTRY.register(Box::new(SHIM_POD_OVERHEAD_MEMORY.clone()))?;
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn update_shim_metrics() -> Result<()> {
|
||||
let me = procfs::process::Process::myself();
|
||||
|
||||
let me = match me {
|
||||
Ok(p) => p,
|
||||
Err(e) => {
|
||||
warn!(sl!(), "failed to create process instance: {:?}", e);
|
||||
return Ok(());
|
||||
}
|
||||
};
|
||||
|
||||
SHIM_THREADS.set(me.stat.num_threads as f64);
|
||||
|
||||
match me.status() {
|
||||
Err(err) => error!(sl!(), "failed to get process status: {:?}", err),
|
||||
Ok(status) => set_gauge_vec_proc_status(&SHIM_PROC_STATUS, &status),
|
||||
}
|
||||
|
||||
match me.stat() {
|
||||
Err(err) => {
|
||||
error!(sl!(), "failed to get process stat: {:?}", err);
|
||||
}
|
||||
Ok(stat) => {
|
||||
set_gauge_vec_proc_stat(&SHIM_PROC_STAT, &stat);
|
||||
}
|
||||
}
|
||||
|
||||
match procfs::net::dev_status() {
|
||||
Err(err) => {
|
||||
error!(sl!(), "failed to get host net::dev_status: {:?}", err);
|
||||
}
|
||||
Ok(devs) => {
|
||||
for (_, status) in devs {
|
||||
set_gauge_vec_netdev(&SHIM_NETDEV, &status);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
match me.io() {
|
||||
Err(err) => {
|
||||
error!(sl!(), "failed to get process io stat: {:?}", err);
|
||||
}
|
||||
Ok(io) => {
|
||||
set_gauge_vec_proc_io(&SHIM_IO_STAT, &io);
|
||||
}
|
||||
}
|
||||
|
||||
match me.fd_count() {
|
||||
Err(err) => {
|
||||
error!(sl!(), "failed to get process open fds number: {:?}", err);
|
||||
}
|
||||
Ok(fds) => {
|
||||
SHIM_OPEN_FDS.set(fds as f64);
|
||||
}
|
||||
}
|
||||
|
||||
// TODO:
|
||||
// RPC_DURATIONS_HISTOGRAM & SHIM_POD_OVERHEAD_CPU & SHIM_POD_OVERHEAD_MEMORY
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn set_gauge_vec_proc_status(gv: &prometheus::GaugeVec, status: &procfs::process::Status) {
|
||||
gv.with_label_values(&["vmpeak"])
|
||||
.set(status.vmpeak.unwrap_or(0) as f64);
|
||||
gv.with_label_values(&["vmsize"])
|
||||
.set(status.vmsize.unwrap_or(0) as f64);
|
||||
gv.with_label_values(&["vmlck"])
|
||||
.set(status.vmlck.unwrap_or(0) as f64);
|
||||
gv.with_label_values(&["vmpin"])
|
||||
.set(status.vmpin.unwrap_or(0) as f64);
|
||||
gv.with_label_values(&["vmhwm"])
|
||||
.set(status.vmhwm.unwrap_or(0) as f64);
|
||||
gv.with_label_values(&["vmrss"])
|
||||
.set(status.vmrss.unwrap_or(0) as f64);
|
||||
gv.with_label_values(&["rssanon"])
|
||||
.set(status.rssanon.unwrap_or(0) as f64);
|
||||
gv.with_label_values(&["rssfile"])
|
||||
.set(status.rssfile.unwrap_or(0) as f64);
|
||||
gv.with_label_values(&["rssshmem"])
|
||||
.set(status.rssshmem.unwrap_or(0) as f64);
|
||||
gv.with_label_values(&["vmdata"])
|
||||
.set(status.vmdata.unwrap_or(0) as f64);
|
||||
gv.with_label_values(&["vmstk"])
|
||||
.set(status.vmstk.unwrap_or(0) as f64);
|
||||
gv.with_label_values(&["vmexe"])
|
||||
.set(status.vmexe.unwrap_or(0) as f64);
|
||||
gv.with_label_values(&["vmlib"])
|
||||
.set(status.vmlib.unwrap_or(0) as f64);
|
||||
gv.with_label_values(&["vmpte"])
|
||||
.set(status.vmpte.unwrap_or(0) as f64);
|
||||
gv.with_label_values(&["vmswap"])
|
||||
.set(status.vmswap.unwrap_or(0) as f64);
|
||||
gv.with_label_values(&["hugetlbpages"])
|
||||
.set(status.hugetlbpages.unwrap_or(0) as f64);
|
||||
gv.with_label_values(&["voluntary_ctxt_switches"])
|
||||
.set(status.voluntary_ctxt_switches.unwrap_or(0) as f64);
|
||||
gv.with_label_values(&["nonvoluntary_ctxt_switches"])
|
||||
.set(status.nonvoluntary_ctxt_switches.unwrap_or(0) as f64);
|
||||
}
|
||||
|
||||
fn set_gauge_vec_proc_stat(gv: &prometheus::GaugeVec, stat: &procfs::process::Stat) {
|
||||
gv.with_label_values(&["utime"]).set(stat.utime as f64);
|
||||
gv.with_label_values(&["stime"]).set(stat.stime as f64);
|
||||
gv.with_label_values(&["cutime"]).set(stat.cutime as f64);
|
||||
gv.with_label_values(&["cstime"]).set(stat.cstime as f64);
|
||||
}
|
||||
|
||||
fn set_gauge_vec_netdev(gv: &prometheus::GaugeVec, status: &procfs::net::DeviceStatus) {
|
||||
gv.with_label_values(&[status.name.as_str(), "recv_bytes"])
|
||||
.set(status.recv_bytes as f64);
|
||||
gv.with_label_values(&[status.name.as_str(), "recv_packets"])
|
||||
.set(status.recv_packets as f64);
|
||||
gv.with_label_values(&[status.name.as_str(), "recv_errs"])
|
||||
.set(status.recv_errs as f64);
|
||||
gv.with_label_values(&[status.name.as_str(), "recv_drop"])
|
||||
.set(status.recv_drop as f64);
|
||||
gv.with_label_values(&[status.name.as_str(), "recv_fifo"])
|
||||
.set(status.recv_fifo as f64);
|
||||
gv.with_label_values(&[status.name.as_str(), "recv_frame"])
|
||||
.set(status.recv_frame as f64);
|
||||
gv.with_label_values(&[status.name.as_str(), "recv_compressed"])
|
||||
.set(status.recv_compressed as f64);
|
||||
gv.with_label_values(&[status.name.as_str(), "recv_multicast"])
|
||||
.set(status.recv_multicast as f64);
|
||||
gv.with_label_values(&[status.name.as_str(), "sent_bytes"])
|
||||
.set(status.sent_bytes as f64);
|
||||
gv.with_label_values(&[status.name.as_str(), "sent_packets"])
|
||||
.set(status.sent_packets as f64);
|
||||
gv.with_label_values(&[status.name.as_str(), "sent_errs"])
|
||||
.set(status.sent_errs as f64);
|
||||
gv.with_label_values(&[status.name.as_str(), "sent_drop"])
|
||||
.set(status.sent_drop as f64);
|
||||
gv.with_label_values(&[status.name.as_str(), "sent_fifo"])
|
||||
.set(status.sent_fifo as f64);
|
||||
gv.with_label_values(&[status.name.as_str(), "sent_colls"])
|
||||
.set(status.sent_colls as f64);
|
||||
gv.with_label_values(&[status.name.as_str(), "sent_carrier"])
|
||||
.set(status.sent_carrier as f64);
|
||||
gv.with_label_values(&[status.name.as_str(), "sent_compressed"])
|
||||
.set(status.sent_compressed as f64);
|
||||
}
|
||||
|
||||
fn set_gauge_vec_proc_io(gv: &prometheus::GaugeVec, io_stat: &procfs::process::Io) {
|
||||
gv.with_label_values(&["rchar"]).set(io_stat.rchar as f64);
|
||||
gv.with_label_values(&["wchar"]).set(io_stat.wchar as f64);
|
||||
gv.with_label_values(&["syscr"]).set(io_stat.syscr as f64);
|
||||
gv.with_label_values(&["syscw"]).set(io_stat.syscw as f64);
|
||||
gv.with_label_values(&["read_bytes"])
|
||||
.set(io_stat.read_bytes as f64);
|
||||
gv.with_label_values(&["write_bytes"])
|
||||
.set(io_stat.write_bytes as f64);
|
||||
gv.with_label_values(&["cancelled_write_bytes"])
|
||||
.set(io_stat.cancelled_write_bytes as f64);
|
||||
}
|
||||
@@ -7,6 +7,7 @@
|
||||
// This defines the handlers corresponding to the url when a request is sent to destined url,
|
||||
// the handler function should be invoked, and the corresponding data will be in the response
|
||||
|
||||
use crate::shim_metrics::get_shim_metrics;
|
||||
use agent::ResizeVolumeRequest;
|
||||
use anyhow::{anyhow, Context, Result};
|
||||
use common::Sandbox;
|
||||
@@ -16,7 +17,7 @@ use url::Url;
|
||||
|
||||
use shim_interface::shim_mgmt::{
|
||||
AGENT_URL, DIRECT_VOLUME_PATH_KEY, DIRECT_VOLUME_RESIZE_URL, DIRECT_VOLUME_STATS_URL,
|
||||
IP6_TABLE_URL, IP_TABLE_URL,
|
||||
IP6_TABLE_URL, IP_TABLE_URL, METRICS_URL,
|
||||
};
|
||||
|
||||
// main router for response, this works as a multiplexer on
|
||||
@@ -43,6 +44,7 @@ pub(crate) async fn handler_mux(
|
||||
(&Method::POST, DIRECT_VOLUME_RESIZE_URL) => {
|
||||
direct_volume_resize_handler(sandbox, req).await
|
||||
}
|
||||
(&Method::GET, METRICS_URL) => metrics_url_handler(sandbox, req).await,
|
||||
_ => Ok(not_found(req).await),
|
||||
}
|
||||
}
|
||||
@@ -146,3 +148,19 @@ async fn direct_volume_resize_handler(
|
||||
_ => Err(anyhow!("handler: Failed to resize volume")),
|
||||
}
|
||||
}
|
||||
|
||||
// returns the url for metrics
|
||||
async fn metrics_url_handler(
|
||||
sandbox: Arc<dyn Sandbox>,
|
||||
_req: Request<Body>,
|
||||
) -> Result<Response<Body>> {
|
||||
// get metrics from agent, hypervisor, and shim
|
||||
let agent_metrics = sandbox.agent_metrics().await.unwrap_or_default();
|
||||
let hypervisor_metrics = sandbox.hypervisor_metrics().await.unwrap_or_default();
|
||||
let shim_metrics = get_shim_metrics().unwrap_or_default();
|
||||
|
||||
Ok(Response::new(Body::from(format!(
|
||||
"{}{}{}",
|
||||
agent_metrics, hypervisor_metrics, shim_metrics
|
||||
))))
|
||||
}
|
||||
|
||||
@@ -459,6 +459,18 @@ impl Sandbox for VirtSandbox {
|
||||
.context("sandbox: failed to get iptables")?;
|
||||
Ok(resp.data)
|
||||
}
|
||||
|
||||
async fn agent_metrics(&self) -> Result<String> {
|
||||
self.agent
|
||||
.get_metrics(agent::Empty::new())
|
||||
.await
|
||||
.map_err(|err| anyhow!("failed to get agent metrics {:?}", err))
|
||||
.map(|resp| resp.metrics)
|
||||
}
|
||||
|
||||
async fn hypervisor_metrics(&self) -> Result<String> {
|
||||
self.hypervisor.get_hypervisor_metrics().await
|
||||
}
|
||||
}
|
||||
|
||||
#[async_trait]
|
||||
|
||||
@@ -18,6 +18,7 @@ use containerd_shim_protos::{
|
||||
shim_async,
|
||||
};
|
||||
use runtimes::RuntimeHandlerManager;
|
||||
use shim_interface::KATA_PATH;
|
||||
use tokio::{
|
||||
io::AsyncWriteExt,
|
||||
process::Command,
|
||||
@@ -26,9 +27,9 @@ use tokio::{
|
||||
use ttrpc::asynchronous::Server;
|
||||
|
||||
use crate::task_service::TaskService;
|
||||
|
||||
/// message buffer size
|
||||
const MESSAGE_BUFFER_SIZE: usize = 8;
|
||||
use shim_interface::KATA_PATH;
|
||||
|
||||
pub struct ServiceManager {
|
||||
receiver: Option<Receiver<Message>>,
|
||||
@@ -52,48 +53,8 @@ impl std::fmt::Debug for ServiceManager {
|
||||
}
|
||||
}
|
||||
|
||||
async fn send_event(
|
||||
containerd_binary: String,
|
||||
address: String,
|
||||
namespace: String,
|
||||
event: Arc<dyn Event>,
|
||||
) -> Result<()> {
|
||||
let any = Any {
|
||||
type_url: event.type_url(),
|
||||
value: event.value().context("get event value")?,
|
||||
..Default::default()
|
||||
};
|
||||
let data = any.write_to_bytes().context("write to any")?;
|
||||
let mut child = Command::new(containerd_binary)
|
||||
.stdin(Stdio::piped())
|
||||
.stdout(Stdio::piped())
|
||||
.stderr(Stdio::piped())
|
||||
.args([
|
||||
"--address",
|
||||
&address,
|
||||
"publish",
|
||||
"--topic",
|
||||
&event.r#type(),
|
||||
"--namespace",
|
||||
&namespace,
|
||||
])
|
||||
.spawn()
|
||||
.context("spawn containerd cmd to publish event")?;
|
||||
|
||||
let stdin = child.stdin.as_mut().context("failed to open stdin")?;
|
||||
stdin
|
||||
.write_all(&data)
|
||||
.await
|
||||
.context("failed to write to stdin")?;
|
||||
let output = child
|
||||
.wait_with_output()
|
||||
.await
|
||||
.context("failed to read stdout")?;
|
||||
info!(sl!(), "get output: {:?}", output);
|
||||
Ok(())
|
||||
}
|
||||
|
||||
impl ServiceManager {
|
||||
// TODO: who manages lifecycle for `task_server_fd`?
|
||||
pub async fn new(
|
||||
id: &str,
|
||||
containerd_binary: &str,
|
||||
@@ -102,11 +63,8 @@ impl ServiceManager {
|
||||
task_server_fd: RawFd,
|
||||
) -> Result<Self> {
|
||||
let (sender, receiver) = channel::<Message>(MESSAGE_BUFFER_SIZE);
|
||||
let handler = Arc::new(
|
||||
RuntimeHandlerManager::new(id, sender)
|
||||
.await
|
||||
.context("new runtime handler")?,
|
||||
);
|
||||
let rt_mgr = RuntimeHandlerManager::new(id, sender).context("new runtime handler")?;
|
||||
let handler = Arc::new(rt_mgr);
|
||||
let mut task_server = unsafe { Server::from_raw_fd(task_server_fd) };
|
||||
task_server = task_server.set_domain_unix();
|
||||
Ok(Self {
|
||||
@@ -119,9 +77,10 @@ impl ServiceManager {
|
||||
})
|
||||
}
|
||||
|
||||
pub async fn run(&mut self) -> Result<()> {
|
||||
pub async fn run(mut self) -> Result<()> {
|
||||
info!(sl!(), "begin to run service");
|
||||
self.start().await.context("start")?;
|
||||
self.registry_service().context("registry service")?;
|
||||
self.start_service().await.context("start service")?;
|
||||
|
||||
info!(sl!(), "wait server message");
|
||||
let mut rx = self.receiver.take();
|
||||
@@ -129,23 +88,15 @@ impl ServiceManager {
|
||||
while let Some(r) = rx.recv().await {
|
||||
info!(sl!(), "receive action {:?}", &r.action);
|
||||
let result = match r.action {
|
||||
Action::Start => self.start().await.context("start listen"),
|
||||
Action::Stop => self.stop_listen().await.context("stop listen"),
|
||||
Action::Start => self.start_service().await.context("start listen"),
|
||||
Action::Stop => self.stop_service().await.context("stop listen"),
|
||||
Action::Shutdown => {
|
||||
self.stop_listen().await.context("stop listen")?;
|
||||
self.stop_service().await.context("stop listen")?;
|
||||
break;
|
||||
}
|
||||
Action::Event(event) => {
|
||||
info!(sl!(), "get event {:?}", &event);
|
||||
send_event(
|
||||
self.binary.clone(),
|
||||
self.address.clone(),
|
||||
self.namespace.clone(),
|
||||
event,
|
||||
)
|
||||
.await
|
||||
.context("send event")?;
|
||||
Ok(())
|
||||
self.send_event(event).await.context("send event")
|
||||
}
|
||||
};
|
||||
|
||||
@@ -165,49 +116,79 @@ impl ServiceManager {
|
||||
|
||||
pub async fn cleanup(sid: &str) -> Result<()> {
|
||||
let (sender, _receiver) = channel::<Message>(MESSAGE_BUFFER_SIZE);
|
||||
let handler = RuntimeHandlerManager::new(sid, sender)
|
||||
.await
|
||||
.context("new runtime handler")?;
|
||||
handler.cleanup().await.context("runtime handler cleanup")?;
|
||||
let handler = RuntimeHandlerManager::new(sid, sender).context("new runtime handler")?;
|
||||
if let Err(e) = handler.cleanup().await {
|
||||
warn!(sl!(), "failed to clean up runtime state, {}", e);
|
||||
}
|
||||
|
||||
let temp_dir = [KATA_PATH, sid].join("/");
|
||||
if std::fs::metadata(temp_dir.as_str()).is_ok() {
|
||||
if fs::metadata(temp_dir.as_str()).is_ok() {
|
||||
// try to remove dir and skip the result
|
||||
fs::remove_dir_all(temp_dir)
|
||||
.map_err(|err| {
|
||||
warn!(sl!(), "failed to clean up sandbox tmp dir");
|
||||
err
|
||||
})
|
||||
.ok();
|
||||
if let Err(e) = fs::remove_dir_all(temp_dir) {
|
||||
warn!(sl!(), "failed to clean up sandbox tmp dir, {}", e);
|
||||
}
|
||||
}
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn registry_service(&mut self) -> Result<()> {
|
||||
if let Some(t) = self.task_server.take() {
|
||||
let task_service = Arc::new(Box::new(TaskService::new(self.handler.clone()))
|
||||
as Box<dyn shim_async::Task + Send + Sync>);
|
||||
let t = t.register_service(shim_async::create_task(task_service));
|
||||
self.task_server = Some(t);
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
|
||||
async fn start(&mut self) -> Result<()> {
|
||||
let task_service = Arc::new(Box::new(TaskService::new(self.handler.clone()))
|
||||
as Box<dyn shim_async::Task + Send + Sync>);
|
||||
let task_server = self.task_server.take();
|
||||
let task_server = match task_server {
|
||||
Some(t) => {
|
||||
let mut t = t.register_service(shim_async::create_task(task_service));
|
||||
t.start().await.context("task server start")?;
|
||||
Some(t)
|
||||
}
|
||||
None => None,
|
||||
};
|
||||
self.task_server = task_server;
|
||||
async fn start_service(&mut self) -> Result<()> {
|
||||
if let Some(t) = self.task_server.as_mut() {
|
||||
t.start().await.context("task server start")?;
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
|
||||
async fn stop_listen(&mut self) -> Result<()> {
|
||||
let task_server = self.task_server.take();
|
||||
let task_server = match task_server {
|
||||
Some(mut t) => {
|
||||
t.stop_listen().await;
|
||||
Some(t)
|
||||
}
|
||||
None => None,
|
||||
async fn stop_service(&mut self) -> Result<()> {
|
||||
if let Some(t) = self.task_server.as_mut() {
|
||||
t.stop_listen().await;
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
|
||||
async fn send_event(&self, event: Arc<dyn Event>) -> Result<()> {
|
||||
let any = Any {
|
||||
type_url: event.type_url(),
|
||||
value: event.value().context("get event value")?,
|
||||
..Default::default()
|
||||
};
|
||||
self.task_server = task_server;
|
||||
let data = any.write_to_bytes().context("write to any")?;
|
||||
let mut child = Command::new(&self.binary)
|
||||
.stdin(Stdio::piped())
|
||||
.stdout(Stdio::piped())
|
||||
.stderr(Stdio::piped())
|
||||
.args([
|
||||
"--address",
|
||||
&self.address,
|
||||
"publish",
|
||||
"--topic",
|
||||
&event.r#type(),
|
||||
"--namespace",
|
||||
&self.namespace,
|
||||
])
|
||||
.spawn()
|
||||
.context("spawn containerd cmd to publish event")?;
|
||||
|
||||
let stdin = child.stdin.as_mut().context("failed to open stdin")?;
|
||||
stdin
|
||||
.write_all(&data)
|
||||
.await
|
||||
.context("failed to write to stdin")?;
|
||||
let output = child
|
||||
.wait_with_output()
|
||||
.await
|
||||
.context("failed to read stdout")?;
|
||||
info!(sl!(), "get output: {:?}", output);
|
||||
Ok(())
|
||||
}
|
||||
}
|
||||
|
||||
@@ -24,31 +24,31 @@ impl TaskService {
|
||||
pub(crate) fn new(handler: Arc<RuntimeHandlerManager>) -> Self {
|
||||
Self { handler }
|
||||
}
|
||||
}
|
||||
|
||||
async fn handler_message<TtrpcReq, TtrpcResp>(
|
||||
s: &RuntimeHandlerManager,
|
||||
ctx: &TtrpcContext,
|
||||
req: TtrpcReq,
|
||||
) -> ttrpc::Result<TtrpcResp>
|
||||
where
|
||||
Request: TryFrom<TtrpcReq>,
|
||||
<Request as TryFrom<TtrpcReq>>::Error: std::fmt::Debug,
|
||||
TtrpcResp: TryFrom<Response>,
|
||||
<TtrpcResp as TryFrom<Response>>::Error: std::fmt::Debug,
|
||||
{
|
||||
let r = req
|
||||
.try_into()
|
||||
.map_err(|err| ttrpc::Error::Others(format!("failed to translate from shim {:?}", err)))?;
|
||||
let logger = sl!().new(o!("stream id" => ctx.mh.stream_id));
|
||||
debug!(logger, "====> task service {:?}", &r);
|
||||
let resp = s
|
||||
.handler_message(r)
|
||||
.await
|
||||
.map_err(|err| ttrpc::Error::Others(format!("failed to handler message {:?}", err)))?;
|
||||
debug!(logger, "<==== task service {:?}", &resp);
|
||||
resp.try_into()
|
||||
.map_err(|err| ttrpc::Error::Others(format!("failed to translate to shim {:?}", err)))
|
||||
async fn handler_message<TtrpcReq, TtrpcResp>(
|
||||
&self,
|
||||
ctx: &TtrpcContext,
|
||||
req: TtrpcReq,
|
||||
) -> ttrpc::Result<TtrpcResp>
|
||||
where
|
||||
Request: TryFrom<TtrpcReq>,
|
||||
<Request as TryFrom<TtrpcReq>>::Error: std::fmt::Debug,
|
||||
TtrpcResp: TryFrom<Response>,
|
||||
<TtrpcResp as TryFrom<Response>>::Error: std::fmt::Debug,
|
||||
{
|
||||
let r = req.try_into().map_err(|err| {
|
||||
ttrpc::Error::Others(format!("failed to translate from shim {:?}", err))
|
||||
})?;
|
||||
let logger = sl!().new(o!("stream id" => ctx.mh.stream_id));
|
||||
debug!(logger, "====> task service {:?}", &r);
|
||||
let resp =
|
||||
self.handler.handler_message(r).await.map_err(|err| {
|
||||
ttrpc::Error::Others(format!("failed to handler message {:?}", err))
|
||||
})?;
|
||||
debug!(logger, "<==== task service {:?}", &resp);
|
||||
resp.try_into()
|
||||
.map_err(|err| ttrpc::Error::Others(format!("failed to translate to shim {:?}", err)))
|
||||
}
|
||||
}
|
||||
|
||||
macro_rules! impl_service {
|
||||
@@ -56,7 +56,7 @@ macro_rules! impl_service {
|
||||
#[async_trait]
|
||||
impl shim_async::Task for TaskService {
|
||||
$(async fn $name(&self, ctx: &TtrpcContext, req: $req) -> ttrpc::Result<$resp> {
|
||||
handler_message(&self.handler, ctx, req).await
|
||||
self.handler_message(ctx, req).await
|
||||
})*
|
||||
}
|
||||
};
|
||||
|
||||
@@ -16,7 +16,7 @@ const WORKER_THREADS: usize = 2;
|
||||
|
||||
async fn real_main() {
|
||||
let (sender, _receiver) = channel::<Message>(MESSAGE_BUFFER_SIZE);
|
||||
let manager = RuntimeHandlerManager::new("xxx", sender).await.unwrap();
|
||||
let manager = RuntimeHandlerManager::new("xxx", sender).unwrap();
|
||||
|
||||
let req = Request::CreateContainer(ContainerConfig {
|
||||
container_id: "xxx".to_owned(),
|
||||
|
||||
@@ -46,7 +46,7 @@ impl ShimExecutor {
|
||||
self.args.validate(false).context("validate")?;
|
||||
|
||||
let server_fd = get_server_fd().context("get server fd")?;
|
||||
let mut service_manager = service::ServiceManager::new(
|
||||
let service_manager = service::ServiceManager::new(
|
||||
&self.args.id,
|
||||
&self.args.publish_binary,
|
||||
&self.args.address,
|
||||
|
||||
@@ -851,6 +851,21 @@ func (c *Container) checkBlockDeviceSupport(ctx context.Context) bool {
|
||||
return false
|
||||
}
|
||||
|
||||
// Sort the devices starting with device #1 being the VFIO control group
|
||||
// device and the next the actuall device(s) e.g. /dev/vfio/<group>
|
||||
func sortContainerVFIODevices(devices []ContainerDevice) []ContainerDevice {
|
||||
var vfioDevices []ContainerDevice
|
||||
|
||||
for _, device := range devices {
|
||||
if deviceManager.IsVFIOControlDevice(device.ContainerPath) {
|
||||
vfioDevices = append([]ContainerDevice{device}, vfioDevices...)
|
||||
continue
|
||||
}
|
||||
vfioDevices = append(vfioDevices, device)
|
||||
}
|
||||
return vfioDevices
|
||||
}
|
||||
|
||||
// create creates and starts a container inside a Sandbox. It has to be
|
||||
// called only when a new container, not known by the sandbox, has to be created.
|
||||
func (c *Container) create(ctx context.Context) (err error) {
|
||||
@@ -893,6 +908,13 @@ func (c *Container) create(ctx context.Context) (err error) {
|
||||
}
|
||||
c.devices = cntDevices
|
||||
}
|
||||
// If modeVFIO is enabled we need 1st to attach the VFIO control group
|
||||
// device /dev/vfio/vfio an 2nd the actuall device(s) afterwards.
|
||||
// Sort the devices starting with device #1 being the VFIO control group
|
||||
// device and the next the actuall device(s) /dev/vfio/<group>
|
||||
if modeVFIO {
|
||||
c.devices = sortContainerVFIODevices(c.devices)
|
||||
}
|
||||
|
||||
c.Logger().WithFields(logrus.Fields{
|
||||
"devices": c.devices,
|
||||
|
||||
@@ -651,7 +651,8 @@ func (s *Sandbox) coldOrHotPlugVFIO(sandboxConfig *SandboxConfig) (bool, error)
|
||||
hotPlugVFIO := (sandboxConfig.HypervisorConfig.HotPlugVFIO != config.NoPort)
|
||||
|
||||
modeIsGK := (sandboxConfig.VfioMode == config.VFIOModeGuestKernel)
|
||||
modeIsVFIO := (sandboxConfig.VfioMode == config.VFIOModeVFIO)
|
||||
// modeIsVFIO is needed at the container level not the sandbox level.
|
||||
// modeIsVFIO := (sandboxConfig.VfioMode == config.VFIOModeVFIO)
|
||||
|
||||
var vfioDevices []config.DeviceInfo
|
||||
// vhost-user-block device is a PCIe device in Virt, keep track of it
|
||||
@@ -666,13 +667,6 @@ func (s *Sandbox) coldOrHotPlugVFIO(sandboxConfig *SandboxConfig) (bool, error)
|
||||
continue
|
||||
}
|
||||
isVFIODevice := deviceManager.IsVFIODevice(device.ContainerPath)
|
||||
isVFIOControlDevice := deviceManager.IsVFIOControlDevice(device.ContainerPath)
|
||||
// vfio_mode=vfio needs the VFIO control device add it to the list
|
||||
// of devices to be added to the VM.
|
||||
if modeIsVFIO && isVFIOControlDevice && !hotPlugVFIO {
|
||||
vfioDevices = append(vfioDevices, device)
|
||||
}
|
||||
|
||||
if hotPlugVFIO && isVFIODevice {
|
||||
device.ColdPlug = false
|
||||
device.Port = sandboxConfig.HypervisorConfig.HotPlugVFIO
|
||||
|
||||
877
src/tools/kata-ctl/Cargo.lock
generated
877
src/tools/kata-ctl/Cargo.lock
generated
File diff suppressed because it is too large
Load Diff
@@ -44,7 +44,12 @@ logging = { path = "../../libs/logging" }
|
||||
slog = "2.7.0"
|
||||
slog-scope = "4.4.0"
|
||||
hyper = "0.14.20"
|
||||
tokio = "1.28.1"
|
||||
tokio = { version = "1.28.1", features = ["signal"] }
|
||||
ttrpc = "0.6.0"
|
||||
|
||||
prometheus = { version = "0.13.0", features = ["process"] }
|
||||
procfs = "0.12.0"
|
||||
lazy_static = "1.2"
|
||||
|
||||
[target.'cfg(target_arch = "s390x")'.dependencies]
|
||||
reqwest = { version = "0.11", default-features = false, features = ["json", "blocking", "native-tls"] }
|
||||
|
||||
@@ -56,6 +56,9 @@ pub enum Commands {
|
||||
/// Gather metrics associated with infrastructure used to run a sandbox
|
||||
Metrics(MetricsCommand),
|
||||
|
||||
/// Start a monitor to get metrics of Kata Containers
|
||||
Monitor(MonitorArgument),
|
||||
|
||||
/// Display version details
|
||||
Version,
|
||||
}
|
||||
@@ -122,6 +125,12 @@ pub enum IpTablesArguments {
|
||||
Metrics,
|
||||
}
|
||||
|
||||
#[derive(Debug, Args)]
|
||||
pub struct MonitorArgument {
|
||||
/// The address to listen on for HTTP requests. (default "127.0.0.1:8090")
|
||||
pub address: Option<String>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Args)]
|
||||
pub struct DirectVolumeCommand {
|
||||
#[clap(subcommand)]
|
||||
|
||||
@@ -3,9 +3,16 @@
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#[macro_use]
|
||||
extern crate lazy_static;
|
||||
|
||||
#[macro_use]
|
||||
extern crate slog;
|
||||
|
||||
mod arch;
|
||||
mod args;
|
||||
mod check;
|
||||
mod monitor;
|
||||
mod ops;
|
||||
mod types;
|
||||
mod utils;
|
||||
@@ -18,7 +25,7 @@ use std::process::exit;
|
||||
use args::{Commands, KataCtlCli};
|
||||
|
||||
use ops::check_ops::{
|
||||
handle_check, handle_factory, handle_iptables, handle_metrics, handle_version,
|
||||
handle_check, handle_factory, handle_iptables, handle_metrics, handle_monitor, handle_version,
|
||||
};
|
||||
use ops::env_ops::handle_env;
|
||||
use ops::exec_ops::handle_exec;
|
||||
@@ -52,6 +59,7 @@ fn real_main() -> Result<()> {
|
||||
Commands::Factory => handle_factory(),
|
||||
Commands::Iptables(args) => handle_iptables(args),
|
||||
Commands::Metrics(args) => handle_metrics(args),
|
||||
Commands::Monitor(args) => handle_monitor(args),
|
||||
Commands::Version => handle_version(),
|
||||
};
|
||||
|
||||
|
||||
181
src/tools/kata-ctl/src/monitor/http_server.rs
Normal file
181
src/tools/kata-ctl/src/monitor/http_server.rs
Normal file
@@ -0,0 +1,181 @@
|
||||
// Copyright 2022-2023 Ant Group
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
use crate::monitor::metrics::get_monitor_metrics;
|
||||
use crate::sl;
|
||||
use crate::utils::TIMEOUT;
|
||||
|
||||
use anyhow::{anyhow, Context, Result};
|
||||
use hyper::body;
|
||||
use hyper::service::{make_service_fn, service_fn};
|
||||
use hyper::{Body, Method, Request, Response, Server, StatusCode};
|
||||
use shim_interface::shim_mgmt::client::MgmtClient;
|
||||
use slog::{self, info};
|
||||
use std::collections::HashMap;
|
||||
use std::net::SocketAddr;
|
||||
|
||||
const ROOT_URI: &str = "/";
|
||||
const METRICS_URI: &str = "/metrics";
|
||||
|
||||
async fn handler_mux(req: Request<Body>) -> Result<Response<Body>> {
|
||||
info!(
|
||||
sl!(),
|
||||
"mgmt-svr(mux): recv req, method: {}, uri: {}",
|
||||
req.method(),
|
||||
req.uri().path()
|
||||
);
|
||||
|
||||
match (req.method(), req.uri().path()) {
|
||||
(&Method::GET, ROOT_URI) => root_uri_handler(req).await,
|
||||
(&Method::GET, METRICS_URI) => metrics_uri_handler(req).await,
|
||||
_ => not_found_uri_handler(req).await,
|
||||
}
|
||||
.map_or_else(
|
||||
|e| {
|
||||
Response::builder()
|
||||
.status(StatusCode::INTERNAL_SERVER_ERROR)
|
||||
.body(Body::from(format!("{:?}\n", e)))
|
||||
.map_err(|e| anyhow!("Failed to Build Response {:?}", e))
|
||||
},
|
||||
Ok,
|
||||
)
|
||||
}
|
||||
|
||||
pub async fn http_server_setup(socket_addr: &str) -> Result<()> {
|
||||
let addr: SocketAddr = socket_addr
|
||||
.parse()
|
||||
.context("failed to parse http socket address")?;
|
||||
|
||||
let make_svc =
|
||||
make_service_fn(|_conn| async { Ok::<_, anyhow::Error>(service_fn(handler_mux)) });
|
||||
|
||||
Server::bind(&addr).serve(make_svc).await?;
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
async fn root_uri_handler(_req: Request<Body>) -> Result<Response<Body>> {
|
||||
Response::builder()
|
||||
.status(StatusCode::OK)
|
||||
.body(Body::from(
|
||||
r#"Available HTTP endpoints:
|
||||
/metrics : Get metrics from sandboxes.
|
||||
"#,
|
||||
))
|
||||
.map_err(|e| anyhow!("Failed to Build Response {:?}", e))
|
||||
}
|
||||
|
||||
async fn metrics_uri_handler(req: Request<Body>) -> Result<Response<Body>> {
|
||||
let mut response_body = String::new();
|
||||
|
||||
response_body += &get_monitor_metrics().context("Failed to Get Monitor Metrics")?;
|
||||
|
||||
if let Some(uri_query) = req.uri().query() {
|
||||
if let Ok(sandbox_id) = parse_sandbox_id(uri_query) {
|
||||
response_body += &get_runtime_metrics(sandbox_id)
|
||||
.await
|
||||
.context(format!("{}\nFailed to Get Runtime Metrics", response_body))?;
|
||||
}
|
||||
}
|
||||
|
||||
Response::builder()
|
||||
.status(StatusCode::OK)
|
||||
.body(Body::from(response_body))
|
||||
.map_err(|e| anyhow!("Failed to Build Response {:?}", e))
|
||||
}
|
||||
|
||||
async fn get_runtime_metrics(sandbox_id: &str) -> Result<String> {
|
||||
// build shim client
|
||||
let shim_client =
|
||||
MgmtClient::new(sandbox_id, Some(TIMEOUT)).context("failed to build shim mgmt client")?;
|
||||
|
||||
// get METRICS_URI
|
||||
let shim_response = shim_client
|
||||
.get(METRICS_URI)
|
||||
.await
|
||||
.context("failed to get METRICS_URI")?;
|
||||
|
||||
// get runtime_metrics
|
||||
let runtime_metrics = String::from_utf8(body::to_bytes(shim_response).await?.to_vec())
|
||||
.context("failed to get runtime_metrics")?;
|
||||
|
||||
Ok(runtime_metrics)
|
||||
}
|
||||
|
||||
async fn not_found_uri_handler(_req: Request<Body>) -> Result<Response<Body>> {
|
||||
Response::builder()
|
||||
.status(StatusCode::NOT_FOUND)
|
||||
.body(Body::from("NOT FOUND"))
|
||||
.map_err(|e| anyhow!("Failed to Build Response {:?}", e))
|
||||
}
|
||||
|
||||
fn parse_sandbox_id(uri: &str) -> Result<&str> {
|
||||
let uri_pairs: HashMap<_, _> = uri
|
||||
.split_whitespace()
|
||||
.map(|s| s.split_at(s.find('=').unwrap_or(0)))
|
||||
.map(|(key, val)| (key, &val[1..]))
|
||||
.collect();
|
||||
|
||||
match uri_pairs.get("sandbox") {
|
||||
Some(sid) => Ok(sid.to_owned()),
|
||||
None => Err(anyhow!("params sandbox not found")),
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn test_parse_sandbox_id() {
|
||||
assert!(parse_sandbox_id("sandbox=demo_sandbox").unwrap() == "demo_sandbox");
|
||||
assert!(parse_sandbox_id("foo=bar").is_err());
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn test_root_uri_handler() {
|
||||
let root_resp = handler_mux(
|
||||
Request::builder()
|
||||
.method("GET")
|
||||
.uri("/")
|
||||
.body(hyper::Body::from(""))
|
||||
.unwrap(),
|
||||
)
|
||||
.await
|
||||
.unwrap();
|
||||
|
||||
assert!(root_resp.status() == StatusCode::OK);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn test_metrics_uri_handler() {
|
||||
let metrics_resp = handler_mux(
|
||||
Request::builder()
|
||||
.method("GET")
|
||||
.uri("/metrics?sandbox=demo_sandbox")
|
||||
.body(hyper::Body::from(""))
|
||||
.unwrap(),
|
||||
)
|
||||
.await
|
||||
.unwrap();
|
||||
|
||||
assert!(metrics_resp.status() == StatusCode::INTERNAL_SERVER_ERROR);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn test_not_found_uri_handler() {
|
||||
let not_found_resp = handler_mux(
|
||||
Request::builder()
|
||||
.method("POST")
|
||||
.uri("/metrics?sandbox=demo_sandbox")
|
||||
.body(hyper::Body::from(""))
|
||||
.unwrap(),
|
||||
)
|
||||
.await
|
||||
.unwrap();
|
||||
|
||||
assert!(not_found_resp.status() == StatusCode::NOT_FOUND);
|
||||
}
|
||||
}
|
||||
91
src/tools/kata-ctl/src/monitor/metrics.rs
Normal file
91
src/tools/kata-ctl/src/monitor/metrics.rs
Normal file
@@ -0,0 +1,91 @@
|
||||
// Copyright 2022-2023 Ant Group
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
extern crate procfs;
|
||||
|
||||
use anyhow::{anyhow, Context, Result};
|
||||
|
||||
use prometheus::{Encoder, Gauge, IntCounter, Registry, TextEncoder};
|
||||
use std::sync::Mutex;
|
||||
|
||||
const NAMESPACE_KATA_MONITOR: &str = "kata_ctl_monitor";
|
||||
|
||||
lazy_static! {
|
||||
|
||||
static ref REGISTERED: Mutex<bool> = Mutex::new(false);
|
||||
|
||||
// custom registry
|
||||
static ref REGISTRY: Registry = Registry::new();
|
||||
|
||||
// monitor metrics
|
||||
static ref MONITOR_SCRAPE_COUNT: IntCounter =
|
||||
IntCounter::new(format!("{}_{}", NAMESPACE_KATA_MONITOR, "scrape_count"), "Monitor scrape count").unwrap();
|
||||
|
||||
static ref MONITOR_MAX_FDS: Gauge = Gauge::new(format!("{}_{}", NAMESPACE_KATA_MONITOR, "process_max_fds"), "Open FDs for monitor").unwrap();
|
||||
|
||||
static ref MONITOR_OPEN_FDS: Gauge = Gauge::new(format!("{}_{}", NAMESPACE_KATA_MONITOR, "process_open_fds"), "Open FDs for monitor").unwrap();
|
||||
|
||||
static ref MONITOR_RESIDENT_MEMORY: Gauge = Gauge::new(format!("{}_{}", NAMESPACE_KATA_MONITOR, "process_resident_memory_bytes"), "Resident memory size in bytes for monitor").unwrap();
|
||||
}
|
||||
|
||||
/// get monitor metrics
|
||||
pub fn get_monitor_metrics() -> Result<String> {
|
||||
let mut registered = REGISTERED
|
||||
.lock()
|
||||
.map_err(|e| anyhow!("failed to check monitor metrics register status {:?}", e))?;
|
||||
|
||||
if !(*registered) {
|
||||
register_monitor_metrics().context("failed to register monitor metrics")?;
|
||||
*registered = true;
|
||||
}
|
||||
|
||||
update_monitor_metrics().context("failed to update monitor metrics")?;
|
||||
|
||||
// gather all metrics and return as a String
|
||||
let metric_families = REGISTRY.gather();
|
||||
|
||||
let mut buffer = Vec::new();
|
||||
TextEncoder::new()
|
||||
.encode(&metric_families, &mut buffer)
|
||||
.context("failed to encode gathered metrics")?;
|
||||
|
||||
Ok(String::from_utf8(buffer)?)
|
||||
}
|
||||
|
||||
fn register_monitor_metrics() -> Result<()> {
|
||||
REGISTRY.register(Box::new(MONITOR_SCRAPE_COUNT.clone()))?;
|
||||
REGISTRY.register(Box::new(MONITOR_MAX_FDS.clone()))?;
|
||||
REGISTRY.register(Box::new(MONITOR_OPEN_FDS.clone()))?;
|
||||
REGISTRY.register(Box::new(MONITOR_RESIDENT_MEMORY.clone()))?;
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn update_monitor_metrics() -> Result<()> {
|
||||
MONITOR_SCRAPE_COUNT.inc();
|
||||
|
||||
let me = match procfs::process::Process::myself() {
|
||||
Ok(p) => p,
|
||||
Err(e) => {
|
||||
eprintln!("failed to create process instance: {:?}", e);
|
||||
|
||||
return Ok(());
|
||||
}
|
||||
};
|
||||
|
||||
if let Ok(fds) = procfs::sys::fs::file_max() {
|
||||
MONITOR_MAX_FDS.set(fds as f64);
|
||||
}
|
||||
|
||||
if let Ok(fds) = me.fd_count() {
|
||||
MONITOR_OPEN_FDS.set(fds as f64);
|
||||
}
|
||||
|
||||
if let Ok(statm) = me.statm() {
|
||||
MONITOR_RESIDENT_MEMORY.set(statm.resident as f64);
|
||||
}
|
||||
|
||||
Ok(())
|
||||
}
|
||||
8
src/tools/kata-ctl/src/monitor/mod.rs
Normal file
8
src/tools/kata-ctl/src/monitor/mod.rs
Normal file
@@ -0,0 +1,8 @@
|
||||
// Copyright 2022-2023 Ant Group
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
mod metrics;
|
||||
|
||||
pub mod http_server;
|
||||
@@ -5,15 +5,21 @@
|
||||
|
||||
use crate::arch::arch_specific::get_checks;
|
||||
|
||||
use crate::args::{CheckArgument, CheckSubCommand, IptablesCommand, MetricsCommand};
|
||||
use crate::args::{
|
||||
CheckArgument, CheckSubCommand, IptablesCommand, MetricsCommand, MonitorArgument,
|
||||
};
|
||||
|
||||
use crate::check;
|
||||
|
||||
use crate::monitor::http_server;
|
||||
|
||||
use crate::ops::version;
|
||||
|
||||
use crate::types::*;
|
||||
|
||||
use anyhow::{anyhow, Result};
|
||||
use anyhow::{anyhow, Context, Result};
|
||||
|
||||
const MONITOR_DEFAULT_SOCK_ADDR: &str = "127.0.0.1:8090";
|
||||
|
||||
use slog::{info, o, warn};
|
||||
|
||||
@@ -128,6 +134,17 @@ pub fn handle_metrics(_args: MetricsCommand) -> Result<()> {
|
||||
Ok(())
|
||||
}
|
||||
|
||||
pub fn handle_monitor(monitor_args: MonitorArgument) -> Result<()> {
|
||||
tokio::runtime::Runtime::new()
|
||||
.context("failed to new runtime for aync http server")?
|
||||
.block_on(http_server::http_server_setup(
|
||||
monitor_args
|
||||
.address
|
||||
.as_deref()
|
||||
.unwrap_or(MONITOR_DEFAULT_SOCK_ADDR),
|
||||
))
|
||||
}
|
||||
|
||||
pub fn handle_version() -> Result<()> {
|
||||
let version = version::get().unwrap();
|
||||
|
||||
|
||||
@@ -25,6 +25,8 @@ use vmm_sys_util::terminal::Terminal;
|
||||
use crate::args::ExecArguments;
|
||||
use shim_interface::shim_mgmt::{client::MgmtClient, AGENT_URL};
|
||||
|
||||
use crate::utils::TIMEOUT;
|
||||
|
||||
const CMD_CONNECT: &str = "CONNECT";
|
||||
const CMD_OK: &str = "OK";
|
||||
const SCHEME_VSOCK: &str = "VSOCK";
|
||||
@@ -32,7 +34,6 @@ const SCHEME_HYBRID_VSOCK: &str = "HVSOCK";
|
||||
|
||||
const EPOLL_EVENTS_LEN: usize = 16;
|
||||
const KATA_AGENT_VSOCK_TIMEOUT: u64 = 5;
|
||||
const TIMEOUT: Duration = Duration::from_millis(2000);
|
||||
|
||||
type Result<T> = std::result::Result<T, Error>;
|
||||
|
||||
|
||||
@@ -14,7 +14,7 @@ use kata_types::mount::{
|
||||
use nix;
|
||||
use reqwest::StatusCode;
|
||||
use slog::{info, o};
|
||||
use std::{fs, time::Duration};
|
||||
use std::fs;
|
||||
use url;
|
||||
|
||||
use agent::ResizeVolumeRequest;
|
||||
@@ -23,7 +23,8 @@ use shim_interface::shim_mgmt::{
|
||||
DIRECT_VOLUME_PATH_KEY, DIRECT_VOLUME_RESIZE_URL, DIRECT_VOLUME_STATS_URL,
|
||||
};
|
||||
|
||||
const TIMEOUT: Duration = Duration::from_millis(2000);
|
||||
use crate::utils::TIMEOUT;
|
||||
|
||||
const CONTENT_TYPE_JSON: &str = "application/json";
|
||||
|
||||
macro_rules! sl {
|
||||
|
||||
@@ -8,10 +8,12 @@
|
||||
use crate::arch::arch_specific;
|
||||
|
||||
use anyhow::{anyhow, Context, Result};
|
||||
use std::fs;
|
||||
use std::{fs, time::Duration};
|
||||
|
||||
const NON_PRIV_USER: &str = "nobody";
|
||||
|
||||
pub const TIMEOUT: Duration = Duration::from_millis(2000);
|
||||
|
||||
pub fn drop_privs() -> Result<()> {
|
||||
if nix::unistd::Uid::effective().is_root() {
|
||||
privdrop::PrivDrop::default()
|
||||
|
||||
@@ -173,8 +173,8 @@ function delete_cluster() {
|
||||
}
|
||||
|
||||
function get_nodes_and_pods_info() {
|
||||
kubectl debug $(kubectl get nodes -o name) -it --image=quay.io/kata-containers/kata-debug:latest
|
||||
kubectl get pods -o name | grep node-debugger | xargs kubectl delete
|
||||
kubectl debug $(kubectl get nodes -o name) -it --image=quay.io/kata-containers/kata-debug:latest || true
|
||||
kubectl get pods -o name | grep node-debugger | xargs kubectl delete || true
|
||||
}
|
||||
|
||||
function main() {
|
||||
|
||||
@@ -66,6 +66,8 @@ Tests relating to networking. General items could include:
|
||||
- parallel bandwidth
|
||||
- write and read percentiles
|
||||
|
||||
For further details see the [network tests documentation](network).
|
||||
|
||||
### Storage
|
||||
|
||||
Tests relating to the storage (graph, volume) drivers.
|
||||
|
||||
@@ -17,7 +17,7 @@ description = "measure container lifecycle timings"
|
||||
checkvar = ".\"boot-times\".Results | .[] | .\"to-workload\".Result"
|
||||
checktype = "mean"
|
||||
midval = 0.69
|
||||
minpercent = 30.0
|
||||
minpercent = 40.0
|
||||
maxpercent = 30.0
|
||||
|
||||
[[metric]]
|
||||
|
||||
@@ -17,7 +17,7 @@ description = "measure container lifecycle timings"
|
||||
checkvar = ".\"boot-times\".Results | .[] | .\"to-workload\".Result"
|
||||
checktype = "mean"
|
||||
midval = 0.71
|
||||
minpercent = 30.0
|
||||
minpercent = 40.0
|
||||
maxpercent = 30.0
|
||||
|
||||
[[metric]]
|
||||
|
||||
@@ -51,3 +51,8 @@ For more details see the [footprint test documentation](footprint_data.md).
|
||||
Measures the memory statistics *inside* the container. This allows evaluation of
|
||||
the overhead the VM kernel and rootfs are having on the memory that was requested
|
||||
by the container co-ordination system, and thus supplied to the VM.
|
||||
|
||||
## `k8s-sysbench`
|
||||
|
||||
`Sysbench`is an open-source and multi-purpose benchmark utility that evaluates parameters features tests for `CPU`, memory
|
||||
and I/O. Currently the `k8s-sysbench` test is measuring the `CPU` performance.
|
||||
|
||||
69
tests/metrics/density/k8s-sysbench.sh
Executable file
69
tests/metrics/density/k8s-sysbench.sh
Executable file
@@ -0,0 +1,69 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# Copyright (c) 2022-2023 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
set -o errexit
|
||||
set -o nounset
|
||||
set -o pipefail
|
||||
|
||||
SCRIPT_PATH=$(dirname "$(readlink -f "$0")")
|
||||
source "${SCRIPT_PATH}/../lib/common.bash"
|
||||
sysbench_file=$(mktemp sysbenchresults.XXXXXXXXXX)
|
||||
TEST_NAME="${TEST_NAME:-sysbench}"
|
||||
CI_JOB="${CI_JOB:-}"
|
||||
IMAGE="docker.io/library/local-sysbench:latest"
|
||||
DOCKERFILE="${SCRIPT_PATH}/sysbench-dockerfile/Dockerfile"
|
||||
|
||||
function remove_tmp_file() {
|
||||
rm -rf "${sysbench_file}"
|
||||
}
|
||||
|
||||
trap remove_tmp_file EXIT
|
||||
|
||||
function sysbench_memory() {
|
||||
kubectl exec -i "$pod_name" -- sh -c "sysbench memory --threads=2 run" > "${sysbench_file}"
|
||||
metrics_json_init
|
||||
local memory_latency_sum=$(cat "$sysbench_file" | grep sum | cut -f2 -d':' | sed 's/[[:blank:]]//g')
|
||||
metrics_json_start_array
|
||||
local json="$(cat << EOF
|
||||
{
|
||||
"memory-latency-sum": {
|
||||
"Result" : $memory_latency_sum,
|
||||
"Units" : "ms"
|
||||
}
|
||||
}
|
||||
EOF
|
||||
)"
|
||||
metrics_json_add_array_element "$json"
|
||||
metrics_json_end_array "Results"
|
||||
metrics_json_save
|
||||
}
|
||||
|
||||
function sysbench_start_deployment() {
|
||||
cmds=("bc" "jq")
|
||||
check_cmds "${cmds[@]}"
|
||||
|
||||
# Check no processes are left behind
|
||||
check_processes
|
||||
|
||||
export pod_name="test-sysbench"
|
||||
|
||||
kubectl create -f "${SCRIPT_PATH}/runtimeclass_workloads/sysbench-pod.yaml"
|
||||
kubectl wait --for=condition=Ready --timeout=120s pod "$pod_name"
|
||||
}
|
||||
|
||||
function sysbench_cleanup() {
|
||||
kubectl delete pod "$pod_name"
|
||||
check_processes
|
||||
}
|
||||
|
||||
function main() {
|
||||
init_env
|
||||
sysbench_start_deployment
|
||||
sysbench_memory
|
||||
sysbench_cleanup
|
||||
}
|
||||
|
||||
main "$@"
|
||||
@@ -0,0 +1,18 @@
|
||||
#
|
||||
# Copyright (c) 2018-2023 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: test-sysbench
|
||||
spec:
|
||||
terminationGracePeriodSeconds: 0
|
||||
runtimeClassName: kata
|
||||
containers:
|
||||
- name: test-sysbench
|
||||
image: localhost:5000/sysbench-kata:latest
|
||||
command:
|
||||
- sleep
|
||||
- "60"
|
||||
17
tests/metrics/density/sysbench-dockerfile/Dockerfile
Normal file
17
tests/metrics/density/sysbench-dockerfile/Dockerfile
Normal file
@@ -0,0 +1,17 @@
|
||||
# Copyright (c) 2022-2023 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
# Usage: FROM [image name]
|
||||
FROM ubuntu:20.04
|
||||
|
||||
# Version of the Dockerfile
|
||||
LABEL DOCKERFILE_VERSION="1.0"
|
||||
|
||||
RUN apt-get update && \
|
||||
apt-get install -y build-essential git curl sudo && \
|
||||
apt-get remove -y unattended-upgrades && \
|
||||
curl -OkL https://packagecloud.io/install/repositories/akopytov/sysbench/script.deb.sh && \
|
||||
apt-get install -y sysbench
|
||||
|
||||
CMD ["/bin/bash"]
|
||||
@@ -85,6 +85,14 @@ function run_test_tensorflow() {
|
||||
check_metrics
|
||||
}
|
||||
|
||||
function run_test_fio() {
|
||||
info "Running FIO test using ${KATA_HYPERVISOR} hypervisor"
|
||||
# ToDo: remove the exit once the metrics workflow is stable
|
||||
exit 0
|
||||
|
||||
bash storage/fio-k8s/fio-test-ci.sh
|
||||
}
|
||||
|
||||
function main() {
|
||||
action="${1:-}"
|
||||
case "${action}" in
|
||||
@@ -95,6 +103,7 @@ function main() {
|
||||
run-test-memory-usage-inside-container) run_test_memory_usage_inside_container ;;
|
||||
run-test-blogbench) run_test_blogbench ;;
|
||||
run-test-tensorflow) run_test_tensorflow ;;
|
||||
run-test-fio) run_test_fio ;;
|
||||
*) >&2 die "Invalid argument" ;;
|
||||
esac
|
||||
}
|
||||
|
||||
21
tests/metrics/network/README.md
Normal file
21
tests/metrics/network/README.md
Normal file
@@ -0,0 +1,21 @@
|
||||
# Kata Containers network metrics
|
||||
|
||||
Kata Containers provides a series of network performance tests. Running these provides a basic reference for measuring network essentials like
|
||||
bandwidth, jitter, latency and parallel bandwidth.
|
||||
|
||||
## Performance tools
|
||||
|
||||
- `iperf3` measures bandwidth, jitter, CPU usage and the quality of a network link.
|
||||
|
||||
## Networking tests
|
||||
|
||||
- `k8s-network-metrics-iperf3.sh` measures bandwidth which is the speed of the data transfer.
|
||||
|
||||
## Running the tests
|
||||
|
||||
Individual tests can be run by hand, for example:
|
||||
|
||||
```
|
||||
$ cd metrics
|
||||
$ bash network/iperf3_kubernetes/k8s-network-metrics-iperf3.sh -b
|
||||
```
|
||||
314
tests/metrics/network/iperf3_kubernetes/k8s-network-metrics-iperf3.sh
Executable file
314
tests/metrics/network/iperf3_kubernetes/k8s-network-metrics-iperf3.sh
Executable file
@@ -0,0 +1,314 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# Copyright (c) 2021-2023 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
# This test measures the following network essentials:
|
||||
# - bandwith simplex
|
||||
# - jitter
|
||||
#
|
||||
# These metrics/results will be got from the interconnection between
|
||||
# a client and a server using iperf3 tool.
|
||||
# The following cases are covered:
|
||||
#
|
||||
# case 1:
|
||||
# container-server <----> container-client
|
||||
#
|
||||
# case 2"
|
||||
# container-server <----> host-client
|
||||
|
||||
set -o pipefail
|
||||
|
||||
SCRIPT_PATH=$(dirname "$(readlink -f "$0")")
|
||||
|
||||
source "${SCRIPT_PATH}/../../lib/common.bash"
|
||||
iperf_file=$(mktemp iperfresults.XXXXXXXXXX)
|
||||
TEST_NAME="${TEST_NAME:-network-iperf3}"
|
||||
COLLECT_ALL="${COLLECT_ALL:-false}"
|
||||
|
||||
function remove_tmp_file() {
|
||||
rm -rf "${iperf_file}"
|
||||
}
|
||||
|
||||
trap remove_tmp_file EXIT
|
||||
|
||||
function iperf3_all_collect_results() {
|
||||
metrics_json_init
|
||||
metrics_json_start_array
|
||||
local json="$(cat << EOF
|
||||
{
|
||||
"bandwidth": {
|
||||
"Result" : $bandwidth_result,
|
||||
"Units" : "$bandwidth_units"
|
||||
},
|
||||
"jitter": {
|
||||
"Result" : $jitter_result,
|
||||
"Units" : "$jitter_units"
|
||||
},
|
||||
"cpu": {
|
||||
"Result" : $cpu_result,
|
||||
"Units" : "$cpu_units"
|
||||
},
|
||||
"parallel": {
|
||||
"Result" : $parallel_result,
|
||||
"Units" : "$parallel_units"
|
||||
}
|
||||
}
|
||||
EOF
|
||||
)"
|
||||
metrics_json_add_array_element "$json"
|
||||
metrics_json_end_array "Results"
|
||||
}
|
||||
|
||||
function iperf3_bandwidth() {
|
||||
# Start server
|
||||
local transmit_timeout="30"
|
||||
|
||||
kubectl exec -i "$client_pod_name" -- sh -c "iperf3 -J -c ${server_ip_add} -t ${transmit_timeout}" | jq '.end.sum_received.bits_per_second' > "${iperf_file}"
|
||||
export bandwidth_result=$(cat "${iperf_file}")
|
||||
export bandwidth_units="bits per second"
|
||||
|
||||
if [ "$COLLECT_ALL" == "true" ]; then
|
||||
iperf3_all_collect_results
|
||||
else
|
||||
metrics_json_init
|
||||
metrics_json_start_array
|
||||
|
||||
local json="$(cat << EOF
|
||||
{
|
||||
"bandwidth": {
|
||||
"Result" : $bandwidth_result,
|
||||
"Units" : "$bandwidth_units"
|
||||
}
|
||||
}
|
||||
EOF
|
||||
)"
|
||||
metrics_json_add_array_element "$json"
|
||||
metrics_json_end_array "Results"
|
||||
fi
|
||||
}
|
||||
|
||||
function iperf3_jitter() {
|
||||
# Start server
|
||||
local transmit_timeout="30"
|
||||
|
||||
kubectl exec -i "$client_pod_name" -- sh -c "iperf3 -J -c ${server_ip_add} -u -t ${transmit_timeout}" | jq '.end.sum.jitter_ms' > "${iperf_file}"
|
||||
result=$(cat "${iperf_file}")
|
||||
export jitter_result=$(printf "%0.3f\n" $result)
|
||||
export jitter_units="ms"
|
||||
|
||||
if [ "$COLLECT_ALL" == "true" ]; then
|
||||
iperf3_all_collect_results
|
||||
else
|
||||
metrics_json_init
|
||||
metrics_json_start_array
|
||||
|
||||
local json="$(cat << EOF
|
||||
{
|
||||
"jitter": {
|
||||
"Result" : $jitter_result,
|
||||
"Units" : "ms"
|
||||
}
|
||||
}
|
||||
EOF
|
||||
)"
|
||||
metrics_json_add_array_element "$json"
|
||||
metrics_json_end_array "Results"
|
||||
fi
|
||||
}
|
||||
|
||||
function iperf3_parallel() {
|
||||
# This will measure four parallel connections with iperf3
|
||||
kubectl exec -i "$client_pod_name" -- sh -c "iperf3 -J -c ${server_ip_add} -P 4" | jq '.end.sum_received.bits_per_second' > "${iperf_file}"
|
||||
export parallel_result=$(cat "${iperf_file}")
|
||||
export parallel_units="bits per second"
|
||||
|
||||
if [ "$COLLECT_ALL" == "true" ]; then
|
||||
iperf3_all_collect_results
|
||||
else
|
||||
metrics_json_init
|
||||
metrics_json_start_array
|
||||
|
||||
local json="$(cat << EOF
|
||||
{
|
||||
"parallel": {
|
||||
"Result" : $parallel_result,
|
||||
"Units" : "$parallel_units"
|
||||
}
|
||||
}
|
||||
EOF
|
||||
)"
|
||||
metrics_json_add_array_element "$json"
|
||||
metrics_json_end_array "Results"
|
||||
fi
|
||||
}
|
||||
|
||||
function iperf3_cpu() {
|
||||
# Start server
|
||||
local transmit_timeout="80"
|
||||
|
||||
kubectl exec -i "$client_pod_name" -- sh -c "iperf3 -J -c ${server_ip_add} -t ${transmit_timeout}" | jq '.end.cpu_utilization_percent.host_total' > "${iperf_file}"
|
||||
export cpu_result=$(cat "${iperf_file}")
|
||||
export cpu_units="percent"
|
||||
|
||||
if [ "$COLLECT_ALL" == "true" ]; then
|
||||
iperf3_all_collect_results
|
||||
else
|
||||
metrics_json_init
|
||||
metrics_json_start_array
|
||||
|
||||
local json="$(cat << EOF
|
||||
{
|
||||
"cpu": {
|
||||
"Result" : $cpu_result,
|
||||
"Units" : "$cpu_units"
|
||||
}
|
||||
}
|
||||
EOF
|
||||
)"
|
||||
|
||||
metrics_json_add_array_element "$json"
|
||||
metrics_json_end_array "Results"
|
||||
fi
|
||||
}
|
||||
|
||||
function iperf3_start_deployment() {
|
||||
cmds=("bc" "jq")
|
||||
check_cmds "${cmds[@]}"
|
||||
|
||||
# Check no processes are left behind
|
||||
check_processes
|
||||
|
||||
export service="iperf3-server"
|
||||
export deployment="iperf3-server-deployment"
|
||||
|
||||
wait_time=20
|
||||
sleep_time=2
|
||||
|
||||
# Create deployment
|
||||
kubectl create -f "${SCRIPT_PATH}/runtimeclass_workloads/iperf3-deployment.yaml"
|
||||
|
||||
# Check deployment creation
|
||||
local cmd="kubectl wait --for=condition=Available deployment/${deployment}"
|
||||
waitForProcess "$wait_time" "$sleep_time" "$cmd"
|
||||
|
||||
# Create DaemonSet
|
||||
kubectl create -f "${SCRIPT_PATH}/runtimeclass_workloads/iperf3-daemonset.yaml"
|
||||
|
||||
# Expose deployment
|
||||
kubectl expose deployment/"${deployment}"
|
||||
|
||||
# Get the names of the server pod
|
||||
export server_pod_name=$(kubectl get pods -o name | grep server | cut -d '/' -f2)
|
||||
|
||||
# Verify the server pod is working
|
||||
local cmd="kubectl get pod $server_pod_name -o yaml | grep 'phase: Running'"
|
||||
waitForProcess "$wait_time" "$sleep_time" "$cmd"
|
||||
|
||||
# Get the names of client pod
|
||||
export client_pod_name=$(kubectl get pods -o name | grep client | cut -d '/' -f2)
|
||||
|
||||
# Verify the client pod is working
|
||||
local cmd="kubectl get pod $client_pod_name -o yaml | grep 'phase: Running'"
|
||||
waitForProcess "$wait_time" "$sleep_time" "$cmd"
|
||||
|
||||
# Get the ip address of the server pod
|
||||
export server_ip_add=$(kubectl get pod "$server_pod_name" -o jsonpath='{.status.podIP}')
|
||||
}
|
||||
|
||||
function iperf3_deployment_cleanup() {
|
||||
kubectl delete pod "$server_pod_name" "$client_pod_name"
|
||||
kubectl delete ds iperf3-clients
|
||||
kubectl delete deployment "$deployment"
|
||||
kubectl delete service "$deployment"
|
||||
check_processes
|
||||
}
|
||||
|
||||
function help() {
|
||||
echo "$(cat << EOF
|
||||
Usage: $0 "[options]"
|
||||
Description:
|
||||
This script implements a number of network metrics
|
||||
using iperf3.
|
||||
|
||||
Options:
|
||||
-a Run all tests
|
||||
-b Run bandwidth tests
|
||||
-c Run cpu metrics tests
|
||||
-h Help
|
||||
-j Run jitter tests
|
||||
EOF
|
||||
)"
|
||||
}
|
||||
|
||||
function main() {
|
||||
init_env
|
||||
iperf3_start_deployment
|
||||
|
||||
local OPTIND
|
||||
while getopts ":abcjph:" opt
|
||||
do
|
||||
case "$opt" in
|
||||
a) # all tests
|
||||
test_all="1"
|
||||
;;
|
||||
b) # bandwith test
|
||||
test_bandwith="1"
|
||||
;;
|
||||
c)
|
||||
# run cpu tests
|
||||
test_cpu="1"
|
||||
;;
|
||||
h)
|
||||
help
|
||||
exit 0;
|
||||
;;
|
||||
j) # jitter tests
|
||||
test_jitter="1"
|
||||
;;
|
||||
p)
|
||||
# run parallel tests
|
||||
test_parallel="1"
|
||||
;;
|
||||
:)
|
||||
echo "Missing argument for -$OPTARG";
|
||||
help
|
||||
exit 1;
|
||||
;;
|
||||
esac
|
||||
done
|
||||
shift $((OPTIND-1))
|
||||
|
||||
[[ -z "$test_bandwith" ]] && \
|
||||
[[ -z "$test_jitter" ]] && \
|
||||
[[ -z "$test_cpu" ]] && \
|
||||
[[ -z "$test_parallel" ]] && \
|
||||
[[ -z "$test_all" ]] && \
|
||||
help && die "Must choose at least one test"
|
||||
|
||||
if [ "$test_bandwith" == "1" ]; then
|
||||
iperf3_bandwidth
|
||||
fi
|
||||
|
||||
if [ "$test_jitter" == "1" ]; then
|
||||
iperf3_jitter
|
||||
fi
|
||||
|
||||
if [ "$test_cpu" == "1" ]; then
|
||||
iperf3_cpu
|
||||
fi
|
||||
|
||||
if [ "$test_parallel" == "1" ]; then
|
||||
iperf3_parallel
|
||||
fi
|
||||
|
||||
if [ "$test_all" == "1" ]; then
|
||||
export COLLECT_ALL=true && iperf3_bandwidth && iperf3_jitter && iperf3_cpu && iperf3_parallel
|
||||
fi
|
||||
|
||||
metrics_json_save
|
||||
iperf3_deployment_cleanup
|
||||
}
|
||||
|
||||
main "$@"
|
||||
@@ -0,0 +1,29 @@
|
||||
#
|
||||
# Copyright (c) 2021-2023 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
apiVersion: apps/v1
|
||||
kind: DaemonSet
|
||||
metadata:
|
||||
name: iperf3-clients
|
||||
labels:
|
||||
app: iperf3-client
|
||||
spec:
|
||||
selector:
|
||||
matchLabels:
|
||||
app: iperf3-client
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: iperf3-client
|
||||
spec:
|
||||
tolerations:
|
||||
- key: node-role.kubernetes.io/master
|
||||
operator: Exists
|
||||
effect: NoSchedule
|
||||
containers:
|
||||
- name: iperf3-client
|
||||
image: networkstatic/iperf3
|
||||
command: ['/bin/sh', '-c', 'sleep infinity']
|
||||
terminationGracePeriodSeconds: 0
|
||||
@@ -0,0 +1,44 @@
|
||||
#
|
||||
# Copyright (c) 2021-2023 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: iperf3-server-deployment
|
||||
labels:
|
||||
app: iperf3-server
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: iperf3-server
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: iperf3-server
|
||||
spec:
|
||||
affinity:
|
||||
nodeAffinity:
|
||||
preferredDuringSchedulingIgnoredDuringExecution:
|
||||
- weight: 1
|
||||
preference:
|
||||
matchExpressions:
|
||||
- key: kubernetes.io/role
|
||||
operator: In
|
||||
values:
|
||||
- master
|
||||
tolerations:
|
||||
- key: node-role.kubernetes.io/master
|
||||
operator: Exists
|
||||
effect: NoSchedule
|
||||
containers:
|
||||
- name: iperf3-server
|
||||
image: networkstatic/iperf3
|
||||
args: ['-s']
|
||||
ports:
|
||||
- containerPort: 5201
|
||||
name: server
|
||||
terminationGracePeriodSeconds: 0
|
||||
runtimeClassName: kata
|
||||
@@ -0,0 +1,44 @@
|
||||
#
|
||||
# Copyright (c) 2021-2023 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: iperf3-server-deployment
|
||||
labels:
|
||||
app: iperf3-server
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: iperf3-server
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: iperf3-server
|
||||
spec:
|
||||
affinity:
|
||||
nodeAffinity:
|
||||
preferredDuringSchedulingIgnoredDuringExecution:
|
||||
- weight: 1
|
||||
preference:
|
||||
matchExpressions:
|
||||
- key: kubernetes.io/role
|
||||
operator: In
|
||||
values:
|
||||
- master
|
||||
tolerations:
|
||||
- key: node-role.kubernetes.io/master
|
||||
operator: Exists
|
||||
effect: NoSchedule
|
||||
containers:
|
||||
- name: iperf3-server
|
||||
image: networkstatic/iperf3
|
||||
args: ['-s']
|
||||
ports:
|
||||
- containerPort: 5201
|
||||
name: server
|
||||
terminationGracePeriodSeconds: 0
|
||||
runtimeClassName: kata
|
||||
@@ -1,11 +1,27 @@
|
||||
# Kata Containers storage I/O tests
|
||||
|
||||
The metrics tests in this directory are designed to be used to assess storage IO.
|
||||
|
||||
## `Blogbench` test
|
||||
|
||||
The `blogbench` script is based on the `blogbench` program which is designed to emulate a busy blog server with a number of concurrent
|
||||
threads performing a mixture of reads, writes and rewrites.
|
||||
|
||||
### Running the `blogbench` test
|
||||
|
||||
The `blogbench` test can be run by hand, for example:
|
||||
```
|
||||
$ cd metrics
|
||||
$ bash storage/blogbench.sh
|
||||
```
|
||||
## `fio` test
|
||||
|
||||
The `fio` test utilises the [fio tool](https://github.com/axboe/fio), configured
|
||||
to perform measurements upon a single test file.
|
||||
|
||||
The test configuration used by the script can be modified by setting a number of
|
||||
environment variables to change or over-ride the test defaults.
|
||||
|
||||
## DAX `virtio-fs` `fio` Kubernetes tests
|
||||
|
||||
[Test](fio-k8s/README.md) to compare the use of DAX option in `virtio-fs`.
|
||||
|
||||
30
tests/metrics/storage/fio-k8s/README.md
Normal file
30
tests/metrics/storage/fio-k8s/README.md
Normal file
@@ -0,0 +1,30 @@
|
||||
# FIO test in Kubernetes
|
||||
|
||||
This is an automation to run `fio` with Kubernetes.
|
||||
|
||||
## Requirements:
|
||||
|
||||
- Kubernetes cluster running.
|
||||
- Kata configured as `runtimeclass`.
|
||||
|
||||
## Test structure:
|
||||
|
||||
- [fio-test]: Program wrapper to launch `fio` in a K8s pod.
|
||||
- [pkg]: Library code that could be used for more `fio` automation.
|
||||
- [configs]: Configuration files used by [fio-test].
|
||||
- [DAX-compare-test]: Script to run [fio-test] to generate `fio` data for Kata with/without `virtio-fs DAX` and K8s bare-metal runtime(`runc`).
|
||||
- [report] Jupyter Notebook to create reports for data generated by [DAX-compare-test].
|
||||
|
||||
## Top-level Makefile targets
|
||||
|
||||
- `build`: Build `fio` metrics.
|
||||
- `test`: quick test, used to verify changes in [fio-test].
|
||||
- `run`: Run `fio` metrics and generate reports.
|
||||
- `test-report-interactive`: Run python notebook in `localhost:8888`, useful to edit the report.
|
||||
- `test-report`: Generate report from data generated by `make test`.
|
||||
|
||||
[fio-test]:cmd/fiotest
|
||||
[configs]:configs
|
||||
[pkg]:pkg
|
||||
[report]:scripts/dax-compare-test/report
|
||||
[DAX-compare-test]:scripts/dax-compare-test/README.md
|
||||
85
tests/metrics/storage/fio-k8s/fio-test-ci.sh
Executable file
85
tests/metrics/storage/fio-k8s/fio-test-ci.sh
Executable file
@@ -0,0 +1,85 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# Copyright (c) 2022-2023 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
set -e
|
||||
|
||||
# General env
|
||||
SCRIPT_PATH=$(dirname "$(readlink -f "$0")")
|
||||
source "${SCRIPT_PATH}/../../lib/common.bash"
|
||||
FIO_PATH="${GOPATH}/src/github.com/kata-containers/kata-containers/tests/metrics/storage/fio-k8s"
|
||||
TEST_NAME="${TEST_NAME:-fio}"
|
||||
|
||||
function main() {
|
||||
cmds=("bc" "jq")
|
||||
check_cmds "${cmds[@]}"
|
||||
check_processes
|
||||
init_env
|
||||
|
||||
export KUBECONFIG="$HOME/.kube/config"
|
||||
|
||||
pushd "${FIO_PATH}"
|
||||
echo "INFO: Running K8S FIO test"
|
||||
make test-ci
|
||||
popd
|
||||
|
||||
test_result_file="${FIO_PATH}/cmd/fiotest/test-results/kata/randrw-sync.job/output.json"
|
||||
|
||||
metrics_json_init
|
||||
local read_io=$(cat $test_result_file | grep io_bytes | head -1 | sed 's/[[:blank:]]//g' | cut -f2 -d ':' | cut -f1 -d ',')
|
||||
local read_bw=$(cat $test_result_file | grep bw_bytes | head -1 | sed 's/[[:blank:]]//g' | cut -f2 -d ':' | cut -f1 -d ',')
|
||||
local read_90_percentile=$(cat $test_result_file | grep 90.000000 | head -1 | sed 's/[[:blank:]]//g' | cut -f2 -d ':' | cut -f1 -d ',')
|
||||
local read_95_percentile=$(cat $test_result_file | grep 95.000000 | head -1 | sed 's/[[:blank:]]//g' | cut -f2 -d ':' | cut -f1 -d ',')
|
||||
local write_io=$(cat $test_result_file | grep io_bytes | head -2 | tail -1 | sed 's/[[:blank:]]//g' | cut -f2 -d ':' | cut -f1 -d ',')
|
||||
local write_bw=$(cat $test_result_file | grep bw_bytes | head -2 | tail -1 | sed 's/[[:blank:]]//g' | cut -f2 -d ':' | cut -f1 -d ',')
|
||||
local write_90_percentile=$(cat $test_result_file | grep 90.000000 | head -2 | tail -1 | sed 's/[[:blank:]]//g' | cut -f2 -d ':' | cut -f1 -d ',')
|
||||
local write_95_percentile=$(cat $test_result_file | grep 95.000000 | head -2 | tail -1 | sed 's/[[:blank:]]//g' | cut -f2 -d ':' | cut -f1 -d ',')
|
||||
|
||||
metrics_json_start_array
|
||||
local json="$(cat << EOF
|
||||
{
|
||||
"readio": {
|
||||
"Result" : $read_io,
|
||||
"Units" : "bytes"
|
||||
},
|
||||
"readbw": {
|
||||
"Result" : $read_bw,
|
||||
"Units" : "bytes/sec"
|
||||
},
|
||||
"read90percentile": {
|
||||
"Result" : $read_90_percentile,
|
||||
"Units" : "ns"
|
||||
},
|
||||
"read95percentile": {
|
||||
"Result" : $read_95_percentile,
|
||||
"Units" : "ns"
|
||||
},
|
||||
"writeio": {
|
||||
"Result" : $write_io,
|
||||
"Units" : "bytes"
|
||||
},
|
||||
"writebw": {
|
||||
"Result" : $write_bw,
|
||||
"Units" : "bytes/sec"
|
||||
},
|
||||
"write90percentile": {
|
||||
"Result" : $write_90_percentile,
|
||||
"Units" : "ns"
|
||||
},
|
||||
"write95percentile": {
|
||||
"Result" : $write_95_percentile,
|
||||
"Units" : "ns"
|
||||
}
|
||||
}
|
||||
EOF
|
||||
)"
|
||||
metrics_json_add_array_element "$json"
|
||||
metrics_json_end_array "Results"
|
||||
metrics_json_save
|
||||
|
||||
check_processes
|
||||
}
|
||||
|
||||
main "$@"
|
||||
@@ -0,0 +1,47 @@
|
||||
# FIO in Kubernetes
|
||||
|
||||
This test runs `fio` jobs to measure how Kata Containers work using virtio-fs DAX. The test works using Kubernetes.
|
||||
The test has to run in a single node cluster, it is needed as the test modifies Kata configuration file.
|
||||
|
||||
The `virtio-fs` options that this test will use are:
|
||||
|
||||
* `cache mode` Only `auto`, this is the most compatible mode for most of the Kata use cases. Today this is default in Kata.
|
||||
* `thread pool size` Restrict the number of worker threads per request queue, zero means no thread pool.
|
||||
* `DAX`
|
||||
```
|
||||
File contents can be mapped into a memory window on the host, allowing the guest to directly access data from the host page cache. This has several advantages: The guest page cache is bypassed, reducing the memory footprint. No communication is necessary
|
||||
to access file contents, improving I/O performance. Shared file access is coherent between virtual machines on the same host even with mmap.
|
||||
```
|
||||
|
||||
This test by default iterates over different `virtio-fs` configurations.
|
||||
|
||||
| test name | DAX | thread pool size | cache mode |
|
||||
|---------------------------|-----|------------------|------------|
|
||||
| pool_0_cache_auto_no_DAX | no | 0 | auto |
|
||||
| pool_0_cache_auto_DAX | yes | 0 | auto |
|
||||
|
||||
The `fio` options used are:
|
||||
|
||||
`ioengine`: How the IO requests are issued to the kernel.
|
||||
* `libaio`: Supports async IO for both direct and buffered IO.
|
||||
* `mmap`: File is memory mapped with mmap(2) and data copied to/from using memcpy(3).
|
||||
|
||||
`rw type`: Type of I/O pattern.
|
||||
* `randread`: Random reads.
|
||||
* `randrw`: Random mixed reads and writes.
|
||||
* `randwrite`: Random writes.
|
||||
* `read`: Sequential reads.
|
||||
* `write`: Sequential writes.
|
||||
|
||||
Additional notes: Some jobs contain a `multi` prefix. This means that the same job runs more than once at the same time using its own file.
|
||||
|
||||
### Static `fio` values:
|
||||
|
||||
Some `fio` values are not modified over all the jobs.
|
||||
|
||||
* `runtime`: Tell `fio` to terminate processing after the specified period of time(seconds).
|
||||
* `loops`: Run the specified number of iterations of this job. Used to repeat the same workload a given number of times.
|
||||
* `iodepth`: Number of I/O units to keep in flight against the file. Note that increasing `iodepth` beyond 1 will not affect synchronous `ioengine`.
|
||||
* `size`: The total size of file I/O for each thread of this job.
|
||||
* `direct`: If value is true, use non-buffered I/O. This is usually O_`DIRECT`.
|
||||
* `blocksize`: The block size in bytes used for I/O units.
|
||||
@@ -13,6 +13,16 @@ set -o pipefail
|
||||
DOCKER_RUNTIME=${DOCKER_RUNTIME:-runc}
|
||||
MEASURED_ROOTFS=${MEASURED_ROOTFS:-no}
|
||||
|
||||
#For cross build
|
||||
CROSS_BUILD=${CROSS_BUILD:-false}
|
||||
BUILDX=""
|
||||
PLATFORM=""
|
||||
TARGET_ARCH=${TARGET_ARCH:-$(uname -m)}
|
||||
ARCH=${ARCH:-$(uname -m)}
|
||||
[ "${TARGET_ARCH}" == "aarch64" ] && TARGET_ARCH=arm64
|
||||
TARGET_OS=${TARGET_OS:-linux}
|
||||
[ "${CROSS_BUILD}" == "true" ] && BUILDX=buildx && PLATFORM="--platform=${TARGET_OS}/${TARGET_ARCH}"
|
||||
|
||||
readonly script_name="${0##*/}"
|
||||
readonly script_dir=$(dirname "$(readlink -f "$0")")
|
||||
readonly lib_file="${script_dir}/../scripts/lib.sh"
|
||||
@@ -154,7 +164,7 @@ build_with_container() {
|
||||
engine_build_args+=" --runtime ${DOCKER_RUNTIME}"
|
||||
fi
|
||||
|
||||
"${container_engine}" build \
|
||||
"${container_engine}" ${BUILDX} build ${PLATFORM} \
|
||||
${engine_build_args} \
|
||||
--build-arg http_proxy="${http_proxy}" \
|
||||
--build-arg https_proxy="${https_proxy}" \
|
||||
@@ -189,6 +199,8 @@ build_with_container() {
|
||||
--env MEASURED_ROOTFS="${MEASURED_ROOTFS}" \
|
||||
--env SELINUX="${SELINUX}" \
|
||||
--env DEBUG="${DEBUG}" \
|
||||
--env ARCH="${ARCH}" \
|
||||
--env TARGET_ARCH="${TARGET_ARCH}" \
|
||||
-v /dev:/dev \
|
||||
-v "${script_dir}":"/osbuilder" \
|
||||
-v "${script_dir}/../scripts":"/scripts" \
|
||||
|
||||
@@ -31,6 +31,16 @@ SELINUX=${SELINUX:-"no"}
|
||||
lib_file="${script_dir}/../scripts/lib.sh"
|
||||
source "$lib_file"
|
||||
|
||||
#For cross build
|
||||
CROSS_BUILD=${CROSS_BUILD:-false}
|
||||
BUILDX=""
|
||||
PLATFORM=""
|
||||
TARGET_ARCH=${TARGET_ARCH:-$(uname -m)}
|
||||
ARCH=${ARCH:-$(uname -m)}
|
||||
[ "${TARGET_ARCH}" == "aarch64" ] && TARGET_ARCH=arm64
|
||||
TARGET_OS=${TARGET_OS:-linux}
|
||||
[ "${CROSS_BUILD}" == "true" ] && BUILDX=buildx && PLATFORM="--platform=${TARGET_OS}/${TARGET_ARCH}"
|
||||
|
||||
handle_error() {
|
||||
local exit_code="${?}"
|
||||
local line_number="${1:-}"
|
||||
|
||||
@@ -8,6 +8,7 @@ FROM ${IMAGE_REGISTRY}/ubuntu:@OS_VERSION@
|
||||
|
||||
# makedev tries to mknod from postinst
|
||||
RUN [ -x /usr/bin/systemd-detect-virt ] || ( echo "echo docker" >/usr/bin/systemd-detect-virt && chmod +x /usr/bin/systemd-detect-virt )
|
||||
# hadolint ignore=DL3009,SC2046
|
||||
RUN apt-get update && \
|
||||
DEBIAN_FRONTEND=noninteractive \
|
||||
apt-get --no-install-recommends -y install \
|
||||
@@ -18,6 +19,7 @@ RUN apt-get update && \
|
||||
libc_arch="$gcc_arch" && \
|
||||
[ "$gcc_arch" = aarch64 ] && libc_arch=arm64; \
|
||||
[ "$gcc_arch" = ppc64le ] && gcc_arch=powerpc64le && libc_arch=ppc64el; \
|
||||
[ "$gcc_arch" = s390x ] && gcc_arch=s390x && libc_arch=s390x; \
|
||||
[ "$gcc_arch" = x86_64 ] && gcc_arch=x86-64 && libc_arch=amd64; \
|
||||
echo "gcc-$gcc_arch-linux-gnu libc6-dev-$libc_arch-cross")) \
|
||||
git \
|
||||
|
||||
@@ -21,7 +21,13 @@ readonly osbuilder_dir="$(cd "${repo_root_dir}/tools/osbuilder" && pwd)"
|
||||
|
||||
export GOPATH=${GOPATH:-${HOME}/go}
|
||||
|
||||
arch_target="$(uname -m)"
|
||||
ARCH=${ARCH:-$(uname -m)}
|
||||
if [ $(uname -m) == "${ARCH}" ]; then
|
||||
arch_target="$(uname -m)"
|
||||
else
|
||||
arch_target="${ARCH}"
|
||||
fi
|
||||
|
||||
final_artifact_name="kata-containers"
|
||||
image_initrd_extension=".img"
|
||||
|
||||
|
||||
@@ -1,46 +0,0 @@
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: DaemonSet
|
||||
metadata:
|
||||
name: kubelet-kata-cleanup
|
||||
namespace: kube-system
|
||||
spec:
|
||||
selector:
|
||||
matchLabels:
|
||||
name: kubelet-kata-cleanup
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
name: kubelet-kata-cleanup
|
||||
spec:
|
||||
serviceAccountName: kata-label-node
|
||||
nodeSelector:
|
||||
katacontainers.io/kata-runtime: cleanup
|
||||
containers:
|
||||
- name: kube-kata-cleanup
|
||||
image: quay.io/kata-containers/kata-deploy:stable
|
||||
imagePullPolicy: Always
|
||||
command: [ "bash", "-c", "/opt/kata-artifacts/scripts/kata-deploy.sh reset" ]
|
||||
env:
|
||||
- name: NODE_NAME
|
||||
valueFrom:
|
||||
fieldRef:
|
||||
fieldPath: spec.nodeName
|
||||
securityContext:
|
||||
privileged: false
|
||||
volumeMounts:
|
||||
- name: dbus
|
||||
mountPath: /var/run/dbus
|
||||
- name: systemd
|
||||
mountPath: /run/systemd
|
||||
volumes:
|
||||
- name: dbus
|
||||
hostPath:
|
||||
path: /var/run/dbus
|
||||
- name: systemd
|
||||
hostPath:
|
||||
path: /run/systemd
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
maxUnavailable: 1
|
||||
type: RollingUpdate
|
||||
@@ -1,69 +0,0 @@
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: DaemonSet
|
||||
metadata:
|
||||
name: kata-deploy
|
||||
namespace: kube-system
|
||||
spec:
|
||||
selector:
|
||||
matchLabels:
|
||||
name: kata-deploy
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
name: kata-deploy
|
||||
spec:
|
||||
serviceAccountName: kata-label-node
|
||||
containers:
|
||||
- name: kube-kata
|
||||
image: quay.io/kata-containers/kata-deploy:stable
|
||||
imagePullPolicy: Always
|
||||
lifecycle:
|
||||
preStop:
|
||||
exec:
|
||||
command: ["bash", "-c", "/opt/kata-artifacts/scripts/kata-deploy.sh cleanup"]
|
||||
command: [ "bash", "-c", "/opt/kata-artifacts/scripts/kata-deploy.sh install" ]
|
||||
env:
|
||||
- name: NODE_NAME
|
||||
valueFrom:
|
||||
fieldRef:
|
||||
fieldPath: spec.nodeName
|
||||
securityContext:
|
||||
privileged: false
|
||||
volumeMounts:
|
||||
- name: crio-conf
|
||||
mountPath: /etc/crio/
|
||||
- name: containerd-conf
|
||||
mountPath: /etc/containerd/
|
||||
- name: kata-artifacts
|
||||
mountPath: /opt/kata/
|
||||
- name: dbus
|
||||
mountPath: /var/run/dbus
|
||||
- name: systemd
|
||||
mountPath: /run/systemd
|
||||
- name: local-bin
|
||||
mountPath: /usr/local/bin/
|
||||
volumes:
|
||||
- name: crio-conf
|
||||
hostPath:
|
||||
path: /etc/crio/
|
||||
- name: containerd-conf
|
||||
hostPath:
|
||||
path: /etc/containerd/
|
||||
- name: kata-artifacts
|
||||
hostPath:
|
||||
path: /opt/kata/
|
||||
type: DirectoryOrCreate
|
||||
- name: dbus
|
||||
hostPath:
|
||||
path: /var/run/dbus
|
||||
- name: systemd
|
||||
hostPath:
|
||||
path: /run/systemd
|
||||
- name: local-bin
|
||||
hostPath:
|
||||
path: /usr/local/bin/
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
maxUnavailable: 1
|
||||
type: RollingUpdate
|
||||
@@ -19,6 +19,29 @@ gid=$(id -g ${USER})
|
||||
http_proxy="${http_proxy:-}"
|
||||
https_proxy="${https_proxy:-}"
|
||||
|
||||
ARCH=${ARCH:-$(uname -m)}
|
||||
CROSS_BUILD=
|
||||
BUILDX=""
|
||||
PLATFORM=""
|
||||
TARGET_ARCH=${TARGET_ARCH:-$(uname -m)}
|
||||
[ "$(uname -m)" != "${TARGET_ARCH}" ] && CROSS_BUILD=true
|
||||
|
||||
[ "${TARGET_ARCH}" == "aarch64" ] && TARGET_ARCH=arm64
|
||||
|
||||
# used for cross build
|
||||
TARGET_OS=${TARGET_OS:-linux}
|
||||
TARGET_ARCH=${TARGET_ARCH:-$ARCH}
|
||||
|
||||
[ "${CROSS_BUILD}" == "true" ] && BUILDX="buildx" && PLATFORM="--platform=${TARGET_OS}/${TARGET_ARCH}"
|
||||
if [ "${CROSS_BUILD}" == "true" ]; then
|
||||
# check if the current docker support docker buildx
|
||||
docker buildx ls > /dev/null 2>&1 || true
|
||||
[ $? != 0 ] && echo "no docker buildx support, please upgrad your docker" && exit 1
|
||||
# check if docker buildx support target_arch, if not install it
|
||||
r=$(docker buildx ls | grep "${TARGET_ARCH}" || true)
|
||||
[ -z "$r" ] && sudo docker run --privileged --rm tonistiigi/binfmt --install ${TARGET_ARCH}
|
||||
fi
|
||||
|
||||
if [ "${script_dir}" != "${PWD}" ]; then
|
||||
ln -sf "${script_dir}/build" "${PWD}/build"
|
||||
fi
|
||||
@@ -66,6 +89,9 @@ docker run \
|
||||
--env VIRTIOFSD_CONTAINER_BUILDER="${VIRTIOFSD_CONTAINER_BUILDER:-}" \
|
||||
--env MEASURED_ROOTFS="${MEASURED_ROOTFS:-}" \
|
||||
--env USE_CACHE="${USE_CACHE:-}" \
|
||||
--env CROSS_BUILD="${CROSS_BUILD}" \
|
||||
--env TARGET_ARCH="${TARGET_ARCH}" \
|
||||
--env ARCH="${ARCH}" \
|
||||
--rm \
|
||||
-w ${script_dir} \
|
||||
build-kata-deploy "${kata_deploy_create}" $@
|
||||
|
||||
@@ -38,7 +38,7 @@ readonly rootfs_builder="${repo_root_dir}/tools/packaging/guest-image/build_imag
|
||||
readonly jenkins_url="http://jenkins.katacontainers.io"
|
||||
readonly cached_artifacts_path="lastSuccessfulBuild/artifact/artifacts"
|
||||
|
||||
ARCH=$(uname -m)
|
||||
ARCH=${ARCH:-$(uname -m)}
|
||||
MEASURED_ROOTFS=${MEASURED_ROOTFS:-no}
|
||||
USE_CACHE="${USE_CACHE:-"yes"}"
|
||||
|
||||
@@ -150,7 +150,7 @@ install_image() {
|
||||
image_type+="-${variant}"
|
||||
fi
|
||||
|
||||
local jenkins="${jenkins_url}/job/kata-containers-main-rootfs-${image_type}-$(uname -m)/${cached_artifacts_path}"
|
||||
local jenkins="${jenkins_url}/job/kata-containers-main-rootfs-${image_type}-${ARCH}/${cached_artifacts_path}"
|
||||
local component="rootfs-${image_type}"
|
||||
|
||||
local osbuilder_last_commit="$(get_last_modification "${repo_root_dir}/tools/osbuilder")"
|
||||
@@ -197,7 +197,7 @@ install_initrd() {
|
||||
initrd_type+="-${variant}"
|
||||
fi
|
||||
|
||||
local jenkins="${jenkins_url}/job/kata-containers-main-rootfs-${initrd_type}-$(uname -m)/${cached_artifacts_path}"
|
||||
local jenkins="${jenkins_url}/job/kata-containers-main-rootfs-${initrd_type}-${ARCH}/${cached_artifacts_path}"
|
||||
local component="rootfs-${initrd_type}"
|
||||
|
||||
local osbuilder_last_commit="$(get_last_modification "${repo_root_dir}/tools/osbuilder")"
|
||||
@@ -208,6 +208,8 @@ install_initrd() {
|
||||
local libseccomp_version="$(get_from_kata_deps "externals.libseccomp.version")"
|
||||
local rust_version="$(get_from_kata_deps "languages.rust.meta.newest-version")"
|
||||
|
||||
[[ "${ARCH}" == "aarch64" && "${CROSS_BUILD}" == "true" ]] && echo "warning: Don't cross build initrd for aarch64 as it's too slow" && exit 0
|
||||
|
||||
install_cached_tarball_component \
|
||||
"${component}" \
|
||||
"${jenkins}" \
|
||||
@@ -247,7 +249,7 @@ install_cached_kernel_tarball_component() {
|
||||
|
||||
install_cached_tarball_component \
|
||||
"${kernel_name}" \
|
||||
"${jenkins_url}/job/kata-containers-main-${kernel_name}-$(uname -m)/${cached_artifacts_path}" \
|
||||
"${jenkins_url}/job/kata-containers-main-${kernel_name}-${ARCH}/${cached_artifacts_path}" \
|
||||
"${kernel_version}-${kernel_kata_config_version}-$(get_last_modification $(dirname $kernel_builder))" \
|
||||
"$(get_kernel_image_name)" \
|
||||
"${final_tarball_name}" \
|
||||
@@ -384,7 +386,7 @@ install_qemu_helper() {
|
||||
|
||||
install_cached_tarball_component \
|
||||
"${qemu_name}" \
|
||||
"${jenkins_url}/job/kata-containers-main-${qemu_name}-$(uname -m)/${cached_artifacts_path}" \
|
||||
"${jenkins_url}/job/kata-containers-main-${qemu_name}-${ARCH}/${cached_artifacts_path}" \
|
||||
"${qemu_version}-$(calc_qemu_files_sha256sum)" \
|
||||
"$(get_qemu_image_name)" \
|
||||
"${final_tarball_name}" \
|
||||
@@ -495,7 +497,7 @@ install_clh_glibc() {
|
||||
install_virtiofsd() {
|
||||
install_cached_tarball_component \
|
||||
"virtiofsd" \
|
||||
"${jenkins_url}/job/kata-containers-main-virtiofsd-$(uname -m)/${cached_artifacts_path}" \
|
||||
"${jenkins_url}/job/kata-containers-main-virtiofsd-${ARCH}/${cached_artifacts_path}" \
|
||||
"$(get_from_kata_deps "externals.virtiofsd.version")-$(get_from_kata_deps "externals.virtiofsd.toolchain")" \
|
||||
"$(get_virtiofsd_image_name)" \
|
||||
"${final_tarball_name}" \
|
||||
@@ -542,7 +544,7 @@ install_shimv2() {
|
||||
|
||||
install_cached_tarball_component \
|
||||
"shim-v2" \
|
||||
"${jenkins_url}/job/kata-containers-main-shim-v2-$(uname -m)/${cached_artifacts_path}" \
|
||||
"${jenkins_url}/job/kata-containers-main-shim-v2-${ARCH}/${cached_artifacts_path}" \
|
||||
"${shim_v2_version}" \
|
||||
"$(get_shim_v2_image_name)" \
|
||||
"${final_tarball_name}" \
|
||||
|
||||
@@ -65,6 +65,8 @@ kernel_url=""
|
||||
#Linux headers for GPU guest fs module building
|
||||
linux_headers=""
|
||||
|
||||
CROSS_BUILD_ARG=""
|
||||
|
||||
MEASURED_ROOTFS=${MEASURED_ROOTFS:-no}
|
||||
|
||||
packaging_scripts_dir="${script_dir}/../scripts"
|
||||
@@ -436,7 +438,7 @@ setup_kernel() {
|
||||
|
||||
info "Copying config file from: ${kernel_config_path}"
|
||||
cp "${kernel_config_path}" ./.config
|
||||
make oldconfig
|
||||
ARCH=${arch_target} make oldconfig ${CROSS_BUILD_ARG}
|
||||
)
|
||||
}
|
||||
|
||||
@@ -447,7 +449,7 @@ build_kernel() {
|
||||
[ -n "${arch_target}" ] || arch_target="$(uname -m)"
|
||||
arch_target=$(arch_to_kernel "${arch_target}")
|
||||
pushd "${kernel_path}" >>/dev/null
|
||||
make -j $(nproc ${CI:+--ignore 1}) ARCH="${arch_target}"
|
||||
make -j $(nproc ${CI:+--ignore 1}) ARCH="${arch_target}" ${CROSS_BUILD_ARG}
|
||||
if [ "${conf_guest}" == "sev" ]; then
|
||||
make -j $(nproc ${CI:+--ignore 1}) INSTALL_MOD_STRIP=1 INSTALL_MOD_PATH=${kernel_path} modules_install
|
||||
fi
|
||||
@@ -658,6 +660,8 @@ main() {
|
||||
|
||||
info "Kernel version: ${kernel_version}"
|
||||
|
||||
[ "${arch_target}" != "" -a "${arch_target}" != $(uname -m) ] && CROSS_BUILD_ARG="CROSS_COMPILE=${arch_target}-linux-gnu-"
|
||||
|
||||
case "${subcmd}" in
|
||||
build)
|
||||
build_kernel "${kernel_path}"
|
||||
|
||||
@@ -1 +1 @@
|
||||
111
|
||||
112
|
||||
|
||||
@@ -141,18 +141,18 @@ build reproducibility we publish those container images, and when those are used
|
||||
of the projects listed as part of the "versions.yaml" file, users can get as close to the environment we
|
||||
used to build the release artefacts.
|
||||
* Kernel (on all its different flavours): $(get_kernel_image_name)
|
||||
* OVMF (on all its diferent flavours): $(get_ovmf_image_name)
|
||||
* OVMF (on all its different flavours): $(get_ovmf_image_name)
|
||||
* QEMU (on all its different flavurs): $(get_qemu_image_name)
|
||||
* shim-v2: $(get_shim_v2_image_name)
|
||||
* virtiofsd: $(get_virtiofsd_image_name)
|
||||
|
||||
The users who want to rebuild the tarballs using exactly the same images can simply use the following environment
|
||||
variables:
|
||||
* `KERNEL_CONTAINER_BUILDER`
|
||||
* `OVMF_CONTAINER_BUILDER`
|
||||
* `QEMU_CONTAINER_BUILDER`
|
||||
* `SHIM_V2_CONTAINER_BUILDER`
|
||||
* `VIRTIOFSD_CONTAINER_BUILDER`
|
||||
* \`KERNEL_CONTAINER_BUILDER\`
|
||||
* \`OVMF_CONTAINER_BUILDER\`
|
||||
* \`QEMU_CONTAINER_BUILDER\`
|
||||
* \`SHIM_V2_CONTAINER_BUILDER\`
|
||||
* \`VIRTIOFSD_CONTAINER_BUILDER\`
|
||||
|
||||
## Kata Linux Containers Kernel
|
||||
Kata Containers ${runtime_version} suggest to use the Linux kernel [${kernel_version}][kernel]
|
||||
|
||||
@@ -19,6 +19,16 @@ short_commit_length=10
|
||||
|
||||
hub_bin="hub-bin"
|
||||
|
||||
#for cross build
|
||||
CROSS_BUILD=${CROSS_BUILD-:}
|
||||
BUILDX=""
|
||||
PLATFORM=""
|
||||
TARGET_ARCH=${TARGET_ARCH:-$(uname -m)}
|
||||
ARCH=${ARCH:-$(uname -m)}
|
||||
[ "${TARGET_ARCH}" == "aarch64" ] && TARGET_ARCH=arm64
|
||||
TARGET_OS=${TARGET_OS:-linux}
|
||||
[ "${CROSS_BUILD}" == "true" ] && BUILDX=buildx && PLATFORM="--platform=${TARGET_OS}/${TARGET_ARCH}"
|
||||
|
||||
clone_tests_repo() {
|
||||
# KATA_CI_NO_NETWORK is (has to be) ignored if there is
|
||||
# no existing clone.
|
||||
@@ -189,7 +199,7 @@ get_ovmf_image_name() {
|
||||
}
|
||||
|
||||
get_virtiofsd_image_name() {
|
||||
ARCH=$(uname -m)
|
||||
ARCH=${ARCH:-$(uname -m)}
|
||||
case ${ARCH} in
|
||||
"aarch64")
|
||||
libc="musl"
|
||||
|
||||
@@ -5,6 +5,8 @@
|
||||
FROM ubuntu:22.04
|
||||
ENV DEBIAN_FRONTEND=noninteractive
|
||||
|
||||
ARG ARCH
|
||||
|
||||
# kernel deps
|
||||
RUN apt-get update && \
|
||||
apt-get install -y --no-install-recommends \
|
||||
@@ -23,4 +25,5 @@ RUN apt-get update && \
|
||||
rsync \
|
||||
cpio \
|
||||
patch && \
|
||||
apt-get clean && apt-get autoclean
|
||||
if [ "${ARCH}" != "$(uname -m)" ]; then apt-get install --no-install-recommends -y gcc-"${ARCH}"-linux-gnu binutils-"${ARCH}"-linux-gnu; fi && \
|
||||
apt-get clean && apt-get autoclean && rm -rf /var/lib/apt/lists/*
|
||||
|
||||
@@ -14,12 +14,26 @@ source "${script_dir}/../../scripts/lib.sh"
|
||||
|
||||
readonly kernel_builder="${repo_root_dir}/tools/packaging/kernel/build-kernel.sh"
|
||||
|
||||
BUILDX=
|
||||
PLATFORM=
|
||||
|
||||
DESTDIR=${DESTDIR:-${PWD}}
|
||||
PREFIX=${PREFIX:-/opt/kata}
|
||||
container_image="${KERNEL_CONTAINER_BUILDER:-$(get_kernel_image_name)}"
|
||||
|
||||
if [ "${CROSS_BUILD}" == "true" ]; then
|
||||
container_image="${container_image}-${ARCH}-cross-build"
|
||||
# Need to build a s390x image due to an issue at
|
||||
# https://github.com/kata-containers/kata-containers/pull/6586#issuecomment-1603189242
|
||||
if [ ${ARCH} == "s390x" ]; then
|
||||
BUILDX="buildx"
|
||||
PLATFORM="--platform=linux/s390x"
|
||||
fi
|
||||
fi
|
||||
|
||||
sudo docker pull ${container_image} || \
|
||||
(sudo docker build -t "${container_image}" "${script_dir}" && \
|
||||
(sudo docker ${BUILDX} build ${PLATFORM} \
|
||||
--build-arg ARCH=${ARCH} -t "${container_image}" "${script_dir}" && \
|
||||
# No-op unless PUSH_TO_REGISTRY is exported as "yes"
|
||||
push_to_registry "${container_image}")
|
||||
|
||||
@@ -27,21 +41,21 @@ sudo docker run --rm -i -v "${repo_root_dir}:${repo_root_dir}" \
|
||||
-w "${PWD}" \
|
||||
--env MEASURED_ROOTFS="${MEASURED_ROOTFS:-}" \
|
||||
"${container_image}" \
|
||||
bash -c "${kernel_builder} $* setup"
|
||||
bash -c "${kernel_builder} -a ${ARCH} $* setup"
|
||||
|
||||
sudo docker run --rm -i -v "${repo_root_dir}:${repo_root_dir}" \
|
||||
-w "${PWD}" \
|
||||
"${container_image}" \
|
||||
bash -c "${kernel_builder} $* build"
|
||||
bash -c "${kernel_builder} -a ${ARCH} $* build"
|
||||
|
||||
sudo docker run --rm -i -v "${repo_root_dir}:${repo_root_dir}" \
|
||||
-w "${PWD}" \
|
||||
--env DESTDIR="${DESTDIR}" --env PREFIX="${PREFIX}" \
|
||||
"${container_image}" \
|
||||
bash -c "${kernel_builder} $* install"
|
||||
bash -c "${kernel_builder} -a ${ARCH} $* install"
|
||||
|
||||
sudo docker run --rm -i -v "${repo_root_dir}:${repo_root_dir}" \
|
||||
-w "${PWD}" \
|
||||
--env DESTDIR="${DESTDIR}" --env PREFIX="${PREFIX}" \
|
||||
"${container_image}" \
|
||||
bash -c "${kernel_builder} $* build-headers"
|
||||
bash -c "${kernel_builder} -a ${ARCH} $* build-headers"
|
||||
|
||||
@@ -8,8 +8,23 @@ from ubuntu:20.04
|
||||
# This is required to keep build dependencies with security fixes.
|
||||
ARG CACHE_TIMEOUT
|
||||
ARG DEBIAN_FRONTEND=noninteractive
|
||||
ARG DPKG_ARCH
|
||||
ARG ARCH
|
||||
ARG GCC_ARCH
|
||||
ARG PREFIX
|
||||
|
||||
SHELL ["/bin/bash", "-o", "pipefail", "-c"]
|
||||
|
||||
RUN if [ "${ARCH}" != "$(uname -m)" ]; then sed -i 's/^deb/deb [arch=amd64]/g' /etc/apt/sources.list && \
|
||||
dpkg --add-architecture "${DPKG_ARCH#:}" && \
|
||||
echo "deb [arch=${DPKG_ARCH#:}] http://ports.ubuntu.com/ focal main restricted" >> /etc/apt/sources.list && \
|
||||
echo "deb [arch=${DPKG_ARCH#:}] http://ports.ubuntu.com/ focal-updates main restricted" >> /etc/apt/sources.list && \
|
||||
echo "deb [arch=${DPKG_ARCH#:}] http://ports.ubuntu.com/ focal universe" >> /etc/apt/sources.list && \
|
||||
echo "deb [arch=${DPKG_ARCH#:}] http://ports.ubuntu.com/ focal-updates universe" >> /etc/apt/sources.list && \
|
||||
echo "deb [arch=${DPKG_ARCH#:}] http://ports.ubuntu.com/ focal multiverse" >> /etc/apt/sources.list && \
|
||||
echo "deb [arch=${DPKG_ARCH#:}] http://ports.ubuntu.com/ focal-updates multiverse" >> /etc/apt/sources.list && \
|
||||
echo "deb [arch=${DPKG_ARCH#:}] http://ports.ubuntu.com/ focal-backports main restricted universe multiverse" >> /etc/apt/sources.list; fi
|
||||
|
||||
RUN apt-get update && apt-get upgrade -y && \
|
||||
apt-get --no-install-recommends install -y \
|
||||
apt-utils \
|
||||
@@ -19,37 +34,43 @@ RUN apt-get update && apt-get upgrade -y && \
|
||||
bison \
|
||||
ca-certificates \
|
||||
cpio \
|
||||
dpkg-dev \
|
||||
flex \
|
||||
gawk \
|
||||
libaudit-dev \
|
||||
libblkid-dev \
|
||||
libcap-dev \
|
||||
libcap-ng-dev \
|
||||
libdw-dev \
|
||||
libelf-dev \
|
||||
libffi-dev \
|
||||
libglib2.0-0 \
|
||||
libglib2.0-dev \
|
||||
libglib2.0-dev git \
|
||||
libltdl-dev \
|
||||
libmount-dev \
|
||||
libpixman-1-dev \
|
||||
libselinux1-dev \
|
||||
libtool \
|
||||
libaudit-dev${DPKG_ARCH} \
|
||||
libblkid-dev${DPKG_ARCH} \
|
||||
libcap-dev${DPKG_ARCH} \
|
||||
libcap-ng-dev${DPKG_ARCH} \
|
||||
libdw-dev${DPKG_ARCH} \
|
||||
libelf-dev${DPKG_ARCH} \
|
||||
libffi-dev${DPKG_ARCH} \
|
||||
libglib2.0-0${DPKG_ARCH} \
|
||||
libglib2.0-dev${DPKG_ARCH} \
|
||||
libglib2.0-dev${DPKG_ARCH} git \
|
||||
libltdl-dev${DPKG_ARCH} \
|
||||
libmount-dev${DPKG_ARCH} \
|
||||
libpixman-1-dev${DPKG_ARCH} \
|
||||
libselinux1-dev${DPKG_ARCH} \
|
||||
libtool${DPKG_ARCH} \
|
||||
make \
|
||||
ninja-build \
|
||||
pkg-config \
|
||||
libseccomp-dev \
|
||||
libseccomp2 \
|
||||
pkg-config${DPKG_ARCH} \
|
||||
libseccomp-dev${DPKG_ARCH} \
|
||||
libseccomp2${DPKG_ARCH} \
|
||||
patch \
|
||||
python \
|
||||
python-dev \
|
||||
rsync \
|
||||
zlib1g-dev && \
|
||||
if [ "$(uname -m)" != "s390x" ]; then apt-get install -y --no-install-recommends libpmem-dev; fi && \
|
||||
zlib1g-dev${DPKG_ARCH} && \
|
||||
if [ "${ARCH}" != s390x ]; then apt-get install -y --no-install-recommends libpmem-dev${DPKG_ARCH}; fi && \
|
||||
GCC_ARCH="${ARCH}" && if [ "${ARCH}" = "ppc64le" ]; then GCC_ARCH="powerpc64le"; fi && \
|
||||
if [ "${ARCH}" != "$(uname -m)" ]; then apt-get install --no-install-recommends -y gcc-"${GCC_ARCH}"-linux-gnu; fi && \
|
||||
apt-get clean && rm -rf /var/lib/apt/lists/
|
||||
|
||||
RUN git clone https://github.com/axboe/liburing/ ~/liburing && \
|
||||
cd ~/liburing && \
|
||||
git checkout tags/liburing-2.1 && \
|
||||
GCC_ARCH="${ARCH}" && if [ "${ARCH}" = "ppc64le" ]; then GCC_ARCH="powerpc64le"; fi && \
|
||||
if [ "${ARCH}" != "$(uname -m)" ]; then PREFIX="${GCC_ARCH}-linux-gnu"; fi && \
|
||||
./configure --cc=${GCC_ARCH}-linux-gnu-gcc --cxx=${GCC_ARCH}-linux-gnu-cpp --prefix=/usr/${PREFIX}/ && \
|
||||
make && make install && ldconfig
|
||||
|
||||
@@ -14,6 +14,12 @@ readonly qemu_builder="${script_dir}/build-qemu.sh"
|
||||
source "${script_dir}/../../scripts/lib.sh"
|
||||
source "${script_dir}/../qemu.blacklist"
|
||||
|
||||
ARCH=${ARCH:-$(uname -m)}
|
||||
dpkg_arch=":${ARCH}"
|
||||
[ ${dpkg_arch} == ":aarch64" ] && dpkg_arch=":arm64"
|
||||
[ ${dpkg_arch} == ":x86_64" ] && dpkg_arch=""
|
||||
[ "${dpkg_arch}" == ":ppc64le" ] && dpkg_arch=":ppc64el"
|
||||
|
||||
packaging_dir="${script_dir}/../.."
|
||||
qemu_destdir="/tmp/qemu-static/"
|
||||
container_engine="${USE_PODMAN:+podman}"
|
||||
@@ -39,11 +45,14 @@ CACHE_TIMEOUT=$(date +"%Y-%m-%d")
|
||||
[ -n "${build_suffix}" ] && PKGVERSION="kata-static-${build_suffix}" || PKGVERSION="kata-static"
|
||||
|
||||
container_image="${QEMU_CONTAINER_BUILDER:-$(get_qemu_image_name)}"
|
||||
[ "${CROSS_BUILD}" == "true" ] && container_image="${container_image}-cross-build"
|
||||
|
||||
sudo docker pull ${container_image} || (sudo "${container_engine}" build \
|
||||
--build-arg CACHE_TIMEOUT="${CACHE_TIMEOUT}" \
|
||||
--build-arg http_proxy="${http_proxy}" \
|
||||
--build-arg https_proxy="${https_proxy}" \
|
||||
--build-arg DPKG_ARCH="${dpkg_arch}" \
|
||||
--build-arg ARCH="${ARCH}" \
|
||||
"${packaging_dir}" \
|
||||
-f "${script_dir}/Dockerfile" \
|
||||
-t "${container_image}" && \
|
||||
@@ -54,13 +63,14 @@ sudo "${container_engine}" run \
|
||||
--rm \
|
||||
-i \
|
||||
--env BUILD_SUFFIX="${build_suffix}" \
|
||||
--env HYPERVISOR_NAME="${HYPERVISOR_NAME}" \
|
||||
--env PKGVERSION="${PKGVERSION}" \
|
||||
--env QEMU_DESTDIR="${qemu_destdir}" \
|
||||
--env QEMU_REPO="${qemu_repo}" \
|
||||
--env QEMU_VERSION="${qemu_version}" \
|
||||
--env QEMU_TARBALL="${qemu_tar}" \
|
||||
--env PREFIX="${prefix}" \
|
||||
--env HYPERVISOR_NAME="${HYPERVISOR_NAME}" \
|
||||
--env QEMU_VERSION_NUM="${qemu_version}" \
|
||||
--env ARCH="${ARCH}" \
|
||||
-v "${repo_root_dir}:/root/kata-containers" \
|
||||
-v "${PWD}":/share "${container_image}" \
|
||||
bash -c "/root/kata-containers/tools/packaging/static-build/qemu/build-qemu.sh"
|
||||
|
||||
@@ -14,13 +14,19 @@ kata_packaging_scripts="${kata_packaging_dir}/scripts"
|
||||
kata_static_build_dir="${kata_packaging_dir}/static-build"
|
||||
kata_static_build_scripts="${kata_static_build_dir}/scripts"
|
||||
|
||||
ARCH=${ARCH:-$(uname -m)}
|
||||
|
||||
git clone --depth=1 "${QEMU_REPO}" qemu
|
||||
pushd qemu
|
||||
git fetch --depth=1 origin "${QEMU_VERSION}"
|
||||
git fetch --depth=1 origin "${QEMU_VERSION_NUM}"
|
||||
git checkout FETCH_HEAD
|
||||
scripts/git-submodule.sh update meson capstone
|
||||
${kata_packaging_scripts}/patch_qemu.sh "${QEMU_VERSION}" "${kata_packaging_dir}/qemu/patches"
|
||||
PREFIX="${PREFIX}" ${kata_packaging_scripts}/configure-hypervisor.sh -s "${HYPERVISOR_NAME}" | xargs ./configure --with-pkgversion="${PKGVERSION}"
|
||||
${kata_packaging_scripts}/patch_qemu.sh "${QEMU_VERSION_NUM}" "${kata_packaging_dir}/qemu/patches"
|
||||
if [ "$(uname -m)" != "${ARCH}" ] && [ "${ARCH}" == "s390x" ]; then
|
||||
PREFIX="${PREFIX}" ${kata_packaging_scripts}/configure-hypervisor.sh -s "${HYPERVISOR_NAME}" "${ARCH}" | xargs ./configure --with-pkgversion="${PKGVERSION}" --cc=s390x-linux-gnu-gcc --cross-prefix=s390x-linux-gnu- --prefix="${PREFIX}" --target-list=s390x-softmmu
|
||||
else
|
||||
PREFIX="${PREFIX}" ${kata_packaging_scripts}/configure-hypervisor.sh -s "${HYPERVISOR_NAME}" "${ARCH}" | xargs ./configure --with-pkgversion="${PKGVERSION}"
|
||||
fi
|
||||
make -j"$(nproc +--ignore 1)"
|
||||
make install DESTDIR="${QEMU_DESTDIR}"
|
||||
popd
|
||||
|
||||
@@ -16,6 +16,7 @@ VMM_CONFIGS="qemu fc"
|
||||
|
||||
GO_VERSION=${GO_VERSION}
|
||||
RUST_VERSION=${RUST_VERSION}
|
||||
CC=""
|
||||
|
||||
DESTDIR=${DESTDIR:-${PWD}}
|
||||
PREFIX=${PREFIX:-/opt/kata}
|
||||
@@ -23,29 +24,43 @@ container_image="${SHIM_V2_CONTAINER_BUILDER:-$(get_shim_v2_image_name)}"
|
||||
|
||||
EXTRA_OPTS="${EXTRA_OPTS:-""}"
|
||||
|
||||
[ "${CROSS_BUILD}" == "true" ] && container_image_bk="${container_image}" && container_image="${container_image}-cross-build"
|
||||
sudo docker pull ${container_image} || \
|
||||
(sudo docker build \
|
||||
(sudo docker ${BUILDX} build ${PLATFORM} \
|
||||
--build-arg GO_VERSION="${GO_VERSION}" \
|
||||
--build-arg RUST_VERSION="${RUST_VERSION}" \
|
||||
-t "${container_image}" \
|
||||
"${script_dir}" && \
|
||||
push_to_registry "${container_image}")
|
||||
|
||||
arch=$(uname -m)
|
||||
arch=${ARCH:-$(uname -m)}
|
||||
GCC_ARCH=${arch}
|
||||
if [ ${arch} = "ppc64le" ]; then
|
||||
GCC_ARCH="powerpc64le"
|
||||
arch="ppc64"
|
||||
fi
|
||||
|
||||
#Build rust project using cross build musl image to speed up
|
||||
[[ "${CROSS_BUILD}" == "true" && ${ARCH} != "s390x" ]] && container_image="messense/rust-musl-cross:${GCC_ARCH}-musl" && CC=${GCC_ARCH}-unknown-linux-musl-gcc
|
||||
|
||||
sudo docker run --rm -i -v "${repo_root_dir}:${repo_root_dir}" \
|
||||
--env CROSS_BUILD=${CROSS_BUILD} \
|
||||
--env ARCH=${ARCH} \
|
||||
--env CC="${CC}" \
|
||||
-w "${repo_root_dir}/src/runtime-rs" \
|
||||
"${container_image}" \
|
||||
bash -c "git config --global --add safe.directory ${repo_root_dir} && make PREFIX=${PREFIX} QEMUCMD=qemu-system-${arch}"
|
||||
|
||||
sudo docker run --rm -i -v "${repo_root_dir}:${repo_root_dir}" \
|
||||
--env CROSS_BUILD=${CROSS_BUILD} \
|
||||
--env ARCH=${ARCH} \
|
||||
--env CC="${CC}" \
|
||||
-w "${repo_root_dir}/src/runtime-rs" \
|
||||
"${container_image}" \
|
||||
bash -c "git config --global --add safe.directory ${repo_root_dir} && make PREFIX="${PREFIX}" DESTDIR="${DESTDIR}" install"
|
||||
|
||||
|
||||
[ "${CROSS_BUILD}" == "true" ] && container_image="${container_image_bk}-cross-build"
|
||||
|
||||
sudo docker run --rm -i -v "${repo_root_dir}:${repo_root_dir}" \
|
||||
-w "${repo_root_dir}/src/runtime" \
|
||||
"${container_image}" \
|
||||
|
||||
@@ -8,7 +8,7 @@ set -o errexit
|
||||
set -o nounset
|
||||
set -o pipefail
|
||||
|
||||
ARCH=$(uname -m)
|
||||
ARCH=${ARCH:-$(uname -m)}
|
||||
ARCH_LIBC=""
|
||||
LIBC=""
|
||||
|
||||
|
||||
@@ -13,6 +13,7 @@ readonly virtiofsd_builder="${script_dir}/build-static-virtiofsd.sh"
|
||||
|
||||
source "${script_dir}/../../scripts/lib.sh"
|
||||
|
||||
ARCH=${ARCH:-$(uname -m)}
|
||||
DESTDIR=${DESTDIR:-${PWD}}
|
||||
PREFIX=${PREFIX:-/opt/kata}
|
||||
kata_version="${kata_version:-}"
|
||||
@@ -32,7 +33,6 @@ package_output_dir="${package_output_dir:-}"
|
||||
[ -n "${virtiofsd_toolchain}" ] || die "Failed to get the rust toolchain to build virtiofsd"
|
||||
[ -n "${virtiofsd_zip}" ] || die "Failed to get virtiofsd binary URL"
|
||||
|
||||
ARCH=$(uname -m)
|
||||
case ${ARCH} in
|
||||
"aarch64")
|
||||
libc="musl"
|
||||
@@ -49,9 +49,10 @@ case ${ARCH} in
|
||||
esac
|
||||
|
||||
container_image="${VIRTIOFSD_CONTAINER_BUILDER:-$(get_virtiofsd_image_name)}"
|
||||
[ "${CROSS_BUILD}" == "true" ] && container_image="${container_image}-cross-build"
|
||||
|
||||
sudo docker pull ${container_image} || \
|
||||
(sudo docker build \
|
||||
(sudo docker $BUILDX build $PLATFORM \
|
||||
--build-arg RUST_TOOLCHAIN="${virtiofsd_toolchain}" \
|
||||
-t "${container_image}" "${script_dir}/${libc}" && \
|
||||
# No-op unless PUSH_TO_REGISTRY is exported as "yes"
|
||||
@@ -64,5 +65,6 @@ sudo docker run --rm -i -v "${repo_root_dir}:${repo_root_dir}" \
|
||||
--env virtiofsd_repo="${virtiofsd_repo}" \
|
||||
--env virtiofsd_version="${virtiofsd_version}" \
|
||||
--env virtiofsd_zip="${virtiofsd_zip}" \
|
||||
--env ARCH="${ARCH}" \
|
||||
"${container_image}" \
|
||||
bash -c "${virtiofsd_builder}"
|
||||
|
||||
Reference in New Issue
Block a user