mirror of
https://github.com/k8sgpt-ai/k8sgpt.git
synced 2025-10-21 22:47:23 +00:00
* feat: initial Prometheus analyzers Added a prometheus integration with two analyzers: 1. PrometheusConfigValidate 2. PrometheusConfigRelabelReport The integration does not deploy any Prometheus stack in the cluster. Instead, it searches the provided --namespace for a Prometheus configuration, stored in a ConfigMap or Secret. If it finds one, it unmarshals it into memory and runs the analyzers on it. PrometheusConfigValidate checks if the actual Prometheus configuration is valid or has any errors. PrometheusConfigRelabelReport tries to distill the scrape config relabeling rules to give a concise label set per job that targets need to have to be scraped. This analyzer is unconventional, in that it does not necessarily mean there are issues with the config. It merely tries to give a human-readable explanation of the relabel rules it discovers, leaning on the LLM and prompt. Tested on both kube-prometheus and Google Managed Prometheus stacks. Signed-off-by: Daniel Clark <danielclark@google.com> * review: feedback cycle 1 Simplify ConfigValidate prompt and add comments. Signed-off-by: Daniel Clark <danielclark@google.com> * review: feedback cycle 2 Add Prometheus configuration discovery to integration activate command. Also improve logging to make this more clear to users. Signed-off-by: Daniel Clark <danielclark@google.com> --------- Signed-off-by: Daniel Clark <danielclark@google.com>
60 lines
2.3 KiB
Go
60 lines
2.3 KiB
Go
package ai
|
|
|
|
const (
|
|
default_prompt = `Simplify the following Kubernetes error message delimited by triple dashes written in --- %s --- language; --- %s ---.
|
|
Provide the most possible solution in a step by step style in no more than 280 characters. Write the output in the following format:
|
|
Error: {Explain error here}
|
|
Solution: {Step by step solution here}
|
|
`
|
|
trivy_vuln_prompt = "Explain the following trivy scan result and the detail risk or root cause of the CVE ID, then provide a solution. Response in %s: %s"
|
|
trivy_conf_prompt = "Explain the following trivy scan result and the detail risk or root cause of the security check, then provide a solution."
|
|
|
|
prom_conf_prompt = `Simplify the following Prometheus error message delimited by triple dashes written in --- %s --- language; --- %s ---.
|
|
This error came when validating the Prometheus configuration file.
|
|
Provide step by step instructions to fix, with suggestions, referencing Prometheus documentation if relevant.
|
|
Write the output in the following format in no more than 300 characters:
|
|
Error: {Explain error here}
|
|
Solution: {Step by step solution here}
|
|
`
|
|
|
|
prom_relabel_prompt = `
|
|
Return your prompt in this language: %s, beginning with
|
|
The following is a list of the form:
|
|
job_name:
|
|
{Prometheus job_name}
|
|
relabel_configs:
|
|
{Prometheus relabel_configs}
|
|
kubernetes_sd_configs:
|
|
{Prometheus service discovery config}
|
|
---
|
|
%s
|
|
---
|
|
For each job_name, describe the Kubernetes service and pod labels,
|
|
namespaces, ports, and containers they match.
|
|
Return the message:
|
|
Discovered and parsed Prometheus scrape configurations.
|
|
For targets to be scraped by Prometheus, ensure they are running with
|
|
at least one of the following label sets:
|
|
Then for each job, write this format:
|
|
- Job: {job_name}
|
|
- Service Labels:
|
|
- {list of service labels}
|
|
- Pod Labels:
|
|
- {list of pod labels}
|
|
- Namespaces:
|
|
- {list of namespaces}
|
|
- Ports:
|
|
- {list of ports}
|
|
- Containers:
|
|
- {list of container names}
|
|
`
|
|
)
|
|
|
|
var PromptMap = map[string]string{
|
|
"default": default_prompt,
|
|
"VulnerabilityReport": trivy_vuln_prompt, // for Trivy integration, the key should match `Result.Kind` in pkg/common/types.go
|
|
"ConfigAuditReport": trivy_conf_prompt,
|
|
"PrometheusConfigValidate": prom_conf_prompt,
|
|
"PrometheusConfigRelabelReport": prom_relabel_prompt,
|
|
}
|