mirror of
https://github.com/k8sgpt-ai/k8sgpt.git
synced 2026-03-18 19:17:25 +00:00
Compare commits
4 Commits
copilot/re
...
copilot/ad
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
6c7b7c4751 | ||
|
|
03bd9a8387 | ||
|
|
dfdcf9edd2 | ||
|
|
2253625f40 |
@@ -279,6 +279,7 @@ you will be able to write your own analyzers.
|
||||
- [x] OperatorGroup
|
||||
- [x] InstallPlan
|
||||
- [x] Subscription
|
||||
- [x] **CustomResource** - Generic analyzer for any CRD (cert-manager, ArgoCD, Kafka, etc.) [Documentation](docs/CRD_ANALYZER.md)
|
||||
|
||||
## Examples
|
||||
|
||||
|
||||
252
docs/CRD_ANALYZER.md
Normal file
252
docs/CRD_ANALYZER.md
Normal file
@@ -0,0 +1,252 @@
|
||||
# Generic CRD Analyzer Configuration Examples
|
||||
|
||||
The Generic CRD Analyzer enables K8sGPT to automatically analyze custom resources from any installed CRD in your Kubernetes cluster. This provides observability for operator-managed resources like cert-manager, ArgoCD, Kafka, and more.
|
||||
|
||||
## Basic Configuration
|
||||
|
||||
The CRD analyzer is configured via the K8sGPT configuration file (typically `~/.config/k8sgpt/k8sgpt.yaml`). Here's a minimal example:
|
||||
|
||||
```yaml
|
||||
crd_analyzer:
|
||||
enabled: true
|
||||
```
|
||||
|
||||
With this basic configuration, the analyzer will:
|
||||
- Discover all CRDs installed in your cluster
|
||||
- Apply generic health checks based on common Kubernetes patterns
|
||||
- Report issues with resources that have unhealthy status conditions
|
||||
|
||||
## Configuration Options
|
||||
|
||||
### Complete Example
|
||||
|
||||
```yaml
|
||||
crd_analyzer:
|
||||
enabled: true
|
||||
include:
|
||||
- name: certificates.cert-manager.io
|
||||
statusPath: ".status.conditions"
|
||||
readyCondition:
|
||||
type: "Ready"
|
||||
expectedStatus: "True"
|
||||
|
||||
- name: applications.argoproj.io
|
||||
statusPath: ".status.health.status"
|
||||
expectedValue: "Healthy"
|
||||
|
||||
- name: kafkas.kafka.strimzi.io
|
||||
readyCondition:
|
||||
type: "Ready"
|
||||
expectedStatus: "True"
|
||||
|
||||
exclude:
|
||||
- name: kafkatopics.kafka.strimzi.io
|
||||
- name: servicemonitors.monitoring.coreos.com
|
||||
```
|
||||
|
||||
### Configuration Fields
|
||||
|
||||
#### `enabled` (boolean)
|
||||
- **Default**: `false`
|
||||
- **Description**: Master switch to enable/disable the CRD analyzer
|
||||
- **Example**: `enabled: true`
|
||||
|
||||
#### `include` (array)
|
||||
- **Description**: List of CRDs with custom health check configurations
|
||||
- **Fields**:
|
||||
- `name` (string, required): The full CRD name (e.g., `certificates.cert-manager.io`)
|
||||
- `statusPath` (string, optional): JSONPath to the status field to check (e.g., `.status.health.status`)
|
||||
- `readyCondition` (object, optional): Configuration for checking a Ready-style condition
|
||||
- `type` (string): The condition type to check (e.g., `"Ready"`)
|
||||
- `expectedStatus` (string): Expected status value (e.g., `"True"`)
|
||||
- `expectedValue` (string, optional): Expected value at the statusPath (requires `statusPath`)
|
||||
|
||||
#### `exclude` (array)
|
||||
- **Description**: List of CRDs to skip during analysis
|
||||
- **Fields**:
|
||||
- `name` (string): The full CRD name to exclude
|
||||
|
||||
## Use Cases
|
||||
|
||||
### 1. cert-manager Certificate Analysis
|
||||
|
||||
Detect certificates that are not ready or have issuance failures:
|
||||
|
||||
```yaml
|
||||
crd_analyzer:
|
||||
enabled: true
|
||||
include:
|
||||
- name: certificates.cert-manager.io
|
||||
readyCondition:
|
||||
type: "Ready"
|
||||
expectedStatus: "True"
|
||||
```
|
||||
|
||||
**Detected Issues:**
|
||||
- Certificates with `Ready=False`
|
||||
- Certificate renewal failures
|
||||
- Invalid certificate configurations
|
||||
|
||||
### 2. ArgoCD Application Health
|
||||
|
||||
Monitor ArgoCD application sync and health status:
|
||||
|
||||
```yaml
|
||||
crd_analyzer:
|
||||
enabled: true
|
||||
include:
|
||||
- name: applications.argoproj.io
|
||||
statusPath: ".status.health.status"
|
||||
expectedValue: "Healthy"
|
||||
```
|
||||
|
||||
**Detected Issues:**
|
||||
- Applications in `Degraded` state
|
||||
- Sync failures
|
||||
- Missing resources
|
||||
|
||||
### 3. Kafka Operator Resources
|
||||
|
||||
Check Kafka cluster health with Strimzi operator:
|
||||
|
||||
```yaml
|
||||
crd_analyzer:
|
||||
enabled: true
|
||||
include:
|
||||
- name: kafkas.kafka.strimzi.io
|
||||
readyCondition:
|
||||
type: "Ready"
|
||||
expectedStatus: "True"
|
||||
exclude:
|
||||
- name: kafkatopics.kafka.strimzi.io # Exclude topics to reduce noise
|
||||
```
|
||||
|
||||
**Detected Issues:**
|
||||
- Kafka clusters not ready
|
||||
- Broker failures
|
||||
- Configuration issues
|
||||
|
||||
### 4. Prometheus Operator
|
||||
|
||||
Monitor Prometheus instances:
|
||||
|
||||
```yaml
|
||||
crd_analyzer:
|
||||
enabled: true
|
||||
include:
|
||||
- name: prometheuses.monitoring.coreos.com
|
||||
readyCondition:
|
||||
type: "Available"
|
||||
expectedStatus: "True"
|
||||
```
|
||||
|
||||
**Detected Issues:**
|
||||
- Prometheus instances not available
|
||||
- Configuration reload failures
|
||||
- Storage issues
|
||||
|
||||
## Generic Health Checks
|
||||
|
||||
When a CRD is not explicitly configured in the `include` list, the analyzer applies generic health checks:
|
||||
|
||||
### Supported Patterns
|
||||
|
||||
1. **status.conditions** - Standard Kubernetes conditions
|
||||
- Flags `Ready` conditions with status != `"True"`
|
||||
- Flags any condition type containing "failed" with status = `"True"`
|
||||
|
||||
2. **status.phase** - Phase-based resources
|
||||
- Flags resources with phase = `"Failed"` or `"Error"`
|
||||
|
||||
3. **status.health.status** - ArgoCD-style health
|
||||
- Flags resources with health status != `"Healthy"` (except `"Unknown"`)
|
||||
|
||||
4. **status.state** - State-based resources
|
||||
- Flags resources with state = `"Failed"` or `"Error"`
|
||||
|
||||
5. **Deletion with Finalizers** - Stuck resources
|
||||
- Flags resources with `deletionTimestamp` set but still having finalizers
|
||||
|
||||
## Running the Analyzer
|
||||
|
||||
### Enable in Configuration
|
||||
|
||||
Add the CRD analyzer to your active filters:
|
||||
|
||||
```bash
|
||||
# Add CustomResource filter
|
||||
k8sgpt filters add CustomResource
|
||||
|
||||
# List active filters to verify
|
||||
k8sgpt filters list
|
||||
```
|
||||
|
||||
### Run Analysis
|
||||
|
||||
```bash
|
||||
# Basic analysis
|
||||
k8sgpt analyze --explain
|
||||
|
||||
# With specific filter
|
||||
k8sgpt analyze --explain --filter=CustomResource
|
||||
|
||||
# In a specific namespace
|
||||
k8sgpt analyze --explain --filter=CustomResource --namespace=production
|
||||
```
|
||||
|
||||
### Example Output
|
||||
|
||||
```
|
||||
AI Provider: openai
|
||||
|
||||
0: CustomResource/Certificate(default/example-cert)
|
||||
- Error: Condition Ready is False (reason: Failed): Certificate issuance failed
|
||||
- Details: The certificate 'example-cert' in namespace 'default' failed to issue.
|
||||
The Let's Encrypt challenge validation failed due to DNS propagation issues.
|
||||
Recommendation: Check DNS records and retry certificate issuance.
|
||||
|
||||
1: CustomResource/Application(argocd/my-app)
|
||||
- Error: Health status is Degraded
|
||||
- Details: The ArgoCD application 'my-app' is in a Degraded state.
|
||||
This typically indicates that deployed resources are not healthy.
|
||||
Recommendation: Check application logs and pod status.
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### 1. Start with Generic Checks
|
||||
Begin with just `enabled: true` to see what issues are detected across all CRDs.
|
||||
|
||||
### 2. Add Specific Configurations Gradually
|
||||
Add custom configurations for critical CRDs that need specialized health checks.
|
||||
|
||||
### 3. Use Exclusions to Reduce Noise
|
||||
Exclude CRDs that generate false positives or are less critical.
|
||||
|
||||
### 4. Combine with Other Analyzers
|
||||
Use the CRD analyzer alongside built-in analyzers for comprehensive cluster observability.
|
||||
|
||||
### 5. Monitor Performance
|
||||
If you have many CRDs, the analysis may take longer. Use exclusions to optimize.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Analyzer Not Running
|
||||
- Verify `enabled: true` is set in configuration
|
||||
- Check that `CustomResource` is in active filters: `k8sgpt filters list`
|
||||
- Ensure configuration file is in the correct location
|
||||
|
||||
### No Issues Detected
|
||||
- Verify CRDs are actually installed: `kubectl get crds`
|
||||
- Check if custom resources exist: `kubectl get <crd-name> --all-namespaces`
|
||||
- Review generic health check patterns - your CRDs may use different status fields
|
||||
|
||||
### Too Many False Positives
|
||||
- Add specific configurations for problematic CRDs in the `include` section
|
||||
- Use the `exclude` list to skip noisy CRDs
|
||||
- Review the status patterns your CRDs use and configure accordingly
|
||||
|
||||
### Configuration Not Applied
|
||||
- Restart K8sGPT after configuration changes
|
||||
- Verify YAML syntax is correct
|
||||
- Check K8sGPT logs for configuration parsing errors
|
||||
45
examples/crd_analyzer_config.yaml
Normal file
45
examples/crd_analyzer_config.yaml
Normal file
@@ -0,0 +1,45 @@
|
||||
# Example K8sGPT Configuration with CRD Analyzer
|
||||
# Place this file at ~/.config/k8sgpt/k8sgpt.yaml
|
||||
|
||||
# CRD Analyzer Configuration
|
||||
crd_analyzer:
|
||||
enabled: true
|
||||
|
||||
# Specific CRD configurations with custom health checks
|
||||
include:
|
||||
# cert-manager certificates
|
||||
- name: certificates.cert-manager.io
|
||||
readyCondition:
|
||||
type: "Ready"
|
||||
expectedStatus: "True"
|
||||
|
||||
# ArgoCD applications
|
||||
- name: applications.argoproj.io
|
||||
statusPath: ".status.health.status"
|
||||
expectedValue: "Healthy"
|
||||
|
||||
# Strimzi Kafka clusters
|
||||
- name: kafkas.kafka.strimzi.io
|
||||
readyCondition:
|
||||
type: "Ready"
|
||||
expectedStatus: "True"
|
||||
|
||||
# Prometheus instances
|
||||
- name: prometheuses.monitoring.coreos.com
|
||||
readyCondition:
|
||||
type: "Available"
|
||||
expectedStatus: "True"
|
||||
|
||||
# CRDs to skip during analysis
|
||||
exclude:
|
||||
- name: kafkatopics.kafka.strimzi.io
|
||||
- name: servicemonitors.monitoring.coreos.com
|
||||
- name: podmonitors.monitoring.coreos.com
|
||||
- name: prometheusrules.monitoring.coreos.com
|
||||
|
||||
# Other K8sGPT configuration...
|
||||
# ai:
|
||||
# providers:
|
||||
# - name: openai
|
||||
# model: gpt-4
|
||||
# # ... other AI config
|
||||
@@ -64,6 +64,7 @@ var additionalAnalyzerMap = map[string]common.IAnalyzer{
|
||||
"InstallPlan": InstallPlanAnalyzer{},
|
||||
"CatalogSource": CatalogSourceAnalyzer{},
|
||||
"OperatorGroup": OperatorGroupAnalyzer{},
|
||||
"CustomResource": CRDAnalyzer{},
|
||||
}
|
||||
|
||||
func ListFilters() ([]string, []string, []string) {
|
||||
|
||||
330
pkg/analyzer/crd.go
Normal file
330
pkg/analyzer/crd.go
Normal file
@@ -0,0 +1,330 @@
|
||||
/*
|
||||
Copyright 2023 The K8sGPT Authors.
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
*/
|
||||
|
||||
package analyzer
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"strings"
|
||||
|
||||
"github.com/k8sgpt-ai/k8sgpt/pkg/common"
|
||||
"github.com/spf13/viper"
|
||||
apiextensionsv1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1"
|
||||
apiextensionsclientset "k8s.io/apiextensions-apiserver/pkg/client/clientset/clientset"
|
||||
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
|
||||
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
|
||||
"k8s.io/apimachinery/pkg/runtime/schema"
|
||||
)
|
||||
|
||||
type CRDAnalyzer struct{}
|
||||
|
||||
func (CRDAnalyzer) Analyze(a common.Analyzer) ([]common.Result, error) {
|
||||
// Load CRD analyzer configuration
|
||||
var config common.CRDAnalyzerConfig
|
||||
if err := viper.UnmarshalKey("crd_analyzer", &config); err != nil {
|
||||
// If no config or error, disable the analyzer
|
||||
return nil, nil
|
||||
}
|
||||
|
||||
if !config.Enabled {
|
||||
return nil, nil
|
||||
}
|
||||
|
||||
// Create apiextensions client to discover CRDs
|
||||
apiExtClient, err := apiextensionsclientset.NewForConfig(a.Client.GetConfig())
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to create apiextensions client: %w", err)
|
||||
}
|
||||
|
||||
// List all CRDs in the cluster
|
||||
crdList, err := apiExtClient.ApiextensionsV1().CustomResourceDefinitions().List(a.Context, metav1.ListOptions{})
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to list CRDs: %w", err)
|
||||
}
|
||||
|
||||
var results []common.Result
|
||||
|
||||
// Process each CRD
|
||||
for _, crd := range crdList.Items {
|
||||
// Check if CRD should be excluded
|
||||
if shouldExcludeCRD(crd.Name, config.Exclude) {
|
||||
continue
|
||||
}
|
||||
|
||||
// Get the CRD configuration (if specified)
|
||||
crdConfig := getCRDConfig(crd.Name, config.Include)
|
||||
|
||||
// Analyze resources for this CRD
|
||||
crdResults, err := analyzeCRDResources(a, crd, crdConfig)
|
||||
if err != nil {
|
||||
// Log error but continue with other CRDs
|
||||
continue
|
||||
}
|
||||
|
||||
results = append(results, crdResults...)
|
||||
}
|
||||
|
||||
return results, nil
|
||||
}
|
||||
|
||||
// shouldExcludeCRD checks if a CRD should be excluded from analysis
|
||||
func shouldExcludeCRD(crdName string, excludeList []common.CRDExcludeConfig) bool {
|
||||
for _, exclude := range excludeList {
|
||||
if exclude.Name == crdName {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// getCRDConfig returns the configuration for a specific CRD if it exists
|
||||
func getCRDConfig(crdName string, includeList []common.CRDIncludeConfig) *common.CRDIncludeConfig {
|
||||
for _, include := range includeList {
|
||||
if include.Name == crdName {
|
||||
return &include
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// analyzeCRDResources analyzes all instances of a CRD
|
||||
func analyzeCRDResources(a common.Analyzer, crd apiextensionsv1.CustomResourceDefinition, config *common.CRDIncludeConfig) ([]common.Result, error) {
|
||||
if a.Client.GetDynamicClient() == nil {
|
||||
return nil, fmt.Errorf("dynamic client is nil")
|
||||
}
|
||||
|
||||
// Get the preferred version (typically the storage version)
|
||||
var version string
|
||||
for _, v := range crd.Spec.Versions {
|
||||
if v.Storage {
|
||||
version = v.Name
|
||||
break
|
||||
}
|
||||
}
|
||||
if version == "" && len(crd.Spec.Versions) > 0 {
|
||||
version = crd.Spec.Versions[0].Name
|
||||
}
|
||||
|
||||
// Construct GVR
|
||||
gvr := schema.GroupVersionResource{
|
||||
Group: crd.Spec.Group,
|
||||
Version: version,
|
||||
Resource: crd.Spec.Names.Plural,
|
||||
}
|
||||
|
||||
// List resources
|
||||
var list *unstructured.UnstructuredList
|
||||
var err error
|
||||
if crd.Spec.Scope == apiextensionsv1.NamespaceScoped {
|
||||
if a.Namespace != "" {
|
||||
list, err = a.Client.GetDynamicClient().Resource(gvr).Namespace(a.Namespace).List(a.Context, metav1.ListOptions{LabelSelector: a.LabelSelector})
|
||||
} else {
|
||||
list, err = a.Client.GetDynamicClient().Resource(gvr).Namespace(metav1.NamespaceAll).List(a.Context, metav1.ListOptions{LabelSelector: a.LabelSelector})
|
||||
}
|
||||
} else {
|
||||
// Cluster-scoped
|
||||
list, err = a.Client.GetDynamicClient().Resource(gvr).List(a.Context, metav1.ListOptions{LabelSelector: a.LabelSelector})
|
||||
}
|
||||
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
var results []common.Result
|
||||
|
||||
// Analyze each resource instance
|
||||
for _, item := range list.Items {
|
||||
failures := analyzeResource(item, crd, config)
|
||||
if len(failures) > 0 {
|
||||
resourceName := item.GetName()
|
||||
if item.GetNamespace() != "" {
|
||||
resourceName = item.GetNamespace() + "/" + resourceName
|
||||
}
|
||||
|
||||
results = append(results, common.Result{
|
||||
Kind: crd.Spec.Names.Kind,
|
||||
Name: resourceName,
|
||||
Error: failures,
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
return results, nil
|
||||
}
|
||||
|
||||
// analyzeResource analyzes a single CR instance for issues
|
||||
func analyzeResource(item unstructured.Unstructured, crd apiextensionsv1.CustomResourceDefinition, config *common.CRDIncludeConfig) []common.Failure {
|
||||
var failures []common.Failure
|
||||
|
||||
// Check for deletion with finalizers (resource stuck in deletion)
|
||||
if item.GetDeletionTimestamp() != nil && len(item.GetFinalizers()) > 0 {
|
||||
failures = append(failures, common.Failure{
|
||||
Text: fmt.Sprintf("Resource is being deleted but has finalizers: %v", item.GetFinalizers()),
|
||||
})
|
||||
}
|
||||
|
||||
// If custom config is provided, use it
|
||||
if config != nil {
|
||||
configFailures := analyzeWithConfig(item, config)
|
||||
failures = append(failures, configFailures...)
|
||||
return failures
|
||||
}
|
||||
|
||||
// Otherwise, use generic health checks based on common patterns
|
||||
genericFailures := analyzeGenericHealth(item)
|
||||
failures = append(failures, genericFailures...)
|
||||
|
||||
return failures
|
||||
}
|
||||
|
||||
// analyzeWithConfig analyzes a resource using custom configuration
|
||||
func analyzeWithConfig(item unstructured.Unstructured, config *common.CRDIncludeConfig) []common.Failure {
|
||||
var failures []common.Failure
|
||||
|
||||
// Check ReadyCondition if specified
|
||||
if config.ReadyCondition != nil {
|
||||
conditions, found, err := unstructured.NestedSlice(item.Object, "status", "conditions")
|
||||
if !found || err != nil {
|
||||
failures = append(failures, common.Failure{
|
||||
Text: "Expected status.conditions not found",
|
||||
})
|
||||
return failures
|
||||
}
|
||||
|
||||
ready := false
|
||||
var conditionMessages []string
|
||||
for _, cond := range conditions {
|
||||
condMap, ok := cond.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
|
||||
condType, _, _ := unstructured.NestedString(condMap, "type")
|
||||
status, _, _ := unstructured.NestedString(condMap, "status")
|
||||
message, _, _ := unstructured.NestedString(condMap, "message")
|
||||
|
||||
if condType == config.ReadyCondition.Type {
|
||||
if status == config.ReadyCondition.ExpectedStatus {
|
||||
ready = true
|
||||
} else {
|
||||
conditionMessages = append(conditionMessages, fmt.Sprintf("%s=%s: %s", condType, status, message))
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if !ready {
|
||||
msg := fmt.Sprintf("Ready condition not met: expected %s=%s", config.ReadyCondition.Type, config.ReadyCondition.ExpectedStatus)
|
||||
if len(conditionMessages) > 0 {
|
||||
msg += "; " + strings.Join(conditionMessages, "; ")
|
||||
}
|
||||
failures = append(failures, common.Failure{
|
||||
Text: msg,
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// Check ExpectedValue if specified and StatusPath provided
|
||||
if config.ExpectedValue != "" && config.StatusPath != "" {
|
||||
pathParts := strings.Split(config.StatusPath, ".")
|
||||
// Remove leading dot if present
|
||||
if len(pathParts) > 0 && pathParts[0] == "" {
|
||||
pathParts = pathParts[1:]
|
||||
}
|
||||
|
||||
actualValue, found, err := unstructured.NestedString(item.Object, pathParts...)
|
||||
if !found || err != nil {
|
||||
failures = append(failures, common.Failure{
|
||||
Text: fmt.Sprintf("Expected field %s not found", config.StatusPath),
|
||||
})
|
||||
} else if actualValue != config.ExpectedValue {
|
||||
failures = append(failures, common.Failure{
|
||||
Text: fmt.Sprintf("Field %s has value '%s', expected '%s'", config.StatusPath, actualValue, config.ExpectedValue),
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
return failures
|
||||
}
|
||||
|
||||
// analyzeGenericHealth applies generic health checks based on common Kubernetes patterns
|
||||
func analyzeGenericHealth(item unstructured.Unstructured) []common.Failure {
|
||||
var failures []common.Failure
|
||||
|
||||
// Check for status.conditions (common pattern)
|
||||
conditions, found, err := unstructured.NestedSlice(item.Object, "status", "conditions")
|
||||
if found && err == nil && len(conditions) > 0 {
|
||||
for _, cond := range conditions {
|
||||
condMap, ok := cond.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
|
||||
condType, _, _ := unstructured.NestedString(condMap, "type")
|
||||
status, _, _ := unstructured.NestedString(condMap, "status")
|
||||
reason, _, _ := unstructured.NestedString(condMap, "reason")
|
||||
message, _, _ := unstructured.NestedString(condMap, "message")
|
||||
|
||||
// Check for common failure patterns
|
||||
if condType == "Ready" && status != "True" {
|
||||
msg := fmt.Sprintf("Condition Ready is %s", status)
|
||||
if reason != "" {
|
||||
msg += fmt.Sprintf(" (reason: %s)", reason)
|
||||
}
|
||||
if message != "" {
|
||||
msg += fmt.Sprintf(": %s", message)
|
||||
}
|
||||
failures = append(failures, common.Failure{Text: msg})
|
||||
} else if strings.Contains(strings.ToLower(condType), "failed") && status == "True" {
|
||||
msg := fmt.Sprintf("Condition %s is True", condType)
|
||||
if message != "" {
|
||||
msg += fmt.Sprintf(": %s", message)
|
||||
}
|
||||
failures = append(failures, common.Failure{Text: msg})
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Check for status.phase (common pattern)
|
||||
phase, found, _ := unstructured.NestedString(item.Object, "status", "phase")
|
||||
if found && phase != "" {
|
||||
lowerPhase := strings.ToLower(phase)
|
||||
if lowerPhase == "failed" || lowerPhase == "error" {
|
||||
failures = append(failures, common.Failure{
|
||||
Text: fmt.Sprintf("Resource phase is %s", phase),
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// Check for status.health.status (ArgoCD pattern)
|
||||
healthStatus, found, _ := unstructured.NestedString(item.Object, "status", "health", "status")
|
||||
if found && healthStatus != "" {
|
||||
if healthStatus != "Healthy" && healthStatus != "Unknown" {
|
||||
failures = append(failures, common.Failure{
|
||||
Text: fmt.Sprintf("Health status is %s", healthStatus),
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// Check for status.state (common pattern)
|
||||
state, found, _ := unstructured.NestedString(item.Object, "status", "state")
|
||||
if found && state != "" {
|
||||
lowerState := strings.ToLower(state)
|
||||
if lowerState == "failed" || lowerState == "error" {
|
||||
failures = append(failures, common.Failure{
|
||||
Text: fmt.Sprintf("Resource state is %s", state),
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
return failures
|
||||
}
|
||||
410
pkg/analyzer/crd_test.go
Normal file
410
pkg/analyzer/crd_test.go
Normal file
@@ -0,0 +1,410 @@
|
||||
/*
|
||||
Copyright 2023 The K8sGPT Authors.
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
*/
|
||||
|
||||
package analyzer
|
||||
|
||||
import (
|
||||
"context"
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"github.com/k8sgpt-ai/k8sgpt/pkg/common"
|
||||
"github.com/k8sgpt-ai/k8sgpt/pkg/kubernetes"
|
||||
"github.com/spf13/viper"
|
||||
apiextensionsv1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1"
|
||||
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
|
||||
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
|
||||
"k8s.io/client-go/rest"
|
||||
)
|
||||
|
||||
// TestCRDAnalyzer_Disabled tests that analyzer returns nil when disabled
|
||||
func TestCRDAnalyzer_Disabled(t *testing.T) {
|
||||
viper.Reset()
|
||||
viper.Set("crd_analyzer", map[string]interface{}{
|
||||
"enabled": false,
|
||||
})
|
||||
|
||||
a := common.Analyzer{
|
||||
Context: context.TODO(),
|
||||
Client: &kubernetes.Client{},
|
||||
}
|
||||
|
||||
res, err := (CRDAnalyzer{}).Analyze(a)
|
||||
if err != nil {
|
||||
t.Fatalf("Analyze error: %v", err)
|
||||
}
|
||||
if res != nil {
|
||||
t.Fatalf("expected nil result when disabled, got %d results", len(res))
|
||||
}
|
||||
}
|
||||
|
||||
// TestCRDAnalyzer_NoConfig tests that analyzer returns nil when no config exists
|
||||
func TestCRDAnalyzer_NoConfig(t *testing.T) {
|
||||
viper.Reset()
|
||||
|
||||
a := common.Analyzer{
|
||||
Context: context.TODO(),
|
||||
Client: &kubernetes.Client{},
|
||||
}
|
||||
|
||||
res, err := (CRDAnalyzer{}).Analyze(a)
|
||||
if err != nil {
|
||||
t.Fatalf("Analyze error: %v", err)
|
||||
}
|
||||
if res != nil {
|
||||
t.Fatalf("expected nil result when no config, got %d results", len(res))
|
||||
}
|
||||
}
|
||||
|
||||
// TestAnalyzeGenericHealth_ReadyConditionFalse tests detection of Ready=False condition
|
||||
func TestAnalyzeGenericHealth_ReadyConditionFalse(t *testing.T) {
|
||||
item := unstructured.Unstructured{
|
||||
Object: map[string]interface{}{
|
||||
"apiVersion": "cert-manager.io/v1",
|
||||
"kind": "Certificate",
|
||||
"metadata": map[string]interface{}{
|
||||
"name": "example-cert",
|
||||
"namespace": "default",
|
||||
},
|
||||
"status": map[string]interface{}{
|
||||
"conditions": []interface{}{
|
||||
map[string]interface{}{
|
||||
"type": "Ready",
|
||||
"status": "False",
|
||||
"reason": "Failed",
|
||||
"message": "Certificate issuance failed",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
failures := analyzeGenericHealth(item)
|
||||
if len(failures) != 1 {
|
||||
t.Fatalf("expected 1 failure, got %d", len(failures))
|
||||
}
|
||||
if !strings.Contains(failures[0].Text, "Ready is False") {
|
||||
t.Errorf("expected 'Ready is False' in failure text, got: %s", failures[0].Text)
|
||||
}
|
||||
if !strings.Contains(failures[0].Text, "Failed") {
|
||||
t.Errorf("expected 'Failed' reason in failure text, got: %s", failures[0].Text)
|
||||
}
|
||||
}
|
||||
|
||||
// TestAnalyzeGenericHealth_FailedPhase tests detection of Failed phase
|
||||
func TestAnalyzeGenericHealth_FailedPhase(t *testing.T) {
|
||||
item := unstructured.Unstructured{
|
||||
Object: map[string]interface{}{
|
||||
"apiVersion": "example.io/v1",
|
||||
"kind": "CustomJob",
|
||||
"metadata": map[string]interface{}{
|
||||
"name": "failed-job",
|
||||
"namespace": "default",
|
||||
},
|
||||
"status": map[string]interface{}{
|
||||
"phase": "Failed",
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
failures := analyzeGenericHealth(item)
|
||||
if len(failures) != 1 {
|
||||
t.Fatalf("expected 1 failure, got %d", len(failures))
|
||||
}
|
||||
if !strings.Contains(failures[0].Text, "phase is Failed") {
|
||||
t.Errorf("expected 'phase is Failed' in failure text, got: %s", failures[0].Text)
|
||||
}
|
||||
}
|
||||
|
||||
// TestAnalyzeGenericHealth_UnhealthyHealthStatus tests ArgoCD-style health status
|
||||
func TestAnalyzeGenericHealth_UnhealthyHealthStatus(t *testing.T) {
|
||||
item := unstructured.Unstructured{
|
||||
Object: map[string]interface{}{
|
||||
"apiVersion": "argoproj.io/v1alpha1",
|
||||
"kind": "Application",
|
||||
"metadata": map[string]interface{}{
|
||||
"name": "my-app",
|
||||
"namespace": "argocd",
|
||||
},
|
||||
"status": map[string]interface{}{
|
||||
"health": map[string]interface{}{
|
||||
"status": "Degraded",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
failures := analyzeGenericHealth(item)
|
||||
if len(failures) != 1 {
|
||||
t.Fatalf("expected 1 failure, got %d", len(failures))
|
||||
}
|
||||
if !strings.Contains(failures[0].Text, "Health status is Degraded") {
|
||||
t.Errorf("expected 'Health status is Degraded' in failure text, got: %s", failures[0].Text)
|
||||
}
|
||||
}
|
||||
|
||||
// TestAnalyzeGenericHealth_HealthyResource tests that healthy resources are not flagged
|
||||
func TestAnalyzeGenericHealth_HealthyResource(t *testing.T) {
|
||||
item := unstructured.Unstructured{
|
||||
Object: map[string]interface{}{
|
||||
"apiVersion": "cert-manager.io/v1",
|
||||
"kind": "Certificate",
|
||||
"metadata": map[string]interface{}{
|
||||
"name": "healthy-cert",
|
||||
"namespace": "default",
|
||||
},
|
||||
"status": map[string]interface{}{
|
||||
"conditions": []interface{}{
|
||||
map[string]interface{}{
|
||||
"type": "Ready",
|
||||
"status": "True",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
failures := analyzeGenericHealth(item)
|
||||
if len(failures) != 0 {
|
||||
t.Fatalf("expected 0 failures for healthy resource, got %d", len(failures))
|
||||
}
|
||||
}
|
||||
|
||||
// TestAnalyzeResource_DeletionWithFinalizers tests detection of stuck deletion
|
||||
func TestAnalyzeResource_DeletionWithFinalizers(t *testing.T) {
|
||||
deletionTimestamp := metav1.Now()
|
||||
item := unstructured.Unstructured{
|
||||
Object: map[string]interface{}{
|
||||
"apiVersion": "example.io/v1",
|
||||
"kind": "CustomResource",
|
||||
"metadata": map[string]interface{}{
|
||||
"name": "stuck-resource",
|
||||
"namespace": "default",
|
||||
"deletionTimestamp": deletionTimestamp.Format("2006-01-02T15:04:05Z"),
|
||||
"finalizers": []interface{}{"example.io/finalizer"},
|
||||
},
|
||||
},
|
||||
}
|
||||
item.SetDeletionTimestamp(&deletionTimestamp)
|
||||
item.SetFinalizers([]string{"example.io/finalizer"})
|
||||
|
||||
crd := apiextensionsv1.CustomResourceDefinition{}
|
||||
failures := analyzeResource(item, crd, nil)
|
||||
|
||||
if len(failures) != 1 {
|
||||
t.Fatalf("expected 1 failure for stuck deletion, got %d", len(failures))
|
||||
}
|
||||
if !strings.Contains(failures[0].Text, "being deleted") {
|
||||
t.Errorf("expected 'being deleted' in failure text, got: %s", failures[0].Text)
|
||||
}
|
||||
if !strings.Contains(failures[0].Text, "finalizers") {
|
||||
t.Errorf("expected 'finalizers' in failure text, got: %s", failures[0].Text)
|
||||
}
|
||||
}
|
||||
|
||||
// TestAnalyzeWithConfig_ReadyConditionCheck tests custom ready condition checking
|
||||
func TestAnalyzeWithConfig_ReadyConditionCheck(t *testing.T) {
|
||||
item := unstructured.Unstructured{
|
||||
Object: map[string]interface{}{
|
||||
"apiVersion": "cert-manager.io/v1",
|
||||
"kind": "Certificate",
|
||||
"metadata": map[string]interface{}{
|
||||
"name": "test-cert",
|
||||
"namespace": "default",
|
||||
},
|
||||
"status": map[string]interface{}{
|
||||
"conditions": []interface{}{
|
||||
map[string]interface{}{
|
||||
"type": "Ready",
|
||||
"status": "False",
|
||||
"message": "Certificate not issued",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
config := &common.CRDIncludeConfig{
|
||||
ReadyCondition: &common.CRDReadyCondition{
|
||||
Type: "Ready",
|
||||
ExpectedStatus: "True",
|
||||
},
|
||||
}
|
||||
|
||||
failures := analyzeWithConfig(item, config)
|
||||
if len(failures) != 1 {
|
||||
t.Fatalf("expected 1 failure, got %d", len(failures))
|
||||
}
|
||||
if !strings.Contains(failures[0].Text, "Ready condition not met") {
|
||||
t.Errorf("expected 'Ready condition not met' in failure text, got: %s", failures[0].Text)
|
||||
}
|
||||
}
|
||||
|
||||
// TestAnalyzeWithConfig_ExpectedValueCheck tests custom status path value checking
|
||||
func TestAnalyzeWithConfig_ExpectedValueCheck(t *testing.T) {
|
||||
item := unstructured.Unstructured{
|
||||
Object: map[string]interface{}{
|
||||
"apiVersion": "argoproj.io/v1alpha1",
|
||||
"kind": "Application",
|
||||
"metadata": map[string]interface{}{
|
||||
"name": "my-app",
|
||||
"namespace": "argocd",
|
||||
},
|
||||
"status": map[string]interface{}{
|
||||
"health": map[string]interface{}{
|
||||
"status": "Degraded",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
config := &common.CRDIncludeConfig{
|
||||
StatusPath: "status.health.status",
|
||||
ExpectedValue: "Healthy",
|
||||
}
|
||||
|
||||
failures := analyzeWithConfig(item, config)
|
||||
if len(failures) != 1 {
|
||||
t.Fatalf("expected 1 failure, got %d", len(failures))
|
||||
}
|
||||
if !strings.Contains(failures[0].Text, "Degraded") {
|
||||
t.Errorf("expected 'Degraded' in failure text, got: %s", failures[0].Text)
|
||||
}
|
||||
if !strings.Contains(failures[0].Text, "expected 'Healthy'") {
|
||||
t.Errorf("expected 'expected Healthy' in failure text, got: %s", failures[0].Text)
|
||||
}
|
||||
}
|
||||
|
||||
// TestShouldExcludeCRD tests exclusion logic
|
||||
func TestShouldExcludeCRD(t *testing.T) {
|
||||
excludeList := []common.CRDExcludeConfig{
|
||||
{Name: "kafkatopics.kafka.strimzi.io"},
|
||||
{Name: "prometheuses.monitoring.coreos.com"},
|
||||
}
|
||||
|
||||
if !shouldExcludeCRD("kafkatopics.kafka.strimzi.io", excludeList) {
|
||||
t.Error("expected kafkatopics to be excluded")
|
||||
}
|
||||
|
||||
if shouldExcludeCRD("certificates.cert-manager.io", excludeList) {
|
||||
t.Error("expected certificates not to be excluded")
|
||||
}
|
||||
}
|
||||
|
||||
// TestGetCRDConfig tests configuration retrieval
|
||||
func TestGetCRDConfig(t *testing.T) {
|
||||
includeList := []common.CRDIncludeConfig{
|
||||
{
|
||||
Name: "certificates.cert-manager.io",
|
||||
StatusPath: "status.conditions",
|
||||
ReadyCondition: &common.CRDReadyCondition{
|
||||
Type: "Ready",
|
||||
ExpectedStatus: "True",
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
config := getCRDConfig("certificates.cert-manager.io", includeList)
|
||||
if config == nil {
|
||||
t.Fatal("expected config to be found")
|
||||
}
|
||||
if config.StatusPath != "status.conditions" {
|
||||
t.Errorf("expected StatusPath 'status.conditions', got %s", config.StatusPath)
|
||||
}
|
||||
|
||||
config = getCRDConfig("nonexistent.crd.io", includeList)
|
||||
if config != nil {
|
||||
t.Error("expected nil config for non-existent CRD")
|
||||
}
|
||||
}
|
||||
|
||||
// TestAnalyzeGenericHealth_MultipleConditionTypes tests handling multiple condition types
|
||||
func TestAnalyzeGenericHealth_MultipleConditionTypes(t *testing.T) {
|
||||
item := unstructured.Unstructured{
|
||||
Object: map[string]interface{}{
|
||||
"apiVersion": "example.io/v1",
|
||||
"kind": "CustomResource",
|
||||
"metadata": map[string]interface{}{
|
||||
"name": "multi-cond",
|
||||
"namespace": "default",
|
||||
},
|
||||
"status": map[string]interface{}{
|
||||
"conditions": []interface{}{
|
||||
map[string]interface{}{
|
||||
"type": "Available",
|
||||
"status": "True",
|
||||
},
|
||||
map[string]interface{}{
|
||||
"type": "Ready",
|
||||
"status": "False",
|
||||
"reason": "Pending",
|
||||
"message": "Waiting for dependencies",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
failures := analyzeGenericHealth(item)
|
||||
if len(failures) != 1 {
|
||||
t.Fatalf("expected 1 failure (Ready=False), got %d", len(failures))
|
||||
}
|
||||
if !strings.Contains(failures[0].Text, "Ready is False") {
|
||||
t.Errorf("expected 'Ready is False' in failure text, got: %s", failures[0].Text)
|
||||
}
|
||||
}
|
||||
|
||||
// TestAnalyzeGenericHealth_NoStatusFields tests resource without any status fields
|
||||
func TestAnalyzeGenericHealth_NoStatusFields(t *testing.T) {
|
||||
item := unstructured.Unstructured{
|
||||
Object: map[string]interface{}{
|
||||
"apiVersion": "example.io/v1",
|
||||
"kind": "CustomResource",
|
||||
"metadata": map[string]interface{}{
|
||||
"name": "no-status",
|
||||
"namespace": "default",
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
failures := analyzeGenericHealth(item)
|
||||
if len(failures) != 0 {
|
||||
t.Fatalf("expected 0 failures for resource without status, got %d", len(failures))
|
||||
}
|
||||
}
|
||||
|
||||
// TestCRDAnalyzer_NilClientConfig tests that the analyzer handles errors gracefully
|
||||
func TestCRDAnalyzer_NilClientConfig(t *testing.T) {
|
||||
viper.Reset()
|
||||
viper.Set("crd_analyzer", map[string]interface{}{
|
||||
"enabled": true,
|
||||
})
|
||||
|
||||
// Create a client with a config that will cause an error when trying to create apiextensions client
|
||||
a := common.Analyzer{
|
||||
Context: context.TODO(),
|
||||
Client: &kubernetes.Client{Config: &rest.Config{}},
|
||||
}
|
||||
|
||||
// The analyzer should handle the error gracefully without panicking
|
||||
results, err := (CRDAnalyzer{}).Analyze(a)
|
||||
|
||||
// We expect either an error or no results, but no panic
|
||||
if err != nil {
|
||||
// Error is expected in this case - that's fine
|
||||
if results != nil {
|
||||
t.Errorf("Expected nil results when error occurs, got %v", results)
|
||||
}
|
||||
}
|
||||
// The important thing is that we didn't panic
|
||||
}
|
||||
@@ -97,6 +97,32 @@ type Sensitive struct {
|
||||
Masked string
|
||||
}
|
||||
|
||||
// CRDAnalyzerConfig defines the configuration for the generic CRD analyzer
|
||||
type CRDAnalyzerConfig struct {
|
||||
Enabled bool `yaml:"enabled" json:"enabled"`
|
||||
Include []CRDIncludeConfig `yaml:"include" json:"include"`
|
||||
Exclude []CRDExcludeConfig `yaml:"exclude" json:"exclude"`
|
||||
}
|
||||
|
||||
// CRDIncludeConfig defines configuration for a specific CRD to analyze
|
||||
type CRDIncludeConfig struct {
|
||||
Name string `yaml:"name" json:"name"`
|
||||
StatusPath string `yaml:"statusPath" json:"statusPath"`
|
||||
ReadyCondition *CRDReadyCondition `yaml:"readyCondition" json:"readyCondition"`
|
||||
ExpectedValue string `yaml:"expectedValue" json:"expectedValue"`
|
||||
}
|
||||
|
||||
// CRDReadyCondition defines the expected ready condition
|
||||
type CRDReadyCondition struct {
|
||||
Type string `yaml:"type" json:"type"`
|
||||
ExpectedStatus string `yaml:"expectedStatus" json:"expectedStatus"`
|
||||
}
|
||||
|
||||
// CRDExcludeConfig defines a CRD to exclude from analysis
|
||||
type CRDExcludeConfig struct {
|
||||
Name string `yaml:"name" json:"name"`
|
||||
}
|
||||
|
||||
type (
|
||||
SourceType string
|
||||
AvailabilityMode string
|
||||
|
||||
Reference in New Issue
Block a user