Compare commits

..

4 Commits

Author SHA1 Message Date
Alon Girmonsky
5925e74ff5 Reference MCP demo GIF by commit SHA for preview 2026-03-06 12:16:53 -08:00
Alon Girmonsky
5a6a8ad38c Reorder README sections and add MCP demo GIF
- Hero description + stream.png first
- Get Started section
- AI-Powered Network Analysis with MCP demo GIF
- L7 API Dissection
- L4/L7 Workload Map
- Traffic Retention
- Features, Install, Contributing, License
2026-03-06 12:14:38 -08:00
Alon Girmonsky
f99c78f31f Add MCP demo GIF to README hero section
Replace static stream.png with animated MCP demo showing
Claude Code + Kubeshark workflow.
2026-03-06 12:10:12 -08:00
Alon Girmonsky
d3f06cb54a Update README hero: Network Observability for SREs & AI Agents
Rewrite hero section to focus on cluster-wide network data
consolidation and dual access model (AI agents via MCP,
human operators via dashboard).
2026-03-06 10:34:38 -08:00
2 changed files with 48 additions and 50 deletions

View File

@@ -9,7 +9,7 @@
<a href="https://join.slack.com/t/kubeshark/shared_invite/zt-3jdcdgxdv-1qNkhBh9c6CFoE7bSPkpBQ"><img alt="Slack" src="https://img.shields.io/badge/slack-join_chat-green?logo=Slack&style=flat-square"></a>
</p>
<p align="center"><b>Network Intelligence for Kubernetes</b></p>
<p align="center"><b>Network Observability for SREs & AI Agents</b></p>
<p align="center">
<a href="https://demo.kubeshark.com/">Live Demo</a> · <a href="https://docs.kubeshark.com">Docs</a>
@@ -17,9 +17,17 @@
---
* **Cluster-wide, real-time visibility into every packet, API call, and service interaction.**
* Replay any moment in time.
* Resolve incidents at the speed of LLMs. 100% on-premises.
Kubeshark captures cluster-wide network traffic at the speed and scale of Kubernetes, continuously, at the kernel level using eBPF. It consolidates a highly fragmented picture — dozens of nodes, thousands of workloads, millions of connections — into a single, queryable view with full Kubernetes and API context.
Network data is available to **AI agents via [MCP](https://docs.kubeshark.com/en/mcp)** and to **human operators via a [dashboard](https://docs.kubeshark.com/en/v2)**.
**What's captured, cluster-wide:**
- **L4 Packets & TCP Metrics** — retransmissions, RTT, window saturation, connection lifecycle, packet loss across every node-to-node path ([TCP insights →](https://docs.kubeshark.com/en/mcp/tcp_insights))
- **L7 API Calls** — real-time request/response matching with full payload parsing: HTTP, gRPC, GraphQL, Redis, Kafka, DNS ([API dissection →](https://docs.kubeshark.com/en/v2/l7_api_dissection))
- **Decrypted TLS** — eBPF-based TLS decryption without key management
- **Kubernetes Context** — every packet and API call resolved to pod, service, namespace, and node
- **PCAP Retention** — point-in-time raw packet snapshots, exportable for Wireshark ([Snapshots →](https://docs.kubeshark.com/en/v2/traffic_snapshots))
![Kubeshark](https://github.com/kubeshark/assets/raw/master/png/stream.png)
@@ -34,33 +42,37 @@ helm install kubeshark kubeshark/kubeshark
Dashboard opens automatically. You're capturing traffic.
**With AI** — connect your assistant and debug with natural language:
**Connect an AI agent** via MCP:
```bash
brew install kubeshark
claude mcp add kubeshark -- kubeshark mcp
```
[MCP setup guide →](https://docs.kubeshark.com/en/mcp)
---
### AI-Powered Network Analysis
Kubeshark exposes all cluster-wide network data via MCP (Model Context Protocol). AI agents can query L4 metrics, investigate L7 API calls, analyze traffic patterns, and run root cause analysis — through natural language. Use cases include incident response, root cause analysis, troubleshooting, debugging, and reliability workflows.
> *"Why did checkout fail at 2:15 PM?"*
> *"Which services have error rates above 1%?"*
> *"Show TCP retransmission rates across all node-to-node paths"*
> *"Trace request abc123 through all services"*
Works with Claude Code, Cursor, and any MCP-compatible AI.
![MCP Demo](https://github.com/kubeshark/assets/raw/8bb4bab5298e3ffdc3afba164c716cc45dd9f19d/gif/mcp-demo.gif)
[MCP setup guide →](https://docs.kubeshark.com/en/mcp)
---
## Why Kubeshark
### L7 API Dissection
- **Instant root cause** — trace requests across services, see exact errors
- **Zero instrumentation** — no code changes, no SDKs, just deploy
- **Full payload capture** — request/response bodies, headers, timing
- **TLS decryption** — see encrypted traffic without managing keys
- **AI-ready** — query traffic with natural language via MCP
---
### Traffic Analysis and API Dissection
Capture and inspect every API call across your cluster—HTTP, gRPC, Redis, Kafka, DNS, and more. Request/response matching with full payloads, parsed according to protocol specifications. Headers, timing, and complete context. Zero instrumentation required.
Cluster-wide request/response matching with full payloads, parsed according to protocol specifications. HTTP, gRPC, Redis, Kafka, DNS, and more. Every API call resolved to source and destination pod, service, namespace, and node. No code instrumentation required.
![API context](https://github.com/kubeshark/assets/raw/master/png/api_context.png)
@@ -68,27 +80,15 @@ Capture and inspect every API call across your cluster—HTTP, gRPC, Redis, Kafk
### L4/L7 Workload Map
Visualize how your services communicate. See dependencies, traffic flow, and identify anomalies at a glance.
Cluster-wide view of service communication: dependencies, traffic flow, and anomalies across all nodes and namespaces.
![Service Map](https://github.com/kubeshark/assets/raw/master/png/servicemap.png)
[Learn more →](https://docs.kubeshark.com/en/v2/service_map)
### AI-Powered Root Cause Analysis
Resolve production issues in minutes instead of hours. Connect your AI assistant and investigate incidents using natural language. Build network-aware AI agents for forensics, monitoring, compliance, and security.
> *"Why did checkout fail at 2:15 PM?"*
> *"Which services have error rates above 1%?"*
> *"Trace request abc123 through all services"*
Works with Claude Code, Cursor, and any MCP-compatible AI.
[MCP setup guide →](https://docs.kubeshark.com/en/mcp)
### Traffic Retention
Retain every packet. Take snapshots. Export PCAP files. Replay any moment in time.
Continuous raw packet capture with point-in-time snapshots. Export PCAP files for offline analysis with Wireshark or other tools.
![Traffic Retention](https://github.com/kubeshark/assets/raw/master/png/snapshots.png)
@@ -105,7 +105,7 @@ Retain every packet. Take snapshots. Export PCAP files. Replay any moment in tim
| [**L7 API Dissection**](https://docs.kubeshark.com/en/v2/l7_api_dissection) | Request/response matching with full payloads and protocol parsing |
| [**Protocol Support**](https://docs.kubeshark.com/en/protocols) | HTTP, gRPC, GraphQL, Redis, Kafka, DNS, and more |
| [**TLS Decryption**](https://docs.kubeshark.com/en/encrypted_traffic) | eBPF-based decryption without key management |
| [**AI-Powered Analysis**](https://docs.kubeshark.com/en/v2/ai_powered_analysis) | Query traffic with Claude, Cursor, or any MCP-compatible AI |
| [**AI-Powered Analysis**](https://docs.kubeshark.com/en/v2/ai_powered_analysis) | Query cluster-wide network data with Claude, Cursor, or any MCP-compatible AI |
| [**Display Filters**](https://docs.kubeshark.com/en/v2/kfl2) | Wireshark-inspired display filters for precise traffic analysis |
| [**100% On-Premises**](https://docs.kubeshark.com/en/air_gapped) | Air-gapped support, no external dependencies |

View File

@@ -142,8 +142,7 @@ Example for overriding image names:
| `tap.capture.dissection.stopAfter` | Set to a duration (e.g. `30s`) to have L7 dissection stop after no activity. | `5m` |
| `tap.capture.raw.enabled` | Enable raw capture of packets and syscalls to disk for offline analysis | `true` |
| `tap.capture.raw.storageSize` | Maximum storage size for raw capture files (supports K8s quantity format: `1Gi`, `500Mi`, etc.) | `1Gi` |
| `tap.capture.captureSelf` | Include Kubeshark's own traffic in capture | `false` |
| `tap.capture.dbMaxSize` | Maximum size for capture database (e.g., `4Gi`, `2000Mi`). | `500Mi` |
| `tap.capture.dbMaxSize` | Maximum size for capture database (e.g., `4Gi`, `2000Mi`). When empty, automatically uses 80% of allocated storage (`tap.storageLimit`). | `""` |
| `tap.snapshots.local.storageClass` | Storage class for local snapshots volume. When empty, uses `emptyDir`. When set, creates a PVC with this storage class | `""` |
| `tap.snapshots.local.storageSize` | Storage size for local snapshots volume (supports K8s quantity format: `1Gi`, `500Mi`, etc.) | `20Gi` |
| `tap.snapshots.cloud.provider` | Cloud storage provider for snapshots: `s3` or `azblob`. Empty string disables cloud storage. See [Cloud Storage docs](docs/snapshots_cloud_storage.md). | `""` |
@@ -159,8 +158,6 @@ Example for overriding image names:
| `tap.snapshots.cloud.azblob.storageAccount` | Azure storage account name. When set, the chart auto-creates a ConfigMap with `SNAPSHOT_AZBLOB_STORAGE_ACCOUNT`. | `""` |
| `tap.snapshots.cloud.azblob.container` | Azure blob container name. | `""` |
| `tap.snapshots.cloud.azblob.storageKey` | Azure storage account access key. When set, the chart auto-creates a Secret with `SNAPSHOT_AZBLOB_STORAGE_KEY`. | `""` |
| `tap.delayedDissection.cpu` | CPU allocation for delayed dissection jobs | `1` |
| `tap.delayedDissection.memory` | Memory allocation for delayed dissection jobs | `4Gi` |
| `tap.release.repo` | URL of the Helm chart repository | `https://helm.kubeshark.com` |
| `tap.release.name` | Helm release name | `kubeshark` |
| `tap.release.namespace` | Helm release namespace | `default` |
@@ -168,30 +165,30 @@ Example for overriding image names:
| `tap.persistentStorageStatic` | Use static persistent volume provisioning (explicitly defined `PersistentVolume` ) | `false` |
| `tap.persistentStoragePvcVolumeMode` | Set the pvc volume mode (Filesystem\|Block) | `Filesystem` |
| `tap.efsFileSytemIdAndPath` | [EFS file system ID and, optionally, subpath and/or access point](https://github.com/kubernetes-sigs/aws-efs-csi-driver/blob/master/examples/kubernetes/access_points/README.md) `<FileSystemId>:<Path>:<AccessPointId>` | "" |
| `tap.storageLimit` | Limit of either the `emptyDir` or `persistentVolumeClaim` | `10Gi` |
| `tap.storageLimit` | Limit of either the `emptyDir` or `persistentVolumeClaim` | `5Gi` |
| `tap.storageClass` | Storage class of the `PersistentVolumeClaim` | `standard` |
| `tap.dryRun` | Preview of all pods matching the regex, without tapping them | `false` |
| `tap.dns.nameservers` | Nameservers to use for DNS resolution | `[]` |
| `tap.dns.searches` | Search domains to use for DNS resolution | `[]` |
| `tap.dns.options` | DNS options to use for DNS resolution | `[]` |
| `tap.dnsConfig.nameservers` | Nameservers to use for DNS resolution | `[]` |
| `tap.dnsConfig.searches` | Search domains to use for DNS resolution | `[]` |
| `tap.dnsConfig.options` | DNS options to use for DNS resolution | `[]` |
| `tap.resources.hub.limits.cpu` | CPU limit for hub | `""` (no limit) |
| `tap.resources.hub.limits.memory` | Memory limit for hub | `5Gi` |
| `tap.resources.hub.requests.cpu` | CPU request for hub | `50m` |
| `tap.resources.hub.requests.memory` | Memory request for hub | `50Mi` |
| `tap.resources.sniffer.limits.cpu` | CPU limit for sniffer | `""` (no limit) |
| `tap.resources.sniffer.limits.memory` | Memory limit for sniffer | `5Gi` |
| `tap.resources.sniffer.limits.memory` | Memory limit for sniffer | `3Gi` |
| `tap.resources.sniffer.requests.cpu` | CPU request for sniffer | `50m` |
| `tap.resources.sniffer.requests.memory` | Memory request for sniffer | `50Mi` |
| `tap.resources.tracer.limits.cpu` | CPU limit for tracer | `""` (no limit) |
| `tap.resources.tracer.limits.memory` | Memory limit for tracer | `5Gi` |
| `tap.resources.tracer.limits.memory` | Memory limit for tracer | `3Gi` |
| `tap.resources.tracer.requests.cpu` | CPU request for tracer | `50m` |
| `tap.resources.tracer.requests.memory` | Memory request for tracer | `50Mi` |
| `tap.probes.hub.initialDelaySeconds` | Initial delay before probing the hub | `5` |
| `tap.probes.hub.periodSeconds` | Period between probes for the hub | `5` |
| `tap.probes.hub.initialDelaySeconds` | Initial delay before probing the hub | `15` |
| `tap.probes.hub.periodSeconds` | Period between probes for the hub | `10` |
| `tap.probes.hub.successThreshold` | Number of successful probes before considering the hub healthy | `1` |
| `tap.probes.hub.failureThreshold` | Number of failed probes before considering the hub unhealthy | `3` |
| `tap.probes.sniffer.initialDelaySeconds` | Initial delay before probing the sniffer | `5` |
| `tap.probes.sniffer.periodSeconds` | Period between probes for the sniffer | `5` |
| `tap.probes.sniffer.initialDelaySeconds` | Initial delay before probing the sniffer | `15` |
| `tap.probes.sniffer.periodSeconds` | Period between probes for the sniffer | `10` |
| `tap.probes.sniffer.successThreshold` | Number of successful probes before considering the sniffer healthy | `1` |
| `tap.probes.sniffer.failureThreshold` | Number of failed probes before considering the sniffer unhealthy | `3` |
| `tap.serviceMesh` | Capture traffic from service meshes like Istio, Linkerd, Consul, etc. | `true` |
@@ -226,7 +223,7 @@ Example for overriding image names:
| `tap.telemetry.enabled` | Enable anonymous usage statistics collection | `true` |
| `tap.resourceGuard.enabled` | Enable resource guard worker process, which watches RAM/disk usage and enables/disables traffic capture based on available resources | `false` |
| `tap.secrets` | List of secrets to be used as source for environment variables (e.g. `kubeshark-license`) | `[]` |
| `tap.sentry.enabled` | Enable sending of error logs to Sentry | `false` |
| `tap.sentry.enabled` | Enable sending of error logs to Sentry | `true` (only for qualified users) |
| `tap.sentry.environment` | Sentry environment to label error logs with | `production` |
| `tap.defaultFilter` | Sets the default dashboard KFL filter (e.g. `http`). By default, this value is set to filter out noisy protocols such as DNS, UDP, ICMP and TCP. The user can easily change this, **temporarily**, in the Dashboard. For a permanent change, you should change this value in the `values.yaml` or `config.yaml` file. | `""` |
| `tap.liveConfigMapChangesDisabled` | If set to `true`, all user functionality (scripting, targeting settings, global & default KFL modification, traffic recording, traffic capturing on/off, protocol dissectors) involving dynamic ConfigMap changes from UI will be disabled | `false` |
@@ -235,9 +232,6 @@ Example for overriding image names:
| `tap.enabledDissectors` | This is an array of strings representing the list of supported protocols. Remove or comment out redundant protocols (e.g., dns).| The default list excludes: `udp` and `tcp` |
| `tap.mountBpf` | BPF filesystem needs to be mounted for eBPF to work properly. This helm value determines whether Kubeshark will attempt to mount the filesystem. This option is not required if filesystem is already mounts. │ `true`|
| `tap.hostNetwork` | Enable host network mode for worker DaemonSet pods. When enabled, worker pods use the host's network namespace for direct network access. | `true` |
| `tap.packetCapture` | Packet capture backend: `best`, `af_packet`, or `pf_ring` | `best` |
| `tap.misc.trafficSampleRate` | Percentage of traffic to process (0-100) | `100` |
| `tap.misc.tcpStreamChannelTimeoutMs` | Timeout in milliseconds for TCP stream channel | `10000` |
| `tap.gitops.enabled` | Enable GitOps functionality. This will allow you to use GitOps to manage your Kubeshark configuration. | `false` |
| `tap.misc.tcpFlowTimeout` | TCP flow aggregation timeout in seconds. Controls how long the worker waits before finalizing a TCP flow. | `1200` |
| `tap.misc.udpFlowTimeout` | UDP flow aggregation timeout in seconds. Controls how long the worker waits before finalizing a UDP flow. | `1200` |
@@ -258,6 +252,10 @@ Example for overriding image names:
| `supportChatEnabled` | Enable real-time support chat channel based on Intercom | `false` |
| `internetConnectivity` | Turns off API requests that are dependent on Internet connectivity such as `telemetry` and `online-support`. | `true` |
KernelMapping pairs kernel versions with a
DriverContainer image. Kernel versions can be matched
literally or using a regular expression
# Installing with SAML enabled
### Prerequisites: