Skip to content

logclaw/logclaw

LogClaw

AI SRE that deploys in your VPC. Real-time anomaly detection, trace-correlated incident tickets, and AI root cause analysis — your logs never leave your infrastructure.

LogClaw Dashboard — real-time log monitoring with AI anomaly detection


TL;DR — Try It

Option A: Managed Cloud (no install — fastest)

Try the full experience instantly at console.logclaw.ai — includes AI root cause analysis, API key management, multi-tenant isolation, and the complete incident pipeline. No Docker required.

Option B: Docker Compose (self-hosted, no Kubernetes)

curl -O https://raw.githubusercontent.com/logclaw/logclaw/main/docker-compose.yml
curl -O https://raw.githubusercontent.com/logclaw/logclaw/main/otel-collector-config.yaml
docker compose up -d

Open http://localhost:3000 — the LogClaw stack is running:

  • Dashboard (:3000) — incidents, log ingestion, config
  • OTel Collector (:4317 gRPC, :4318 HTTP) — send logs via OTLP
  • Bridge (:8080) — anomaly detection + trace correlation
  • Ticketing Agent (:18081) — AI-powered incident management
  • OpenSearch (:9200) — log storage + search
  • Kafka (:9092) — event bus

All images are pulled from ghcr.io/logclaw/ — no registry auth required.

Note: The local stack runs in single-tenant mode with LLM-powered root cause analysis disabled. For AI RCA, API key management, and multi-tenant isolation, use the managed cloud or deploy to Kubernetes with LLM_PROVIDER=claude|openai|ollama.

Option C: Kind Cluster (full Kubernetes stack)

git clone https://github.com/logclaw/logclaw.git && cd logclaw
./scripts/setup-dev.sh

This creates a Kind cluster, installs all operators and services, builds the dashboard, and runs a smoke test. Takes ~20 minutes on a 16 GB laptop.

Container Images

All LogClaw images are published to GHCR as public packages:

Service Image Latest Stable
Dashboard ghcr.io/logclaw/logclaw-dashboard stable / 2.5.0
Bridge ghcr.io/logclaw/logclaw-bridge stable / 1.3.0
Ticketing Agent ghcr.io/logclaw/logclaw-ticketing-agent stable / 1.5.0
Flink Jobs ghcr.io/logclaw/logclaw-flink-jobs stable / 0.1.1

Pull any image directly:

docker pull ghcr.io/logclaw/logclaw-dashboard:stable

See It in Action

Incident Management AI Root Cause Analysis
Incident list with severity and blast radius AI-powered root cause analysis
Log Ingestion Dashboard Overview
OTLP log ingestion pipeline LogClaw dashboard overview

Live demo: console.logclaw.ai | Video walkthrough: logclaw.ai


Open Source vs Cloud vs Enterprise

Capability Open Source (free) Cloud ($0.30/GB) Enterprise (custom)
Log Ingestion (OTLP) Unlimited 1 GB/day free Unlimited
Anomaly Detection Z-score statistical Z-score + ML pipeline Z-score + ML + custom models
AI Root Cause Analysis BYO LLM (Ollama/OpenAI/Claude) Included Included + fine-tuned models
Incident Ticketing PagerDuty, Jira, ServiceNow, OpsGenie, Slack, Zammad All 6 platforms All 6 + custom connectors
Dashboard Full UI (logs, incidents, config) Full UI + hosted Full UI + white-label option
Authentication None (open access) Clerk OAuth + org management SSO (SAML/OIDC) + RBAC
Multi-tenancy Single tenant Multi-org, multi-project, multi-env Full namespace isolation per tenant
API Keys N/A Per-project, SHA-256 hashed, revocable Per-project + custom scoping
Data Residency Your infrastructure LogClaw-managed cloud Your VPC (AWS/Azure/GCP)
Secrets Encryption At rest (OpenSearch) At rest + in transit AES-256-GCM for secrets + full TLS
Config Management Env vars 6-tab settings UI UI + API + GitOps
Retention Configurable via Helm 9-day logs, 97-day incidents Custom retention policies
Air-Gapped Mode Yes (Zammad + Ollama) No Yes
MCP Server Self-hosted Hosted (mcp.logclaw.ai) Both
Support GitHub Issues Email ([email protected]) Dedicated SRE team + SLA
Pricing Free forever (Apache 2.0) $0.30/GB ingested Custom

No per-seat fees. No per-host fees. AI features included at every tier.

Start Free (Cloud)  |  Deploy from GitHub (OSS)  |  Book a Demo (Enterprise)


Architecture

All components below are included in every tier — Open Source, Cloud, and Enterprise.

LogClaw Stack (per tenant, namespace-isolated)
│
├── logclaw-auth-proxy       API key validation + tenant ID injection
├── logclaw-otel-collector   OpenTelemetry Collector (OTLP gRPC + HTTP)
├── logclaw-ingestion        Vector.dev edge ingestion (optional)
├── logclaw-kafka            Strimzi Kafka 3-broker KRaft cluster
├── logclaw-flink            ETL + enrichment + anomaly scoring
├── logclaw-opensearch       OpenSearch cluster (hot-tier log storage)
├── logclaw-bridge           OTLP ETL + trace correlation + lifecycle manager
├── logclaw-ml-engine        Feast Feature Store + KServe/TorchServe + Ollama
├── logclaw-airflow          Apache Airflow (ML training DAGs)
├── logclaw-ticketing-agent  AI-powered RCA + multi-platform ticketing
├── logclaw-agent            In-cluster infrastructure health collector
├── logclaw-dashboard        Next.js web UI (ingestion, incidents, config, dark mode)
└── logclaw-console          Enterprise SaaS console (multi-tenant)

Data flow: Logs → Auth Proxy (API key + tenant injection) → OTel Collector (OTLP ingestion) → Kafka → Bridge (ETL + anomaly + trace correlation) → OpenSearch + Ticketing Agent → Incident tickets

All charts are wired together by the logclaw-tenant umbrella chart — a single helm install deploys the full stack for one tenant.


Quick Start (Production / ArgoCD)

Prerequisites

One-time cluster setup (operators, run once per cluster):

helmfile -f helmfile.d/00-operators.yaml apply

Onboard a new tenant

  1. Copy the template:

    cp gitops/tenants/_template.yaml gitops/tenants/tenant-<id>.yaml
  2. Fill in the required values (tenantId, tier, cloudProvider, secret store config).

  3. Commit and push — ArgoCD will detect the new file and deploy the full stack in ~30 minutes.

Manual install (dev/staging)

helm install logclaw-acme charts/logclaw-tenant \
  --namespace logclaw-acme \
  --create-namespace \
  -f gitops/tenants/tenant-acme.yaml

Running Locally (Step by Step)

Prefer the one-command setup? Run ./scripts/setup-dev.sh and skip to Step 6.

Prerequisites

# macOS (Homebrew)
brew install helm helmfile kind kubectl node python3

# Helm plugins
helm plugin install https://github.com/databus23/helm-diff
helm plugin install https://github.com/helm-unittest/helm-unittest

# Docker Desktop must be running
open -a Docker

1 — Create a local Kubernetes cluster

make kind-create

Verify:

kubectl cluster-info --context kind-logclaw-dev

2 — Install cluster-level operators

make install-operators

Wait for operators to be ready (~3 min):

kubectl get pods -n strimzi-system -w
kubectl get pods -n opensearch-operator-system -w

3 — Install the full tenant stack

make install TENANT_ID=dev-local STORAGE_CLASS=standard

This deploys all 16 helmfile releases in dependency order. Monitor progress:

watch kubectl get pods -n logclaw-dev-local
Time Milestone
T+2 min Namespace, RBAC, NetworkPolicies
T+6 min Kafka broker ready
T+10 min OpenSearch cluster green
T+15 min Bridge + Ticketing Agent running
T+20 min Full stack operational

4 — Build and deploy the Dashboard

The dashboard requires a Docker image build:

docker build -t logclaw-dashboard:dev apps/dashboard/
kind load docker-image logclaw-dashboard:dev --name logclaw-dev

helm upgrade --install logclaw-dashboard-dev-local charts/logclaw-dashboard \
  --namespace logclaw-dev-local \
  --set global.tenantId=dev-local \
  -f charts/logclaw-dashboard/ci/default-values.yaml

5 — Access the services

# Dashboard (main UI)
kubectl port-forward svc/logclaw-dashboard-dev-local 3333:3000 -n logclaw-dev-local
open http://localhost:3333

# OpenSearch (query API)
kubectl port-forward svc/logclaw-opensearch-dev-local 9200:9200 -n logclaw-dev-local

# Airflow (ML pipelines)
kubectl port-forward svc/logclaw-airflow-dev-local-webserver 8080:8080 -n logclaw-dev-local
open http://localhost:8080   # admin / admin

6 — Send logs

LogClaw ingests logs via OTLP (OpenTelemetry Protocol) — the CNCF industry standard. Port-forward the OTel Collector:

kubectl port-forward svc/logclaw-otel-collector-dev-local 4318:4318 -n logclaw-dev-local &

Send a single log via OTLP HTTP:

curl -X POST http://localhost:4318/v1/logs \
  -H "Content-Type: application/json" \
  -d '{
    "resourceLogs": [{
      "resource": {
        "attributes": [
          {"key": "service.name", "value": {"stringValue": "payment-api"}}
        ]
      },
      "scopeLogs": [{
        "logRecords": [{
          "timeUnixNano": "'$(date +%s)000000000'",
          "severityText": "ERROR",
          "body": {"stringValue": "Connection refused to database"},
          "traceId": "abcdef1234567890abcdef1234567890",
          "spanId": "abcdef12345678"
        }]
      }]
    }]
  }'

Any OpenTelemetry SDK or agent can send logs to LogClaw — no custom integration needed. See OTLP Integration Guide for SDK examples.

Generate and ingest 900 sample Apple Pay logs:

# Generate sample OTel logs
python3 scripts/generate-applepay-logs.py    # → 500 payment flow logs
python3 scripts/generate-applepay-logs-2.py  # → 400 infra/security errors

# Ingest them
./scripts/ingest-logs.sh scripts/applepay-otel-500.json
./scripts/ingest-logs.sh scripts/applepay-otel-400-batch2.json

Or use the helper script:

./scripts/ingest-logs.sh --generate   # generates + ingests all sample logs
./scripts/ingest-logs.sh --smoke      # single test log

7 — See it in action

After ingesting error logs, the Bridge detects anomalies and the Ticketing Agent creates incident tickets. View them:

# Watch Bridge trace correlation in real-time
kubectl logs -f deployment/logclaw-bridge-dev-local -n logclaw-dev-local

# Check auto-created incidents
kubectl port-forward svc/logclaw-opensearch-dev-local 9200:9200 -n logclaw-dev-local &
curl -s 'http://localhost:9200/logclaw-incidents-*/_search?size=5&sort=created_at:desc' | python3 -m json.tool

# Or use the Dashboard
open http://localhost:3333/incidents

8 — Tear down

# Remove just the tenant
make uninstall TENANT_ID=dev-local

# Remove everything including the Kind cluster
make kind-delete

Repository Layout

charts/
├── logclaw-tenant/           # Umbrella chart — single install entry point
├── logclaw-auth-proxy/       # API key validation + tenant ID injection
├── logclaw-otel-collector/   # OpenTelemetry Collector (OTLP gRPC + HTTP)
├── logclaw-ingestion/        # Vector.dev edge ingestion
├── logclaw-kafka/            # Strimzi Kafka + KafkaConnect + MirrorMaker2
├── logclaw-flink/            # Flink ETL + enrichment + anomaly jobs
├── logclaw-opensearch/       # OpenSearch cluster via Opster operator
├── logclaw-bridge/           # OTLP ETL + trace correlation + lifecycle manager
├── logclaw-ml-engine/        # Feast + KServe/TorchServe + Ollama
├── logclaw-airflow/          # Apache Airflow
├── logclaw-ticketing-agent/  # AI-powered RCA + multi-platform ticketing
├── logclaw-agent/            # In-cluster infrastructure health agent
├── logclaw-dashboard/        # Next.js web UI
└── logclaw-console/          # Enterprise SaaS console

apps/
├── bridge/                # Python — OTLP ETL + anomaly detection + trace correlation
├── agent/                 # Go — infrastructure health collector
├── dashboard/             # Next.js — web UI (incidents, logs, config, dark mode)
├── ticketing-agent/       # Python — AI-powered RCA + multi-platform ticketing
├── flink-jobs/            # Java — Flink stream processing jobs
├── logclaw-auth-proxy/    # TypeScript/Express — API key validation + tenant injection
├── logclaw-slack-bot/     # TypeScript/Hono — Slack incident bot (Cloudflare Workers)
├── logclaw-mcp-server/    # TypeScript — MCP server for AI coding tools (8 tools)
└── logclaw-mcp-remote/    # TypeScript — remote MCP client (OAuth 2.1)

cli/                        # Go CLI (logclaw start/stop/status)

scripts/
├── setup-dev.sh                # One-command local dev setup (Kind cluster)
├── setup-gke.sh                # GKE production cluster setup
├── ingest-logs.sh              # Log ingestion helper (--generate, --smoke)
├── generate-applepay-logs.py   # Generate 500 OTel sample logs (batch 1)
├── generate-applepay-logs-2.py # Generate 400 infra/security logs (batch 2)
├── trigger-anomaly.sh          # Trigger test anomaly for demo
└── trigger-request-failure.sh  # Trigger test request failure for demo

operators/                    # Cluster-level operator bootstrap (once per cluster)
├── strimzi/                  # strimzi-kafka-operator 0.41.0
├── flink-operator/           # flink-kubernetes-operator 1.9.0
├── opensearch-operator/      # opensearch-operator 2.6.1
├── eso/                      # external-secrets 0.10.3
└── cert-manager/             # cert-manager v1.16.1

helmfile.d/                   # Ordered helmfile releases (00-operators → 90-dashboard)
gitops/                       # ArgoCD ApplicationSet + per-tenant value files
tests/                        # Helm chart tests + integration test pods
docs/                         # Architecture, onboarding, values reference

Key Features

For a side-by-side comparison across tiers, see Open Source vs Cloud vs Enterprise above.

Trace-Correlated AI Ticket Engine

The Bridge runs a 5-layer trace correlation engine:

  1. ETL Consumer — Consumes enriched logs from Kafka
  2. Anomaly Detector — Statistical anomaly scoring on error rates
  3. OpenSearch Indexer — Indexes logs for search and correlation
  4. Lifecycle Engine — Traces causal chains across services, computes blast radius, creates/deduplicates incidents

When an anomaly is detected, the system:

  • Queries all logs sharing the same trace_id
  • Builds a causal chain showing error propagation across services
  • Computes blast radius (% of services affected)
  • Creates a deduplicated incident ticket with full trace context

Multi-Platform Ticketing

The logclaw-ticketing-agent supports 6 independently-toggleable platforms simultaneously:

Platform Type Egress
PagerDuty SaaS External HTTPS
Jira SaaS External HTTPS
ServiceNow SaaS External HTTPS
OpsGenie SaaS External HTTPS
Slack SaaS External HTTPS
Zammad In-cluster Zero external egress

Per-severity routing (critical → PagerDuty, medium → Jira, etc.) is configurable via config.routing.*.

Air-Gapped Mode

When paired with Zammad (external ITSM chart) and Ollama for local LLM inference, the needsExternalHttps helper sets the NetworkPolicy to zero external egress — fully air-gapped. No logs, tickets, or model calls leave the cluster.

LLM Provider Abstraction

global:
  llm:
    provider: ollama   # claude | openai | ollama | vllm | disabled
    model: llama3.2:8b

Dashboard

The Dashboard provides:

  • Dark mode — system-aware with manual toggle (Light/Dark/System), persisted in localStorage
  • Drag-and-drop upload supporting JSON, NDJSON, CSV, and plain text files
  • Bulk incident actions — select multiple incidents and acknowledge/resolve/escalate in batch
  • CSV export — download incidents as a CSV file
  • Loading skeletons — smooth animated placeholders during data fetches
  • Error boundaries — graceful crash recovery with retry UI
  • LLM fallback badge — indicates when AI RCA is unavailable and rule-based fallback was used
  • Incident auto-deduplication — prevents duplicate incidents for the same anomaly

Log Ingestion — OTLP Native

LogClaw uses OTLP (OpenTelemetry Protocol) as its sole ingestion protocol — the CNCF industry standard supported by every major observability vendor (Datadog, Splunk, Grafana, AWS, GCP, Azure).

Supported transports:

  • gRPC<collector>:4317 (recommended for high-throughput)
  • HTTP/JSON<collector>:4318/v1/logs

Any OpenTelemetry SDK, agent, or collector can send logs directly to LogClaw without custom integrations. The OTel Collector enriches each log with tenant_id, batches them, and writes to Kafka using otlp_json encoding.

{
  "resourceLogs": [{
    "resource": {
      "attributes": [
        {"key": "service.name", "value": {"stringValue": "my-service"}},
        {"key": "host.name", "value": {"stringValue": "my-service-pod-abc12"}}
      ]
    },
    "scopeLogs": [{
      "logRecords": [{
        "timeUnixNano": "1709510400000000000",
        "severityText": "ERROR",
        "body": {"stringValue": "Something went wrong"},
        "traceId": "abcdef1234567890abcdef1234567890",
        "spanId": "abcdef12345678",
        "attributes": [
          {"key": "environment", "value": {"stringValue": "production"}}
        ]
      }]
    }]
  }]
}

See OTLP Integration Guide for Python, Java, and Node.js SDK examples.

MCP Server — AI Coding Tools

The logclaw-mcp-server connects AI coding tools to LogClaw incidents, logs, and anomalies via the Model Context Protocol. Published as an npm package with 8 tools.

npx logclaw-mcp-server

Works with Claude Code, Cursor, Windsurf, and any MCP-compatible client. Also available as a hosted server at https://mcp.logclaw.ai (OAuth 2.1, no install needed).

See MCP Integration Guide for setup instructions.

Slack Bot — Incident Notifications

The logclaw-slack-bot delivers real-time incident notifications to Slack with rich Block Kit formatting, DM support, and OAuth. Runs on Cloudflare Workers.

See Integrations for setup.

Auth Proxy — API Key Validation

The logclaw-auth-proxy sits between ingress and the OTel Collector. It validates API keys against PostgreSQL, injects tenant_id into OTLP payloads, and enforces rate limits (200 req/min unauthenticated, 6000 req/min per tenant). Stateless and horizontally scalable.


Component Versions

Component Version
Apache Kafka (Strimzi) 3.7.0
Apache Flink 1.19.0
OpenSearch 2.14.0
External Secrets Operator 0.10.3
cert-manager v1.16.1
Apache Airflow 1.14.0
Zammad 12.4.1
OpenTelemetry Collector Contrib 0.114.0
KServe 0.13.0
Feast 0.40.0
Next.js (Dashboard) 16.1.6

Development

Dashboard (Next.js)

cd apps/dashboard
npm install
npm run dev
# → http://localhost:3000

Bridge (Python)

cd apps/bridge
pip install -r requirements.txt
export KAFKA_BROKERS="localhost:9092"
export OPENSEARCH_ENDPOINT="http://localhost:9200"
python main.py
# → HTTP API on :8080 (/health, /metrics, /config)

See Bridge docs for configuration reference.

Ticketing Agent (Python)

cd apps/ticketing-agent
pip install -r requirements.txt
export KAFKA_BROKERS="localhost:9092"
export OPENSEARCH_ENDPOINT="http://localhost:9200"
python main.py
# → HTTP API on :8080

Agent (Go)

cd apps/agent
go run main.go
# → HTTP API on :8080 (/health, /ready, /metrics)

Auth Proxy (TypeScript)

cd apps/logclaw-auth-proxy
npm install
npm run dev
# → HTTP API on :4318

Requires a PostgreSQL database with API keys. See API Keys docs.

MCP Server (TypeScript)

cd apps/logclaw-mcp-server
npm install && npm run build
LOGCLAW_API_KEY=lc_proj_test npx .

Helm Charts

# Lint all charts
make lint

# Render templates (dry-run, no cluster needed)
make template TENANT_ID=ci-test

# Diff current vs new
make template-diff TENANT_ID=dev-local

# Package charts as .tgz
make package

# Push to OCI registry
make push HELM_REGISTRY=oci://ghcr.io/logclaw/charts

Docs

Full documentation is available at docs.logclaw.ai.

Getting Started:

Components:

Integrations:

Reference:

Enterprise:


Contributing

We welcome contributions! Please read our guidelines before opening a PR:

Use the issue templates for bug reports and feature requests.


License

Apache 2.0 — see LICENSE

About

LogClaw — Multi-tenant Helm chart monorepo for deploying the full log intelligence stack

Resources

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors