Skip to content

dunamismax/repokeeper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

repokeeper

Self-hosted repo health daemon with built-in doc verification. Watches your repos, validates your remotes, detects drift, verifies your docs, and tells you before something breaks.

repokeeper is a single Go binary that monitors a directory of git repositories on a configurable schedule. It validates dual-push remote configurations, detects commit drift between push targets, extracts verification commands from documentation files, runs them in sandboxed temporary directories, stores results in SQLite, and serves a tiny local dashboard with the results. It knows nothing about your code — only whether your repos are healthy, in sync, and telling the truth.

No cloud. No account. No telemetry. One binary, your repos, the truth.

Status: v0.1.0 — Verification engine complete. Parser, sandbox execution, SQLite storage, Markdown reporting, and CLI are implemented with 115 passing tests. Daemon mode, remote validation, drift detection, and dashboard are planned. See BUILD.md for the roadmap.

Why repokeeper?

Developers who mirror repositories across multiple hosts — GitHub and Codeberg, GitHub and Gitea, any combination — eventually discover drift the hard way. A push that only landed on one remote. A doc that claims a build command works when it doesn't. A repo whose remote configuration silently broke weeks ago. The existing options are manual git remote -v checks, ad-hoc shell scripts, or finding out when it matters most.

Every well-maintained repo also has a verification ritual: clone it fresh, run the documented commands, confirm they work. This ritual is manual, tedious, and the first thing skipped under time pressure.

repokeeper replaces both problems with a single tool.

Approach Persistent daemon Remote validation Drift detection Doc verification Auditable results Self-hosted
Manual git remote -v No Manual No No No N/A
Shell scripts / cron jobs Partial DIY DIY No No Yes
GitHub Actions No No No Partial Partial No
CI pipelines No No No Partial Partial No
Makefile / Taskfile No No No Partial No Yes
repokeeper Planned Planned Planned Yes Yes Yes

repokeeper's design targets:

  • Doc verification now. Point it at a repo. It finds the docs, extracts the commands, runs them in a sandbox, and reports the truth. Zero configuration.
  • Always watching (planned). Runs as a daemon with a configurable schedule. Health checks happen automatically.
  • Remote validation (planned). Verifies that dual-push remotes are configured correctly. Catches misconfigurations before they cause silent data loss.
  • Drift detection (planned). Compares commit state across push targets. Knows when one remote is ahead of another.
  • Sandboxed execution. Every verification command runs in an isolated temporary directory. Your working tree is never modified.
  • Local-first, SQLite-backed. Results live in a local SQLite database. Query your verification history with SQL if you want.
  • Honest reporting. Pass, fail, timeout, skipped — every command block gets a verdict with stdout, stderr, exit code, and wall time. Dated Markdown reports you can commit, diff, or review.
  • Local dashboard (planned). A tiny read-only HTTP dashboard shows repo health at a glance.
  • Zero-knowledge. repokeeper checks health signals — remote URLs, commit refs, command exit codes. It never reads, indexes, uploads, or analyzes your source code.

Usage

Doc verification (implemented)

# verify a repo — run doc-extracted commands in sandbox, store results
repokeeper verify .
# => Running 14 commands...
# => [1/14] echo hello .......................... PASS (0.1s)
# => [2/14] go test ./... ....................... PASS (4.2s)
# => [3/14] go vet ./... ........................ PASS (1.1s)
# => ...
# => Results: 12 passed, 1 failed, 1 timeout
# => Stored to .repokeeper/repokeeper.db

# scan a repo — parse README.md and BUILD.md, index command blocks
repokeeper scan .
# => Found 12 command blocks in README.md (8 runnable)
# => Found 8 command blocks in BUILD.md (6 runnable)
# => Indexed 14 commands to .repokeeper/repokeeper.db

# generate a report — emit dated Markdown from stored results
repokeeper report .
# => Written: .repokeeper/reports/2026-03-22-a1b2c3d.md
# verify with a custom timeout (default: 60s per command)
repokeeper verify . --timeout 120

# verify with fail-fast (stop on first failure)
repokeeper verify . --fail-fast

# scan a specific file
repokeeper scan . --file BUILD.md

# also scan the docs/ directory
repokeeper verify . --docs

# report for a previous run
repokeeper report . --run-id 3

# list all stored runs
repokeeper report . --list

# verify a different repo
repokeeper verify ~/github/bore

Daemon mode (planned)

# start the daemon (watches ~/github by default)
repokeeper serve

# start with a config file
repokeeper serve --config repokeeper.toml

# one-shot health check (no daemon)
repokeeper check

# check a specific directory
repokeeper check --dir ~/projects

# show version and build info
repokeeper version
# repokeeper.toml (planned)

[watch]
directories = ["~/github", "~/projects"]
interval = "30m"
exclude = ["node_modules", ".cache"]

[remotes]
expect_push = [
  "[email protected]:dunamismax/*.git",
  "[email protected]:dunamismax/*.git",
]

[alerts]
stdout = true

[dashboard]
bind = "127.0.0.1:7480"

[verification]
enabled = true
timeout = "5m"
doc_files = ["README.md", "BUILD.md"]

[database]
path = "~/.local/share/repokeeper/health.db"

Skip directives

Mark code blocks that shouldn't be executed with an HTML comment:

<!-- repokeeper:skip -->

Place the directive on the line immediately before (or with blank lines between) the fenced code block. The legacy <!-- repotruth:skip --> directive is also supported.

Architecture

┌──────────────────────────────────────────────────────┐
│                     repokeeper                       │
│                                                      │
│  ┌────────────────────────────────────────────────┐  │
│  │       Verification Engine (implemented)        │  │
│  │                                                │  │
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐     │  │
│  │  │ Parser   │  │ Sandbox  │  │ Reporter │     │  │
│  │  │          │  │ Engine   │  │          │     │  │
│  │  │ markdown │  │ tempdir  │  │ markdown │     │  │
│  │  │ fenced   │  │ subproc  │  │ from SQL │     │  │
│  │  │ blocks   │  │ timeout  │  │          │     │  │
│  │  └────┬─────┘  └────┬─────┘  └────┬─────┘     │  │
│  │       └──────────────▼─────────────┘           │  │
│  │               ┌──────────┐                     │  │
│  │               │  SQLite  │                     │  │
│  │               │ commands │                     │  │
│  │               │ runs     │                     │  │
│  │               │ results  │                     │  │
│  │               └──────────┘                     │  │
│  └────────────────────────────────────────────────┘  │
│                                                      │
│  ┌────────────────────────────────────────────────┐  │
│  │         Health Daemon (planned)                │  │
│  │                                                │  │
│  │  Scheduler · Scanner · Remote Validation       │  │
│  │  Drift Detection · Dashboard · Alerting        │  │
│  └────────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────────┘

Verification engine

  • Parser — reads Markdown files, identifies fenced code blocks (```bash, ```sh, ```shell, ```console, ```zsh), extracts command text, tags each block with source file, line number, and language. Ignores output-only blocks, comment-only blocks, and blocks marked with <!-- repokeeper:skip -->.
  • Discovery — finds documentation files to parse. Defaults to README.md and BUILD.md in the repo root. Supports explicit file targeting and recursive docs/ scanning.
  • Sandbox engine — creates an isolated temporary directory per verification run, copies repo contents (excluding .git, node_modules, target, and other heavy dirs), executes each command via sh -c with configurable timeout, captures stdout, stderr, exit code, and wall-clock duration.
  • SQLite storage — stores commands (source file, line, text, language), runs (timestamp, commit hash, repo path), and results (exit code, stdout, stderr, duration, verdict). Schema designed for querying and historical comparison.
  • Reporter — reads from SQLite, generates dated Markdown reports with pass/fail summary, per-command details, and timing. Reports are diffable and committable.

Health daemon (planned)

  • Scheduler — drives periodic health scans on a configurable interval
  • Scanner — discovers git repositories in watched directories, inspects remote configuration
  • Remote validation — validates dual-push remote configurations against expected patterns
  • Drift detection — compares refs across push targets to detect divergence
  • Dashboard — tiny HTTP server serving a read-only status page and JSON API
  • Alerter — fires alerts when repos drift, break, or misconfigure

Install

From source (recommended)

Requires Go 1.22+.

go install github.com/dunamismax/repokeeper/cmd/repokeeper@latest

Or clone and build locally:

git clone https://github.com/dunamismax/repokeeper.git
cd repokeeper
go build -o repokeeper ./cmd/repokeeper

Quality checks

go build ./...
go test ./...
go vet ./...

Repository layout

.
├── BUILD.md              # execution manual — phases, decisions, progress
├── README.md             # public-facing project description, honest status
├── LICENSE               # MIT
├── .gitignore            # Go exclusions
├── go.mod                # Go module definition
├── go.sum                # dependency checksums
├── cmd/
│   └── repokeeper/
│       └── main.go       # CLI entry point with verify/scan/report commands
└── internal/
    └── verify/
        ├── parser/
        │   ├── parser.go           # Markdown fenced block extraction and classification
        │   ├── io.go               # file reading
        │   ├── parser_test.go      # 52 unit tests for parser
        │   └── integration_test.go # 6 integration tests against own docs
        ├── discovery/
        │   ├── discovery.go        # file discovery (README.md, BUILD.md, docs/)
        │   └── discovery_test.go   # 9 unit tests for discovery
        ├── sandbox/
        │   ├── sandbox.go          # sandbox execution engine
        │   └── sandbox_test.go     # 26 unit tests for sandbox
        ├── storage/
        │   ├── storage.go          # SQLite schema, storage API, and query layer
        │   └── storage_test.go     # 16 unit tests for storage
        └── reporter/
            ├── reporter.go         # Markdown report generation from stored results
            └── reporter_test.go    # 12 unit tests for reporter

Planned directories (not yet created):

    internal/
        config/           # TOML configuration loading and validation
        scanner/          # repository discovery and git inspection
        health/           # remote validation and drift detection
        store/            # unified health history persistence
        dashboard/        # HTTP server and API routes
        alert/            # alert dispatch (stdout, webhook, file)
        scheduler/        # periodic scan orchestration

Roadmap

Phase Name Status
0 Truthful scaffold Done
1 Verification: parser and command extraction Done
2 Verification: sandbox execution engine Done
3 Verification: SQLite storage and reporting Done
4 Verification: CLI surface Done
5 Configuration and repository discovery Planned
6 Remote validation engine Planned
7 Drift detection Planned
8 Scheduler and daemon mode Planned
9 Dashboard and API Planned
10 Alerting Planned
11 Hardening and real-world testing Planned
12 Distribution and packaging Planned

See BUILD.md for the full phase breakdown with tasks, exit criteria, risks, and decisions.

Design principles

  1. Truth over theater. Never claim a verification result that wasn't actually observed. The report, the database, and the CLI must agree.
  2. Zero-knowledge. repokeeper checks health signals — remote URLs, commit refs, command exit codes. It never reads, parses, indexes, or uploads source code.
  3. Local-first. All data stays on your machine. No telemetry, no phone-home, no cloud dependency.
  4. Single static binary. go build produces one binary with zero runtime dependencies. Copy it anywhere and run it.
  5. Zero configuration for verification. Point at a repo, get a report. Configuration exists for the daemon and edge cases, not the happy path.
  6. Sandboxed by default. Verification commands run in isolation. Your working tree is read-only to the verification engine.
  7. Honest reporting. Every check is observable and explainable. No hidden heuristics, no opaque scores.
  8. Operator control. The user decides what to watch, how often to check, what alerts to fire, and what constitutes "healthy."
  9. Stdlib-first. External dependencies are minimal and justified. The only non-stdlib dependency is a pure-Go SQLite driver.

Privacy

repokeeper is privacy-first by design:

  • Never uploads repo content or metadata. All checks are local operations.
  • Never phones home. No telemetry, no analytics, no update checks.
  • Dashboard is local-only (planned). Will bind to 127.0.0.1 by default.
  • SQLite database is yours. Inspect it, back it up, delete it. It contains verification results, not source code.
  • Doc verification runs your commands locally. Commands are extracted from your own docs and run on your machine. Nothing is sent anywhere.

Contributing

The verification engine is stable and tested. The daemon, remote validation, and drift detection are next. Contributions, design feedback, and use case discussions are welcome — open an issue.

Mirrors

License

MIT — see LICENSE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages