OpenRepoWiki automatically generates comprehensive wiki documentation for any GitHub repository. Stop reading through endless code files and get instant insights into what each file and folder does.
Live Demo: openrepowiki.xyz
- Automated Wiki Generation: Creates detailed summaries of repository purpose, functionality, and architecture
- Codebase Analysis: Identifies key files, functions, and their roles within the project
- Dependency Graphs: Visualizes how files relate to each other using Mermaid diagrams
- Code Block Links: Sky blue highlighted code blocks link directly to GitHub source
This branch (aws-main) runs on a fully serverless AWS infrastructure:
flowchart TB
subgraph CDN["CloudFront CDN"]
CF["openrepowiki.xyz, api.*"]
end
CF --> S3F["S3 (Frontend)"]
CF --> APIGW["API Gateway"]
APIGW --> JobsLambda["Lambda: Jobs Handler"]
APIGW --> ReposLambda["Lambda: Repos Handler"]
APIGW --> AuthLambda["Lambda: Authorizer"]
JobsLambda --> SFN["Step Functions"]
SFN --> ECS["ECS Fargate (Processor)"]
ECS --> DDB["DynamoDB"]
ECS --> S3A["S3 (Artifacts)"]
ECS --> SM["Secrets Manager"]
| Component | Description |
|---|---|
| CloudFront | CDN with custom domain, SSL termination |
| S3 | Static frontend hosting + artifact storage |
| API Gateway | REST API with Lambda authorizer, WAF protection |
| Lambda | API handlers (jobs, repos) and request authorizer |
| Step Functions | Orchestrates the repository processing workflow |
| ECS Fargate | Runs the LLM-powered code summarization |
| DynamoDB | Stores repository data, job status, summaries |
| WAF | Rate limiting, bot protection, IP filtering |
openrepowiki3/
├── frontend/ # React + Vite frontend
│ └── src/
│ ├── api/ # API client with request signing
│ └── components/ # React components
├── services/
│ ├── api/ # Lambda API handlers
│ │ └── handlers/ # Jobs, Repos, Authorizer
│ └── processor/ # ECS container for processing
├── shared/ # Shared utilities
│ ├── github/ # GitHub API client
│ ├── llm/ # LLM providers (DeepSeek, Gemini, etc.)
│ └── storage/ # DynamoDB client
├── infra/
│ └── terraform/
│ ├── modules/ # Reusable Terraform modules
│ │ ├── apigw/ # API Gateway + authorizer
│ │ ├── cloudfront/ # CDN configuration
│ │ ├── dynamodb/ # Database tables
│ │ ├── ecs/ # Fargate cluster + task
│ │ ├── lambda/ # Lambda functions
│ │ ├── sfn/ # Step Functions
│ │ ├── vpc/ # VPC + networking
│ │ └── waf/ # Web Application Firewall
│ └── env/prod/ # Production environment config
└── docs/ # Requirements & documentation
- AWS CLI configured with appropriate permissions
- Terraform v1.5+
- Node.js 18+
- Python 3.11+
- Docker for building ECS container
# Copy and configure environment variables
cp .env.example .env
# Required variables:
# - LLM_PROVIDER (deepseek, gemini, ollama)
# - LLM_APIKEY
# - GITHUB_TOKENcd infra/terraform/env/prod
# Initialize Terraform
terraform init
# Review changes
terraform plan
# Apply infrastructure
terraform applycd services/api
./build_package.sh
# Upload to Lambda (via Terraform or AWS CLI)
aws lambda update-function-code \
--function-name openrepowiki-prod-jobs-handler \
--zip-file fileb://dist/api-lambda-package.zipcd frontend
# Set production API URL
export VITE_API_URL=https://api.openrepowiki.xyz/v1
export VITE_SIGNING_KEY=your-signing-key
npm install
npm run build:prod
# Sync to S3
aws s3 sync dist/ s3://openrepowiki-prod-frontend/This deployment includes multiple security layers:
| Layer | Protection |
|---|---|
| WAF | Rate limiting, AWS Managed Rules, bot detection |
| Lambda Authorizer | HMAC-signed requests for POST endpoints |
| CORS | Restricted to openrepowiki.xyz origin |
| VPC | Private subnets for ECS, VPC endpoints |
| Secrets Manager | Secure API keys and signing secrets |
- CloudWatch Logs: All Lambda, ECS, and API Gateway logs
- CloudWatch Metrics: Request counts, latency, errors
- WAF Logs: Blocked requests, rate limit hits
This architecture is designed for cost efficiency:
- Lambda: Pay per invocation, no idle costs
- Fargate Spot: Up to 70% savings on processing
- DynamoDB On Demand: Pay per request
- CloudFront: Caches static assets globally
Caution
Analyzing large repositories can consume 1M+ input/output tokens per repository. Use a cost-effective LLM provider like DeepSeek for production.
