This project will include a FastAPI backend (Readme: document-ia-api/README.md) This project will also include a worker service to process messages from Redis queue
document-ia-api/
├── src/
├── tests/ # Unit and integration tests
├── pyproject.toml # Poetry configuration
└── poetry.lock # Dependency lock file
document-ia-worker/
├── src/
├── tests/ # Unit and integration tests
├── pyproject.toml # Poetry configuration
└── poetry.lock # Dependency lock file
docker-compose.yml # Docker Compose file for local development
.env
- Start the services
# Start PostgreSQL and Redis in detached mode
docker-compose up -d- Stop the services
# Stop and remove containers
docker-compose down- View service logs
# View all service logs
docker-compose logs
# View specific service logs
docker-compose logs postgres
docker-compose logs redis
docker-compose logs minio
# Follow logs in real-time
docker-compose logs -f- Check service status
# List running containers
docker-compose ps
# Check service health
docker-compose exec postgres pg_isready
docker-compose exec redis redis-cli ping
docker-compose exec minio mc admin info localThe docker-compose.yml file uses environment variables from your .env file. Make sure your .env file includes the
following variables (see env.example for reference):
# PostgreSQL Configuration
POSTGRES_DB=document_ia
POSTGRES_USER=postgres
POSTGRES_PASSWORD=your-secure-postgres-password
POSTGRES_PORT=5432
# Redis Configuration
REDIS_PORT=6379
# MinIO Configuration (S3 compatible)
MINIO_ROOT_USER=minioadmin
MINIO_ROOT_PASSWORD=minioadmin
MINIO_API_PORT=9000
MINIO_CONSOLE_PORT=9001These variables are used by Docker Compose to configure the PostgreSQL, Redis, and MinIO services. The application will connect to these services using the same configuration.
MinIO provides S3-compatible object storage with a web-based management console:
- API Endpoint:
http://localhost:9000(or the port specified inMINIO_API_PORT) - Web Console:
http://localhost:9001(or the port specified inMINIO_CONSOLE_PORT)
Before using the application, you need to create the default S3 bucket. Use the provided initialization script:
# Run the MinIO bucket initialization script
python scripts/init-s3-bucket.pyThis script will:
- Connect to your MinIO instance
- Create the default bucket (
document-ia) if it doesn't exist - Verify the bucket is accessible
Note: Make sure your MinIO service is running (docker-compose up -d) before running this script.
- Start the services:
docker-compose up -d - Open your browser and navigate to
http://localhost:9001 - Login with the default credentials (or those specified in your
.envfile) - Create buckets and manage your S3-compatible storage
MinIO is fully compatible with AWS S3 SDKs. Configure your application to use MinIO instead of AWS S3:
# Example configuration for MinIO
S3_ENDPOINT_URL = "http://localhost:9000"
S3_ACCESS_KEY = "minioadmin"
S3_SECRET_KEY = "minioadmin"
S3_REGION = "us-east-1" # MinIO default regionThis project uses pre-commit hooks to ensure code quality. The setup includes:
- Ruff: Fast Python linter and formatter
- Pre-commit hooks: Automated checks before commits
- Install dependencies (if not already done):
poetry install- Install pre-commit hooks:
# Manual installation
pre-commit install- To make changes on the api without having to bump the version code
# This will create a symlink to the infra package
poetry run pip install -e ../document-ia-infra- Python 3.11+ features
- PEP 8 style guidelines (enforced by ruff)
- Type hints for all function parameters and return values
- Comprehensive error handling with custom exceptions
- Structured logging with data sanitization
- Unit tests for all business logic
- Integration tests for external dependencies
- Async test support
- Mock external services in tests
- Test idempotency behavior
- Proper authentication and authorization
- Input sanitization
- Rate limiting
- HTTPS in production
- File upload validation
The project is configured for production deployment on Heroku with:
- Procfile for process management
- Heroku Postgres and Redis add-ons
- Environment variable configuration
- Proper logging for Heroku
- Connection pooling for all external services
- Caching strategies with Redis
- Performance metrics logging
- Health checks implementation
- Structured logging for easy analysis
- Follow Clean Architecture principles
- Use async/await for all I/O operations
- Implement proper error handling
- Write comprehensive tests
- Use type hints
- Sanitize data before logging
- Make operations idempotent
[License information here]