go-tesseract-ocr-service

This Golang based project provides a microservice that offers a REST API and a Web view to convert PDF's and Images to Text, using Tesseract OCR scanner.

Just a proof-of-concept at this point. For future development it will be split in a multi-tier application architecture for better escalability - again for instructional purposes.

1. How to build and run:

docker build -t ocr-tesseract .

docker run --privileged=true -d -t -i \
    -p 8080:80 \
    -e UPLOADED_FILES_DIR='/tmp/pdf-cache' \
    -v /tmp/pdf-cache:/tmp/pdf-cache ocr-tesseract

2. Main Web Views

The service provides some minimalistic webviews to use the functionalities.

http://localhost:8080/web/pdf
http://localhost:8080/web/img

3. Endpoints

3.1 API Endpoints for PDF submission

http://localhost:8080/api/upload/pdf

3.2 API endpoint for Image submission

http://localhost:8080/api/upload/img

4. Frameworks

This projects uses the following SDK's:

Tesseract OCR : OCR Engine
GhostScript: PDF interpreter used to convert PDF to a set of images (per page)

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
cmd/ocr-service		cmd/ocr-service
handlers		handlers
schema		schema
vendor		vendor
wrappers		wrappers
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

go-tesseract-ocr-service

1. How to build and run:

2. Main Web Views

3. Endpoints

3.1 API Endpoints for PDF submission

3.2 API endpoint for Image submission

4. Frameworks

About

Releases

Packages

Contributors 2

Languages

oscarpfernandez/go-tesseract-ocr-service

Folders and files

Latest commit

History

Repository files navigation

go-tesseract-ocr-service

1. How to build and run:

2. Main Web Views

3. Endpoints

3.1 API Endpoints for PDF submission

3.2 API endpoint for Image submission

4. Frameworks

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages