Autocracy

Absolute power to automate OCR.

Installing

$ npm install -g autocracy

Alternatively, don't install it and just prepend the below commands with npx.

Completions for Zsh will also be installed if a directory exists:

$ mkdir -p /usr/local/share/zsh/site-functions
$ chown -R $(whoami) /usr/local/share/zsh/site-functions

You will also need to install tools that Autocracy relies on to operate, Tesseract, MuPDF, and QPDF. On a Mac with Homebrew these can be installed with brew install tesseract mupdf qpdf. With Apt you will need to run apt install tesseract-ocr mupdf-tools qpdf.

If not using Homebrew check your Tesseract installation includes the fast training data for your desired languages, which can otherwise be downloaded from here.

Usage

To output text files:

$ autocracy get-text <origin> <destination>

To output new PDF files with embedded text:

$ autocracy make-searchable <origin> <destination>

In either case, the origin should be a directory of PDF files. The destination should be the name of a directory to be created for the results.

By default, Autocracy will first attempt to extract any tagged-text from within the PDF files. If tagged-text is found, it is used instead of (much slower) OCR. To disable this use the --force-ocr flag. The --preprocess flag will do some processing to attempt to improve OCR quality. The language expected in the documents defaults to English, but can be specified by passing the --language flag one of the language codes from this page.

A directory named .autocracy-cache will be created to contain intermediate files. These will be used on subsequent invocations of Autocracy. You will want to delete this directory after you finish.

Name		Name	Last commit message	Last commit date
Latest commit History 165 Commits
operations		operations
processes		processes
.gitignore		.gitignore
README.md		README.md
autocracy.js		autocracy.js
bin.js		bin.js
cli-renderer.js		cli-renderer.js
cli.js		cli.js
package.json		package.json
shared.js		shared.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Autocracy

Installing

Usage

See also

About

Releases 2

Languages

maxharlow/autocracy

Folders and files

Latest commit

History

Repository files navigation

Autocracy

Installing

Usage

See also

About

Topics

Resources

Stars

Watchers

Forks

Releases 2

Languages