Taranis is a similarity search engine built around Faiss library. It allows you to find the most similar vectors (a common mathematical and simplified representation of an image or a sound) of a query vector among hundreds of millions, or billions if you have enough RAM.
Many computer scientists are now able to use machine learning frameworks to classify images without even having to understand how it works. It is very easy to obtain a probability of belonging to a class. In a production environment, regularly adding images or classes causes a bottleneck since you must constantly re-learn your model. One solution is to use a non-evolving model that simply produces an N-dimensional vector for each input image. These vectors can then be compressed, indexed and/or searched by similarity with an external and incremental system.
Taranis is similarity search engine (Think Elasticsearch, but for vectors, not text documents). Taranis is in fact just a wrapper around the Faiss library. It aims at providing what is missing in such a scientific library:
-
Data persitency: reliable storage of raw and compressed vectors (Faiss stores the data in RAM, and can persist to disk on demand, but the writing is not incremental).
-
Web services: Taranis provides a gRPC server, and some rest API endpoints for management/monitoring.
-
Packaging in a ready-to-run Docker image.
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
-
MongoDB 4.0+ (for storing databases and indices informations, and raw features)
-
Redis 5.0+ (for storing Faiss indices and encoded vectors, inverted lists, etc.)
-
cmake 3.7+
-
openblas
-
tacopie 3.2.0: https://github.com/Cylix/tacopie
-
cpp_redis 4.3.1: https://github.com/cpp-redis/cpp_redis
-
faiss 1.5.3: https://github.com/facebookresearch/faiss
-
pybind11 2.2.4: https://github.com/pybind/pybind11
-
fmtlib 5.3.0: https://github.com/fmtlib/fmt
docker-compose up -d
A default configuration file is provided inside the the Docker image. There are 4 complementary ways to configure Taranis:
-
One or more configuration files (yaml|tomljson|ini|xml) provided by the
--config-file /my/config/file/path
command line arg. -
One or more configuration directories provided with the
--config-path /my/config/directory
command line arg. Example: providing the path/run/secrets
, Taranis will read every files in all subdirectories of/run/secrets
and associate a key (the subpath) to a value (file content). If there is a file/run/secrets/db/redis/password
containing the textnotagoodpassword
, the configuration will be:db.redis.password=notagoodpassword
-
Every environment variables starting with
TARANIS__
will be parsed. For instance,TARANIS__DB__REDIS__HOST=my-redis-host
is equivalent todb.redis.host=my-redis-host
. -
Every additional command lines provided with the arg
--additional-config
or-C
, such as-C db.redis.host=my-redis-host
.
I am Pierre Letessier, R&D engineer. Taranis is a personal project developed only in my free time, so please be indulgent !
Feedbacks, tests, benchmarks, issues and pull requests are welcome. For pull requests, please fork and create a new branch before to submit it.