Feature Overview
Efficient
Terrier can index large corpora of documents, and provides multiple indexing strategies, such as multi-pass and large-scale single-pass indexing. Real-time indexing of document streams are also supported via updatable index structures.
Effective
State-of-the-art retrieval approaches are provided, such as Divergence From Randomness, BM25F, as well as term dependence proximity models. Support for supervised ranking models via Learning to Rank is also built-in.
Flexible
Terrier is ideal for performing information retrieval experiments. It can index and perform batch retrieval experiments for all known TREC test collections. Tools to evaluate experiments results are also included.
Multi-lingual
Terrier uses UTF internally, and can support corporas written in languages other than English.
Extensible
Terrier follows a plugin architecture, and is easy to extend to develop new retrieval techniques, add new ranking features or experiment with low-level functionality such as index compression.
Interactive
View search results in a handy desktop search application, online using JSP web interfaces or using the provided website search application. Plan and execute experiments in notebooks using Terrier-Spark.