MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
-
Updated
Jun 4, 2024 - Python
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
Quickly search, compare, and analyze genomic and metagenomic data sets.
JS implementation of probabilistic data structures: Bloom Filter (and its derived), HyperLogLog, Count-Min Sketch, Top-K and MinHash
Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents
C++ Implementations of sketch data structures with SIMD Parallelism, including Python bindings
Sketching Algorithms for Clojure (bloom filter, min-hash, hyper-loglog, count-min sketch)
Detect and visualize text reuse
Weighted MinHash implementation on CUDA (multi-gpu).
Dynatrace hash library for Java
Locality Sensitive Hashing
High-performance MinHash implementation in Rust with Python bindings for efficient similarity estimation and deduplication of large datasets
A resistome profiler for Graphing Resistance Out Of meTagenomes
A Clojure library for querying large data-sets on similarity
Elasticsearch plugin for b-bit minhash algorism
Union, intersection, and set cardinality in loglog space
Quickly estimate the similarity between many sets
SetSketch: Filling the Gap between MinHash and HyperLogLog
Genomic neighbor typing of bacterial pathogens using MinHash 🐀
ProbMinHash – A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity
Add a description, image, and links to the minhash topic page so that developers can more easily learn about it.
To associate your repository with the minhash topic, visit your repo's landing page and select "manage topics."