Fast Near-Duplicate Image Search and Delete using pHash, t-SNE and KDTree.
-
Updated
Nov 22, 2022 - Python
Fast Near-Duplicate Image Search and Delete using pHash, t-SNE and KDTree.
Advanced Duplicate File Finder for Python
Command line utility to remove exact duplicate files.
🍰 A library for creating n-grams, skip-grams, bag of words, bag of n-grams, bag of skip-grams.
File duplicate remover for Synology DSM 213j+
Program to scan and search for file duplicates. (~300MB/s)
Function that removes duplicate items and objects based on a key from an array of objects.
Command Line Interface for deplicate
Takes an input CSV and produces a CSV of duplicate records. Then the input CSV is cleansed to remove duplicates.
Sort, uniq, reverse, and randomize data
A no-nonsense .NET Core 2.1 CLI duplicate files remover
A tool that deduplicates lines of a textfile with the speed of ram and scales nicely on all cores concurrently.
Searches for duplicates in two separate folders allowing removing duplicated files from one and keeping another intact.
powerful data preprocessing application that simplifies the task of preparing data for machine learning models.
Created modified Levenshtein distance algorithms, to match strings by deletion and capitalization only and does not allow replacement or insertion of characters
Parallel Computing (Lab)
rm-dup is a script to remove duplicate files
Add a description, image, and links to the duplicates-removed topic page so that developers can more easily learn about it.
To associate your repository with the duplicates-removed topic, visit your repo's landing page and select "manage topics."