dedupe

Matcher

A Python library for fuzzy matching and record deduplication

id A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

GitHub

4k stars
120 watching
552 forks
Language: Python
last commit: about 2 months ago
Linked from 2 awesome lists

clusteringdatamadede-duplicatingdedupededupe-libraryentity-resolutionpythonpython-libraryrecord-linkage

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
kornelski/dupe-krill A tool that identifies and replaces duplicate files with hardlinks to reduce storage space. 187
nil0x42/duplicut Tools to remove duplicates from massive wordlists used in password cracking 885
dedline-io/dedline-api Provides static API endpoints for US election registration deadline info 1
gauss314/defi Tools and libraries for calculating DeFi metrics and analyzing cryptocurrency data 559
dipy/dipy A comprehensive library for the analysis of MR diffusion imaging in medical research. 723
seriousmanual/dedupe A utility function to remove duplicate values from arrays 24
dropbox/pyhive Provides interfaces to connect and interact with data sources like Hive and Presto using Python. 1,676
yfzhang114/llava-align Debiasing techniques to minimize hallucinations in large visual language models 75
debrouwere/python-literate A tool that integrates Python code with Markdown documentation to generate reproducible HTML reports. 41
radimspetlik/si-ddpm-fmo A Python-based framework for training and evaluating deep learning models for single-image deblurring, shape, and trajectory recovery of fast-moving objects. 5
dissectmalware/batch_deobfuscator Deobfuscates batch scripts by substituting encoded strings and escaping characters. 150
crypto-com/python-iavl A toolset for inspecting and debugging Cosmos SDK application databases using an in-memory data structure called an IAVL tree 5
egoist/dedep Tool to manage and retrieve the latest versions of external dependencies used in a Deno project 66
hi-primus/optimus A Python library that provides a simple API for data preparation and analysis on various big-data engines 1,486
deepmed-lab-ecnu/deeprft-aaai2023 An image deblurring technique based on frequency selection using machine learning models 18