dedupe

Matcher

A Python library for fuzzy matching and record deduplication

id A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

GitHub

4k stars
120 watching
551 forks
Language: Python
last commit: 20 days ago
Linked from 2 awesome lists

clusteringdatamadede-duplicatingdedupededupe-libraryentity-resolutionpythonpython-libraryrecord-linkage

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
kornelski/dupe-krill A tool that identifies and replaces duplicate files with hardlinks to reduce storage space. 186
nil0x42/duplicut Tools to remove duplicates from massive wordlists used in password cracking 881
dedline-io/dedline-api Provides static API endpoints for US election registration deadline info 1
gauss314/defi Tools and libraries for calculating DeFi metrics and analyzing cryptocurrency data 558
dipy/dipy A comprehensive library for analyzing and processing medical imaging data from diffusion MRI 716
seriousmanual/dedupe A utility function to remove duplicate values from arrays 24
dropbox/pyhive Provides interfaces to connect and interact with data sources like Hive and Presto using Python. 1,671
yfzhang114/llava-align Debiasing techniques to minimize hallucinations in large visual language models 71
debrouwere/python-literate A tool that integrates Python code with Markdown documentation to generate reproducible HTML reports. 41
radimspetlik/si-ddpm-fmo A Python-based framework for training and evaluating deep learning models for single-image deblurring, shape, and trajectory recovery of fast-moving objects. 5
dissectmalware/batch_deobfuscator Deobfuscates batch scripts by substituting encoded strings and escaping characters. 145
crypto-com/python-iavl A toolset for inspecting and debugging Cosmos SDK application databases using an in-memory data structure called an IAVL tree 5
egoist/dedep Tool to manage and retrieve the latest versions of external dependencies used in a Deno project 66
hi-primus/optimus A Python library that provides a simple API for data preparation and analysis on various big-data engines 1,481
deepmed-lab-ecnu/deeprft-aaai2023 A deep learning-based image deblurring system that explores the impact of frequency selection on restoration quality 18