dedupe
Matcher
A Python library for fuzzy matching and record deduplication
A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
4k stars
120 watching
551 forks
Language: Python
last commit: 20 days ago
Linked from 2 awesome lists
clusteringdatamadede-duplicatingdedupededupe-libraryentity-resolutionpythonpython-libraryrecord-linkage
Related projects:
Repository | Description | Stars |
---|---|---|
kornelski/dupe-krill | A tool that identifies and replaces duplicate files with hardlinks to reduce storage space. | 186 |
nil0x42/duplicut | Tools to remove duplicates from massive wordlists used in password cracking | 881 |
dedline-io/dedline-api | Provides static API endpoints for US election registration deadline info | 1 |
gauss314/defi | Tools and libraries for calculating DeFi metrics and analyzing cryptocurrency data | 558 |
dipy/dipy | A comprehensive library for analyzing and processing medical imaging data from diffusion MRI | 716 |
seriousmanual/dedupe | A utility function to remove duplicate values from arrays | 24 |
dropbox/pyhive | Provides interfaces to connect and interact with data sources like Hive and Presto using Python. | 1,671 |
yfzhang114/llava-align | Debiasing techniques to minimize hallucinations in large visual language models | 71 |
debrouwere/python-literate | A tool that integrates Python code with Markdown documentation to generate reproducible HTML reports. | 41 |
radimspetlik/si-ddpm-fmo | A Python-based framework for training and evaluating deep learning models for single-image deblurring, shape, and trajectory recovery of fast-moving objects. | 5 |
dissectmalware/batch_deobfuscator | Deobfuscates batch scripts by substituting encoded strings and escaping characters. | 145 |
crypto-com/python-iavl | A toolset for inspecting and debugging Cosmos SDK application databases using an in-memory data structure called an IAVL tree | 5 |
egoist/dedep | Tool to manage and retrieve the latest versions of external dependencies used in a Deno project | 66 |
hi-primus/optimus | A Python library that provides a simple API for data preparation and analysis on various big-data engines | 1,481 |
deepmed-lab-ecnu/deeprft-aaai2023 | A deep learning-based image deblurring system that explores the impact of frequency selection on restoration quality | 18 |