dedupe
Matcher
A Python library for fuzzy matching and record deduplication
A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
4k stars
120 watching
552 forks
Language: Python
last commit: 3 months ago
Linked from 2 awesome lists
clusteringdatamadede-duplicatingdedupededupe-libraryentity-resolutionpythonpython-libraryrecord-linkage
Related projects:
Repository | Description | Stars |
---|---|---|
| A tool that identifies and replaces duplicate files with hardlinks to reduce storage space. | 187 |
| Tools to remove duplicates from massive wordlists used in password cracking | 885 |
| Provides static API endpoints for US election registration deadline info | 1 |
| Tools and libraries for calculating DeFi metrics and analyzing cryptocurrency data | 559 |
| A comprehensive library for the analysis of MR diffusion imaging in medical research. | 723 |
| A utility function to remove duplicate values from arrays | 24 |
| Provides interfaces to connect and interact with data sources like Hive and Presto using Python. | 1,676 |
| Debiasing techniques to minimize hallucinations in large visual language models | 75 |
| A tool that integrates Python code with Markdown documentation to generate reproducible HTML reports. | 41 |
| A Python-based framework for training and evaluating deep learning models for single-image deblurring, shape, and trajectory recovery of fast-moving objects. | 5 |
| Deobfuscates batch scripts by substituting encoded strings and escaping characters. | 150 |
| A toolset for inspecting and debugging Cosmos SDK application databases using an in-memory data structure called an IAVL tree | 5 |
| Tool to manage and retrieve the latest versions of external dependencies used in a Deno project | 66 |
| A Python library that provides a simple API for data preparation and analysis on various big-data engines | 1,486 |
| An image deblurring technique based on frequency selection using machine learning models | 18 |