splink

Data linker

A Python package that uses statistical models to link and deduplicate data records from datasets lacking unique identifiers.

Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends

GitHub

1k stars
20 watching
154 forks
Language: Python
last commit: about 2 months ago
data-matchingdata-sciencededuplicate-datadeduplicationduckdbem-algorithmentity-resolutionfuzzy-matchingrecord-linkagesparkuk-gov-data-science

Related projects:

Repository Description Stars
josephfrazier/octopermalinker A browser extension that automatically updates links to branches on GitHub 27
byrnereese/linkchecker-mkdocs Tool to validate links in static generated websites with Markdown files 10
mommi84/mandolin A system that uses Markov Logic Networks to discover links in knowledge graphs 4
rafaelstz/magento2-quicklink An extension that predicts and preloads links on subsequent pages to improve loading speed 51
rafguns/linkpred A tool for predicting links in networks based on heuristics and statistical analysis of relationships. 142
rescribet/link-redux A JavaScript library and React component suite for rendering Linked Data in web applications. 36
blake-regalia/linked-data.syntaxes A package of syntax highlighters for various linked data formats 30
arbazkiraak/burpblh An extension for Burp Suite to identify broken links in web responses 56
ged/linkparser A high-level interface to the CMU Link Grammar parser 76
arbazkiraak/linksdumper A tool that extracts and filters links from web responses 86
dzonatan/ngx-linky An Angular pipe to find links in text and turn them into HTML links 41
umbrelladocs/linkspector A CLI tool that checks for dead hyperlinks in files using multiple markup languages. 70
rmlio/rmlmapper-java A Java library that generates high-quality Linked Data from multiple semi-structured data sources using RML rules. 161
archi-doc/valuelink A C# library for creating and managing flexible links between objects in code. 10
remodoy/clj-postgresql A Clojure library that provides an interface to PostgreSQL databases with support for connection parameter customization and type conversion 161