fuzzy_match

Record matcher

A tool for finding similar records in large datasets using string similarity and regular expression rules.

Find a needle (a document or record) in a haystack using string similarity and (optionally) regular expression rules. Uses Dice's Coefficient (aka Pair Similiarity) and Levenshtein Distance internally.

GitHub

676 stars
10 watching
46 forks
Language: Ruby
last commit: over 3 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
wouterrutgers/fuzzy-search A lightweight JavaScript library for searching similar strings in an array of objects 224
jamesturk/jellyfish A Python library providing algorithms and encoding schemes for approximate string matching. 2,066
glench/fuzzyset.js A fuzzy string matching library that performs approximate string matching and likely mispellings detection 1,369
danharltey/fastenshtein An optimized Levenshtein implementation for fast fuzzy matching and string comparison in .NET. 248
hernanmd/fuzzysearcher An implementation of the ambiguous matching algorithm from Baeta-Yates et al. 2
rapidfuzz/rapidfuzz-cpp A C++ library for fast string matching using the Levenshtein Distance algorithm 244
jhawthorn/fzy.js A JavaScript implementation of a fuzzy string matching algorithm 152
rmm5t/liquidmetal A JavaScript library to improve fuzzy matching in web controls by leveraging a modified Quicksilver scoring algorithm. 295
brianhempel/fuzzy_tools A toolset for searching and indexing strings in Ruby with fuzzy matching capabilities 23
kiyoka/fuzzy-string-match A fast fuzzy string matching library 285
dgrtwo/fuzzyjoin Package for joining tables based on inexact matching 669
lexmag/simetric Facilities to calculate the distance and similarity between strings using various algorithms 61
mitsuhiko/insta A library for comparing expected values against reference data to ensure consistency during development and testing. 2,234
blackrabbitt/mspm An algorithm implementation for efficient multi-string pattern matching using trie data structures. 26
sindresorhus/matcher A utility for simple string matching with wildcard patterns 537