java-string-similarity

String similarity library

A Java library implementing various string similarity and distance measures.

Implementation of various string similarity and distance algorithms: Levenshtein, Jaro-winkler, n-Gram, Q-Gram, Jaccard index, Longest Common Subsequence edit distance, cosine similarity ...

GitHub

3k stars
112 watching
414 forks
Language: Java
last commit: over 2 years ago
Linked from 1 awesome list

algorithmcosine-similaritydamerau-levenshteindistancedistance-measurejaro-winklerjavalevenshtein-distanceshinglessimilarity-measuresstring-distance

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
feature23/stringsimilarity.net A .NET port of Java string similarity library implementing various distance and similarity measures 452
sindresorhus/leven A JavaScript implementation of the Levenshtein distance algorithm for measuring string similarity. 718
agext/levenshtein Calculates Levenshtein distance and similarity metrics between two strings 86
hbollon/go-edlib A comprehensive Go library for calculating string similarity and edit distances between strings 488
cbaggers/mk-string-metrics Provides efficient algorithms to calculate string similarity metrics 22
life4/textdistance A Python library for comparing distances between sequences using various algorithms. 3,410
ztane/python-levenshtein Fast string computation and similarity functions for text analysis 1,265
lexmag/simetric Facilities to calculate the distance and similarity between strings using various algorithms 61
tonytonyjan/jaro_winkler An implementation of the Jaro-Winkler distance algorithm to compare strings 195
globalnamesarchitecture/damerau-levenshtein Calculates edit distance between two strings using the Damerau-Levenshtein algorithm 145
dbalatero/levenshtein-ffi Fast string edit distance computation using the Damerau-Levenshtein algorithm 150
mateusza/sqlite-levenshtein A utility extension for computing string similarities between two sequences using the Levenshtein distance algorithm 15
nektro/zig-leven Calculates the difference between two strings using the Levenshtein distance algorithm 7
turnerj/quickenshtein A high-performance Levenshtein Distance calculator with SIMD and threading support. 285
roy-ht/editdistance A fast implementation of Levenshtein distance for calculating string similarity 664