java-string-similarity

String similarity library

A Java library implementing various string similarity and distance measures.

Implementation of various string similarity and distance algorithms: Levenshtein, Jaro-winkler, n-Gram, Q-Gram, Jaccard index, Longest Common Subsequence edit distance, cosine similarity ...

GitHub

3k stars
112 watching
414 forks
Language: Java
last commit: over 2 years ago
Linked from 1 awesome list

algorithmcosine-similaritydamerau-levenshteindistancedistance-measurejaro-winklerjavalevenshtein-distanceshinglessimilarity-measuresstring-distance

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
feature23/stringsimilarity.net A .NET port of Java string similarity library implementing various distance and similarity measures 448
sindresorhus/leven A JavaScript implementation of the Levenshtein distance algorithm for measuring string similarity. 715
agext/levenshtein Calculates Levenshtein distance and similarity metrics between two strings 86
hbollon/go-edlib A comprehensive Go library for calculating string similarity and edit distances between strings 481
cbaggers/mk-string-metrics Provides efficient algorithms to calculate string similarity metrics 22
life4/textdistance A Python library for comparing distances between sequences using various algorithms. 3,394
ztane/python-levenshtein Fast string computation and similarity functions for text analysis 1,263
lexmag/simetric Facilities to calculate the distance and similarity between strings using various algorithms 61
tonytonyjan/jaro_winkler An implementation of the Jaro-Winkler distance algorithm to compare strings 195
globalnamesarchitecture/damerau-levenshtein Calculates edit distance between two strings using the Damerau-Levenshtein algorithm 144
dbalatero/levenshtein-ffi Fast string edit distance computation using the Damerau-Levenshtein algorithm 149
mateusza/sqlite-levenshtein A utility extension for computing string similarities between two sequences using the Levenshtein distance algorithm 15
nektro/zig-leven Calculates the difference between two strings using the Levenshtein distance algorithm 7
turnerj/quickenshtein A high-performance Levenshtein Distance calculator with SIMD and threading support. 284
roy-ht/editdistance A fast implementation of Levenshtein distance for calculating string similarity 661