cutthai

Thai word segmenter

A tool for Thai word segmentation using a combination of data structures and algorithms

Thai word segmentation written in coffee-script

GitHub

5 stars
1 watching
1 forks
Language: CoffeeScript
last commit: almost 7 years ago

Related projects:

Repository Description Stars
pucktada/cutkum A tool for segmenting Thai text into words using Recurrent Neural Networks in TensorFlow. 154
remixman/pythonlexto A Python wrapper around a Java library for segmenting Thai text into individual words 3
krakenai/synthai A deep learning-based project for segmenting Thai text into words and annotating parts of speech with high accuracy. 41
tchayintr/best2010_cooker Extracts segmented words from Thai BEST2010 corpus. 2
c4n/pythonlexto A Python wrapper around the Thai word segmentator LexTo, allowing developers to easily integrate it into their applications. 1
veer66/wordcut A Node.js-based Thai word breaker with an optional custom dictionary and command-line interface 143
tkellen/ruby-ngram Breaks text into contiguous sequences of words or phrases 12
rkcosmos/deepcut A Thai word tokenization library using Deep Neural Network 420
patois/xray Tool for filtering and highlighting decompiler output based on regular expressions 125
diasks2/pragmatic_segmenter A rule-based sentence boundary detection gem that works across many languages 551
minibikini/paasaa Tools for detecting the language of unstructured text in Elixir applications 115
soumyaxyz/query-segmenter An unsupervised method to segment queries in search results based on query logs. 1
naikai/sake A tool for analyzing Single-cell RNA-Seq data to identify patterns and clusters in gene expression. 27
lfcipriani/punkt-segmenter An implementation of a sentence boundary detection algorithm in Ruby. 92
jtmoulia/neotomex A PEG parser/transformer written in Elixir with a DSL for specifying grammars 68