ruby-ngram
Text segmenter
Breaks text into contiguous sequences of words or phrases
Break words and phrases into ngrams.
12 stars
4 watching
2 forks
Language: Ruby
last commit: almost 11 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
pharo-ai/ngrammodel | A tool for splitting text into sequences of words | 4 |
lfcipriani/punkt-segmenter | An implementation of a sentence boundary detection algorithm in Ruby. | 92 |
6/tiny_segmenter | A Ruby port of a Japanese text tokenization algorithm | 21 |
ankane/fasttext-ruby | Efficient text classification and representation learning library for Ruby | 203 |
reddavis/n-gram | Generates sequences of characters from a given text, useful for data analysis and modeling | 37 |
nelstrom/vim-textobj-rubyblock | A Vim plugin for selecting Ruby blocks | 331 |
postmodern/raingrams | A flexible ngrams library in Ruby allowing users to model and generate text | 69 |
ankane/torchtext-ruby | A Ruby library providing data loaders and abstractions for text and NLP tasks | 34 |
abitdodgy/words_counted | A Ruby library that tokenizes input and provides various statistical measures about the tokens | 159 |
diasks2/pragmatic_segmenter | A rule-based sentence boundary detection gem that works across many languages | 551 |
tmm1/rblineprof | A line profiler for Ruby programming language | 771 |
patterns-ai-core/langchainrb | A Ruby library providing an interface to Large Language Model (LLM) providers for text generation and embedding | 1,415 |
ankane/ngt-ruby | A high-performance approximate nearest neighbors search library for Ruby | 50 |
yohasebe/lemmatizer | A Ruby library that provides a lemmatizer for text in English. | 108 |
louismullie/scalpel | A Ruby library that uses a simple rule-based approach to segment sentences into individual words or phrases. | 51 |