pragmatic_segmenter
Sentence segmenter
A rule-based sentence boundary detection gem that works across many languages
Pragmatic Segmenter is a rule-based sentence boundary detection gem that works out-of-the-box across many languages.
551 stars
16 watching
55 forks
Language: Ruby
last commit: 3 months ago
Linked from 2 awesome lists
Related projects:
Repository | Description | Stars |
---|---|---|
lfcipriani/punkt-segmenter | An implementation of a sentence boundary detection algorithm in Ruby. | 92 |
uglytoad/pragmaticsegmenternet | A C# implementation of sentence boundary detection with rule-based approach. | 33 |
diasks2/pragmatic_tokenizer | A multilingual tokenizer to split strings into tokens, handling various language and formatting nuances. | 90 |
nipunsadvilkar/pysbd | A Python package for out-of-the-box sentence boundary detection using rule-based algorithms. | 807 |
apohllo/srx-english | A Ruby library containing English sentence and word segmentation rules based on the SRX standard. | 18 |
tkellen/ruby-ngram | Breaks text into contiguous sequences of words or phrases | 12 |
diasks2/chat_correct | A tool that highlights errors in user input to help improve English language skills | 43 |
louismullie/scalpel | A Ruby library that uses a simple rule-based approach to segment sentences into individual words or phrases. | 51 |
6/tiny_segmenter | A Ruby port of a Japanese text tokenization algorithm | 21 |
lartpang/pysodmetrics | A library providing an implementation of various metrics for object segmentation and saliency detection in computer vision. | 144 |
diasks2/word_count_analyzer | An analyzer tool to account for variations in word count calculations | 20 |
cslu-nlp/detectormorse | A tool for automatically detecting sentence boundaries in natural language text using machine learning and handcrafted features. | 90 |
dcjones/proseg | An open-source software package for probabilistic cell segmentation in spatial transcriptomics | 45 |
juntang-zhuang/shelfnet | An implementation of a lightweight semantic segmentation model with real-time performance capabilities | 252 |
aalto-speech/morfessor | A tool for unsupervised and semi-supervised morphological segmentation of text data | 185 |