morfessor
Segmenter
Tools and algorithms for morphological segmentation in natural language processing
Morfessor is a tool for unsupervised and semi-supervised morphological segmentation
185 stars
23 watching
29 forks
Language: Python
last commit: about 4 years ago
Linked from 1 awesome list
pythonsegmentationsubword-segmentationsubword-units
Related projects:
Repository | Description | Stars |
---|---|---|
recski/hunparse | An NLTK-based parser that provides morphological annotation for languages using KR-style annotations. | 4 |
diasks2/pragmatic_segmenter | A rule-based sentence boundary detection gem that works across many languages | 553 |
machinalis/yalign | Automates the process of extracting parallel sentences from comparable corpora to aid in statistical machine translation | 127 |
fnl/segtok | Provides tools for splitting text into sentences and words | 170 |
zijundeng/pytorch-semantic-segmentation | Provides PyTorch implementations of various models and pipelines for semantic segmentation in deep learning. | 1,724 |
hszhao/semseg | A PyTorch implementation of semantic segmentation models with support for multiprocessing training and various backbones. | 1,343 |
nvidia/semantic-segmentation | Monorepo implementing PyTorch-based neural network architecture for image segmentation | 1,777 |
remixman/pythonlexto | A Python wrapper around a Java library for segmenting Thai text into individual words | 3 |
amir-zeldes/rftokenizer | A tokenizer for segmenting words into morphological components | 27 |
adbar/simplemma | Lemmatization tool for natural language processing | 145 |
apohllo/srx-english | A Ruby library providing English sentence and word segmentation rules based on the SRX standard. | 18 |
lfcipriani/punkt-segmenter | Port of the NLTK Punkt sentence segmentation algorithm in Ruby | 92 |
ikawaha/kagome | A Japanese morphological analyzer that splits words into grammatical components and segments phrases for efficient text processing | 827 |
louismullie/scalpel | A Ruby library that uses a simple rule-based approach to segment sentences into individual words or phrases. | 51 |
cslu-nlp/detectormorse | A tool for automatically detecting sentence boundaries in natural language text using machine learning and handcrafted features. | 90 |