morfessor

segmenter

A tool for unsupervised and semi-supervised morphological segmentation in text data

Morfessor is a tool for unsupervised and semi-supervised morphological segmentation

GitHub

186 stars
23 watching
29 forks
Language: Python
last commit: over 4 years ago
Linked from 1 awesome list

pythonsegmentationsubword-segmentationsubword-units

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
recski/hunparse An NLTK-based parser that provides morphological annotation for languages using KR-style annotations. 4
diasks2/pragmatic_segmenter A rule-based sentence boundary detection gem that works across many languages 559
machinalis/yalign Automates the process of extracting parallel sentences from comparable corpora to aid in statistical machine translation 127
fnl/segtok Provides tools for splitting text into sentences and words 171
zijundeng/pytorch-semantic-segmentation Provides PyTorch implementations of various models and pipelines for semantic segmentation in deep learning. 1,729
hszhao/semseg A PyTorch implementation of semantic segmentation models with support for multiprocessing training and various backbones. 1,347
nvidia/semantic-segmentation Monorepo implementing PyTorch-based neural network architecture for image segmentation 1,787
remixman/pythonlexto A Python wrapper around a Java library for segmenting Thai text into individual words 3
amir-zeldes/rftokenizer A tokenizer for segmenting words into morphological components 27
adbar/simplemma Lemmatization tool for natural language processing 146
apohllo/srx-english A Ruby library providing sentence segmentation rules based on the SRX standard for English language text processing. 18
lfcipriani/punkt-segmenter A Ruby port of the NLTK algorithm to detect sentence boundaries in unstructured text 92
ikawaha/kagome A Japanese morphological analyzer that splits words into grammatical components and segments phrases for efficient text processing 833
louismullie/scalpel A Ruby library that uses a simple rule-based approach to segment sentences into individual words or phrases. 51
cslu-nlp/detectormorse A tool for automatically detecting sentence boundaries in natural language text using machine learning and handcrafted features. 90