simplemma
Lemmatizer
Lemmatization tool for natural language processing
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
146 stars
7 watching
12 forks
Language: Python
last commit: 12 months ago
Linked from 1 awesome list
corpus-toolslanguage-detectionlanguage-identificationlemmatiserlemmatizationlemmatizerlow-resource-nlpmorphological-analysisnlptokenizationtokenizerwordlist
Related projects:
| Repository | Description | Stars |
|---|---|---|
| | A Ruby library that provides a lemmatizer for text in English. | 108 |
| | A lemmatiser tool for multiple languages using affix rules and supervised learning from full-form dictionaries. | 36 |
| | Lemmatizer for Danish and Swedish languages | 76 |
| | A curated collection of German language resources and tools for natural language processing | 453 |
| | A suite of tools for linguistic analysis and correction, including finite state automata manipulation and string correction algorithms. | 28 |
| | This is an experimental project for fine-tuning the NLB language model with a specific dataset and evaluating its performance on translation tasks. | 7 |
| | A Python wrapper around the Thai word segmentator LexTo, allowing developers to easily integrate it into their applications. | 1 |
| | A dataset of Arabic book reviews for natural language processing tasks | 44 |
| | A Python binding to a C++ NLP tool for Dutch language processing tasks | 47 |
| | A Python package implementing an interpretable machine learning model for text classification with visualization tools | 336 |
| | An LDA-based text clustering pipeline for multiple languages | 82 |
| | Provides tools for part of speech tagging and lemmatization across multiple languages using machine learning models. | 18 |
| | A tool for comparing the performance of different language generation systems. | 467 |
| | Tools for normalizing and deriving sentiment from Arabic text | 26 |
| | Tools for detecting the language of unstructured text in Elixir applications | 116 |