pyvi
VN Text Toolkit
A toolkit for processing Vietnamese text with tokenization, part-of-speech tagging, accents removal and addition capabilities.
Python Vietnamese Core NLP Toolkit
245 stars
12 watching
49 forks
Language: Jupyter Notebook
last commit: about 2 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
phuonglh/vn.vitk | A toolkit for processing and analyzing text data in Vietnamese, with tools for word segmentation, part-of-speech tagging, and dependency parsing. | 214 |
undertheseanlp/underthesea | A comprehensive toolkit for processing and analyzing Vietnamese language texts | 1,414 |
vncorenlp/vncorenlp | A Vietnamese natural language processing toolkit providing annotation pipelines for key NLP components such as word segmentation and named entity recognition. | 592 |
vinairesearch/phobert | Pre-trained language models for Vietnamese NLP tasks | 663 |
venuv/langchain_yt_tools | Custom tools to extract text from YouTube video transcripts | 62 |
nlp-uoregon/trankit | A lightweight toolkit for multilingual natural language processing tasks using transformer-based architectures. | 736 |
pku-yuangroup/video-bench | Evaluates and benchmarks large language models' video understanding capabilities | 117 |
wichert/lingua | A tool to simplify translating software source code and check existing translations. | 46 |
skynav/ttt | A collection of tools and conversion utilities for the W3C Timed Text Markup Language (TTML) | 74 |
fyu/lsun | Provides tools and data for training image classification models using the LSUN dataset. | 544 |
divvun/libdivvun | A library for Finite-State Morphology and Constraint Grammar based NLP tasks, providing tools for tokenisation, normalisation, grammar-checking and correction. | 9 |
vchahun/teny | Tools and techniques for improving machine translation in resource-constrained environments. | 3 |
t-vi/pytorch-tvmisc | A collection of utilities and tools for building and improving deep learning models in PyTorch | 468 |
kyubyong/wordvectors | Provides pre-trained word vectors for multiple languages to facilitate NLP tasks | 2,215 |
vocalpy/vak | A Python framework for training and applying neural networks to acoustic communication research | 78 |