ThaiToxicityTweetCorpus
Toxicity dataset
Corpus of annotated Thai tweets to analyze toxicity and sentiment
10 stars
4 watching
3 forks
Language: Jupyter Notebook
last commit: about 4 years ago Related projects:
Repository | Description | Stars |
---|---|---|
| Named Entity Recognition for Thai Text using PyThaiNLP and custom implementation. | 53 |
| A Thai language corpus and lexicon repository for natural language processing | 142 |
| Analyzes sentiment in Thai text using machine learning algorithms and natural language processing techniques. | 12 |
| A Python package for text processing and linguistic analysis focused on Thai language | 993 |
| A Thai word tokenization library using Deep Neural Network | 421 |
| A collection of datasets for natural language processing research in Thai, including word segmentation and review rating prediction. | 76 |
| A Java library to tokenize Thai text into groups of characters | 18 |
| An implementation of a word embedding technique using TensorFlow for Thai language processing | 11 |
| A collection of Ukrainian Twitter texts for linguistic analysis and research | 15 |
| A template-based text parsing library | 353 |
| An article classification dataset created from news articles scraped from Prachathai.com with multiple benchmark models for multi-label classification | 16 |
| A deep learning-based project for segmenting Thai text into words and annotating parts of speech with high accuracy. | 41 |
| A toolset for collecting and analyzing tweets from Twitter | 367 |
| Tools and techniques for improving machine translation in resource-constrained environments. | 3 |
| A pre-trained BERT model designed to facilitate NLP research and development with limited Thai language resources | 6 |