Tatoeba-Challenge
Translation dataset pack
A collection of machine translation datasets and tools to support real-world low-resource scenarios
811 stars
23 watching
90 forks
Language: Makefile
last commit: 6 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| A multilingual dataset for sentiment analysis and emotion detection from movie subtitles. | 56 |
| A collection of Ukrainian language tools and resources for machine translation, natural language processing, and text translation. | 30 |
| A curated list of Natural Language Processing datasets used to train and evaluate NLP models. | 919 |
| A collection of tools and modeling code for a large multilingual Natural Language Understanding dataset | 541 |
| A Go package providing an easy interface to use pre-trained NLP models from the HuggingFace repository for tasks like text classification and machine translation. | 293 |
| Provides a collection of datasets for natural language processing in Ukrainian. | 57 |
| A tool that simplifies the process of preparing and manipulating natural language processing datasets | 243 |
| A collection of Go-based resources and tools for data science tasks | 879 |
| A collection of linguistic datasets and benchmarks for natural language understanding tasks | 8 |
| Generates training data from the Carla driving simulator in the KITTI dataset format for autonomous vehicle development | 108 |
| An NLP library providing morphological analysis and language modeling tools for Uralic languages and others. | 71 |
| A language model trained on Danish Wikipedia data for named entity recognition and masked language modeling | 9 |
| A linguistic framework for natural language processing tasks. | 216 |
| A large-scale instruction-tuning dataset for multi-task and few-shot learning in the medical domain | 328 |
| A dataset of Hungarian translations of human-language examples to test anaphora resolution algorithms | 1 |