Chatito
Dataset generator
A tool for generating datasets for AI chatbots and natural language processing tasks using a simple domain-specific language.
🎯🗯 Dataset generation for AI chatbots, NLP tasks, named entity recognition or text classification models using a simple DSL!
876 stars
28 watching
157 forks
Language: TypeScript
last commit: about 1 year ago
Linked from 1 awesome list
chatbotchatbotschatitodatasetdataset-generationnamed-entity-recognitionnlgnlpnlutext-classification
Related projects:
Repository | Description | Stars |
---|---|---|
candlewill/dialog_corpus | A collection of datasets used to train and improve chatbot systems in both English and Chinese. | 2,033 |
radi-cho/datasetgpt | A command-line interface to generate textual datasets with Large Language Models | 293 |
poio-nlp/poio-corpus | A collection of language resources extracted from publicly available sources. | 7 |
certainlyio/corona_dataset | A collection of data to train chatbots on COVID-19-related questions | 11 |
maluuba/geneva_datasets | Scripts to generate datasets for an image generation task using Generative Adversarial Networks and deep learning techniques | 37 |
fido-ai/ua-datasets | Provides a collection of datasets for natural language processing in Ukrainian. | 55 |
karthikncode/nlp-datasets | A curated list of Natural Language Processing datasets used to train and evaluate NLP models. | 919 |
botman/studio | A bundle of tools and testing environment for developing chatbots using the Laravel PHP framework. | 330 |
instancio/instancio | Automates object creation and population with customizable data generation, reuse, and external feed integration for unit testing. | 923 |
chatopera/insuranceqa-corpus-zh | An insurance industry conversation corpus with pre-processed data for natural language processing and question answering tasks. | 1,020 |
pharo-ai/datasets | A Smalltalk library for loading and managing datasets as data frames. | 9 |
mirfan899/urdu | A collection of Urdu language datasets for various NLP tasks and applications | 71 |
ifttt/polo | Tool generates sample data from database models for testing and development purposes | 776 |
philipperemy/timit | A collection of acoustic and phonetic speech data designed for training and evaluating automatic speech recognition systems | 294 |
abbey4799/cutegpt | A conversational language model developed to improve understanding of complex instructions and Chinese vocabulary. | 62 |