chazutsu

NLP dataset manager

A tool that simplifies the process of preparing and manipulating natural language processing datasets

The tool to make NLP datasets ready to use

GitHub

243 stars

14 watching

33 forks

Language: Python

last commit: almost 4 years ago

Linked from 1 awesome list

datasetmachine-learningnatural-language-processing

Screenshot of chakki-works/chazutsu website

medium.com/chakki/how-to-load-text-datasets-before-youre-in-trouble-with-them-345cdb1f1b33

Backlinks from these awesome lists:

keon/awesome-nlp

Related projects:

Repository	Description	Stars
karthikncode/nlp-datasets	A curated list of Natural Language Processing datasets used to train and evaluate NLP models.	919
kimtaro/ve	A linguistic framework for natural language processing tasks.	216
chartbeat-labs/textacy	A Python library providing NLP tools and utilities built on top of spaCy for text processing and analysis.	2,217
alexa/massive	A collection of tools and modeling code for a large multilingual Natural Language Understanding dataset	541
languagemachines/luiginlp	A workflow management system for Natural Language Processing tasks	21
goru001/inltk	A comprehensive toolkit for Natural Language Processing tasks in Indic languages, providing pre-trained models and datasets.	825
jd-aig/nlp_baai	A collection of natural language processing models and tools for collaboration on a joint project between BAAI and JDAI.	254
radi-cho/datasetgpt	A command-line interface to generate textual datasets with Large Language Models	293
pks/zipf	A Ruby NLP library providing tools and data structures for natural language processing tasks	3
michael-wzhu/promptcblue	A large-scale instruction-tuning dataset for multi-task and few-shot learning in the medical domain	328
piskvorky/gensim-data	A repository of pre-trained NLP models and corpora for text processing.	990
dayyass/dayyass	A collection of libraries and tools for natural language processing and reinforcement learning.	39
dkogan/vnlog	A toolkit for manipulating tabular ASCII data with normal UNIX tools.	161
zaibacu/rita-dsl	A DSL for building custom NLP patterns from manual language rules	65
leks-forever/nllb-tuning	This is an experimental project for fine-tuning the NLB language model with a specific dataset and evaluating its performance on translation tasks.	7