laion-datasets

AI dataset collection

A repository containing a collection of large datasets used for training and testing AI models, specifically designed to improve image-text matching capabilities.

Description and pointers of laion datasets

GitHub

239 stars

6 watching

9 forks

Language: HTML

last commit: over 3 years ago

Screenshot of LAION-AI/laion-datasets website

projects.laion.ai/laion-datasets

Related projects:

Repository	Description	Stars
faceperceiver/laion-face	Provides pre-trained face detection and analysis models using large-scale image-text data	281
aitutorials/datasets	A comprehensive collection of datasets from various AI-related sources worldwide.	46
laion-ai/clip_benchmark	Evaluates and compares the performance of various CLIP-like models on different tasks and datasets.	632
laion-ai/clap	A library for learning audio embeddings from text and audio data using contrastive language-audio pretraining	1,457
laion-ai/aesthetic-predictor	Predicts aesthetic quality of images using CLIP model embeddings	491
logpai/loghub	Provides a collection of system log datasets for AI-driven analytics research.	1,883
aisegmentcn/matting_human_datasets	A large dataset of human matting images and corresponding results for training person segmentation models.	615
mirfan899/urdu	A collection of Urdu language datasets for various NLP tasks and applications	71
niraj-lunavat/artificial-intelligence	A comprehensive resource for learning and exploring Artificial Intelligence (AI) concepts and applications	1,667
karthikncode/nlp-datasets	A curated list of Natural Language Processing datasets used to train and evaluate NLP models.	919
pratyushmaini/llm_dataset_inference	Detects whether a given text sequence is part of the training data used to train a large language model.	23
radi-cho/datasetgpt	A command-line interface to generate textual datasets with Large Language Models	293
lemondan/humanparsing-dataset	A collection of detailed pixel-wise annotations for fashion images used in human parsing research.	213
poio-nlp/poio-corpus	A collection of language resources extracted from publicly available sources.	7
aiplanethub/beyondllm	An open-source toolkit for building and evaluating large language models	267