laion-datasets

AI dataset collection

A repository containing a collection of large datasets used for training and testing AI models, specifically designed to improve image-text matching capabilities.

Description and pointers of laion datasets

GitHub

235 stars
6 watching
9 forks
Language: HTML
last commit: about 2 years ago

Related projects:

Repository Description Stars
faceperceiver/laion-face Provides pre-trained face detection and analysis models using large-scale image-text data 278
aitutorials/datasets A comprehensive collection of datasets from various AI-related sources worldwide. 46
laion-ai/clip_benchmark Evaluates and compares the performance of various CLIP-like models on different tasks and datasets. 615
laion-ai/clap A library for learning audio embeddings from text and audio data using contrastive language-audio pretraining 1,415
laion-ai/aesthetic-predictor Predicts aesthetic quality of images using CLIP model embeddings 487
logpai/loghub Provides a collection of system log datasets for AI-driven analytics research. 1,833
aisegmentcn/matting_human_datasets A large dataset of human matting images and corresponding results for training person segmentation models. 610
mirfan899/urdu A collection of Urdu language datasets for various NLP tasks and applications 71
niraj-lunavat/artificial-intelligence A comprehensive resource for learning and exploring Artificial Intelligence (AI) concepts and applications 1,652
karthikncode/nlp-datasets A curated list of Natural Language Processing datasets used to train and evaluate NLP models. 919
pratyushmaini/llm_dataset_inference Detects whether a given text sequence is part of the training data used to train a large language model. 23
radi-cho/datasetgpt A command-line interface to generate textual datasets with Large Language Models 293
lemondan/humanparsing-dataset A collection of detailed pixel-wise annotations for fashion images used in human parsing research. 211
poio-nlp/poio-corpus A collection of language resources extracted from publicly available sources. 7
aiplanethub/beyondllm An open-source toolkit for building and evaluating large language models 261