distilabel

AI data generator

A framework for generating synthetic data and AI feedback to accelerate AI development

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

GitHub

2k stars
17 watching
129 forks
Language: Python
last commit: 6 days ago
aihuggingfacellmsopenaipythonrlaifrlhfsynthetic-datasynthetic-dataset-generation

Related projects:

Repository Description Stars
gretelai/gretel-synthetics A toolkit for generating synthetic data while preserving differential privacy 597
mage-os-lab/module-catalog-data-ai Automates product content generation using AI to improve SEO and customer experience. 25
jolibrain/joligen An integrated framework for training custom generative AI models 244
dmey/synthia Software for generating synthetic multivariate data with statistical properties preserved 57
intellabs/fastrag A framework for efficient and optimized retrieval augmented generative pipelines using state-of-the-art LLMs and Information Retrieval. 1,336
fetchai/uagents A framework for creating autonomous AI agents with simple decorators and cryptographic security. 987
xiyuzhai-husky-lang/husky A new programming language designed to support the development of hybrid AI systems. 85
rbbrdckybk/ai-art-generator Automates large batches of AI-generated artwork locally using GPU acceleration. 634
arrudagates/substate A utility library for generating and manipulating unique identifiers in a Substrate-based storage system 6
eli64s/readme-ai Automates the generation of comprehensive README files using AI-powered language models. 1,590
yuyz0112/dewhale An AI-powered development platform that generates code and stores it on GitHub, allowing developers to customize and integrate it into their workflows. 1,262
archinetai/audio-diffusion-pytorch An audio generation library that uses diffusion models to produce high-quality audio samples from noise or text input 1,961
iiis-li-group/openfe An automated feature generation tool for tabular data 782
bin123apple/autocoder An AI model designed to generate and execute code automatically 814
code-kern-ai/refinery A tool to help data scientists manage and annotate natural language data for training AI models 1,402