distilabel
AI data generator
A framework for generating synthetic data and AI feedback to accelerate AI development
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
2k stars
17 watching
138 forks
Language: Python
last commit: 11 months ago aihuggingfacellmsopenaipythonrlaifrlhfsynthetic-datasynthetic-dataset-generation
Related projects:
| Repository | Description | Stars |
|---|---|---|
| | A toolkit for generating synthetic data while preserving differential privacy | 602 |
| | Automates product content generation using AI to improve SEO and customer experience. | 26 |
| | An integrated framework for training custom generative AI models | 246 |
| | Software for generating synthetic multivariate data with statistical properties preserved | 57 |
| | A framework for efficient and optimized retrieval augmented generative pipelines using state-of-the-art LLMs and Information Retrieval. | 1,392 |
| | A framework for creating autonomous AI agents with simple decorators and cryptographic security. | 1,158 |
| | A new programming language designed to support the development of hybrid AI systems. | 86 |
| | Automates large batches of AI-generated artwork locally using GPU acceleration. | 633 |
| | A utility library for generating and manipulating unique identifiers in a Substrate-based storage system | 6 |
| | Automates the generation of comprehensive README files using AI-powered language models. | 1,665 |
| | An AI-powered development platform that generates code and stores it on GitHub, allowing developers to customize and integrate it into their workflows. | 1,301 |
| | An audio generation library that uses diffusion models to produce high-quality audio samples from noise or text input | 1,975 |
| | An automated feature generation tool for tabular data | 806 |
| | An AI model designed to generate and execute code automatically | 816 |
| | A tool to help data scientists manage and annotate natural language data for training AI models | 1,405 |