alignment-handbook
Alignment Recipes
Provides recipes and guidelines for training language models to align with human preferences and AI goals
Robust recipes to align language models with human and AI preferences
5k stars
112 watching
417 forks
Language: Python
last commit: about 2 months ago llmrlhftransformers
Related projects:
Repository | Description | Stars |
---|---|---|
zjh-819/llmdatahub | A curated collection of high-quality datasets for training large language models. | 2,708 |
thunlp/promptpapers | A curated list of papers on prompt-based tuning for pre-trained language models, providing insights and advancements in the field. | 4,112 |
ethanyanjiali/minchatgpt | This project demonstrates the effectiveness of reinforcement learning from human feedback (RLHF) in improving small language models like GPT-2. | 214 |
openai/lm-human-preferences | Training methods and tools for fine-tuning language models using human preferences | 1,240 |
huggingface/trl | A library designed to train transformer language models with reinforcement learning using various optimization techniques and fine-tuning methods. | 10,308 |
huggingface/lerobot | A platform providing pre-trained models, datasets, and tools for robotics with focus on imitation learning and reinforcement learning. | 7,874 |
huggingface/peft | An efficient method for fine-tuning large pre-trained models by adapting only a small fraction of their parameters | 16,699 |
haotian-liu/llava | A system that uses large language and vision models to generate and process visual instructions | 20,683 |
stability-ai/stablelm | Develops and maintains large language models with improved stability and performance | 15,829 |
dair-ai/ml-papers-explained | An explanation of key concepts and advancements in the field of Machine Learning | 7,352 |
huggingface/transformers | A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. | 136,357 |
haifengl/smile | A comprehensive machine learning framework that provides a wide range of algorithms and data structures for tasks such as classification, regression, clustering, and visualization. | 6,066 |
instruction-tuning-with-gpt-4/gpt-4-llm | This project generates instruction-following data using GPT-4 to fine-tune large language models for real-world tasks. | 4,244 |
huggingface/text-generation-inference | A toolkit for deploying and serving Large Language Models (LLMs) for high-performance text generation | 9,456 |
microsoft/flaml | Automates machine learning workflows and optimizes model performance using large language models and efficient algorithms | 3,968 |