minGPT
Transformer model
A minimal PyTorch implementation of a transformer-based language model
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
20k stars
259 watching
3k forks
Language: Python
last commit: 6 months ago Related projects:
Repository | Description | Stars |
---|---|---|
minimaxir/gpt-2-simple | A tool for retraining and fine-tuning the OpenAI GPT-2 text generation model on new datasets. | 3,398 |
eleutherai/gpt-neox | Provides a framework for training large-scale language models on GPUs with advanced features and optimizations. | 6,997 |
openai/gpt-2 | A repository providing code and models for research into language modeling and multitask learning | 22,644 |
cornellius-gp/gpytorch | A library for creating scalable and flexible Gaussian process models with ease | 3,605 |
opennmt/ctranslate2 | A high-performance inference engine for transformer models | 3,467 |
keyvank/femtogpt | A Rust implementation of a minimal Generative Pretrained Transformer architecture. | 845 |
mshumer/gpt-prompt-engineer | A tool for automating the process of generating and ranking effective prompts for AI models like GPT-4, GPT-3.5-Turbo, or Claude 3 Opus. | 9,411 |
google-research/vision_transformer | Provides pre-trained models and code for training vision transformers and mixers using JAX/Flax | 10,620 |
ther1d/shell_gpt | A command-line tool using AI-powered language models to generate shell commands and code snippets | 9,933 |
microsoft/megatron-deepspeed | Research tool for training large transformer language models at scale | 1,926 |
bigscience-workshop/megatron-deepspeed | A collection of tools and scripts for training large transformer language models at scale | 1,342 |
eleutherai/pythia | Analyzing knowledge development and evolution in large language models during training | 2,309 |
openai/finetune-transformer-lm | This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. | 2,167 |
rasbt/llms-from-scratch | Developing and pretraining a GPT-like Large Language Model from scratch | 35,405 |
gpt-engineer-org/gpt-engineer | An AI-powered platform to experiment with software engineering tasks using natural language input. | 52,634 |