minGPT
Transformer model
A minimal PyTorch implementation of a transformer-based language model
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
20k stars
257 watching
3k forks
Language: Python
last commit: 3 months ago Related projects:
Repository | Description | Stars |
---|---|---|
minimaxir/gpt-2-simple | A tool for retraining and fine-tuning the OpenAI GPT-2 text generation model on new datasets. | 3,397 |
eleutherai/gpt-neox | Provides a framework for training large-scale language models on GPUs with advanced features and optimizations. | 6,941 |
openai/gpt-2 | A repository providing code and models for research into language modeling and multitask learning | 22,516 |
cornellius-gp/gpytorch | A library for creating scalable and flexible Gaussian process models with ease | 3,580 |
opennmt/ctranslate2 | A high-performance library for efficient inference with Transformer models on CPUs and GPUs. | 3,404 |
keyvank/femtogpt | A Rust implementation of a minimal Generative Pretrained Transformer architecture. | 834 |
mshumer/gpt-prompt-engineer | A tool for automating the process of generating and ranking effective prompts for AI models like GPT-4, GPT-3.5-Turbo, or Claude 3 Opus. | 9,368 |
google-research/vision_transformer | Provides pre-trained models and code for training vision transformers and mixers using JAX/Flax | 10,450 |
ther1d/shell_gpt | A command-line tool using AI-powered language models to generate shell commands and code snippets | 9,672 |
microsoft/megatron-deepspeed | Research tool for training large transformer language models at scale | 1,895 |
bigscience-workshop/megatron-deepspeed | A collection of tools and scripts for training large transformer language models at scale | 1,335 |
eleutherai/pythia | Analyzing knowledge development and evolution in large language models during training | 2,280 |
openai/finetune-transformer-lm | This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. | 2,160 |
rasbt/llms-from-scratch | Developing and pretraining a GPT-like Large Language Model from scratch | 32,908 |
gpt-engineer-org/gpt-engineer | An AI-powered development tool that uses natural language to generate and execute code. | 52,392 |