minGPT

Transformer model

A minimal PyTorch implementation of a transformer-based language model

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

GitHub

20k stars
259 watching
3k forks
Language: Python
last commit: 6 months ago

Related projects:

Repository Description Stars
minimaxir/gpt-2-simple A tool for retraining and fine-tuning the OpenAI GPT-2 text generation model on new datasets. 3,398
eleutherai/gpt-neox Provides a framework for training large-scale language models on GPUs with advanced features and optimizations. 6,997
openai/gpt-2 A repository providing code and models for research into language modeling and multitask learning 22,644
cornellius-gp/gpytorch A library for creating scalable and flexible Gaussian process models with ease 3,605
opennmt/ctranslate2 A high-performance inference engine for transformer models 3,467
keyvank/femtogpt A Rust implementation of a minimal Generative Pretrained Transformer architecture. 845
mshumer/gpt-prompt-engineer A tool for automating the process of generating and ranking effective prompts for AI models like GPT-4, GPT-3.5-Turbo, or Claude 3 Opus. 9,411
google-research/vision_transformer Provides pre-trained models and code for training vision transformers and mixers using JAX/Flax 10,620
ther1d/shell_gpt A command-line tool using AI-powered language models to generate shell commands and code snippets 9,933
microsoft/megatron-deepspeed Research tool for training large transformer language models at scale 1,926
bigscience-workshop/megatron-deepspeed A collection of tools and scripts for training large transformer language models at scale 1,342
eleutherai/pythia Analyzing knowledge development and evolution in large language models during training 2,309
openai/finetune-transformer-lm This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. 2,167
rasbt/llms-from-scratch Developing and pretraining a GPT-like Large Language Model from scratch 35,405
gpt-engineer-org/gpt-engineer An AI-powered platform to experiment with software engineering tasks using natural language input. 52,634