finetune-transformer-lm

Language model trainer

This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture.

Code and model for the paper "Improving Language Understanding by Generative Pre-Training"

GitHub

2k stars

74 watching

502 forks

Language: Python

last commit: over 6 years ago

paper

Screenshot of openai/finetune-transformer-lm website

s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf

Related projects:

Repository	Description	Stars
openai/lm-human-preferences	Training methods and tools for fine-tuning language models using human preferences	1,240
huggingface/pytorch-openai-transformer-lm	Implementing OpenAI's transformer language model in PyTorch with pre-trained weights and fine-tuning capabilities	1,511
german-nlp-group/german-transformer-training	Trains German transformer models to improve language understanding	23
microsoft/mpnet	Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning.	288
flagai-open/aquila2	Provides pre-trained language models and tools for fine-tuning and evaluation	439
zhuiyitechnology/gau-alpha	An implementation of a transformer-based NLP model utilizing gated attention units	98
vhellendoorn/code-lms	A guide to using pre-trained large language models in source code analysis and generation	1,789
fastnlp/cpt	A pre-trained transformer model for natural language understanding and generation tasks in Chinese	482
csuhan/onellm	A framework for training and fine-tuning multimodal language models on various data types	601
google-research/flan	A repository providing tools and datasets to fine-tune language models for specific tasks	1,484
microsoft/megatron-deepspeed	Research tool for training large transformer language models at scale	1,926
open-mmlab/mmengine	Provides a flexible and configurable framework for training deep learning models with PyTorch.	1,196
bigscience-workshop/megatron-deepspeed	A collection of tools and scripts for training large transformer language models at scale	1,342
proger/uk4b	Develops pretraining and finetuning techniques for language models using metadata-conditioned text generation	18
luogen1996/lavin	An open-source implementation of a vision-language instructed large language model	513