finetune-transformer-lm

Language model trainer

This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture.

Code and model for the paper "Improving Language Understanding by Generative Pre-Training"

GitHub

2k stars
73 watching
503 forks
Language: Python
last commit: almost 6 years ago
paper

Related projects:

Repository Description Stars
openai/lm-human-preferences Training methods and tools for fine-tuning language models using human preferences 1,229
huggingface/pytorch-openai-transformer-lm Implementing OpenAI's transformer language model in PyTorch with pre-trained weights and fine-tuning capabilities 1,511
german-nlp-group/german-transformer-training Trains German transformer models to improve language understanding 23
microsoft/mpnet Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning. 288
flagai-open/aquila2 Provides pre-trained language models and tools for fine-tuning and evaluation 437
zhuiyitechnology/gau-alpha An implementation of a Gated Attention Unit-based Transformer model for natural language processing tasks 96
vhellendoorn/code-lms A guide to using pre-trained large language models in source code analysis and generation 1,782
fastnlp/cpt A pre-trained transformer model for natural language understanding and generation tasks in Chinese 481
csuhan/onellm A framework for training and fine-tuning multimodal language models on various data types 588
google-research/flan A repository providing tools and datasets to fine-tune language models for specific tasks 1,474
microsoft/megatron-deepspeed Research tool for training large transformer language models at scale 1,895
open-mmlab/mmengine Provides a flexible and configurable framework for training deep learning models with PyTorch. 1,179
bigscience-workshop/megatron-deepspeed A collection of tools and scripts for training large transformer language models at scale 1,335
proger/uk4b Develops pretraining and finetuning techniques for language models using metadata-conditioned text generation 18
luogen1996/lavin An open-source implementation of a vision-language instructed large language model 508