finetune-transformer-lm
Language model trainer
This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture.
Code and model for the paper "Improving Language Understanding by Generative Pre-Training"
2k stars
73 watching
503 forks
Language: Python
last commit: almost 6 years ago paper
Related projects:
Repository | Description | Stars |
---|---|---|
openai/lm-human-preferences | Training methods and tools for fine-tuning language models using human preferences | 1,229 |
huggingface/pytorch-openai-transformer-lm | Implementing OpenAI's transformer language model in PyTorch with pre-trained weights and fine-tuning capabilities | 1,511 |
german-nlp-group/german-transformer-training | Trains German transformer models to improve language understanding | 23 |
microsoft/mpnet | Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning. | 288 |
flagai-open/aquila2 | Provides pre-trained language models and tools for fine-tuning and evaluation | 437 |
zhuiyitechnology/gau-alpha | An implementation of a Gated Attention Unit-based Transformer model for natural language processing tasks | 96 |
vhellendoorn/code-lms | A guide to using pre-trained large language models in source code analysis and generation | 1,782 |
fastnlp/cpt | A pre-trained transformer model for natural language understanding and generation tasks in Chinese | 481 |
csuhan/onellm | A framework for training and fine-tuning multimodal language models on various data types | 588 |
google-research/flan | A repository providing tools and datasets to fine-tune language models for specific tasks | 1,474 |
microsoft/megatron-deepspeed | Research tool for training large transformer language models at scale | 1,895 |
open-mmlab/mmengine | Provides a flexible and configurable framework for training deep learning models with PyTorch. | 1,179 |
bigscience-workshop/megatron-deepspeed | A collection of tools and scripts for training large transformer language models at scale | 1,335 |
proger/uk4b | Develops pretraining and finetuning techniques for language models using metadata-conditioned text generation | 18 |
luogen1996/lavin | An open-source implementation of a vision-language instructed large language model | 508 |