MPNet
Language model trainer
Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning.
MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf
288 stars
13 watching
33 forks
Language: Python
last commit: over 3 years ago Related projects:
Repository | Description | Stars |
---|---|---|
openai/finetune-transformer-lm | This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. | 2,167 |
csuhan/onellm | A framework for training and fine-tuning multimodal language models on various data types | 601 |
brightmart/xlnet_zh | Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks | 230 |
huggingface/nanotron | A pretraining framework for large language models using 3D parallelism and scalable training techniques | 1,332 |
elanmart/psmm | An implementation of a neural network model for character-level language modeling. | 50 |
microsoft/unicoder | This repository provides pre-trained models and code for understanding and generation tasks in multiple languages. | 89 |
openbmb/cpm-live | A live training platform for large-scale deep learning models, allowing community participation and collaboration in model development and deployment. | 511 |
moses-smt/nplm | A toolkit for training neural network language models | 14 |
german-nlp-group/german-transformer-training | Trains German transformer models to improve language understanding | 23 |
sihengli99/textbind | Enables larger language models to generate multi-turn multimodal instruction-response conversations from image-caption pairs with minimal annotations. | 47 |
vhellendoorn/code-lms | A guide to using pre-trained large language models in source code analysis and generation | 1,789 |
cluebenchmark/cluepretrainedmodels | Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models. | 806 |
ieit-yuan/yuan2.0-m32 | A high-performance language model designed to excel in tasks like natural language understanding, mathematical computation, and code generation | 182 |
wyy-123-xyy/ra-fed | A Python implementation of a distributed machine learning framework for training neural networks on multiple GPUs | 6 |
chendelong1999/polite-flamingo | Develops training methods to improve the politeness and natural flow of multi-modal Large Language Models | 63 |