MPNet

Language model trainer

Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning.

MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf

GitHub

288 stars

13 watching

33 forks

Language: Python

last commit: about 4 years ago

Related projects:

Repository	Description	Stars
openai/finetune-transformer-lm	This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture.	2,167
csuhan/onellm	A framework for training and fine-tuning multimodal language models on various data types	601
brightmart/xlnet_zh	Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks	230
huggingface/nanotron	A pretraining framework for large language models using 3D parallelism and scalable training techniques	1,332
elanmart/psmm	An implementation of a neural network model for character-level language modeling.	50
microsoft/unicoder	This repository provides pre-trained models and code for understanding and generation tasks in multiple languages.	89
openbmb/cpm-live	A live training platform for large-scale deep learning models, allowing community participation and collaboration in model development and deployment.	511
moses-smt/nplm	A toolkit for training neural network language models	14
german-nlp-group/german-transformer-training	Trains German transformer models to improve language understanding	23
sihengli99/textbind	Enables larger language models to generate multi-turn multimodal instruction-response conversations from image-caption pairs with minimal annotations.	47
vhellendoorn/code-lms	A guide to using pre-trained large language models in source code analysis and generation	1,789
cluebenchmark/cluepretrainedmodels	Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models.	806
ieit-yuan/yuan2.0-m32	A high-performance language model designed to excel in tasks like natural language understanding, mathematical computation, and code generation	182
wyy-123-xyy/ra-fed	A Python implementation of a distributed machine learning framework for training neural networks on multiple GPUs	6
chendelong1999/polite-flamingo	Develops training methods to improve the politeness and natural flow of multi-modal Large Language Models	63