MPNet
Language model trainer
Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning.
MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf
288 stars
13 watching
33 forks
Language: Python
last commit: about 3 years ago Related projects:
Repository | Description | Stars |
---|---|---|
openai/finetune-transformer-lm | This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. | 2,160 |
csuhan/onellm | A framework for training and fine-tuning multimodal language models on various data types | 588 |
brightmart/xlnet_zh | Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks | 230 |
huggingface/nanotron | A library for training large language models with parallel computing and mixed precision training methods | 1,244 |
elanmart/psmm | An implementation of a neural network model for character-level language modeling. | 50 |
microsoft/unicoder | This repository provides pre-trained models and code for understanding and generation tasks in multiple languages. | 88 |
openbmb/cpm-live | A live training platform for large-scale deep learning models, allowing community participation and collaboration in model development and deployment. | 511 |
moses-smt/nplm | A toolkit for training neural network language models | 14 |
german-nlp-group/german-transformer-training | Trains German transformer models to improve language understanding | 23 |
sihengli99/textbind | Enables larger language models to generate multi-turn multimodal instruction-response conversations from image-caption pairs with minimal annotations. | 48 |
vhellendoorn/code-lms | A guide to using pre-trained large language models in source code analysis and generation | 1,782 |
cluebenchmark/cluepretrainedmodels | Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models. | 804 |
ieit-yuan/yuan2.0-m32 | A high-performance language model designed to excel in tasks like natural language understanding, mathematical computation, and code generation | 180 |
wyy-123-xyy/ra-fed | A Python implementation of a distributed machine learning framework for training neural networks on multiple GPUs | 6 |
chendelong1999/polite-flamingo | Develops training methods to improve the politeness and natural flow of multi-modal Large Language Models | 63 |