MPNet
Language model trainer
Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning.
MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf
288 stars
13 watching
33 forks
Language: Python
last commit: over 3 years ago Related projects:
Repository | Description | Stars |
---|---|---|
| This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. | 2,167 |
| A framework for training and fine-tuning multimodal language models on various data types | 601 |
| Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks | 230 |
| A pretraining framework for large language models using 3D parallelism and scalable training techniques | 1,332 |
| An implementation of a neural network model for character-level language modeling. | 50 |
| This repository provides pre-trained models and code for understanding and generation tasks in multiple languages. | 89 |
| A live training platform for large-scale deep learning models, allowing community participation and collaboration in model development and deployment. | 511 |
| A toolkit for training neural network language models | 14 |
| Trains German transformer models to improve language understanding | 23 |
| Enables larger language models to generate multi-turn multimodal instruction-response conversations from image-caption pairs with minimal annotations. | 47 |
| A guide to using pre-trained large language models in source code analysis and generation | 1,789 |
| Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models. | 806 |
| A high-performance language model designed to excel in tasks like natural language understanding, mathematical computation, and code generation | 182 |
| A Python implementation of a distributed machine learning framework for training neural networks on multiple GPUs | 6 |
| Develops training methods to improve the politeness and natural flow of multi-modal Large Language Models | 63 |