metaseq
Transformer platform
A codebase for working with Open Pre-trained Transformers, enabling deployment and fine-tuning of transformer models on various platforms.
Repo for external large-scale work
Archived
7k stars
112 watching
725 forks
Language: Python
last commit: 7 months ago Related projects:
Repository | Description | Stars |
---|---|---|
huggingface/transformers | A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. | 135,022 |
google-research/vision_transformer | Provides pre-trained models and code for training vision transformers and mixers using JAX/Flax | 10,450 |
eleutherai/gpt-neox | Provides a framework for training large-scale language models on GPUs with advanced features and optimizations. | 6,941 |
nvidia/megatron-lm | A research framework for training large language models at scale using GPU optimized techniques. | 10,562 |
google-research/big_vision | Supports large-scale vision model training on GPU machines or Google Cloud TPUs using scalable input pipelines. | 2,334 |
facebookresearch/fairseq | A toolkit for training custom sequence-to-sequence models for various NLP tasks | 30,522 |
huggingface/optimum | A toolkit providing optimization tools and hardware acceleration for training and inference of machine learning models | 2,572 |
huggingface/peft | An efficient method for fine-tuning large pre-trained models by adapting only a small fraction of their parameters | 16,437 |
huggingface/transformers.js | An API for using pre-trained machine learning models in web browsers without the need for a server | 12,085 |
opennmt/ctranslate2 | A high-performance library for efficient inference with Transformer models on CPUs and GPUs. | 3,404 |
optimalscale/lmflow | A toolkit for finetuning large language models and providing efficient inference capabilities | 8,273 |
facebookresearch/xformers | Provides optimized building blocks and components for transformer-based architectures in various domains. | 8,658 |
microsoft/megatron-deepspeed | Research tool for training large transformer language models at scale | 1,895 |
nvidia/fastertransformer | A high-performance transformer-based NLP component optimized for GPU acceleration and integration into various frameworks. | 5,886 |
google/big-bench | A benchmark designed to evaluate the capabilities of large language models by simulating various tasks and measuring their performance | 2,868 |