vision_transformer
Vision transformer framework
Provides pre-trained models and code for training vision transformers and mixers using JAX/Flax
11k stars
105 watching
1k forks
Language: Jupyter Notebook
last commit: 3 months ago
Linked from 3 awesome lists
Related projects:
Repository | Description | Stars |
---|---|---|
| Supports large-scale vision model training on GPU machines or Google Cloud TPUs using scalable input pipelines. | 2,439 |
| A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. | 136,357 |
| An implementation of a transformer-based vision model that aggregates local transformers on image blocks to improve accuracy and efficiency. | 195 |
| A codebase for working with Open Pre-trained Transformers, enabling deployment and fine-tuning of transformer models on various platforms. | 6,519 |
| Provides tools and libraries for training and fine-tuning large language models using transformer architectures | 6,215 |
| An interactive visualization tool to help users understand how large language models like GPT work | 3,604 |
| A framework for training large language models using scalable and optimized GPU techniques | 10,804 |
| Implementations of various deep learning algorithms and techniques with accompanying documentation | 57,177 |
| A library designed to train transformer language models with reinforcement learning using various optimization techniques and fine-tuning methods. | 10,308 |
| A deep learning framework for training vision transformers from scratch on image data. | 1,162 |
| A toolkit for building and deploying deep learning models in computer vision | 5,850 |
| A comprehensive PyTorch-based framework for computer vision tasks | 2,249 |
| Pre-trained models and code for fine-tuning image recognition tasks using deep learning frameworks | 1,516 |
| An open-source JavaScript library for running machine learning models in the browser without a server. | 12,363 |
| Enabling vision-language understanding by fine-tuning large language models on visual data. | 25,490 |