vision-longformer
Image encoder
An implementation of a vision transformer architecture designed for high-resolution image encoding with multiple efficient attention mechanisms
243 stars
12 watching
25 forks
Language: Python
last commit: almost 3 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| A C-based library and command line tools for encoding and decoding image formats commonly used on the web. | 2,042 |
| A vision-language model that uses a query transformer to encode images as visual tokens and allows flexible choice of the number of visual tokens. | 101 |
| An implementation of a new neural network architecture that combines the strengths of convolutional and transformer designs to improve performance on image classification tasks. | 559 |
| A library of classes for asynchronously encoding BitmapData objects into image file formats. | 19 |
| A project for pre-training models to support image captioning and question answering tasks. | 416 |
| An artificial intelligence system designed to understand and describe long-form video content | 329 |
| A PNG encoder and decoder written in C++ with support for ANSI C | 2,120 |
| An Angular web app for generating AI-generated image descriptions using OpenAI models | 1 |
| A C library for encoding and decoding AV1 image files | 1,589 |
| A Portable Network Graphics (PNG) image format implementation with support for compression and decompression. | 1,315 |
| A lightweight PNG image decoder and encoder library with a focus on performance and simplicity. | 749 |
| Enhances image resolution while adding realistic details using AI-powered super-resolution techniques | 994 |
| A software implementation of the QOI image format decoder and encoder in Zig. | 0 |
| A JPEG encoder and decoder implementation with improved features and optimizations for image compression and decompression. | 139 |
| An image segmentation system that identifies objects in an image and assigns each pixel to a particular object. | 32 |