vision-longformer

Image encoder

An implementation of a vision transformer architecture designed for high-resolution image encoding with multiple efficient attention mechanisms

GitHub

241 stars
12 watching
25 forks
Language: Python
last commit: over 2 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
webmproject/libwebp A library for encoding and decoding image files in the WebP format 2,027
gordonhu608/mqt-llava A vision-language model that uses a query transformer to encode images as visual tokens and allows flexible choice of the number of visual tokens. 97
microsoft/cvt An implementation of a new neural network architecture that combines the strengths of convolutional and transformer designs to improve performance on image classification tasks. 555
leeburrows/async-image-encoders A library of classes for asynchronously encoding BitmapData objects into image file formats. 20
luoweizhou/vlp A project for pre-training models to support image captioning and question answering tasks. 412
vision-cair/longvu An artificial intelligence system designed to understand and describe long-form video content 270
lvandeve/lodepng A PNG encoder and decoder written in C++ with support for ANSI C 2,104
slsfi/abbi-ng-ai-image-descriptor An Angular web app for generating AI-generated image descriptions using OpenAI models 1
aomediacodec/libavif A C library for encoding and decoding AV1 image files 1,578
pnggroup/libpng A Portable Network Graphics (PNG) image format implementation with support for compression and decompression. 1,290
randy408/libspng A lightweight PNG image decoder and encoder library with a focus on performance and simplicity. 736
ibm/max-image-resolution-enhancer Enhances image resolution while adding realistic details using AI-powered super-resolution techniques 994
levydsa/qoiz A software implementation of the QOI image format decoder and encoder in Zig. 0
google/jpegli A JPEG encoder and decoder implementation with improved features and optimizations for image compression and decompression. 126
ibm/max-image-segmenter An image segmentation system that identifies objects in an image and assigns each pixel to a particular object. 32