vision-longformer

Image encoder

An implementation of a vision transformer architecture designed for high-resolution image encoding with multiple efficient attention mechanisms

GitHub

243 stars
12 watching
25 forks
Language: Python
last commit: almost 3 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
webmproject/libwebp A C-based library and command line tools for encoding and decoding image formats commonly used on the web. 2,042
gordonhu608/mqt-llava A vision-language model that uses a query transformer to encode images as visual tokens and allows flexible choice of the number of visual tokens. 101
microsoft/cvt An implementation of a new neural network architecture that combines the strengths of convolutional and transformer designs to improve performance on image classification tasks. 559
leeburrows/async-image-encoders A library of classes for asynchronously encoding BitmapData objects into image file formats. 19
luoweizhou/vlp A project for pre-training models to support image captioning and question answering tasks. 416
vision-cair/longvu An artificial intelligence system designed to understand and describe long-form video content 329
lvandeve/lodepng A PNG encoder and decoder written in C++ with support for ANSI C 2,120
slsfi/abbi-ng-ai-image-descriptor An Angular web app for generating AI-generated image descriptions using OpenAI models 1
aomediacodec/libavif A C library for encoding and decoding AV1 image files 1,589
pnggroup/libpng A Portable Network Graphics (PNG) image format implementation with support for compression and decompression. 1,315
randy408/libspng A lightweight PNG image decoder and encoder library with a focus on performance and simplicity. 749
ibm/max-image-resolution-enhancer Enhances image resolution while adding realistic details using AI-powered super-resolution techniques 994
levydsa/qoiz A software implementation of the QOI image format decoder and encoder in Zig. 0
google/jpegli A JPEG encoder and decoder implementation with improved features and optimizations for image compression and decompression. 139
ibm/max-image-segmenter An image segmentation system that identifies objects in an image and assigns each pixel to a particular object. 32