vision-longformer

Image encoder

An implementation of a vision transformer architecture designed for high-resolution image encoding with multiple efficient attention mechanisms

243 stars

12 watching

25 forks

Language: Python

last commit: over 3 years ago

Linked from 1 awesome list

Backlinks from these awesome lists:

weiaicunzai/awesome-image-classification

Related projects:

Repository	Description	Stars
webmproject/libwebp	A C-based library and command line tools for encoding and decoding image formats commonly used on the web.	2,042
gordonhu608/mqt-llava	A vision-language model that uses a query transformer to encode images as visual tokens and allows flexible choice of the number of visual tokens.	101
microsoft/cvt	An implementation of a new neural network architecture that combines the strengths of convolutional and transformer designs to improve performance on image classification tasks.	559
leeburrows/async-image-encoders	A library of classes for asynchronously encoding BitmapData objects into image file formats.	19
luoweizhou/vlp	A project for pre-training models to support image captioning and question answering tasks.	416
vision-cair/longvu	An artificial intelligence system designed to understand and describe long-form video content	329
lvandeve/lodepng	A PNG encoder and decoder written in C++ with support for ANSI C	2,120
slsfi/abbi-ng-ai-image-descriptor	An Angular web app for generating AI-generated image descriptions using OpenAI models	1
aomediacodec/libavif	A C library for encoding and decoding AV1 image files	1,589
pnggroup/libpng	A Portable Network Graphics (PNG) image format implementation with support for compression and decompression.	1,315
randy408/libspng	A lightweight PNG image decoder and encoder library with a focus on performance and simplicity.	749
ibm/max-image-resolution-enhancer	Enhances image resolution while adding realistic details using AI-powered super-resolution techniques	994
levydsa/qoiz	A software implementation of the QOI image format decoder and encoder in Zig.	0
google/jpegli	A JPEG encoder and decoder implementation with improved features and optimizations for image compression and decompression.	139
ibm/max-image-segmenter	An image segmentation system that identifies objects in an image and assigns each pixel to a particular object.	32