coot-videotext

Video transformer

An open-source implementation of a video-text representation learning framework using transformers and PyTorch.

COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning

GitHub

288 stars

8 watching

55 forks

Language: Python

last commit: almost 3 years ago

Linked from 1 awesome list

Backlinks from these awesome lists:

danieljf24/awesome-video-text-retrieval

Related projects:

Repository	Description	Stars
swintransformer/video-swin-transformer	An implementation of the Video Swin Transformer architecture for video recognition tasks	1,463
hanzhanggit/stackgan	A PyTorch implementation of a generative adversarial network for image synthesis from text descriptions	1,863
chaoyuaw/pytorch-coviar	A PyTorch implementation of a compressed video action recognition system	502
xiadingz/video-caption.pytorch	PyTorch implementation of video captioning, combining deep learning and computer vision techniques.	402
clementpinard/sfmlearner-pytorch	Pytorch implementation of unsupervised depth and ego-motion learning from video sequences	1,022
thudm/cogview	A framework for generating images from text using transformers.	1,735
pixart-alpha/pixart-sigma	Develops a PyTorch model for 4K text-to-image generation using diffusion transformer	1,711
jeonsworld/vit-pytorch	A PyTorch implementation of the Vision Transformer model for image recognition tasks.	1,959
microsoft/megatron-deepspeed	Research tool for training large transformer language models at scale	1,926
bigscience-workshop/megatron-deepspeed	A collection of tools and scripts for training large transformer language models at scale	1,342
pylons/colander	A library for serializing and deserializing data structures into strings, mappings, and lists while performing validation.	451
leviswind/pytorch-transformer	Implementation of a transformer-based translation model in PyTorch	240
tongjilibo/bert4torch	An implementation of transformer models in PyTorch for natural language processing tasks	1,257
mchong6/soat	This repository provides a PyTorch implementation of an image manipulation technique using a pretrained StyleGAN model.	380
locuslab/pytorch_fft	Provides an efficient wrapper around CUDA FFTs for PyTorch transformations	315