howto100m

Text-Video Embedding Toolkit

Provides code and tools for learning joint text-video embeddings using the HowTo100M dataset

Code for the HowTo100M paper

GitHub

250 stars
5 watching
37 forks
Language: Python
last commit: over 4 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
antoine77340/mixture-of-embedding-experts An open-source implementation of the Mixture-of-Embeddings-Experts model in Pytorch for video-text retrieval tasks. 118
jwieting/para-nmt-50m A collection of pre-trained models and code for training paraphrastic sentence embeddings from large machine translation datasets. 102
pku-yuangroup/video-bench Evaluates and benchmarks large language models' video understanding capabilities 117
kronoscode/django-magicembed Provides a tool to easily embed videos and generate thumbnails in Django web applications. 19
showlab/show-1 This project enables text-to-video generation by combining pixel and latent diffusion models 1,103
snailedlt/markdown-videos Embeds YouTube and Vimeo videos into GitHub markdown with ease using an API and website 80
nlprinceton/text_embedding A utility class for generating and evaluating document representations using word embeddings. 54
jwieting/acl2017 A codebase for training and using models of sentence embeddings. 33
materialsintelligence/mat2vec Unsupervised word embeddings capture latent knowledge from materials science literature 619
florianmai/word2mat A framework for learning sentence embeddings from matrices 21
samirhodzic/ngx-embed-video A library for embedding video content from YouTube, Vimeo, and Dailymotion in web applications. 56
showlab/vlog Transforms video content into a long document containing visual and audio information that can be used for chat or other applications. 538
gink03/alt-i2v An implementation of a deep learning-based image representation learning approach using a modified fully connected layer and transfer learning from VGG16 34
jwieting/paragram-word Trains word embeddings from a paraphrase database to represent semantic relationships between words. 30
jayleicn/tvqa PyTorch implementation of video question answering system based on TVQA dataset 172