Valley

Video Assistant

An offline video assistant system powered by large language models and computer vision techniques.

The official repository of "Video assistant towards large language model makes everything easy"

GitHub

210 stars

4 watching

14 forks

Language: Python

last commit: over 2 years ago

Related projects:

Repository	Description	Stars
liuzhao1225/youdub-webui	A web-based video processing tool that uses AI to facilitate cultural and linguistic tasks such as transcription, translation, and audio synthesis.	1,980
damo-nlp-sg/videollama2	An audio-visual language model designed to advance spatial-temporal modeling and audio understanding in video processing.	957
vision-cair/longvu	An artificial intelligence system designed to understand and describe long-form video content	329
aliaksandrsiarohin/video-preprocessing	Tools for preprocessing videos for various datasets, including video cropping and annotation.	522
pku-yuangroup/video-bench	Evaluates and benchmarks large language models' video understanding capabilities	121
dvlab-research/llama-vid	An image-based language model that uses large language models to generate visual and text features from videos	748
nus-hpc-ai-lab/videosys	A comprehensive toolkit for high-performance video generation and processing	1,819
li-xirong/w2vvpp	A deep learning-based video search system using pre-trained models and datasets	28
renshuhuai-andy/timechat	A large language model designed to understand long videos by binding visual content with timestamps and producing video token sequences of varying lengths.	314
kylejginavan/youtube_it	A Ruby wrapper for accessing YouTube's video API and managing video content	595
webpilot-ai/webpilot	An extension for Google Chrome that enables users to engage in conversations with web pages or argue with other users.	1,796
ozmartian/vidcutter	A video editing and management application with cross-platform support	1,821
opengvlab/internvideo	Develops general video foundation models and related datasets for multimodal understanding and generation through generative and discriminative learning.	1,467
nkasmanoff/pi-card	An AI-powered conversational assistant built on top of a Raspberry Pi.	747
vpgtrans/vpgtrans	Transfers visual prompt generators across large language models to reduce training costs and enable customization of multimodal LLMs	270