VLog
Video doc generator
Transforms video content into a long document containing visual and audio information that can be used for chat or other applications.
Transform Video as a Document with ChatGPT, CLIP, BLIP2, GRIT, Whisper, LangChain.
546 stars
7 watching
26 forks
Language: Python
last commit: over 1 year ago
Linked from 1 awesome list
chatgptlangchainlarge-language-modelvideo-languagewhisper
Related projects:
Repository | Description | Stars |
---|---|---|
showlab/show-1 | This project enables text-to-video generation using a combination of pixel and latent diffusion models. | 1,110 |
damo-nlp-sg/videollama2 | An audio-visual language model designed to advance spatial-temporal modeling and audio understanding in video processing. | 957 |
m1guelpf/yt-whisper | Automates transcription and subtitle generation from YouTube videos using OpenAI's Whisper model | 1,373 |
0voice/ffmpeg_develop_doc | A collection of resources and tutorials on using FFmpeg for video processing and playback | 1,969 |
venuv/langchain_yt_tools | Custom tools to extract text from YouTube video transcripts | 63 |
mbzuai-oryx/video-chatgpt | A video conversation model that generates meaningful conversations about videos using large vision and language models | 1,246 |
aspiers/ly2video | Converts music represented by a GNU LilyPond file into a video containing a horizontally scrolling music staff synchronized with audio rendering. | 158 |
antoine77340/howto100m | Provides code and tools for learning joint text-video embeddings using the HowTo100M dataset | 254 |
timothycrosley/portray | Automates the creation of documentation websites for Python projects with minimal configuration | 863 |
platisd/phonix | Generates captions for videos using OpenAI's Whisper API | 39 |
pku-yuangroup/video-bench | Evaluates and benchmarks large language models' video understanding capabilities | 121 |
techgaun/gulp-apidoc | Generates documentation for RESTful web APIs | 5 |
context-labs/autodoc | Tool for auto-generating codebase documentation using Large Language Models | 2,000 |
transitive-bullshit/ffmpeg-generate-video-preview | Generates image strips or GIFs from video files | 153 |
opengvlab/internvideo | Develops general video foundation models and related datasets for multimodal understanding and generation through generative and discriminative learning. | 1,467 |