Video-Bench
Video benchmarking toolkit
Evaluates and benchmarks large language models' video understanding capabilities
A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models!
121 stars
3 watching
2 forks
Language: Python
last commit: 12 months ago benchmarklarge-language-modelstoolkit
Related projects:
Repository | Description | Stars |
---|---|---|
pku-yuangroup/chronomagic-bench | Provides a benchmarking framework for evaluating the quality of text-to-video generation models | 191 |
pku-yuangroup/languagebind | Extending pretraining models to handle multiple modalities by aligning language and video representations | 751 |
pku-yuangroup/open-sora-dataset | A large video dataset collected from various open-source websites for use in computer vision and multimedia applications. | 94 |
felixgithub2017/mmcu | Measures the understanding of massive multitask Chinese datasets using large language models | 87 |
shawn-ieitsystems/yuan-1.0 | Large-scale language model with improved performance on NLP tasks through distributed training and efficient data processing | 591 |
renshuhuai-andy/timechat | A large language model designed to understand long videos by binding visual content with timestamps and producing video token sequences of varying lengths. | 314 |
antoine77340/howto100m | Provides code and tools for learning joint text-video embeddings using the HowTo100M dataset | 254 |
huaizhengzhang/awsome-deep-learning-for-video-analysis | A collection of resources and tools for video analysis using deep learning and multi-modal learning techniques. | 767 |
tencent/tencent-hunyuan-large | This project makes a large language model accessible for research and development | 1,245 |
bradyfu/video-mme | Comprehensive benchmark for evaluating multi-modal large language models on video analysis tasks | 422 |
jshilong/gpt4roi | Training and deploying large language models on computer vision tasks using region-of-interest inputs | 517 |
opengvlab/internvideo | Develops general video foundation models and related datasets for multimodal understanding and generation through generative and discriminative learning. | 1,467 |
aliaksandrsiarohin/video-preprocessing | Tools for preprocessing videos for various datasets, including video cropping and annotation. | 522 |
tianyi-lab/hallusionbench | An image-context reasoning benchmark designed to challenge large vision-language models and help improve their accuracy | 259 |
qcri/llmebench | A benchmarking framework for large language models | 81 |