Video-Bench
Video benchmarking toolkit
Evaluates and benchmarks large language models' video understanding capabilities
A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models!
117 stars
3 watching
2 forks
Language: Python
last commit: 11 months ago benchmarklarge-language-modelstoolkit
Related projects:
Repository | Description | Stars |
---|---|---|
pku-yuangroup/chronomagic-bench | A benchmark and dataset for evaluating text-to-video generation models' ability to generate coherent and varied metamorphic time-lapse videos. | 186 |
pku-yuangroup/languagebind | Extending pretraining models to handle multiple modalities by aligning language and video representations | 723 |
pku-yuangroup/open-sora-dataset | A large video dataset collected from various open-source websites for use in computer vision and multimedia applications. | 93 |
felixgithub2017/mmcu | Evaluates the semantic understanding capabilities of large Chinese language models using a multimodal dataset. | 87 |
shawn-ieitsystems/yuan-1.0 | Large-scale language model with improved performance on NLP tasks through distributed training and efficient data processing | 591 |
renshuhuai-andy/timechat | A large language model designed to understand and process long videos with temporal information | 286 |
antoine77340/howto100m | Provides code and tools for learning joint text-video embeddings using the HowTo100M dataset | 250 |
huaizhengzhang/awsome-deep-learning-for-video-analysis | A collection of resources and tools for video analysis using deep learning and multi-modal learning techniques. | 763 |
tencent/tencent-hunyuan-large | This project makes a large language model accessible for research and development | 1,114 |
bradyfu/video-mme | An evaluation framework for large language models in video analysis, providing a comprehensive benchmark of their capabilities. | 406 |
jshilong/gpt4roi | Training and deploying large language models on computer vision tasks using region-of-interest inputs | 506 |
opengvlab/internvideo | Developing video foundation models and datasets for multimodal understanding and applications | 1,413 |
aliaksandrsiarohin/video-preprocessing | Tools for preprocessing videos for various datasets, including video cropping and annotation. | 518 |
tianyi-lab/hallusionbench | An image-context reasoning benchmark designed to challenge large vision-language models and help improve their accuracy | 243 |
qcri/llmebench | A benchmarking framework for large language models | 80 |