TempCompass
Video understanding tester
A tool to evaluate video language models' ability to understand and describe video content
[ACL 2024 Findings] "TempCompass: Do Video LLMs Really Understand Videos?", Yuanxin Liu, Shicheng Li, Yi Liu, Yuxiang Wang, Shuhuai Ren, Lei Li, Sishuo Chen, Xu Sun, Lu Hou
84 stars
4 watching
2 forks
Language: Python
last commit: 8 days ago evaluationtemporal-perceptionvideo-llms
Related projects:
Repository | Description | Stars |
---|---|---|
mit-han-lab/temporal-shift-module | Develops a video analysis module with efficient temporal processing capabilities | 2,068 |
lxtgh/omg-seg | Develops an end-to-end model for multiple visual perception and reasoning tasks using a single encoder, decoder, and large language model. | 1,300 |
pku-yuangroup/video-bench | Evaluates and benchmarks large language models' video understanding capabilities | 117 |
boheumd/ma-lmm | This project develops an AI model for long-term video understanding | 244 |
dcdmllm/momentor | A video Large Language Model designed for fine-grained comprehension and localization in videos with a custom Temporal Perception Module for improved temporal modeling | 54 |
renshuhuai-andy/timechat | A large language model designed to understand and process long videos with temporal information | 286 |
antoine77340/howto100m | Provides code and tools for learning joint text-video embeddings using the HowTo100M dataset | 250 |
dvlab-research/llama-vid | An image-based language model that uses large language models to generate visual and text features from videos | 733 |
poyro/poyro | An extension of Vitest for testing LLM applications using local language models | 30 |
researchmm/sttn | Proposes a deep learning model to fill missing regions in video frames and generate completed videos | 474 |
tsb0601/mmvp | An evaluation framework for multimodal language models' visual capabilities using image and question benchmarks. | 288 |
johnsnowlabs/langtest | A tool for testing and evaluating large language models with a focus on AI safety and model assessment. | 501 |
krrishdholakia/betterprompt | An API for evaluating the quality of text prompts used in Large Language Models (LLMs) based on perplexity estimation | 38 |
huangb23/vtimellm | A PyTorch-based Video LLM designed to understand and reason about video moments in terms of time boundaries. | 225 |
ray-project/llmperf | A tool for evaluating the performance of large language model APIs | 641 |