TempCompass

Video understanding tester

A tool to evaluate video language models' ability to understand and describe video content

[ACL 2024 Findings] "TempCompass: Do Video LLMs Really Understand Videos?", Yuanxin Liu, Shicheng Li, Yi Liu, Yuxiang Wang, Shuhuai Ren, Lei Li, Sishuo Chen, Xu Sun, Lu Hou

GitHub

91 stars

4 watching

2 forks

Language: Python

last commit: 9 months ago

evaluationtemporal-perceptionvideo-llms

Related projects:

Repository	Description	Stars
mit-han-lab/temporal-shift-module	Develops a video analysis module with efficient temporal processing capabilities	2,078
lxtgh/omg-seg	Develops an end-to-end model for multiple visual perception and reasoning tasks using a single encoder, decoder, and large language model.	1,336
pku-yuangroup/video-bench	Evaluates and benchmarks large language models' video understanding capabilities	121
boheumd/ma-lmm	This project develops an AI model for long-term video understanding	254
dcdmllm/momentor	A video Large Language Model designed for fine-grained comprehension and localization in videos with a custom Temporal Perception Module for improved temporal modeling	58
renshuhuai-andy/timechat	A large language model designed to understand long videos by binding visual content with timestamps and producing video token sequences of varying lengths.	314
antoine77340/howto100m	Provides code and tools for learning joint text-video embeddings using the HowTo100M dataset	254
dvlab-research/llama-vid	An image-based language model that uses large language models to generate visual and text features from videos	748
poyro/poyro	An extension of Vitest for testing LLM applications using local language models	31
researchmm/sttn	Proposes a deep learning model to fill missing regions in video frames and generate completed videos	480
tsb0601/mmvp	An evaluation framework for multimodal language models' visual capabilities using image and question benchmarks.	296
johnsnowlabs/langtest	A tool for testing and evaluating large language models with a focus on AI safety and model assessment.	506
krrishdholakia/betterprompt	An API for evaluating the quality of text prompts used in Large Language Models (LLMs) based on perplexity estimation	43
huangb23/vtimellm	A PyTorch-based Video LLM designed to understand and reason about video moments in terms of time boundaries.	231
ray-project/llmperf	A tool for evaluating the performance of large language model APIs	678