TS-LLaVA

Video model trainer

This project provides an implementation of a novel approach to training large language models on video data without explicit supervision.

TS-LLaVA: Constructing Visual Tokens through Thumbnail-and-Sampling for Training-Free Video Large Language Models

GitHub

7 stars
2 watching
0 forks
Language: Python
last commit: about 1 month ago