Multi-Modality-Arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

GitHub

451 stars
6 watching
34 forks
Language: Python
last commit: 6 months ago
chatchatbotchatgptgradiolarge-language-modelsllmsmulti-modalityvision-language-modelvqa