M3Exam

Data and code for paper "M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models"

GitHub

91 stars
9 watching
12 forks
Language: Python
last commit: over 1 year ago
ai-educationchatgptevaluationgpt-4large-language-modelsllmsmultilingualmultimodal