h2o-LLM-eval
Large-language Model Evaluation framework with Elo Leaderboard and A-B testing
49 stars
40 watching
1 forks
Language: Jupyter Notebook
last commit: about 1 year ago Large-language Model Evaluation framework with Elo Leaderboard and A-B testing