evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
15k stars
263 watching
3k forks
Language: Python
last commit: about 1 month ago
Linked from 2 awesome lists
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.