evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

GitHub

15k stars
263 watching
3k forks
Language: Python
last commit: about 1 month ago
Linked from 2 awesome lists


Backlinks from these awesome lists: