ARES

RAG model evaluator

A tool for automatically evaluating RAG models by generating synthetic data and fine-tuning classifiers

Automated Evaluation of RAG Systems

GitHub

483 stars
11 watching
52 forks
Language: Python
last commit: 20 days ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
gomate-community/rageval An evaluation tool for Retrieval-augmented Generation methods 132
amazon-science/ragchecker An automated evaluation framework for assessing and diagnosing Retrieval-Augmented Generation systems. 552
openai/simple-evals A library for evaluating language models using standardized prompts and benchmarking tests. 1,939
allenai/olmo-eval An evaluation framework for large language models. 311
declare-lab/instruct-eval An evaluation framework for large language models trained with instruction tuning methods 528
huggingface/evaluate An evaluation framework for machine learning models and datasets, providing standardized metrics and tools for comparing model performance. 2,034
ffri/packerdetectiontoolevaluation An evaluation of packer type estimation and detection tools to improve malware analysis capabilities 11
kentcdodds/preval.macro A build-time code evaluation tool for JavaScript 127
evolvinglmms-lab/lmms-eval Tools and evaluation suite for large multimodal models 2,058
mshukor/evalign-icl Evaluating and improving large multimodal models through in-context learning 20
ruixiangcui/agieval Evaluates foundation models on human-centric tasks with diverse exams and question types 708
arm-doe/pyart An interactive toolkit for working with weather radar data using Python and atmospheric radar algorithms 517
whyhow-ai/rule-based-retrieval A Python package for creating and managing RAG applications with advanced filtering capabilities 222
martinkersner/py-img-seg-eval A Python package providing metrics and tools for evaluating image segmentation models 282
pcmdi/pcmdi_metrics Provides objective comparisons of Earth System Models with one another and available observations 102