rageval

RAG evaluator

An evaluation tool for Retrieval-augmented Generation methods

Evaluation tools for Retrieval-augmented Generation (RAG) methods.

GitHub

132 stars
7 watching
11 forks
Language: Python
last commit: 6 days ago
Linked from 1 awesome list

evalutionllmrag

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
stanford-futuredata/ares A tool for automatically evaluating RAG models by generating synthetic data and fine-tuning classifiers 483
amazon-science/ragchecker An automated evaluation framework for assessing and diagnosing Retrieval-Augmented Generation systems. 535
paesslerag/gval An expression evaluation library for Go that supports arbitrary expressions and parameters 743
whyhow-ai/rule-based-retrieval A Python package for creating and managing RAG applications with advanced filtering capabilities 222
allenai/olmo-eval An evaluation framework for large language models. 310
maja42/goval A Go library for evaluating arbitrary arithmetic, string, and logic expressions with support for variables and custom functions. 159
nullne/evaluator An expression evaluator library written in Go. 41
huggingface/evaluate An evaluation framework for machine learning models and datasets, providing standardized metrics and tools for comparing model performance. 2,034
mlabonne/llm-autoeval A tool to automate the evaluation of large language models in Google Colab using various benchmarks and custom parameters. 558
antonmedv/golang-expression-evaluation-comparison A benchmarking repository comparing the performance of different expression evaluation packages in Go. 48
huggingface/lighteval A toolkit for evaluating Large Language Models across multiple backends 804
thedevsaddam/govalidator Validate golang request data with simple rules inspired by Laravel's request validation 1,324
mshukor/evalign-icl Evaluating and improving large multimodal models through in-context learning 20
rlancemartin/auto-evaluator An evaluation tool for question-answering systems using large language models and natural language processing techniques 1,063
krrishdholakia/betterprompt An API for evaluating the quality of text prompts used in Large Language Models (LLMs) based on perplexity estimation 38