rageval
RAG evaluator
An evaluation tool for Retrieval-augmented Generation methods
Evaluation tools for Retrieval-augmented Generation (RAG) methods.
141 stars
7 watching
11 forks
Language: Python
last commit: 2 months ago
Linked from 1 awesome list
evalutionllmrag
Related projects:
Repository | Description | Stars |
---|---|---|
stanford-futuredata/ares | A tool for automatically evaluating RAG models by generating synthetic data and fine-tuning classifiers | 499 |
amazon-science/ragchecker | A framework for evaluating and diagnosing retrieval-augmented generation systems | 630 |
paesslerag/gval | An expression evaluation library for Go that supports arbitrary expressions and parameters | 758 |
whyhow-ai/rule-based-retrieval | A Python package that enables the creation and management of Retrieval Augmented Generation applications with filtering capabilities. | 229 |
allenai/olmo-eval | A framework for evaluating language models on NLP tasks | 326 |
maja42/goval | A Go library for evaluating arbitrary arithmetic, string, and logic expressions with support for variables and custom functions. | 160 |
nullne/evaluator | An expression evaluator library written in Go. | 41 |
huggingface/evaluate | An evaluation framework for machine learning models and datasets, providing standardized metrics and tools for comparing model performance. | 2,063 |
mlabonne/llm-autoeval | A tool to automate the evaluation of large language models in Google Colab using various benchmarks and custom parameters. | 566 |
antonmedv/golang-expression-evaluation-comparison | A benchmarking repository comparing the performance of different expression evaluation packages in Go. | 48 |
huggingface/lighteval | An all-in-one toolkit for evaluating Large Language Models (LLMs) across multiple backends. | 879 |
thedevsaddam/govalidator | Validate golang request data with simple rules inspired by Laravel's request validation | 1,324 |
mshukor/evalign-icl | Evaluating and improving large multimodal models through in-context learning | 21 |
rlancemartin/auto-evaluator | An evaluation tool for question-answering systems using large language models and natural language processing techniques | 1,065 |
krrishdholakia/betterprompt | An API for evaluating the quality of text prompts used in Large Language Models (LLMs) based on perplexity estimation | 43 |