tonic_validate

Quality checker

A framework for evaluating and monitoring the quality of large language model outputs in Retrieval Augmented Generation applications.

Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.

GitHub

271 stars
14 watching
27 forks
Language: Python
last commit: 2 months ago
Linked from 1 awesome list

evaluation-frameworkevaluation-metricslarge-language-modelsllmllmopsllmsragretrieval-augmented-generation

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
parsifal-47/muterl A tool for verifying test quality by introducing small changes to code and checking if tests pass. 15
testdriverai/goodlooks A tool to visually validate web pages using natural language prompts instead of traditional selectors. 38
raphaelstolt/lean-package-validator-action Tools to validate the size and contents of software packages during continuous integration 0
angelognazzo/reliable-trustworthy-ai An implementation of a DeepPoly-based verifier for robustness analysis in deep neural networks 2
psycoy/mixeval An evaluation suite and dynamic data release platform for large language models 230
krrishdholakia/betterprompt An API for evaluating the quality of text prompts used in Large Language Models (LLMs) based on perplexity estimation 43
gomate-community/rageval An evaluation tool for Retrieval-augmented Generation methods 141
cmader/qskos A tool for identifying quality issues in SKOS vocabularies, integrating with online services and development workflows. 65
orsinium-labs/flake8-pylint An extension for flake8 that integrates PyLint to check Python code quality and detect potential errors. 8
qcri/llmebench A benchmarking framework for large language models 81
alecthomas/voluptuous A Python data validation library providing simple and flexible ways to validate complex data structures. 1,823
brettz9/eslint-config-ash-nazg A comprehensive configuration for JavaScript projects with enhanced error checking and code quality control. 6
whyhow-ai/rule-based-retrieval A Python package that enables the creation and management of Retrieval Augmented Generation applications with filtering capabilities. 229
s-weigand/flake8-nb A tool to check Python code quality in Jupyter notebooks. 28
allenai/olmo-eval A framework for evaluating language models on NLP tasks 326