tonic_validate
Quality checker
A framework for evaluating and monitoring the quality of large language model outputs in Retrieval Augmented Generation applications.
Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.
258 stars
14 watching
29 forks
Language: Python
last commit: 7 days ago
Linked from 1 awesome list
evaluation-frameworkevaluation-metricslarge-language-modelsllmllmopsllmsragretrieval-augmented-generation
Related projects:
Repository | Description | Stars |
---|---|---|
parsifal-47/muterl | A tool for verifying test quality by introducing small changes to code and checking if tests pass. | 15 |
testdriverai/goodlooks | A tool to visually validate web pages using natural language prompts instead of traditional selectors. | 37 |
raphaelstolt/lean-package-validator-action | Tools to validate the size and contents of software packages during continuous integration | 0 |
angelognazzo/reliable-trustworthy-ai | An implementation of a DeepPoly-based verifier for robustness analysis in deep neural networks | 1 |
psycoy/mixeval | An evaluation suite and dynamic data release platform for large language models | 224 |
krrishdholakia/betterprompt | An API for evaluating the quality of text prompts used in Large Language Models (LLMs) based on perplexity estimation | 38 |
gomate-community/rageval | An evaluation tool for Retrieval-augmented Generation methods | 132 |
cmader/qskos | A tool for identifying quality issues in SKOS vocabularies, integrating with online services and development workflows. | 65 |
orsinium-labs/flake8-pylint | An extension for flake8 that integrates PyLint to check Python code quality and detect potential errors. | 8 |
qcri/llmebench | A benchmarking framework for large language models | 80 |
alecthomas/voluptuous | A Python data validation library that provides simple and expressive validation of complex data structures. | 1,819 |
brettz9/eslint-config-ash-nazg | A comprehensive configuration for JavaScript projects with enhanced error checking and code quality control. | 6 |
whyhow-ai/rule-based-retrieval | A Python package for creating and managing RAG applications with advanced filtering capabilities | 222 |
s-weigand/flake8-nb | A tool to check Python code quality in Jupyter notebooks. | 28 |
allenai/olmo-eval | An evaluation framework for large language models. | 310 |