evidently
Model monitor
An observability framework for evaluating and monitoring the performance of machine learning models and data pipelines
Evidently is an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.
6k stars
48 watching
607 forks
Language: Jupyter Notebook
last commit: 4 months ago
Linked from 8 awesome lists
data-driftdata-qualitydata-sciencedata-validationgenerative-aihacktoberfesthtml-reportjupyter-notebookllmllmopsmachine-learningmlopsmodel-monitoringpandas-dataframe
Related projects:
Repository | Description | Stars |
---|---|---|
| A framework for evaluating large language models | 4,003 |
| A toolkit for evaluating and optimizing Large Language Model applications with objective metrics, test data generation, and seamless integrations. | 7,598 |
| Automates the detection of performance, bias, and security issues in AI applications | 4,125 |
| A framework for evaluating large language models and systems, providing a registry of benchmarks. | 15,168 |
| Provides a unified framework to test generative language models on various evaluation tasks. | 7,200 |
| A Python library that simplifies working with structured outputs from large language models | 8,551 |
| Provides a comprehensive framework for evaluating Large Language Model (LLM) applications and pipelines with customizable metrics | 455 |
| An environment for battle-testing prompts to Large Language Models (LLMs) to evaluate response quality and performance. | 2,413 |
| An interactive tool for analyzing and understanding machine learning models | 3,500 |
| Automates data quality checks and model training with AI-driven methods to improve machine learning performance | 9,820 |
| A comprehensive resource for explaining the decisions and behavior of machine learning models. | 4,811 |
| An open-source package for explaining machine learning models and promoting transparency in AI decision-making | 6,324 |
| Provides tools and techniques for interpreting machine learning models | 483 |
| An open-source toolkit for building and evaluating large language models | 267 |
| An evaluation suite and dynamic data release platform for large language models | 230 |