evidently
Model monitor
An observability framework for evaluating and monitoring the performance of machine learning models and data pipelines
Evidently is an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.
6k stars
48 watching
607 forks
Language: Jupyter Notebook
last commit: about 1 month ago
Linked from 8 awesome lists
data-driftdata-qualitydata-sciencedata-validationgenerative-aihacktoberfesthtml-reportjupyter-notebookllmllmopsmachine-learningmlopsmodel-monitoringpandas-dataframe
Related projects:
Repository | Description | Stars |
---|---|---|
confident-ai/deepeval | A framework for evaluating large language models | 4,003 |
explodinggradients/ragas | A toolkit for evaluating and optimizing Large Language Model applications with objective metrics, test data generation, and seamless integrations. | 7,598 |
giskard-ai/giskard | Automates the detection of performance, bias, and security issues in AI applications | 4,125 |
openai/evals | A framework for evaluating large language models and systems, providing a registry of benchmarks. | 15,168 |
eleutherai/lm-evaluation-harness | Provides a unified framework to test generative language models on various evaluation tasks. | 7,200 |
instructor-ai/instructor | A Python library that simplifies working with structured outputs from large language models | 8,551 |
relari-ai/continuous-eval | Provides a comprehensive framework for evaluating Large Language Model (LLM) applications and pipelines with customizable metrics | 455 |
ianarawjo/chainforge | An environment for battle-testing prompts to Large Language Models (LLMs) to evaluate response quality and performance. | 2,413 |
pair-code/lit | An interactive tool for analyzing and understanding machine learning models | 3,500 |
cleanlab/cleanlab | Automates data quality checks and model training with AI-driven methods to improve machine learning performance | 9,820 |
christophm/interpretable-ml-book | A comprehensive resource for explaining the decisions and behavior of machine learning models. | 4,811 |
interpretml/interpret | An open-source package for explaining machine learning models and promoting transparency in AI decision-making | 6,324 |
h2oai/mli-resources | Provides tools and techniques for interpreting machine learning models | 483 |
aiplanethub/beyondllm | An open-source toolkit for building and evaluating large language models | 267 |
psycoy/mixeval | An evaluation suite and dynamic data release platform for large language models | 230 |