evidently
Model monitor
An observability framework for evaluating and monitoring the performance of machine learning models and data pipelines
Evidently is an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.
5k stars
48 watching
598 forks
Language: Jupyter Notebook
last commit: 7 days ago
Linked from 8 awesome lists
data-driftdata-qualitydata-sciencedata-validationgenerative-aihacktoberfesthtml-reportjupyter-notebookllmllmopsmachine-learningmlopsmodel-monitoringpandas-dataframe
Related projects:
Repository | Description | Stars |
---|---|---|
confident-ai/deepeval | A framework for evaluating large language models | 3,669 |
explodinggradients/ragas | A toolkit for evaluating and optimizing Large Language Model applications with data-driven insights | 7,233 |
giskard-ai/giskard | Automates detection and evaluation of performance, bias, and security issues in AI applications | 4,071 |
openai/evals | A framework for evaluating large language models and systems, providing a registry of benchmarks. | 15,015 |
eleutherai/lm-evaluation-harness | Provides a unified framework to test generative language models on various evaluation tasks. | 6,970 |
instructor-ai/instructor | A Python library that provides structured outputs from large language models (LLMs) and facilitates seamless integration with various LLM providers. | 8,163 |
relari-ai/continuous-eval | Provides a comprehensive framework for evaluating Large Language Model (LLM) applications and pipelines with customizable metrics | 446 |
ianarawjo/chainforge | An environment for battle-testing prompts to Large Language Models (LLMs) to evaluate response quality and performance. | 2,334 |
pair-code/lit | An interactive tool for analyzing and understanding machine learning models | 3,492 |
cleanlab/cleanlab | Automates data quality checks and model training with AI-driven methods to improve machine learning performance | 9,756 |
christophm/interpretable-ml-book | A comprehensive resource for explaining the decisions and behavior of machine learning models. | 4,794 |
interpretml/interpret | An open-source package for explaining machine learning models and promoting transparency in AI decision-making | 6,296 |
h2oai/mli-resources | Provides tools and techniques for interpreting machine learning models | 484 |
aiplanethub/beyondllm | An open-source toolkit for building and evaluating large language models | 261 |
psycoy/mixeval | An evaluation suite and dynamic data release platform for large language models | 224 |