deepeval
LLM evaluator
A framework for evaluating large language models
The LLM Evaluation Framework
4k stars
23 watching
324 forks
Language: Python
last commit: 4 months ago
Linked from 2 awesome lists
evaluation-frameworkevaluation-metricsllm-evaluationllm-evaluation-frameworkllm-evaluation-metrics
Related projects:
Repository | Description | Stars |
---|---|---|
| A toolkit for evaluating and optimizing Large Language Model applications with objective metrics, test data generation, and seamless integrations. | 7,598 |
| An observability framework for evaluating and monitoring the performance of machine learning models and data pipelines | 5,519 |
| Provides a unified framework to test generative language models on various evaluation tasks. | 7,200 |
| Provides pre-packaged building blocks for generative AI applications with standardized APIs and service-oriented design. | 5,164 |
| An environment for battle-testing prompts to Large Language Models (LLMs) to evaluate response quality and performance. | 2,413 |
| An AI orchestration framework to build customizable LLM applications with advanced retrieval methods. | 18,094 |
| A low-code framework for building custom deep learning models and neural networks | 11,236 |
| Provides a comprehensive framework for evaluating Large Language Model (LLM) applications and pipelines with customizable metrics | 455 |
| Automates the detection of performance, bias, and security issues in AI applications | 4,125 |
| A framework for evaluating large language models and systems, providing a registry of benchmarks. | 15,168 |
| A Python-based framework for serving large language models with low latency and high scalability. | 2,691 |
| A comprehensive course and resource package on building and deploying Large Language Models (LLMs) | 40,053 |
| A Database for AI that stores and manages various data types used in deep learning applications. | 8,237 |
| An evaluation suite and dynamic data release platform for large language models | 230 |
| Enables LLM inference with minimal setup and high performance on various hardware platforms | 69,185 |