inspect_ai
Model inspector
A framework for evaluating large language models
Inspect: A framework for large language model evaluations
669 stars
9 watching
135 forks
Language: Python
last commit: 10 months ago
Linked from 1 awesome list
Related projects:
| Repository | Description | Stars |
|---|---|---|
| | An evaluation framework for large language models trained with instruction tuning methods | 535 |
| | A framework for evaluating language models on NLP tasks | 326 |
| | Evaluates language models using standardized benchmarks and prompting techniques. | 2,059 |
| | Automates code quality checks for Python programs | 1,049 |
| | An evaluation toolkit and platform for assessing large models in various domains | 307 |
| | Evaluates foundation models on human-centric tasks with diverse exams and question types | 714 |
| | Provides utilities for inspecting and analyzing Python types at runtime | 352 |
| | A tool for testing and evaluating large language models with a focus on AI safety and model assessment. | 506 |
| | A framework for efficiently evaluating and benchmarking large models | 308 |
| | Evaluates the legal knowledge of large language models using a custom benchmarking framework. | 273 |
| | An evaluation framework using Chinese high school examination questions to assess large language model capabilities | 565 |
| | An evaluation framework for machine learning models and datasets, providing standardized metrics and tools for comparing model performance. | 2,063 |
| | Provides tools and techniques for interpreting machine learning models | 483 |
| | Provides pre-trained language models and tools for fine-tuning and evaluation | 439 |
| | Develops and publishes large multilingual language models with advanced mixing-of-experts architecture. | 37 |