inspect_ai
Model inspector
A framework for evaluating large language models
Inspect: A framework for large language model evaluations
669 stars
9 watching
135 forks
Language: Python
last commit: 2 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| An evaluation framework for large language models trained with instruction tuning methods | 535 |
| A framework for evaluating language models on NLP tasks | 326 |
| Evaluates language models using standardized benchmarks and prompting techniques. | 2,059 |
| Automates code quality checks for Python programs | 1,049 |
| An evaluation toolkit and platform for assessing large models in various domains | 307 |
| Evaluates foundation models on human-centric tasks with diverse exams and question types | 714 |
| Provides utilities for inspecting and analyzing Python types at runtime | 352 |
| A tool for testing and evaluating large language models with a focus on AI safety and model assessment. | 506 |
| A framework for efficiently evaluating and benchmarking large models | 308 |
| Evaluates the legal knowledge of large language models using a custom benchmarking framework. | 273 |
| An evaluation framework using Chinese high school examination questions to assess large language model capabilities | 565 |
| An evaluation framework for machine learning models and datasets, providing standardized metrics and tools for comparing model performance. | 2,063 |
| Provides tools and techniques for interpreting machine learning models | 483 |
| Provides pre-trained language models and tools for fine-tuning and evaluation | 439 |
| Develops and publishes large multilingual language models with advanced mixing-of-experts architecture. | 37 |