fiddler-auditor
LLM auditor
An auditing tool to identify weaknesses in large language models before deployment.
Fiddler Auditor is a tool to evaluate language models.
173 stars
8 watching
20 forks
Language: Python
last commit: 10 months ago
Linked from 1 awesome list
ai-observabilityevaluationgenerative-ailangchainllmsnlprobustness
Related projects:
Repository | Description | Stars |
---|---|---|
mlabonne/llm-autoeval | A tool to automate the evaluation of large language models in Google Colab using various benchmarks and custom parameters. | 566 |
dperfly/fiddler2jmeter | Tools to convert Fiddler/Charles requests to JMeter scripts and supports filtering functionality. | 47 |
iamgroot42/mimir | A Python package for measuring memorization in Large Language Models. | 126 |
h2oai/h2o-llm-eval | An evaluation framework for large language models with Elo rating system and A/B testing capabilities | 50 |
qcri/llmebench | A benchmarking framework for large language models | 81 |
freedomintelligence/mllm-bench | Evaluates and compares the performance of multimodal large language models on various tasks | 56 |
declare-lab/instruct-eval | An evaluation framework for large language models trained with instruction tuning methods | 535 |
howiehwong/trustllm | A toolkit for assessing trustworthiness in large language models | 491 |
hardlycodeman/audit_helper | Automates Foundry boilerplate setup for smart contract audits | 20 |
relari-ai/continuous-eval | Provides a comprehensive framework for evaluating Large Language Model (LLM) applications and pipelines with customizable metrics | 455 |
thmsmlr/instructor_ex | A library that provides structured outputs for Large Language Models (LLMs) in Elixir | 587 |
allenai/olmo-eval | A framework for evaluating language models on NLP tasks | 326 |
mlgroupjlu/llm-eval-survey | A repository of papers and resources for evaluating large language models. | 1,450 |
open-compass/lawbench | Evaluates the legal knowledge of large language models using a custom benchmarking framework. | 273 |
adebayoj/fairml | An auditing toolbox to assess the fairness of black-box predictive models | 361 |