trulens

Performance tracker for LLMs

A tool to evaluate and track the performance of large language model (LLM) experiments

Evaluation and Tracking for LLM Experiments

GitHub

2k stars
19 watching
189 forks
Language: Python
last commit: 5 days ago
Linked from 1 awesome list

explainable-mlllmllmopsmachine-learningneural-networks

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
rlworkgroup/dowel A tool for logging and tracking machine learning research progress in Python 32
replicate/keepsake A tool for version control of machine learning experiments 1,650
qcri/llmebench A benchmarking framework for large language models 80
polyaxon/traceml A software framework for tracking and analyzing machine learning data, including model performance, inputs, outputs, and project metadata. 504
didactic-drunk/fiber_metrics.cr A tool to track and measure runtime, memory allocation, and other performance metrics for individual fibers or methods within concurrent applications. 8
neptune-ai/neptune-client An experiment tracker for machine learning model training that allows users to log and visualize their experiments in detail. 584
damo-nlp-sg/m3exam A benchmark for evaluating large language models in multiple languages and formats 92
catalyst-team/alchemy Provides tools and infrastructure to log and visualize experiments in deep learning research 50
iamgroot42/mimir Measures memorization in Large Language Models (LLMs) to detect potential privacy issues 121
freedomintelligence/mllm-bench Evaluates and compares the performance of multimodal large language models on various tasks 55
alephbet/gimel An A/B testing backend built using AWS Lambda and Redis HyperLogLog to efficiently track experiment data in a scalable and cost-effective manner. 227
rwdaigle/metrix An Elixir library to log custom application metrics in a well-structured format for downstream processing systems. 52
trubrics/trubrics-sdk An analytics platform designed to track performance and feedback for AI-powered assistants using machine learning and data analysis techniques. 133
leks-forever/nllb-tuning This is an experimental project for fine-tuning the NLB language model with a specific dataset and evaluating its performance on translation tasks. 7
luogen1996/lavin An open-source implementation of a vision-language instructed large language model 508