trulens
Performance tracker for LLMs
A tool to evaluate and track the performance of large language model (LLM) experiments
Evaluation and Tracking for LLM Experiments
2k stars
19 watching
189 forks
Language: Python
last commit: 5 days ago
Linked from 1 awesome list
explainable-mlllmllmopsmachine-learningneural-networks
Related projects:
Repository | Description | Stars |
---|---|---|
rlworkgroup/dowel | A tool for logging and tracking machine learning research progress in Python | 32 |
replicate/keepsake | A tool for version control of machine learning experiments | 1,650 |
qcri/llmebench | A benchmarking framework for large language models | 80 |
polyaxon/traceml | A software framework for tracking and analyzing machine learning data, including model performance, inputs, outputs, and project metadata. | 504 |
didactic-drunk/fiber_metrics.cr | A tool to track and measure runtime, memory allocation, and other performance metrics for individual fibers or methods within concurrent applications. | 8 |
neptune-ai/neptune-client | An experiment tracker for machine learning model training that allows users to log and visualize their experiments in detail. | 584 |
damo-nlp-sg/m3exam | A benchmark for evaluating large language models in multiple languages and formats | 92 |
catalyst-team/alchemy | Provides tools and infrastructure to log and visualize experiments in deep learning research | 50 |
iamgroot42/mimir | Measures memorization in Large Language Models (LLMs) to detect potential privacy issues | 121 |
freedomintelligence/mllm-bench | Evaluates and compares the performance of multimodal large language models on various tasks | 55 |
alephbet/gimel | An A/B testing backend built using AWS Lambda and Redis HyperLogLog to efficiently track experiment data in a scalable and cost-effective manner. | 227 |
rwdaigle/metrix | An Elixir library to log custom application metrics in a well-structured format for downstream processing systems. | 52 |
trubrics/trubrics-sdk | An analytics platform designed to track performance and feedback for AI-powered assistants using machine learning and data analysis techniques. | 133 |
leks-forever/nllb-tuning | This is an experimental project for fine-tuning the NLB language model with a specific dataset and evaluating its performance on translation tasks. | 7 |
luogen1996/lavin | An open-source implementation of a vision-language instructed large language model | 508 |