uptrain
AI evaluation platform
An open-source platform to evaluate and improve Generative AI applications through automated checks and analysis
UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform root cause analysis on failure cases and give insights on how to resolve them.
2k stars
21 watching
193 forks
Language: Python
last commit: 5 months ago
Linked from 1 awesome list
autoevaluationevaluationexperimentationhallucination-detectionjailbreak-detectionllm-evalllm-promptingllm-testllmopsmachine-learningmonitoringopenai-evalsprompt-engineeringroot-cause-analysis
Related projects:
Repository | Description | Stars |
---|---|---|
upgini/upgini | Automated data search and enrichment tool for machine learning pipelines | 321 |
upb-lea/gym-electric-motor | A Python toolbox for simulating and controlling electric motors with a focus on reinforcement learning and classical control. | 311 |
cloud-cv/evalai | A platform for comparing and evaluating AI and machine learning algorithms at scale | 1,779 |
openai/finetune-transformer-lm | This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. | 2,167 |
qiangyt/batchai | Automates bulk code checks and generates unit test codes to supplement AI tools like Copilot and Cursor | 36 |
packtpublishing/hands-on-intelligent-agents-with-openai-gym | Teaching software developers to build intelligent agents using deep reinforcement learning and OpenAI Gym | 374 |
uber-research/upsnet | Develops an instance segmentation and panoptic segmentation model for computer vision tasks. | 648 |
ethicalml/xai | An eXplainability toolbox for machine learning that enables data analysis and model evaluation to mitigate biases and improve performance | 1,135 |
shu223/ios-genai-sampler | A collection of Generative AI examples on iOS | 80 |
promptslab/openai-detector | An AI classifier designed to determine whether text is written by humans or machines. | 122 |
codeintegrity-ai/mutahunter | Automated unit test generation and mutation testing tool using Large Language Models. | 252 |
ukgovernmentbeis/inspect_ai | A framework for evaluating large language models | 669 |
oeg-upm/gtfs-bench | Provides a benchmarking framework for evaluating declarative knowledge graph construction engines in the transport domain | 17 |
aporia-ai/mlnotify | Automated notification system for machine learning model training | 343 |
trypromptly/llmstack | A tool for building and deploying generative AI applications with a no-code multi-agent framework | 1,659 |