uptrain
AI evaluation platform
An open-source platform to evaluate and improve Generative AI applications through automated checks and analysis
UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform root cause analysis on failure cases and give insights on how to resolve them.
2k stars
20 watching
191 forks
Language: Python
last commit: 3 months ago
Linked from 1 awesome list
autoevaluationevaluationexperimentationhallucination-detectionjailbreak-detectionllm-evalllm-promptingllm-testllmopsmachine-learningmonitoringopenai-evalsprompt-engineeringroot-cause-analysis
Related projects:
Repository | Description | Stars |
---|---|---|
upgini/upgini | Automated data search and enrichment tool for machine learning pipelines | 319 |
upb-lea/gym-electric-motor | A Python toolbox for simulating and controlling electric motors with a focus on reinforcement learning and classical control. | 303 |
cloud-cv/evalai | A platform for comparing and evaluating AI and machine learning algorithms at scale | 1,771 |
openai/finetune-transformer-lm | This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. | 2,160 |
qiangyt/batchai | An AI-powered tool for automating code review and improvement in software projects | 24 |
packtpublishing/hands-on-intelligent-agents-with-openai-gym | Teaching software developers to build intelligent agents using deep reinforcement learning and OpenAI Gym | 373 |
uber-research/upsnet | Develops an instance segmentation and panoptic segmentation model for computer vision tasks. | 649 |
ethicalml/xai | An eXplainability toolbox for machine learning that enables data analysis and model evaluation to mitigate biases and improve performance | 1,125 |
shu223/ios-genai-sampler | Demonstrates various Generative AI examples and local LLMs on iOS | 81 |
promptslab/openai-detector | An AI classifier designed to determine whether text is written by humans or machines. | 122 |
codeintegrity-ai/mutahunter | Automated unit test generation and mutation testing tool using Large Language Models. | 243 |
ukgovernmentbeis/inspect_ai | A framework for evaluating large language models | 615 |
oeg-upm/gtfs-bench | Provides a benchmarking framework for evaluating declarative knowledge graph construction engines in the transport domain | 17 |
aporia-ai/mlnotify | Automated notification system for machine learning model training | 343 |
trypromptly/llmstack | A tool for building and deploying generative AI applications with a no-code multi-agent framework | 1,610 |