uptrain

AI evaluation platform

An open-source platform to evaluate and improve Generative AI applications through automated checks and analysis

UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform root cause analysis on failure cases and give insights on how to resolve them.

GitHub

2k stars
21 watching
193 forks
Language: Python
last commit: 5 months ago
Linked from 1 awesome list

autoevaluationevaluationexperimentationhallucination-detectionjailbreak-detectionllm-evalllm-promptingllm-testllmopsmachine-learningmonitoringopenai-evalsprompt-engineeringroot-cause-analysis

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
upgini/upgini Automated data search and enrichment tool for machine learning pipelines 321
upb-lea/gym-electric-motor A Python toolbox for simulating and controlling electric motors with a focus on reinforcement learning and classical control. 311
cloud-cv/evalai A platform for comparing and evaluating AI and machine learning algorithms at scale 1,779
openai/finetune-transformer-lm This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. 2,167
qiangyt/batchai Automates bulk code checks and generates unit test codes to supplement AI tools like Copilot and Cursor 36
packtpublishing/hands-on-intelligent-agents-with-openai-gym Teaching software developers to build intelligent agents using deep reinforcement learning and OpenAI Gym 374
uber-research/upsnet Develops an instance segmentation and panoptic segmentation model for computer vision tasks. 648
ethicalml/xai An eXplainability toolbox for machine learning that enables data analysis and model evaluation to mitigate biases and improve performance 1,135
shu223/ios-genai-sampler A collection of Generative AI examples on iOS 80
promptslab/openai-detector An AI classifier designed to determine whether text is written by humans or machines. 122
codeintegrity-ai/mutahunter Automated unit test generation and mutation testing tool using Large Language Models. 252
ukgovernmentbeis/inspect_ai A framework for evaluating large language models 669
oeg-upm/gtfs-bench Provides a benchmarking framework for evaluating declarative knowledge graph construction engines in the transport domain 17
aporia-ai/mlnotify Automated notification system for machine learning model training 343
trypromptly/llmstack A tool for building and deploying generative AI applications with a no-code multi-agent framework 1,659