uptrain
AI evaluation platform
An open-source platform to evaluate and improve Generative AI applications through automated checks and analysis
UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform root cause analysis on failure cases and give insights on how to resolve them.
2k stars
21 watching
193 forks
Language: Python
last commit: 7 months ago
Linked from 1 awesome list
autoevaluationevaluationexperimentationhallucination-detectionjailbreak-detectionllm-evalllm-promptingllm-testllmopsmachine-learningmonitoringopenai-evalsprompt-engineeringroot-cause-analysis
Related projects:
Repository | Description | Stars |
---|---|---|
| Automated data search and enrichment tool for machine learning pipelines | 321 |
| A Python toolbox for simulating and controlling electric motors with a focus on reinforcement learning and classical control. | 311 |
| A platform for comparing and evaluating AI and machine learning algorithms at scale | 1,779 |
| This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. | 2,167 |
| Automates bulk code checks and generates unit test codes to supplement AI tools like Copilot and Cursor | 36 |
| Teaching software developers to build intelligent agents using deep reinforcement learning and OpenAI Gym | 374 |
| Develops an instance segmentation and panoptic segmentation model for computer vision tasks. | 648 |
| An eXplainability toolbox for machine learning that enables data analysis and model evaluation to mitigate biases and improve performance | 1,135 |
| A collection of Generative AI examples on iOS | 80 |
| An AI classifier designed to determine whether text is written by humans or machines. | 122 |
| Automated unit test generation and mutation testing tool using Large Language Models. | 252 |
| A framework for evaluating large language models | 669 |
| Provides a benchmarking framework for evaluating declarative knowledge graph construction engines in the transport domain | 17 |
| Automated notification system for machine learning model training | 343 |
| A tool for building and deploying generative AI applications with a no-code multi-agent framework | 1,659 |