ChainForge
Prompt testing env
An environment for battle-testing prompts to Large Language Models (LLMs) to evaluate response quality and performance.
An open-source visual programming environment for battle-testing prompts to LLMs.
2k stars
30 watching
179 forks
Language: TypeScript
last commit: 23 days ago aievaluationlarge-language-modelsllmopsllmsprompt-engineering
Related projects:
Repository | Description | Stars |
---|---|---|
promptfoo/promptfoo | A tool for testing and evaluating large language models (LLMs) to ensure they are reliable and secure | 4,754 |
hegelai/prompttools | A set of tools for testing and evaluating natural language processing models and vector databases. | 2,708 |
eleutherai/lm-evaluation-harness | Provides a unified framework to test generative language models on various evaluation tasks. | 6,970 |
confident-ai/deepeval | A framework for evaluating large language models | 3,669 |
explodinggradients/ragas | A toolkit for evaluating and optimizing Large Language Model applications with data-driven insights | 7,233 |
langfuse/langfuse | An integrated development platform for large language models (LLMs) that provides observability, analytics, and management tools. | 6,537 |
llmware-ai/llmware | A framework for building enterprise LLM-based applications using small, specialized models | 6,651 |
open-compass/opencompass | An LLM evaluation platform supporting various models and datasets | 4,124 |
openai/evals | A framework for evaluating large language models and systems, providing a registry of benchmarks. | 15,015 |
scisharp/llamasharp | A C#/.NET library to efficiently run Large Language Models (LLMs) on local devices | 2,673 |
brexhq/prompt-engineering | Guides software developers on how to effectively use and build systems around Large Language Models like GPT-4. | 8,440 |
instructor-ai/instructor | A Python library that provides structured outputs from large language models (LLMs) and facilitates seamless integration with various LLM providers. | 8,163 |
microsoft/prompt-engine | A utility library for crafting prompts to help Large Language Models generate specific outputs | 2,591 |
fminference/flexllmgen | Generates large language model outputs in high-throughput mode on single GPUs | 9,192 |
pair-code/lit | An interactive tool for analyzing and understanding machine learning models | 3,492 |