betterprompt
Prompt evaluator
An API for evaluating the quality of text prompts used in Large Language Models (LLMs) based on perplexity estimation
Test suite for LLM prompts
43 stars
3 watching
4 forks
Language: Python
last commit: 9 months ago Related projects:
Repository | Description | Stars |
---|---|---|
| A tool to reduce the complexity of text prompts to minimize API costs and model computations. | 246 |
| Evaluating and improving large multimodal models through in-context learning | 21 |
| Evaluates language models using standardized benchmarks and prompting techniques. | 2,059 |
| An open-source benchmark and evaluation tool for assessing multimodal large language models' performance in embodied decision-making tasks | 99 |
| Evaluates foundation models on human-centric tasks with diverse exams and question types | 714 |
| An evaluation tool for question-answering systems using large language models and natural language processing techniques | 1,065 |
| An evaluation toolkit for large vision-language models | 1,514 |
| Evaluates segmentation performance in medical imaging using multiple metrics | 57 |
| An evaluation suite for assessing chart understanding in multimodal large language models. | 85 |
| An implementation of a two-stage framework designed to prompt large language models with answer heuristics for knowledge-based visual question answering tasks. | 270 |
| Evaluates and compares the performance of multimodal large language models on various tasks | 56 |
| A benchmarking framework for evaluating Large Multimodal Models by providing rigorous metrics and an efficient evaluation pipeline. | 22 |
| A comprehensive toolkit for evaluating NLP experiments offering automated metrics and efficient computation. | 187 |
| Evaluates German transformer language models with syntactic agreement tests | 7 |
| An evaluation framework for large language models trained with instruction tuning methods | 535 |