opencompass
LLM evaluator
An LLM evaluation platform supporting various models and datasets
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
4k stars
26 watching
457 forks
Language: Python
last commit: 2 months ago
Linked from 1 awesome list
benchmarkchatgptevaluationlarge-language-modelllama2llama3llmopenai
Related projects:
Repository | Description | Stars |
---|---|---|
| Fine-tuned language models trained on mixed-quality data | 5,273 |
| Provides a unified framework to test generative language models on various evaluation tasks. | 7,200 |
| An evaluation toolkit for large vision-language models | 1,514 |
| An environment for battle-testing prompts to Large Language Models (LLMs) to evaluate response quality and performance. | 2,413 |
| An open-source toolkit for pretraining and fine-tuning large language models | 2,732 |
| A set of extensions built on top of OpenTelemetry to provide observability for large language model applications. | 5,188 |
| A Chinese language large language model built from OpenLLaMA and fine-tuned on various datasets for multilingual text generation. | 65 |
| An incremental pre-trained Chinese large language model based on the LLaMA-7B model | 234 |
| Guides software developers on how to effectively use and build systems around Large Language Models like GPT-4. | 8,487 |
| An integrated development platform for large language models (LLMs) that provides observability, analytics, and management tools. | 7,123 |
| A platform for training, serving, and evaluating large language models to enable tool use capability | 4,888 |
| An implementation of a method for fine-tuning language models to follow instructions with high efficiency and accuracy | 5,775 |
| Evaluates the legal knowledge of large language models using a custom benchmarking framework. | 273 |
| A flexible framework for adapting pre-trained language models to downstream NLP tasks using textual templates | 4,398 |
| A multi-platform translator and text processing tool leveraging ChatGPT API | 24,004 |