XiezhiBenchmark
Questionnaire
An evaluation suite to assess language models' performance in multi-choice questions
93 stars
1 watching
4 forks
Language: Python
last commit: about 1 year ago Related projects:
Repository | Description | Stars |
---|---|---|
| Measures the understanding of massive multitask Chinese datasets using large language models | 87 |
| Develops and publishes large multilingual language models with advanced mixing-of-experts architecture. | 37 |
| Evaluates the capabilities of large multimodal models using a set of diverse tasks and metrics | 274 |
| A large language model developed to support multiple languages and applications | 648 |
| A high-performance language model designed to excel in tasks like natural language understanding, mathematical computation, and code generation | 182 |
| A large language model developed by XVERSE Technology Inc. using transformer architecture and fine-tuned on diverse data sets for various applications. | 132 |
| A large multimodal model for visual question answering, trained on a dataset of 2.1B image-text pairs and 8.2M instruction sequences. | 78 |
| A Python package implementing an interpretable machine learning model for text classification with visualization tools | 336 |
| Evaluates and benchmarks large language models' video understanding capabilities | 121 |
| A large-scale instruction-tuning dataset for multi-task and few-shot learning in the medical domain | 328 |
| Developed by XVERSE Technology Inc. as a multilingual large language model with a unique mixture-of-experts architecture and fine-tuned for various tasks such as conversation, question answering, and natural language understanding. | 36 |
| Develops large multimodal models for high-resolution understanding and analysis of text, images, and other data types. | 143 |
| An open-source framework providing tools and models for analyzing and generating Chinese classics texts using large language models | 263 |
| Develops a multimodal task and dataset to assess vision-language models' ability to handle interleaved image-text inputs. | 33 |
| An analysis project investigating limitations of visual language models in understanding and processing images with potential biases and interference challenges. | 53 |