HalluQA

Hallucination detector

An evaluation framework for assessing the performance of large language models on question-answering tasks with hallucination detection

Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"

GitHub

111 stars

5 watching

4 forks

Language: Python

last commit: about 2 years ago

arxiv.org/pdf/2310.03368.pdf

Related projects:

Repository	Description	Stars
openkg-org/easydetect	A framework to detect and mitigate hallucinations in multimodal large language models	48
junyangwang0410/haelm	A framework for detecting hallucinations in large language models	17
rucaibox/pope	An evaluation framework for detecting object hallucinations in vision-language models	187
amazon-science/refchecker	Automates fine-grained hallucination detection in large language model outputs	325
opendatalab/ha-dpo	A framework to improve large language model performance by mitigating hallucination effects through data and optimization techniques.	73
assafbk/mocha_code	A unified framework and benchmark for detecting and mitigating hallucinations in open-vocabulary image captioning models	13
x-plug/mplug-halowl	Evaluates and mitigates hallucinations in multimodal large language models	82
bradyfu/woodpecker	A method to correct hallucinations in multimodal large language models without requiring retraining	617
1zhou-wang/memvr	An implementation of a method to mitigate hallucinations in large language models using visual re-tracing	28
damo-nlp-sg/vcd	An approach to reduce object hallucinations in large vision-language models by contrasting output distributions derived from original and distorted visual inputs	222
yuqifan1117/hallucidoctor	This project provides tools and frameworks to mitigate hallucinatory toxicity in visual instruction data, allowing researchers to fine-tune MLLM models on specific datasets.	41
fuxiaoliu/lrv-instruction	A research project focused on mitigating hallucinations in large multi-modal models by improving instruction tuning through robust training methods.	262
bcdnlp/faithscore	Evaluates answers generated by large vision-language models to assess hallucinations	27
tianyi-lab/hallusionbench	An image-context reasoning benchmark designed to challenge large vision-language models and help improve their accuracy	259
masaiahhan/correlationqa	An investigation into the relationship between misleading images and hallucinations in large language models	8