bleurt

NLG evaluation metric

An evaluation metric for Natural Language Generation based on transfer learning.

BLEURT is a metric for Natural Language Generation based on transfer learning.

GitHub

705 stars

13 watching

85 forks

Language: Python

last commit: almost 2 years ago

Linked from 1 awesome list

Screenshot of google-research/bleurt website

arxiv.org/abs/2004.04696

Backlinks from these awesome lists:

accelerated-text/awesome-nlg

Related projects:

Repository	Description	Stars
maluuba/nlg-eval	A toolset for evaluating and comparing natural language generation models	1,350
benhamner/metrics	Provides implementations of various supervised machine learning evaluation metrics in multiple programming languages.	1,632
mlgroupjlu/llm-eval-survey	A repository of papers and resources for evaluating large language models.	1,450
bllip/bllip-parser	A statistical natural language parser used to generate grammatically correct sentences from unstructured text input.	227
microsoft/prophetnet	A collection of research implementations and models for natural language generation	694
i-gallegos/fair-llm-benchmark	Compiles bias evaluation datasets and provides access to original data sources for large language models	115
thiagocf05/webnlg	Provides intermediate representations of data for NLG tasks like Discourse Ordering and Lexicalization	69
nlgranger/seqtools	A Python library to manipulate and transform indexable data	49
dluebke/bpelstats	A tool for calculating and analyzing BPEL metrics	0
simplenlg/simplenlg	A Java API for generating natural language texts from syntactic forms	810
intellabs/fastrag	A framework for efficient and optimized retrieval augmented generative pipelines using state-of-the-art LLMs and Information Retrieval.	1,392
lartpang/pysodmetrics	A library providing an implementation of various metrics for object segmentation and saliency detection in computer vision.	150
opennlg/openba	A pre-trained language model designed for various NLP tasks, including dialogue generation, code completion, and retrieval.	94
google-research/deep_ope	Provides benchmarking policies and datasets for offline reinforcement learning	85