Bilingual Evaluation Understudy (BLEU) NLP
noun phrase
Definition: An automatic evaluation metric for machine translation that measures the quality of a candidate translation by comparing it with one or more reference human translations using modified n-gram precision, together with a brevity penalty that discourages excessively short outputs [Papineni et al. 2002].
Example in context: “Specifically, we utilize two well-known automatic metrics: BLEU (Bilingual Evaluation Understudy) and perplexity (PPL), for assessing the quality of text generation and machine translation.” [Liu and Yin 2023]
Synonyms: BLEU; BLEU score
Related terms: machine translation evaluation; ROUGE; METEOR; BERTScore