Perplexity in language model

1.PPL

PPL是用在自然语言处理领域(NLP)中,衡量语言模型好坏的指标。它主要是根据每个词来估计一句话出现的概率,并用句子长度作normalize,公式为

 

由公式可知,perplexity越小,模型越好。从公式最后一部分,感觉更像是描述GPT这种生成模型。

2.Language Model

  • autoregressive (AR) language model

GPT:

 

  • autoencoding (AE)language model

BERT(denoising auto-encoding):

where mt = 1 indicates xt is masked.

3.Reference

0

[论文笔记][2020-WWW]Enhanced-RCNN: An Efficient Method for Learning Sentence Similarity

前面的Related可以用来做综述

一、Architecture

二、Detail

1.Input Encoding

(a)RNN Encoder

(b)CNN Encoder

2.Interactive Sentence Representation

(a)Soft-attention Alignment(类似于ESIM)

(b)Interaction Modeling

3.Similarity Modeling

(a)Fusion Layer

(b)Label Prediction

MLP+softmax

三、Reference

1+