LASER Multilingual Representations
1. Main idea
first train an encoder that learns to produce a multilingual, fixed-size sentence representation; and then compute a distance between two sentences in the learned embedding space.
1. In the first step, we train a domain classifier with the same model architecture on the data from different domains with domain labels.
BERT domain adaptation
2. In the second step, we select a subset of source domain data based on the domain probability from the domain classifier, and train the original model on the selected source data.
The trained domain classifier is then used to predict the target domain probability for each data point from the source domain. Source data points with the highest target domain probability are selected for fine-tuning BERT for domain adaptation.
- multi-source domain adaptation
- applied to few-shot learning scenarios in which the selected source domain data can be used to augment the limited target domain training data
Method 1: Data Selection using Cross-Entropy(sourceside)
Method 2: Data Selection using Cross-Entropy Difference(sourceside)
Method 3(novel): Data Selection using Bilingual Cross-Entropy Difference(source and targetside)