Beta. Content is under active construction and has not been peer-reviewed. Report errors on
GitHub
.
Disclaimer
Theorem
Path
Curriculum
Paths
Demos
Diagnostic
Search
Quiz Hub
/
BERT and the Pretrain-Finetune Paradigm
BERT and the Pretrain-Finetune Paradigm
3 questions
Difficulty 2-4
View topic
Foundation
0 / 3
2 foundation
1 intermediate
Adapts to your performance
1 / 3
foundation (2/10)
conceptual
The pretrain-then-finetune paradigm popularized by BERT (2018) has become standard for NLP. What is the key advantage over training from scratch on each task?
Hide and think first
A.
General linguistic knowledge learned on large unlabeled corpora transfers to downstream tasks with minimal labeled data
B.
Pretraining randomizes weights more effectively than standard initialization schemes, improving convergence
C.
Pretraining is a legal requirement imposed by data licensing on commercial text datasets
D.
Finetuning eliminates the need for any labeled data, since the model has already seen all possible examples
Submit Answer