Beta. Content is under active construction and has not been peer-reviewed. Report errors on
GitHub
.
Disclaimer
Theorem
Path
Curriculum
Paths
Demos
Diagnostic
Search
Quiz Hub
/
Grokking
Grokking
3 questions
Difficulty 4-6
View topic
Intermediate
0 / 3
3 intermediate
Adapts to your performance
1 / 3
intermediate (4/10)
state theorem
Grokking (Power et al. 2022) is a surprising training phenomenon. What happens?
Hide and think first
A.
Different layers achieve different accuracy levels, creating a 'grokked' hierarchy
B.
The model overfits permanently, and grokking is a name for the irreversible memorization plateau
C.
The model suddenly loses all training accuracy after many epochs, requiring re-initialization
D.
Training accuracy reaches 100% early, but validation accuracy stays low for many more epochs before suddenly jumping to 100% — long delayed generalization
Submit Answer