Beta. Content is under active construction and has not been peer-reviewed. Report errors on
GitHub
.
Disclaimer
Theorem
Path
Curriculum
Paths
Demos
Diagnostic
Search
Quiz Hub
/
Batch Normalization
Batch Normalization
1 questions
Difficulty 4-4
View topic
Intermediate
0 / 1
1 intermediate
Adapts to your performance
1 / 1
intermediate (4/10)
conceptual
Batch normalization normalizes activations using the mean and variance computed over the current mini-batch during training. At inference time, it uses running averages instead. Why is this switch necessary?
Hide and think first
A.
Batch statistics are too expensive to compute at inference time
B.
The batch size at inference is always exactly 1 for all deployment scenarios, making per-batch mean and variance mathematically undefined
C.
The exponential moving average running statistics computed during training are inherently more accurate than per-batch statistics at inference time
D.
Predictions must be deterministic and independent of other examples in a batch; batch statistics would make the output depend on co-occurring inputs
Submit Answer