Beta. Content is under active construction and has not been peer-reviewed. Report errors on GitHub.Disclaimer

Learning Rate Scheduling

3 questionsDifficulty 3-6View topic
Foundation
0 / 3
1 foundation2 intermediateAdapts to your performance
1 / 3
foundation (3/10)conceptual
Why does stochastic gradient descent (SGD) typically use a decaying learning rate schedule?