Linear Layer: Shapes and Memory
Implement Y = XW + b and explain every shape in the forward and backward pass.
Neural Network Foundations
A tiny MLP build path: linear layers, activations, losses, manual backprop, train/test behavior, and simple regularization.
Time
~10 hours
Core loop
forward pass → loss → backward pass → update
Topics
10 ordered topics
End state
You can build a tiny MLP, explain every tensor shape, and debug whether training is failing because of data, loss, gradients, or updates.
Build the smallest useful neural block and track the shape of every tensor.
Implement Y = XW + b and explain every shape in the forward and backward pass.
Understand the simplest linear classifier before adding hidden layers.
Know why nonlinearities make a network more than one big linear map.
Turn predictions into a scalar loss and use gradient descent to change parameters.
Pick a scalar objective that matches regression or classification.
Update parameters using gradients, learning rates, momentum, and Adam-style variants.
Verify gradients numerically before trusting a training loop.
Compose layers, cache forward values, and send gradients backward.
Turn logits into probabilities without overflow or underflow.
Connect logits, labels, likelihood, and gradients for classification.
Separate fitting the toy data from learning a rule that survives held-out examples.
Check whether a tiny network learned a rule or just memorized examples.
Use weight decay, dropout, and early stopping as concrete controls.
Do not only read the pages. For each step, write the shape ledger, answer the practice prompt, and then run a small quiz or diagnostic. The goal is operational fluency: you should be able to predict what changes before code or algebra tells you.
Back to reading paths →