Skip to main content

Applied ML

Hebbian Learning

Local correlation-based plasticity rules from Hebb (1949) through STDP, Oja, and BCM. Modern reinterpretations link Hebbian dynamics to predictive coding and contrastive learning.

AdvancedTier 3Stable~15 min
0

Why This Matters

Backpropagation requires a global error signal that propagates exact gradients backward through symmetric weights. Real cortex has neither: synapses update from locally available signals, and forward and backward pathways use distinct anatomical projections. Hebbian rules are the family of learning algorithms that respect those locality constraints. They are the default model of cortical plasticity, the algorithm STDP experiments measure, and the starting point for almost every "biologically plausible alternative to backprop" proposal.

Hebbian learning also matters because it keeps reappearing in machine learning under new names: PCA via Oja's rule, normalization via BCM, contrastive learning as approximate energy-based Hebbian descent. Understanding the original gives a sharper lens on the modern variants and on the recurring question of why cortex does not seem to need exact gradients.

Core Ideas

Hebb's postulate (1949). "When an axon of cell A is near enough to excite cell B and repeatedly takes part in firing it, some growth process or metabolic change takes place." The textbook simplification: Δwijxixj\Delta w_{ij} \propto x_i\, x_j. Pre and post activity correlate, weight grows. The rule is local in space (only the two endpoints) and time (only current activity).

Stability fixes: Oja and BCM. Pure Hebb is unstable: weights blow up because correlated activity always grows them. Oja (1982) added a normalizing decay term, Δwi=ηy(xiywi)\Delta w_i = \eta\, y\, (x_i - y\, w_i), and proved that the rule converges to the principal eigenvector of the input covariance: it is online PCA. Bienenstock-Cooper-Munro (BCM, 1982) introduced a sliding modification threshold θM\theta_M that scales with average post-synaptic activity, producing both LTP (long-term potentiation) and LTD (long-term depression) regimes and accounting for orientation selectivity in visual cortex.

Spike-timing-dependent plasticity. Bi and Poo (1998, J. Neurosci. 18(24)) measured plasticity in cultured hippocampal neurons as a function of the precise relative timing Δt=tposttpre\Delta t = t_{\text{post}} - t_{\text{pre}}. Pre-before-post within roughly 20 ms causes potentiation; post-before-pre causes depression. STDP refines Hebb to a causal rule: A wires to B only if A's spike actually contributed to B's spike. STDP is the dominant phenomenological model of cortical synaptic plasticity.

Biological plausibility critiques of backprop. Backprop requires the weight-transport problem: the backward pass uses WW^\top, the transpose of the forward weights. Lillicrap, Cownden, Tweed, and Akerman (2016, Nat. Commun. 7) showed that fixed random feedback weights work nearly as well, eliminating the symmetry requirement. This feedback alignment result, together with target propagation and predictive coding, suggests that approximate gradient signals carried by local Hebbian-like updates can train deep networks. Whether cortex actually implements anything in this family is unsettled.

Common Confusions

Watch Out

"Neurons that fire together wire together" is the rule. That phrase (Lowel and Singer, popularized by Carla Shatz) is a slogan, not Hebb's actual statement. Hebb required A to participate in firing B, which is a directional and causal claim. STDP makes this precise; symmetric correlation rules do not.

Watch Out

Hebbian learning is unsupervised, so it can replace SGD. Pure Hebbian rules find principal components and do unsupervised feature learning. They do not minimize task loss. Modern "biologically plausible" schemes succeed by smuggling in a global signal (a target, a contrastive phase, a predictive error), which is no longer purely Hebbian.

References

Related Topics

Last reviewed: April 18, 2026

Prerequisites

Foundations this topic depends on.

Next Topics