Applied ML
Spiking Neural Networks
Discrete-event neuron models trained with surrogate gradients. Energy-efficient on neuromorphic hardware, but rarely competitive with ANNs on standard benchmarks.
Why This Matters
Standard artificial neurons emit continuous activations every forward pass. Biological neurons emit binary spikes asynchronously and stay quiet otherwise. Spiking neural networks (SNNs) preserve that sparsity: a neuron contributes energy only when it fires. On event-driven neuromorphic chips like Intel Loihi 2 and SpiNNaker 2, this asymmetry yields one to three orders of magnitude lower inference energy than a comparable ANN running on a GPU, especially for streaming sensor data from event cameras.
The catch is training. Spike functions are non-differentiable, so vanilla backprop does not apply. A decade of progress (surrogate gradients, ANN-to-SNN conversion, time-to-first-spike coding) has narrowed but not closed the accuracy gap on static-image benchmarks like ImageNet. SNNs remain the right tool when the substrate is event-driven, the power budget is tight, or the input is intrinsically temporal. They are usually the wrong tool when you have a GPU and a static dataset.
Core Ideas
Leaky integrate-and-fire (LIF). The canonical neuron integrates input current into a membrane potential that leaks toward rest with time constant :
When crosses threshold , the neuron emits a spike and resets to . Discretizing in time gives a recurrent unit with binary output and hidden state .
Surrogate gradients. The spike has zero derivative almost everywhere. Neftci, Mostafa, and Zenke (2019, IEEE Signal Process. Mag. 36(6)) replace the derivative with a smooth surrogate (a fast sigmoid, a triangular pulse) only in the backward pass. The forward pass stays binary, so inference remains spike-driven; the backward pass behaves like training a recurrent net through time.
ANN-to-SNN conversion. Rueckauer et al. (2017, Front. Neurosci. 11) showed that a ReLU ANN can be mapped to a rate-coded SNN by interpreting each ReLU activation as a firing rate and weight-normalizing per layer. Conversion preserves accuracy on CIFAR-10 and ImageNet within a few percent but requires hundreds of timesteps to integrate stable rates, eroding the energy advantage. Direct SNN training tends to need fewer timesteps but more training compute.
Time-coded versus rate-coded. Rate codes encode information in firing rate over a window; they are robust but slow. Temporal codes (time-to-first-spike, phase coding) encode in spike timing and can decide a class in a single spike per neuron. Temporal codes are closer to the biological story and to the energy promise, but harder to train.
Common Confusions
"SNNs are biologically realistic, so they will eventually beat ANNs." Biological plausibility and task accuracy are different axes. SNNs match ANN accuracy on small static-image benchmarks but lag on ImageNet, language, and most modern benchmarks. The case for SNNs is energy-per-inference on neuromorphic hardware, not representational power.
"Surrogate gradients are mathematically sound." They are a heuristic that works empirically. The surrogate is not the gradient of the spike, and convergence guarantees from smooth optimization do not carry over directly. Treat them as a useful trick, not a derivation.
References
Related Topics
Last reviewed: April 18, 2026
Prerequisites
Foundations this topic depends on.
- Feedforward Networks and BackpropagationLayer 2
- Differentiation in RnLayer 0A
- Sets, Functions, and RelationsLayer 0A
- Basic Logic and Proof TechniquesLayer 0A
- Matrix CalculusLayer 1
- The Jacobian MatrixLayer 0A
- The Hessian MatrixLayer 0A
- Matrix Operations and PropertiesLayer 0A
- Eigenvalues and EigenvectorsLayer 0A
- Activation FunctionsLayer 1
- Convex Optimization BasicsLayer 1
- Convolutional Neural NetworksLayer 3
- Vectors, Matrices, and Linear MapsLayer 0A