Skip to main content

ML Methods

Wave-Based Neural Networks

A small niche of neural network designs built around wave equations, complex-valued arithmetic, and physical wave propagation. Includes complex-valued networks, optical and photonic networks, and Hamiltonian and symplectic architectures.

AdvancedTier 3Current~15 min
0

What It Is

Most neural networks operate on real-valued tensors with pointwise nonlinearities. A small but coherent line of work builds networks where the underlying object is a wave: a complex-valued field, a propagating optical signal, or a Hamiltonian flow. The motivation varies by subfield. Complex arithmetic gives a natural way to represent phase. Optics gives near-zero-energy linear operations at the speed of light. Hamiltonian structure gives long-horizon stability that purely data-driven networks rarely match.

The unifying technical content is small but real. Linear operations in the Fourier domain become pointwise multiplications, so any wave-based architecture inherits the FFT as its workhorse linear layer. Complex weights double parameter count without doubling expressivity in any free way, so the gain has to come from inductive bias, not capacity. None of these architectures dominate standard real-valued networks on generic vision or language benchmarks; they are tools for specific physical or signal-processing settings where the wave structure is part of the data.

Core Ideas

Complex-valued networks. Trabelsi et al. (2018) define complex convolutions, complex batch normalization, and complex ReLU variants, and report parity or modest gains on audio and music tasks where signals are naturally complex (after a short-time Fourier transform). The complex inner product x,y=ixiyi\langle x, y \rangle = \sum_i \overline{x_i} \, y_i encodes phase alignment, which a real network must learn from scratch. The ICLR 2018 paper is the canonical reference for getting complex-valued backprop and initialization right.

Optical and photonic networks. A diffractive deep neural network (Lin et al. 2018, Science) is a stack of physical phase masks. Light passes through, each layer applies a learned phase shift, and the output intensity at a detector array is the prediction. Training is done in simulation; the masks are then 3D printed. Inference happens at the speed of light with no electrical power for the linear part. Wetzstein et al. (2020, Nature) survey the broader field of inference with deep optics: lensless cameras, Fourier-plane filters, and metasurfaces co-designed with the downstream network.

Wave-equation PDE solvers. The wave equation t2u=c22u\partial_t^2 u = c^2 \nabla^2 u shows up in seismic imaging, ultrasound, and electromagnetic simulation. Physics-informed and neural-operator approaches treat the wave field as the network output and either penalize the PDE residual or learn the solution operator from data. Fourier neural operators are a natural fit because the Green's function of the wave equation is diagonal in the Fourier basis.

Hamiltonian and symplectic networks. Greydanus, Dzamba, and Yosinski (2019) parameterize a learned Hamiltonian Hθ(q,p)H_\theta(q, p) and integrate Hamilton's equations q˙=H/p\dot q = \partial H / \partial p, p˙=H/q\dot p = -\partial H / \partial q with a symplectic integrator. Energy is approximately conserved over long horizons, which standard recurrent networks fail at. The connection to waves is structural: many wave systems have a Hamiltonian formulation, and symplectic integrators preserve the geometric structure that purely data-driven networks discard.

Watch Out

Optical neural networks are not faster on every workload

Diffractive networks are fast and low-power for the linear part of inference. The nonlinearity, the analog-to-digital conversion at the detector, and the training process are all still electronic and slow. The win is in fixed, repeated linear operations on optical inputs (e.g., on-sensor filtering for cameras). For arbitrary inference on digital data, an optical layer adds I/O cost that wipes out the energy savings.

Watch Out

Complex-valued does not mean twice the capacity

A complex weight has two real degrees of freedom but is constrained by complex linearity: w(a+ib)=(wRawIb)+i(wIa+wRb)w \cdot (a + ib) = (w_R a - w_I b) + i (w_I a + w_R b). A real network with 2n2n inputs and 2n2n outputs has 4n24n^2 free parameters in its linear layer; the corresponding complex layer has 2n22n^2 real parameters. The complex layer is strictly less expressive as a linear map; the value comes from inductive bias on phase, not raw capacity.

References

Related Topics

Last reviewed: April 18, 2026