Proof Tasks

Graded checkpoints that turn topic mastery into earned dependency edges. Each task is anchored to a real definition, theorem, or formula on a canonical TheoremPath page. A correct answer writes evidence to your assessment-attempt history; that evidence flows through the PFA model and the knowledge-state engine to update the edge state on your knowledge state.

Attention shape trace through QK^T

Difficulty 2/5

In a single attention head, the query matrix Q has shape [B, n_q, d_k], the key matrix K has shape [B, n_k, d_k], and the value matrix V has shape [B, n_k, d_v], where B is the batch size. What is the shape of the attent…

Calculation·attention-mechanism-theorymatrix-multiplication-algorithms

Softmax axis in attention logits

Difficulty 2/5

In multi-head attention, the logits tensor has shape [batch, heads, query_tokens, key_tokens]. Over which axis must softmax normalize to produce, for each query position, a probability distribution over key positions?

Definition check·softmax-and-numerical-stabilityattention-mechanism-theory

Fourier multiplier for the heat equation

Difficulty 3/5

Consider the 1D heat equation on a periodic domain: partial_t u = alpha * partial_xx u, with alpha > 0. Take the Fourier transform u_hat(k, t) of u(x, t) in the spatial variable. Given an initial Fourier amplitude u_hat(…

Derivation·pde-fundamentals-for-mlspectral-theory-of-operators

What does Fokker-Planck describe vs the SDE itself

Difficulty 3/5

Given an Ito SDE dX_t = b(X_t, t) dt + sigma(X_t, t) dW_t with associated Fokker-Planck (forward Kolmogorov) equation partial_t p(x, t) = - div(b * p) + (1/2) sum_ij partial_i partial_j (D_ij * p) where D = sigma * sigma…

Concept discrimination·fokker-planck-equationscore-matching

Why backprop is reverse-mode, not forward-mode

Difficulty 3/5

A network computes L = f4(f3(f2(f1(x)))) where x is a vector input and L is a scalar loss. To compute the gradient dL/dx via the chain rule, the product J_4 * J_3 * J_2 * J_1 (Jacobians) can be evaluated either left-to-r…

Concept discrimination·vector-calculus-chain-rulefeedforward-networks-and-backpropagation

How proof tasks work

Each task is a multiple-choice checkpoint with an exact correct answer grounded in a specific page. The correct answer is not a matter of opinion: it is whatever the canonical content says.

When you submit, the task grades against the canonical answer and writes one AssessmentAttempt row per topic the task is tagged with. That evidence flows through the PFA learning model (Pavlik et al. 2009) and the knowledge-state engine, updating both per-topic mastery scores and the corresponding edge state.

Proof tasks complement reading: a topic page tells you what's true; a proof task asks you to commit to which alternative statement is correct. Wrong answers surface a misconception explanation tied to the specific way the alternative diverges from the canonical answer.