Learning Theory

Adaptive Learning Is Not IID

Why diagnostic systems that choose the next question from previous answers need sequential-probability assumptions, not iid sampling assumptions.

AdvancedTier 2StableSupporting~35 min

Prerequisites

Random Variables Radon Nikodym and Conditional Expectation Martingale Theory Concentration Inequalities

Quiz (10)Pulse Check Prereq Map

Why This Matters

A diagnostic tutor does not collect answers the way a fixed survey collects answers. The next question is often chosen from the learner's previous answers. That makes the sequence adaptive:

Target -> Question -> Feedback -> Gap -> Retry -> Checkpoint

Once the policy uses past answers, the observations are generally not iid. That is not a flaw. It is a different mathematical object. The right question is not "can we pretend these answers are iid?" The right question is:

What assumptions let an adaptive diagnostic loop behave predictably?

Sequential probability gives a clean answer. If the feedback error at each step is centered after conditioning on the learner history, and if the error is bounded or conditionally sub-Gaussian, then the accumulated diagnostic error can still concentrate. Adaptive does not mean statistically lawless.

The Diagnostic Loop

Here is the learner-facing loop in mathematical language.

Product step	Mathematical object	What must be recorded
Target	learning goal	theorem, paper section, capability, or checkpoint
Question	adaptive action $Q_t$	selected item and its tested assumptions
Feedback	response $R_t$	correct, incorrect, skipped, confidence, time
Gap	weak concept signal	skill or claim whose evidence weakened
Retry	future action constraint	similar item scheduled after a delay
Checkpoint	stopping or advancement rule	enough evidence to advance, or review needed

The product value lives in this record. If the learner misses a Hoeffding assumption question, the system should not merely say "study probability." It should route toward the boundedness, independence, or sub-Gaussian assumption that the answer failed to support.

Formal Setup

Let $\mathcal F_{t-1}$ represent the information available before question $t$ : the target, previous questions, previous answers, skipped items, and the current learner-state estimate.

An adaptive diagnostic policy chooses the next question

Q_t = \pi_t(\mathcal F_{t-1}).

This means $Q_t$ is measurable with respect to the previous history. It is not sampled independently of the past.

Let $R_t$ be the response signal, scaled to $[0,1]$ for simplicity. A centered diagnostic error is

X_t = R_t - \mathbb E[R_t \mid \mathcal F_{t-1}].

Then

\mathbb E[X_t \mid \mathcal F_{t-1}] = 0,

so $(X_t)$ is a martingale difference sequence with respect to the learner history. The sequence can be dependent and adaptive, but it has no conditional drift after the past is known.

Main Theorem

Theorem

Adaptive Diagnostic Concentration

Statement

Let $(\mathcal F_t)_{t=0}^n$ be the learner-history filtration. Suppose $X_t$ is adapted to $\mathcal F_t$ , satisfies $\mathbb E[X_t \mid \mathcal F_{t-1}] = 0$ , and has bounded increments $|X_t| \leq c_t$ almost surely. Then, for any $\epsilon > 0$ ,

\Pr\!\left[\sum_{t=1}^n X_t \geq \epsilon\right] \leq \exp\!\left(-\dfrac{\epsilon^2}{2\sum_{t=1}^n c_t^2}\right).

Equivalently, if $c_t \leq c$ for all $t$ ,

\Pr\!\left[\dfrac{1}{n}\sum_{t=1}^n X_t \geq \eta\right] \leq \exp\!\left(-\dfrac{n\eta^2}{2c^2}\right).

Intuition

The policy may adaptively choose questions, but the centered feedback errors still have zero conditional mean. Bounded martingale differences cannot drift far in one direction for many steps without paying an exponential probability penalty.

Proof Sketch

The partial sums $M_k = \sum_{t=1}^k X_t$ form a martingale. Apply the Azuma-Hoeffding inequality to $M_k$ with increment bounds $c_t$ . The average version follows by setting $\epsilon = n\eta$ .

Why It Matters

This is the statistical reason adaptive diagnostics can be analyzed without pretending that all questions were drawn iid from a fixed question bank.

Failure Mode

The theorem does not say that the learner model is calibrated, that the policy is optimal, or that every checkpoint decision is correct. It only controls a specific centered, bounded error process under the stated assumptions.

report a correction →

Worked Example: A Hoeffding Assumption Miss

Suppose the target is finite-class uniform convergence. The learner gets an item wrong because they apply a Hoeffding-style bound without checking whether the loss is bounded.

The diagnostic record should separate the objects:

Field	Example value
Target	finite-class uniform convergence
Question	identify which assumption allows Hoeffding
Feedback	incorrect
Gap	bounded loss or sub-Gaussian tail control
Retry later	similar item on boundedness versus variance-only assumptions
Next checkpoint	sub-Gaussian / Hoeffding bridge

That retry item is not iid with the first item. It was selected because of the first answer. The analysis should therefore condition on the history that caused the retry.

Current Lean Status

This page is not a Lean formalization of TheoremPath's adaptive learning system. The deterministic objects in the loop -- concept graph, learner state, retry set, policy, and checkpoint transition -- are not formalized in this repository yet.

The current checked component is narrower: the Lean artifact TheoremPath.Probability.Concentration.azumaHoeffdingConditionalSubGaussianTail records a scoped Azuma-Hoeffding bridge for finite martingale-difference sums under conditional sub-Gaussian assumptions. That supports the concentration tool used in this page, but it does not verify the whole product loop.

Common Confusions

Watch Out

Adaptive is not the same as biased

Adaptivity means the next question depends on the previous history. Bias means the centered error has nonzero conditional expectation. A policy can be adaptive while the centered error process remains a martingale difference sequence.

Watch Out

A concentration bound is not a calibration guarantee

Azuma-Hoeffding controls the sum of centered errors under boundedness or conditional sub-Gaussian assumptions. It does not prove that the mastery model estimates the learner correctly. Calibration is a separate empirical question.

Watch Out

Retry later is a policy, not an iid sample

If a wrong answer causes a similar item to appear later, the item sequence is history-dependent. That is the point of the retry policy, and it is why the analysis should use filtrations and conditional expectation.

Exercises

ExerciseCore

In a 10-question diagnostic, question 6 is chosen only if the learner missed question 3. Explain why the sequence of questions is not iid.

ExerciseAdvanced

Let $X_t = R_t - \mathbb E[R_t \mid \mathcal F_{t-1}]$ . Prove that $\mathbb E[X_t \mid \mathcal F_{t-1}] = 0$ .

ExerciseCore

Starting from

\Pr[\sum_{t=1}^n X_t \geq \epsilon] \leq \exp(-\epsilon^2/(2\sum_{t=1}^n c_t^2)),

derive the average-error version when $c_t \leq c$ for every $t$ .

References

Canonical:

Azuma, K. (1967). "Weighted sums of certain dependent random variables." Tohoku Mathematical Journal.
Hoeffding, W. (1963). "Probability inequalities for sums of bounded random variables." Journal of the American Statistical Association.
McDiarmid, C. (1989). "On the method of bounded differences." Surveys in Combinatorics.
Wainwright, M. J. (2019). High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge University Press, Chapter 2.

Adaptive learning:

Corbett, A. T., and Anderson, J. R. (1995). "Knowledge tracing: Modeling the acquisition of procedural knowledge." User Modeling and User-Adapted Interaction.
Pavlik, P. I., Cen, H., and Koedinger, K. R. (2009). "Performance Factors Analysis: A New Alternative to Knowledge Tracing." Online Submission.

Related TheoremPath pages:

Last reviewed: May 2, 2026

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Full prerequisite chain All derived topics

Required prerequisites

Random Variableslayer 0A · tier 1
Radon-Nikodym and Conditional Expectationlayer 0B · tier 1
Concentration Inequalitieslayer 1 · tier 1
Martingale Theorylayer 0B · tier 2

Derived topics

No published topic currently declares this as a prerequisite.