Skip to main content

Learning Theory

Adaptive Learning Is Not IID

Why diagnostic systems that choose the next question from previous answers need sequential-probability assumptions, not iid sampling assumptions.

AdvancedTier 2StableSupporting~35 min

Why This Matters

A diagnostic tutor does not collect answers the way a fixed survey collects answers. The next question is often chosen from the learner's previous answers. That makes the sequence adaptive:

Target -> Question -> Feedback -> Gap -> Retry -> Checkpoint

Once the policy uses past answers, the observations are generally not iid. That is not a flaw. It is a different mathematical object. The right question is not "can we pretend these answers are iid?" The right question is:

What assumptions let an adaptive diagnostic loop behave predictably?

Sequential probability gives a clean answer. If the feedback error at each step is centered after conditioning on the learner history, and if the error is bounded or conditionally sub-Gaussian, then the accumulated diagnostic error can still concentrate. Adaptive does not mean statistically lawless.

The Diagnostic Loop

Here is the learner-facing loop in mathematical language.

Product stepMathematical objectWhat must be recorded
Targetlearning goaltheorem, paper section, capability, or checkpoint
Questionadaptive action QtQ_tselected item and its tested assumptions
Feedbackresponse RtR_tcorrect, incorrect, skipped, confidence, time
Gapweak concept signalskill or claim whose evidence weakened
Retryfuture action constraintsimilar item scheduled after a delay
Checkpointstopping or advancement ruleenough evidence to advance, or review needed

The product value lives in this record. If the learner misses a Hoeffding assumption question, the system should not merely say "study probability." It should route toward the boundedness, independence, or sub-Gaussian assumption that the answer failed to support.

Formal Setup

Let Ft1\mathcal F_{t-1} represent the information available before question tt: the target, previous questions, previous answers, skipped items, and the current learner-state estimate.

An adaptive diagnostic policy chooses the next question

Qt=πt(Ft1).Q_t = \pi_t(\mathcal F_{t-1}).

This means QtQ_t is measurable with respect to the previous history. It is not sampled independently of the past.

Let RtR_t be the response signal, scaled to [0,1][0,1] for simplicity. A centered diagnostic error is

Xt=RtE[RtFt1].X_t = R_t - \mathbb E[R_t \mid \mathcal F_{t-1}].

Then

E[XtFt1]=0,\mathbb E[X_t \mid \mathcal F_{t-1}] = 0,

so (Xt)(X_t) is a martingale difference sequence with respect to the learner history. The sequence can be dependent and adaptive, but it has no conditional drift after the past is known.

Main Theorem

Theorem

Adaptive Diagnostic Concentration

Statement

Let (Ft)t=0n(\mathcal F_t)_{t=0}^n be the learner-history filtration. Suppose XtX_t is adapted to Ft\mathcal F_t, satisfies E[XtFt1]=0\mathbb E[X_t \mid \mathcal F_{t-1}] = 0, and has bounded increments Xtct|X_t| \leq c_t almost surely. Then, for any ϵ>0\epsilon > 0,

Pr ⁣[t=1nXtϵ]exp ⁣(ϵ22t=1nct2).\Pr\!\left[\sum_{t=1}^n X_t \geq \epsilon\right] \leq \exp\!\left(-\dfrac{\epsilon^2}{2\sum_{t=1}^n c_t^2}\right).

Equivalently, if ctcc_t \leq c for all tt,

Pr ⁣[1nt=1nXtη]exp ⁣(nη22c2).\Pr\!\left[\dfrac{1}{n}\sum_{t=1}^n X_t \geq \eta\right] \leq \exp\!\left(-\dfrac{n\eta^2}{2c^2}\right).

Intuition

The policy may adaptively choose questions, but the centered feedback errors still have zero conditional mean. Bounded martingale differences cannot drift far in one direction for many steps without paying an exponential probability penalty.

Proof Sketch

The partial sums Mk=t=1kXtM_k = \sum_{t=1}^k X_t form a martingale. Apply the Azuma-Hoeffding inequality to MkM_k with increment bounds ctc_t. The average version follows by setting ϵ=nη\epsilon = n\eta.

Why It Matters

This is the statistical reason adaptive diagnostics can be analyzed without pretending that all questions were drawn iid from a fixed question bank.

Failure Mode

The theorem does not say that the learner model is calibrated, that the policy is optimal, or that every checkpoint decision is correct. It only controls a specific centered, bounded error process under the stated assumptions.

Worked Example: A Hoeffding Assumption Miss

Suppose the target is finite-class uniform convergence. The learner gets an item wrong because they apply a Hoeffding-style bound without checking whether the loss is bounded.

The diagnostic record should separate the objects:

FieldExample value
Targetfinite-class uniform convergence
Questionidentify which assumption allows Hoeffding
Feedbackincorrect
Gapbounded loss or sub-Gaussian tail control
Retry latersimilar item on boundedness versus variance-only assumptions
Next checkpointsub-Gaussian / Hoeffding bridge

That retry item is not iid with the first item. It was selected because of the first answer. The analysis should therefore condition on the history that caused the retry.

Current Lean Status

This page is not a Lean formalization of TheoremPath's adaptive learning system. The deterministic objects in the loop -- concept graph, learner state, retry set, policy, and checkpoint transition -- are not formalized in this repository yet.

The current checked component is narrower: the Lean artifact TheoremPath.Probability.Concentration.azumaHoeffdingConditionalSubGaussianTail records a scoped Azuma-Hoeffding bridge for finite martingale-difference sums under conditional sub-Gaussian assumptions. That supports the concentration tool used in this page, but it does not verify the whole product loop.

Common Confusions

Watch Out

Adaptive is not the same as biased

Adaptivity means the next question depends on the previous history. Bias means the centered error has nonzero conditional expectation. A policy can be adaptive while the centered error process remains a martingale difference sequence.

Watch Out

A concentration bound is not a calibration guarantee

Azuma-Hoeffding controls the sum of centered errors under boundedness or conditional sub-Gaussian assumptions. It does not prove that the mastery model estimates the learner correctly. Calibration is a separate empirical question.

Watch Out

Retry later is a policy, not an iid sample

If a wrong answer causes a similar item to appear later, the item sequence is history-dependent. That is the point of the retry policy, and it is why the analysis should use filtrations and conditional expectation.

Exercises

ExerciseCore

In a 10-question diagnostic, question 6 is chosen only if the learner missed question 3. Explain why the sequence of questions is not iid.

ExerciseAdvanced

Let Xt=RtE[RtFt1]X_t = R_t - \mathbb E[R_t \mid \mathcal F_{t-1}]. Prove that E[XtFt1]=0\mathbb E[X_t \mid \mathcal F_{t-1}] = 0.

ExerciseCore

Starting from

Pr[t=1nXtϵ]exp(ϵ2/(2t=1nct2)),\Pr[\sum_{t=1}^n X_t \geq \epsilon] \leq \exp(-\epsilon^2/(2\sum_{t=1}^n c_t^2)),

derive the average-error version when ctcc_t \leq c for every tt.

References

Canonical:

  • Azuma, K. (1967). "Weighted sums of certain dependent random variables." Tohoku Mathematical Journal.
  • Hoeffding, W. (1963). "Probability inequalities for sums of bounded random variables." Journal of the American Statistical Association.
  • McDiarmid, C. (1989). "On the method of bounded differences." Surveys in Combinatorics.
  • Wainwright, M. J. (2019). High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge University Press, Chapter 2.

Adaptive learning:

  • Corbett, A. T., and Anderson, J. R. (1995). "Knowledge tracing: Modeling the acquisition of procedural knowledge." User Modeling and User-Adapted Interaction.
  • Pavlik, P. I., Cen, H., and Koedinger, K. R. (2009). "Performance Factors Analysis: A New Alternative to Knowledge Tracing." Online Submission.

Related TheoremPath pages:

Last reviewed: May 2, 2026

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Required prerequisites

4

Derived topics

0

No published topic currently declares this as a prerequisite.