Learning Theory
Adaptive Learning Is Not IID
Why diagnostic systems that choose the next question from previous answers need sequential-probability assumptions, not iid sampling assumptions.
Prerequisites
Why This Matters
A diagnostic tutor does not collect answers the way a fixed survey collects answers. The next question is often chosen from the learner's previous answers. That makes the sequence adaptive:
Target -> Question -> Feedback -> Gap -> Retry -> Checkpoint
Once the policy uses past answers, the observations are generally not iid. That is not a flaw. It is a different mathematical object. The right question is not "can we pretend these answers are iid?" The right question is:
What assumptions let an adaptive diagnostic loop behave predictably?
Sequential probability gives a clean answer. If the feedback error at each step is centered after conditioning on the learner history, and if the error is bounded or conditionally sub-Gaussian, then the accumulated diagnostic error can still concentrate. Adaptive does not mean statistically lawless.
The Diagnostic Loop
Here is the learner-facing loop in mathematical language.
| Product step | Mathematical object | What must be recorded |
|---|---|---|
| Target | learning goal | theorem, paper section, capability, or checkpoint |
| Question | adaptive action | selected item and its tested assumptions |
| Feedback | response | correct, incorrect, skipped, confidence, time |
| Gap | weak concept signal | skill or claim whose evidence weakened |
| Retry | future action constraint | similar item scheduled after a delay |
| Checkpoint | stopping or advancement rule | enough evidence to advance, or review needed |
The product value lives in this record. If the learner misses a Hoeffding assumption question, the system should not merely say "study probability." It should route toward the boundedness, independence, or sub-Gaussian assumption that the answer failed to support.
Formal Setup
Let represent the information available before question : the target, previous questions, previous answers, skipped items, and the current learner-state estimate.
An adaptive diagnostic policy chooses the next question
This means is measurable with respect to the previous history. It is not sampled independently of the past.
Let be the response signal, scaled to for simplicity. A centered diagnostic error is
Then
so is a martingale difference sequence with respect to the learner history. The sequence can be dependent and adaptive, but it has no conditional drift after the past is known.
Main Theorem
Adaptive Diagnostic Concentration
Statement
Let be the learner-history filtration. Suppose is adapted to , satisfies , and has bounded increments almost surely. Then, for any ,
Equivalently, if for all ,
Intuition
The policy may adaptively choose questions, but the centered feedback errors still have zero conditional mean. Bounded martingale differences cannot drift far in one direction for many steps without paying an exponential probability penalty.
Proof Sketch
The partial sums form a martingale. Apply the Azuma-Hoeffding inequality to with increment bounds . The average version follows by setting .
Why It Matters
This is the statistical reason adaptive diagnostics can be analyzed without pretending that all questions were drawn iid from a fixed question bank.
Failure Mode
The theorem does not say that the learner model is calibrated, that the policy is optimal, or that every checkpoint decision is correct. It only controls a specific centered, bounded error process under the stated assumptions.
Worked Example: A Hoeffding Assumption Miss
Suppose the target is finite-class uniform convergence. The learner gets an item wrong because they apply a Hoeffding-style bound without checking whether the loss is bounded.
The diagnostic record should separate the objects:
| Field | Example value |
|---|---|
| Target | finite-class uniform convergence |
| Question | identify which assumption allows Hoeffding |
| Feedback | incorrect |
| Gap | bounded loss or sub-Gaussian tail control |
| Retry later | similar item on boundedness versus variance-only assumptions |
| Next checkpoint | sub-Gaussian / Hoeffding bridge |
That retry item is not iid with the first item. It was selected because of the first answer. The analysis should therefore condition on the history that caused the retry.
Current Lean Status
This page is not a Lean formalization of TheoremPath's adaptive learning system. The deterministic objects in the loop -- concept graph, learner state, retry set, policy, and checkpoint transition -- are not formalized in this repository yet.
The current checked component is narrower: the Lean artifact
TheoremPath.Probability.Concentration.azumaHoeffdingConditionalSubGaussianTail
records a scoped Azuma-Hoeffding bridge for finite martingale-difference sums
under conditional sub-Gaussian assumptions. That supports the concentration
tool used in this page, but it does not verify the whole product loop.
Common Confusions
Adaptive is not the same as biased
Adaptivity means the next question depends on the previous history. Bias means the centered error has nonzero conditional expectation. A policy can be adaptive while the centered error process remains a martingale difference sequence.
A concentration bound is not a calibration guarantee
Azuma-Hoeffding controls the sum of centered errors under boundedness or conditional sub-Gaussian assumptions. It does not prove that the mastery model estimates the learner correctly. Calibration is a separate empirical question.
Retry later is a policy, not an iid sample
If a wrong answer causes a similar item to appear later, the item sequence is history-dependent. That is the point of the retry policy, and it is why the analysis should use filtrations and conditional expectation.
Exercises
In a 10-question diagnostic, question 6 is chosen only if the learner missed question 3. Explain why the sequence of questions is not iid.
Let . Prove that .
Starting from
derive the average-error version when for every .
References
Canonical:
- Azuma, K. (1967). "Weighted sums of certain dependent random variables." Tohoku Mathematical Journal.
- Hoeffding, W. (1963). "Probability inequalities for sums of bounded random variables." Journal of the American Statistical Association.
- McDiarmid, C. (1989). "On the method of bounded differences." Surveys in Combinatorics.
- Wainwright, M. J. (2019). High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge University Press, Chapter 2.
Adaptive learning:
- Corbett, A. T., and Anderson, J. R. (1995). "Knowledge tracing: Modeling the acquisition of procedural knowledge." User Modeling and User-Adapted Interaction.
- Pavlik, P. I., Cen, H., and Koedinger, K. R. (2009). "Performance Factors Analysis: A New Alternative to Knowledge Tracing." Online Submission.
Related TheoremPath pages:
Last reviewed: May 2, 2026
Canonical graph
Required before and derived from this topic
These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.
Required prerequisites
4- Random Variableslayer 0A · tier 1
- Radon-Nikodym and Conditional Expectationlayer 0B · tier 1
- Concentration Inequalitieslayer 1 · tier 1
- Martingale Theorylayer 0B · tier 2
Derived topics
0No published topic currently declares this as a prerequisite.