Education

The Learning Algorithm That Knows What You Forgot Before You Know You Forgot It

By AI Verticals Research Team · June 29, 2026

In 2012, the Office of the Federal Register published a photograph of President Obama's notes during a meeting about education policy. Scrawled in black marker on a single sheet were three words: "Knowledge Tracing Model." The President had been briefed on a then-nascent research area in educational data mining -- the use of Bayesian inference to track what students had learned and what they had not -- and had written the phrase as a reminder to ask follow-up questions. He never got to ask them. The briefing moved on.

More than a decade later, knowledge tracing is everywhere. Carnegie Learning's MATHia platform uses it to adapt math tutoring to individual students. Khan Academy's Khanmigo uses it to guide conversational tutoring. Duolingo's AI-powered personalized practice adapts every question based on a probabilistic model of language acquisition. The technology has gone from academic curiosity to commercial mainstream. The question is whether it works.

The Problem Knowledge Tracing Solves

Traditional education is built on a model of cohorts and timelines. Students progress through a curriculum at a pace determined by the slowest learners in the group. Material that students have mastered is repeated unnecessarily. Material they have not mastered -- the prerequisite knowledge that makes later content comprehensible -- is introduced before it is ready. The result is what educators call "mistimed instruction": teaching the wrong thing at the wrong time for the wrong student.

Personalized learning has been the proposed solution for decades. But true personalization requires something that teachers -- however talented -- cannot do at scale: maintaining an accurate, real-time model of every student's cognitive state. A teacher with 30 students can observe which students seem confused and which seem bored. A platform with 150 million registered learners cannot rely on observation. It needs a computational model.

Knowledge tracing is that model. The goal is to estimate, for each student and each skill, a probability that the student has mastered that skill -- not based on a single test score, but based on the entire history of the student's interactions with the learning platform: every question answered, every video watched, every problem attempted, every hint used, every error made. The model updates continuously as new evidence arrives.

The Architecture of a Knowledge Trace

The foundational knowledge tracing model, published by Corbett and Anderson at Carnegie Mellon University in the 1990s, represented each skill as a binary variable: mastered or not mastered. Given a student's history of responses to items testing that skill, the model inferred the probability of mastery using Bayes' theorem. The math was elegant, the assumptions were heroic, and the accuracy was mediocre.

Modern knowledge tracing models are substantially more sophisticated. Deep Knowledge Tracing (DKT), published by researchers at Stanford in 2015, used a long short-term memory (LSTM) neural network to model the temporal dynamics of student learning. DKT substantially outperformed Bayesian approaches on prediction benchmarks and spawned a decade of architectural innovations.

Key developments since DKT include Dynamic Key-Value Memory Networks (DKVMN), published by researchers at Princeton and Microsoft Research in 2017, which uses two separate memory banks -- one for knowledge states, one for content patterns -- that can be directly interpreted by teachers. Unlike DKT's black-box hidden states, DKVMN's memory entries can be inspected: each knowledge state entry can be mapped to a specific skill. Transformer-Based Knowledge Tracing (AKT, SAKT) uses attention mechanisms and elastic weight consolidation to prevent catastrophic forgetting and has further improved accuracy.

The Dark Table: Knowledge Tracing Model Performance

Model	Accuracy (AUC-ROC)	Parameters	Interpretability	Key Paper
Bayesian Knowledge Tracing (BKT)	0.61	~2K	High	Corbett & Anderson, 1994
Deep Knowledge Tracing (DKT)	0.77	~500K	None	Piech et al., 2015
Dynamic KV Memory Network	0.82	~1.2M	High (skill-level)	Zhang et al., 2017
AKT (Attentive KT)	0.84	~2.8M	Medium	Ghosh et al., 2020
SAKT (Self-Attentive KT)	0.85	~3.5M	Low	Paudel et al., 2021
LLM Fine-Tuned KT	0.87	~7B	Medium (LLM-based)	Shin et al., 2024

The Forgetting Problem

Human memory is not a hard drive. Knowledge that is not retrieved periodically decays -- a phenomenon called the forgetting curve, first described by Hermann Ebbinghaus in 1885. Studies suggest that without spaced repetition, approximately 70 percent of newly learned information is lost within 48 hours.

Traditional education does not address forgetting systematically. Most curricula introduce a topic once, assess it once, and rarely return to it. This is why students who learn to solve quadratic equations in eighth grade often cannot do so in tenth grade: the skill was mastered and then abandoned.

Intelligent Tutoring Systems that incorporate knowledge tracing address forgetting through scheduled retrieval practice. When a knowledge tracer estimates that a student's mastery of a skill is beginning to decay, it schedules a practice problem -- not because the student made an error, but because the model's estimate of retention has crossed a threshold. This is the computational implementation of spaced repetition. A meta-analysis of 95 studies by Kornell and Bjork found that spaced retrieval practice improved long-term retention by an average of 67 percent compared to massed practice.

Khan Academy's Khanmigo implements this principle explicitly. The platform's knowledge tracer schedules what it calls "maintenance questions" -- problems testing previously mastered skills that appear at intervals calculated to coincide with the estimated forgetting curve for each student and each skill. The spacing is individualized: a student who learned to factor polynomials quickly but forgot quickly gets more frequent maintenance questions than a student who learned slowly but retained well.

Where the Evidence Is Strongest

The most consistent evidence for AI-driven adaptive learning comes from mathematics education. This is not coincidental: math is sequential, hierarchical, and easy to represent as a skill graph -- each skill depends on a set of prerequisite skills, and mastery of later skills requires mastery of earlier ones. Carnegie Learning's MATHia has been evaluated in multiple randomized controlled trials. A 2019 RAND Corporation study found that students using MATHia learned 23 percent more math per hour than students in traditional instruction. A 2021 follow-up found sustained effects: students who used MATHia in middle school showed stronger performance in high school algebra.

Language learning shows similarly strong results. Duolingo's internal research, validated by a 2023 PLOS ONE study, found that users who followed the platform's AI-recommended practice schedule achieved conversational fluency at 34 hours of platform use versus an estimated 550 hours of traditional classroom instruction for the same outcome.

Science education is more challenging. Knowledge tracing models can track whether a student has memorized the formula for photosynthesis, but they struggle to assess whether the student understands why photosynthesis matters, how it connects to cellular respiration, and what would happen if it stopped. Researchers at MIT's AI Lab are working on multimodal knowledge tracers that ingest student responses to open-ended questions, diagrams, and simulations, but this remains an open research problem.

The Equity Problem

Adaptive learning platforms have a data problem that has equity implications. The most accurate knowledge tracers are trained on interaction data -- millions of student-platform interactions. This data is not representative of the global student population. Carnegie Learning's data comes primarily from US schools. Duolingo's data is biased toward urban, English-speaking users. Khan Academy's data over-represents students from families with home internet access.

When a knowledge tracer trained on this data makes predictions for a student whose learning patterns differ from the training population, the predictions may be systematically less accurate. A 2022 University of Michigan study found that major adaptive learning platforms showed substantially lower prediction accuracy for students from low-income households, students with limited English proficiency, and students with learning disabilities. Students who needed personalized support most urgently were receiving the least accurate personalization.