How Aurora Works
The scheduling engine behind Aurora, explained from scratch. No statistics background required.
The Core Idea
When you learn something, you start forgetting it immediately. But each time you successfully recall it, your memory gets a little more durable. Spaced repetition exploits this: review right before you'd forget, and each review pushes the next one further into the future.
Aurora tracks two things for every problem you've attempted:
- Stability — how durable your memory is (measured in days)
- Retrievability — the probability you can solve it right now (0–100%)
Stability goes up when you solve problems well. Retrievability decays over time, which is what triggers reviews. The system's job is to keep retrievability high across all your problems using the fewest reviews possible.
Stability
Stability is a number in days. If your stability for Two Sum is 7, that means your memory of how to solve Two Sum is strong enough to last about 7 days before it starts fading significantly.
After each attempt, stability is updated:
- Multiplier above 1.0 → stability grows → your next review is scheduled further out
- Multiplier below 1.0 → stability shrinks → you review sooner
Example
You have stability = 4 days for Container With Most Water. You solve it independently with the optimal approach and confidence 4. Base multiplier: 2.5. Confidence bonus: +0.1. New stability: 4 × 2.6 = 10.4 days. Your next review is ~10 days out.
Next review: you struggle and need a hint (Partial), confidence 2. Base multiplier: 1.1. Confidence penalty: −0.2. New stability: 10.4 × 0.9 = 9.4 days. Stability shrank — and because you had low confidence with a partial solve, the system forces a review in just 1 day regardless.
First Attempt
For your very first attempt at a problem, initial stability starts at a base of 2.0 days multiplied by the same formula. A perfect first solve (YES:OPTIMAL, confidence 5, rewrite) reaches 2.0 × (2.5 + 0.3 + 0.5) = 6.6 days. A typical first solve — optimal approach, confidence 3 — lands at 2.0 × 2.5 = 5 days. This spacing reflects the real cost of coding reviews (15–30 min each): daily recall is counterproductive and builds queue debt faster than it can be cleared.
Stability is clamped between 0.5 days (minimum) and 365 days (maximum).
Retrievability (the Forgetting Curve)
Retrievability answers: “If I sat down to solve this right now, what's the chance I could do it?”
R = retrievability · t = days since review · S = stability
Forgetting Curve — retrievability over time
Why this formula?
This is exponential decay — the same math that describes radioactive decay and cooling coffee. It's used because human forgetting empirically follows this shape (first measured by Hermann Ebbinghaus in 1885 and confirmed many times since).
The key property: you forget fast at first, then slower. The day after a review, retrievability drops quickly. A week later, it's still dropping but more gradually.
e ≈ 2.718 is the natural mathematical constant. The −days/stability part means higher stability = slower decay.
Concrete examples
| Stability | After 1 day | After 3 days | After 7 days | After 14 days |
|---|---|---|---|---|
| 2 days | 61% | 30%* | 30%* | 30%* |
| 7 days | 87% | 65% | 37% | 30%* |
| 30 days | 97% | 90% | 79% | 63% |
* Retrievability has a floor of 30%. Even for a problem you haven't reviewed in months, the system assumes you retained something — the problem name, the general pattern, a vague approach. Without the floor, a problem 60 days overdue would score 0% and be indistinguishable from one you've never seen.
What the labels mean
| Retrievability | Label | What it means |
|---|---|---|
| ≥ 80% | Strong | You can probably solve this cold |
| 60–79% | Good | Likely solvable, might need a moment |
| 40–59% | Fading | You might struggle — review soon |
| 20–39% | Weak | Likely can't solve without help |
| < 20% | Critical | Essentially relearning from scratch |
How Attempts Are Scored
When you log an attempt, two things determine how much your stability changes: the base multiplier (how well you did) and modifiers (bonus/penalty details).
Base Multipliers
| Outcome | Optimal | Suboptimal | Brute Force | No Solution |
|---|---|---|---|---|
| Solved independently | 2.5× | 2.0× | 1.5× | — |
| Needed help (Partial) | 1.1× (solution quality ignored) | — | ||
| Could not solve | — | — | 0.8× | 0.5× |
Why is Partial always 1.1× regardless of quality? If you needed a hint or AI help, you didn't prove you can solve it yourself. Whether you arrived at optimal or brute force with help doesn't meaningfully change how well you know the problem. The quality question is only meaningful when you solved independently.
Modifiers (independent solves only)
| Signal | Modifier | Why |
|---|---|---|
| Rewrote from scratch | +0.5 | Proves you can write it clean, not just patch old code |
| Fast solve (Medium < 10 min) | +0.2 | Speed indicates strong, fluent recall |
Confidence (applies to all attempts)
| Level | Description | Modifier |
|---|---|---|
| 5 | Solve cold, no issues | +0.3 |
| 4 | Can code it, minor bugs | +0.1 |
| 3 | Can pseudocode optimal, maybe code it | 0 |
| 2 | Can pseudocode brute force | −0.2 |
| 1 | Can't solve or pseudocode | −0.4 |
Special scheduling rules
- Could not solve → review is scheduled immediately (due now)
- Partial + confidence ≤ 2 → review forced to 1 day, regardless of stability math
Review Queue Priority
When multiple problems are due, which do you review first? The queue sorts by priority:
Problems you're most likely to have forgotten score highest. A problem at 30% retrievability gets urgency 0.7; one at 80% gets 0.2.
The weight adjusts for importance:
- Difficulty: Hard 1.1×, Medium 1.0×, Easy 0.8×
- Blind 75: +0.2 bonus — these are the most commonly asked interview problems, so they get a small tiebreaker when urgency is similar
- Weak category: +0.3 if your average retention in that category is below 60% — this helps shore up gaps (e.g., if your Stack problems are all fading, they get prioritized)
Why is my queue smaller than my total attempts?
This is the SRS working correctly. The system assigns each problem a next review date based on its stability — some problems are due today, others next week, others in a month. At any given moment, only the subset whose review date has passed appears in the queue. If you have 30 problems in active learning, you might see 5–15 due on a typical day. The queue size fluctuates naturally: it grows when you skip sessions and shrinks as you clear it. A small queue does not mean you have less work to do — it means the work is spread across time as intended.
Readiness Score
The readiness score (0–100) estimates how prepared you are for a coding interview. It combines four factors:
| Factor | Weight | What it measures |
|---|---|---|
| Coverage | 30% | Percentage of NeetCode 150 you've attempted |
| Retention | 40% | Percentage of attempted problems with retrievability > 70% |
| Category Balance | 20% | Your worst category's average retention — penalizes big blind spots |
| Consistency | 10% | Percentage of scheduled reviews completed in the last 14 days |
Retention is weighted highest (40%) because knowing 20 problems cold beats having seen 100 problems once.
| Score | Tier |
|---|---|
| 90–100 | S |
| 75–89 | A |
| 55–74 | B |
| 35–54 | C |
| 0–34 | D |
Mastery
A problem is considered mastered when its stability reaches 45 days or more. At that point, the SRS won't schedule it for at least six weeks — your memory is durable enough that frequent reviews would be wasted effort.
Mastery is a function of consistency, not a single impressive solve. The table below shows roughly how many independent solves it takes to cross the 45-day threshold under different scenarios (starting from a first attempt):
| Scenario | Multiplier path | Stability after each solve | Solves to mastery |
|---|---|---|---|
| Optimal, conf 5, rewrite | 3.3× | 6.6 → 21.8 → 71.9 | 3 |
| Optimal, conf 4 | 2.6× | 5.2 → 13.5 → 35.1 → 91.3 | 4 |
| Optimal, conf 3 (neutral) | 2.5× | 5.0 → 12.5 → 31.3 → 78.1 | 4 |
| Suboptimal, conf 3 | 2.0× | 4.0 → 8.0 → 16.0 → 32.0 → 64.0 | 5 |
| Partial only | 1.1× | grows <1.1× per solve | unreachable |
These are simplified — actual stability depends on retrievability at review time and the exact modifiers logged. Mixing in Partial solves or low-confidence sessions will slow the curve.
Mastered problems still appear in the review queue occasionally — once every 45–365 days — to confirm retention hasn't decayed unexpectedly, especially before interviews. A successful confirmation extends stability further; a failed one resets the problem back into active learning.
Glossary
- Spaced Repetition ↗
- A learning technique where reviews are scheduled at increasing intervals. Each successful review pushes the next one further out. Based on the finding that memory consolidates with well-timed recall practice.
- FSRS (Free Spaced Repetition Scheduler) ↗
- An open-source spaced repetition algorithm by Jarrett Ye. Aurora uses a modified version adapted for coding problems — multi-signal scoring (outcome + solution quality + confidence + timing) replaces FSRS's single pass/fail grade.
- Stability
- How durable your memory of a problem is, measured in days. Higher stability = slower forgetting = longer intervals between reviews. Clamped between 0.5 and 365 days.
- Retrievability
- The estimated probability (0–100%) that you could solve a problem right now without help. Decays exponentially over time since your last review. Has a floor of 30%.
- Mastery
- A problem is mastered when its stability reaches 45 days or more. Mastered problems are reviewed infrequently — roughly once every 1–3 months — as a retention check rather than active learning.
- Exponential Decay ↗
- A pattern where something decreases by a consistent proportion over equal time periods. Memory follows this shape: fast loss early, slowing over time. The formula R = e−t/S captures this.
- Forgetting Curve ↗
- The graph of retrievability over time. First described by Hermann Ebbinghaus in 1885 through experiments on memorizing nonsense syllables. The exponential decay model fits his data and has been confirmed repeatedly since.
- NeetCode 150 ↗
- A curated list of 150 LeetCode problems covering all major algorithms and data structures needed for coding interviews. Organized into categories like Arrays & Hashing, Two Pointers, Stack, etc.
- Blind 75
- A subset of 75 problems from a viral list of the most frequently asked coding interview questions. These are given a small priority bonus in the review queue because they're statistically more likely to appear in interviews.
- Readiness Score
- A 0–100 composite score estimating interview preparedness. Combines coverage (how many problems you've seen), retention (how many you remember), category balance (no blind spots), and consistency (keeping up with reviews).
Further Reading
- FSRS Algorithm Wiki — the research behind modern spaced repetition scheduling
- Forgetting Curve (Wikipedia) — Ebbinghaus's original research and subsequent confirmations
- Spaced Repetition (Wikipedia) — overview of the learning technique and its evidence base
- NeetCode Roadmap — the NeetCode 150 problem list and category structure
- Problem List — browse all 150 problems with optimal complexity and links