Building a Leaderboard That Doesn't Destroy Your Database

Production Engineering · Sports App · 100K+ Users

At 100K+ active users, response time dropped below 50ms — a 120× improvement over the naive approach. Here's how we got there.

01 — Why the Naive Approach Always Fails

The most natural implementation is also the most dangerous at scale. Counting every user's activity, ranking them, and returning a sorted list feels correct — because it is. The problem is what the database must do to produce that result on every single API call.

At 100K users with millions of records, a naive query can take 600ms or more. Multiply that by 50 concurrent users during peak hours: 50 full-table scans running simultaneously. The database runs out of resources. Response times spike. Users see spinners and leave.

The query is correct. It just doesn't scale — and those are two very different problems.

Query Time at Scale — Naive vs Materialized View

naive query materialized view

1K users

10ms

↳ (MV)

1ms

10K users

80ms

↳ (MV)

~5ms

100K users

600ms+

↳ (MV)

~5ms

02 — Materialized Views: Pre-compute What Doesn't Change

The core insight: leaderboard rankings don't need to be real-time. If the ranking refreshes every 5–10 minutes instead of on every request, the vast majority of users will never notice. Their rank doesn't change that fast.

A materialized view stores the result of an expensive query as a pre-computed table. Every subsequent read hits the cached result — no aggregation, no full scans. The expensive work runs once per refresh window, not once per user.

The flow looks like this:

WRITE (every 5 min, background): Scan table → compute rankings for all 100K users → store in materialized view. Runs once, costs 600ms in the background.
READ (every API call): Read 20 pre-computed rows from indexed table. No aggregation. No full scans. Returns in ~5ms.
RESULT: Expensive computation runs once per 5-minute window, not once per request. ~120× improvement at scale.

This principle extends further: display name formatting, ranking, aggregation — anything that doesn't change per-request should be computed at refresh time, not in the application layer on every API call. Push computation to where it's cheapest.

03 — `RANK()` vs `ROW_NUMBER()`: Fairness Is an Engineering Decision

SQL provides two window functions for ranking rows. They sound interchangeable — they're not. The difference only shows up when users are tied, which in a competitive feature, happens constantly.

ROW_NUMBER() assigns a unique sequential number to every row. Two users with identical scores still get different ranks — determined by insertion order, ID, or whatever tiebreaker the database picks. Arbitrary.

RANK() assigns the same rank to tied rows and skips ahead accordingly. Two users at the top both get #1. The next user gets #3. That's the correct answer.

In a competitive feature, perceived fairness directly impacts retention. Users who feel the ranking is arbitrary stop competing in it.

✗ ROW_NUMBER() — arbitrary

Alice47 visits#1

Bob47 visits#2

Charlie45 visits#3

Diana45 visits#4

Eve40 visits#5

Bob feels robbed. Alice too.

✓ RANK() — correct

Alice47 visits#1tie

Bob47 visits#1tie

Charlie45 visits#3tie

Diana45 visits#3tie

Eve40 visits#5

Tied = same rank. Fair.

04 — Daily Snapshots for Rank Change Indicators

The ↑3 / ↓2 rank change indicators are the most engagement-driving element of the whole feature. Seeing movement creates a feedback loop — users come back to defend their position. But showing change requires knowing where someone was, which is a fundamentally temporal problem.

The solution is simpler than it sounds: a nightly job that copies the current rankings into an append-only snapshot table. One row per user per day. The delta is a single subtraction — no complex diffing, no event sourcing.

Here's how the math works:

23:59 Day N — your rank (#50) is saved to daily_ranking_snapshots.
Throughout the day — you go to the gym. The materialized view picks up your new visit every 5 min.
API call today — current rank from MV: #47. Yesterday's rank from snapshot: #50. Delta = 50 − 47 = +3 → you moved up.

Live Leaderboard with Rank Deltas

1Alice K.247 visits↑2

1Marco T.247 visits↑5

3Sara M.231 visits↓1

4James R.219 visits↑3

5You198 visits↓2

One gotcha: a positive delta means the rank number decreased (e.g., 50 → 47), which means you actually moved up. Lower number = better position. The display layer must map this correctly or every arrow will point the wrong direction.

Parallelize Independent Queries

The leaderboard API typically needs three pieces of data: the ranked page, the total user count, and your own rank. If you await them sequentially, you pay all three latencies end-to-end. They're independent — run them in parallel and pay only the slowest one.

// ✗ sequential — ~12ms total
const page = await getRankedPage(offset);
const total = await getTotalCount();
const myRank = await getMyRank(userId);
 
// ✓ parallel — ~5ms total (limited by the slowest query)
const [page, total, myRank] = await Promise.all([
  getRankedPage(offset),
  getTotalCount(),
  getMyRank(userId),
]);

05 — Engineering Lessons That Generalize

Pre-compute what doesn't need to be real-time

Rankings updated every 5 min are indistinguishable from real-time to most users. The performance difference is enormous.

RANK() is almost always right for user-facing boards

Ties are common. Arbitrary tiebreaking feels unfair. Use the window function that handles it correctly.

Preserve global context when filtering

A friends leaderboard should show global ranks, not a re-ranked subset. The user cares about their actual position.

Snapshots are the simplest way to track change

Append-only, cheap to store, delta by subtraction. Don't overcomplicate temporal tracking.

Push computation to the right layer

SQL handles set-based transformations better than a TypeScript loop. Pre-compute in views when data is stable per-request.

Parallelize independent database calls

Awaiting queries sequentially when they don't depend on each other is a free performance loss every time.

01 — Why the Naive Approach Always Fails

02 — Materialized Views: Pre-compute What Doesn't Change

03 — RANK() vs ROW_NUMBER(): Fairness Is an Engineering Decision

04 — Daily Snapshots for Rank Change Indicators

Parallelize Independent Queries

05 — Engineering Lessons That Generalize

03 — `RANK()` vs `ROW_NUMBER()`: Fairness Is an Engineering Decision