Back-of-the-Envelope Estimation

Back-of-the-envelope estimation is the practice of using a combination of thought experiments and simple mathematical calculations to quickly arrive at a reasonable estimate for a system's capacity needs. It's one of the most important skills in a system design interview because it demonstrates that you can think about scale and make data-driven decisions.

The goal is not to find the exact right answer, but to be in the right order of magnitude.

Why Do We Do This?

Before you can design a system, you need to know what you're designing for.

Does your system need to handle 10 requests per second, or 100,000?
Do you need to store gigabytes of data, or petabytes?
How much will it cost to run?

These estimations will directly influence your choice of technology, architecture, and infrastructure.

The Core Numbers You Should Know

You don't need to be a human calculator, but you should have a few key numbers memorized to speed up your calculations.

Powers of 2

2^10 = 1,024 ≈ 1 Thousand (Kilo)
2^20 = 1,048,576 ≈ 1 Million (Mega)
2^30 ≈ 1 Billion (Giga)
2^40 ≈ 1 Trillion (Tera)
2^50 ≈ 1 Quadrillion (Peta)

Latency Numbers Every Programmer Should Know

L1 cache reference: ~0.5 ns
Branch mispredict: ~5 ns
L2 cache reference: ~7 ns
Mutex lock/unlock: ~25 ns
Main memory reference: ~100 ns
Send 2K bytes over 1 Gbps network: ~20,000 ns (20 µs)
Read 1 MB sequentially from memory: ~250,000 ns (250 µs)
Round trip within same datacenter: ~500,000 ns (0.5 ms)
SSD random read: ~150-200 µs
Read 1 MB sequentially from SSD: ~1,000,000 ns (1 ms)
HDD seek: ~10,000,000 ns (10 ms)
Read 1 MB sequentially from HDD: ~20,000,000 ns (20 ms)
Round trip USA to Europe: ~150,000,000 ns (150 ms)

Key Takeaway: Reading from memory is fast. Reading from disk is slow. Reading over the network is even slower. This is why caching is so important.

Throughput & Storage Calculations

Number of seconds in a day: 24 hours * 60 min/hr * 60 sec/min ≈ 25 * 3600 = 86,400 ≈ 90,000 seconds
Number of seconds in a month: 90,000 * 30 ≈ 2.7 Million

A Practical Example: Designing a Twitter-like Service

Let's walk through an estimation exercise for a simplified version of Twitter.

Interviewer: "Let's design a service where users can post short text messages. How would you estimate the scale?"

Step 1: Clarify and State Assumptions

Total Users: 500 Million
Daily Active Users (DAU): 200 Million (A reasonable fraction of total users)
Write Operations (Tweets per day): Each DAU posts, on average, 0.5 tweets per day.
- 200 Million DAU * 0.5 tweets/DAU = 100 Million tweets per day
Read Operations (Feed views per day): Each DAU views their feed, on average, 5 times per day.
- 200 Million DAU * 5 views/DAU = 1 Billion feed views per day
Read/Write Ratio: 1 Billion reads / 100 Million writes = 10:1. This is a very common ratio for social media. The system is "read-heavy".

Step 2: Calculate QPS (Queries Per Second)

Write QPS:

100 Million tweets / 90,000 seconds ≈ 100,000,000 / 90,000 ≈ 10,000 / 9 ≈ ~1,100 QPS (write)

Read QPS:

1 Billion reads / 90,000 seconds ≈ 1,000,000,000 / 90,000 ≈ 100,000 / 9 ≈ ~11,000 QPS (read)

Peak QPS: Traffic is not evenly distributed. It often has peaks. A common rule of thumb is to assume peak traffic is 2x - 3x the average.

Peak Write QPS: 1,100 * 2 = ~2,200 QPS
Peak Read QPS: 11,000 * 2 = ~22,000 QPS

Step 3: Estimate Storage Requirements

Size of a single tweet:

tweet_id: 8 bytes (64-bit integer)
user_id: 8 bytes
text: 280 characters * 2 bytes/char (UTF-8) = 560 bytes
media_url (optional): ~50 bytes
timestamp: 8 bytes
Total: Let's round up to ~700 bytes per tweet.

Storage per day:

100 Million tweets/day * 700 bytes/tweet = 70 Billion bytes/day = 70 GB per day

Storage for 5 years:

70 GB/day * 365 days/year * 5 years
70 * 365 * 5 ≈ 70 * 1800 ≈ 126,000 GB = ~126 TB

Step 4: Estimate Bandwidth/Network Requirements

Ingress (Writes):

1,100 QPS * 700 bytes/tweet = 770,000 bytes/sec = ~0.77 MB/s

Egress (Reads):

Let's assume a feed view loads ~20 tweets.
11,000 QPS * (700 bytes/tweet * 20 tweets/view) = 11,000 * 14,000 ≈ 154,000,000 bytes/sec = ~154 MB/s

Summary of Estimates

Write QPS: ~1.1k (Peak ~2.2k)
Read QPS: ~11k (Peak ~22k)
Storage (5 years): ~126 TB
Ingress: ~0.77 MB/s
Egress: ~154 MB/s

Now you have a concrete set of numbers to guide your design. You know you need a system that can handle thousands of queries per second and store terabytes of data. This immediately tells you that a single server won't be enough, and you'll need to think about load balancing, distributed databases, and caching.