Distributed System Design Interview #

I’ve been thinking a lot about how to do well in system design interviews recently. Aside from the obvious utility in getting an offer as a software engineer, I think there is some happy union where being good at a system design interview also can provide a good rubric at doing off-the-cuff system design.

Here’s my notes on my ideal workflow, that I plan on iterating on as time allows.

gather requirements
API design (maybe you could do this during “gather requirements” as you figure out usage).
block diagram
data storage

1: gather requirements #

I think requirements have a minimum checklist: have you thought about every intrinsic attribute that could affect your system?

The common ones, in my opinion, are:

usage requirements (start with this first)
seasonality in usage patterns (e.g. are certain days / months heavier than others)
read / write load (or read-heavy / write-heavy)
latency requirements
geolocation
uptime requirements
scale

Then anything more domain-specific to the problem at hand.

napkin math around requirements #

“napkin math”, where you try to do ballpark estimates of the numerical requirements or properties of your system, are a must for system design, and in interviews I found myself getting into doing really complex numerical conversions that I honestly couldn’t do in my head. A lot of it had to do with the base 2 calculations vs the base 10 calculations and interchanging them.

To help with this, I have the following framework:

Calculate the data storage in bytes (base 2 calculation)
Do estimates in multiples of ten if possible (e.g. 500k users). Write that out as powers of 10 (e.g. 5*10^6)
Estimate requests per second by estimating the total activity per day, divide by 10^5, then multiple by 1.1.

The third one requires a bit more explanation. Basically:

There are 86,400 seconds in a day.
100,000 / 86,400 ~ 1.15. Which we can further ballbark to 1.1 or 1.2.

This, combined with using multiples of ten for users, makes the math really really easy. 1.1 is very easy to mentally perform as well.

Figure out data storage #

Question: which two of the CAP theorem are most critical for the storage of the data?

Every systems design problem has some sort of data that you need to store. On that note, I’ve seen a couple different flavors of “data storage” discussions:

Thinking about the design patterns to use based on the properties of your business problem.
Discussing real-life databases or categories that you might use (e.g. SQL vs document database).

To hedge your bets, I think you should start with 1 where you think about the data structures, and know how to map them to the database that implement.

design pattern or data structure	database
hashmap	key-value store
b-tree index	indexed database

A common data structure is to:

have a dataset split into multiple “shards”
replicate each shard into replica sets (each replica set could have a single master via paxos quorum)
have the metadata of each replica server started in a config server.
have the config server heartbeat into replicas to do coordination (e.g. disenroll)
the master has a write-ahead-log to store the transactions, ensure ordering. Add a checksum to ensure the operation completes. Otherwise query master for correct row data and re-verify.
replicas replicate the WAL and perform subsqeuent writes.

graph TD
CS[Config Server] --> RS1[Replica Set 1]
CS --> RS2[Replica Set 2]
CS --> RS3[Replica Set 3]
subgraph "Shard 1"
RS1 --> M1[Master]
RS1 --> S1A[Replica]
RS1 --> S1B[Replica]
end
subgraph "Shard 2"
RS2 --> M2[Master]
RS2 --> S2A[Replica]
RS2 --> S2B[Replica]
end
subgraph "Shard 3"
RS3 --> M3[Master]
RS3 --> S3A[Replica]
RS3 --> S3B[Replica]
end
%% Heartbeat connections
CS -.->|Heartbeat| M1
CS -.->|Heartbeat| M2
CS -.->|Heartbeat| M3
CS -.->|Heartbeat| S1A
CS -.->|Heartbeat| S1B
CS -.->|Heartbeat| S2A
CS -.->|Heartbeat| S2B
CS -.->|Heartbeat| S3A
CS -.->|Heartbeat| S3B

Organizing and communicating in a design interview #

have three different text sections: this will help communicate to the interviewer what assumptions you have made.
requirements/
API?
user journey?

Deeper notes #

Design patterns per use case #

problem	solution
thundering herd / high number of requests	exponential backoff
generating garbage / corruption over time	garbage collection process
ensuring nodes are available	heartbeating
high availability with state	master with a quorum (e.g. via paxos)
distributed consensus	paxos

Design Patterns when considering data storage #

Choosing the partitioning key for data locality. #

Similar to Google bigtable, one could choose the partition key for rows by some structure that ensures that for most queries, the query only has to send a request subset of nodes, rather than all of them. This:

reduces the query load on all nodes.
also reduces the latency of the request, since fewer nodes have to respond.

Paxos #

The Paxos algorithm can be used to achieve consensus across multiple separate hosts.

The Google Chubby lock server can handle ~20 req / s of content updates, with a significant number of reads and KeepAlives.

write-ahead log #

A write-ahead log can be used to ensure durability in the operations that are performed. A transaction can be acknowledged after it’s committed into a write-ahead log.

Expontial backoff #

Read locks and write locks #

Read locks (aka shared locks) are a common database pattern which allows multiple readers on a row of data, preventing mutation. It prevents subsequent writes until all readers have released a lock.

Write locks are an exclusive lock on the row, blocking both reads and writes.