System DesignFoundationsNetworking Fundamentals

Networking Fundamentals

Foundations

While you don't need to be a network engineer, understanding the fundamental principles of how data moves across the internet is crucial for designing robust and efficient systems. In a system design interview, this knowledge helps you justify your choices and understand potential bottlenecks.

This chapter focuses on three core components: DNS, TCP/IP, and HTTP/HTTPS.

DNS: The Internet's Phonebook

What it is: DNS (Domain Name System) translates human-readable domain names (like www.google.com) into machine-readable IP addresses (like 172.217.14.228).

Why it matters in System Design:

  • First Point of Contact: DNS resolution is the very first step when a user tries to access your service. A slow DNS lookup means a slow start for your user's request.
  • Load Balancing: DNS can be used for global load balancing. By returning different IP addresses based on the user's geographic location (GeoDNS) or server load, you can direct traffic to the nearest or healthiest datacenter. This is a simple way to achieve high-level scalability and availability.
  • Resilience: If one of your datacenters goes down, you can update the DNS records to point all traffic to a healthy datacenter.

Common DNS Record Types:

  • A Record: Maps a domain name to an IPv4 address.
  • AAAA Record: Maps a domain name to an IPv6 address.
  • CNAME (Canonical Name): Maps a domain name to another domain name. Useful for aliasing. For example, ftp.example.com could be a CNAME to example.com.
  • NS (Name Server): Delegates a domain or subdomain to a set of authoritative name servers.

Trade-offs:

  • TTL (Time to Live): DNS records are cached to reduce latency. The TTL value tells caches how long to store a record. A low TTL means changes propagate quickly, but it also means more frequent DNS lookups, increasing load on your DNS servers. A high TTL reduces DNS traffic but makes it slow to failover if a server goes down.

TCP/IP: The Postal Service of the Internet

TCP/IP is a suite of communication protocols that define how data is broken down into packets and sent across a network.

IP (Internet Protocol)

What it is: IP is responsible for addressing and routing. It ensures that each packet is labeled with the correct destination IP address, but it makes no guarantees about delivery. It's like writing an address on an envelope.

TCP (Transmission Control Protocol)

What it is: TCP runs on top of IP and adds reliability and ordering. It ensures that all packets arrive, that they are in the correct order, and that they are not corrupted.

How it works (The 3-Way Handshake):

  1. SYN: The client sends a "synchronize" packet to the server to initiate a connection.
  2. SYN-ACK: The server sends a "synchronize-acknowledgment" packet back to the client.
  3. ACK: The client sends an "acknowledgment" packet back, and the connection is established.

Why it matters in System Design:

  • Reliability vs. Performance: TCP's guarantees (ordering, error checking, retransmission) are essential for most applications (like file transfers or web browsing), but they come with overhead. The handshake adds latency, and the acknowledgments create extra traffic.
  • TCP vs. UDP: UDP (User Datagram Protocol) is another protocol that runs on top of IP. It's "fire and forget"—it offers no guarantees of delivery or order. This makes it much faster and lower overhead than TCP.
    • Use TCP for: Web traffic, email, file transfers—anything where data integrity is critical.
    • Use UDP for: Video streaming, online gaming, VoIP—anything where speed is more important than losing a single packet here and there.

HTTP/HTTPS: The Language of the Web

What it is: HTTP (Hypertext Transfer Protocol) is an application-layer protocol that defines a set of rules for how clients (like web browsers) and servers communicate. It runs on top of TCP.

HTTPS (HTTP Secure): This is simply HTTP with an added layer of security provided by TLS (Transport Layer Security), formerly known as SSL. TLS encrypts the communication between the client and server, preventing eavesdropping and man-in-the-middle attacks.

Why it matters in System Design:

  • Statelessness: HTTP is a stateless protocol. Each request from a client to a server is independent. This is a fundamental design principle that makes scaling web services much easier. If any server can handle any request, you can easily add more servers behind a load balancer.
  • Common Methods (Verbs):
    • GET: Retrieve data (e.g., a web page, a user's profile). Should be idempotent and safe.
    • POST: Create a new resource (e.g., a new user, a new tweet).
    • PUT: Update an existing resource completely.
    • PATCH: Partially update an existing resource.
    • DELETE: Remove a resource.
  • Persistent Connections (Keep-Alive): Establishing a TCP connection is expensive. HTTP/1.1 introduced persistent connections, allowing the client and server to send multiple requests and responses over a single TCP connection. This significantly reduces latency. HTTP/2 takes this even further with multiplexing, allowing multiple requests and responses to be in flight simultaneously over a single connection.

In an interview, showing you understand these layers—from the high-level DNS routing down to the choice of TCP vs. UDP for a specific service—demonstrates a deep and practical understanding of how modern systems are built.