Proxies

Scaling to a Distributed System

A proxy server is an intermediary server that sits between a client and a destination server. It acts on behalf of the client when requesting resources, effectively forwarding requests and responses.

Proxies are versatile tools used for security, performance, and monitoring. There are two main types of proxies that are important to understand in system design: Forward Proxies and Reverse Proxies.

Forward Proxy (or just "Proxy")

What it is: A forward proxy sits in front of a group of clients and forwards their requests to the internet. From the perspective of the destination server, the request appears to come from the proxy server itself, not the individual clients.

  • Analogy: A company's mailroom. All outgoing mail from employees goes to the mailroom first, which then sends it out to the postal service. The return address might be the company's address, not the individual employee's desk.
Private Network (e.g., Corporate Office)👤Client 1👤Client 2ForwardProxyInternet

Common Use Cases:

  1. Bypassing Firewalls and Censorship: This is a common use case for consumer VPNs and proxies. If a user is on a network that blocks access to a certain website, they can send their request to a proxy server outside the network, which can then access the website on their behalf.
  2. Filtering Outgoing Traffic: A company or school might use a forward proxy to prevent employees or students from accessing certain websites.
  3. Anonymity: It hides the IP addresses of the clients, providing a degree of anonymity.
  4. Caching: A forward proxy can cache frequently accessed content from the internet to speed up access for all clients behind it.

Reverse Proxy

What it is: A reverse proxy sits in front of a group of servers and forwards requests from the internet to one of the servers. From the perspective of the client, it appears they are communicating directly with the reverse proxy; they have no knowledge of the backend servers.

  • Analogy: A company's customer service phone number. You call one public number, and a receptionist (the reverse proxy) directs your call to the appropriate department or agent (the backend server). You don't know or care about the direct extension of the agent you're talking to.
An illustration showing clients from the internet sending requests to a Reverse Proxy, which distributes them to backend servers.Internet Clients👤👤ReverseProxy(Load Balancer)Private Backend NetworkAPI Server 1Web Server 2

This is the type of proxy you will almost always be talking about in a system design interview. In fact, a load balancer is a specific type of reverse proxy.

Common Use Cases:

  1. Load Balancing: As discussed in the "Load Balancing" chapter, a reverse proxy can distribute incoming traffic across multiple backend servers to improve scalability and reliability.
  2. Security and Anonymity for Servers: It hides the IP addresses and architecture of your backend servers from the public internet. This provides a layer of security and makes it harder for attackers to target your servers directly.
  3. SSL Termination: A reverse proxy can handle the decryption of incoming HTTPS requests and the encryption of outgoing responses. This offloads the computationally expensive work of SSL/TLS from your backend application servers, allowing them to focus on their core business logic.
  4. Caching: It can cache responses from the backend servers and serve them directly to clients for subsequent requests, reducing the load on the backend.
  5. Request Routing (API Gateway): A sophisticated reverse proxy (often called an API Gateway) can route requests to different microservices based on the URL path or other request attributes. For example, requests to /api/users go to the user service, while requests to /api/orders go to the order service.
  6. Compression: It can compress outgoing responses to reduce the amount of data sent over the network, speeding up load times for clients.

Forward Proxy vs. Reverse Proxy

FeatureForward ProxyReverse Proxy
PositionSits in front of clientsSits in front of servers
RepresentsThe clientThe server
Primary GoalTo protect and manage outgoing trafficTo protect and manage incoming traffic
VisibilityThe server doesn't know the actual clientThe client doesn't know the actual server
Common ExampleA corporate web filterA load balancer or API Gateway

In a system design interview, when you add a load balancer or an API Gateway to your diagram, you are adding a reverse proxy. Being able to use this terminology correctly and explain why you are using a reverse proxy (for load balancing, SSL termination, caching, etc.) demonstrates a solid understanding of web architecture.