Concurrency & Parallelism
Execution & Concurrency Models
Concurrency vs. Parallelism: What's the Difference?
These two terms are often used interchangeably, but they represent distinct concepts. Understanding the difference is crucial for designing efficient and scalable applications.
-
Concurrency: Concurrency is about dealing with multiple tasks at once. It's a concept related to the structure of a program. A concurrent program is one where different parts can be in progress at the same time. This doesn't necessarily mean they are running at the same time. For example, on a single-core CPU, the system can switch between different tasks, giving the illusion of simultaneous execution.
- Analogy: A chef juggling multiple tasks in the kitchen—chopping vegetables, stirring a pot, and watching the oven. They are making progress on all tasks over a period of time, but they are only doing one specific action at any given instant.
-
Parallelism: Parallelism is about doing multiple tasks at once. It's a concept related to the execution of a program. A parallel program is one where multiple tasks are running simultaneously, typically on different CPU cores. Parallelism is impossible on a single-core machine.
- Analogy: A team of three chefs working in a kitchen. One is chopping vegetables, another is stirring a pot, and a third is washing dishes. All three tasks are happening at the exact same time.
Key takeaway: Concurrency is about structure, parallelism is about execution. You can have a concurrent program that doesn't run in parallel (e.g., on a single-core CPU), but you cannot have parallelism without a concurrent design.
Common Concurrency Models
1. Threads and Processes
- Process: A process is an instance of a running program. Each process has its own isolated memory space (its own stack and heap). Processes are heavyweight and communication between them (Inter-Process Communication or IPC) is relatively slow as it has to go through the operating system.
- Thread: A thread is the smallest unit of execution within a process. A single process can have multiple threads, all of which share the same memory space (the same heap). Each thread has its own stack for local variables and function calls, but they share the main memory of the process.
Threads are the most common model for achieving parallelism on multi-core systems.
The Challenge: Shared State
Because threads share memory, they can read and write to the same variables. This is powerful but also dangerous. If two threads try to modify the same variable at the same time, you can get a race condition, leading to corrupted data and unpredictable behavior.
To prevent this, you need synchronization mechanisms:
- Locks (or Mutexes - Mutual Exclusion): A lock is a mechanism that ensures only one thread can enter a "critical section" of code at a time. Before accessing a shared resource, a thread must "acquire" the lock. If another thread already holds the lock, the new thread will block (wait) until the lock is released. This prevents race conditions but can introduce other problems like deadlocks (where two or more threads are waiting for each other to release locks) amd starvation (where a thread never gets to run because others keep acquiring the lock).
// Java example of a synchronized method (using a lock)
public class Counter {
private int count = 0;
// The 'synchronized' keyword ensures that only one thread can execute this method on a given instance at a time.
public synchronized void increment() {
count++;
}
public int getCount() {
return count;
}
}
# Python example using a Lock
import threading
class Counter:
def __init__(self):
self.count = 0
self.lock = threading.Lock()
def increment(self):
# Acquire the lock before modifying the shared resource
with self.lock:
self.count += 1
def get_count(self):
return self.count
- Semaphores: A semaphore is like a lock that allows a specified number of threads to access a resource. A lock is a semaphore with a count of 1. They are useful for managing a pool of limited resources.
2. Asynchronous Programming (Async/Await)
This model is an alternative to traditional threading, popularized by Node.js and now common in many languages like Python, C#, and Rust. It's a form of cooperative multitasking.
Instead of relying on the OS to switch between threads, the program itself yields control voluntarily. It's particularly well-suited for I/O-bound tasks.
How it works:
- The program runs on a single main thread (or a small pool of threads).
- When an operation is encountered that would normally block (like a network request), it's started in the background.
- The function "awaits" the result without blocking the main thread. The event loop is now free to run other tasks.
- When the background operation completes, the function resumes where it left off.
The async
and await
keywords are syntactic sugar that makes this asynchronous code look and feel like synchronous code.
- Pros:
- Avoids the complexity of locks and race conditions for many use cases.
- Very efficient for I/O-bound workloads, as the main thread is never idle.
- Cons:
- Not suitable for CPU-bound tasks. A long-running calculation on the main thread will still block everything else.
- Can lead to "async all the way down" code, where a function that becomes async forces its callers to become async as well.
// JavaScript (Node.js)
async function fetchData() {
console.log("Starting to fetch data...");
// The 'await' keyword pauses this function, but not the whole program.
// The event loop can run other tasks while waiting for the network.
const response = await fetch('https://api.example.com/data');
const data = await response.json();
console.log("Data fetched!");
return data;
}
console.log("Before calling fetch.");
fetchData();
console.log("After calling fetch. This line runs immediately.");
# Python
import asyncio
async def fetch_data():
print("Starting to fetch data...")
# 'await' pauses this coroutine and allows the event loop to run other tasks.
await asyncio.sleep(2) # Simulate a network request
print("Data fetched!")
return {"data": "some_data"}
async def main():
print("Before calling fetch.")
await fetch_data()
print("After calling fetch.")
asyncio.run(main())
Summary for Interviews
- Concurrency is about dealing with many things at once (structure). Parallelism is about doing many things at once (execution).
- Threads are the standard way to achieve parallelism. They are powerful but introduce the complexity of shared state and race conditions.
- Locks (Mutexes) are the primary tool to prevent race conditions by ensuring only one thread can access a resource at a time. Be aware of the risk of deadlocks.
- Async/Await is a concurrency model that excels at I/O-bound tasks. It uses a non-blocking, single-threaded event loop to handle many operations concurrently without the overhead of traditional threads. It is not a good fit for CPU-bound work.
- Choosing the right model depends on the problem:
- For CPU-bound tasks that can be broken into independent pieces (e.g., video encoding, scientific computing), a multi-threaded or multi-processing approach is ideal.
- For I/O-bound tasks with many concurrent connections (e.g., web servers, APIs), an asynchronous (async/await) model is often more efficient and scalable.