Introduction to Concurrency¶
Concurrency is about dealing with multiple things at once. This chapter covers Python's tools for concurrent and parallel execution.
Mental Model
Concurrency means structuring a program to handle multiple tasks that overlap in time. Parallelism means actually executing them at the same instant on different cores. Python gives you both: threads and asyncio for concurrency (interleaving work), and processes for true parallelism (simultaneous computation).
Why Concurrency?¶
The Problem: Sequential Execution¶
```python import time
def download_file(url): print(f"Downloading {url}...") time.sleep(2) # Simulate network delay print(f"Finished {url}")
Sequential: 6 seconds total¶
urls = ["file1.zip", "file2.zip", "file3.zip"] for url in urls: download_file(url) ```
Each download waits for the previous one to complete. Total time: 6 seconds.
The Solution: Concurrent Execution¶
```python import time from concurrent.futures import ThreadPoolExecutor
def download_file(url): print(f"Downloading {url}...") time.sleep(2) print(f"Finished {url}")
Concurrent: ~2 seconds total¶
urls = ["file1.zip", "file2.zip", "file3.zip"] with ThreadPoolExecutor() as executor: executor.map(download_file, urls) ```
All downloads happen simultaneously. Total time: ~2 seconds.
Key Terminology¶
Concurrency vs Parallelism¶
| Term | Definition | Analogy |
|---|---|---|
| Concurrency | Managing multiple tasks at once | One chef juggling multiple dishes |
| Parallelism | Executing multiple tasks simultaneously | Multiple chefs cooking simultaneously |
Concurrency is about structure — organizing code to handle multiple tasks. Parallelism is about execution — actually running tasks at the same time.
``` Concurrency (single core): Task A: ██░░██░░██ Task B: ░░██░░██░░ Time →
Parallelism (multiple cores): Task A: ██████████ (Core 1) Task B: ██████████ (Core 2) Time → ```
Threads vs Processes¶
| Aspect | Thread | Process |
|---|---|---|
| Memory | Shared memory space | Separate memory space |
| Creation | Fast, lightweight | Slower, heavier |
| Communication | Easy (shared variables) | Requires IPC (queues, pipes) |
| GIL impact | Affected by GIL | Not affected by GIL |
| Best for | I/O-bound tasks | CPU-bound tasks |
Synchronous vs Asynchronous¶
| Mode | Description |
|---|---|
| Synchronous | Wait for each operation to complete before starting next |
| Asynchronous | Start operations without waiting, handle results when ready |
Python's Concurrency Tools¶
Standard Library Modules¶
| Module | Purpose | Use Case |
|---|---|---|
threading |
Thread-based concurrency | I/O-bound tasks |
multiprocessing |
Process-based parallelism | CPU-bound tasks |
concurrent.futures |
High-level interface | Both (recommended) |
asyncio |
Async I/O | High-concurrency I/O |
queue |
Thread-safe queues | Producer-consumer patterns |
Which to Use?¶
Start here
│
├─ Is the task CPU-intensive (computation)?
│ │
│ ├─ Yes → multiprocessing / ProcessPoolExecutor
│ │
│ └─ No → Continue
│
├─ Is the task I/O-intensive (network, disk)?
│ │
│ ├─ Yes → threading / ThreadPoolExecutor
│ │ or asyncio for very high concurrency
│ │
│ └─ No → Sequential is probably fine
│
└─ Simple parallel map over data?
│
└─ Yes → concurrent.futures (easiest)
Real-World Examples¶
I/O-Bound: Web Scraping¶
```python import requests from concurrent.futures import ThreadPoolExecutor
urls = [ "https://example.com/page1", "https://example.com/page2", "https://example.com/page3", ]
def fetch(url): response = requests.get(url) return len(response.content)
Threads work well — waiting for network I/O¶
with ThreadPoolExecutor(max_workers=10) as executor: results = list(executor.map(fetch, urls)) ```
CPU-Bound: Number Crunching¶
```python from concurrent.futures import ProcessPoolExecutor
def compute_heavy(n): """CPU-intensive calculation.""" return sum(i * i for i in range(n))
numbers = [10_000_000, 20_000_000, 30_000_000]
Processes work well — true parallel computation¶
with ProcessPoolExecutor() as executor: results = list(executor.map(compute_heavy, numbers)) ```
Mixed: Data Pipeline¶
```python from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
def download_data(url): """I/O-bound: fetch from network.""" import requests return requests.get(url).json()
def process_data(data): """CPU-bound: heavy computation.""" return expensive_computation(data)
Stage 1: Download (I/O-bound) — use threads¶
with ThreadPoolExecutor() as executor: raw_data = list(executor.map(download_data, urls))
Stage 2: Process (CPU-bound) — use processes¶
with ProcessPoolExecutor() as executor: results = list(executor.map(process_data, raw_data)) ```
Performance Comparison¶
Benchmark: I/O-Bound Task¶
```python import time import requests from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
def fetch_url(url): requests.get(url) return url
urls = ["https://httpbin.org/delay/1"] * 5
Sequential: ~5 seconds¶
ThreadPool: ~1 second ✓ Best¶
ProcessPool: ~1.5 seconds (overhead)¶
```
Benchmark: CPU-Bound Task¶
```python import time from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
def compute(n): return sum(i ** 2 for i in range(n))
numbers = [5_000_000] * 4
Sequential: ~4 seconds (on 4-core machine)¶
ThreadPool: ~4 seconds (GIL blocks parallelism)¶
ProcessPool: ~1 second ✓ Best¶
```
Common Pitfalls¶
1. Using Threads for CPU-Bound Work¶
```python
Bad: Threads don't help with CPU-bound tasks¶
with ThreadPoolExecutor() as executor: results = executor.map(heavy_computation, data) # No speedup!
Good: Use processes for CPU-bound tasks¶
with ProcessPoolExecutor() as executor: results = executor.map(heavy_computation, data) # Real parallelism ```
2. Too Many Workers¶
```python
Bad: 1000 threads/processes is wasteful¶
with ThreadPoolExecutor(max_workers=1000) as executor: ...
Good: Match workers to task type¶
I/O-bound: 10-50 threads typically sufficient¶
CPU-bound: Match CPU cores¶
import os with ProcessPoolExecutor(max_workers=os.cpu_count()) as executor: ... ```
3. Shared State Without Synchronization¶
```python
Bad: Race condition¶
counter = 0 def increment(): global counter counter += 1 # Not thread-safe!
Good: Use synchronization¶
import threading counter = 0 lock = threading.Lock() def increment(): global counter with lock: counter += 1 ```
Chapter Overview¶
This chapter covers:
- Concurrency Concepts — GIL, CPU vs I/O bound, threads vs processes
- threading Module — Creating threads, synchronization, communication
- multiprocessing Module — Processes, pools, sharing state
- concurrent.futures — Modern, high-level API (recommended)
- Practical Patterns — Decision guide, common patterns, error handling
Key Takeaways¶
- Concurrency = managing multiple tasks; Parallelism = running simultaneously
- Threads share memory, affected by GIL — best for I/O-bound tasks
- Processes have separate memory, bypass GIL — best for CPU-bound tasks
- concurrent.futures provides the cleanest API for most use cases
- Match your concurrency strategy to your task type
- Always consider synchronization when sharing state
Exercises¶
Exercise 1.
Write a program that creates two threads: one prints "Hello" 5 times with a 0.1s delay, and the other prints "World" 5 times with a 0.15s delay. Use threading.Thread to run them concurrently and join() to wait for both. Observe the interleaved output.
Solution to Exercise 1
```python
import threading
import time
def say_hello():
for _ in range(5):
print("Hello")
time.sleep(0.1)
def say_world():
for _ in range(5):
print("World")
time.sleep(0.15)
t1 = threading.Thread(target=say_hello)
t2 = threading.Thread(target=say_world)
t1.start()
t2.start()
t1.join()
t2.join()
print("Both threads finished.")
```
Exercise 2.
Write a function that runs a simulated I/O task (time.sleep(0.5)) both sequentially (4 times) and concurrently using ThreadPoolExecutor with 4 workers. Measure and print the elapsed time for each approach and compute the speedup.
Solution to Exercise 2
```python
import time
from concurrent.futures import ThreadPoolExecutor
def io_task(n):
time.sleep(0.5)
return n
# Sequential
start = time.perf_counter()
for i in range(4):
io_task(i)
seq_time = time.perf_counter() - start
# Concurrent
start = time.perf_counter()
with ThreadPoolExecutor(max_workers=4) as executor:
list(executor.map(io_task, range(4)))
conc_time = time.perf_counter() - start
print(f"Sequential: {seq_time:.2f}s")
print(f"Concurrent: {conc_time:.2f}s")
print(f"Speedup: {seq_time / conc_time:.2f}x")
```
Exercise 3.
Demonstrate the difference between threads and processes by running a CPU-bound function (sum of squares up to 5,000,000) four times using ThreadPoolExecutor and four times using ProcessPoolExecutor. Compare the elapsed times and explain which is faster and why.
Solution to Exercise 3
```python
import time
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
def cpu_task(n):
return sum(i * i for i in range(n))
if __name__ == "__main__":
args = [5_000_000] * 4
start = time.perf_counter()
with ThreadPoolExecutor(max_workers=4) as ex:
list(ex.map(cpu_task, args))
thread_time = time.perf_counter() - start
start = time.perf_counter()
with ProcessPoolExecutor(max_workers=4) as ex:
list(ex.map(cpu_task, args))
proc_time = time.perf_counter() - start
print(f"Threads: {thread_time:.2f}s")
print(f"Processes: {proc_time:.2f}s")
print(f"Processes are faster because they bypass the GIL.")
```