Latency and Bandwidth¶
Mental Model
Latency is how long it takes for the first drop of water to travel through a pipe; bandwidth is how wide the pipe is. For many small requests, latency dominates -- you wait for each round trip. For large transfers, bandwidth dominates -- you wait for the pipe to drain. Knowing which one is your bottleneck tells you what to optimize.
Two Dimensions of Network Performance¶
Network performance has two key metrics that are often confused:
``` Bandwidth vs Latency
Bandwidth (throughput): How much water the pipe can carry Latency (delay): How long for water to reach the other end
┌─────────────────────────────────────────────────────────────┐ │ │ │ High Bandwidth, High Latency (intercontinental fiber): │ │ ════════════════════════════════════════════▶ │ │ Lots of data, but takes 100ms to arrive │ │ │ │ Low Bandwidth, Low Latency (local network): │ │ ══════▶ │ │ Less data, but arrives in 1ms │ │ │ └─────────────────────────────────────────────────────────────┘ ```
Latency¶
Latency is the time delay between sending and receiving data.
Latency Components¶
``` Total Latency = Propagation + Transmission + Queuing + Processing
┌────────────────────────────────────────────────────────────────┐ │ │ │ Propagation: Time for signal to travel (speed of light) │ │ ~5 μs per km in fiber │ │ │ │ Transmission: Time to push bits onto wire │ │ = Data size / Bandwidth │ │ │ │ Queuing: Time waiting in router/switch buffers │ │ Variable, depends on congestion │ │ │ │ Processing: Time for routers to process headers │ │ Usually microseconds │ │ │ └────────────────────────────────────────────────────────────────┘ ```
Typical Latencies¶
| Path | Round-Trip Time (RTT) |
|---|---|
| Same machine (localhost) | < 0.1 ms |
| Local network (LAN) | 0.1 - 1 ms |
| Same city | 1 - 10 ms |
| Same continent | 10 - 50 ms |
| Cross-continent | 50 - 150 ms |
| Opposite side of globe | 150 - 300 ms |
| Satellite (GEO) | 500 - 700 ms |
Measuring Latency in Python¶
```python import socket import time
def measure_tcp_latency(host, port, iterations=10): """Measure TCP connection latency.""" latencies = []
for _ in range(iterations):
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(5)
start = time.perf_counter()
try:
sock.connect((host, port))
latency = (time.perf_counter() - start) * 1000
latencies.append(latency)
except Exception as e:
print(f"Error: {e}")
finally:
sock.close()
if latencies:
avg = sum(latencies) / len(latencies)
print(f"Average latency to {host}:{port}: {avg:.2f} ms")
return latencies
measure_tcp_latency('google.com', 80)¶
```
HTTP Latency¶
```python import requests import time
def measure_http_latency(url, iterations=5): """Measure HTTP request latency.""" for _ in range(iterations): start = time.perf_counter() response = requests.get(url) elapsed = (time.perf_counter() - start) * 1000
print(f"{url}: {elapsed:.0f} ms (status: {response.status_code})")
measure_http_latency('https://api.github.com')¶
```
Bandwidth¶
Bandwidth is the maximum rate of data transfer.
Bandwidth Units¶
``` Bits vs Bytes:
Network speeds in bits: 1 Gbps = 1,000,000,000 bits/second File sizes in bytes: 1 GB = 1,000,000,000 bytes
Conversion: 1 Gbps = 125 MB/s (divide by 8)
Common confusion: "1 Gbps internet" ≠ "1 GB per second downloads" "1 Gbps internet" = "125 MB per second" maximum ```
Typical Bandwidths¶
| Connection | Bandwidth | Download 1 GB |
|---|---|---|
| 4G LTE | 50 Mbps | ~3 minutes |
| Home cable | 100 Mbps | ~80 seconds |
| Gigabit fiber | 1 Gbps | ~8 seconds |
| 10 GbE | 10 Gbps | <1 second |
| Datacenter | 100 Gbps | ~0.1 second |
Measuring Bandwidth¶
```python import requests import time
def measure_download_bandwidth(url, size_mb=10): """Measure download bandwidth.""" # Use a test file of known size start = time.perf_counter() response = requests.get(url, stream=True)
total_bytes = 0
for chunk in response.iter_content(chunk_size=8192):
total_bytes += len(chunk)
elapsed = time.perf_counter() - start
bandwidth_mbps = (total_bytes * 8) / elapsed / 1_000_000
bandwidth_mbs = total_bytes / elapsed / 1_000_000
print(f"Downloaded: {total_bytes / 1_000_000:.1f} MB")
print(f"Time: {elapsed:.2f} seconds")
print(f"Bandwidth: {bandwidth_mbps:.1f} Mbps ({bandwidth_mbs:.1f} MB/s)")
measure_download_bandwidth('http://speedtest.example.com/100MB.bin')¶
```
Bandwidth-Delay Product¶
The bandwidth-delay product (BDP) is the amount of data "in flight":
``` BDP = Bandwidth × Latency
Example: Bandwidth: 1 Gbps (125 MB/s) Latency: 100 ms (0.1 s)
BDP = 125 MB/s × 0.1 s = 12.5 MB
This much data can be in transit at any moment! ```
Why BDP Matters¶
``` Pipe Analogy:
Small BDP (low latency OR low bandwidth): ┌──────────────────────────────┐ │ ● ● ● ● │ Few packets in flight └──────────────────────────────┘
Large BDP (high latency AND high bandwidth): ┌──────────────────────────────────────────────────────────┐ │ ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● │ └──────────────────────────────────────────────────────────┘ Many packets in flight - need large buffers! ```
Latency vs Throughput Trade-offs¶
Small Requests¶
For small requests, latency dominates:
```python import requests import time
def small_request_test(url, n_requests=100): """Latency matters more for small requests.""" start = time.perf_counter()
for _ in range(n_requests):
requests.get(url)
elapsed = time.perf_counter() - start
per_request = elapsed / n_requests * 1000
print(f"{n_requests} small requests: {elapsed:.2f}s total")
print(f"Per request: {per_request:.1f} ms")
# Mostly latency, not bandwidth!
```
Large Transfers¶
For large transfers, bandwidth dominates:
```python def large_transfer_test(url): """Bandwidth matters more for large transfers.""" start = time.perf_counter()
response = requests.get(url) # Large file
elapsed = time.perf_counter() - start
size_mb = len(response.content) / 1_000_000
bandwidth = size_mb / elapsed
print(f"Downloaded {size_mb:.1f} MB in {elapsed:.2f}s")
print(f"Effective bandwidth: {bandwidth:.1f} MB/s")
```
Optimizing for Latency¶
1. Reduce Round Trips¶
```python
Bad: Multiple requests¶
user = requests.get('/api/user/123').json() posts = requests.get('/api/user/123/posts').json() comments = requests.get('/api/user/123/comments').json()
3 round trips = 3 × latency¶
Good: Single request¶
data = requests.get('/api/user/123?include=posts,comments').json()
1 round trip¶
```
2. Use Connection Pooling¶
```python import requests
Bad: New connection each time¶
for url in urls: requests.get(url) # TCP handshake overhead each time
Good: Reuse connections¶
session = requests.Session() for url in urls: session.get(url) # Reuses TCP connection ```
3. Geographic Proximity¶
``` User in Tokyo:
→ US West Server: 100 ms RTT → Tokyo Server: 10 ms RTT
10x improvement from location alone! ```
Optimizing for Bandwidth¶
1. Compression¶
```python import gzip import requests
Request compressed data¶
response = requests.get(url, headers={'Accept-Encoding': 'gzip'})
Compress before sending¶
data = gzip.compress(large_data) requests.post(url, data=data, headers={'Content-Encoding': 'gzip'}) ```
2. Batch Operations¶
```python
Bad: Many small requests¶
for item in items: requests.post('/api/process', json={'item': item})
Good: Batch request¶
requests.post('/api/process-batch', json={'items': items}) ```
3. Parallel Downloads¶
```python import concurrent.futures import requests
def download(url): return requests.get(url).content
Download files in parallel¶
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor: results = list(executor.map(download, urls)) ```
Summary¶
| Metric | Definition | Optimization |
|---|---|---|
| Latency | Time to deliver one bit | Reduce distance, round trips |
| Bandwidth | Bits per second capacity | Compression, parallelism |
| Throughput | Actual achieved rate | Balance latency & bandwidth |
| BDP | Bandwidth × Latency | Size buffers appropriately |
Key formulas:
``` Transfer Time = Latency + (Size / Bandwidth)
For small data: Transfer Time ≈ Latency For large data: Transfer Time ≈ Size / Bandwidth
BDP = Bandwidth × RTT ```
Rules of thumb:
- Many small requests → optimize latency (reduce round trips)
- Few large transfers → optimize bandwidth (compression, parallelism)
- Real applications → profile to find the bottleneck
Exercises¶
Exercise 1. Explain the difference between latency and bandwidth. Give a real-world analogy for each.
Solution to Exercise 1
```python
Conceptual solution - see page content for details¶
import sys import platform
print(f"Python version: {sys.version}") print(f"Platform: {platform.platform()}") print(f"Architecture: {platform.machine()}") ```
Exercise 2. Calculate the total transfer time for a 1 GB file over a network with 100 Mbps bandwidth and 50 ms latency.
Solution to Exercise 2
See the main content for the detailed explanation. The key concept involves understanding the hardware-software interaction and how it affects Python performance.
Exercise 3. Explain why high bandwidth does not help if latency is the bottleneck. Give an example scenario.
Solution to Exercise 3
```python import time
Simple benchmark¶
n = 10_000_000 start = time.perf_counter() total = sum(range(n)) elapsed = time.perf_counter() - start print(f"Sum of {n} integers: {total}") print(f"Time: {elapsed:.4f} seconds") ```
Exercise 4. Write Python code that measures the round-trip time (latency) to a server using urllib.request or requests.
Solution to Exercise 4
```python import numpy as np import time
n = 1_000_000
Python loop¶
start = time.perf_counter() result_py = sum(i * i for i in range(n)) time_py = time.perf_counter() - start
NumPy vectorized¶
arr = np.arange(n) start = time.perf_counter() result_np = np.sum(arr * arr) time_np = time.perf_counter() - start
print(f"Python: {time_py:.4f}s, NumPy: {time_np:.4f}s") print(f"Speedup: {time_py / time_np:.1f}x") ```