Latency and Bandwidth¶
Two Dimensions of Network Performance¶
Network performance has two key metrics that are often confused:
Bandwidth vs Latency
Bandwidth (throughput): How much water the pipe can carry
Latency (delay): How long for water to reach the other end
┌─────────────────────────────────────────────────────────────┐
│ │
│ High Bandwidth, High Latency (intercontinental fiber): │
│ ════════════════════════════════════════════▶ │
│ Lots of data, but takes 100ms to arrive │
│ │
│ Low Bandwidth, Low Latency (local network): │
│ ══════▶ │
│ Less data, but arrives in 1ms │
│ │
└─────────────────────────────────────────────────────────────┘
Latency¶
Latency is the time delay between sending and receiving data.
Latency Components¶
Total Latency = Propagation + Transmission + Queuing + Processing
┌────────────────────────────────────────────────────────────────┐
│ │
│ Propagation: Time for signal to travel (speed of light) │
│ ~5 μs per km in fiber │
│ │
│ Transmission: Time to push bits onto wire │
│ = Data size / Bandwidth │
│ │
│ Queuing: Time waiting in router/switch buffers │
│ Variable, depends on congestion │
│ │
│ Processing: Time for routers to process headers │
│ Usually microseconds │
│ │
└────────────────────────────────────────────────────────────────┘
Typical Latencies¶
| Path | Round-Trip Time (RTT) |
|---|---|
| Same machine (localhost) | < 0.1 ms |
| Local network (LAN) | 0.1 - 1 ms |
| Same city | 1 - 10 ms |
| Same continent | 10 - 50 ms |
| Cross-continent | 50 - 150 ms |
| Opposite side of globe | 150 - 300 ms |
| Satellite (GEO) | 500 - 700 ms |
Measuring Latency in Python¶
import socket
import time
def measure_tcp_latency(host, port, iterations=10):
"""Measure TCP connection latency."""
latencies = []
for _ in range(iterations):
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(5)
start = time.perf_counter()
try:
sock.connect((host, port))
latency = (time.perf_counter() - start) * 1000
latencies.append(latency)
except Exception as e:
print(f"Error: {e}")
finally:
sock.close()
if latencies:
avg = sum(latencies) / len(latencies)
print(f"Average latency to {host}:{port}: {avg:.2f} ms")
return latencies
# measure_tcp_latency('google.com', 80)
HTTP Latency¶
import requests
import time
def measure_http_latency(url, iterations=5):
"""Measure HTTP request latency."""
for _ in range(iterations):
start = time.perf_counter()
response = requests.get(url)
elapsed = (time.perf_counter() - start) * 1000
print(f"{url}: {elapsed:.0f} ms (status: {response.status_code})")
# measure_http_latency('https://api.github.com')
Bandwidth¶
Bandwidth is the maximum rate of data transfer.
Bandwidth Units¶
Bits vs Bytes:
Network speeds in bits: 1 Gbps = 1,000,000,000 bits/second
File sizes in bytes: 1 GB = 1,000,000,000 bytes
Conversion: 1 Gbps = 125 MB/s (divide by 8)
Common confusion:
"1 Gbps internet" ≠ "1 GB per second downloads"
"1 Gbps internet" = "125 MB per second" maximum
Typical Bandwidths¶
| Connection | Bandwidth | Download 1 GB |
|---|---|---|
| 4G LTE | 50 Mbps | ~3 minutes |
| Home cable | 100 Mbps | ~80 seconds |
| Gigabit fiber | 1 Gbps | ~8 seconds |
| 10 GbE | 10 Gbps | <1 second |
| Datacenter | 100 Gbps | ~0.1 second |
Measuring Bandwidth¶
import requests
import time
def measure_download_bandwidth(url, size_mb=10):
"""Measure download bandwidth."""
# Use a test file of known size
start = time.perf_counter()
response = requests.get(url, stream=True)
total_bytes = 0
for chunk in response.iter_content(chunk_size=8192):
total_bytes += len(chunk)
elapsed = time.perf_counter() - start
bandwidth_mbps = (total_bytes * 8) / elapsed / 1_000_000
bandwidth_mbs = total_bytes / elapsed / 1_000_000
print(f"Downloaded: {total_bytes / 1_000_000:.1f} MB")
print(f"Time: {elapsed:.2f} seconds")
print(f"Bandwidth: {bandwidth_mbps:.1f} Mbps ({bandwidth_mbs:.1f} MB/s)")
# measure_download_bandwidth('http://speedtest.example.com/100MB.bin')
Bandwidth-Delay Product¶
The bandwidth-delay product (BDP) is the amount of data "in flight":
BDP = Bandwidth × Latency
Example:
Bandwidth: 1 Gbps (125 MB/s)
Latency: 100 ms (0.1 s)
BDP = 125 MB/s × 0.1 s = 12.5 MB
This much data can be in transit at any moment!
Why BDP Matters¶
Pipe Analogy:
Small BDP (low latency OR low bandwidth):
┌──────────────────────────────┐
│ ● ● ● ● │ Few packets in flight
└──────────────────────────────┘
Large BDP (high latency AND high bandwidth):
┌──────────────────────────────────────────────────────────┐
│ ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● │
└──────────────────────────────────────────────────────────┘
Many packets in flight - need large buffers!
Latency vs Throughput Trade-offs¶
Small Requests¶
For small requests, latency dominates:
import requests
import time
def small_request_test(url, n_requests=100):
"""Latency matters more for small requests."""
start = time.perf_counter()
for _ in range(n_requests):
requests.get(url)
elapsed = time.perf_counter() - start
per_request = elapsed / n_requests * 1000
print(f"{n_requests} small requests: {elapsed:.2f}s total")
print(f"Per request: {per_request:.1f} ms")
# Mostly latency, not bandwidth!
Large Transfers¶
For large transfers, bandwidth dominates:
def large_transfer_test(url):
"""Bandwidth matters more for large transfers."""
start = time.perf_counter()
response = requests.get(url) # Large file
elapsed = time.perf_counter() - start
size_mb = len(response.content) / 1_000_000
bandwidth = size_mb / elapsed
print(f"Downloaded {size_mb:.1f} MB in {elapsed:.2f}s")
print(f"Effective bandwidth: {bandwidth:.1f} MB/s")
Optimizing for Latency¶
1. Reduce Round Trips¶
# Bad: Multiple requests
user = requests.get('/api/user/123').json()
posts = requests.get('/api/user/123/posts').json()
comments = requests.get('/api/user/123/comments').json()
# 3 round trips = 3 × latency
# Good: Single request
data = requests.get('/api/user/123?include=posts,comments').json()
# 1 round trip
2. Use Connection Pooling¶
import requests
# Bad: New connection each time
for url in urls:
requests.get(url) # TCP handshake overhead each time
# Good: Reuse connections
session = requests.Session()
for url in urls:
session.get(url) # Reuses TCP connection
3. Geographic Proximity¶
User in Tokyo:
→ US West Server: 100 ms RTT
→ Tokyo Server: 10 ms RTT
10x improvement from location alone!
Optimizing for Bandwidth¶
1. Compression¶
import gzip
import requests
# Request compressed data
response = requests.get(url, headers={'Accept-Encoding': 'gzip'})
# Compress before sending
data = gzip.compress(large_data)
requests.post(url, data=data, headers={'Content-Encoding': 'gzip'})
2. Batch Operations¶
# Bad: Many small requests
for item in items:
requests.post('/api/process', json={'item': item})
# Good: Batch request
requests.post('/api/process-batch', json={'items': items})
3. Parallel Downloads¶
import concurrent.futures
import requests
def download(url):
return requests.get(url).content
# Download files in parallel
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
results = list(executor.map(download, urls))
Summary¶
| Metric | Definition | Optimization |
|---|---|---|
| Latency | Time to deliver one bit | Reduce distance, round trips |
| Bandwidth | Bits per second capacity | Compression, parallelism |
| Throughput | Actual achieved rate | Balance latency & bandwidth |
| BDP | Bandwidth × Latency | Size buffers appropriately |
Key formulas:
Transfer Time = Latency + (Size / Bandwidth)
For small data: Transfer Time ≈ Latency
For large data: Transfer Time ≈ Size / Bandwidth
BDP = Bandwidth × RTT
Rules of thumb:
- Many small requests → optimize latency (reduce round trips)
- Few large transfers → optimize bandwidth (compression, parallelism)
- Real applications → profile to find the bottleneck