Skip to content

Latency and Bandwidth

Mental Model

Latency is how long it takes for the first drop of water to travel through a pipe; bandwidth is how wide the pipe is. For many small requests, latency dominates -- you wait for each round trip. For large transfers, bandwidth dominates -- you wait for the pipe to drain. Knowing which one is your bottleneck tells you what to optimize.

Two Dimensions of Network Performance

Network performance has two key metrics that are often confused:

``` Bandwidth vs Latency

Bandwidth (throughput): How much water the pipe can carry Latency (delay): How long for water to reach the other end

┌─────────────────────────────────────────────────────────────┐ │ │ │ High Bandwidth, High Latency (intercontinental fiber): │ │ ════════════════════════════════════════════▶ │ │ Lots of data, but takes 100ms to arrive │ │ │ │ Low Bandwidth, Low Latency (local network): │ │ ══════▶ │ │ Less data, but arrives in 1ms │ │ │ └─────────────────────────────────────────────────────────────┘ ```

Latency

Latency is the time delay between sending and receiving data.

Latency Components

``` Total Latency = Propagation + Transmission + Queuing + Processing

┌────────────────────────────────────────────────────────────────┐ │ │ │ Propagation: Time for signal to travel (speed of light) │ │ ~5 μs per km in fiber │ │ │ │ Transmission: Time to push bits onto wire │ │ = Data size / Bandwidth │ │ │ │ Queuing: Time waiting in router/switch buffers │ │ Variable, depends on congestion │ │ │ │ Processing: Time for routers to process headers │ │ Usually microseconds │ │ │ └────────────────────────────────────────────────────────────────┘ ```

Typical Latencies

Path Round-Trip Time (RTT)
Same machine (localhost) < 0.1 ms
Local network (LAN) 0.1 - 1 ms
Same city 1 - 10 ms
Same continent 10 - 50 ms
Cross-continent 50 - 150 ms
Opposite side of globe 150 - 300 ms
Satellite (GEO) 500 - 700 ms

Measuring Latency in Python

```python import socket import time

def measure_tcp_latency(host, port, iterations=10): """Measure TCP connection latency.""" latencies = []

for _ in range(iterations):
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.settimeout(5)

    start = time.perf_counter()
    try:
        sock.connect((host, port))
        latency = (time.perf_counter() - start) * 1000
        latencies.append(latency)
    except Exception as e:
        print(f"Error: {e}")
    finally:
        sock.close()

if latencies:
    avg = sum(latencies) / len(latencies)
    print(f"Average latency to {host}:{port}: {avg:.2f} ms")

return latencies

measure_tcp_latency('google.com', 80)

```

HTTP Latency

```python import requests import time

def measure_http_latency(url, iterations=5): """Measure HTTP request latency.""" for _ in range(iterations): start = time.perf_counter() response = requests.get(url) elapsed = (time.perf_counter() - start) * 1000

    print(f"{url}: {elapsed:.0f} ms (status: {response.status_code})")

measure_http_latency('https://api.github.com')

```

Bandwidth

Bandwidth is the maximum rate of data transfer.

Bandwidth Units

``` Bits vs Bytes:

Network speeds in bits: 1 Gbps = 1,000,000,000 bits/second File sizes in bytes: 1 GB = 1,000,000,000 bytes

Conversion: 1 Gbps = 125 MB/s (divide by 8)

Common confusion: "1 Gbps internet" ≠ "1 GB per second downloads" "1 Gbps internet" = "125 MB per second" maximum ```

Typical Bandwidths

Connection Bandwidth Download 1 GB
4G LTE 50 Mbps ~3 minutes
Home cable 100 Mbps ~80 seconds
Gigabit fiber 1 Gbps ~8 seconds
10 GbE 10 Gbps <1 second
Datacenter 100 Gbps ~0.1 second

Measuring Bandwidth

```python import requests import time

def measure_download_bandwidth(url, size_mb=10): """Measure download bandwidth.""" # Use a test file of known size start = time.perf_counter() response = requests.get(url, stream=True)

total_bytes = 0
for chunk in response.iter_content(chunk_size=8192):
    total_bytes += len(chunk)

elapsed = time.perf_counter() - start
bandwidth_mbps = (total_bytes * 8) / elapsed / 1_000_000
bandwidth_mbs = total_bytes / elapsed / 1_000_000

print(f"Downloaded: {total_bytes / 1_000_000:.1f} MB")
print(f"Time: {elapsed:.2f} seconds")
print(f"Bandwidth: {bandwidth_mbps:.1f} Mbps ({bandwidth_mbs:.1f} MB/s)")

measure_download_bandwidth('http://speedtest.example.com/100MB.bin')

```

Bandwidth-Delay Product

The bandwidth-delay product (BDP) is the amount of data "in flight":

``` BDP = Bandwidth × Latency

Example: Bandwidth: 1 Gbps (125 MB/s) Latency: 100 ms (0.1 s)

BDP = 125 MB/s × 0.1 s = 12.5 MB

This much data can be in transit at any moment! ```

Why BDP Matters

``` Pipe Analogy:

Small BDP (low latency OR low bandwidth): ┌──────────────────────────────┐ │ ● ● ● ● │ Few packets in flight └──────────────────────────────┘

Large BDP (high latency AND high bandwidth): ┌──────────────────────────────────────────────────────────┐ │ ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● │ └──────────────────────────────────────────────────────────┘ Many packets in flight - need large buffers! ```

Latency vs Throughput Trade-offs

Small Requests

For small requests, latency dominates:

```python import requests import time

def small_request_test(url, n_requests=100): """Latency matters more for small requests.""" start = time.perf_counter()

for _ in range(n_requests):
    requests.get(url)

elapsed = time.perf_counter() - start
per_request = elapsed / n_requests * 1000

print(f"{n_requests} small requests: {elapsed:.2f}s total")
print(f"Per request: {per_request:.1f} ms")
# Mostly latency, not bandwidth!

```

Large Transfers

For large transfers, bandwidth dominates:

```python def large_transfer_test(url): """Bandwidth matters more for large transfers.""" start = time.perf_counter()

response = requests.get(url)  # Large file

elapsed = time.perf_counter() - start
size_mb = len(response.content) / 1_000_000
bandwidth = size_mb / elapsed

print(f"Downloaded {size_mb:.1f} MB in {elapsed:.2f}s")
print(f"Effective bandwidth: {bandwidth:.1f} MB/s")

```

Optimizing for Latency

1. Reduce Round Trips

```python

Bad: Multiple requests

user = requests.get('/api/user/123').json() posts = requests.get('/api/user/123/posts').json() comments = requests.get('/api/user/123/comments').json()

3 round trips = 3 × latency

Good: Single request

data = requests.get('/api/user/123?include=posts,comments').json()

1 round trip

```

2. Use Connection Pooling

```python import requests

Bad: New connection each time

for url in urls: requests.get(url) # TCP handshake overhead each time

Good: Reuse connections

session = requests.Session() for url in urls: session.get(url) # Reuses TCP connection ```

3. Geographic Proximity

``` User in Tokyo:

→ US West Server: 100 ms RTT → Tokyo Server: 10 ms RTT

10x improvement from location alone! ```

Optimizing for Bandwidth

1. Compression

```python import gzip import requests

Request compressed data

response = requests.get(url, headers={'Accept-Encoding': 'gzip'})

Compress before sending

data = gzip.compress(large_data) requests.post(url, data=data, headers={'Content-Encoding': 'gzip'}) ```

2. Batch Operations

```python

Bad: Many small requests

for item in items: requests.post('/api/process', json={'item': item})

Good: Batch request

requests.post('/api/process-batch', json={'items': items}) ```

3. Parallel Downloads

```python import concurrent.futures import requests

def download(url): return requests.get(url).content

Download files in parallel

with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor: results = list(executor.map(download, urls)) ```

Summary

Metric Definition Optimization
Latency Time to deliver one bit Reduce distance, round trips
Bandwidth Bits per second capacity Compression, parallelism
Throughput Actual achieved rate Balance latency & bandwidth
BDP Bandwidth × Latency Size buffers appropriately

Key formulas:

``` Transfer Time = Latency + (Size / Bandwidth)

For small data: Transfer Time ≈ Latency For large data: Transfer Time ≈ Size / Bandwidth

BDP = Bandwidth × RTT ```

Rules of thumb:

  • Many small requests → optimize latency (reduce round trips)
  • Few large transfers → optimize bandwidth (compression, parallelism)
  • Real applications → profile to find the bottleneck

Exercises

Exercise 1. Explain the difference between latency and bandwidth. Give a real-world analogy for each.

Solution to Exercise 1

```python

Conceptual solution - see page content for details

import sys import platform

print(f"Python version: {sys.version}") print(f"Platform: {platform.platform()}") print(f"Architecture: {platform.machine()}") ```


Exercise 2. Calculate the total transfer time for a 1 GB file over a network with 100 Mbps bandwidth and 50 ms latency.

Solution to Exercise 2

See the main content for the detailed explanation. The key concept involves understanding the hardware-software interaction and how it affects Python performance.


Exercise 3. Explain why high bandwidth does not help if latency is the bottleneck. Give an example scenario.

Solution to Exercise 3

```python import time

Simple benchmark

n = 10_000_000 start = time.perf_counter() total = sum(range(n)) elapsed = time.perf_counter() - start print(f"Sum of {n} integers: {total}") print(f"Time: {elapsed:.4f} seconds") ```


Exercise 4. Write Python code that measures the round-trip time (latency) to a server using urllib.request or requests.

Solution to Exercise 4

```python import numpy as np import time

n = 1_000_000

Python loop

start = time.perf_counter() result_py = sum(i * i for i in range(n)) time_py = time.perf_counter() - start

NumPy vectorized

arr = np.arange(n) start = time.perf_counter() result_np = np.sum(arr * arr) time_np = time.perf_counter() - start

print(f"Python: {time_py:.4f}s, NumPy: {time_np:.4f}s") print(f"Speedup: {time_py / time_np:.1f}x") ```