Skip to content

Latency and Bandwidth

Two Dimensions of Network Performance

Network performance has two key metrics that are often confused:

Bandwidth vs Latency

Bandwidth (throughput):  How much water the pipe can carry
Latency (delay):         How long for water to reach the other end

┌─────────────────────────────────────────────────────────────┐
│                                                             │
│  High Bandwidth, High Latency (intercontinental fiber):     │
│  ════════════════════════════════════════════▶              │
│        Lots of data, but takes 100ms to arrive              │
│                                                             │
│  Low Bandwidth, Low Latency (local network):                │
│  ══════▶                                                    │
│        Less data, but arrives in 1ms                        │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Latency

Latency is the time delay between sending and receiving data.

Latency Components

Total Latency = Propagation + Transmission + Queuing + Processing

┌────────────────────────────────────────────────────────────────┐
│                                                                │
│  Propagation:   Time for signal to travel (speed of light)    │
│                 ~5 μs per km in fiber                          │
│                                                                │
│  Transmission:  Time to push bits onto wire                    │
│                 = Data size / Bandwidth                        │
│                                                                │
│  Queuing:       Time waiting in router/switch buffers          │
│                 Variable, depends on congestion                │
│                                                                │
│  Processing:    Time for routers to process headers            │
│                 Usually microseconds                           │
│                                                                │
└────────────────────────────────────────────────────────────────┘

Typical Latencies

Path Round-Trip Time (RTT)
Same machine (localhost) < 0.1 ms
Local network (LAN) 0.1 - 1 ms
Same city 1 - 10 ms
Same continent 10 - 50 ms
Cross-continent 50 - 150 ms
Opposite side of globe 150 - 300 ms
Satellite (GEO) 500 - 700 ms

Measuring Latency in Python

import socket
import time

def measure_tcp_latency(host, port, iterations=10):
    """Measure TCP connection latency."""
    latencies = []

    for _ in range(iterations):
        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        sock.settimeout(5)

        start = time.perf_counter()
        try:
            sock.connect((host, port))
            latency = (time.perf_counter() - start) * 1000
            latencies.append(latency)
        except Exception as e:
            print(f"Error: {e}")
        finally:
            sock.close()

    if latencies:
        avg = sum(latencies) / len(latencies)
        print(f"Average latency to {host}:{port}: {avg:.2f} ms")

    return latencies

# measure_tcp_latency('google.com', 80)

HTTP Latency

import requests
import time

def measure_http_latency(url, iterations=5):
    """Measure HTTP request latency."""
    for _ in range(iterations):
        start = time.perf_counter()
        response = requests.get(url)
        elapsed = (time.perf_counter() - start) * 1000

        print(f"{url}: {elapsed:.0f} ms (status: {response.status_code})")

# measure_http_latency('https://api.github.com')

Bandwidth

Bandwidth is the maximum rate of data transfer.

Bandwidth Units

Bits vs Bytes:

Network speeds in bits:    1 Gbps = 1,000,000,000 bits/second
File sizes in bytes:       1 GB = 1,000,000,000 bytes

Conversion: 1 Gbps = 125 MB/s (divide by 8)

Common confusion:
  "1 Gbps internet" ≠ "1 GB per second downloads"
  "1 Gbps internet" = "125 MB per second" maximum

Typical Bandwidths

Connection Bandwidth Download 1 GB
4G LTE 50 Mbps ~3 minutes
Home cable 100 Mbps ~80 seconds
Gigabit fiber 1 Gbps ~8 seconds
10 GbE 10 Gbps <1 second
Datacenter 100 Gbps ~0.1 second

Measuring Bandwidth

import requests
import time

def measure_download_bandwidth(url, size_mb=10):
    """Measure download bandwidth."""
    # Use a test file of known size
    start = time.perf_counter()
    response = requests.get(url, stream=True)

    total_bytes = 0
    for chunk in response.iter_content(chunk_size=8192):
        total_bytes += len(chunk)

    elapsed = time.perf_counter() - start
    bandwidth_mbps = (total_bytes * 8) / elapsed / 1_000_000
    bandwidth_mbs = total_bytes / elapsed / 1_000_000

    print(f"Downloaded: {total_bytes / 1_000_000:.1f} MB")
    print(f"Time: {elapsed:.2f} seconds")
    print(f"Bandwidth: {bandwidth_mbps:.1f} Mbps ({bandwidth_mbs:.1f} MB/s)")

# measure_download_bandwidth('http://speedtest.example.com/100MB.bin')

Bandwidth-Delay Product

The bandwidth-delay product (BDP) is the amount of data "in flight":

BDP = Bandwidth × Latency

Example:
  Bandwidth: 1 Gbps (125 MB/s)
  Latency: 100 ms (0.1 s)

  BDP = 125 MB/s × 0.1 s = 12.5 MB

This much data can be in transit at any moment!

Why BDP Matters

Pipe Analogy:

Small BDP (low latency OR low bandwidth):
┌──────────────────────────────┐
│ ●  ●  ●  ●                   │  Few packets in flight
└──────────────────────────────┘

Large BDP (high latency AND high bandwidth):
┌──────────────────────────────────────────────────────────┐
│ ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● │
└──────────────────────────────────────────────────────────┘
Many packets in flight - need large buffers!

Latency vs Throughput Trade-offs

Small Requests

For small requests, latency dominates:

import requests
import time

def small_request_test(url, n_requests=100):
    """Latency matters more for small requests."""
    start = time.perf_counter()

    for _ in range(n_requests):
        requests.get(url)

    elapsed = time.perf_counter() - start
    per_request = elapsed / n_requests * 1000

    print(f"{n_requests} small requests: {elapsed:.2f}s total")
    print(f"Per request: {per_request:.1f} ms")
    # Mostly latency, not bandwidth!

Large Transfers

For large transfers, bandwidth dominates:

def large_transfer_test(url):
    """Bandwidth matters more for large transfers."""
    start = time.perf_counter()

    response = requests.get(url)  # Large file

    elapsed = time.perf_counter() - start
    size_mb = len(response.content) / 1_000_000
    bandwidth = size_mb / elapsed

    print(f"Downloaded {size_mb:.1f} MB in {elapsed:.2f}s")
    print(f"Effective bandwidth: {bandwidth:.1f} MB/s")

Optimizing for Latency

1. Reduce Round Trips

# Bad: Multiple requests
user = requests.get('/api/user/123').json()
posts = requests.get('/api/user/123/posts').json()
comments = requests.get('/api/user/123/comments').json()
# 3 round trips = 3 × latency

# Good: Single request
data = requests.get('/api/user/123?include=posts,comments').json()
# 1 round trip

2. Use Connection Pooling

import requests

# Bad: New connection each time
for url in urls:
    requests.get(url)  # TCP handshake overhead each time

# Good: Reuse connections
session = requests.Session()
for url in urls:
    session.get(url)  # Reuses TCP connection

3. Geographic Proximity

User in Tokyo:

  → US West Server:  100 ms RTT
  → Tokyo Server:    10 ms RTT

  10x improvement from location alone!

Optimizing for Bandwidth

1. Compression

import gzip
import requests

# Request compressed data
response = requests.get(url, headers={'Accept-Encoding': 'gzip'})

# Compress before sending
data = gzip.compress(large_data)
requests.post(url, data=data, headers={'Content-Encoding': 'gzip'})

2. Batch Operations

# Bad: Many small requests
for item in items:
    requests.post('/api/process', json={'item': item})

# Good: Batch request
requests.post('/api/process-batch', json={'items': items})

3. Parallel Downloads

import concurrent.futures
import requests

def download(url):
    return requests.get(url).content

# Download files in parallel
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
    results = list(executor.map(download, urls))

Summary

Metric Definition Optimization
Latency Time to deliver one bit Reduce distance, round trips
Bandwidth Bits per second capacity Compression, parallelism
Throughput Actual achieved rate Balance latency & bandwidth
BDP Bandwidth × Latency Size buffers appropriately

Key formulas:

Transfer Time = Latency + (Size / Bandwidth)

For small data: Transfer Time ≈ Latency
For large data: Transfer Time ≈ Size / Bandwidth

BDP = Bandwidth × RTT

Rules of thumb:

  • Many small requests → optimize latency (reduce round trips)
  • Few large transfers → optimize bandwidth (compression, parallelism)
  • Real applications → profile to find the bottleneck