I/O and Peripherals¶
Input/Output Overview¶
I/O (Input/Output) refers to communication between the computer and external devices:
┌─────────────────────────────────────────────────────────────┐
│ Computer │
│ │
│ ┌─────────┐ ┌──────────────────┐ ┌───────────────┐ │
│ │ CPU │◀══▶│ I/O Controller │◀══▶│ Peripherals │ │
│ └─────────┘ │ (Chipset/PCH) │ │ │ │
│ └──────────────────┘ │ - Keyboard │ │
│ ▲ │ - Mouse │ │
│ │ │ - Storage │ │
│ ┌─────────┐ │ │ - Network │ │
│ │ RAM │◀═══════════╯ │ - Display │ │
│ └─────────┘ └───────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
I/O Methods¶
1. Programmed I/O (Polling)¶
CPU actively checks device status:
CPU Polling Loop:
while True:
status = read_device_status()
if status == READY:
data = read_device_data()
break
# CPU wastes cycles waiting!
Pros: Simple Cons: Wastes CPU cycles
2. Interrupt-Driven I/O¶
Device signals CPU when ready:
Interrupt Flow:
1. CPU initiates I/O operation
2. CPU continues other work
3. Device completes → sends interrupt
4. CPU pauses current work
5. CPU handles interrupt (reads data)
6. CPU resumes previous work
┌─────────────────────────────────────────────────────────────┐
│ CPU: [Work][Work][Work][Int][Work][Work][Work][Int][Work] │
│ ▲ ▲ │
│ Keyboard Network │
│ interrupt interrupt │
└─────────────────────────────────────────────────────────────┘
3. Direct Memory Access (DMA)¶
Device transfers data directly to memory:
DMA Transfer:
┌─────┐ ┌─────────┐
│ CPU │ 1. Setup DMA │ Device │
│ │────────────────────────────▶ │ │
└─────┘ └────┬────┘
│
2. CPU does other work │ 3. Device transfers
│ directly to RAM
┌─────────┐ │
│ RAM │◀──────────────────────────────┘
└─────────┘
4. DMA complete interrupt
Pros: CPU free during transfer, high throughput Cons: Complex setup, memory contention
Common I/O Interfaces¶
USB (Universal Serial Bus)¶
USB Speed Comparison:
┌──────────────┬────────────┬────────────────────┐
│ Version │ Speed │ Common Use │
├──────────────┼────────────┼────────────────────┤
│ USB 2.0 │ 480 Mbps │ Keyboards, mice │
│ USB 3.0 │ 5 Gbps │ External drives │
│ USB 3.1 │ 10 Gbps │ Fast storage │
│ USB 3.2 │ 20 Gbps │ Docks, displays │
│ USB4 │ 40 Gbps │ High-speed I/O │
└──────────────┴────────────┴────────────────────┘
SATA vs NVMe¶
Storage Interface Comparison:
SATA III:
┌─────┐ ┌─────────┐
│ CPU │──── SATA ─────────▶│ SSD │
└─────┘ (~600 MB/s) └─────────┘
NVMe (PCIe):
┌─────┐ ┌─────────┐
│ CPU │──── PCIe ─────────▶│ SSD │
└─────┘ (~7,000 MB/s) └─────────┘
Network Interfaces¶
Network Speed Comparison:
┌──────────────┬────────────┬────────────────────┐
│ Type │ Speed │ Bandwidth │
├──────────────┼────────────┼────────────────────┤
│ 1 GbE │ 1 Gbps │ ~125 MB/s │
│ 10 GbE │ 10 Gbps │ ~1.25 GB/s │
│ 25 GbE │ 25 Gbps │ ~3.1 GB/s │
│ 100 GbE │ 100 Gbps │ ~12.5 GB/s │
└──────────────┴────────────┴────────────────────┘
I/O in Python¶
File I/O (Storage)¶
import time
# Buffered I/O (default)
start = time.perf_counter()
with open('large_file.bin', 'rb') as f:
data = f.read() # OS handles buffering
read_time = time.perf_counter() - start
size_mb = len(data) / 1e6
bandwidth = size_mb / read_time
print(f"Read: {bandwidth:.0f} MB/s")
Network I/O¶
import socket
import time
def measure_network_latency(host, port):
"""Measure round-trip time to server."""
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((host, port))
message = b'ping'
start = time.perf_counter()
sock.send(message)
response = sock.recv(1024)
latency = time.perf_counter() - start
sock.close()
return latency * 1000 # ms
# Example
# latency = measure_network_latency('example.com', 80)
# print(f"RTT: {latency:.1f} ms")
Asynchronous I/O¶
import asyncio
import aiohttp
async def fetch_url(session, url):
"""Non-blocking HTTP request."""
async with session.get(url) as response:
return await response.text()
async def fetch_many(urls):
"""Fetch multiple URLs concurrently."""
async with aiohttp.ClientSession() as session:
tasks = [fetch_url(session, url) for url in urls]
return await asyncio.gather(*tasks)
# All requests happen concurrently, not sequentially
# results = asyncio.run(fetch_many(urls))
I/O Latency Comparison¶
┌────────────────────────────────────────────────────────────┐
│ I/O Latency Scale │
├────────────────────────────────────────────────────────────┤
│ │
│ L1 cache: │ 1 ns │
│ RAM: │████ 60 ns │
│ NVMe SSD: │████████████████ 20,000 ns │
│ SATA SSD: │██████████████████ 100,000 ns │
│ HDD: │████████████████████████████████████ │
│ 10,000,000 ns │
│ Network (local): │████████████████████ 100,000 ns │
│ Network (internet):│████████████████████████████████████ │
│ 50,000,000 ns │
│ │
└────────────────────────────────────────────────────────────┘
I/O-Bound vs CPU-Bound¶
I/O-Bound Operations¶
import time
# I/O-bound: waiting for external device
def io_bound_task():
# CPU sits idle while waiting for disk/network
with open('large_file.bin', 'rb') as f:
data = f.read() # CPU waits for disk
return data
# Threading helps I/O-bound tasks
from concurrent.futures import ThreadPoolExecutor
def fetch_multiple_files(file_list):
with ThreadPoolExecutor(max_workers=10) as executor:
# Threads wait in parallel for I/O
results = list(executor.map(io_bound_task, file_list))
return results
CPU-Bound Operations¶
import numpy as np
# CPU-bound: computation keeps CPU busy
def cpu_bound_task(data):
# CPU actively computing
return np.sum(data ** 2)
# Multiprocessing helps CPU-bound tasks
from multiprocessing import Pool
def process_multiple_arrays(arrays):
with Pool(processes=4) as pool:
# Different processes on different cores
results = pool.map(cpu_bound_task, arrays)
return results
Optimizing I/O¶
Strategy 1: Buffering¶
# Bad: Many small writes
with open('output.txt', 'w') as f:
for i in range(1000000):
f.write(f"{i}\n") # Each write may hit disk
# Better: Write larger chunks
buffer = []
with open('output.txt', 'w') as f:
for i in range(1000000):
buffer.append(f"{i}\n")
if len(buffer) >= 10000:
f.write(''.join(buffer))
buffer = []
if buffer:
f.write(''.join(buffer))
Strategy 2: Memory Mapping¶
import mmap
import numpy as np
# Memory-map large file
with open('huge_data.bin', 'r+b') as f:
mm = mmap.mmap(f.fileno(), 0)
# Access like memory, OS handles I/O
data = mm[1000:2000]
mm.close()
# NumPy memmap
arr = np.memmap('huge_array.dat', dtype='float64', mode='r',
shape=(1000000000,))
# Access elements without loading entire file
subset = arr[::1000] # Every 1000th element
Strategy 3: Async I/O¶
import asyncio
async def main():
# Concurrent I/O operations
results = await asyncio.gather(
async_read_file('file1.txt'),
async_read_file('file2.txt'),
async_fetch_url('http://example.com'),
)
return results
# All three I/O operations overlap
# asyncio.run(main())
Summary¶
| Method | How It Works | Best For |
|---|---|---|
| Polling | CPU checks device | Simple, low-speed |
| Interrupts | Device signals CPU | General purpose |
| DMA | Direct memory transfer | High-speed, bulk data |
| Interface | Bandwidth | Typical Use |
|---|---|---|
| USB 3.0 | 625 MB/s | Peripherals, storage |
| SATA III | 600 MB/s | Legacy storage |
| NVMe | 7,000 MB/s | Fast storage |
| 10 GbE | 1.25 GB/s | Networking |
Key points for Python:
- I/O operations often dominate execution time
- Use threading for I/O-bound tasks (GIL released during I/O)
- Use multiprocessing for CPU-bound tasks
- Buffering and batching reduce I/O overhead
- Async I/O enables concurrent operations
- Memory mapping avoids loading entire files