Skip to content

Hardware and Interpreted Languages

The Language Spectrum

Programming languages exist on a spectrum from hardware to abstraction:

Hardware Distance Spectrum:

Machine Code    Assembly    C/C++    Java    Python
     │             │          │        │        │
     ▼             ▼          ▼        ▼        ▼
┌─────────────────────────────────────────────────────────────┐
│░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░│
└─────────────────────────────────────────────────────────────┘
 Close to Hardware                        Close to Human

 Fast execution                           Fast development
 Manual memory                            Automatic memory
 Hardware-specific                        Portable
 Harder to write                          Easier to write

Compiled vs Interpreted

Compiled Languages (C, C++, Rust)

Compilation Process:

Source Code                    Machine Code
┌─────────────┐               ┌─────────────┐
│  int x = 5; │               │ 10111010... │
│  x = x + 1; │ ──Compiler──▶ │ 01001101... │
│  return x;  │               │ 11100010... │
└─────────────┘               └─────────────┘
                                    │
                                    ▼
                              ┌──────────┐
                              │   CPU    │
                              │ (direct) │
                              └──────────┘

Characteristics:
  ✓ Compiled once, run many times
  ✓ Direct CPU execution
  ✓ Optimizations at compile time
  ✗ Platform-specific binaries
  ✗ Slower development cycle

Interpreted Languages (Python, JavaScript, Ruby)

Interpretation Process:

Source Code                Interpreter               CPU
┌─────────────┐           ┌─────────────┐       ┌──────────┐
│  x = 5      │           │             │       │          │
│  x = x + 1  │ ────────▶ │  Interpret  │ ────▶ │ Execute  │
│  return x   │           │  each line  │       │          │
└─────────────┘           └─────────────┘       └──────────┘
                                │
                          Read → Parse → Execute
                          Read → Parse → Execute
                          Read → Parse → Execute
                                │
                           Every time!

Characteristics:
  ✓ No compilation step
  ✓ Platform independent
  ✓ Dynamic and flexible
  ✗ Interpretation overhead
  ✗ Slower execution

Bytecode Compiled (Python, Java)

Python actually uses a hybrid approach:

Python's Execution Model:

Source (.py)          Bytecode (.pyc)          Execution
┌─────────────┐      ┌─────────────┐      ┌─────────────┐
│  x = 5      │      │ LOAD_CONST 5│      │             │
│  x = x + 1  │ ───▶ │ STORE_NAME x│ ───▶ │    PVM      │
│  print(x)   │      │ LOAD_NAME x │      │ (interpret) │
└─────────────┘      │ LOAD_CONST 1│      └─────────────┘
                     │ BINARY_ADD  │
  .py file           │ STORE_NAME x│       Python Virtual
                     └─────────────┘       Machine

                     .pyc file (cached)

Hardware Interaction Layers

Direct Hardware Access (C)

// C code - directly controls memory
int* arr = malloc(1000 * sizeof(int));  // Direct allocation
arr[0] = 42;                            // Direct memory write
free(arr);                              // Manual deallocation

// Compiles to roughly:
// MOV eax, 4000        ; Calculate size
// CALL malloc          ; System call
// MOV [eax], 42        ; Write to address
// CALL free            ; Deallocate

Abstracted Hardware Access (Python)

# Python code - hardware abstracted away
arr = [0] * 1000      # Python handles allocation
arr[0] = 42           # Python handles the write
# Garbage collector handles deallocation

# Internally involves:
# - Create list object
# - Create 1000 integer objects
# - Store references in list
# - Reference counting
# - Type checking at runtime

The Abstraction Cost

What Happens in x = x + 1

In C:

1 CPU instruction (approximately):
  ADD [x_address], 1

In Python:

~100+ operations:
1. Look up 'x' in local namespace (dict lookup)
2. Get PyObject* for x
3. Check type of x (is it int? float? custom?)
4. Look up '__add__' method
5. Look up '1' - create new int object
6. Call __add__(x, 1)
7. Inside __add__:
   - Unbox x to C long
   - Unbox 1 to C long
   - Add them
   - Create new PyObject for result
   - Set reference count
8. Bind result to name 'x'
9. Decrement old x's reference count
10. Maybe trigger garbage collection

Visualization

C: x = x + 1
┌───────────────┐
│   ADD [x], 1  │  ~1 CPU cycle
└───────────────┘

Python: x = x + 1
┌────────────────────────────────────────────────────────────┐
│ LOAD_NAME     │ dict lookup, type check                   │
│ LOAD_CONST    │ create int object                         │
│ BINARY_ADD    │ type check, method lookup, unbox, add,    │
│               │ create new object, set refcount           │
│ STORE_NAME    │ dict update, decref old                   │
└────────────────────────────────────────────────────────────┘
                              ~100-1000 CPU cycles

Why Use Interpreted Languages?

Despite the performance cost, interpreted languages dominate:

Development Speed

# Python: Write and run immediately
def analyze(data):
    return sum(x**2 for x in data) / len(data)

# vs C: Write, compile, link, run, debug memory issues...

Flexibility

# Dynamic typing - decide at runtime
def process(x):
    if isinstance(x, list):
        return [i * 2 for i in x]
    elif isinstance(x, dict):
        return {k: v * 2 for k, v in x.items()}
    else:
        return x * 2

Rich Ecosystem

# One line to do complex operations
import pandas as pd
df = pd.read_csv('data.csv').groupby('category').mean()

The Best of Both Worlds

Modern Python leverages compiled code where it matters:

Python Ecosystem Strategy:

┌─────────────────────────────────────────────────────────────┐
│                     Python Code                             │
│                  (easy to write)                            │
│                                                             │
│    data_processing()    # Pure Python - slow but flexible  │
│    result = np.dot(A, B) # Calls C code - fast!            │
│    model.fit(X, y)       # Calls C/Fortran - fast!         │
│                                                             │
└─────────────────────────────────────────────────────────────┘
                            │
              ┌─────────────┴─────────────┐
              ▼                           ▼
    ┌─────────────────┐         ┌─────────────────┐
    │  Python Layer   │         │ C/Fortran Layer │
    │  (orchestrate)  │         │ (compute)       │
    │  ~10% of time   │         │ ~90% of time    │
    └─────────────────┘         └─────────────────┘

Performance Comparison

import time
import numpy as np

# Pure Python
def python_sum(n):
    total = 0
    for i in range(n):
        total += i
    return total

# NumPy (calls C)
def numpy_sum(n):
    return np.sum(np.arange(n))

n = 10_000_000

start = time.perf_counter()
python_sum(n)
python_time = time.perf_counter() - start

start = time.perf_counter()
numpy_sum(n)
numpy_time = time.perf_counter() - start

print(f"Python: {python_time:.3f}s")
print(f"NumPy:  {numpy_time:.3f}s")
print(f"Speedup: {python_time/numpy_time:.0f}x")

Typical output:

Python: 0.850s
NumPy:  0.012s
Speedup: 70x

Summary

Aspect Compiled (C) Interpreted (Python)
Execution Direct to CPU Via interpreter
Speed Fast Slower
Development Slower Faster
Memory Manual Automatic
Typing Static Dynamic
Portability Compile per platform Run anywhere

Key insight:

Python's strategy is to be the "glue" language—easy to write Python code that orchestrates fast compiled libraries (NumPy, TensorFlow, etc.). This gives you the best of both worlds: productivity AND performance.