Skip to content

Big-O Notation

Big-O notation describes how algorithm performance scales with input size. Understanding Big-O is essential for choosing algorithms and data structures that perform well at scale.

Mental Model

Big-O answers one question: "if I double the input size, how much longer does this take?" O(1) stays the same, O(n) doubles, O(n^2) quadruples, and O(log n) barely increases. Focus on the dominant term and ignore constants -- Big-O captures the growth rate, not the exact runtime.


Common Complexities

Comparing Orders of Growth

```python def demonstrate_complexity(): def constant(n): return n

def linear(n):
    total = 0
    for i in range(n):
        total += i
    return total

def quadratic(n):
    total = 0
    for i in range(n):
        for j in range(n):
            total += 1
    return total

n = 1000

print(f"O(1): {constant(n)}")
print(f"O(n): {linear(n)}")
print(f"O(n²): {quadratic(100)}")

demonstrate_complexity() ```

Output: O(1): 1000 O(n): 499500 O(n²): 10000

Analysis Examples

```python def linear_search(arr, target): for i, val in enumerate(arr): if val == target: return i return -1

def binary_search(arr, target): left, right = 0, len(arr) - 1 while left <= right: mid = (left + right) // 2 if arr[mid] == target: return mid elif arr[mid] < target: left = mid + 1 else: right = mid - 1 return -1

data = list(range(1000000))

print(f"Linear: {linear_search(data, 999999)}") print(f"Binary: {binary_search(data, 999999)}") ```

Output: Linear: 999999 Binary: 999999

Space Complexity

Memory Usage Patterns

```python def space_o1(): x = 10 return x * 2

def space_on(n): lst = list(range(n)) return sum(lst)

def space_on_sq(n): matrix = [[0] * n for _ in range(n)] return matrix

space_o1() space_on(1000)

matrix = space_on_sq(100) print(f"Matrix created: {len(matrix)}x{len(matrix[0])}") ```

Output: Matrix created: 100x100

Practical Guidelines

Choosing Algorithms

```python import timeit

small_n = 100

setup_small = "arr = list(range(100)); target = 50" linear_time = timeit.timeit("target in arr", setup=setup_small, number=10000)

print(f"Search time for small data: {linear_time:.6f}s") print("For large data, use binary search or hash tables (O(1) lookup)") ```

Output: Search time for small data: 0.000123s For large data, use binary search or hash tables (O(1) lookup)


Runnable Example: search_algorithms_complexity.py

```python """ Search Algorithms and Time Complexity Visualization

This tutorial demonstrates two fundamental search algorithms and visualizes the growth rates of common time complexity classes.

Topics covered: - Sequential (linear) search: O(n) - Binary search: O(log n) on sorted data - Big-O complexity growth rate comparison

Based on concepts from Python-100-Days example01 and ch02/performance materials. """

from math import log2, factorial

=============================================================================

Example 1: Sequential Search - O(n)

=============================================================================

def sequential_search(items: list, target) -> int: """Search for target by checking each element one by one.

Time Complexity: O(n) - must check every element in worst case.
Best for: unsorted data, small lists.

>>> sequential_search([35, 97, 12, 68, 55], 12)
2
>>> sequential_search([35, 97, 12, 68, 55], 99)
-1
"""
for index, item in enumerate(items):
    if item == target:
        return index  # Early return on match
return -1

=============================================================================

Example 2: Binary Search - O(log n)

=============================================================================

def binary_search(items: list, target) -> int: """Search for target by repeatedly halving the search space.

Requires: items must be sorted in ascending order.
Time Complexity: O(log n) - halves search space each step.

>>> binary_search([12, 35, 40, 55, 68, 73, 81, 97], 55)
3
>>> binary_search([12, 35, 40, 55, 68, 73, 81, 97], 99)
-1
"""
start, end = 0, len(items) - 1
while start <= end:
    mid = (start + end) // 2
    if target > items[mid]:
        start = mid + 1
    elif target < items[mid]:
        end = mid - 1
    else:
        return mid
return -1

=============================================================================

Example 3: Comparing Search Performance

=============================================================================

def compare_searches(): """Compare sequential vs binary search performance.""" import time

data = list(range(1_000_000))  # sorted list of 1M elements
target = 999_999  # worst case for sequential

# Sequential search
start = time.perf_counter()
result1 = sequential_search(data, target)
seq_time = time.perf_counter() - start

# Binary search
start = time.perf_counter()
result2 = binary_search(data, target)
bin_time = time.perf_counter() - start

print("=== Search Performance Comparison (1,000,000 elements) ===")
print(f"Sequential search: found at index {result1}, time = {seq_time:.6f}s")
print(f"Binary search:     found at index {result2}, time = {bin_time:.6f}s")
print(f"Speedup: {seq_time / bin_time:.1f}x faster")
print()

=============================================================================

Example 4: Big-O Complexity Classes

=============================================================================

def print_complexity_table(): """Display a table showing how different complexity classes grow.

Common complexity classes (from fastest to slowest):
O(1)        - Constant:    Hash table lookup
O(log n)    - Logarithmic: Binary search
O(n)        - Linear:      Sequential search
O(n log n)  - Linearithmic: Merge sort, quick sort
O(n^2)      - Quadratic:   Bubble sort, selection sort
O(n^3)      - Cubic:       Matrix multiplication (naive)
O(2^n)      - Exponential: Recursive fibonacci (naive)
O(n!)       - Factorial:   Travelling salesman (brute force)
"""
print("=== Big-O Complexity Growth Table ===")
print(f"{'n':>4} | {'log n':>8} | {'n':>8} | {'n log n':>10} | "
      f"{'n^2':>10} | {'n^3':>12} | {'2^n':>12} | {'n!':>14}")
print("-" * 90)

for n in range(1, 11):
    log_n = log2(n) if n > 0 else 0
    n_log_n = n * log2(n) if n > 0 else 0
    n_sq = n ** 2
    n_cube = n ** 3
    two_n = 2 ** n
    n_fact = factorial(n)
    print(f"{n:>4} | {log_n:>8.2f} | {n:>8} | {n_log_n:>10.2f} | "
          f"{n_sq:>10} | {n_cube:>12} | {two_n:>12} | {n_fact:>14}")
print()

=============================================================================

Example 5: Complexity Visualization (requires matplotlib + numpy)

=============================================================================

def plot_complexity_growth(): """Visualize how different complexity classes grow with n.

This creates a plot comparing O(log n), O(n), O(n log n),
O(n^2), O(n^3), O(2^n), and O(n!) growth rates.
"""
try:
    from matplotlib import pyplot as plt
    import numpy as np
except ImportError:
    print("matplotlib and numpy required for plotting")
    return

num = 6
x = list(range(1, num + 1))

complexities = {
    'O(log n)':    [log2(n) for n in x],
    'O(n)':        x,
    'O(n log n)':  [n * log2(n) for n in x],
    'O(n²)':       [n ** 2 for n in x],
    'O(n³)':       [n ** 3 for n in x],
    'O(2ⁿ)':       [2 ** n for n in x],
    'O(n!)':       [factorial(n) for n in x],
}

styles = ['r-.', 'g-*', 'b-o', 'y-x', 'c-^', 'm-+', 'k-d']
fig, ax = plt.subplots(figsize=(10, 6))

for (label, y_data), style in zip(complexities.items(), styles):
    ax.plot(x, y_data, style, label=label)

ax.set_xlabel('Input Size (n)')
ax.set_ylabel('Operations')
ax.set_title('Time Complexity Growth Rates')
ax.legend()
ax.set_xticks(np.arange(1, num + 1, step=1))
ax.set_yticks(np.arange(0, 751, step=50))
plt.tight_layout()
plt.show()

=============================================================================

Main

=============================================================================

if name == 'main': compare_searches() print_complexity_table() # Uncomment to see the plot: # plot_complexity_growth() ```


Exercises

Exercise 1. Write both a linear search and a binary search for a sorted list. Time each when searching for the last element in a list of 1,000,000 items. Report the time difference.

Solution to Exercise 1

```python import time

def linear_search(arr, target): for i, val in enumerate(arr): if val == target: return i return -1

def binary_search(arr, target): lo, hi = 0, len(arr) - 1 while lo <= hi: mid = (lo + hi) // 2 if arr[mid] == target: return mid elif arr[mid] < target: lo = mid + 1 else: hi = mid - 1 return -1

data = list(range(1_000_000)) target = 999_999

start = time.perf_counter() linear_search(data, target) linear_time = time.perf_counter() - start

start = time.perf_counter() binary_search(data, target) binary_time = time.perf_counter() - start

print(f"Linear: {linear_time:.4f}s") print(f"Binary: {binary_time:.6f}s") ```

Linear search checks every element (O(n)), while binary search halves the search space each step (O(log n)).


Exercise 2. Write a function that finds all duplicate values in a list. Implement it in two ways: one with O(n^2) time complexity (nested loops) and one with O(n) time complexity (using a set). Compare their performance on a list of 10,000 elements.

Solution to Exercise 2

```python import time

def find_dupes_quadratic(lst): dupes = [] for i in range(len(lst)): for j in range(i + 1, len(lst)): if lst[i] == lst[j] and lst[i] not in dupes: dupes.append(lst[i]) return dupes

def find_dupes_linear(lst): seen = set() dupes = set() for item in lst: if item in seen: dupes.add(item) seen.add(item) return list(dupes)

data = list(range(5000)) + list(range(5000))

start = time.perf_counter() find_dupes_quadratic(data) quad_time = time.perf_counter() - start

start = time.perf_counter() find_dupes_linear(data) lin_time = time.perf_counter() - start

print(f"O(n^2): {quad_time:.4f}s") print(f"O(n): {lin_time:.6f}s") ```

The set-based approach uses O(1) membership testing, reducing the overall complexity from O(n^2) to O(n).


Exercise 3. Explain why an O(n^2) algorithm with a small constant factor might outperform an O(n log n) algorithm for small input sizes. Demonstrate with a concrete example using timeit.

Solution to Exercise 3

```python import timeit

Simple O(n^2) insertion sort vs O(n log n) sorted()

def insertion_sort(arr): for i in range(1, len(arr)): key = arr[i] j = i - 1 while j >= 0 and arr[j] > key: arr[j + 1] = arr[j] j -= 1 arr[j + 1] = key

n = 10 setup = f"import random; data = random.sample(range({n}), {n})"

t_insert = timeit.timeit("insertion_sort(data[:])", setup=setup + "; from main import insertion_sort", number=10000) t_sorted = timeit.timeit("sorted(data)", setup=setup, number=10000)

print(f"Insertion sort (n={n}): {t_insert:.4f}s") print(f"sorted() (n={n}): {t_sorted:.4f}s") ```

For very small inputs, the overhead of O(n log n) algorithms (function calls, more complex logic) can exceed the theoretical advantage. The constant factors dominate when n is tiny.