Skip to content

Cloud Computing Basics

Mental Model

Cloud computing is renting someone else's computers by the hour instead of buying your own. You trade upfront cost and maintenance for on-demand flexibility, paying only for what you use. The key decision is how much control you need: full machines (IaaS), just a platform (PaaS), or a finished product (SaaS).

What is Cloud Computing?

Cloud computing provides on-demand access to computing resources over the internet, without owning physical hardware.

``` Traditional (On-Premises): Cloud:

┌─────────────────────────┐ ┌─────────────────────────┐ │ Your Data Center │ │ Cloud Provider │ │ │ │ │ │ Buy servers │ │ Rent by the hour │ │ Maintain hardware │ │ No maintenance │ │ Fixed capacity │ │ Elastic scaling │ │ Upfront cost │ │ Pay-as-you-go │ │ Full control │ │ Managed services │ └─────────────────────────┘ └─────────────────────────┘ ```

Service Models

IaaS, PaaS, SaaS

``` Cloud Service Stack:

┌─────────────────────────────────────────────────────────────┐ │ SaaS (Software as a Service) │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ Gmail, Dropbox, Salesforce, Slack │ │ │ │ You manage: Just use it │ │ │ │ Provider manages: Everything │ │ │ └─────────────────────────────────────────────────────┘ │ ├─────────────────────────────────────────────────────────────┤ │ PaaS (Platform as a Service) │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ Heroku, Google App Engine, AWS Elastic Beanstalk │ │ │ │ You manage: Code, data │ │ │ │ Provider manages: Runtime, OS, servers │ │ │ └─────────────────────────────────────────────────────┘ │ ├─────────────────────────────────────────────────────────────┤ │ IaaS (Infrastructure as a Service) │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ AWS EC2, Google Compute, Azure VMs │ │ │ │ You manage: OS, runtime, code, data │ │ │ │ Provider manages: Virtualization, servers, network │ │ │ └─────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────┘ ```

Major Cloud Providers

Provider Strengths Key Services
AWS Largest, most services EC2, S3, Lambda
Google Cloud ML/AI, data analytics BigQuery, TPUs
Azure Enterprise, Microsoft integration VMs, Active Directory

Core Cloud Services

Compute

``` Compute Options:

Virtual Machines (IaaS): - Full control, any OS - AWS EC2, GCP Compute Engine, Azure VMs - Pay by hour, various sizes

Containers: - Package app + dependencies - AWS ECS/EKS, GCP GKE, Azure AKS - Kubernetes orchestration

Serverless (Functions): - Run code without servers - AWS Lambda, GCP Functions, Azure Functions - Pay per execution, auto-scales ```

Storage

``` Storage Types:

Object Storage (files, media): - AWS S3, GCP Cloud Storage, Azure Blob - Unlimited capacity, pay per GB - Access via HTTP/API

Block Storage (VM disks): - AWS EBS, GCP Persistent Disk - Attached to VMs - SSD or HDD options

Database: - Managed SQL: AWS RDS, Cloud SQL - NoSQL: DynamoDB, Firestore, CosmosDB - No maintenance, automatic backups ```

Python in the Cloud

AWS with Boto3

```python import boto3

S3: Upload/download files

s3 = boto3.client('s3')

Upload file

s3.upload_file('local_file.csv', 'my-bucket', 'data/file.csv')

Download file

s3.download_file('my-bucket', 'data/file.csv', 'downloaded.csv')

List objects

response = s3.list_objects_v2(Bucket='my-bucket', Prefix='data/') for obj in response.get('Contents', []): print(obj['Key'], obj['Size']) ```

Google Cloud

```python from google.cloud import storage, bigquery

Cloud Storage

storage_client = storage.Client() bucket = storage_client.bucket('my-bucket')

Upload

blob = bucket.blob('data/file.csv') blob.upload_from_filename('local_file.csv')

BigQuery

bq_client = bigquery.Client() query = """ SELECT * FROM project.dataset.table WHERE date > '2024-01-01' """ df = bq_client.query(query).to_dataframe() ```

Azure

```python from azure.storage.blob import BlobServiceClient from azure.identity import DefaultAzureCredential

Blob Storage

credential = DefaultAzureCredential() blob_service = BlobServiceClient( account_url="https://myaccount.blob.core.windows.net", credential=credential )

Upload

container = blob_service.get_container_client("my-container") with open("local_file.csv", "rb") as data: container.upload_blob("data/file.csv", data) ```

Cloud for Machine Learning

Managed ML Services

``` ML Platforms:

┌─────────────────────────────────────────────────────────────┐ │ AWS SageMaker │ │ - Managed notebooks │ │ - Built-in algorithms │ │ - Model training and deployment │ │ - GPU instances available │ └─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐ │ Google Vertex AI │ │ - AutoML (no-code ML) │ │ - Custom training │ │ - TPU access │ │ - ML pipelines │ └─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐ │ Azure ML Studio │ │ - Drag-and-drop ML │ │ - Automated ML │ │ - Integration with VS Code │ └─────────────────────────────────────────────────────────────┘ ```

GPU Instances

```python

Example: Launch GPU instance for training

AWS EC2 GPU instances:

- p3.2xlarge: 1x V100 (16GB) - ~$3/hour

- p3.8xlarge: 4x V100 - ~$12/hour

- p4d.24xlarge: 8x A100 - ~$33/hour

Google Cloud:

- a2-highgpu-1g: 1x A100 - ~$3/hour

- a2-highgpu-8g: 8x A100 - ~$25/hour

Can also attach GPUs to regular VMs:

# gcloud compute instances create my-gpu-vm \ # --machine-type=n1-standard-8 \

--accelerator=type=nvidia-tesla-v100,count=1

```

Serverless Computing

AWS Lambda Example

```python

lambda_function.py

import json import boto3

def lambda_handler(event, context): """Process uploaded S3 file."""

# Get bucket and key from event
bucket = event['Records'][0]['s3']['bucket']['name']
key = event['Records'][0]['s3']['key']

# Process file
s3 = boto3.client('s3')
response = s3.get_object(Bucket=bucket, Key=key)
content = response['Body'].read().decode('utf-8')

# Do something with content
line_count = len(content.split('\n'))

return {
    'statusCode': 200,
    'body': json.dumps({
        'file': key,
        'lines': line_count
    })
}

```

When to Use Serverless

``` Good for Serverless: ✓ Event-driven processing ✓ Variable/unpredictable load ✓ Short-running tasks (<15 min) ✓ API endpoints

Not Good for Serverless: ✗ Long-running jobs ✗ GPU computation ✗ Stateful applications ✗ Constant high load (cost) ```

Cost Management

Pricing Models

``` Cloud Pricing:

On-Demand: - Pay by hour/second - Most expensive - Maximum flexibility

Reserved (1-3 years): - 30-75% cheaper - Commitment required - Good for steady workloads

Spot/Preemptible: - 60-90% cheaper - Can be interrupted - Good for fault-tolerant batch jobs ```

Cost Optimization Tips

```python

1. Use spot instances for training

2. Right-size instances (don't over-provision)

3. Auto-scaling for variable loads

4. Use serverless for intermittent workloads

5. Choose appropriate storage tier

Example: S3 storage tiers

Standard: $0.023/GB/month (frequent access)

Infrequent Access: $0.0125/GB/month

Glacier: $0.004/GB/month (archival)

Glacier Deep: $0.00099/GB/month (rarely accessed)

```

Getting Started

Local Development → Cloud

```python

1. Develop locally

python train.py --data local_data.csv

2. Test with cloud storage

python train.py --data s3://bucket/data.csv

3. Run on cloud compute

Deploy to EC2/GCE or use managed service

4. Scale up

Increase instance size or use multiple instances

```

Basic Cloud Workflow

1. Create cloud account 2. Set up credentials/authentication 3. Install SDK (boto3, google-cloud, azure) 4. Store data in cloud storage 5. Launch compute instance or use managed service 6. Run workload 7. Download results / deploy model 8. Shut down resources!

Summary

Service Type What It Provides Example
IaaS Virtual machines EC2, Compute Engine
PaaS Application platform Heroku, App Engine
SaaS Complete applications Gmail, Dropbox
Storage Files and databases S3, Cloud Storage
Serverless Run code on events Lambda, Functions
ML Platform Managed ML training SageMaker, Vertex AI

Key takeaways:

  • Cloud provides flexible, on-demand computing
  • Pay-as-you-go vs large upfront investment
  • Choose service level based on control needs
  • Use managed services to reduce operational burden
  • Monitor costs—easy to overspend
  • Spot/preemptible instances for cost savings
  • Python SDKs available for all major clouds

Exercises

Exercise 1. Explain the difference between IaaS, PaaS, and SaaS. Give an example of each relevant to data science.

Solution to Exercise 1

```python

Conceptual solution - see page content for details

import sys import platform

print(f"Python version: {sys.version}") print(f"Platform: {platform.platform()}") print(f"Architecture: {platform.machine()}") ```


Exercise 2. Name three major cloud providers and one service from each that is commonly used for machine learning workloads.

Solution to Exercise 2

See the main content for the detailed explanation. The key concept involves understanding the hardware-software interaction and how it affects Python performance.


Exercise 3. Explain the concept of horizontal scaling (scaling out) vs vertical scaling (scaling up). When would you use each?

Solution to Exercise 3

```python import time

Simple benchmark

n = 10_000_000 start = time.perf_counter() total = sum(range(n)) elapsed = time.perf_counter() - start print(f"Sum of {n} integers: {total}") print(f"Time: {elapsed:.4f} seconds") ```


Exercise 4. Write Python code that demonstrates a simple client-server interaction using requests to call a REST API endpoint.

Solution to Exercise 4

```python import numpy as np import time

n = 1_000_000

Python loop

start = time.perf_counter() result_py = sum(i * i for i in range(n)) time_py = time.perf_counter() - start

NumPy vectorized

arr = np.arange(n) start = time.perf_counter() result_np = np.sum(arr * arr) time_np = time.perf_counter() - start

print(f"Python: {time_py:.4f}s, NumPy: {time_np:.4f}s") print(f"Speedup: {time_py / time_np:.1f}x") ```