Skip to content

DatetimeIndex

pandas provides rich support for time series data through the DatetimeIndex, which enables label-based time alignment and slicing.

Creating DatetimeIndex

Generate datetime indices for time series.

1. date_range Function

import pandas as pd

dates = pd.date_range("2020-01-01", periods=5, freq="D")
print(dates)
DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04',
               '2020-01-05'],
              dtype='datetime64[ns]', freq='D')

2. With End Date

dates = pd.date_range(start="2024-01-01", end="2024-01-10")

3. Nanosecond Precision

DatetimeIndex stores timestamps with nanosecond precision.

Converting to Datetime

Parse strings and other formats to datetime.

1. to_datetime Function

pd.to_datetime(["2021-01-01", "2021-01-05"])

2. Common with CSV Loading

df = pd.read_csv('data.csv', parse_dates=['date_column'])

3. Format Specification

pd.to_datetime("01-15-2024", format="%m-%d-%Y")

Date Range Frequencies

Common frequency strings.

1. Daily and Sub-daily

pd.date_range("2024-01-01", periods=5, freq="D")   # Daily
pd.date_range("2024-01-01", periods=5, freq="H")   # Hourly
pd.date_range("2024-01-01", periods=5, freq="T")   # Minute

2. Weekly and Monthly

pd.date_range("2024-01-01", periods=5, freq="W")   # Weekly
pd.date_range("2024-01-01", periods=5, freq="M")   # Month end
pd.date_range("2024-01-01", periods=5, freq="MS")  # Month start

3. Business Days

pd.date_range("2024-01-01", periods=5, freq="B")   # Business days

Indexing with Dates

Date-based slicing and selection.

1. String-based Selection

s = pd.Series(range(5), index=pd.date_range("2020-01-01", periods=5))
s["2020-01-02":"2020-01-04"]  # Inclusive slicing

2. Partial String Indexing

s["2020-01"]  # All of January 2020
s["2020"]     # All of 2020

3. loc with Dates

s.loc["2020-01-02"]
s.loc["2020-01-02":"2020-01-04"]

Time Zone Handling

Work with time zones.

1. Localize to UTC

dates = pd.date_range("2020-01-01", periods=5)
dates_utc = dates.tz_localize("UTC")

2. Convert Time Zone

dates_eastern = dates_utc.tz_convert("US/Eastern")

3. Financial Data

Time zones are essential for global financial data.

Financial Context

DatetimeIndex underlies financial analysis.

1. Price Time Series

prices = pd.Series([100, 101, 102], index=pd.date_range("2024-01-01", periods=3))

2. Returns Calculation

returns = prices.pct_change()

3. Event Studies

# Select specific date ranges for analysis
event_window = prices["2024-01-01":"2024-01-31"]

Runnable Example: time_series_tutorial.py

"""
Pandas Tutorial: Time Series Analysis.

Covers datetime operations, resampling, rolling windows, time zones.
"""

import pandas as pd
import numpy as np

# =============================================================================
# Main
# =============================================================================

if __name__ == "__main__":

    print("="*70)
    print("TIME SERIES ANALYSIS")
    print("="*70)

    # Create time series data
    dates = pd.date_range('2024-01-01', periods=100, freq='D')
    np.random.seed(42)
    ts_data = pd.DataFrame({
        'Date': dates,
        'Sales': np.random.randint(100, 500, 100),
        'Temperature': np.random.uniform(15, 35, 100)
    })
    ts_data.set_index('Date', inplace=True)

    print("\nTime Series Data:")
    print(ts_data.head())

    # Date parsing
    print("\n1. Parse dates from strings:")
    date_strings = ['2024-01-01', '2024-02-15', '2024-03-30']
    parsed_dates = pd.to_datetime(date_strings)
    print(parsed_dates)

    # Date ranges
    print("\n2. Create date ranges:")
    print("Daily:", pd.date_range('2024-01-01', periods=5, freq='D'))
    print("Weekly:", pd.date_range('2024-01-01', periods=5, freq='W'))
    print("Monthly:", pd.date_range('2024-01-01', periods=5, freq='MS'))

    # Accessing datetime components
    print("\n3. Extract datetime components:")
    ts_data['Year'] = ts_data.index.year
    ts_data['Month'] = ts_data.index.month
    ts_data['Day'] = ts_data.index.day
    ts_data['DayOfWeek'] = ts_data.index.dayofweek
    print(ts_data[['Sales', 'Year', 'Month', 'Day', 'DayOfWeek']].head())

    # Resampling (changing frequency)
    print("\n4. Resample to weekly frequency (sum):")
    weekly = ts_data['Sales'].resample('W').sum()
    print(weekly.head())

    print("\n5. Resample to monthly (mean):")
    monthly = ts_data['Sales'].resample('MS').mean()
    print(monthly.head())

    # Rolling windows
    print("\n6. Rolling mean (7-day window):")
    ts_data['Sales_MA7'] = ts_data['Sales'].rolling(window=7).mean()
    print(ts_data[['Sales', 'Sales_MA7']].head(10))

    print("\n7. Rolling statistics:")
    rolling_stats = ts_data['Sales'].rolling(window=7).agg(['mean', 'std', 'min', 'max'])
    print(rolling_stats.head(10))

    # Expanding windows
    print("\n8. Expanding mean (cumulative):")
    ts_data['Cumulative_Mean'] = ts_data['Sales'].expanding().mean()
    print(ts_data[['Sales', 'Cumulative_Mean']].head(10))

    # Shift and lag
    print("\n9. Shift values (lag/lead):")
    ts_data['Sales_Yesterday'] = ts_data['Sales'].shift(1)
    ts_data['Sales_Tomorrow'] = ts_data['Sales'].shift(-1)
    print(ts_data[['Sales', 'Sales_Yesterday', 'Sales_Tomorrow']].head())

    # Percentage change
    print("\n10. Percentage change:")
    ts_data['Sales_Pct_Change'] = ts_data['Sales'].pct_change()
    print(ts_data[['Sales', 'Sales_Pct_Change']].head())

    # Time zones
    print("\n11. Time zone operations:")
    utc_dates = pd.date_range('2024-01-01', periods=3, freq='D', tz='UTC')
    print("UTC:", utc_dates)

    # Convert time zone
    eastern = utc_dates.tz_convert('US/Eastern')
    print("Eastern:", eastern)

    print("\nKEY TAKEAWAYS:")
    print("- pd.to_datetime(): Parse dates from strings")
    print("- pd.date_range(): Create date sequences")
    print("- resample(): Change time frequency")
    print("- rolling(): Moving window calculations")
    print("- shift(): Lag/lead values")
    print("- pct_change(): Percentage change")
    print("- Time zone handling with tz parameter")