Saving DataFrames¶
Export DataFrames to various file formats for storage and sharing.
to_csv¶
Save DataFrame to CSV file.
1. Basic Save¶
import pandas as pd
df.to_csv('output.csv')
2. Without Index¶
df.to_csv('output.csv', index=False)
3. Custom Separator¶
df.to_csv('output.tsv', sep='\t')
to_csv Keywords¶
Customize CSV output.
1. Select Columns¶
df.to_csv('output.csv', columns=['Name', 'Age'])
2. Handle Missing¶
df.to_csv('output.csv', na_rep='NULL')
3. Float Format¶
df.to_csv('output.csv', float_format='%.2f')
to_excel¶
Save DataFrame to Excel file.
1. Basic Save¶
df.to_excel('output.xlsx')
2. Specify Sheet Name¶
df.to_excel('output.xlsx', sheet_name='Data')
3. Without Index¶
df.to_excel('output.xlsx', index=False)
Multiple Sheets¶
Save multiple DataFrames to one Excel file.
1. ExcelWriter¶
with pd.ExcelWriter('output.xlsx') as writer:
df1.to_excel(writer, sheet_name='Sheet1')
df2.to_excel(writer, sheet_name='Sheet2')
df3.to_excel(writer, sheet_name='Sheet3')
2. Append Mode¶
with pd.ExcelWriter('output.xlsx', mode='a') as writer:
df_new.to_excel(writer, sheet_name='NewSheet')
3. Engine Selection¶
# openpyxl for .xlsx, xlsxwriter for formatting
with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer:
df.to_excel(writer)
to_json¶
Save DataFrame to JSON file.
1. Basic Save¶
df.to_json('output.json')
2. Orient Options¶
df.to_json('output.json', orient='records') # List of dicts
df.to_json('output.json', orient='columns') # Dict of lists
df.to_json('output.json', orient='index') # Dict of dicts
3. Indent for Readability¶
df.to_json('output.json', orient='records', indent=2)
to_pickle¶
Save DataFrame in pickle format.
1. Save Pickle¶
df.to_pickle('output.pkl')
2. Load Pickle¶
df = pd.read_pickle('output.pkl')
3. Compression¶
df.to_pickle('output.pkl.gz', compression='gzip')
to_parquet¶
Save in Parquet format (efficient columnar storage).
1. Basic Save¶
df.to_parquet('output.parquet')
2. Compression¶
df.to_parquet('output.parquet', compression='snappy')
3. Read Back¶
df = pd.read_parquet('output.parquet')
Financial Example¶
Save stock data workflow.
1. Download and Save¶
import yfinance as yf
ticker = 'WMT'
df = yf.Ticker(ticker).history(start='2020-01-01', end='2020-12-31')
# Save to CSV
df.to_csv(f'{ticker}.csv')
# Save to Excel
df.to_excel(f'{ticker}.xlsx', sheet_name='stocks')
2. Multiple Tickers¶
tickers = ['AAPL', 'MSFT', 'GOOGL']
with pd.ExcelWriter('portfolio.xlsx') as writer:
for ticker in tickers:
df = yf.Ticker(ticker).history(period='1y')
df.to_excel(writer, sheet_name=ticker)
3. Pickle for Speed¶
# Faster save/load for large DataFrames
df.to_pickle('large_data.pkl')
df = pd.read_pickle('large_data.pkl')
Format Comparison¶
Choose the right format.
1. CSV¶
- Pros: Universal, human-readable
- Cons: No type preservation, slow for large data
2. Excel¶
- Pros: Business-friendly, multiple sheets
- Cons: Slower, file size limits
3. Pickle¶
- Pros: Fast, preserves types
- Cons: Python-only, security concerns
4. Parquet¶
- Pros: Fast, compressed, columnar
- Cons: Less universal