Skip to content

Keyword - ignore_index

The ignore_index parameter resets the index of the concatenated result to a new RangeIndex.

Mental Model

When you stack two DataFrames, their original indices tag along, often producing duplicates like [0, 1, 0, 1]. Setting ignore_index=True throws away those old labels and renumbers from scratch -- [0, 1, 2, 3]. Use it whenever the original index values carry no meaning.

Default Behavior

By default, original indices are preserved.

1. Preserved Indices

```python import pandas as pd

df1 = pd.DataFrame([[1, 2], [3, 4]], columns=list('AB')) df2 = pd.DataFrame([[5, 6], [7, 8]], columns=list('AB'))

df = pd.concat([df1, df2]) print("ignore_index = False (default)") print(df) ```

A B 0 1 2 1 3 4 0 5 6 1 7 8

2. Duplicate Indices

Notice indices 0 and 1 appear twice.

3. Potential Issues

Duplicate indices can cause problems with loc.

ignore_index=True

Create a new sequential index.

1. Reset Index

python dg = pd.concat([df1, df2], ignore_index=True) print("ignore_index = True") print(dg)

A B 0 1 2 1 3 4 2 5 6 3 7 8

2. Unique Indices

New RangeIndex from 0 to n-1.

3. Clean Result

No duplicate indices; easier to work with.

When to Use

Guidelines for ignore_index.

1. Use ignore_index=True When

```python

Original index is meaningless (like auto-generated)

Combining data from multiple files

Need unique indices

Index will be reset anyway

```

2. Keep Original Index When

```python

Index has meaning (dates, IDs)

Need to trace data back to source

Using keys parameter instead

```

3. Common Pattern

```python

Combine CSVs with ignore_index

all_data = pd.concat([ pd.read_csv('data_2023.csv'), pd.read_csv('data_2024.csv') ], ignore_index=True) ```

axis=1 Behavior

ignore_index works on both axes.

1. Reset Column Names

```python df1 = pd.DataFrame([[1, 2]], columns=['A', 'B']) df2 = pd.DataFrame([[3, 4]], columns=['C', 'D'])

result = pd.concat([df1, df2], axis=1, ignore_index=True) print(result) ```

0 1 2 3 0 1 2 3 4

2. Column Names Lost

Columns become 0, 1, 2, 3 instead of A, B, C, D.

3. Usually Not Wanted

For axis=1, typically keep column names.

Comparison with reset_index

Two ways to achieve clean indices.

1. concat with ignore_index

python result = pd.concat([df1, df2], ignore_index=True)

2. reset_index After

python result = pd.concat([df1, df2]).reset_index(drop=True)

3. Equivalent Results

Both produce the same output; ignore_index is cleaner.


Exercises

Exercise 1. Write code that concatenates two DataFrames and shows the difference between ignore_index=False (default) and ignore_index=True.

Solution to Exercise 1

```python import pandas as pd

df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}) df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]}) result = pd.concat([df1, df2], ignore_index=True) print(result) ```


Exercise 2. Explain when ignore_index=True is useful. What problem does it solve?

Solution to Exercise 2

See the explanation in the main content. The key concept involves understanding how pd.concat() aligns data along the specified axis and handles mismatched indices or columns.


Exercise 3. Create two DataFrames with custom indices (e.g., starting from 100). Concatenate them with and without ignore_index and compare the index.

Solution to Exercise 3

```python import pandas as pd

df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}) df2 = pd.DataFrame({'A': [5, 6], 'C': [7, 8]}) result = pd.concat([df1, df2], axis=0) print(result) ```


Exercise 4. Write code that concatenates DataFrames along axis=1 with ignore_index=True. What gets reset in this case?

Solution to Exercise 4

```python import pandas as pd

df1 = pd.DataFrame({'A': [1, 2]}, index=[0, 1]) df2 = pd.DataFrame({'A': [3, 4]}, index=[2, 3]) result = pd.concat([df1, df2]) print(result) ```