Download notebook

Transposing data in pandas (switching columns and rows)¶

Switching the columns and rows in a dataset is called transposing, and is one of the simplest forms of reshaping data.

In [3]:

import pandas as pd

df = pd.DataFrame([
    { 'date': '2017-01-01', 'beef': 1001.2, 'ofada': 377.4, 'cafish': 899.6},
    { 'date': '2017-01-01', 'beef': 1129.0, 'ofada': 472.3, 'cafish': 916.5},
    { 'date': '2017-01-01', 'beef': 1067.7, 'ofada': 319.9, 'cafish': 919.8}
])
df
import pandas as pd

df = pd.DataFrame([
    { 'date': '2017-01-01', 'beef': 1001.2, 'ofada': 377.4, 'cafish': 899.6},
    { 'date': '2017-01-01', 'beef': 1129.0, 'ofada': 472.3, 'cafish': 916.5},
    { 'date': '2017-01-01', 'beef': 1067.7, 'ofada': 319.9, 'cafish': 919.8}
])
df

Out[3]:

	date	beef	ofada	cafish
0	2017-01-01	1001.2	377.4	899.6
1	2017-01-01	1129.0	472.3	916.5
2	2017-01-01	1067.7	319.9	919.8

Simple transposition¶

To switch columns and rows, you can just use df.T

In [8]:

df.T
df.T

Out[8]:

	0	1	2
date	2017-01-01	2017-01-01	2017-01-01
beef	1001.2	1129.0	1067.7
ofada	377.4	472.3	319.9
cafish	899.6	916.5	919.8

The problem we can see is the first row should be the column names! You could manually the first row and rename your column headers...

In [41]:

df.T.iloc[1:].rename(columns=df.date)
df.T.iloc[1:].rename(columns=df.date)

Out[41]:

	2017-01-01	2017-01-01	2017-01-01
beef	1001.2	1129.0	1067.7
ofada	377.4	472.3	319.9
cafish	899.6	916.5	919.8

...but there's another way, too.

Transpose but fix column issue with `.set_index`¶

With our starting dataset, the date is just the same as any other column.

In [69]:

df
df

Out[69]:

	date	beef	ofada	cafish
0	2017-01-01	1001.2	377.4	899.6
1	2017-01-01	1129.0	472.3	916.5
2	2017-01-01	1067.7	319.9	919.8

When we transpose, it's the same as any other row.

In [70]:

df.T
df.T

Out[70]:

	0	1	2
date	2017-01-01	2017-01-01	2017-01-01
beef	1001.2	1129.0	1067.7
ofada	377.4	472.3	319.9
cafish	899.6	916.5	919.8

Where do the numbers on the columns come from? When we transpose, the index of the original dataframe becomes the column headers. To allow our dates to become column headers, we just need to set it as the index before we transpose.

In [71]:

df.set_index('date')
df.set_index('date')

Out[71]:

	beef	ofada	cafish
date
2017-01-01	1001.2	377.4	899.6
2017-01-01	1129.0	472.3	916.5
2017-01-01	1067.7	319.9	919.8

In [76]:

df.set_index('date').T
df.set_index('date').T

Out[76]:

date	2017-01-01	2017-01-01	2017-01-01
beef	1001.2	1129.0	1067.7
ofada	377.4	472.3	319.9
cafish	899.6	916.5	919.8

But the opposite happened, too, which is why our foods are bold: the column headers became the index of our rows. To turn them into normal rows, we'll use reset_index().

In [75]:

df.set_index('date').T.reset_index()
df.set_index('date').T.reset_index()

Out[75]:

date	index	2017-01-01	2017-01-01	2017-01-01
0	beef	1001.2	1129.0	1067.7
1	ofada	377.4	472.3	319.9
2	cafish	899.6	916.5	919.8

But they have a ridiculous name now! There's no solution to this that makes me happy.

In [82]:

df.set_index('date').T.reset_index().rename(columns={'index': 'food'})
df.set_index('date').T.reset_index().rename(columns={'index': 'food'})

Out[82]:

date	food	2017-01-01	2017-01-01	2017-01-01
0	beef	1001.2	1129.0	1067.7
1	ofada	377.4	472.3	319.9
2	cafish	899.6	916.5	919.8

If we want to get rid of the index label saying date when it's just a side effect of the transposition, we can use rename_axis.

In [86]:

df.set_index('date').T.reset_index().rename(columns={'index': 'food'}).rename_axis(None, axis=1)
df.set_index('date').T.reset_index().rename(columns={'index': 'food'}).rename_axis(None, axis=1)

Out[86]:

	food	2017-01-01	2017-01-01	2017-01-01
0	beef	1001.2	1129.0	1067.7
1	ofada	377.4	472.3	319.9
2	cafish	899.6	916.5	919.8

Sigh. This is so much more complicated than the original answer.

In [ ]:

Transposing data in pandas (switching columns and rows)¶

Simple transposition¶

Transpose but fix column issue with .set_index¶

Transpose but fix column issue with `.set_index`¶