+4 votes
in Programming Languages by (28.4k points)
What is the best pythonic way to select the first k rows and last k rows of a Pandas dataframe?

1 Answer

+2 votes
by (48.8k points)

You can use the function head() to get the first k rows from a DataFrame. Similarly, the tail() function can get the last k rows from a DataFrame.
You can also use iloc to select the first and last k rows.

Example:

>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame({'a':np.random.random(5), 'b':np.random.random(5)})
>>> df
          a         b
0  0.398201  0.996588
1  0.687678  0.609697
2  0.865939  0.350159
3  0.037772  0.000470
4  0.383119  0.181807
>>> df.head(n=2)  #first 2 rows
          a         b
0  0.398201  0.996588
1  0.687678  0.609697
>>> df.tail(n=2)  #last 2 rows
          a         b
3  0.037772  0.000470
4  0.383119  0.181807

Using iloc

>>> df.iloc[:2,:]  #first 2 rows
          a         b
0  0.398201  0.996588
1  0.687678  0.609697
>>> df.iloc[-2:,:]  #last 2 rows
          a         b
3  0.037772  0.000470
4  0.383119  0.181807


...