+1 vote
in Programming Languages by (9k points)

How can I sort a Pandas dataframe by some column(s) in the descending order and then select top n rows from the sorted dataframe?

E.g. sort the following dataframe by Age in the descending order and then select the top 5 rows with larger Age.

Name  Age   DOB

0   AA   29  1967

1   BB   12  1867

2   CC   45  1834

3   DD   34  1945

4   EE   38  2011

5   FF   17  1745

6   GG   22  1867

7   HH   12  1756

8   II   38  2000

1 Answer

0 votes
by (18k points)
edited by

You can use sort_values() function to sort the dataframe by column 'Age'. To sort in the descending order, you need to add parameter 'ascending=False' to this function. Then you can apply iloc to the sorted dataframe to select the top 5 rows.

>>> df.sort_values(by='Age',ascending=False)  # sort by Age in descending order
  Name  Age   DOB
2   CC   45  1834
4   EE   38  2011
8   II   38  2000
3   DD   34  1945
0   AA   29  1967
6   GG   22  1867
5   FF   17  1745
1   BB   12  1867
7   HH   12  1756
>>> df.sort_values(by='Age',ascending=False).iloc[:5,:]  # select top 5 rows from the sorted dataframe
  Name  Age   DOB
2   CC   45  1834
4   EE   38  2011
8   II   38  2000
3   DD   34  1945
0   AA   29  1967

...