+3 votes
in Programming Languages by (17.9k points)
Can I apply multiple aggregate functions (e.g., sum, min, max) to rows/columns of a Pandas dataframe at once? If yes, which function should I use?

1 Answer

+1 vote
by (48.9k points)

You can use the aggregate() function of the dataframe to apply multiple aggregate functions together. You have to pass the list of aggregate functions to the aggregate() function of the dataframe. Also, if you want to apply different aggregate functions on different columns, you can also do that.

Here are examples:

>>> import pandas as pd
>>> df = pd.DataFrame({'A':[1,2,3,4,5], 'B':[11,12,13,14,15], 'C':[21,22,23,24,25]})
>>> df
   A   B   C
0  1  11  21
1  2  12  22
2  3  13  23
3  4  14  24
4  5  15  25

Apply sum and min to all columns

>>> df.aggregate([np.sum, np.min])
       A   B    C
sum   15  65  115
amin   1  11   21

Apply sum, min, and max to all columns

>>> df.aggregate([np.sum, np.min, np.max])
       A   B    C
sum   15  65  115
amin   1  11   21
amax   5  15   25

Apply different aggregate functions to different columns

>>> df.aggregate({'A': [np.sum, np.min], 'B': [np.min, np.max], 'C': [np.sum, np.max]})
         A     B      C
sum   15.0   NaN  115.0
amin   1.0  11.0    NaN
amax   NaN  15.0   25.0

NaN means the particular aggregate function is not applied to the column. E.g., in the above code, the sum is not applied to column 'B' and hence is NaN for column 'B' and sum.


...