+3 votes
in Programming Languages by (15.7k points)
I have a Pandas DataFrame with just one column. The column has some repeated values. How can I find the count of each unique element in the dataframe?

1 Answer

+1 vote
by (45.8k points)

You can apply the groupby() function to the column and then apply the transform() function with argument 'len' to the result of the groupby() function.

Here is an example:

>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame({'A':[1,1,2,2,1,2,3,2,6,5,4,4,6,4,5,6]})
>>> df
    A
0   1
1   1
2   2
3   2
4   1
5   2
6   3
7   2
8   6
9   5
10  4
11  4
12  6
13  4
14  5
15  6

>>> df['size'] = df.groupby('A')['A'].transform(len)
>>> df
    A  size
0   1     3
1   1     3
2   2     4
3   2     4
4   1     3
5   2     4
6   3     1
7   2     4
8   6     3
9   5     2
10  4     3
11  4     3
12  6     3
13  4     3
14  5     2
15  6     3

The new column 'size' shows the count of each element. If you want to covert the above dataframe to a dictionary with unique elements as keys and their counts as values, you can apply the to_dict() function to the above dataframe.

>>> {v[0]:v[1] for v in df.to_dict('split')['data']}
{1: 3, 2: 4, 3: 1, 6: 3, 5: 2, 4: 3}
 

 


...