+2 votes
in Programming Languages by (73.8k points)

In a Numpy array, there are several duplicate rows. How can I remove those duplicate rows?

E.g.

array([[1, 2, 3],

       [4, 5, 6],

       [7, 8, 9],

       [1, 2, 3],

       [4, 5, 6]])

1 Answer

+1 vote
by (349k points)
selected by
 
Best answer

You can use numpy.unique() to select only unique rows. Since you want unique rows, you need to use parameter axis=0 .

>>> import numpy as np
>>> data=np.array([[1,2,3],[4,5,6],[7,8,9],[1,2,3],[4,5,6]])
>>> data
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9],
       [1, 2, 3],
       [4, 5, 6]])
>>> np.unique(data, axis=0)
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])


...