+1 vote
in Programming Languages by (8.2k points)

In a Numpy array, there are several duplicate rows. How can I remove those duplicate rows?

E.g.

array([[1, 2, 3],

       [4, 5, 6],

       [7, 8, 9],

       [1, 2, 3],

       [4, 5, 6]])

1 Answer

0 votes
by (16.1k points)

You can use numpy.unique() to select only unique rows. Since you want unique rows, you need to use parameter axis=0 .

>>> import numpy as np
>>> data=np.array([[1,2,3],[4,5,6],[7,8,9],[1,2,3],[4,5,6]])
>>> data
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9],
       [1, 2, 3],
       [4, 5, 6]])
>>> np.unique(data, axis=0)
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

...