In a Numpy array, there are several duplicate rows. How can I remove those duplicate rows?

E.g.

array([[1, 2, 3],

[4, 5, 6],

[7, 8, 9],

[1, 2, 3],

[4, 5, 6]])

+1 vote

Best answer

You can use numpy.unique() to select only unique rows. Since you want unique rows, you need to use parameter axis=0 .

>>> import numpy as np

>>> data=np.array([[1,2,3],[4,5,6],[7,8,9],[1,2,3],[4,5,6]])

>>> data

array([[1, 2, 3],

[4, 5, 6],

[7, 8, 9],

[1, 2, 3],

[4, 5, 6]])

>>> np.unique(data, axis=0)

array([[1, 2, 3],

[4, 5, 6],

[7, 8, 9]])