Python: How to shuffle two related lists (training data and labels ) in the same order

Question

Python: How to shuffle two related lists (training data and labels ) in the same order

1 Answer

answered Oct 21, 2019 by pkumar81 (349k points)
selected Apr 11, 2023 by pkumar81

Best answer

You can try one of the following two approaches to shuffle both data and labels in the same order.

Approach 1: Using the number of elements in your data, generate a random index using function permutation(). Use that random index to shuffle the data and labels.

>>> import numpy as np
>>> X=np.array([[1.1,2.2,3.3,4.4],[1.2,2.3,3.4,4.5],[2.1,2.2,2.3,2.4],[3.1,3.2,3.3,3.4],[4.1,4.2,4.3,4.4]])
>>> X
array([[1.1, 2.2, 3.3, 4.4],
       [1.2, 2.3, 3.4, 4.5],
       [2.1, 2.2, 2.3, 2.4],
       [3.1, 3.2, 3.3, 3.4],
       [4.1, 4.2, 4.3, 4.4]])
>>> y=np.array([0,1,2,3,4])

>>> p = np.random.permutation(len(y))
>>> p
array([1, 0, 3, 4, 2])

>>> X_shuffled=X[p]
>>> X_shuffled
array([[1.2, 2.3, 3.4, 4.5],
       [1.1, 2.2, 3.3, 4.4],
       [3.1, 3.2, 3.3, 3.4],
       [4.1, 4.2, 4.3, 4.4],
       [2.1, 2.2, 2.3, 2.4]])

>>> y_shuffled=y[p]
>>> y_shuffled
array([1, 0, 3, 4, 2])

Approach 2: You can also use the shuffle() module of sklearn to randomize the data and labels in the same order.

>>> from sklearn.utils import shuffle
>>> X_shuffled,y_shuffled = shuffle(X, y, random_state=0)
>>> X_shuffled
array([[2.1, 2.2, 2.3, 2.4],
       [1.1, 2.2, 3.3, 4.4],
       [1.2, 2.3, 3.4, 4.5],
       [3.1, 3.2, 3.3, 3.4],
       [4.1, 4.2, 4.3, 4.4]])
>>> y_shuffled
array([2, 0, 1, 3, 4])

Python: How to shuffle two related lists (training data and labels ) in the same order

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Related questions

Categories