# How to select some of the rows from CSR (Compressed Sparse Row) matrix in python

I have an MxN sparse matrix and want to randomly select some of the rows? How can I do that?

+1 vote
by (48.9k points)

If your CSR matrix is X and rows you want to select are n=[n1,n2,n3,...,nk], you can use X[n] to select those rows. Check the following example:

>>> import numpy as np

>>> from scipy.sparse import csr_matrix

>>> row = np.array([0, 0, 1, 2, 2, 2,3,3,4,5,6,6,7,7,8,8,8,9])

>>> col = np.array([0, 4, 3, 2, 3, 1,1,4,2,3,1,4,3,2,2,3,1,4])

>>> data = np.array([1]*len(row))

>>> X=csr_matrix((data, (row, col)), shape=(10, 5))

>>> X

<10x5 sparse matrix of type '<type 'numpy.int32'>'

with 18 stored elements in Compressed Sparse Row format>

>>> X.toarray()

array([[1, 0, 0, 0, 1],

[0, 0, 0, 1, 0],

[0, 1, 1, 1, 0],

[0, 1, 0, 0, 1],

[0, 0, 1, 0, 0],

[0, 0, 0, 1, 0],

[0, 1, 0, 0, 1],

[0, 0, 1, 1, 0],

[0, 1, 1, 1, 0],

[0, 0, 0, 0, 1]])

>>> c=[2,3,4]

>>> X[c]

<3x5 sparse matrix of type '<type 'numpy.int32'>'

with 6 stored elements in Compressed Sparse Row format>

>>> X[c].toarray()

array([[0, 1, 1, 1, 0],

[0, 1, 0, 0, 1],

[0, 0, 1, 0, 0]])

>>>