+2 votes
in Programming Languages by (14.3k points)

My compressed sparse matrix has many 0s and a few 1s. How can I find the indices of 1s in each row of the matrix?

array([[1, 0, 1],
       [0, 0, 1],
       [1, 1, 1]])

1 Answer

0 votes
by (24.7k points)
edited by

You can use find() or tolil() or indices to return the indices of the nonzero elements of a CSR matrix. find() returns indices and values of the nonzero elements.

Here is an example to show how to use these functions in Python3:

>>> from scipy.sparse import csr_matrix
>>> row = np.array([0, 0, 1, 2, 2, 2])
>>> col = np.array([0, 2, 2, 0, 1, 2])
>>> data = np.array([1, 1, 1, 1, 1, 1])
>>> X=csr_matrix((data, (row, col)), shape=(3, 3))
>>> X.toarray()
array([[1, 0, 1],
       [0, 0, 1],
       [1, 1, 1]], dtype=int32)

Approach 1
>>> idx=[]
>>> for v in X:
...  idx.append(v.indices)

>>> idx  #nonzero indices in each row
[array([0, 2], dtype=int32), array([2], dtype=int32), array([0, 1, 2], dtype=int32)]

Approach 2
>>> X.tolil().rows #nonzero indices in each row
array([list([0, 2]), list([2]), list([0, 1, 2])], dtype=object)

Approach 3
>>> from scipy import sparse
>>> sparse.find(X)
(array([0, 2, 2, 0, 1, 2], dtype=int32), array([0, 0, 1, 2, 2, 2], dtype=int32), array([1, 1, 1, 1, 1, 1], dtype=int32))
>>> list(zip(sparse.find(X)[0], sparse.find(X)[1])) #nonzero indices in the whole matrix
[(0, 0), (2, 0), (2, 1), (0, 2), (1, 2), (2, 2)]

...