# Python: Compute the sum of values in all rows for a column in compressed sparse matrix

I have a compressed sparse matrix with values 1 and 0. A large percentage of elements in the CSR matrix is 0. I want to check how many 1s are there in a particular column. How can I do this?

+1 vote
by (176k points)

The sum() function of Numpy can be used to compute the sum of values in all rows for a particular column.

Here is an example:

In the following 7x7 sparse matrix, I am computing the sum of rows for column 6.

>>> import numpy as np
>>> from scipy.sparse import csr_matrix
>>> r = np.array([0, 0, 1, 1, 2, 2, 2, 3, 4, 4, 5, 6, 6])
>>> c = np.array([0, 3, 4, 1, 3, 5, 6, 3, 1, 6, 0, 1, 3])
>>> data = np.array([1]*len(r))
>>> X = csr_matrix((data, (r, c)), shape=(7, 7))
>>> X
<7x7 sparse matrix of type '<class 'numpy.int64'>'
with 13 stored elements in Compressed Sparse Row format>
>>> X.toarray()
array([[1, 0, 0, 1, 0, 0, 0],
[0, 1, 0, 0, 1, 0, 0],
[0, 0, 0, 1, 0, 1, 1],
[0, 0, 0, 1, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 1, 0, 0, 0]])
>>> col=6
>>> np.sum(X[:,col])
2
>>>