+3 votes
in Programming Languages by (73.2k points)
I want to select all rows from a Pandas Dataframe whose one column contains a substring or some string pattern (e.g. 'abc*'). What function should I use for that?

1 Answer

+1 vote
by (346k points)
selected by
 
Best answer

contains() function works very much similar to 'like' of SQL. You can use it as follows:

>>> df =pd.DataFrame({'A':['apple1','apple2','banana','grape','apple3'], 'B':[i*2 for i in range(1,6)], 'C':[i*3 for i in range(1,6)]})
>>> df
        A   B   C
0  apple1   2   3
1  apple2   4   6
2  banana   6   9
3   grape   8  12
4  apple3  10  15
>>> df['A'].str.contains('apple')
0     True
1     True
2    False
3    False
4     True
Name: A, dtype: bool
>>> df[df['A'].str.contains('apple')]
        A   B   C
0  apple1   2   3
1  apple2   4   6
4  apple3  10  15
>>>


...