+4 votes
in Programming Languages by (56.6k points)
Which function should I use to query the columns of a Pandas DataFrame?

E.g. if a DataFrame has columns "age" and "salary", how can I select salaries of a particular age?

1 Answer

+2 votes
by (348k points)
selected by
 
Best answer

Pandas DataFrame has query() function that can be used to query the columns of a DataFrame with some boolean expressions.

Here is an example:

The following DataFrame has three columns: id, age, and salary. By giving boolean expressions as arguments to the query() function, you can select the desired rows from the DataFrame.

>>> import pandas as pd
>>> df = pd.DataFrame({'id': [11, 12, 13, 14, 15, 16, 17], 'age': [21, 43, 12, 54, 23, 76, 34], 'salary': [4234, 4321, 654, 2342, 65456, 6453, 12334]})
>>> df
   id  age  salary
0  11   21    4234
1  12   43    4321
2  13   12     654
3  14   54    2342
4  15   23   65456
5  16   76    6453
6  17   34   12334
>>> df.query('salary > 2345 and age < 50')
   id  age  salary
0  11   21    4234
1  12   43    4321
4  15   23   65456
6  17   34   12334
>>> df.query('salary > 12345 and age > 45')
Empty DataFrame
Columns: [id, age, salary]
Index: []
>>> df.query('salary > 5345 and age > 45')
   id  age  salary
5  16   76    6453


...