+2 votes
in Programming Languages by (34.9k points)
I created a copy of a dataframe using "=" (e.g., df2=df1).

When I change any value in df1, the corresponding value in df2 is also changed. How can I stop it? Is there any way to create a copy of a dataframe so that if the value in one dataframe changes, the value remains unchanged in another dataframe?

1 Answer

+1 vote
by (51.1k points)

You can use the copy() function of the dataframe with the parameter "deep=True". This function makes a copy of the original dataframe's indices and data.

"deep=True" is also the default value of this function. "deep=True" ensures that modifications to the data or indices of the copy are not be reflected in the original object and vice versa.

>>> import pandas as pd
>>> import numpy as np
>>> df1 = pd.DataFrame(np.random.randn(10,3), columns=['A','B','C'])
>>> df1
          A         B         C
0  1.170990  0.006632  1.119248
1  1.070425  1.143718 -0.449758
2 -2.178855  0.277188  0.805898
3 -1.568562 -0.231959 -0.087659
4  1.939483  0.182899 -0.308923
5  0.451356  0.920912 -0.693982
6 -0.433843 -0.908450 -0.808124
7 -0.641044 -1.359938 -0.930583
8 -0.908536 -0.025954  1.316114
9  0.036994  0.789888 -0.202191
>>> df2=df1.copy(deep=True)
>>> df2
          A         B         C
0  1.170990  0.006632  1.119248
1  1.070425  1.143718 -0.449758
2 -2.178855  0.277188  0.805898
3 -1.568562 -0.231959 -0.087659
4  1.939483  0.182899 -0.308923
5  0.451356  0.920912 -0.693982
6 -0.433843 -0.908450 -0.808124
7 -0.641044 -1.359938 -0.930583
8 -0.908536 -0.025954  1.316114
9  0.036994  0.789888 -0.202191


Now I change all negative values to 0 in df1. The values in df2 will remain unchanged.

>>> df1[df1<0]=0
>>> df1
          A         B         C
0  1.170990  0.006632  1.119248
1  1.070425  1.143718  0.000000
2  0.000000  0.277188  0.805898
3  0.000000  0.000000  0.000000
4  1.939483  0.182899  0.000000
5  0.451356  0.920912  0.000000
6  0.000000  0.000000  0.000000
7  0.000000  0.000000  0.000000
8  0.000000  0.000000  1.316114
9  0.036994  0.789888  0.000000
>>> df2
          A         B         C
0  1.170990  0.006632  1.119248
1  1.070425  1.143718 -0.449758
2 -2.178855  0.277188  0.805898
3 -1.568562 -0.231959 -0.087659
4  1.939483  0.182899 -0.308923
5  0.451356  0.920912 -0.693982
6 -0.433843 -0.908450 -0.808124
7 -0.641044 -1.359938 -0.930583
8 -0.908536 -0.025954  1.316114
9  0.036994  0.789888 -0.202191
>>>


...