+3 votes
in Programming Languages by (14.7k points)

I can create a copy of a pandas dataframe using the assignment operator. However, if I modify the copy, it also modifies the original dataframe. 

E.g.

>>> df

   A   B   C

0  1  11  21

1  2  12  22

2  3  13  23

3  4  14  24

>>> df2=df

>>> df2

   A   B   C

0  1  11  21

1  2  12  22

2  3  13  23

3  4  14  24

>>> df2.iat[1,2]=45

>>> df2

   A   B   C

0  1  11  21

1  2  12  45

2  3  13  23

3  4  14  24

>>> df

   A   B   C

0  1  11  21

1  2  12  45

2  3  13  23

3  4  14  24

Is there any way to make a copy of the dataframe so that modification in the copy is not reflected in the source dataframe?

1 Answer

0 votes
by (25.6k points)

The pandas module has the copy() function to make a copy of the dataframe. If you set its parameter deep=True, the modification in the copy will not be reflected in the source dataframe. The default value of this parameter is also "True".

When you set deep=False, only references to the data and index are copied. Also, when you use the assigment operator to copy a dataframe, a new dataframe is created without copying the source dataframe's data or index.

Here is an example:

>>> df
   A   B   C
0  1  11  21
1  2  12  22
2  3  13  23
3  4  14  24
>>> df2=df.copy(deep=True)
>>> df2
   A   B   C
0  1  11  21
1  2  12  22
2  3  13  23
3  4  14  24
>>> df2.iat[1,2]=55
>>> df2
   A   B   C
0  1  11  21
1  2  12  55
2  3  13  23
3  4  14  24
>>> df
   A   B   C
0  1  11  21
1  2  12  22
2  3  13  23
3  4  14  24
>>>

In the above example, I changed a value in df2, but it was not reflected in df.

...