+4 votes
in Programming Languages by (28.3k points)
How to convert pandas DataFrame to a JSON string and save it in a gzip JSON file?

1 Answer

+1 vote
by (48.7k points)

You can use the pandas to_json() function to convert a DataFrame to a JSON string. In the function, you can specify the JSON filename where you want to save the dataframe. The parameter 'compression' is used for on-the-fly compression/decompression of the data. Use gzip, bz2, zip, or xz if the filename ends in '.gz', '.bz2', '.zip', or '.xz', respectively.

The following example shows how to write data to a gzip JSON file and how to read data from gzip JSON file:

import pandas as pd

df = pd.DataFrame({"name": ['AA', 'BB', 'CC', 'DD', 'EE', 'HH', 'II'], "age": [34, 12, 56, 43, 23, 41, 52]})

# write dataframe to a gzip JSON file
filename = "testfile.json"
df.to_json(filename, orient='records', compression='gzip')

# read JSON string from file and convert to dataframe
df1 = pd.read_json(filename, compression='gzip')

The above code prints the following output:

  name  age
0   AA   34
1   BB   12
2   CC   56
3   DD   43
4   EE   23
5   HH   41
6   II   52