+1 vote
in Programming Languages by (56.8k points)

I am using the PubChem API to fetch the chemical name for a give SMILES string. I am accessing the API using the following Python code.

smi = "CC(C)NC(=O)C=1C=C(C=CC1/N=C/2\CN(CCN2C)C)[N+](=O)[O-]"

pubchem_url = "https: //pubchem.ncbi.nlm.nih.gov/rest/pug/compound/smiles/" + str(smi) + "/synonyms/JSON"

f = requests.get(pubchem_url).text

chem_df = json.loads(f)

chem_name = chem_df['InformationList']['Information'][0]['Synonym'][0]

However, for the given SMILES string, the above code returns an error. When I checked the URL in the browser, it gave the following message. How to fix it?

{

  "Fault": {

    "Code": "PUGREST.BadRequest",

    "Message": "Unable to standardize the given structure - perhaps some special characters need to be escaped or data packed in a MIME form?",

    "Details": [

      "error: ",

      "status: 400",

      "output: Caught ncbi::CException: Standardization failed",

      "Output Log:",

      "Record 1: Warning: Cactvs Ensemble cannot be created from input string",

      "Record 1: Error: Unable to convert input into a compound object",

      "",

      ""

    ]

  }

}

1 Answer

+1 vote
by (351k points)
selected by
 
Best answer

The SMILES string you are using contains forward slash (/), and it seems that the error message is complaining about the forward slash. So, instead of using the SMILES string in the URL, use it as an argument in the requests.get() function.

Here is the example code to fix the error:

>>> import requests

>>> import json

>>> smi = "CC(C)NC(=O)C=1C=C(C=CC1/N=C/2\CN(CCN2C)C)[N+](=O)[O-]"

>>> pubchem_url = "https ://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/smiles/synonyms/JSON"

>>> f = requests.get(pubchem_url, params={'smiles': smi})

>>> chem_df = json.loads(f.text)

>>> chem_name = chem_df['InformationList']['Information'][0]['Synonym'][0]

>>> chem_name

'CHEMBL3342035'

>>> 

I have put a space between "https" and ":" in the variable "pubchem_url". Delete the space to run it.


...