Skip to main content

No project description provided

Project description

pubchem tools

is a small packet to query the pubchem APIs. For simple and easy access, one can use the pubchempy library. It is not as nice as pubchempy neither so complete, but works well for things like finding CAS numbers, safety codes, and anything one knows in the record. Key point: it works with pulling the record, not querying it every time one asks a question.

pubchem has two interfaces

Actually the PUG-RES API is available two ways:

and they provide slightly different facilities. Using the former, one can get simple answer to things like: what is the CID (or pubchem ID) of compound 'a'? The latter fetches the full record in a hierarchic JSON object.

using only pug

Trying the various output methods did not provide a response as rich as the pug_view did. Thus, the double call is important at the moment.

Handling the JSON hierarchy

One thing to do is to pick out the internal lists of dicts to some objects, e.g. turning them to dicts. Then a crawler across the dict structure can discover information we need. We use the dictDigUtils, a c.a. 5 kB set of functions help digging the hierarchy returned by the json python library.

Lists are:

  • Section
  • Value
  • StringWithMarkup
  • Number
  • Boolean

Funcitons like clean_section help managing this.

A Pubchem class is formed which has simple get_value calls to provide parameters, such as molecular_weight, molecular_formula, etc.

full record

is available as the _record_ private variable. Some parts are simplified during the processing to remove formating instructions, not needed here. You can always use:

a = Pubchem('your molecule')
get_value('Molecular Weight', a._record_)

to query into the record. dictDigUtils can provide you dict_list_keys to discover possible keys within.

names and synonyms

The PUG-RES API will recognize any names what is listed in the synonym list, thus one needs no other trics to find the material. However, foreign names may be encoded e.g. the German ä as ae. Now the list of synonyms is available as the synonyms element of the class.

InChi

There are two fields coming from Pubchem:

  • InChI key
  • InChI, which is indicated with InChi= in the text

density

The data set may contain several versions of this, and unfortunately it is not really uniform across the chemicals. There are:

  • a number
  • a number and unit
  • < a value
  • reference to a table image
  • a text describing value ranges and conditions So, for now, the function just returns the list it finds.

A bit more generic solution is to allow a filter in the general searc. This is now the get_value_filtered() function, which handles the density and the CAS numbers. For density, use a cut for vapor values. Others did not work, because they were listed under the same key.

1.0.9: bugfix a typo in pubchem.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pubchemTool-1.0.9.tar.gz (7.0 kB view hashes)

Uploaded Source

Built Distribution

pubchemTool-1.0.9-py3-none-any.whl (7.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page