No project description provided
Project description
pubchem tools
is a small packet to query the pubchem APIs. For simple and easy access, one can use the pubchempy library. It is not as nice as pubchempy neither so complete, but works well for things like finding CAS numbers, safety codes, and anything one knows in the record. Key point: it works with pulling the record, not querying it every time one asks a question.
pubchem has two interfaces
Actually the PUG-RES API is available two ways:
and they provide slightly different facilities. Using the former, one can get simple answer to things like: what is the CID (or pubchem ID) of compound 'a'? The latter fetches the full record in a hierarchic JSON object.
using only pug
Trying the various output methods did not provide a response as rich as the pug_view did. Thus, the double call is important at the moment.
Handling the JSON hierarchy
One thing to do is to pick out the internal lists of dicts to some objects, e.g. turning them to dicts. Then a crawler across the dict structure can discover information we need. We use the dictDigUtils, a c.a. 5 kB set of functions help digging the hierarchy returned by the json python library.
Lists are:
- Section
- Value
- StringWithMarkup
- Number
- Boolean
Funcitons like clean_section help managing this.
A Pubchem class is formed which has simple get_value calls to provide parameters, such as molecular_weight, molecular_formula, etc.
full record
is available as the _record_ private variable. Some parts are simplified during the processing to remove formating instructions, not needed here. You can always use:
a = Pubchem('your molecule')
get_value('Molecular Weight', a._record_)
to query into the record. dictDigUtils can provide you dict_list_keys to discover possible keys within.
names and synonyms
The PUG-RES API will recognize any names what is listed in the synonym list, thus one needs no other trics to find the material. However, foreign names may be encoded e.g. the German ä as ae. Now the list of synonyms is available as the synonyms element of the class.
InChi
There are two fields coming from Pubchem:
- InChI key
- InChI, which is indicated with InChi= in the text
density
The data set may contain several versions of this, and unfortunately it is not really uniform across the chemicals. There are:
- a number
- a number and unit
- < a value
- reference to a table image
- a text describing value ranges and conditions So, for now, the function just returns the list it finds.
A bit more generic solution is to allow a filter in the general searc. This is now the get_value_filtered() function, which handles the density and the CAS numbers. For density, use a cut for vapor values. Others did not work, because they were listed under the same key.
1.0.9: bugfix a typo in pubchem.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pubchemTool-1.0.9-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 02ab41ef2710ba0654d888297a06a65dd3c97479f49b4173e5b4e4179fd380fc |
|
MD5 | 2fc6ed6fa07a1d266193d16ce3daa435 |
|
BLAKE2b-256 | 8bab9e1efdd329b1542d898a3512aebb99f76ac10cdedf594e2a4f586b75b2ab |