A document-level information extraction pipeline for layered cathode materials for sodium-ion batteries.
Project description
CathodeDataExtractor
Cathodedataextractor
is a lightweight document-level information extraction pipeline that can automatically extract
comprehensive properties related to synthesis parameters, cycling and rate performance of cathode materials from the
literature of layered cathode materials for sodium-ion batteries.
Installation
pip install cathodedataextractor
Features
- It is built on open-source libraries: pymatgen, text2chem, and ChemDataExtractor v2 with some modifications.
- BatterySciBERT-uncased Multi-Label text classification model for filtering documents.
- Automated comprehensive data extraction pipeline for cathode materials.
- Paragraph Multi-Class classification algorithms for documents (HTML/XML) from the RSC and Elsevier.
- A normalised entity handling process is provided.
- An effective chemical abbreviation detection module.
- Heuristic multi-level relation extraction algorithm for electrochemical properties.
In addition, the pipeline is also suitable for string sequence text extraction.
Quick start
Extract from documents
from glob import iglob
from cathodedataextractor.information_extraction_pipe import Pipeline
pipline = Pipeline()
for document in iglob('*ml'):
extraction_results = pipline.extract(document)
Extract from string
from cathodedataextractor.information_extraction_pipe import Pipeline
extraction_results = Pipeline.from_string(
'Apart from the conventional cationic redox of transition metals, '
'both Na-deficit and Na-excess materials have showcased the ability '
'to exploit oxygen redox activity as O2–/O2n– for a charge '
'compensation mechanism. To realize cathodes with enhanced energy '
'density, a technique like the incorporation of alkali metal ions '
'into transition metal layers has been adopted. Recent work by Boisse '
'(13) et al. displayed the impact of honeycomb cation ordering of '
'a highly stabilized intermediate phase for a Na2RuO3 cathode material '
'in instigating the anionic redox activity and providing a capacity '
'of 180 mAh g–1 at 0.2C with a capacity retention of 89% for over '
'50 cycles. More devoted efforts to realize the utmost potential '
'of anionic redox ought to be carried out in the future.')
Issues?
You can either report an issue on GitHub or contact me directly. Try gouyx@mail2.sysu.edu.cn.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file cathodedataextractor-0.0.4.tar.gz
.
File metadata
- Download URL: cathodedataextractor-0.0.4.tar.gz
- Upload date:
- Size: 65.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.5.0 importlib_metadata/6.7.0 pkginfo/1.9.6 requests/2.21.0 requests-toolbelt/1.0.0 tqdm/4.66.1 CPython/3.7.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c23d7d1e982a93d6dc9014bd39d344d939e4020115b5a080f5ac1e8f946c513f |
|
MD5 | a85a2f9bae3e93dddfc7428221ef14a9 |
|
BLAKE2b-256 | 3d7a5a5e6df1ce4adb2428a1caf6468e7a125e31f3833da6d0baa9c15e2a76f4 |