Tool for quickly creating a composition-based feature vector.
Project description
CBFV Package
Tool to quickly create a composition-based feature vectors from materials datafiles.
Installation
The source code is currently hosted on GitHub at: https://github.com/kaaiian/CBFV
Binary installers for the latest released version are available at the Python Package Index (PyPI)
# PyPI
pip install CBFV
Making the composition-based feature vector
The CBFV package assumes your data is stored in a pandas dataframe of the following structure:
formula | target |
---|---|
Tc1V1 | 248.539 |
Cu1Dy1 | 66.8444 |
Cd3N2 | 91.5034 |
To featurize this data, the generate_features
function can be called as follows:
from CBFV import composition
X, y, formulae, skipped = composition.generate_features(df)
Extended Functionality
The featurization scheme can be adjusted by calling the the elem_prop
parameter. The following featurization schemes are included within CBFV:
- jarvis
- magpie
- mat2vec
- oliynyk (default)
- onehot
- random_200
Duplicate formula handeling is controlled by the drop_duplicates
parameter. It is set to False
by default to preserve datapoints containing variation outside of their formula. For example, heat capacity measurements performed for the same material at different temperatures.
The extend_features
parameter is used to specify whether columns outside of ['formula', 'target']
should be considered during featurization. It is set to False
by default to exclude nuisance information from consideration. Setting extend_features=True
would allow additional information (i.e. ['temperature', 'pressure']
) to be preserved.
The sum_feat
parameter specifies whether to calculate the sum features when generating the CBFVs for the chemical formulae. It is set to False
by default.
Calling generate_features
with these parameters can be implemented as follows:
formula | target | temp |
---|---|---|
Tc1V1 | 248.539 | 373 |
Tc1V1 | 66.8444 | 473 |
Cd3N2 | 91.5034 | 273 |
from CBFV import composition
X, y, formulae, skipped = composition.generate_features(df,
elem_prop='magpie',
drop_duplicates=False,
extend_features=True,
sum_feat=True)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file composition_based_feature_vector-1.0.6.tar.gz
.
File metadata
- Download URL: composition_based_feature_vector-1.0.6.tar.gz
- Upload date:
- Size: 5.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.27.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a71e95c91eb83a680ecd9c862410a071c79429afddf853394f4dbf469f359f4d |
|
MD5 | 2b87466d750479fd03141400c40003fd |
|
BLAKE2b-256 | 841ff83fbf0c6435f3507bd3919f7cdb336282ed50cc3136ce1e7919a6ad40eb |
File details
Details for the file composition_based_feature_vector-1.0.6-py3-none-any.whl
.
File metadata
- Download URL: composition_based_feature_vector-1.0.6-py3-none-any.whl
- Upload date:
- Size: 543.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.27.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9d89e5fa27af9cbe47a8985dcdca2ef4702ff458c7043a882c7e3ccf8e30da76 |
|
MD5 | 534eb4faca19ad01d78ef9088d8c9286 |
|
BLAKE2b-256 | 928e00fa3b1718d5793dea7d4af38081bba672832476d6ebe008409a5ef578ac |