Skip to main content

Tool for quickly creating a composition-based feature vector

Project description

CBFV Package

Tool to quickly create a composition-based feature vectors from materials datafiles.

Installation

The source code is currently hosted on GitHub at: https://github.com/kaaiian/CBFV

Binary installers for the latest released version are available at the Python Package Index (PyPI)

# PyPI
pip install CBFV

Making the composition-based feature vector

The CBFV package assumes your data is stored in a pandas dataframe of the following structure:

formula target
Tc1V1 248.539
Cu1Dy1 66.8444
Cd3N2 91.5034

To featurize this data, the generate_features function can be called as follows:

from CBFV import composition
X, y, formulae, skipped = composition.generate_features(df)

Extended Functionality

The featurization scheme can be adjusted by calling the the elem_prop parameter. The following featurization schemes are included within CBFV:

  • jarvis
  • magpie
  • mat2vec
  • oliynyk (default)
  • onehot
  • random_200

Duplicate formula handeling is controlled by the drop_duplicates parameter. It is set to False by default to preserve datapoints containing variation outside of their formula. For example, heat capacity measurements performed for the same material at different temperatures.

The extend_features parameter is used to specify whether columns outside of ['formula', 'target'] should be considered during featurization. It is set to False by default to exclude nuisance information from consideration. Setting extend_features=True would allow additional information (i.e. ['temperature', 'pressure']) to be preserved.

The sum_feat parameter specifies whether to calculate the sum features when generating the CBFVs for the chemical formulae. It is set to False by default.

Calling generate_features with these parameters can be implemented as follows:

formula target temp
Tc1V1 248.539 373
Tc1V1 66.8444 473
Cd3N2 91.5034 273
from CBFV import composition
X, y, formulae, skipped = composition.generate_features(df,
                                                        elem_prop='magpie',
                                                        drop_duplicates=False,
                                                        extend_features=True,
                                                        sum_feat=True)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

CBFV-1.1.0-py3-none-any.whl (539.2 kB view details)

Uploaded Python 3

File details

Details for the file CBFV-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: CBFV-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 539.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.7.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for CBFV-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9eb634a54246240ce2207ad1ad80fb7be9c79399a7ccf33bf3b9f0802bfab519
MD5 30b019db16f3d0c03ba2a029c2bcabfc
BLAKE2b-256 364aa70548514fbb65d4422d0cc541f97bb75d28e7473acd0031cf48e52221f6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page