Python package for chemical machine learning.
Project description
Artificial Intelligence Library for Chemical Applications (AILCA)
This is a python package for machine learning with chemical data. It provides various pre-processing modules for chemical data, such as engineering conditions, chemical formulas, and molecular structures. Also, several wrapper classes and functions are included for chemical machine learning. This package was implemented based on Scikit-learn and PyTorch.
Installation
Before installing AILCA, several required packages should be installed in your environment. We highly recommend to use Anaconda to build your Python environment for AILCA.
- Install a cheminformatics package RDKit. RDKit is available at Anaconda archive. You can install RDKit using the following command in the Anaconda prompt.
conda install -c rdkit rdkit
- Install a deep learning framework PyTorch. If you want to build your machine learning models using GPU, CUDA >= 11.1 must be installed your machine. With CUDA of version 11.1, you can install PyTorch using the following command.
conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c conda-forge
- Install a graph-based deep learning framework PyTorch Geometric. It must be installed to build machine learning models that predict target values from molecular and crystal structures. You can install PyTorch Geometric using the following command.
conda install pytorch-geometric -c rusty1s -c conda-forge
- Install required packages from
requirements.txt
in GitHub. After downloading the requirements file, you can install all required packages using the following commend.
conda install --file requirements.txt
- (Optional) If your operating system is Windows, install Graphviz to visualize interpretable information of machine learning algorithms. You can install Graphviz using the following command.
conda install -c conda-forge python-graphviz
- Finally, install AILCA in your Python environment with the following command.
pip install ailca
Examples
Follow the instructions in PyTorch Installation to install the PyTorch package on your environment.
Installation of PyTorch Geometric
Installation of RDKit
Package, module name
Many use a same package and module name, you could definitely do that. But this example package and its module's names are different: example_pypi_package
and examplepy
.
Open example_pypi_package
folder with Visual Studio Code, Ctrl + Shift + F (Windows / Linux) or Cmd + Shift + F (MacOS) to find all occurrences of both names and replace them with your package and module's names. Also remember to change the name of the folder src/examplepy.
Simply and very roughly speaking, package name is used in pip install <PACKAGENAME>
and module name is used in import <MODULENAME>
. Both names should consist of lowercase basic letters (a-z). They may have underscores (_
) if you really need them. Hyphen-minus (-
) should not be used.
You'll also need to make sure the URL "https://pypi.org/project/example-pypi-package/" (replace example-pypi-package
by your package name, with all _
becoming -
) is not occupied.
Details on naming convention (click to show/hide)
Underscores (_
) can be used but such use is discouraged. Numbers can be used if the name does not start with a number, but such use is also discouraged.
Name starting with a number and/or containing hyphen-minus (-
) should not be used: although technically legal, such name causes a lot of trouble − users have to use importlib
to import it.
Don't be fooled by the URL "pypi.org/project/example-pypi-package/" and the name "example-pypi-package" on pypi.org. pypi.org and pip system convert all _
to -
and use the latter on the website / in pip
command, but the real name is still with _
, which users should use when importing the package.
There's also namespace to use if you need sub-packages.
Other changes
Make necessary changes in setup.py.
The package's version number __version__
is in src/examplepy/__init__.py. You may want to change that.
The example package is designed to be compatible with Python 3.6, 3.7, 3.8, 3.9, and will be tested against these versions. If you need to change the version range, you should change:
classifiers
,python_requires
in setup.pyenvlist
in tox.inimatrix: python:
in .github/workflows/test.yml
If you plan to upload to TestPyPI which is a playground of PyPI for testing purpose, change twine upload --repository pypi dist/*
to twine upload --repository testpypi dist/*
in the file .github/workflows/release.yml.
Development
pip
pip is a Python package manager. You already have pip if you use Python 3.4 and later version which include it by default. Read this to know how to check whether pip is installed. Read this if you need to install it.
Use VS Code
Visual Studio Code is the most popular code editor today, our example package is configured to work with VS Code.
Install VS Code extension "Python".
"Python" VS Code extension will suggest you install pylint. Also, the example package is configured to use pytest with VS Code + Python extensions, so, install pylint and pytest:
pip install pylint pytest
(It's likely you will be prompted to install them, if that's the case, you don't need to type and execute the command)
vscode.env's content is now PYTHONPATH=/;src/;${PYTHONPATH}
which is good for Windows. If you use Linux or MacOS, you need to change it to PYTHONPATH=/:src/:${PYTHONPATH}
(replacing ;
with :
). If the PATH is not properly set, you'll see linting errors in test files and pytest won't be able to run tests/test_*.py files correctly.
Close and reopen VS Code. You can now click the lab flask icon in the left menu and run all tests there, with pytest. pytest seems better than the standard unittest framework, it supports unittest
thus you can keep using import unittest
in your test files.
The example package also has a .editorconfig file. You may install VS Code extension "EditorConfig for VS Code" that uses the file. With current configuration, the EditorConfig tool can automatically use spaces (4 spaces for .py, 2 for others) for indentation, set UTF-8
encoding, LF
end of lines, trim trailing whitespaces in non Markdown files, etc.
In VS Code, you can go to File -> Preferences -> Settings, type "Python Formatting Provider" in the search box, and choose one of the three Python code formatting tools (autopep8, black and yapf), you'll be prompted to install it. The shortcuts for formatting of a code file are Shift + Alt + F (Windows); Shift + Option (Alt) + F (MacOS); Ctrl + Shift + I (Linux).
Write your package
In src/examplepy/ (examplepy
should have been replaced by your module name) folder, rename module1.py and write your code in it. Add more module .py files if you need to.
Write your tests
In tests/ folder, rename test_module1.py (to test_*.py) and write your unit test code (with unittest) in it. Add more test_*.py files if you need to.
The testing tool `tox` will be used in the automation with GitHub Actions CI/CD. If you want to use `tox` locally, click to read the "Use tox locally" section
Use tox locally
Install tox and run it:
pip install tox
tox
In our configuration, tox runs a check of source distribution using check-manifest (which requires your repo to be git-initialized (git init
) and added (git add .
) at least), setuptools's check, and unit tests using pytest. You don't need to install check-manifest and pytest though, tox will install them in a separate environment.
The automated tests are run against several Python versions, but on your machine, you might be using only one version of Python, if that is Python 3.9, then run:
tox -e py39
If you add more files to the root directory (example_pypi_package/), you'll need to add your file to check-manifest --ignore
list in tox.ini.
Thanks to GitHub Actions' automated process, you don't need to generate distribution files locally. But if you insist, click to read the "Generate distribution files" section
Generate distribution files
Install tools
Install or upgrade setuptools
and wheel
:
python -m pip install --user --upgrade setuptools wheel
(If python3
is the command on your machine, change python
to python3
in the above command, or add a line alias python=python3
to ~/.bashrc or ~/.bash_aliases file if you use bash on Linux)
Generate dist
From example_pypi_package
directory, run the following command, in order to generate production version for source distribution (sdist) in dist
folder:
python setup.py sdist bdist_wheel
Install locally
Optionally, you can install dist version of your package locally before uploading to PyPI or TestPyPI:
pip install dist/example_pypi_package-0.1.0.tar.gz
(You may need to uninstall existing package first:
pip uninstall example_pypi_package
There may be several installed packages with the same name, so run pip uninstall
multiple times until it says no more package to remove.)
Upload to PyPI
Register on PyPI and get token
Register an account on PyPI, go to Account settings § API tokens, "Add API token". The PyPI token only appears once, copy it somewhere. If you missed it, delete the old and add a new token.
(Register a TestPyPI account if you are uploading to TestPyPI)
Set secret in GitHub repo
On the page of your newly created or existing GitHub repo, click Settings -> Secrets -> New repository secret, the Name should be PYPI_API_TOKEN
and the Value should be your PyPI token (which starts with pypi-
).
Push or release
The example package has automated tests and upload (publishing) already set up with GitHub Actions:
- Every time you
git push
yourmaster
ormain
branch, the package is automatically tested against the desired Python versions with GitHub Actions. - Every time a new release (either the initial version or an updated version) is created, the package is automatically uploaded to PyPI with GitHub Actions.
View it on pypi.org
After your package is published on PyPI, go to https://pypi.org/project/example-pypi-package/ (_
becomes -
). Copy the command on the page, execute it to download and install your package from PyPI. (or test.pypi.org if you use that)
If you publish the package to PyPI manually, click to read
Install Twine
Install or upgrade Twine:
python -m pip install --user --upgrade twine
Create a .pypirc file in your $HOME (~) directory, its content should be:
[pypi]
username = __token__
password = <PyPI token>
(Use [testpypi]
instead of [pypi]
if you are uploading to TestPyPI)
Replace <PyPI token>
with your real PyPI token (which starts with pypi-
).
(if you don't manually create $HOME/.pypirc, you will be prompted for a username (which should be __token__
) and password (which should be your PyPI token) when you run Twine)
Upload
Run Twine to upload all of the archives under dist folder:
python -m twine upload --repository pypi dist/*
(use testpypi
instead of pypi
if you are uploading to TestPyPI)
Update
When you finished developing a newer version of your package, do the following things.
Modify the version number __version__
in src\examplepy__init__.py.
Delete all old versions in dist.
Run the following command again to regenerate dist:
python setup.py sdist bdist_wheel
Run the following command again to upload dist:
python -m twine upload --repository pypi dist/*
(use testpypi
instead of pypi
if needed)
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.