Skip to main content

End-to-end machine learning on your desktop or server.

Project description

Mission

  • AutoML with aidb; keeps track of the moving parts of machine learning (model tuning, feature selection, and dataset splitting) so that data scientists can stay focused on data science.
  • Local-first; empowers non-cloud users (academic/ institute HPCs, private cloud companies, desktop hackers, or even EC2 users) with the same quality ML services in their local IDE as present in the cloud (e.g. SageMaker).
  • Integrated; it doesn’t force your entire workflow into the confines of a GUI app or specific IDE because it integrates with your existing scripts and tools.

Functionality:

  • Calculates and saves model metrics in local files.
  • Visually compare model metrics to find the best model.
  • Queue for hypertuning jobs and batches.
  • Treats cross-validated splits (k-fold) and validation sets (3rd split) as first level citizens.
  • Feature engineering to select the most informative columns.
  • If you need to scale (data size, training time) just switch to cloud_queue=True.

Installation:

Requires Python 3+. You will only need to do this the first time you use the package. Enter the following commands one-by-one and follow any instructions returned by the command prompt to resolve errors:

Starting from the command line:

$ pip install --upgrade pydatasci
$ python

Once inside the Python shell:

>>> import pydatasci as pds
>>> pds.create_folder()
>>> pds.create_config()
>>> from pydatasci import aidb
>>> aidb.create_db()

PyDataSci makes use of appdirs for an operating system (OS) agnostic location to store configuration and database files. This not only keeps your $HOME directory clean, but also helps prevent careless users from deleting your database.

The installation process checks not only that the corresponding appdirs folder exists on your system but also that you have permission to read from as well as write to that location. If these conditions are not met, then you will be provided instructions during the installation about how to create the folder and/ or grant yourself permissions necessary to do so. We have attempted to support both Windows (icacls permissions and backslashes \\, \) and POSIX including Mac, Linux (chmodpermissions and slashes /). If you run into trouble with the installation process on your OS, please submit a GitHub issue so that we can attempt to resolve and release a fix as quickly as possible.

Installation Location Based on OS appdir.user_data_dir('pydatasci'):

  • Mac: /Users/Username/Library/Application Support/pydatasci
  • Linux - Alpine and Ubuntu: /root/.local/share/pydatasci
  • Windows: C:\Users\Username\AppData\Local\pydatasci

Deleting & Recreating the Database:

When deleting the database, you need to either reload the aidb module or restart the Python shell before you can attempt to recreate the database.

>>> aidb.delete_db(True)
>>> from importlib import reload
>>> reload(aidb)
>>> create_db()

PyPI Package - Steps to Build & Upload:

$ pip3 install --upgrade wheel twine
$ python3 setup.py sdist bdist_wheel
$ python3 -m twine upload --repository pypi dist/*
$ rm -r build dist pydatasci.egg-info
# proactively update the version number in setup.py next time
$ pip install --upgrade pydatasci; pip install --upgrade pydatasci

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydatasci-0.0.35.tar.gz (5.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydatasci-0.0.35-py3-none-any.whl (18.1 kB view details)

Uploaded Python 3

File details

Details for the file pydatasci-0.0.35.tar.gz.

File metadata

  • Download URL: pydatasci-0.0.35.tar.gz
  • Upload date:
  • Size: 5.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.6

File hashes

Hashes for pydatasci-0.0.35.tar.gz
Algorithm Hash digest
SHA256 5f97b7f4bda4af370be46c45019ec5a480c538c6c96ab3938e95e363069f4658
MD5 49c481b2c38a2742cbb531a9182b8144
BLAKE2b-256 964c6c359b05bfda189489d231b6be547a2d5fb2f01fcbf4b2ef281cdfc74316

See more details on using hashes here.

File details

Details for the file pydatasci-0.0.35-py3-none-any.whl.

File metadata

  • Download URL: pydatasci-0.0.35-py3-none-any.whl
  • Upload date:
  • Size: 18.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.6

File hashes

Hashes for pydatasci-0.0.35-py3-none-any.whl
Algorithm Hash digest
SHA256 77b18610dfda8ebf7a85163e13684a55a7de1ccd23d77c065ed29db5a39b56be
MD5 b9e7991cb2e26da4602c915a38be4206
BLAKE2b-256 e47077850b5682721aa214a7e85b1ec00b7cf5389b3fc521e8b1e4f7338d1bd8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page