Skip to main content

End-to-end machine learning on your desktop or server.

Project description

pre-alpha


Mission

  • Automated
    AIdb is an autoML tool that keeps track of the moving parts of machine learning (model tuning, feature selection, dataset splitting, and cross validation) so that data scientists can perform best practice ML without the coding overhead.

  • Local-first
    We empower non-cloud users (academic/ institute HPCs, private cloud companies, desktop hackers, or even remote server SSH'ers) with the same quality ML services as present in public clouds (e.g. SageMaker).

  • Integrated
    We don’t force your entire workflow into the confines of a GUI app or specific IDE because we integrate with your existing code.

Functionality:

  • Calculates and saves model metrics in a local SQLite file.
  • Visually compare model metrics to find the best model.
  • Queue for hypertuning jobs and batches.
  • Treats cross-validated splits (k-fold) and validation sets (3rd split) as first-level citizens.
  • Feature engineering to select the most informative columns.
  • If you need to scale (data size, training time) just switch to cloud_queue=True.

Installation:

Requires Python 3+. You will only need to do this the first time you use the package. Enter the following commands one-by-one and follow any instructions returned by the command prompt to resolve errors should they arise:

Starting from the command line:

$ pip install --upgrade pydatasci
$ python

Once inside the Python shell:

>>> import pydatasci as pds
>>> pds.create_folder()
>>> pds.create_config()
>>> from pydatasci import aidb
>>> aidb.create_db()

PyDataSci makes use of the Python package, appdirs, for an operating system (OS) agnostic location to store configuration and database files. This not only keeps your $HOME directory clean, but also helps prevent careless users from deleting your database.

The installation process checks not only that the corresponding appdirs folder exists on your system but also that you have the permissions neceessary to read from and write to that location. If these conditions are not met, then you will be provided instructions during the installation about how to create the folder and/ or grant yourself the appropriate permissions.

We have attempted to support both Windows (icacls permissions and backslashes C:\\) as well as POSIX including Mac and Linux (chmod letters permissions and slashes /). Note: due to variations in the ordering of appdirs author and app directories in different OS', we do not make use of the appdirs appauthor directory, only the appname directory.

If you run into trouble with the installation process on your OS, please submit a GitHub issue so that we can attempt to resolve, document, and release a fix as quickly as possible.

Installation Location Based on OS
appdir.user_data_dir('pydatasci'):

  • Mac:
    /Users/Username/Library/Application Support/pydatasci

  • Linux - Alpine and Ubuntu:
    /root/.local/share/pydatasci

  • Windows:
    C:\Users\Username\AppData\Local\pydatasci

Deleting & Recreating the Database:

When deleting the database, you need to either reload the aidb module or restart the Python shell before you can attempt to recreate the database.

>>> from pydatasci import aidb
>>> aidb.delete_db(True)
>>> from importlib import reload
>>> reload(aidb)
>>> create_db()

Usage

Let's get started.

# comment

PyPI Package

Steps to Build & Upload:

$ pip3 install --upgrade wheel twine
$ python3 setup.py sdist bdist_wheel
$ python3 -m twine upload --repository pypi dist/*
$ rm -r build dist pydatasci.egg-info
# proactively update the version number in setup.py next time
$ pip install --upgrade pydatasci; pip install --upgrade pydatasci

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydatasci-0.0.44.tar.gz (5.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydatasci-0.0.44-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file pydatasci-0.0.44.tar.gz.

File metadata

  • Download URL: pydatasci-0.0.44.tar.gz
  • Upload date:
  • Size: 5.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.6

File hashes

Hashes for pydatasci-0.0.44.tar.gz
Algorithm Hash digest
SHA256 c35f26a5e3bdfe3486bdcf2f9ad44a170179e5e4710487034364fe3519ceef6e
MD5 b81bc85330c02972d84db1f3284dd621
BLAKE2b-256 c2e5d97211e6d97c12c585c69b2abba7fc449c6666c165539bd9717037879b36

See more details on using hashes here.

File details

Details for the file pydatasci-0.0.44-py3-none-any.whl.

File metadata

  • Download URL: pydatasci-0.0.44-py3-none-any.whl
  • Upload date:
  • Size: 18.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.6

File hashes

Hashes for pydatasci-0.0.44-py3-none-any.whl
Algorithm Hash digest
SHA256 3c14cf863169e998321b2d9d6c12a657b95b1e3c8dc38e8bccdd77cbd5c1b2eb
MD5 64378df877ed5fad2cb7792d0108496d
BLAKE2b-256 fe60269db519bf88e1bd7972a1832f5f1995861a48541be13267bee7d68e1a9b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page