Skip to main content

Oracle Accelerated Data Science SDK

Project description

Oracle Accelerated Data Science (ADS)

PyPI Python Code style: black

The Oracle Accelerated Data Science (ADS) SDK is maintained by the Oracle Cloud Infrastructure (OCI) Data Science service team. It speeds up common data science activities by providing tools that automate and simplify common data science tasks. Additionally, provides data scientists a friendly pythonic interface to OCI services. Some of the more notable services are OCI Data Science, Model Catalog, Model Deployment, Jobs, ML Pipelines, Data Flow, Object Storage, Vault, Big Data Service, Data Catalog, and the Autonomous Database. ADS gives you an interface to manage the life cycle of machine learning models, from data acquisition to model evaluation, interpretation, and model deployment.

With ADS you can:

  • Read datasets from Oracle Object Storage, Oracle RDBMS (ATP/ADW/On-prem), AWS S3 and other sources into Pandas dataframes.
  • Tune models using hyperparameter optimization with the ADSTuner tool.
  • Generate detailed evaluation reports of your model candidates with the ADSEvaluator module.
  • Save machine learning models to the OCI Data Science Model Catalog.
  • Deploy models as HTTP endpoints with Model Deployment.
  • Launch distributed ETL, data processing, and model training jobs in Spark with OCI Data Flow.
  • Train machine learning models in OCI Data Science Jobs.
  • Define and run an end-to-end machine learning orchestration covering all the steps of machine learning lifecycle in a repeatable, continuous ML Pipelines.
  • Manage the life cycle of conda environments through the ads conda command line interface (CLI).

Installation

You have various options when installing ADS.

Installing the oracle-ads base package

  python3 -m pip install oracle-ads

Installing OCI AI Operators

To use the AI Forecast Operator, install the "forecast" dependencies using the following command:

  python3 -m pip install 'oracle_ads[forecast]>=2.9.0'

Installing extras libraries

To work with gradient boosting models, install the boosted module. This module includes XGBoost and LightGBM model classes.

  python3 -m pip install 'oracle-ads[boosted]'

For big data use cases using Oracle Big Data Service (BDS), install the bds module. It includes the following libraries, ibis-framework[impala], hdfs[kerberos] and sqlalchemy.

  python3 -m pip install 'oracle-ads[bds]'

To work with a broad set of data formats (for example, Excel, Avro, etc.) install the data module. It includes the fastavro, openpyxl, pandavro, asteval, datefinder, htmllistparse, and sqlalchemy libraries.

  python3 -m pip install 'oracle-ads[data]'

To work with geospatial data install the geo module. It includes the geopandas and libraries from the viz module.

  python3 -m pip install 'oracle-ads[geo]'

Install the notebook module to use ADS within a OCI Data Science service notebook session. This module installs ipywidgets and ipython libraries.

  python3 -m pip install 'oracle-ads[notebook]'

To work with ONNX-compatible run times and libraries designed to maximize performance and model portability, install the onnx module. It includes the following libraries, onnx, onnxruntime, onnxmltools, skl2onnx, xgboost, lightgbm and libraries from the viz module.

  python3 -m pip install 'oracle-ads[onnx]'

For infrastructure tasks, install the opctl module. It includes the following libraries, oci-cli, docker, conda-pack, nbconvert, nbformat, and inflection.

  python3 -m pip install 'oracle-ads[opctl]'

For hyperparameter optimization tasks install the optuna module. It includes the optuna and libraries from the viz module.

  python3 -m pip install 'oracle-ads[optuna]'

Install the tensorflow module to include tensorflow and libraries from the viz module.

  python3 -m pip install 'oracle-ads[tensorflow]'

For text related tasks, install the text module. This will include the wordcloud, spacy libraries.

  python3 -m pip install 'oracle-ads[text]'

Install the torch module to include pytorch and libraries from the viz module.

  python3 -m pip install 'oracle-ads[torch]'

Install the viz module to include libraries for visualization tasks. Some of the key packages are bokeh, folium, seaborn and related packages.

  python3 -m pip install 'oracle-ads[viz]'

See pyproject.toml file [project.optional-dependencies] section for full list of modules and its list of extra libraries.

Note

Multiple extra dependencies can be installed together. For example:

  python3 -m pip install  'oracle-ads[notebook,viz,text]'

Documentation

Examples

Load data from Object Storage

  import ads
  from ads.common.auth import default_signer
  import oci
  import pandas as pd

  ads.set_auth(auth="api_key", oci_config_location=oci.config.DEFAULT_LOCATION, profile="DEFAULT")
  bucket_name = <bucket_name>
  key = <key>
  namespace = <namespace>
  df = pd.read_csv(f"oci://{bucket_name}@{namespace}/{key}", storage_options=default_signer())

Load data from ADB

This example uses SQL injection safe binding variables.

  import ads
  import pandas as pd

  connection_parameters = {
      "user_name": "<user_name>",
      "password": "<password>",
      "service_name": "<tns_name>",
      "wallet_location": "<file_path>",
  }

  df = pd.DataFrame.ads.read_sql(
      """
      SELECT *
      FROM SH.SALES
      WHERE ROWNUM <= :max_rows
      """,
      bind_variables={ max_rows : 100 },
      connection_parameters=connection_parameters,
  )

Contributing

This project welcomes contributions from the community. Before submitting a pull request, please review our contribution guide

Find Getting Started instructions for developers in README-development.md

Security

Consult the security guide SECURITY.md for our responsible security vulnerability disclosure process.

License

Copyright (c) 2020, 2024 Oracle and/or its affiliates. Licensed under the Universal Permissive License v1.0

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oracle_ads-2.11.9.tar.gz (22.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

oracle_ads-2.11.9-py3-none-any.whl (22.7 MB view details)

Uploaded Python 3

File details

Details for the file oracle_ads-2.11.9.tar.gz.

File metadata

  • Download URL: oracle_ads-2.11.9.tar.gz
  • Upload date:
  • Size: 22.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for oracle_ads-2.11.9.tar.gz
Algorithm Hash digest
SHA256 26ec251bc430f79aa2a59ff6cdf0493a956c60ae0c8de7ea982a5a1487f1b855
MD5 0ee9938586e7041e40cee6840b5bf73a
BLAKE2b-256 2b4e801e2f17ddc2353ee5808ac1a868e0eaf5c68364e0bc2463e56bfdfad04c

See more details on using hashes here.

File details

Details for the file oracle_ads-2.11.9-py3-none-any.whl.

File metadata

  • Download URL: oracle_ads-2.11.9-py3-none-any.whl
  • Upload date:
  • Size: 22.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for oracle_ads-2.11.9-py3-none-any.whl
Algorithm Hash digest
SHA256 7231250f287d3733fb39bcc8aeda14c49e4353e5af1fac46efccaa74c51303d6
MD5 5286bf3ef3670a968ab6c9d2ead9bb09
BLAKE2b-256 75cfba0e03980997f7ebecd7d056100a9d58e16c93cfe3623fd8c8296969e00c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page