Skip to main content

Python library which is extensively used for all AI projects

Project description

DXC Industrialized AI Starter

DXC Indusrialized AI Starter makes it easier to build and deploy Indusrialized AI. This Library does the following:

  • Access, clean, and explore raw data
  • Build data pipelines
  • Run AI experiments
  • Publish microservices

Installation

In order to install and use above library please use the below code snippet:

1. pip install DXC-Industrialized-AI-Starter
2. from dxc import ai

Getting Started

Access, Clean, and Explore Raw Data

Here's a quick example of using the library to access, clean, and explore raw data.

#Access raw data
df = ai.read_data_frame_from_remote_json(json_url)
df = ai.read_data_frame_from_remote_csv(csv_url)
df = ai.read_data_frame_from_local_json()
df = ai.read_data_frame_from_local_csv()
df = ai.read_data_frame_from_local_excel_file()

#Clean data
raw_data = ai.clean_dataframe(df)

#Explore raw data
ai.visualize_missing_data(raw_data)
ai.explore_features(raw_data)
ai.plot_distributions(raw_data)

Build Data Pipelines

Below example showcases how to build a data pipeline

# Insert data into MongoDB
data_layer = {
    "connection_string": "<your connection_string>",
    "collection_name": "<your collection_name>",
    "database_name": "<your database_name>"
}
wrt_raw_data = ai.write_raw_data(data_layer, raw_data, date_fields = [])

#Example for creating pipeline
pipeline = [
        {
            '$group':{
                '_id': {
                    "funding_source":"$funding_source",
                    "request_type":"$request_type",
                    "department_name":"$department_name",
                    "replacement_body_style":"$replacement_body_style",
                    "equipment_class":"$equipment_class",
                    "replacement_make":"$replacement_make",
                    "replacement_model":"$replacement_model",
                    "procurement_plan":"$procurement_plan"
                    },
                "avg_est_unit_cost":{"$avg":"$est_unit_cost"},
                "avg_est_unit_cost_error":{"$avg":{ "$subtract": [ "$est_unit_cost", "$actual_unit_cost" ] }}
            }
        }
]

df = ai.access_data_from_pipeline(wrt_raw_data, pipeline)

Run AI Experiments

Sample code snippet to run an AI Experiment

experiment_design = {
    #model options include ['regression()', 'classification()']
    "model": ai.regression(),
    "labels": df.avg_est_unit_cost_error,
    "data": df,
    #Tell the model which column is 'output'
    #Also note columns that aren't purely numerical
    #Examples include ['nlp', 'date', 'categorical', 'ignore']
    "meta_data": {
      "avg_est_unit_cost_error": "output",
      "_id.funding_source": "categorical",
      "_id.department_name": "categorical",
      "_id.replacement_body_style": "categorical",
      "_id.replacement_make": "categorical",
      "_id.replacement_model": "categorical",
      "_id.procurement_plan": "categorical"
  }
}

trained_model = ai.run_experiment(experiment_design)

Publish Microservice

Below is the example for publishing a Microservice

trained_model is the output of run_experiment() function
microservice_design = {
    "microservice_name": "<Name of your microservice>",
    "microservice_description": "<Brief description about your microservice>",
    "execution_environment_username": "<Algorithmia username>",
    "api_key": "<your api_key>",
    "api_namespace": "<your api namespace>",   
    "model_path":"<your model_path>"
}

# publish the micro service and display the url of the api
api_url = ai.publish_microservice(microservice_design, trained_model)
print("api url: " + api_url)

Docs

For detailed and complete documentation, please click here

Example of colab notebook

Here is an quick example of the google colab notebook

Contributing Guide

To know more about the contribution and guidelines please click here

Reporting Issues

If you find any issues, feel free to report them here with clear description of your issue.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DXC-Industrialized-AI-Starter-1.0.tar.gz (13.2 kB view details)

Uploaded Source

Built Distribution

DXC_Industrialized_AI_Starter-1.0-py3-none-any.whl (13.7 kB view details)

Uploaded Python 3

File details

Details for the file DXC-Industrialized-AI-Starter-1.0.tar.gz.

File metadata

  • Download URL: DXC-Industrialized-AI-Starter-1.0.tar.gz
  • Upload date:
  • Size: 13.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for DXC-Industrialized-AI-Starter-1.0.tar.gz
Algorithm Hash digest
SHA256 42c933207149b672ad58bad070cdb5073e18d4b6539098ce885f96058a814dcc
MD5 f489eba7038f63e71661d7a874dd450a
BLAKE2b-256 efb1d174548ef805997a42cce8ecadf68dfacd8ce01c3a3038dcb0dd8c63a4a3

See more details on using hashes here.

File details

Details for the file DXC_Industrialized_AI_Starter-1.0-py3-none-any.whl.

File metadata

  • Download URL: DXC_Industrialized_AI_Starter-1.0-py3-none-any.whl
  • Upload date:
  • Size: 13.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for DXC_Industrialized_AI_Starter-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3258850a0a072b5baacedc7c219e668e6be1dad7202594b6697131a3fa8bf7cb
MD5 28dbd03635d9c580d3846a40fb815bcb
BLAKE2b-256 11b8c8e49d6dcff732faeb40dccfd0f4418f51249e25ca8cdeea8ba537ffc469

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page