Python library which is extensively used for all AI projects
Project description
DXC Industrialized AI Starter
DXC Indusrialized AI Starter makes it easier to build and deploy Indusrialized AI. This Library does the following:
- Access, clean, and explore raw data
- Build data pipelines
- Run AI experiments
- Publish microservices
Installation
In order to install and use above library please use the below code snippet:
1. pip install DXC-Industrialized-AI-Starter
2. from dxc import ai
Getting Started
Access, Clean, and Explore Raw Data
Here's a quick example of using the library to access, clean, and explore raw data.
#Access raw data
df = ai.read_data_frame_from_remote_json(json_url)
df = ai.read_data_frame_from_remote_csv(csv_url)
df = ai.read_data_frame_from_local_json()
df = ai.read_data_frame_from_local_csv()
df = ai.read_data_frame_from_local_excel_file()
#Clean data
raw_data = ai.clean_dataframe(df)
#Explore raw data
ai.visualize_missing_data(raw_data)
ai.explore_features(raw_data)
ai.plot_distributions(raw_data)
Build Data Pipelines
Below example showcases how to build a data pipeline
# Insert data into MongoDB
data_layer = {
"connection_string": "<your connection_string>",
"collection_name": "<your collection_name>",
"database_name": "<your database_name>"
}
wrt_raw_data = ai.write_raw_data(data_layer, raw_data, date_fields = [])
#Example for creating pipeline
pipeline = [
{
'$group':{
'_id': {
"funding_source":"$funding_source",
"request_type":"$request_type",
"department_name":"$department_name",
"replacement_body_style":"$replacement_body_style",
"equipment_class":"$equipment_class",
"replacement_make":"$replacement_make",
"replacement_model":"$replacement_model",
"procurement_plan":"$procurement_plan"
},
"avg_est_unit_cost":{"$avg":"$est_unit_cost"},
"avg_est_unit_cost_error":{"$avg":{ "$subtract": [ "$est_unit_cost", "$actual_unit_cost" ] }}
}
}
]
df = ai.access_data_from_pipeline(wrt_raw_data, pipeline)
Run AI Experiments
Sample code snippet to run an AI Experiment
experiment_design = {
#model options include ['regression()', 'classification()']
"model": ai.regression(),
"labels": df.avg_est_unit_cost_error,
"data": df,
#Tell the model which column is 'output'
#Also note columns that aren't purely numerical
#Examples include ['nlp', 'date', 'categorical', 'ignore']
"meta_data": {
"avg_est_unit_cost_error": "output",
"_id.funding_source": "categorical",
"_id.department_name": "categorical",
"_id.replacement_body_style": "categorical",
"_id.replacement_make": "categorical",
"_id.replacement_model": "categorical",
"_id.procurement_plan": "categorical"
}
}
trained_model = ai.run_experiment(experiment_design)
Publish Microservice
Below is the example for publishing a Microservice
trained_model is the output of run_experiment() function
microservice_design = {
"microservice_name": "<Name of your microservice>",
"microservice_description": "<Brief description about your microservice>",
"execution_environment_username": "<Algorithmia username>",
"api_key": "<your api_key>",
"api_namespace": "<your api namespace>",
"model_path":"<your model_path>"
}
# publish the micro service and display the url of the api
api_url = ai.publish_microservice(microservice_design, trained_model)
print("api url: " + api_url)
Docs
For detailed and complete documentation, please click here
Example of colab notebook
Here is an quick example of the google colab notebook
Contributing Guide
To know more about the contribution and guidelines please click here
Reporting Issues
If you find any issues, feel free to report them here with clear description of your issue.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for DXC-Industrialized-AI-Starter-1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 42c933207149b672ad58bad070cdb5073e18d4b6539098ce885f96058a814dcc |
|
MD5 | f489eba7038f63e71661d7a874dd450a |
|
BLAKE2b-256 | efb1d174548ef805997a42cce8ecadf68dfacd8ce01c3a3038dcb0dd8c63a4a3 |
Hashes for DXC_Industrialized_AI_Starter-1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3258850a0a072b5baacedc7c219e668e6be1dad7202594b6697131a3fa8bf7cb |
|
MD5 | 28dbd03635d9c580d3846a40fb815bcb |
|
BLAKE2b-256 | 11b8c8e49d6dcff732faeb40dccfd0f4418f51249e25ca8cdeea8ba537ffc469 |