Datomize python client

These details have not been verified by PyPI

Project links

Project description

Welcome to Datomize Python SDK

Datomize is a Data-Driven Solution to machine learning. Datomize augments source data with synthetic data of exceptional quality, and can be used to generate synthetic replicas, optimize training data with balanced and richer data, and address the data bias challenge.

Getting Started

Getting your application user & password

In order to use the Datomize Python SDK client, you first need to register the Datomize solution. Once registering Datomize, you will be provided with username and password, which get passed to datomize.Datomizer() when starting your application.

Please register the Datomize solution on Datomize Registration.

Installation

pip install datomize

Important links

Documentation

Usage Example

# Import relevant packages
from datomizer import Datomizer, DatoMapper, DatoTrainer, DatoGenerator, DatoEnhancer
from sklearn.datasets import load_iris
import pandas as pd

# load input data:
data = load_iris(return_X_y=False, as_frame=True)
df = pd.concat([data.data, data.target], axis=1)

# Create a Datomizer with your credentials:
datomizer = Datomizer(username=username, password=password)

# Create a DatoMapper and analyze the data structure:
mapper = DatoMapper(datomizer)
mapper.discover(df_map={"df1": df})

# Create a DatoTrainer and train the generative model:
trainer = DatoTrainer(mapper)
trainer.train()

# Create a DatoGenerator and generate a syntheyic replica:
generator = DatoGenerator(trainer)
generator.generate(output_ratio=5)  # 5 times the original number of records will be created
synth_data = pd.read_csv(generator.get_generated_data_csv())

# Create a DatoEnhancer for a spedific prediction task and generate a balanced and augmented data to enhance your training data:
enhancer = DatoEnhancer(mapper)
enhancer.generate(target_column="target")
train_enhanced = pd.read_csv(enhancer.get_generated_data_csv())

Async Usage Example

from datomizer import Datomizer, DatoMapper, DatoTrainer, DatoGenerator

datomizer = Datomizer(username=username, password=password)

mapper = DatoMapper(datomizer)
mapper.discover(df_map={"df1": df}, title="Some Title", wait=False)
...
do something...
mapper.wait()

trainer = DatoTrainer(mapper)
trainer.train(wait=False)
...
do something...
trainer.wait()

generator = DatoGenerator(trainer)
generator.generate(wait=False)
...
do something...
generator.wait()

dato_df = pd.read_csv(generator.get_generated_data_csv())

Misc examples

# Import relevant packages
from datomizer import Datomizer, DatoMapper, DatoTrainer, DatoGenerator, DatoEnhancer
from datomizer.helpers.wrapper.schema_wrapper import SchemaWrapper
import pandas as pd
from sqlalchemy import create_engine
# load input data:
db_connection_str = 'mysql+pymysql://guest:relational@relational.fit.cvut.cz/financial'
db_connection = create_engine(db_connection_str)
account = pd.read_sql('SELECT * FROM account', con=db_connection)
loan = pd.read_sql('SELECT * FROM loan', con=db_connection)

# Create a Datomizer with your credentials:
datomizer = Datomizer(username=username, password=password)

# Create a DatoMapper and analyze the data structure:
mapper = DatoMapper(datomizer)
mapper.discover(df_map={"account": account, "loan": loan})

schema: SchemaWrapper = mapper.get_schema()
schema.add_relation_one2many(table_one="account", column_one="account_id",
                             table_many="loan", column_many="account_id")
mapper.schema = schema
mapper.set_schema()
# Create a DatoTrainer and train the generative model:
trainer = DatoTrainer(mapper)
trainer.train()

# Create a DatoGenerator and generate a syntheyic replica:
generator = DatoGenerator(trainer)
generator.generate(output_ratio=1)  # 5 times the original number of records will be created
synth_account = pd.read_csv(generator.get_generated_data_csv("account"))
synth_loan = pd.read_csv(generator.get_generated_data_csv("loan"))

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

2.3.20

Nov 22, 2023

2.3.19

Aug 16, 2023

2.3.11

Jun 13, 2023

2.3.10

Jun 11, 2023

2.3.9

Jun 6, 2023

2.3.7

Jun 5, 2023

2.1.7

Jan 12, 2023

2.1.4

Jan 11, 2023

2.0.28

Dec 29, 2022

2.0.26

Dec 28, 2022

2.0.23

Dec 22, 2022

2.0.19

Dec 14, 2022

2.0.17

Dec 6, 2022

2.0.15

Dec 5, 2022

2.0.12

Nov 27, 2022

0.0.2013

Jul 24, 2023

0.0.2006

Jul 23, 2023

0.0.1958

Jun 5, 2023

0.0.1469

Oct 26, 2022

0.0.1392

Sep 13, 2022

0.0.1275

Jul 26, 2022

0.0.1274

Jul 26, 2022

0.0.1087

May 19, 2022

0.0.1086

May 19, 2022

0.0.1085

May 19, 2022

0.0.1077

May 17, 2022

0.0.1076

May 17, 2022

0.0.1075

May 17, 2022

0.0.1069

May 15, 2022

0.0.1059

May 10, 2022

0.0.1032

May 1, 2022

0.0.1031

May 1, 2022

0.0.997

Apr 19, 2022

0.0.992

Apr 18, 2022

0.0.988

Apr 17, 2022

0.0.987

Apr 17, 2022

0.0.986

Apr 17, 2022

0.0.980

Apr 13, 2022

0.0.941

Mar 31, 2022

0.0.940

Mar 31, 2022

0.0.936

Mar 30, 2022

0.0.935

Mar 30, 2022

0.0.934

Mar 29, 2022

0.0.933

Mar 29, 2022

0.0.932

Mar 29, 2022

0.0.931

Mar 29, 2022

0.0.930

Mar 29, 2022

0.0.929

Mar 29, 2022

0.0.928

Mar 29, 2022

0.0.927

Mar 29, 2022

0.0.924

Mar 29, 2022

0.0.923

Mar 29, 2022

0.0.922

Mar 29, 2022

0.0.920

Mar 29, 2022

0.0.909

Mar 24, 2022

0.0.908

Mar 24, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datomize-2.3.20.tar.gz (38.1 kB view details)

Uploaded Nov 22, 2023 Source

File details

Details for the file datomize-2.3.20.tar.gz.

File metadata

Download URL: datomize-2.3.20.tar.gz
Upload date: Nov 22, 2023
Size: 38.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.7.17

File hashes

Hashes for datomize-2.3.20.tar.gz
Algorithm	Hash digest
SHA256	`6b8910d9caab37b513be598396a9ab10c23641d9a2738e1c893e555a7fa17f63`
MD5	`7ec1fdcc194a8d7a2127069a372affaf`
BLAKE2b-256	`c927fa9565e354841d508a96b69fb22206f3f234a90061252575ac4299aa32cf`

See more details on using hashes here.

datomize 2.3.20

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Welcome to Datomize Python SDK

Getting Started

Getting your application user & password

Installation

Important links

Usage Example

Async Usage Example

Misc examples

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes