Multiple tools and utilities for ETL pipelines and others.

Project description

Ditat ETL

Multiple tools and utilities for ETL pipelines and others.

Utils

time_it

Decorator to time function and class method. Additional text can be added.

from ditat_etl.utils import time_it

@time_it()
def f():
  '''Do something'''
f()

f time: 0.1

Url

Extension of module requests/urllib3 for Proxy usage and Bulk usage.

Url

High-level usage

from ditat_etl import url

response = url.get('https://google.com')
# You can pass the same parameters as the library requests and other special parameters.

# Check low level usage for more details.

Low-level usage

from ditat_etl.url import Url

u = Url()

We use the logging module and it is set by default with 'DEBUG'. You can change this parameter to any allowed level

u = Url(debug_level='WARNING') # Just an example

Manage your proxies

u.add_proxies(n=3) # Added 3 new proxies (not necessarily valid) to self.proxies

u.clean_proxies() # Multithreaded to validate and keep only valid proxies.

print(u.proxies)
# You can also u.proxies = [], set them manually but this is not recommended.

Main functionality

def request(
    queue: str or list,
    expected_status_code: int=200,
    n_times: int=1,
    max_retries: int=None,
    use_proxy=False,
    _raise=True,
    ***kwargs
    ):

Examples

result = u.request('https://google.com')

result = u.request(queue=['https://google.com', 'htttps://facebook.com'], use_proxy=True)

# You can also pass optional parameter valid por a requests "Request"
import json
result = u.request(queue='https://example.com', method='post', data=json.dumps({'hello': 'world'}))

Databases

Useful wrappers for databases and methods to execute queries.

Postgres

It is compatible with pandas.DataFrame interaction, either reading as dataframes and pushing to the db.

from ditat_etl.databases import Postgres

config = {
    "database": "xxxx",
    "user": "xxxx",
    "password": "xxxx",
    "host": "xxxxx",
    "port": "xxxx"
}
p = Postgres(config)

The main base function is query.

p.query(
    query_statement: list or str,
    df: bool=False,
    as_dict: bool=False,
    commit: bool=True,
    returning: bool=True,
    mogrify: bool=False,
    mogrify_tuple: tuple or list=None,
    verbose=False
)

This function is a workaround of pandas.to_sql() which drops the table before inserting. It really works like an upsert and it gives you the option to do nothing or update on the column(s) constraint.

p.insert_df_to_sql(
    df: pd.DataFrame,
    tablename: str,
    commit=True,
    conflict_on: list=None,
    do_update_columns: bool or list=False,
    verbose=False
):

This one is similar, it lets you "upsert" without necessarily having a primary key or constraint. Ideally use the previous method.

p.update_df_to_sql(
    df: pd.DataFrame,
    tablename: str,
    on_columns: str or list,
    insert_new=True,
    commit=True,
    verbose=False
):

Project details

Release history Release notifications | RSS feed

2.6.29

Feb 17, 2023

2.6.28

Feb 14, 2023

2.6.27

Jan 19, 2023

2.6.26

Jan 10, 2023

2.6.25

Jan 5, 2023

2.6.24

Jan 4, 2023

2.6.23

Dec 12, 2022

1.6.22

Dec 12, 2022

1.6.21

Dec 9, 2022

1.6.20

Dec 9, 2022

1.6.19

Dec 8, 2022

1.6.18

Dec 6, 2022

1.6.17

Dec 2, 2022

1.6.16

Nov 28, 2022

1.6.15

Nov 24, 2022

1.6.14

Nov 23, 2022

1.6.13

Nov 22, 2022

1.6.12

Nov 22, 2022

1.6.11

Nov 21, 2022

1.6.10

Nov 21, 2022

1.6.9

Nov 15, 2022

1.6.8

Nov 14, 2022

1.6.7

Nov 11, 2022

1.6.6

Nov 11, 2022

1.6.5

Nov 10, 2022

1.6.4

Nov 10, 2022

1.6.3

Nov 10, 2022

1.6.2

Nov 10, 2022

1.6.1

Nov 9, 2022

1.6.0

Nov 8, 2022

1.5.15

Nov 4, 2022

1.5.14

Nov 4, 2022

1.5.13

Nov 2, 2022

1.5.12

Nov 1, 2022

1.5.11

Nov 1, 2022

1.5.10

Oct 31, 2022

1.5.9

Oct 28, 2022

1.5.8

Oct 27, 2022

1.5.7

Oct 27, 2022

1.5.6

Oct 27, 2022

1.5.5

Oct 25, 2022

1.5.4

Oct 25, 2022

1.5.3

Oct 24, 2022

1.5.2

Oct 21, 2022

1.5.1

Oct 21, 2022

1.5.0

Oct 19, 2022

1.4.2

Oct 12, 2022

1.4.1

Oct 11, 2022

1.4.0

Oct 10, 2022

1.3.9

Oct 10, 2022

1.3.8

Oct 6, 2022

1.3.7

Oct 6, 2022

1.3.6

Oct 6, 2022

1.3.5

Oct 6, 2022

1.3.4

Oct 6, 2022

1.3.3

Oct 4, 2022

1.3.2

Sep 30, 2022

1.3.1

Sep 30, 2022

1.3.0

Sep 14, 2022

1.2.10

Sep 9, 2022

1.2.9

Aug 30, 2022

1.2.8

Aug 1, 2022

1.2.7

Jul 12, 2022

1.2.6

Jul 12, 2022

1.2.5

Jul 12, 2022

1.2.4

Jul 1, 2022

1.2.3

Jun 30, 2022

1.2.2

Jun 21, 2022

1.2.1

Jun 21, 2022

1.2.0

Jun 16, 2022

1.1.9

Jun 15, 2022

1.1.8

Jun 13, 2022

1.1.7

Jun 13, 2022

1.1.6

Jun 9, 2022

1.1.5

Jun 9, 2022

1.1.4

Jun 3, 2022

1.1.3

May 31, 2022

1.1.2

May 17, 2022

1.1.1

May 17, 2022

1.1.0

May 17, 2022

1.0.21

May 10, 2022

1.0.20

May 10, 2022

1.0.13

May 4, 2022

1.0.12

May 3, 2022

1.0.11

Apr 20, 2022

1.0.10

Apr 8, 2022

1.0.9

Apr 5, 2022

1.0.8

Apr 4, 2022

1.0.7

Apr 4, 2022

1.0.6

Mar 28, 2022

1.0.5

Mar 24, 2022

1.0.4

Mar 23, 2022

1.0.3

Mar 22, 2022

1.0.2

Mar 18, 2022

1.0.1

Mar 18, 2022

1.0.0

Mar 8, 2022

0.3.11

Mar 3, 2022

0.3.10

Mar 2, 2022

0.3.9

Mar 2, 2022

0.3.8

Feb 25, 2022

0.3.7

Feb 21, 2022

0.3.6

Feb 21, 2022

0.3.5

Feb 16, 2022

0.3.3

Feb 11, 2022

0.3.2

Feb 8, 2022

0.3.1

Feb 4, 2022

0.3.0

Jan 27, 2022

This version

0.2.8

Jan 25, 2022

0.2.7

Jan 25, 2022

0.2.6

Jan 24, 2022

0.2.5

Jan 21, 2022

0.2.4

Jan 20, 2022

0.2.3

Jan 19, 2022

0.2.1

Jan 18, 2022

0.2.0

Jan 12, 2022

0.1.2

Jan 12, 2022

0.1.1

Jan 7, 2022

0.1.0

Jan 6, 2022

0.0.36

Jan 5, 2022

0.0.35

Jan 5, 2022

0.0.34

Jan 5, 2022

0.0.33

Jan 4, 2022

0.0.32

Dec 30, 2021

0.0.31

Dec 29, 2021

0.0.30

Dec 27, 2021

0.0.29

Dec 27, 2021

0.0.28

Dec 23, 2021

0.0.27

Dec 22, 2021

0.0.26

Dec 20, 2021

0.0.25

Dec 9, 2021

0.0.24

Dec 7, 2021

0.0.23

Nov 30, 2021

0.0.22

Nov 19, 2021

0.0.21

Nov 11, 2021

0.0.19

Nov 1, 2021

0.0.18

Oct 21, 2021

0.0.17

Oct 4, 2021

0.0.16

Sep 30, 2021

0.0.14

Sep 29, 2021

0.0.13

Sep 29, 2021

0.0.12

Sep 28, 2021

0.0.11

Sep 28, 2021

0.0.10

Sep 1, 2021

0.0.9

Aug 27, 2021

0.0.8

Aug 21, 2021

0.0.7

Aug 18, 2021

0.0.4

Aug 12, 2021

0.0.3

Aug 11, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ditat_etl-0.2.8.tar.gz (33.2 kB view hashes)

Uploaded Jan 25, 2022 Source

Built Distribution

ditat_etl-0.2.8-py3-none-any.whl (89.5 kB view hashes)

Uploaded Jan 25, 2022 Python 3

Hashes for ditat_etl-0.2.8.tar.gz

Hashes for ditat_etl-0.2.8.tar.gz
Algorithm	Hash digest
SHA256	`47e7df9fb48a9b8f012d0e588bb982ce296453f6705832a88681d336260feaac`
MD5	`b392cc47ceef0b661a4c1820167a467c`
BLAKE2b-256	`214ec420a5cd201fa4b31447a696a1cbf0614b8e63ced0a1a1237db7c667a2c0`

Hashes for ditat_etl-0.2.8-py3-none-any.whl

Hashes for ditat_etl-0.2.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9ce89aac44a6cad0593f28f95a64a3116cc18aaf78c294d27e80fa293fdc16db`
MD5	`f959932ba6fe86a4d0c17edeab16d693`
BLAKE2b-256	`5fc609a208b96689499953e7e43d4cbaea7b0e9239c1c4ca51754690df33370d`