python tools to assist with standardized data ingestion workflows for the OS-Climate project
Project description
osc-ingest-tools
python tools to assist with standardized data ingestion workflows
Install from PyPi
pip install osc-ingest-tools
Examples
>>> from osc_ingest_trino import *
>>> import pandas as pd
>>> data = [['tom', 10], ['nick', 15], ['juli', 14]]
>>> df = pd.DataFrame(data, columns = ['First Name', 'Age In Years']).convert_dtypes()
>>> df
First Name Age In Years
0 tom 10
1 nick 15
2 juli 14
>>> enforce_sql_column_names(df)
first_name age_in_years
0 tom 10
1 nick 15
2 juli 14
>>> enforce_sql_column_names(df, inplace=True)
>>> df
first_name age_in_years
0 tom 10
1 nick 15
2 juli 14
>>> df.info(verbose=True)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 first_name 3 non-null string
1 age_in_years 3 non-null Int64
dtypes: Int64(1), string(1)
memory usage: 179.0 bytes
>>> p = create_table_schema_pairs(df)
>>> print(p)
first_name varchar,
age_in_years bigint
>>>
Adding custom type mappings to create_table_schema_pairs
>>> df = pd.DataFrame(data, columns = ['First Name', 'Age In Years'])
>>> enforce_sql_column_names(df, inplace=True)
>>> df.info(verbose=True)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 first_name 3 non-null object
1 age_in_years 3 non-null int64
dtypes: int64(1), object(1)
memory usage: 176.0+ bytes
>>> p = create_table_schema_pairs(df, typemap={'object':'varchar'})
>>> print(p)
first_name varchar,
age_in_years bigint
>>>
build and upload a new release
- update all occurrences of
__version__
python3 setup.py clean
python3 setup.py sdist
twine check dist/*
twine upload dist/*
- push latest to repo
- create new release on github
upload test or release candidate:
- twine upload --repository-url https://test.pypi.org/legacy/ dist/*
python packaging resources
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
osc-ingest-tools-0.3.0.tar.gz
(9.4 kB
view hashes)
Built Distribution
Close
Hashes for osc_ingest_tools-0.3.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 487d82f04f57078d24d0f32ba66cea2bc507d3cedbc1deda7469c470abcc7aa6 |
|
MD5 | 6c2d8e145ae88f9d470c814cf78ad98b |
|
BLAKE2b-256 | c64a4daa3a0fe3af5141233fc2474a6727ec5c68b671c2ba04044dc17158bbce |