Skip to main content

OS-Climate Python tools to assist with standardized data ingestion workflows

Project description

osc-ingest-tools

python tools to assist with standardized data ingestion workflows

Installation, Usage, and Release Management

Install from PyPi

pip install osc-ingest-tools

Examples

>>> from osc_ingest_trino import *

>>> import pandas as pd

>>> data = [['tom', 10], ['nick', 15], ['juli', 14]]

>>> df = pd.DataFrame(data, columns = ['First Name', 'Age In Years']).convert_dtypes()

>>> df
  First Name  Age In Years
0        tom            10
1       nick            15
2       juli            14

>>> enforce_sql_column_names(df)
  first_name  age_in_years
0        tom            10
1       nick            15
2       juli            14

>>> enforce_sql_column_names(df, inplace=True)

>>> df
  first_name  age_in_years
0        tom            10
1       nick            15
2       juli            14

>>> df.info(verbose=True)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
 #   Column        Non-Null Count  Dtype
---  ------        --------------  -----
 0   first_name    3 non-null      string
 1   age_in_years  3 non-null      Int64
dtypes: Int64(1), string(1)
memory usage: 179.0 bytes

>>> p = create_table_schema_pairs(df)

>>> print(p)
    first_name varchar,
    age_in_years bigint

>>>

Adding custom type mappings to create_table_schema_pairs

>>> df = pd.DataFrame(data, columns = ['First Name', 'Age In Years'])

>>> enforce_sql_column_names(df, inplace=True)

>>> df.info(verbose=True)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
 #   Column        Non-Null Count  Dtype
---  ------        --------------  -----
 0   first_name    3 non-null      object
 1   age_in_years  3 non-null      int64
dtypes: int64(1), object(1)
memory usage: 176.0+ bytes

>>> p = create_table_schema_pairs(df, typemap={'object':'varchar'})

>>> print(p)
    first_name varchar,
    age_in_years bigint

>>>

Development

Patches may be contributed via pull requests to https://github.com/os-climate/osc-ingest-tools.

All changes must pass the automated test suite, along with various static checks.

Black code style and isort import ordering are enforced.

Enabling automatic formatting via pre-commit is recommended:

pip install black isort pre-commit
pre-commit install

To ensure compliance with static check tools, developers may wish to run;

pip install black isort
# auto-sort imports
isort .
# auto-format code
black .

Code can then be tested using tox:

=======
# run static checks and tests
tox
# run only tests
tox -e py3
# run only static checks
tox -e static
# run tests and produce a code coverage report
tox -e cov

Releasing

To release a new version of this library, authorized developers should;

  • Prepare a signed release commit updating version in setup.py
  • Tag the commit using Semantic Versioning prepended with "v"
  • Push the tag

E.g.,

git commit -sm "Release v0.3.4"
git tag v0.3.4
git push --follow-tags

A Github workflow will then automatically release the version to PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

osc_ingest_tools-0.5.5.tar.gz (16.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

osc_ingest_tools-0.5.5-py3-none-any.whl (17.0 kB view details)

Uploaded Python 3

File details

Details for the file osc_ingest_tools-0.5.5.tar.gz.

File metadata

  • Download URL: osc_ingest_tools-0.5.5.tar.gz
  • Upload date:
  • Size: 16.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for osc_ingest_tools-0.5.5.tar.gz
Algorithm Hash digest
SHA256 8107ff66aafb09f584f3c505668d2219d0ec036702d8452593bb7a58f439704e
MD5 a0b05479305ca554ca0103ecf18f290b
BLAKE2b-256 97d1c5075a071aad2f41f5a319b10ac26b0b7b6de76f256de8aa89b071a17f54

See more details on using hashes here.

Provenance

The following attestation bundles were made for osc_ingest_tools-0.5.5.tar.gz:

Publisher: release.yaml on os-climate/osc-ingest-tools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file osc_ingest_tools-0.5.5-py3-none-any.whl.

File metadata

File hashes

Hashes for osc_ingest_tools-0.5.5-py3-none-any.whl
Algorithm Hash digest
SHA256 5dabfa1b1f18748a8c1df39c4a9dc4201643acd72538043f04fbe4ece58c0323
MD5 ba8c4ad78fb48073a5564edf62e7accb
BLAKE2b-256 4fd299269c29a510e1288e5d49e3dc58e52dffed6a1f9b2161efeceaf5c01efb

See more details on using hashes here.

Provenance

The following attestation bundles were made for osc_ingest_tools-0.5.5-py3-none-any.whl:

Publisher: release.yaml on os-climate/osc-ingest-tools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page