Skip to main content

Delta table utilities.

Project description

delta_table_utils

Delta table utilities.

The basic use case for this library is if you are working in Databricks and want to do upserts using AutoLoader.

Basic usage:

from delta_table.delta_table_utils import DeltaTableColumn, DeltaTable

schema_name = 'my_schema'
table_name = 'my_table'

# Define the delta table schema
column_list = [
    DeltaTableColumn('id', data_type='STRING', nulls_allowed=False, is_unique_id=True),
    DeltaTableColumn('col1', data_type='STRING', nulls_allowed=False),
    DeltaTableColumn('col2', data_type='DOUBLE'),
    DeltaTableColumn('col3', data_type='DOUBLE'),
    DeltaTableColumn('col4', data_type='DOUBLE'),
    DeltaTableColumn('created_at', data_type='TIMESTAMP'),
    DeltaTableColumn('updated_at', data_type='TIMESTAMP')
]

# Create the DeltaTable object
delta_table = DeltaTable(schema_name=schema_name, table_name=table_name, upload_path="<location_of_data_in_s3>", column_list=column_list)

# Create the table and start the stream
delta_table.create_if_not_exists(sqlContext)
delta_table.stream(spark, cloudFiles_format='csv')

Additional notes

By default, when you use the stream method in this library, it stops as soon as no new data is detected. This is useful if you don't want a cluster running all the time and rather you just want to update your delta tables on some sort of a schedule.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

delta_table_utils-0.0.17.tar.gz (7.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

delta_table_utils-0.0.17-py3-none-any.whl (7.6 kB view details)

Uploaded Python 3

File details

Details for the file delta_table_utils-0.0.17.tar.gz.

File metadata

  • Download URL: delta_table_utils-0.0.17.tar.gz
  • Upload date:
  • Size: 7.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.15

File hashes

Hashes for delta_table_utils-0.0.17.tar.gz
Algorithm Hash digest
SHA256 7ee9066f72b9c097bce2cea76bcec04a4893cb6d6e971d1adb198ab44a9fd01c
MD5 4fee51e17ed459e31030c805b716d4eb
BLAKE2b-256 c1263f5c0aaccdf171bce44c8274ba322490089dabf7c54839ce568b75a91a73

See more details on using hashes here.

File details

Details for the file delta_table_utils-0.0.17-py3-none-any.whl.

File metadata

File hashes

Hashes for delta_table_utils-0.0.17-py3-none-any.whl
Algorithm Hash digest
SHA256 02265ace80496ed2d6471161d69d853133db2549ddba3cd059bf76aa6c231151
MD5 d49872af8b417382bc45d79b0f523533
BLAKE2b-256 5b30c91c019bd6013957c14fc8d761a24498075a6805377456bde85f407e777d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page