Skip to main content

A package that provides utilities for data wrangling with AWS S3, pandas, and geojson.

Project description

Publish Python Package

softwrdwrangler

softwrdwrangler is a Python package designed to simplify and streamline operations on various AWS resources, primarily focusing on S3. It offers utilities for reading and writing different file formats and managing S3 objects efficiently.

Pre-requisites

You need to have AWS CLI installed and configured with the necessary permissions to interact with AWS services.

Install AWS CLI

Follow the installation guide here: AWS CLI Installation

Once installed, configure it by running:

aws configure

Installation

Install the package via pip:

pip install softwrdwrangler

Usage

S3 Operations

S3 Read and Write Pickle

import softwrdwrangler as swr

# Reading a pickle file from S3
data = swr.read_pickle('s3://bucket-name/path/to/file.pkl')

# Writing a DataFrame to a pickle file in S3
swr.write_pickle(data, 's3://bucket-name/path/to/file.pkl')

S3 Read and Write JSON

import softwrdwrangler as swr

# Writing a dictionary to a JSON file in S3
data = {'a': 1, 'b': 2}
swr.write_json(data, 's3://bucket-name/path/to/file.json')

# Reading a JSON file from S3
print(swr.read_json('s3://bucket-name/path/to/file.json'))

S3 Read and Write CSV

import softwrdwrangler as swr

# Reading a CSV file from S3
data = swr.read_csv('s3://bucket-name/path/to/file.csv')

# Writing a DataFrame to a CSV file in S3
swr.write_csv(data, 's3://bucket-name/path/to/file.csv')

S3 Read and Write GeoJSON

import softwrdwrangler as swr

# Reading a GeoJSON file from S3
data = swr.read_geojson('s3://bucket-name/path/to/file.geojson')

# Writing a dictionary to a GeoJSON file in S3
swr.write_geojson(data, 's3://bucket-name/path/to/file.geojson')

Additional S3 Utilities

List Files

List all files in a specified S3 bucket and prefix.

import softwrdwrangler as swr

files = swr.list_files('s3://bucket-name/path/to/folder/')
print(files)

Delete File

Delete a specific file from S3.

import softwrdwrangler as swr

swr.delete_file('s3://bucket-name/path/to/file')

Upload Local File to S3

Upload a file from local storage to S3.

import softwrdwrangler as swr

swr.upload_file('/local/path/to/file', 's3://bucket-name/path/to/file')

Download S3 File to Local Storage

Download a file from S3 to local storage.

import softwrdwrangler as swr

swr.download_file('s3://bucket-name/path/to/file', '/local/path/to/file')

Copy File within S3

Copy a file from one S3 location to another.

import softwrdwrangler as swr

swr.copy_file('s3://source-bucket/path/to/file', 's3://destination-bucket/path/to/file')

Generate Pre-signed URL

Generate a pre-signed URL to grant temporary access to an S3 object.

import softwrdwrangler as swr

url = swr.generate_presigned_url('s3://bucket-name/path/to/file', expiration=3600)
print(url)

Advanced Operations

Get Latest Date-stamped Folder (ds_nodash)

Retrieve the latest date-stamped folder (e.g., 20221201) from a given S3 prefix, with an optional skip count.

import softwrdwrangler as swr

latest_date = swr.get_latest_ds_nodash('s3://bucket-name/path', skip=1)
if latest_date:
    print(f"Latest ds_nodash after skipping 1: {latest_date}")
else:
    print("No valid date folders found or error occurred.")

License

softwrdwrangler is licensed under the Apache Software License.

Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

softwrdwrangler-0.1.2.tar.gz (7.6 kB view details)

Uploaded Source

Built Distribution

softwrdwrangler-0.1.2-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file softwrdwrangler-0.1.2.tar.gz.

File metadata

  • Download URL: softwrdwrangler-0.1.2.tar.gz
  • Upload date:
  • Size: 7.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for softwrdwrangler-0.1.2.tar.gz
Algorithm Hash digest
SHA256 56d4692dced2465c0805564bc3fb0e97b1a1acdc3e649be8e498b27fc297f59e
MD5 6126dca90115fe03f3c31a004064f0c5
BLAKE2b-256 141620fdc43758c7fe40b65a8351cf66ac44768b5629786cbb9c521f51f32615

See more details on using hashes here.

File details

Details for the file softwrdwrangler-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for softwrdwrangler-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2b3652f91b09ed207779229436ab1772538fa16194ce584448c5c8d9f8a1a8b0
MD5 f12dab171d4c8acd9a4ec114ef01acfe
BLAKE2b-256 ef877cccf559d425f7a8f16c4583b9c1a01872f98d315a1d0da36a8e12b1c77e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page