Skip to main content

No project description provided

Project description

Shimoku Tangram

A Python library for data handling, storage operations, and logging.

Installation

pip install shimoku-tangram

Quick Start

from shimoku_tangram.storage import s3
from shimoku_tangram.reporting.logging import init_logger

# Initialize logging
logger = init_logger("MyLogger")

# Use S3 operations
s3.put_json_object("my-bucket", "path/to/file.json", {"key": "value"})

Components

Storage (s3)

All available methods in the S3 module:

Basic Bucket Operations

bucket_exists(bucket: str) -> bool
clear_path(bucket: str, prefix: str = "") -> bool

Object Listing

list_objects_metadata(bucket: str, prefix: str = "") -> list
list_objects_key(bucket: str, prefix: str = "") -> list
list_single_object_key(bucket: str, prefix: str) -> str
list_multiple_objects_keys(bucket: str, prefix: str) -> List[str]
list_objects_key_between_dates(bucket: str, prefix: str, start_date: datetime, end_date: datetime) -> List[str]

Basic Object Operations

get_object(bucket: str, key: str, compressed: bool = True) -> bytes
put_object(bucket: str, key: str, body: bytes, compress: bool = True) -> bool
delete_object(bucket: str, key: str) -> bool

Text Operations

get_text_object(bucket: str, key: str, encoding: str = "utf-8", compressed: bool = True) -> str
put_text_object(bucket: str, key: str, body: str, encoding: str = "utf-8", compress: bool = True) -> bool

JSON Operations

get_json_object(bucket: str, key: str, compressed: bool = True) -> dict
put_json_object(bucket: str, key: str, body: dict, compress: bool = True) -> bool
get_single_json_object(bucket: str, prefix: str) -> dict
put_single_json_object(bucket: str, prefix: str, body: Dict) -> str

Pickle Operations

get_pkl_object(bucket: str, key: str, compressed: bool = True) -> dict
put_pkl_object(bucket: str, key: str, body, compress: bool = True) -> bool
get_single_pkl_object(bucket: str, prefix: str)
put_single_pkl_object(bucket: str, prefix: str, body) -> str

CSV/DataFrame Operations

get_multiple_csv_objects(bucket: str, prefix: str) -> pd.DataFrame
put_multiple_csv_objects(bucket: str, prefix: str, body: pd.DataFrame, size_max_mb: float = 100) -> List[str]

Threaded Operations

get_multiple_csv_objects_threaded(bucket: str, prefixes: list[str], logger: logging.Logger | None = None) -> pd.DataFrame
put_multiple_csv_objects_threaded(bucket: str, dfs: dict[str, pd.DataFrame], size_max_mb: float = 100, logger: logging.Logger | None = None) -> None
get_multiple_csv_objects_between_dates_threaded(bucket: str, prefix: str, start_date: datetime, end_date: datetime) -> pd.DataFrame

Utility Functions

get_extension(key: str, compressed: bool = True) -> str
is_compressed(key: str) -> bool

Metadata

get_last_timestamp(bucket: str, prefix: str) -> str
set_last_timestamp(bucket: str, prefix: str) -> str

Logging

Initialize with custom name and level:

from shimoku_tangram.reporting.logging import init_logger
logger = init_logger("MyLogger", logging.INFO)

Features

  • Automatic compression (gzip)
  • Thread support for large operations
  • Pandas DataFrame integration
  • S3 path management
  • Structured logging
  • Error handling

Example

from shimoku_tangram.storage import s3
from datetime import datetime

# Get data between dates
df = s3.get_multiple_csv_objects_between_dates_threaded(
    bucket="my-bucket",
    prefix="data/path",
    start_date=datetime(2024, 1, 1),
    end_date=datetime(2024, 12, 31)
)

# Store processed data
s3.put_multiple_csv_objects(
    bucket="my-bucket",
    prefix="output/path",
    body=df,
    size_max_mb=50
)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shimoku_tangram-0.1.0.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

shimoku_tangram-0.1.0-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file shimoku_tangram-0.1.0.tar.gz.

File metadata

  • Download URL: shimoku_tangram-0.1.0.tar.gz
  • Upload date:
  • Size: 6.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for shimoku_tangram-0.1.0.tar.gz
Algorithm Hash digest
SHA256 04be6dd3aab0d5411e25f978558cc06eaf411e9b4936378092b4b71ea5c688c5
MD5 e69cb6d2344ff1ce9f57074b6f655a10
BLAKE2b-256 5c92f5ca9cd29b7e7b58677c025b9769e964e065e80343aa76cbb76958c6af96

See more details on using hashes here.

File details

Details for the file shimoku_tangram-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for shimoku_tangram-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ec504444b4ee213cf270387a1eed340eb2779b6657591d66a37956fbf3bb1b1a
MD5 8395104518b32399c92576bceca54135
BLAKE2b-256 cb1e99b888bd76b0ccc159aac747fea8e9782359c567457068b4aaf336500449

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page