Skip to main content

Configuration of storage connections for mara

Project description

Mara Storage

mara-storage PyPI - License PyPI version Slack Status

Mini package for configuring and accessing multiple storages in a single project. Decouples the use of storages and their configuration by using "aliases" for storages.

The file mara_storage/storages.py contains abstract storage configurations for local disk and cloud storages. The storage connections of a project are configured by overwriting the storages function in mara_storage/config.py:

import pathlib
import mara_storage.config
import mara_storage.storages

## configure storage connections for different aliases
mara_storage.config.storages = lambda: {
    'data': mara_storage.storages.LocalStorage(base_path=pathlib.Path('data')),
    'gcs-bucket-1': mara_storage.storages.GoogleCloudStorage(bucket_name='my_data_lake_bucket_1', project_id='my_awesome_project')
}

## access individual storage configurations with `storages.storage`:
print(mara_storage.storages.storage('data'))
# -> <LocalStorage: base_path=data>

This packages gives the possibility to configure, manage and access multile storages in mara.

 

Batch processing: Accessing storages with shell commands

The file mara_storage/shell.py contains functions that create commands for accessing storage files via their command line clients.

For example, the read_file_command function creates a shell command that reads a file from a storage and returns its content to stdout:

import mara_storage.shell

file = 'my_domain.com/logs/2020/11/15/nginx.node-1.error.log'

print(mara_storage.shell.read_file_command('data', file_name=file))
# -> cat /mara/data/my_domain.com/logs/2020/11/15/nginx.node-1.error.log

print(mara_storage.shell.read_file_command('gcs-bucket-1', file_name=file))
# -> gsutil cat gs://my_data_lake_bucket_1/my_domain.com/logs/2020/11/15/nginx.node-1.error.log

The function write_file_command creates a shell command that receives a data on stdin and writes it to the storage:

import mara_storage.shell

command = 'echo "Hello World!"'
command += ' | '
command += mara_storage.shell.write_file_command('data', file_name='hello-world.txt')

print(command)
# -> echo "Hello World!" | cat - > /mara/data/hello-world.txt

Finally, delete_file_command creates a shell command that deletes a file from the local storage:

import mara_storage.shell

print(mara_storage.shell.delete_file_command('data', file_name='hello-world.txt'))
# -> rm -f /mara/data/hello-world.txt

 

The following command line clients are used to access the various databases:

Storage Client binary Comments
Local storage unix shell Included in standard distributions.
SFTP storage sftp, curl
Google Cloud Storage gsutil From https://cloud.google.com/storage/docs/gsutil_install.
Azure Storage azcopy, curl From https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10

 

Installation

To use the library directly, use pip:

pip install mara-storage

or

pip install git+https://github.com/mara/mara-storage.git

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mara-storage-1.1.0.tar.gz (15.0 kB view details)

Uploaded Source

Built Distribution

mara_storage-1.1.0-py3-none-any.whl (15.2 kB view details)

Uploaded Python 3

File details

Details for the file mara-storage-1.1.0.tar.gz.

File metadata

  • Download URL: mara-storage-1.1.0.tar.gz
  • Upload date:
  • Size: 15.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.2

File hashes

Hashes for mara-storage-1.1.0.tar.gz
Algorithm Hash digest
SHA256 ccddeef05b294af9d3d8ba28666e473cba8af60d2a99d40ad502f85d837eb1f0
MD5 c85bd433092ca4e2b38b2543c3c08a00
BLAKE2b-256 34735e71d76a4248f2edb8d653aa49086409c0f63ffa273262815e687c5ea2e0

See more details on using hashes here.

File details

Details for the file mara_storage-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for mara_storage-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ae7debf0b57a4009af317e8258a3f48d99310f4aad75e68838603440489a8d11
MD5 a6489509f9980f4c84b77c48cc17281e
BLAKE2b-256 aaeae0cf75fff067a83362f5d83ae0d1df44a18d2c8242be0b1b88ca844d1807

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page