Skip to main content

Configuration of storage connections for mara

Project description

Mara Storage

mara-storage PyPI - License PyPI version Slack Status

Mini package for configuring and accessing multiple storages in a single project. Decouples the use of storages and their configuration by using "aliases" for storages.

The file mara_storage/storages.py contains abstract storage configurations for local disk and cloud storages. The storage connections of a project are configured by overwriting the storages function in mara_storage/config.py:

import pathlib
import mara_storage.config
import mara_storage.storages

## configure storage connections for different aliases
mara_storage.config.storages = lambda: {
    'data': mara_storage.storages.LocalStorage(base_path=pathlib.Path('data')),
    'gcs-bucket-1': mara_storage.storages.GoogleCloudStorage(bucket_name='my_data_lake_bucket_1', project_id='my_awesome_project')
}

## access individual storage configurations with `storages.storage`:
print(mara_storage.storages.storage('data'))
# -> <LocalStorage: base_path=data>

This packages gives the possibility to configure, manage and access multile storages in mara.

 

Batch processing: Accessing storages with shell commands

The file mara_storage/shell.py contains functions that create commands for accessing storage files via their command line clients.

For example, the read_file_command function creates a shell command that reads a file from a storage and returns its content to stdout:

import mara_storage.shell

file = 'my_domain.com/logs/2020/11/15/nginx.node-1.error.log'

print(mara_storage.shell.read_file_command('data', file_name=file))
# -> cat /mara/data/my_domain.com/logs/2020/11/15/nginx.node-1.error.log

print(mara_storage.shell.read_file_command('gcs-bucket-1', file_name=file))
# -> gsutil cat gs://my_data_lake_bucket_1/my_domain.com/logs/2020/11/15/nginx.node-1.error.log

The function write_file_command creates a shell command that receives a data on stdin and writes it to the storage:

import mara_storage.shell

command = 'echo "Hello World!"'
command += ' | '
command += mara_storage.shell.write_file_command('data', file_name='hello-world.txt')

print(command)
# -> echo "Hello World!" | cat - > /mara/data/hello-world.txt

Finally, delete_file_command creates a shell command that deletes a file from the local storage:

import mara_storage.shell

print(mara_storage.shell.delete_file_command('data', file_name='hello-world.txt'))
# -> rm -f /mara/data/hello-world.txt

 

The following command line clients are used to access the various databases:

Storage Client binary Comments
Local storage unix shell Included in standard distributions.
SFTP storage sftp, curl
Google Cloud Storage gsutil From https://cloud.google.com/storage/docs/gsutil_install.
Azure Storage azcopy, curl From https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10

 

Installation

To use the library directly, use pip:

pip install mara-storage

or

pip install git+https://github.com/mara/mara-storage.git

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mara-storage-1.1.1.tar.gz (15.4 kB view hashes)

Uploaded Source

Built Distribution

mara_storage-1.1.1-py3-none-any.whl (15.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page