Configuration of storage connections for mara
Project description
Mara Storage
Mini package for configuring and accessing multiple storages in a single project. Decouples the use of storages and their configuration by using "aliases" for storages.
The file mara_storage/storages.py contains abstract storage configurations for local disk and cloud storages. The storage connections of a project are configured by overwriting the storages
function in mara_storage/config.py:
import pathlib
import mara_storage.config
import mara_storage.storages
## configure storage connections for different aliases
mara_storage.config.storages = lambda: {
'data': mara_storage.storages.LocalStorage(base_path=pathlib.Path('data')),
'gcs-bucket-1': mara_storage.storages.GoogleCloudStorage(bucket_name='my_data_lake_bucket_1', project_id='my_awesome_project')
}
## access individual storage configurations with `storages.storage`:
print(mara_storage.storages.storage('data'))
# -> <LocalStorage: base_path=data>
This packages gives the possibility to configure, manage and access multile storages in mara.
Batch processing: Accessing storages with shell commands
The file mara_storage/shell.py contains functions that create commands for accessing storage files via their command line clients.
For example, the read_file_command
function creates a shell command that reads a file from a storage and returns its content to stdout:
import mara_storage.shell
file = 'my_domain.com/logs/2020/11/15/nginx.node-1.error.log'
print(mara_storage.shell.read_file_command('data', file_name=file))
# -> cat /mara/data/my_domain.com/logs/2020/11/15/nginx.node-1.error.log
print(mara_storage.shell.read_file_command('gcs-bucket-1', file_name=file))
# -> gsutil cat gs://my_data_lake_bucket_1/my_domain.com/logs/2020/11/15/nginx.node-1.error.log
The function write_file_command
creates a shell command that receives a data on stdin and writes it to the storage:
import mara_storage.shell
command = 'echo "Hello World!"'
command += ' | '
command += mara_storage.shell.write_file_command('data', file_name='hello-world.txt')
print(command)
# -> echo "Hello World!" | cat - > /mara/data/hello-world.txt
Finally, delete_file_command
creates a shell command that deletes a file from the local storage:
import mara_storage.shell
print(mara_storage.shell.delete_file_command('data', file_name='hello-world.txt'))
# -> rm -f /mara/data/hello-world.txt
The following command line clients are used to access the various databases:
Database | Client binary | Comments |
---|---|---|
Local storage | unix shell | Included in standard distributions. |
Google Cloud Storage | gsutil |
From https://cloud.google.com/storage/docs/gsutil_install. |
Installation
To use the library directly, use pip:
pip install mara-storage
or
pip install git+https://github.com/mara/mara-storage.git
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file mara-storage-1.0.0.tar.gz
.
File metadata
- Download URL: mara-storage-1.0.0.tar.gz
- Upload date:
- Size: 11.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 37def4424d54c45d2869591eb7d89dedeca38268732ac7f807569704b00a598a |
|
MD5 | ab6540ec99cf701f77ea46329bf7be9d |
|
BLAKE2b-256 | 8555cd2fc3ef47dea7c821cc2ff9b5f9333712ef5ddd1a909f75577280777e1b |
File details
Details for the file mara_storage-1.0.0-py3-none-any.whl
.
File metadata
- Download URL: mara_storage-1.0.0-py3-none-any.whl
- Upload date:
- Size: 16.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | de8d1aae0331e88cf1fd4c150ed6e82fc1b6de84c99ebd832e5675b18530ef6a |
|
MD5 | dc44931c86f02c1ad7c7765d9bcfd922 |
|
BLAKE2b-256 | f4e9b10ba9048f78c7ece357e98a440bf1605f36921030f6ef0d6e2437b5926e |