A python wrapper to make file backends seamless
Project description
CS Python File Wrapper
Quick Start
Install
Pip
pip install
To resolve packages using pip, add the following to ~/.pip/pip.conf
:
[global]
index-url = https://<USERNAME>:<PASSWORD>@bergsalex.jfrog.io/artifactory/api/pypi/bergsalex-pypi/simple
If credentials are required they should be embedded in the URL. To resolve packages using pip, run:
pip install "jax-cs-storage"
Poetry
In your project's pyproject.toml
file, add:
[[tool.poetry.source]]
name = "jfrogbergsalex"
url = "https://bergsalex.jfrog.io/artifactory/api/pypi/bergsalex-pypi/simple"
If the repository is password protected, add:
poetry config http-basic.jfrogbergsalex <YOUR-EMAIL>
You will be prompted for your password:
Password:
Then you can run:
poetry add jax-cs-storage
Docker
Pip
Add the following to the top of your requirements.txt
file:
--extra-index-url "https://bergsalex.jfrog.io/artifactory/api/pypi/bergsalex-pypi/simple"
Poetry
The [[tool.poetry.source]]
configuration above should handle resolving to the right repository.
Access restricted repositories
TBD
Setup
The file wrapper uses pydantic to get configuration from a .env file by default. The steps to customize this behavior are discussed in more detail in later sections.
For now, you'll need to at least have the following in you .env
JAX_CS_STORAGE_GCS_BUCKET = 'YOUR_BUCKET_NAME'
Usage
Create a file from in memory content
from jax.cs.storage import StorageObjectIngest
# You could also configure the package with `jax.cs.storage.init` in a namespace of you choice
# from .files import StorageObject, StorageObjectIngest
content = 'This is some in memory content'
filename = 'SomeGeneratedName.txt'
wrapped = StorageObjectIngest(content, from_memory=True).ingest(filename=filename)
Allow a user to create a file in Google Cloud Storage
Google cloud storage allows you to generate a signed url to allow user uploads to a specific location in object storage, without providing the user with explicit credentials to write to that object storage bucket.
from jax.cs.storage import StorageObject, ObjectSvcGCS
desired_gs_url = 'gs://some_gs_url'
wrapped = StorageObject(desired_gs_url, file_svc=ObjectSvcGCS)
user_upload_url = wrapped.user_upload_url(content_type='image/png')
Look up a file by name
# `ObjectSvcGCS` is where `jax.cs.storage stores files in the default configuration,
# you should use whichever service you use with you StorageObjectIngest
from jax.cs.storage.object.services.gcs import ObjectSvcGCS
filename = 'SomeGeneratedName.txt'
all_files = ObjectSvcGCS.all()
found = filename in all_files
Cache by unique file name
from jax.cs.storage import StorageObject, StorageObjectIngest
# `ObjectSvcGCS` is where `jax.cs.storage stores files in the default configuration,
# you should use whichever service you use with you StorageObjectIngest
from jax.cs.storage.object.services.gcs import ObjectSvcGCS
def check_if_exists():
filename = 'SomeGeneratedName.txt'
all_files = ObjectSvcGCS.all()
found = filename in all_files
if found:
wrapped = StorageObject(filename)
else:
content = 'some_generated_content'
wrapped = StorageObjectIngest(content, from_memory=True).ingest(filename=filename)
return wrapped.user_url
Get a wrapped version of an existing file
from jax.cs.storage import StorageObject
def get_file_for_user(known_file_location: str):
# Get the wrapped file
wrapped = StorageObject(known_file_location)
# Return information about the file to a user
return {'name': wrapped.user_name, 'location': wrapped.user_url}
Configuration
Settings
The following pydantic Settings class is used to configure the library. By
default, these values will be populated automatically from a .env
config file.
The .env
can have both your application configuration as well as the library
configuration side by side:
LOG_LEVEL = 'DEBUG'
JAX_CS_STORAGE_GCS_BUCKET = 'fake-bucket'
JAX_CS_STORAGE_GCS_PREFIX_DIR = None
JAX_CS_STORAGE_GCS_CHECK_BLOB_EXISTS = True
JAX_CS_STORAGE_R2_BASE_URL 'https://r2.jax.org
# You should comingle application and library config variables
APPLICATION_SPECIFIC_CONFIG = 'something'
APPLICATION_SPECIFIC_CONFIG_2 = 'something_else'
JAX_CS_STORAGE_IO_FILE_ROOT = '/path/to/local/storage/root'
If you need to manually set the configuration of the library, you can do so with
the special init
entrypoint method.
from jax.cs.storage import init
from jax.cs.storage.config import StorageConfig
StorageObject, StorageObjectIngest = init(
StorageConfig.parse_obj({
'GCS_BUCKET': '1',
'R2_BASE_URL': 'https://r2.jax.org/test',
'GCS_PREFIX_DIR': '3',
'GCS_CHECK_BLOB_EXISTS': False,
'IO_FILE_ROOT': '/path/to/local/storage/root'
})
)
You can use standard Pydantic methods to create the settings instance. E.g. from a dictionary:
from jax.cs.storage import init
from jax.cs.storage.config import StorageConfig
my_dict_config = {
'GCS_BUCKET': '1',
'R2_BASE_URL': 'https://r2.jax.org/test',
'GCS_PREFIX_DIR': '3',
'GCS_CHECK_BLOB_EXISTS': False}
StorageObject, StorageObjectIngest = init(StorageConfig.parse_obj(my_dict_config))
The underlying settings class looks like:
from typing import Optional
from pydantic import AnyHttpUrl, BaseSettings
class StorageConfig(BaseSettings):
"""The pydantic configuration class definition for the jax.cs.storage package."""
LOG_LEVEL: str = 'DEBUG'
GCS_BUCKET: Optional[str] = None
GCS_PREFIX_DIR: Optional[str] = None
GCS_URL_EXPIRE_HOURS: int = 24
GCS_CHECK_BLOB_EXISTS: bool = True
R2_BASE_URL: AnyHttpUrl = 'https://r2.jax.org'
IO_FILE_ROOT: Optional[str] = '/'
class Config:
"""The config class for pydantic object.
Used here to configure the default means of determining settings for the package.
"""
env_prefix = 'JAX_CS_STORAGE_'
case_sensitive = True
env_file = ".env"
Wrapper Config
Default Service
The default service is the concrete ObjectSvc implementation that the StorageObject will fall back to if
no other service can validate the file. To set the default wrapper service set the default_svc
argument on the init
call:
from jax.cs.storage import init, ObjectSvcGCS
StorageObject, StorageObjectIngest = init(default_svc=ObjectSvcGCS)
Available Services
The available services are an ordered list of concrete ObjectSvc implementations in which the order of
the list is the order of precedence of the services. To set the list of available services, use the
services
argument on the init
call:
from jax.cs.storage import init, ObjectSvcGCS, ObjectSvcR2, ObjectSvcIO
StorageObject, StorageObjectIngest = init(
services=[ObjectSvcGCS, ObjectSvcIO, ObjectSvcR2])
Ingestion Service
To set how files are ingested using the StorageObjectIngest, pass an alternate concrete implementation
of the ObjectIngestSvc abstract class as the ingestion_svc
argument on the init
call:
from jax.cs.storage import init, ObjectIngestSvcGCS
StorageObject, StorageObjectIngest = init(ingestion_svc=ObjectIngestSvcGCS)
Configure app instance to use a specific Ingestion service
This example shows how you could dynamically configure the library to use different file ingestion services in different scenarios.
from jax.cs.storage import init
from jax.cs.storage.object.services.io import ObjectIngestSvcIO
from jax.cs.storage.object.services.gcs import ObjectIngestSvcGCS
# This could be a boolean value taken from your app config
use_gcs = False
ingestion_svc = ObjectIngestSvcGCS if use_gcs else ObjectIngestSvcIO
StorageObject, StorageObjectIngest = init(
ingestion_svc=ingestion_svc
)
Contributing
Static Checkers
Python syntax and style: flake8
Flake8 is pre-configured with the .flake8 file from this repository. Just run the following.
python -m flake8 src/jax
Docstring existence and format: pydocstyle
Service implementations inherit their docstrings from their abstract class, we ignore D102 for the service.py files.
Pydocstyle is configured in the pyproject.toml
file at the root of this repository.
Check non-service files
pydocstyle src/jax/cs/storage/
Check services
pydocstyle src/jax/cs/storage/object/services --match='service.py' --add-ignore=D102
Known security vulnerabilities: bandit
bandit -r src/jax/cs/storage
e.g.
$ bandit -r src/jax/cs/storage
[main] INFO profile include tests: None
[main] INFO profile exclude tests: None
[main] INFO cli include tests: None
[main] INFO cli exclude tests: None
[main] INFO running on Python 3.7.12
Run started:2022-01-21 14:02:01.153764
Test results:
No issues identified.
Code scanned:
Total lines of code: 1143
Total lines skipped (#nosec): 0
Run metrics:
Total issues (by severity):
Undefined: 0.0
Low: 0.0
Medium: 0.0
High: 0.0
Total issues (by confidence):
Undefined: 0.0
Low: 0.0
Medium: 0.0
High: 0.0
Files skipped (0):
Dependency license analysis: liccheck
Liccheck is configured in the [tools.liccheck]
section of the pyproject.toml
file.
liccheck
e.g.
$ liccheck
gathering licenses...
8 packages and dependencies.
check authorized packages...
8 packages.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file jax_cs_storage-0.9.2.tar.gz
.
File metadata
- Download URL: jax_cs_storage-0.9.2.tar.gz
- Upload date:
- Size: 29.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.9.13 Darwin/23.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | df8e30ef1a6243dfaecf866229746d747fe61ab5ce2377bedc704b7bc35fa4f6 |
|
MD5 | 31916bed7e0f72136e7cdceddd779364 |
|
BLAKE2b-256 | 6780957b8de7f4ab0bf11cf96b16853e17029425dc74f8f8bd2ffd1a5bc067d9 |
File details
Details for the file jax_cs_storage-0.9.2-py3-none-any.whl
.
File metadata
- Download URL: jax_cs_storage-0.9.2-py3-none-any.whl
- Upload date:
- Size: 41.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.9.13 Darwin/23.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a1725dd46191affad2565d3b8d892b7a617bb9d6f312bd047ab053634679fa7b |
|
MD5 | a88bdaca5800533d7c0b1e949e88c252 |
|
BLAKE2b-256 | 0bdb4f7a90e47295b343f8d93053c5fbf8900b0c72124d2596c4d42c9b0fd31f |