Modelling Static folder structures for python applications
Project description
Static Folders
The premise for this library is simple: you want a statically typed way to refer to a catalogue of data.
Why would you want this?
Let's start with some alternatives I've seen to managing this problem this:
from Pathlib import Path
data_dir = Path("d:/projects/<project>/data") # person A
# data_dir = Path("/mnt/data/projects/<project>/data") # person B
config = data_dir / "config.json"
regression_inputs_csv = data_dir / "regression_model"/"input.csv"
There are a few pain points here:
- Code you need to swap out depending on which machine things are being run on, which often tends to end up across multiple notebooks (this is not something this library solves, but having a reusable folder structure encourages one to thing about having a more centralised mechanism to manage these kinds of things, rather that commented code in scripts)
- You can end up with a lot of variables quickly if you have lots of files
- If you want to compose pieces with constants you end up with even more variables
- No autocompletion from an editor
- If you ever need to move a file you're reliant on find+replace / regex to make sure you update all the usages ( which becomes more error-prone as you try to reuse path pieces, as the there are more text replacement variants )
A solution to this is some kind of wrapper class:
@define
class ApplicationPaths:
root: Path
def get_config(self)->Path:
...
This works fine, but you get no structure around the internals, only an external API which is statically typed. The same issues persist around changing your data representation, and if you want to specify detail within a nested subpath, you end up with a lot of methods, which can become unwieldly.
Static Folders provides a type-checked interface to represent a folder tree:
from pathlib import Path
from static_folders import Folder
class RegressionModelData(Folder):
input_csv: Path = Path("input.csv")
class ApplicationData(Folder):
regression_model_data: RegressionModelData = RegressionModelData("regression_model_data")
config: Path = Path("config.json")
app_root = ApplicationData("d:/projects/<project>/data")
print(app_root.regression_model_data.input_csv.read_text())
We also provide some convenience inference based on type annotations, to reduce the amount of boilerplate being written - we could equivalently write the above class as:
class ApplicationData(Folder):
regression_model_data: RegressionModelData
config: Path = Path("config.json")
for the same result.
Why are you hard coding file paths, shouldn't your code be more modular?
Sometimes it's quite useful to be able to explicitly refer to a catalog of data inputs e.g.
- Data science / analytics or processing pipelines
- GIS data transformations
- For complicated applications with quite specific input requirements e.g. transport models
That doesn't mean you should write your code coupled to Folder instances, you can (and probably should) still seperate business logic from data representation. There's a similar concept well explained in the cattrs documentation - that the serialisation of your data model should be a decoupled concern from the data model itself. In the same way, static folders deals with the data representation of your inputs, which should be seperate from how your code actually processes data.
But what if I have data on the cloud / stored in some way that doesn't fit this model?
- If your data is sufficiently small, you might be able to mirror the data (this can be quite useful working in a transport modelling ecosystem where inspecting and checking input and output files independently of code can be very valuable, and having an explicit mirror makes things easy to find)
- If not, or you're intrinsically coupled to cloud storage, then static folders probably isn't a good fit.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file static_folders-0.1.0.tar.gz.
File metadata
- Download URL: static_folders-0.1.0.tar.gz
- Upload date:
- Size: 40.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b9898ed5f6810ad19088273f10d5a8936140fe7744740ab6cf3ff9dd64778f25
|
|
| MD5 |
083a38b9dde8fce13876001eb29bc2b0
|
|
| BLAKE2b-256 |
fad74de9b1ad047daf6bb86fb55701b141f084152bf2b689cc8ccd78a941b47e
|
Provenance
The following attestation bundles were made for static_folders-0.1.0.tar.gz:
Publisher:
publish_to_pypi.yml on m-richards/static_folders
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
static_folders-0.1.0.tar.gz -
Subject digest:
b9898ed5f6810ad19088273f10d5a8936140fe7744740ab6cf3ff9dd64778f25 - Sigstore transparency entry: 239511247
- Sigstore integration time:
-
Permalink:
m-richards/static_folders@d554a03eeccb4529b65e9f78726cf57e844321d1 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/m-richards
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish_to_pypi.yml@d554a03eeccb4529b65e9f78726cf57e844321d1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file static_folders-0.1.0-py3-none-any.whl.
File metadata
- Download URL: static_folders-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef20a18b59222596718df9457e3a5c6c86b37e9122f915e0a63401cacb0c4943
|
|
| MD5 |
e214cf1e841893882241b4e1f29d4216
|
|
| BLAKE2b-256 |
519a9f0f1ad199a3bd0f8cba86a7b698921280f36bc568b714d37cd35b8ff648
|
Provenance
The following attestation bundles were made for static_folders-0.1.0-py3-none-any.whl:
Publisher:
publish_to_pypi.yml on m-richards/static_folders
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
static_folders-0.1.0-py3-none-any.whl -
Subject digest:
ef20a18b59222596718df9457e3a5c6c86b37e9122f915e0a63401cacb0c4943 - Sigstore transparency entry: 239511263
- Sigstore integration time:
-
Permalink:
m-richards/static_folders@d554a03eeccb4529b65e9f78726cf57e844321d1 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/m-richards
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish_to_pypi.yml@d554a03eeccb4529b65e9f78726cf57e844321d1 -
Trigger Event:
push
-
Statement type: