Skip to main content

Define named directory structure using placeholders

Project description

pathtrees

pypi docs License

Define your path structure at the top, then just fill in the variables later.

Install

pip install pathtrees

Usage

import pathtrees as pt



# define your file structure.

# a simple ML experiment structure
paths = Paths.define('./logs', {
    '{log_id}': {
        'model.h5': 'model',
        'model_spec.pkl': 'model_spec',
        'plots': {
            'epoch_{step:.4d}': {
                '{plot_name}.png': 'plot',
                '': 'plot_dir',
            }
        },
        # a path join hack that gives you: log_dir > ./logs/{log_id}
        '', 'log_dir',
    }
})
paths.update(log_id='test1', step=-1)



# for example, a keras callback that saves a matplotlib plot every epoch

class MyCallback(Callback):
    def on_epoch_end(self, epoch, logs):
        # creates a copy of the path tree that has step_name=epoch
        epoch_paths = paths.specify(step=epoch)
        ...
        # save one plot
        plt.imsave(epoch_paths.plot.specify(plot_name='confusion_matrix'))
        ...
        # save another plot
        plt.imsave(epoch_paths.plot.specify(plot_name='auc'))

# you can glob over any missing data (e.g. step => '*')
# equivalent to: glob("logs/test1/plots/{step}/auc.png")
for path in paths.plot.specify(plot_name='auc').glob():
    print(path)

Path Formatting

path = pathtrees.Path('data/{sensor_id}/raw/{date}/temperature_{file_id:04d}.csv')
path.update(sensor_id='aaa')

try:
    path.format()
except KeyError: 
    print("oops gotta provide more data!")

assert path.partial_format() == 'data/aaa/raw/{date}/temperature_{file_id:04d}.csv'
assert path.glob_format() == 'data/aaa/raw/*/temperature_*.csv'

try:
    path.format(date='111')
except KeyError: 
    print("oops gotta provide more data!")

assert path.partial_format(date='111') == 'data/aaa/raw/111/temperature_{file_id:04d}.csv'
assert path.glob_format(date='111') == 'data/aaa/raw/111/temperature_*.csv'

# fully specified path - all data provided
assert path.format(date='111', fild_id=2) == 'data/aaa/raw/111/temperature_0002.csv'
assert path.partial_format(date='111', fild_id=2) == 'data/aaa/raw/111/temperature_0002.csv'
assert path.glob_format(date='111', fild_id=2) == 'data/aaa/raw/111/temperature_0002.csv'

# passing arguments to format() doesn't update the original object.

# you can either create a copy of the path and update it's data
path2 = path.specify(date='111')
# or you can update the data in place using update()
path2.update(date='222', fild_id=2)

# and now you don't need to pass that info to format()

import os

assert os.fspath(path) == path.format()
assert str(path) == path.partial_format()

TODO:

  • docstrings and examples !!!
  • decide what I want to do about format_path, partial_format_path, etc. (too verbose)
  • publish RTD

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pathtrees-0.0.3.tar.gz (9.4 kB view details)

Uploaded Source

File details

Details for the file pathtrees-0.0.3.tar.gz.

File metadata

  • Download URL: pathtrees-0.0.3.tar.gz
  • Upload date:
  • Size: 9.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for pathtrees-0.0.3.tar.gz
Algorithm Hash digest
SHA256 cb12c61784b0dfd84383ccbdf9fdc9ca26d42fbb6f03e7d34bf31b04a319adfc
MD5 41dd9c8801276b7145565c945515e1d8
BLAKE2b-256 7eb4607b298979d89372825df78abd24f51e655852d8d6907b4e9cad614932db

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page