Skip to main content

from hansel import Crumb to find your file path.

Project description

hansel
======

Flexible parametric file paths to make queries, build folder trees and smart
folder structure access.

|PyPI| |Build Status| |Coverage Status| |PyPI Downloads| |Code Health| |Scrutinizer|

Usage
=====

Quick Intro
-----------

Imagine this folder tree:

::

data
└── raw
├── 0040000
│   └── session_1
│   ├── anat_1
│   └── rest_1
├── 0040001
│   └── session_1
│   ├── anat_1
│   └── rest_1
├── 0040002
│   └── session_1
│   ├── anat_1
│   └── rest_1
├── 0040003
│   └── session_1
│   ├── anat_1
│   └── rest_1
├── 0040004
│   └── session_1
│   ├── anat_1
│   └── rest_1


.. code:: python

from hansel import Crumb

# create the crumb
crumb = Crumb("{base_dir}/data/raw/{subject_id}/{session_id}/{image_type}/{image}")

# set the base_dir path
crumb = crumb.replace('base_dir', '/home/hansel')

assert str(crumb) == "/home/hansel/data/raw/{subject_id}/{session_id}/{image_type}"

# get the ids of the subjects
subj_ids = crumb['subject_id']

assert subj_ids == ['0040000', '0040001', '0040002', '0040003', '0040004', ....]

# get the paths to the subject folders, the output can be strings or crumbs, you choose with the make_crumbs boolean argument
subj_paths = crumb.ls('subject_id', make_crumbs=True)

# set the image_type
anat_crumb = crumb.replace(image_type='anat_1')

# get the paths to the anat_1 folders
anat_paths = anat_crumb.ls('image')


Long Intro
----------

I often find myself in a work related with structured folder paths, such as the
one shown above.

I have tried many ways of solving these situations: loops, dictionaries,
configuration files, etc. I always end up doing a different thing for the same
problem over and over again.

This week I grew tired of it and decided to make a representation of a
structured folder tree in a string and access it the most easy way.

If you look at the folder structure above I have:

- the root directory from where it is hanging: ``...data/raw``,
- many identifiers (in this case a subject identification), e.g.,
``0040000``,
- session identification, ``session_1`` and
- a data type (in this case an image type), ``anat_1`` and ``rest_1``.

With ``hansel`` I can represent this folder structure like this:

.. code:: python

from hansel import Crumb

crumb = Crumb("{base_dir}/data/raw/{subject_id}/{session_id}/{image_type}")


Let's say we have the structure above hanging from a base directory like ``/home/hansel/``.

I can use the ``replace`` function to make set the ``base_dir``
parameter:

.. code:: python

crumb = crumb.replace('base_dir', '/home/hansel')

assert str(crumb) == "/home/hansel/data/raw/{subject_id}/{session_id}/{image_type}"

if you don't need a copy of ``crumb``, you can use the ``[]`` operator:

.. code:: python

crumb['base_dir'] = '/home/hansel'


Now that the root path of my dataset is set, I can start querying my
crumb path.

If I want to know the path to the existing ``subject_ids`` folders:

.. code:: python

subject_paths = anat_crumb.ls('subject_id')

The output of ``ls`` can be ``str`` or ``Crumb`` or ``pathlib.Path``.
They will be ``Path`` if there are no crumb arguments left in the crumb path.
You can choose this using the ``make_crumbs`` argument:

.. code:: python

subject_paths = anat_crumb.ls('subject_id', make_crumbs=True)

If I want to know what are the existing ``subject_ids``:

.. code:: python

subject_ids = crumb.ls('subject_id', fullpath=False)

or

.. code:: python

subject_ids = crumb['subject_id']

Now, if I wanted to get the path to all the ``anat_1`` images, I could
do this:

.. code:: python

anat_crumb = crumb.replace(image_type='anat_1')

anat_paths = anat_crumb.ls('image')

or

.. code:: python

crumb['image_type'] = 'anat_1'

anat_paths = crumb.ls('image')


More features
-------------

There are more possibilities such as:

- creating folder trees with a value of maps for the crumbs:

.. code:: python

from hansel import mktree, ParameterGrid

crumb = Crumb("/home/hansel/raw/{subject_id}/{session_id}/{modality}/{image}")

values_map = {'session_id': ['session_' + str(i) for i in range(2)],
'subject_id': ['subj_' + str(i) for i in range(3)]}

mktree(crumb, list(ParameterGrid(values_map)))


- check the feasibility of a crumb path:

.. code:: python

crumb = Crumb("/home/hansel/raw/{subject_id}/{session_id}/{modality}/{image}")

# ask if there is any subject with the image 'lollipop.png'.
crumb['image'] = 'lollipop.png'
assert crumb.exists()


- check which subjects have 'jujube.png' and 'toffee.png' files:

.. code:: python

crumb = Crumb("/home/hansel/raw/{subject_id}/{session_id}/{modality}/{image}")

toffee_crumb = crumb.replace(image='toffee.png')
jujube_crumb = crumb.replace(image='jujube.png')

# using sets functionality
set(toffee_crumb['subject_id']).intersection(set(jujube_crumb['subject_id']))


- unfold the whole crumb path to get the whole filetree in a list of paths:

.. code:: python

crumb = Crumb("/home/hansel/raw/{subject_id}/{session_id}/{modality}/{image}")
crumbs = crumb.unfold()

# and you can ask for the value of the crumb argument in each element
crumbs[0]['subject_id']


More functionalities, ideas and comments are welcome.


Dependencies
============

Please see the requirements.txt file. Before installing this package,
install its dependencies with:

pip install -r requirements.txt


Install
=======

I am only testing this tool on Python 3.4 and 3.5.
Maybe it works on Python 2.7 too, having `six` and `pathlib2` installed.

This package uses setuptools. You can install it running:

python setup.py install

If you already have the dependencies listed in requirements.txt
installed, to install in your home directory, use:

python setup.py install --user

To install for all users on Unix/Linux:

| python setup.py build
| sudo python setup.py install

You can also install it in development mode with:

python setup.py develop


Development
===========

Code
----

Github
~~~~~~

You can check the latest sources with the command:

git clone https://www.github.com/alexsavio/hansel.git

or if you have write privileges:

git clone git@github.com:alexsavio/hansel.git

If you are going to create patches for this project, create a branch
for it from the master branch.

We tag stable releases in the repository with the version number.

Testing
-------

We are using `py.test <http://pytest.org/>`__ to help us with the testing.

Otherwise you can run the tests executing:

python setup.py test

or

py.test


.. |PyPI| image:: https://img.shields.io/pypi/v/hansel.svg
:target: https://pypi.python.org/pypi/hansel

.. |Build Status| image:: https://travis-ci.org/alexsavio/hansel.svg?branch=master
:target: https://travis-ci.org/alexsavio/hansel

.. |Coverage Status| image:: https://coveralls.io/repos/alexsavio/hansel/badge.svg?branch=master&service=github
:target: https://coveralls.io/github/alexsavio/hansel?branch=master

.. |PyPI Downloads| image:: https://img.shields.io/pypi/dm/hansel.svg
:target: https://pypi.python.org/pypi/hansel

.. |Code Health| image:: https://landscape.io/github/alexsavio/hansel/master/landscape.svg?style=flat
:target: https://landscape.io/github/alexsavio/hansel/master
:alt: Code Health

.. |Scrutinizer| image:: https://img.shields.io/scrutinizer/g/alexsavio/hansel.svg
:target: https://scrutinizer-ci.com/g/alexsavio/hansel/?branch=master
:alt: Scrutinizer Code Quality

=========
Changelog
=========


Version 0.4.0
==============

- Fill CHANGES.rst
- All outputs from `Crumb.ls` function will be sorted.
- Add regular expressions or `fnmatch` option for crumb arguments.
- Change `exists` behaviour. Now the empty crumb arguments will return False when `exist()`.
- Code clean up.
- Fix bugs


Version 0.3.1
==============

- Fix README
- Code clean up.


Version 0.3.0
==============

- Add `_argval` member, a dict which stores crumb arguments replacements.
- Add tests.
- Remove `rm_dups` option in `Crumb.ls` function.
- Remove conversion to `Paths` when `Crumb` has no crumb arguments in `Crumb.ls`.


Version 0.2.0
==============

- Add `ignore_list` parameter in `Crumb` constructor.


Version 0.1.1
==============

- Add `Crumb.unfold` function.
- Move `mktree` out of `Crumb` class.


Version 0.1.0
==============

- Simplify code.
- Increase test coverage.
- Add `exist_check` to `Crumb.ls` function.
- Fix bugs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hansel-0.4.0.tar.gz (14.7 kB view hashes)

Uploaded Source

Built Distribution

hansel-0.4.0-py3-none-any.whl (19.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page