Skip to main content

libraries to make it easier to maniuplate files in a directory tree

Project description

treecrawl

https://img.shields.io/pypi/v/treecrawl.svg https://img.shields.io/travis/natemarks/treecrawl.svg Documentation Status

libraries to make it easier to maniuplate files in a directory tree

Usage

This project makes it a little easier to edit directory trees and to test those edits.

This example uses the Transformer class to rewrite the contents of all the files in a directory to upper case text. is_target() and transform() should always be overridden. You should almost always create and use an alternative to Transformer.write_string_to_output(). Treating everything like a string will cause problems with editing and testing with any unicode at all. It’s really just meant for a simple example.

class MakeUpper(Transformer):
"""Convert non-ASCII files to ASCII"""

def __init__(self, input, output, dry_run=False):
    super().__init__(input=input, output=output, dry_run=dry_run)

def is_target(self, i_file):
    """
    I generally use opt-in targeting to avoid corrupting files i don't want
    to target when I override Transformer.is_target().  I use extensions
    where it's adequate, but if I need something more robust, I might use
    python-magic.

    """
    included_extensions = [".txt"]

    # if it's not a file, right?
    if not os.path.isfile(i_file):
        return False

    # Regardless of extension if the file is in a .git directory
    # exclude it
    if ".git" in i_file.split(os.path.sep):
        return False

    # now target only files ending in ".txt
    # i could use
    _, ext = os.path.splitext(i_file)
    if ext in included_extensions:
        return True

    return False

def transform(self, source_file, destination_file):
    from treecrawl.utility import file_to_string

    contents = file_to_string(source_file)
    contents = contents.upper()
    self.write_string_to_output(contents, destination_file)

** CAUTION!! ** treecrawl doesn’t protect you from mistreating your files by, for example, corrupting a binary file because you transformed it like a text file. In fact, utility.file_to_string() encodes binary to utf-8 ignoring errors, so it will help you wreck your files.

This project also helps me test transformations using golden files. The following example shows how to enable pytest –update_golden to update the golden files automatically

First I need to setup conftest.py for the pytest flag:

import pytest
from treecrawl.utility import locate_subdir


def pytest_addoption(parser):
    parser.addoption(
        "--update_golden",
        action="store_true",
        help="Update golden files before running tests",
    )


@pytest.fixture
def update_golden(request):
    return request.config.getoption("--update_golden")


@pytest.fixture(scope="session", autouse=True)
def testdata():
    return locate_subdir("testdata")

Next I create a parameterized test case for make upper. I have to manually create the input test data. Refer to tests/testdata/test_make_upper for an example.

@pytest.mark.parametrize(
    "test_case",
    ["pets", "cities"],
)
def test_make_upper(test_case, tmp_path, request, testdata, update_golden):
    c = CaseHelper(
        testdata,
        request.node.originalname,
        test_case,
        str(tmp_path),
        update_golden=update_golden,
    )

    """when update golden is set by running pytest --update_golden,
    the project golden files are deleted. This step generates new ones from
    the the function under test """
    if update_golden:
        _ = MakeUpper(c.input, c.golden)

    m = MakeUpper(c.input, c.actual)
    m.run()
    for r in c.compare():
        succeeded, compared = r
        assert succeeded
        if not succeeded:
            print("input: {}\nactual: {}\nexpected: {}".format(*compared))

It may also be important to override the CaseHelper.compare()

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Build Notes

Setup dev venv

python -m venv .treecrawl.venv
source .treecrawl.venv/bin/activate
pip install -r requirements-dev.txt

Tests

I use pyenv to provide multiple versions for nox python testing. in my case:

pyenv install 3.6.8
pyenv install 3.7.8
# in the project directory
pyenv local 3.6.8 3.7.8
make test

If other versions are flagged as missing or are skipped you can just pyenv instal them and add them to the project directory

run ‘make test’ to run all the tests. I use pyenv to install all of the supported python versions so nox can run the full matrix of tests for me

always run ‘ make lint’

History

0.1.3 (2020-07-17)

  • First release on PyPI.

0.1.4 (2020-07-17)

  • Reorganized modules and updateed documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

treecrawl-0.1.23.tar.gz (22.4 kB view details)

Uploaded Source

Built Distribution

treecrawl-0.1.23-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file treecrawl-0.1.23.tar.gz.

File metadata

  • Download URL: treecrawl-0.1.23.tar.gz
  • Upload date:
  • Size: 22.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.8

File hashes

Hashes for treecrawl-0.1.23.tar.gz
Algorithm Hash digest
SHA256 2262a43489183d59994068e57a58e5e8f736ab61a7c4b2cb8a67f44eef4801ad
MD5 0963df63ed1c78824dfbddbb52402937
BLAKE2b-256 a5fbf8a444e8a8a3021c922e0d7fbf7f42e1c7db13ec0c72853ba30a801e522f

See more details on using hashes here.

File details

Details for the file treecrawl-0.1.23-py3-none-any.whl.

File metadata

  • Download URL: treecrawl-0.1.23-py3-none-any.whl
  • Upload date:
  • Size: 12.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.8

File hashes

Hashes for treecrawl-0.1.23-py3-none-any.whl
Algorithm Hash digest
SHA256 817f2214462ab8c56e1e322f1347b645c33b32d83d46081469af2c74f4fb08be
MD5 2fa13ee5880d657bb5dc141d53eeb398
BLAKE2b-256 6de3e4dacad1dea0340d89b7925b01338c55303c617c486a64b30fe97bddb741

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page