Skip to main content

Provides an abstraction layer for creating and parsing paths in a programmatic way via templates.

Project description

Andar Package

Caminante, no hay camino, se hace camino al andar.

Antonio Machado

Andar is a python package that provides an abstraction layer for managing path structures, helping to create and parse paths in a programmatic way via templated file paths.

Install Package

With pip:

pip install andar

Quick start:

Simple PathModel definition using default field configurations:

from andar import PathModel

simple_path_model = PathModel(
    template="/{base_folder}/{subfolder}/{base_name}__{suffix}.{extension}"
)

Generate a path:

result_path = simple_path_model.get_path(
    base_folder="parent_folder",
    subfolder="other_folder",
    base_name="mydata",
    suffix="2000-01-01",
    extension="csv",
)
print(result_path)
"/parent_folder/other_folder/mydata__2000-01-01.csv"

Parse a path:

file_path = "/data/reports/summary__2025-12-31.csv"
parsed_fields = simple_path_model.parse_path(file_path)
print(parsed_fields)
{
    'base_folder': 'data', 
    'subfolder': 'reports', 
    'base_name': 'summary', 
    'suffix': '2025-12-31', 
    'extension': 'csv',
}

Examples

How to create a path generator / parser for a date tree structure

Define a PathModel following a date tree folder structure with datetime a suffix using the next template and fields:

from andar import FieldConf, PathModel, SafePatterns

date_archived_pm = PathModel(
    template="/{base_path}/{subfolder}/{date_path}/{date_prefix}_{name}_{datetime_suffix}.{ext}",
    fields={
        "base_path": FieldConf(pattern=SafePatterns.DIRPATH),
        "subfolder": FieldConf(pattern=SafePatterns.NAME),
        "date_path": FieldConf(pattern=r"\d{4}/\d{2}/\d{2}", date_format="%Y/%m/%d"),
        "date_prefix": FieldConf(pattern=r"\d{4}-\d{2}-\d{2}", date_format="%Y-%m-%d"),
        "name": FieldConf(pattern=SafePatterns.FIELD),
        "datetime_suffix": FieldConf(pattern=r"\d{8}_\d{6}", datetime_format="%Y%m%d_%H%M%S"),
        "ext": FieldConf(pattern=SafePatterns.EXTENSION),
    },
)

Then, for generating the paths just iterate over dates:

import datetime as dt

base_path = "company/reports"
subfolder = "finance"
report_name = "revenue"
extension = "xls"
start_date = dt.date(2025, 12, 1)
report_date_list = [start_date + dt.timedelta(days=d) for d in range(10)]

for report_date in report_date_list:
    creation_datetime = dt.datetime.now()
    report_path = date_archived_pm.get_path(
        base_path=base_path,
        subfolder=subfolder,
        date_path=report_date,
        date_prefix=report_date,
        name=report_name,
        datetime_suffix=creation_datetime,
        ext=extension,
    )
    print(report_path)

For parsing already existing paths use a library that allows to recursive search (e.g. pathlib, glob, os, etc) and output a fullpath for each file:

import pathlib
base_path = "/company/reports"
search_folder = pathlib.Path(base_path)
path_list = [str(i) for i in search_folder.rglob("*") if i.is_file()]

for file_path in path_list:
    parsed_fields = date_archived_pm.parse_path(file_path)
    print(parsed_fields)

How to define path conventions for a datalake

For example Data Mesh propose conventions for separating data into domains, layers and products. This could be implemented with the following PathModel template and fields:

from andar import FieldConf, PathModel, SafePatterns

data_mesh_pm = PathModel(
    template="/{domain}/{layer}/{product}/{aggregation}/{date}_{product}.{ext}",
    fields={
        "domain": FieldConf(pattern=SafePatterns.NAME),  # sales, marketing, HR, finance, etc
        "layer": FieldConf(pattern=SafePatterns.NAME),  # raw, intermediate, mart, etc
        "product": FieldConf(pattern=SafePatterns.NAME),  # orders, revenues, taxes, campaigns, etc
        "aggregation": FieldConf(pattern=SafePatterns.NAME),  # daily, weekly, monthly, etc
        "date": FieldConf(pattern=r"\d{8}", datetime_format="%Y%m%d"),  # product date
        "ext": FieldConf(pattern=SafePatterns.EXTENSION),  # csv, xls, parquet, etc
    },
)

For improving traceability, it's a good practice to also include run datetime (i.e. generation date) as a simple version system:

from andar import FieldConf, PathModel, SafePatterns

data_mesh_pm = PathModel(
    template="/{domain}/{layer}/{product}/{aggregation}/{product_date}_{product}_{run_datetime}.{ext}",
    fields={
        "domain": FieldConf(pattern=SafePatterns.NAME),  # sales, marketing, HR, finance, etc
        "layer": FieldConf(pattern=SafePatterns.NAME),  # raw, intermediate, mart, etc
        "product": FieldConf(pattern=SafePatterns.NAME),  # orders, revenues, taxes, campaigns, etc
        "aggregation": FieldConf(pattern=SafePatterns.NAME),  # daily, weekly, monthly, etc
        "product_date": FieldConf(pattern=r"\d{8}", datetime_format="%Y%m%d"),  # product target date
        "run_datetime": FieldConf(pattern=r"\d{8}_\d{6}", datetime_format="%Y%m%d_%H%M%S"),  # generation datetime
        "ext": FieldConf(pattern=SafePatterns.EXTENSION),  # csv, xls, parquet, etc
    },
)

Documentation

See the official documentation to learn more.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

andar-0.1.3.tar.gz (14.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

andar-0.1.3-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file andar-0.1.3.tar.gz.

File metadata

  • Download URL: andar-0.1.3.tar.gz
  • Upload date:
  • Size: 14.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.10 {"installer":{"name":"uv","version":"0.9.10"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for andar-0.1.3.tar.gz
Algorithm Hash digest
SHA256 82c1dbc34cd09f02a266ec1fe35e66ccfb70ccfc072d4f078d83473599565ebd
MD5 42da838847332ccba0fa34cfedc03481
BLAKE2b-256 27a204a59f76d21b41d8b7d1781a4f9e3b0d6f79ff49f4fe465a93e765a91602

See more details on using hashes here.

File details

Details for the file andar-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: andar-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 11.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.10 {"installer":{"name":"uv","version":"0.9.10"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for andar-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 9d337480a0be98f302cd9b1297703c7243f85d8024bd91a0b0a8918b436a43cd
MD5 7f84c2b5ad2f6c5f543ac8c2c8cd08bf
BLAKE2b-256 503261932ff8dd235bd76623125b87cde2786e8b6d72dda55018957154783d1b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page