Skip to main content

Python library for storing and working with monthly-period data.

Project description

monthpack

monthpack is a Python library for organizing period-based data sources, such as bank statements, income statements, and similar records.

Starting with 0.2.0, monthpack uses a single implicit storage configuration per source.config.json.

Current Layout

monthpack/
  data/
  src/
    monthpack/
  pyproject.toml
  README.md

Quick Example

from monthpack import Source

source = Source.from_path("data/source/source.config.json")
metadata = source.resolve_metastate(202401)

print(metadata.period)
print(metadata.year)
print(metadata.month)
print(metadata.inpath)
print(metadata["reader"])

data = source.read((202401, 202406), skip_error=True)

Processor Registration

Source now accepts singular processors:

source = Source.from_path(
    "data/source/source.config.json",
    admin_user=True,
    preprocessor=preprocess_main,
    postprocessor=postprocess_main,
)

Or after initialization:

source.set_preprocessor(preprocess_main)
source.set_postprocessor(postprocess_main)

postprocessor_kwargs are passed to the postprocessor during read(...) and read_one(...).

Config Writers

The package provides independent helpers for creating starter configs:

from monthpack import write_dataframe_config
from monthpack import write_pickle_config
from monthpack import write_series_config

write_dataframe_config("data/sample/source_dataframe.config.json")
write_series_config("data/sample/source_series.config.json")
write_pickle_config("data/sample/source_pickle.config.json")

source.config.json Schema (v0.2.0)

A config is now flat (no storage container):

{
    "name": "main",
    "input": "|input",
    "output": "|output",
    "writer": "pandas",
    "pandas_type": "dataframe",
    "collection": "concat",
    "concat_axis": 0,
    "period_label": "period",
    "persistence": true,
    "metadata": [
        {
            "inpath": "**/{period}_*.csv",
            "reader": "csv",
            "outpath": "{period.year}/{period}_{name}.bin"
        },
        {
            "period": 202507,
            "inpath": "**/{period}_*.xlsx",
            "reader": "excel"
        }
    ]
}

Field overview:

  • name: optional, used for template interpolation ({name}).
  • writer: supports pandas and pickle.
  • pandas_type: required when writer = "pandas"; use dataframe or series.
  • collection: one of list, dict, or concat.
  • concat_axis: axis for concat collection.
  • period_label: optional label for period annotation in pandas collections.
  • period_as_index: optional (false by default). When true and period_label is set, period labeling replaces the existing index in pandas outputs instead of adding a column (DataFrame) or an outer MultiIndex level (Series).
  • persistence: when true, periodic metadata rules behave as persistence anchors.
  • metadata: unified global metadata rules (base, periodic, and temporary).
  • input/output: optional paths.

Metadata rule behavior:

  • entries without period are base values
  • entries with period apply from that period onward
  • entries with temporary: true apply only for the exact period

Read Behavior

  • source.read(period, ...) reads one period.
  • source.read(None, ...) reads the atemporal/base case.
  • source.read([period1, period2, ...], ...) respects list order.
  • source.read((start, end), ...) expands an inclusive monthly range.
  • source.read_one(period, ...) is the single-period helper used internally.

skip_error=True returns None for missing-read cases such as a missing processed file or persistence anchor. With skip_error=False, those cases raise FileNotFoundError.

User Mode

source.set_user()
data = source.read(202401)

In user mode:

  • read(...) only returns already processed data.
  • missing processed files are not regenerated from raw inputs.
  • save(...) is not available.

Metadata Module

monthpack includes an independent metadata resolver module:

from monthpack.metadata import Metadata

metadata = Metadata.from_entries(
    [
        {"reader": "csv", "path": "raw"},
        {"period": 202501, "reader": "excel"},
        {"period": 202502, "temporary": True, "reader": "parquet"},
    ]
)

base = metadata.resolve(None)
current = metadata.resolve(202502)

print(base.reader)
print(current.reader)
print(current.year)
print(current.month)

Period Module

monthpack also includes an independent Period class for monthly values represented as YYYYMM.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

monthpack-0.2.3.tar.gz (14.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

monthpack-0.2.3-py3-none-any.whl (14.3 kB view details)

Uploaded Python 3

File details

Details for the file monthpack-0.2.3.tar.gz.

File metadata

  • Download URL: monthpack-0.2.3.tar.gz
  • Upload date:
  • Size: 14.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for monthpack-0.2.3.tar.gz
Algorithm Hash digest
SHA256 1742c45e4731229ac6565c98f5c9d193176d53cf17f70fbabed68becf43b1826
MD5 27a589227e7e794fe625a1ef5f4d6c34
BLAKE2b-256 393f44ad63cb1c103e990c963f973a5e771dc46ff7269715d79c8151e496476e

See more details on using hashes here.

File details

Details for the file monthpack-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: monthpack-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 14.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for monthpack-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 bc16dbbd9b44cc4fddbba49a547e400274bd6c441a0e901055a92f810b6486ae
MD5 06fce46ebfcba9c68659c8ddcfe0dd4c
BLAKE2b-256 9330415424e25a00d8785175e1da3f6c1cbe22393d528abea6e22269f2bbd074

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page