Skip to main content

A suite of tools to help with data analysis in python.

Project description

data-analysis

Python utilities for doing data analysis

Currently includes two utilities:

FinancialYear

Represents a Financial Year, from a string in the format "2020-21".

Usage:

from caradoc import FinancialYear

fy = FinancialYear("2020-21")

fy + 1  # FinancialYear("2021-22")
fy - 1  # FinancialYear("2019-20")

str(fy)  # "2020-21"
int(fy)  # 2020

Create from date or year

from datetime import date
from caradoc import FinancialYear

fy = FinancialYear.from_date(date(2020, 1, 1))
str(fy)  # "2019-20"

fy = FinancialYear.from_int(2020)
str(fy)  # "2020-21"

Useful utilities:

from caradoc import FinancialYear

fy = FinancialYear("2020-21")

fy.previous_n_years(4)  # [
#    FinancialYear("2016-17"),
#    FinancialYear("2017-18"),
#    FinancialYear("2018-19"),
#    FinancialYear("2019-20"),
#    FinancialYear("2020-21")
# ]

FinancialYear.range("2018-19", "2020-21")  # [
#    FinancialYear("2018-19"),
#    FinancialYear("2019-20"),
#    FinancialYear("2020-21"),
# ]

d = date(2021, 6, 1)
d in FinancialYear("2021-22")  # True
d in FinancialYear("2020-22")  # False

Currently years are hardcoded to end on 31st March but this will be changed.

ExcelTable

Represents a table in an Excel workbook.

The table itself is a pandas DataFrame. The DataFrame index is not written to the Excel file.

Allows for specifying a title, summary and notes for the table.

Parameters

  • df: pandas DataFrame
  • title: Optional title for the table
  • summary: Optional summary for the table
  • notes: Optional notes for the table

Methods

  • to_excel_table(): Writes just the datatable (df) to an Excel file as a Table (with filters)
  • to_excel(): Writes the table to an Excel file as a Table, with the title and summary as a header and the notes as a footer.

Usage

from caradoc import ExcelTable
import pandas as pd

df = pd.DataFrame({"alice": [1, 2, 3], "bob": [4, 5, 6]})
et = ExcelTable(
    df,
    title="Test Table"
)
with pd.ExcelWriter("test_file.xlsx", engine="auto") as writer:
    et.to_excel(writer, "test_sheet")

Output looks something like:

A B
1 Test Table
2
3 Alice Bob
4 1 4
5 2 5
6 3 6

You can also include a summary (underneath the title) and notes (underneath the table) using summary= and notes=

DataOutput

Represents a collection of ExcelTables to be written to an Excel file.

Methods

  • add_table(): Adds a table to the DataOutput
  • write(): Writes the DataOutput to an Excel file

Usage

from caradoc import DataOutput, ExcelTable
import pandas as pd

output = DataOutput()

df1 = pd.DataFrame({"alice": [1, 2, 3], "bob": [4, 5, 6]})
table1 = ExcelTable(
    df1,
    title="Test Table"
)

output.add_table("test_sheet", table1)

df2 = pd.DataFrame({"alice": [1, 2, 3], "bob": [4, 5, 6]})
output.add_table("test_sheet", df2, title="Test Table 2")

output.write("test_file.xlsx")

Output of test_file.xlsx will be an excel workbook with a sheet called "test_sheet". The sheet will have two tables, each table with a title and spacing between them.

Development

Run tests

Tests can be run with pytest:

hatch run test

Test coverage

hatch run cov-html

Run typing checks

hatch run lint:typing

Linting

Black and ruff should be run before committing any changes.

hatch run lint:style

Run all checks at once

hatch run lint:all

Publish to pypi

python -m build
twine upload dist/*
git tag v<VERSION_NUMBER>
git push origin v<VERSION_NUMBER>

Install development version

The development requirements are installed using pip install -r dev-requirements.txt.

Any additional requirements for the module itself must be added to install_requires in setup.py. You should then generate a new requirements.txt using using pip-tools (pip-compile). You can then run pip-sync to install the requirement.

Any additional development requirements must be added to dev-requirements.in and then the dev-requirements.txt should be generated using pip-compile dev-requirements.in. You can then install the development requirements using pip-sync dev-requirements.txt.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

caradoc-0.0.2.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

caradoc-0.0.2-py3-none-any.whl (7.5 kB view details)

Uploaded Python 3

File details

Details for the file caradoc-0.0.2.tar.gz.

File metadata

  • Download URL: caradoc-0.0.2.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.24.1

File hashes

Hashes for caradoc-0.0.2.tar.gz
Algorithm Hash digest
SHA256 9b736b670af3c059b43deb4e1b3d61893b901912137cf926f133227a92470b70
MD5 a0e007468a823ecda86fa6f7f7732e3b
BLAKE2b-256 5ed8c35d83684138a1ddd0d24cca3fffa2ba25f40f64cb5115c98e31a72cdcae

See more details on using hashes here.

File details

Details for the file caradoc-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: caradoc-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 7.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.24.1

File hashes

Hashes for caradoc-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 6c39af3ac601eaf3b2dfddb6105f044d8eff5516c0ae290f03ba4e420eb91479
MD5 96fb9908bfc8d5486bc8b3e7c9af28a6
BLAKE2b-256 befb73eb2ece3e968f8f2e3a0cc302097b042ef1bad4ef3b7320ada83ce257ab

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page