Skip to main content

A framework to break down Excel Formula strings into a parsable and modifiable JSON data structure.

Project description

ExcelFormulaParser

Description

ExcelFormulaParser extends openpyxl to enhance formula parsing capabilities in Excel spreadsheets. While openpyxl handles spreadsheet management well, its formula parsing features are limited. This project aims to build a robust structure for mapping and manipulating Excel formulas through a comprehensive, user-friendly data structure. Features

 - Formula Parsing: Decompose formulas into a structured format, making them easy to manipulate and analyze.

 - Formula Manipulation: Implement methods like formula.translate() to adjust formulas based on cell repositioning.

 - Analytics: Analyze formulas to extract components count, such as the number of functions used.

Okay... but why?

Strings are inherently chaotic. They give us all the info we need about a formula at the surface level but not much beyond that. If you need to systematically modify a formula in a way that's not easy to program in rules for, this will help a ton. I presume for most people this library will be overkill for their spreadsheet modification needs. But I still think it's better to have a robust overkill solution to a problem than having to spend 3 hours writing complex regex string matching, grouping, and substitution algorithms to do this instead of just treating the formula like a dictionary and iterating through a dictionary.

This also allows for easier transmission, encoding and decoding, of these formulas via APIs (god forbid you'd need to do that), and not to mention allows you to store these structures to files as json files. As well as adding any other metadata fields you need to the structure before it gets saved. So I'm imagining a lot of benefits here.

Progress and Contributions

The project is in active development, aiming for significant functionality completion by end of May 2024. Contributions are welcome! If you'd like to help:

Fork the repository and create a pull request with your additions.
Ensure your code is accompanied by updated or new tests that validate your changes.

Current Work

- Developing features to provide analytical insights into formula components.
- Need to work on implementing absolute cell reference support into our structure. Shouldn't be too hard but will take time as everything relies on this class. 

installation

pip install ExcelFormulaParser

Usage

To use ExcelFormulaParser, start by importing the module and initializing a Formula object with an Excel formula string. Here’s how you can parse a formula and manipulate it:

import ExcelFormulaParser as efp

# Initialize with a formula
f = efp.Formula("=SUM(A1, A2)")

# Access the parsed formula
parsed_formula = f.parsed_formula

# To reflect changes back to a formula string after modifications:
f_str = str(f)  # reconstructs back into an excel friendly string

# dictionary structure of this formula: 
# {
#     "function': '=SUM(A1, A2)
#     "components": {
#         "name": "SUM"
#         "arguments": [
#             {"reference": "A1"
#              "components": {
#                  "sheet_name": None,
#                  "column_letter": "A",
#                  "column_number": 1,
#                  "row_number": 1}},
#
#             {"reference": "A2"
#              "components": {
#                  "sheet_name": None,
#                  "column_letter": "A",
#                  "column_number": 1,
#                  "row_number": 2}},
#          ]
#     }
# }

Translating formulas

import ExcelFormulaParser as efp

# Initialize with a formula
f = efp.Formula("=SUM(A1, A2)")
f.translate(input_cell="A1", output_cell="C3")
str(f) # would now return: "=SUM(C3, C4)"

# so essentially translating is just taking every cell reference in 
# the formula and calculating the offset in rows and columns from the 
# input and output cell shift amounts, then applying that to every cell reference.

# I plan to add more functionality to this, but for now this is just a blanket translation to all cel refs.

Iterating through cell ranges

# you can already iterate through a function's arguments since function_instance.args is a list
# and same goes for expression components. 
# but to iterate through a range of cells or check if a cell ref string would be inside a bigger range, you can do this syntax: 
import ExcelFormulaParser as efp

# Instantiate a Range object
cell_range = efp.Range("A1:B2")

# Check if a cell reference is within the range
print("B1" in cell_range)  # Outputs: True
print("B3" in cell_range)  # Outputs: False

# Iterate through all cells in the range
for cell in cell_range:
    print(cell)  # Outputs: 'A1', 'B1', 'A2', 'B2'

# You can also iterate row-wise if needed:
for row in cell_range.get_cells_in_range():
    for cell in row:
        print(cell)  # Each cell within each row of the range is printed

Project Structure

Explore the src/Models directory for class details and src/Tests for unit tests corresponding to each class. The primary interface is through the Formula class, which serves as a facade for the underlying Parser class handling the detailed parsing logic.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ExcelFormulaParser-1.0.2.tar.gz (17.4 kB view details)

Uploaded Source

Built Distribution

ExcelFormulaParser-1.0.2-py3-none-any.whl (22.7 kB view details)

Uploaded Python 3

File details

Details for the file ExcelFormulaParser-1.0.2.tar.gz.

File metadata

  • Download URL: ExcelFormulaParser-1.0.2.tar.gz
  • Upload date:
  • Size: 17.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.12

File hashes

Hashes for ExcelFormulaParser-1.0.2.tar.gz
Algorithm Hash digest
SHA256 5995c0bfc0a554ff96b64821e9f7c8b025b8790892b70eeb3e6afbd4ccb1be08
MD5 54353bcb94345b0970f19bac5bedd36a
BLAKE2b-256 65b7fbe1d48dff4392d6559554557398f749f9d8173a9e227052e670a1e20fac

See more details on using hashes here.

File details

Details for the file ExcelFormulaParser-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for ExcelFormulaParser-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 015ed3247540e1442b0275de64985d0a80a8e2ba2af40ad160d18771018b598b
MD5 9183d82203a341fa923cab7be7b21b57
BLAKE2b-256 a3200c9093ee99744d4fe15607d759e63f3c823cfc6b6a152353ae7ddbca6c4a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page