Skip to main content

A library for compiling excel spreadsheets to python code & visualizing them as a graph

Project description

Pycel

Build Status Code Coverage Requirements Status

Latest Release pypi-pyversions Repo Size Code Size

Pycel is a small python library that can translate an Excel spreadsheet into executable python code which can be run independently of Excel.

The python code is based on a graph and uses caching & lazy evaluation to ensure (relatively) fast execution. The graph can be exported and analyzed using tools like Gephi. See the contained example for an illustration.

Required python libraries:

dateutil, networkx, numpy, openpyxl, ruamel.yaml, and optionally: matplotlib, pydot

The full motivation behind pycel including some examples & screenshots is described in this blog post.

Usage

Download the library and run the example file.

Quick start: You can use binder to see and explore the tool quickly and interactively in the browser: Open Notebook

The good:

All the main mathematical functions (sin, cos, atan2, …) and operators (+,/,^, …) are supported as are ranges (A5:D7), and functions like MIN, MAX, INDEX, LOOKUP, and LINEST.

The codebase is small, relatively fast and should be easy to understand and extend.

I have tested it extensively on spreadsheets with 10 sheets & more than 10000 formulae. In that case calculation of the equations takes about 50ms and agrees with Excel up to 5 decimal places.

The bad:

My development is driven by the particular spreadsheets I need to handle so I have only added support for functions that I need. However, it is should be straightforward to add support for others.

The code does currently not support cell references so a function like OFFSET would take some more work to implement. Not inherently difficult, its just that I have had no need for references yet. Also, for obvious reasons, any VBA code is not compiled but needs to be re-implemented manually on the python side.

The Ugly:

The resulting graph-based code is fast enough for my purposes but to make it truly fast you would probably replace the graph with a dependency tracker based on sparse matrices or something similar.

Excel Addin

It’s possible to run pycel as an excel addin using PyXLL. Simply place pyxll.xll and pyxll.py in the lib directory and add the xll file to the Excel Addins list as explained in the pyxll documentation.

Acknowledgements

This code was originally made possible thanks to the python port of Eric Bachtal’s Excel formula parsing code by Robin Macharg.

The code currently uses a tokenizer of similar origin from the openpyxl library.

Change Log

1.0b27 (Unreleased)

  • Greetings from Customs.

1.0b26 (Unreleased)

Major Changes

  • Update to Python 3.9, drop support for Python 3.5

Changes

  • Add bitwise functions: bitand, bitor, bitxor, bitlshift and bitrshift (Thanks, bogdan-oprescu-nxp)

  • Add PV function (Thanks, estandiaa-marain)

Fixes

  • Fix Openpyxl > 3.0.4 (Thanks, ckp95)

  • Fix HLOOKUP row_index_num validation to use num rows (Thanks, nanaposo)

  • Fix #86, tokenize.TokenError: (‘EOF in multi-line statement’,

  • Fix #88, Handle calcPR in workbook (Thanks, andreif)

  • Fix #89, NPV function fails when passed range of cashflows (Thanks, jpp-0)

  • Fix #93, AssertionError during set_value(), by adding a better error message

  • Fix #99, Pycel raises NotImplementedError on rectangular ranges (Thanks, rmorel)

  • Fix #103, build_operator_operand_fixup() throws #VALUE error when concatenating AddressCell objects (Thanks, nboukraa)

  • Fix #104, Insufficient coverage and testing after recent merges

  • Fix #105, Incorrect RPN for expressions with consecutive negations (Thanks, victorjmarin)

  • Fix #109, String concatenation fails for particular cases (Thanks, bogdan-oprescu-nxp)

  • Fix issue in =IF() when comparing to numpy result

  • Fix MID() and REPLACE() and LEN() in a CSE context

  • Fix INDEX() error handling

1.0b23-1.0b25

Skipped

1.0b22 (2019-10-17)

Fixes

  • Fix #80, incompatible w/ networkx 2.4

1.0b21 (2019-10-13)

Changes

  • Speed up compile

  • Implement defined names in multicolon ranges

  • Tokenize ‘:’ when adjoining functions as infix operator

  • Various changes in prep to improve references, including

  • Add reference expansion to function helpers

  • Add sheet to indirect() and ref_param=0 to offset()

  • Implement is_address() helper

  • Implement intersection and union for AddressCell

Fixes

  • Fix #77, empty arg in IFERROR()

  • Fix #78, None compare and cleanup error handling for various IFS() funcs

1.0b20 (2019-09-22)

Changes

  • Implement multi colon ranges

  • Add support for missing (empty) function parameters

Fixes

  • Fix threading issue in iterative evaluator

  • Fix range intersection with null result for ROW and COLUMN

  • Fix #74 - Count not working for ranges

1.0b19 (2019-09-12)

Changes

  • Implement INDIRECT & OFFSET

  • Implement SMALL, LARGE & ROUNDDOWN (Thanks, nanaposo)

  • Add error message for unhandled missing function parameter

Fixes

  • Fix threading issue w/ CSE evaluator

1.0b18 (2019-09-07)

Changes

  • Implement CEILING_MATH, CEILING_PRECISION, FLOOR_MATH & FLOOR_PRECISION

  • Implement FACT & FACTDOUBLE

  • Implement AVERAGEIF, MAXIFS, MINIFS

  • Implement ODD, EVEN, ISODD, ISEVEN, SIGN

Fixes

  • Fix #67 - Evaluation with unbounded range

  • Fix bugs w/ single cells for xIFS functions

1.0b17 (2019-09-02)

Changes

  • Add Formula Support for Multi Area Ranges from defined names

  • Allow ExcelCompiler init from openpyxl workbook

  • Implement LOWER(), REPLACE(), TRIM() & UPPER()

  • Implement DATEVALUE(), IFS() and ISERR() (Thanks, int128t)

  • Reorganized time and time utils and text functions

  • Add excelutil.AddressMultiAreaRange.

  • Add abs_coordinate() property to AddressRange and AddressCell

  • Cleanup import statements

Fixes

  • Resolved tox version issue on travis

  • Fix defined names with Multi Area Range

1.0b16 (2019-07-07)

Changes

  • Add twelve date and time functions

  • Serialize workbook filename and use it instead of the serialization filename (Thanks, nanaposo)

1.0b15 (2019-06-30)

Changes

  • Implement AVERAGEIFS()

  • Take Iterative Calc Parameter defaults from workbook

Bug Fixes

  • #60, Binder Notebook Example not Working

1.0b14 (2019-06-16)

Changes

  • Added method to evaluate the conditional format (formulas) for a cell or cells

  • Added ExcelCompiler(…, cycles=True) to allow Excel iterative calculations

1.0b13 (2019-05-10)

Changes

  • Implement VALUE()

  • Improve compile performance reversion from CSE work

Bug Fixes

  • #54, In normalize_year(), month % 12 can be 0 -> IllegalMonthError

1.0b12 (2019-04-22)

Changes

  • Add library plugin support

  • Improve evaluate of unbounded row/col (ie: A:B)

  • Fix some regressions from 1.0b11

1.0b11 (2019-04-21)

Major Changes

  • Refactored ExcelWrapper, ExcelFormula & ExcelCompiler to allow…

  • Added considerable extensions to CSE Array Formula Support

  • Refactored function_helpers to add decorators for excelizing library functions

  • Added a dozen new functions

  • Improved various messages and exceptions in validate_calcs() and trim_graph()

Changes

  • Implement LEFT()

  • Implement ISERROR()

  • Implement FIND()

  • Implement ISNUMBER()

  • Implement SUMPRODUCT()

  • Implement CEILING()

  • Implement TRUNC() and FLOOR()

  • Add support for LOG()

  • Improve ABS(), INT() and ROUND()

  • Add quoted_address() method to AddressRange and AddressCell

  • Add public interface to get list of formula_cells()

  • Improve Some NotImplementedError() messages

  • Add NotImplementedError for “linked” sheet names

  • Fix function info for paired functions from same line

  • Add reference URL to function info

  • Only build compiler eval context once

  • Address Range Union and Intersection need sheet_name

CSE Array Changes

  • Add CSE Array handling to excelformula and excelcompiler

  • Change Row, Column & Index to rectangular arrays only

  • Add in_array_formula_context

  • Add cse_array_wrapper() to allow calling functions in array context

  • Add error_string_wrapper() to check for excel errors

  • Move math_wrap() to function_helpers.

  • Handle Direct CSE Array in cell

  • Reorganize CSE Array Formula handling in excelwrapper

  • For CSE Arrays that are smaller than target fill w/ None

  • Trim oversize array results to fit target range

  • Improve needed addresses parser from python code

  • Improve _coerce_to_number() and _numerics() for CSE arrays

  • Remove formulas from excelwrapper._OpxRange()

Bug Fixes

  • Fix Range Intersection

  • Fix Unary Minus on Empty cell

  • Fix ISNA()

  • Fix AddressCell create from tuple

  • Power(0,-1) now returns DIV0

  • Cleanup index()

1.0b8 (2019-03-20)

Changes

  • Implement operators for Array Formulas

  • Implement concatenate and concat

  • Implement subtotal

  • Add support for expanding array formulas

  • Fix column and row for array formulas

  • Add support for table relative references

  • Add function information methods

  • Improve messages for validate_calcs and not implemented functions

1.0b7 (2019-03-10)

Changes

  • Implement Array (CSE) Formulas

Bug Fixes

  • Fix #45 - Unbounded Range Addresses (ie: A:B or 1:2) broken

1.0b6 (2019-03-03)

Bug Fixes

  • Fix #42 - ‘ReadOnlyWorksheet’ object has no attribute ‘iter_cols’

  • Fix #43 - Fix error with leading/trailing whitespace

1.0b5 (2019-02-24)

Changes

  • Implement XOR(), NOT(), TRUE(), FALSE()

  • Improve error handling for AND(), OR()

  • Implement POWER() function

1.0b4 (2019-02-17)

Changes

  • Move to openpyxl 2.6+

  • Remove support for Python 3.4

1.0b3 (2019-02-02)

Changes

  • Work around openpyxl returning datetimes

  • Pin to openpyxl 2.5.12 to avoid bug in 2.5.14 (fixed in PR #315)

1.0b2 (2019-01-05)

Changes

  • Much work to better match Excel error processing

  • Extend validate_calcs() to allow testing entire workbook

  • Improvements to match(), including wildcard support

  • Finished implementing match(), lookup(), vlookup() and hlookup()

  • Implement COLUMN() and ROW()

  • Implement % operator

  • Implement len()

  • Implement binary base number Excel functions (hex2dec, etc.)

  • Fix PI()

1.0b0 (2018-12-25)

Major changes

  • Converted to Python 3.4+

  • Removed Windows Excel COM driver (openpyxl is used for all xlsx reading)

  • Add support for defined names

  • Add support for structured references

  • Fix support for relative formulas

  • set_value() and evaluate() support ranges and lists

  • Add several more library functions

  • Add AddressRange and AddressCell classes to encapsulate address calcs

  • Add validate_calcs() to aid debugging excellib functions

  • Add build feature which can limit recompile to only when excel file changes

  • Improved handling for #DIV0! and #VALUE!

Compatibility

  • Tests run on Python 3.4, 3.5, 3.6, 3.7 (via tox)

  • Python 2 no longer supported

Other Changes

  • Heavily refactored ExcelCompiler

  • Moved all formula evaluation, parsing, etc, code to ExcelFormula class

  • Convert to using openpyxl tokenizer

  • Converted prints to logging calls

  • Convert to using pytest

  • Add support for travis and codecov.io

  • 100% unit test coverage (mostly)

  • Add debuggable formula evaluation

  • Cleanup generated Python code to make easier to read

  • Add a text format (yaml or json) serialization format

  • flake8 (pep8) checks added

  • pip now handles which Python versions can be used

  • Release to PyPI

  • Docs updated

Bugfixes

  • Numerous

0.0.1 (unreleased)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycel-fixed-1.0b27.tar.gz (102.8 kB view details)

Uploaded Source

Built Distribution

pycel_fixed-1.0b27-py3-none-any.whl (98.8 kB view details)

Uploaded Python 3

File details

Details for the file pycel-fixed-1.0b27.tar.gz.

File metadata

  • Download URL: pycel-fixed-1.0b27.tar.gz
  • Upload date:
  • Size: 102.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.5

File hashes

Hashes for pycel-fixed-1.0b27.tar.gz
Algorithm Hash digest
SHA256 b13485fa5836556523e0eca73858ccd212c8a83e8040a184ef489a67ee03a3f3
MD5 c026213b1854d302030c48c099b521a4
BLAKE2b-256 f63caf0f9b2517c6220ffcc1221d77c29bd895311b858c73ae74a0488f067cda

See more details on using hashes here.

File details

Details for the file pycel_fixed-1.0b27-py3-none-any.whl.

File metadata

  • Download URL: pycel_fixed-1.0b27-py3-none-any.whl
  • Upload date:
  • Size: 98.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.5

File hashes

Hashes for pycel_fixed-1.0b27-py3-none-any.whl
Algorithm Hash digest
SHA256 db1309e46881dc4a8a9abbe5c300cfdcb3080f05f42075ca594b2772789d8f2b
MD5 bad63242257773ed214c40247078a6e6
BLAKE2b-256 eed3b339a009fd80e7ec986fede765a68cf921dec5c370edc57f0fa9c29cefd8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page