A library for compiling excel spreadsheets to python code & visualizing them as a graph
Project description
Pycel
Pycel is a small python library that can translate an Excel spreadsheet into executable python code which can be run independently of Excel.
The python code is based on a graph and uses caching & lazy evaluation to ensure (relatively) fast execution. The graph can be exported and analyzed using tools like Gephi. See the contained example for an illustration.
- Required python libraries:
dateutil, networkx, numpy, openpyxl, ruamel.yaml, and optionally: matplotlib, pydot
The full motivation behind pycel including some examples & screenshots is described in this blog post.
Usage
Download the library and run the example file.
Quick start: You can use binder to see and explore the tool quickly and interactively in the browser:
The good:
All the main mathematical functions (sin, cos, atan2, …) and operators (+,/,^, …) are supported as are ranges (A5:D7), and functions like MIN, MAX, INDEX, LOOKUP, and LINEST.
The codebase is small, relatively fast and should be easy to understand and extend.
I have tested it extensively on spreadsheets with 10 sheets & more than 10000 formulae. In that case calculation of the equations takes about 50ms and agrees with Excel up to 5 decimal places.
The bad:
My development is driven by the particular spreadsheets I need to handle so I have only added support for functions that I need. However, it is should be straightforward to add support for others.
The code does currently not support cell references so a function like OFFSET would take some more work to implement. Not inherently difficult, its just that I have had no need for references yet. Also, for obvious reasons, any VBA code is not compiled but needs to be re-implemented manually on the python side.
The Ugly:
The resulting graph-based code is fast enough for my purposes but to make it truly fast you would probably replace the graph with a dependency tracker based on sparse matrices or something similar.
Excel Addin
It’s possible to run pycel as an excel addin using PyXLL. Simply place pyxll.xll and pyxll.py in the lib directory and add the xll file to the Excel Addins list as explained in the pyxll documentation.
Acknowledgements
This code was originally made possible thanks to the python port of Eric Bachtal’s Excel formula parsing code by Robin Macharg.
The code currently uses a tokenizer of similar origin from the openpyxl library.
Change Log
1.0b27 (Unreleased)
Greetings from Customs.
1.0b26 (Unreleased)
Major Changes
Update to Python 3.9, drop support for Python 3.5
Changes
Add bitwise functions: bitand, bitor, bitxor, bitlshift and bitrshift (Thanks, bogdan-oprescu-nxp)
Add PV function (Thanks, estandiaa-marain)
Fixes
Fix Openpyxl > 3.0.4 (Thanks, ckp95)
Fix HLOOKUP row_index_num validation to use num rows (Thanks, nanaposo)
Fix #86, tokenize.TokenError: (‘EOF in multi-line statement’,
Fix #88, Handle calcPR in workbook (Thanks, andreif)
Fix #89, NPV function fails when passed range of cashflows (Thanks, jpp-0)
Fix #93, AssertionError during set_value(), by adding a better error message
Fix #99, Pycel raises NotImplementedError on rectangular ranges (Thanks, rmorel)
Fix #103, build_operator_operand_fixup() throws #VALUE error when concatenating AddressCell objects (Thanks, nboukraa)
Fix #104, Insufficient coverage and testing after recent merges
Fix #105, Incorrect RPN for expressions with consecutive negations (Thanks, victorjmarin)
Fix #109, String concatenation fails for particular cases (Thanks, bogdan-oprescu-nxp)
Fix issue in =IF() when comparing to numpy result
Fix MID() and REPLACE() and LEN() in a CSE context
Fix INDEX() error handling
1.0b23-1.0b25
Skipped
1.0b22 (2019-10-17)
Fixes
Fix #80, incompatible w/ networkx 2.4
1.0b21 (2019-10-13)
Changes
Speed up compile
Implement defined names in multicolon ranges
Tokenize ‘:’ when adjoining functions as infix operator
Various changes in prep to improve references, including
Add reference expansion to function helpers
Add sheet to indirect() and ref_param=0 to offset()
Implement is_address() helper
Implement intersection and union for AddressCell
Fixes
Fix #77, empty arg in IFERROR()
Fix #78, None compare and cleanup error handling for various IFS() funcs
1.0b20 (2019-09-22)
Changes
Implement multi colon ranges
Add support for missing (empty) function parameters
Fixes
Fix threading issue in iterative evaluator
Fix range intersection with null result for ROW and COLUMN
Fix #74 - Count not working for ranges
1.0b19 (2019-09-12)
Changes
Implement INDIRECT & OFFSET
Implement SMALL, LARGE & ROUNDDOWN (Thanks, nanaposo)
Add error message for unhandled missing function parameter
Fixes
Fix threading issue w/ CSE evaluator
1.0b18 (2019-09-07)
Changes
Implement CEILING_MATH, CEILING_PRECISION, FLOOR_MATH & FLOOR_PRECISION
Implement FACT & FACTDOUBLE
Implement AVERAGEIF, MAXIFS, MINIFS
Implement ODD, EVEN, ISODD, ISEVEN, SIGN
Fixes
Fix #67 - Evaluation with unbounded range
Fix bugs w/ single cells for xIFS functions
1.0b17 (2019-09-02)
Changes
Add Formula Support for Multi Area Ranges from defined names
Allow ExcelCompiler init from openpyxl workbook
Implement LOWER(), REPLACE(), TRIM() & UPPER()
Implement DATEVALUE(), IFS() and ISERR() (Thanks, int128t)
Reorganized time and time utils and text functions
Add excelutil.AddressMultiAreaRange.
Add abs_coordinate() property to AddressRange and AddressCell
Cleanup import statements
Fixes
Resolved tox version issue on travis
Fix defined names with Multi Area Range
1.0b16 (2019-07-07)
Changes
Add twelve date and time functions
Serialize workbook filename and use it instead of the serialization filename (Thanks, nanaposo)
1.0b15 (2019-06-30)
Changes
Implement AVERAGEIFS()
Take Iterative Calc Parameter defaults from workbook
Bug Fixes
#60, Binder Notebook Example not Working
1.0b14 (2019-06-16)
Changes
Added method to evaluate the conditional format (formulas) for a cell or cells
Added ExcelCompiler(…, cycles=True) to allow Excel iterative calculations
1.0b13 (2019-05-10)
Changes
Implement VALUE()
Improve compile performance reversion from CSE work
Bug Fixes
#54, In normalize_year(), month % 12 can be 0 -> IllegalMonthError
1.0b12 (2019-04-22)
Changes
Add library plugin support
Improve evaluate of unbounded row/col (ie: A:B)
Fix some regressions from 1.0b11
1.0b11 (2019-04-21)
Major Changes
Refactored ExcelWrapper, ExcelFormula & ExcelCompiler to allow…
Added considerable extensions to CSE Array Formula Support
Refactored function_helpers to add decorators for excelizing library functions
Added a dozen new functions
Improved various messages and exceptions in validate_calcs() and trim_graph()
Changes
Implement LEFT()
Implement ISERROR()
Implement FIND()
Implement ISNUMBER()
Implement SUMPRODUCT()
Implement CEILING()
Implement TRUNC() and FLOOR()
Add support for LOG()
Improve ABS(), INT() and ROUND()
Add quoted_address() method to AddressRange and AddressCell
Add public interface to get list of formula_cells()
Improve Some NotImplementedError() messages
Add NotImplementedError for “linked” sheet names
Fix function info for paired functions from same line
Add reference URL to function info
Only build compiler eval context once
Address Range Union and Intersection need sheet_name
CSE Array Changes
Add CSE Array handling to excelformula and excelcompiler
Change Row, Column & Index to rectangular arrays only
Add in_array_formula_context
Add cse_array_wrapper() to allow calling functions in array context
Add error_string_wrapper() to check for excel errors
Move math_wrap() to function_helpers.
Handle Direct CSE Array in cell
Reorganize CSE Array Formula handling in excelwrapper
For CSE Arrays that are smaller than target fill w/ None
Trim oversize array results to fit target range
Improve needed addresses parser from python code
Improve _coerce_to_number() and _numerics() for CSE arrays
Remove formulas from excelwrapper._OpxRange()
Bug Fixes
Fix Range Intersection
Fix Unary Minus on Empty cell
Fix ISNA()
Fix AddressCell create from tuple
Power(0,-1) now returns DIV0
Cleanup index()
1.0b8 (2019-03-20)
Changes
Implement operators for Array Formulas
Implement concatenate and concat
Implement subtotal
Add support for expanding array formulas
Fix column and row for array formulas
Add support for table relative references
Add function information methods
Improve messages for validate_calcs and not implemented functions
1.0b7 (2019-03-10)
Changes
Implement Array (CSE) Formulas
Bug Fixes
Fix #45 - Unbounded Range Addresses (ie: A:B or 1:2) broken
1.0b6 (2019-03-03)
Bug Fixes
Fix #42 - ‘ReadOnlyWorksheet’ object has no attribute ‘iter_cols’
Fix #43 - Fix error with leading/trailing whitespace
1.0b5 (2019-02-24)
Changes
Implement XOR(), NOT(), TRUE(), FALSE()
Improve error handling for AND(), OR()
Implement POWER() function
1.0b4 (2019-02-17)
Changes
Move to openpyxl 2.6+
Remove support for Python 3.4
1.0b3 (2019-02-02)
Changes
Work around openpyxl returning datetimes
Pin to openpyxl 2.5.12 to avoid bug in 2.5.14 (fixed in PR #315)
1.0b2 (2019-01-05)
Changes
Much work to better match Excel error processing
Extend validate_calcs() to allow testing entire workbook
Improvements to match(), including wildcard support
Finished implementing match(), lookup(), vlookup() and hlookup()
Implement COLUMN() and ROW()
Implement % operator
Implement len()
Implement binary base number Excel functions (hex2dec, etc.)
Fix PI()
1.0b0 (2018-12-25)
Major changes
Converted to Python 3.4+
Removed Windows Excel COM driver (openpyxl is used for all xlsx reading)
Add support for defined names
Add support for structured references
Fix support for relative formulas
set_value() and evaluate() support ranges and lists
Add several more library functions
Add AddressRange and AddressCell classes to encapsulate address calcs
Add validate_calcs() to aid debugging excellib functions
Add build feature which can limit recompile to only when excel file changes
Improved handling for #DIV0! and #VALUE!
Compatibility
Tests run on Python 3.4, 3.5, 3.6, 3.7 (via tox)
Python 2 no longer supported
Other Changes
Heavily refactored ExcelCompiler
Moved all formula evaluation, parsing, etc, code to ExcelFormula class
Convert to using openpyxl tokenizer
Converted prints to logging calls
Convert to using pytest
Add support for travis and codecov.io
100% unit test coverage (mostly)
Add debuggable formula evaluation
Cleanup generated Python code to make easier to read
Add a text format (yaml or json) serialization format
flake8 (pep8) checks added
pip now handles which Python versions can be used
Release to PyPI
Docs updated
Bugfixes
Numerous
0.0.1 (unreleased)
Original version available from Dirk Ggorissen’s Pycel Github Page.
Supports Python 2
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pycel-fixed-1.0b27.tar.gz
.
File metadata
- Download URL: pycel-fixed-1.0b27.tar.gz
- Upload date:
- Size: 102.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b13485fa5836556523e0eca73858ccd212c8a83e8040a184ef489a67ee03a3f3 |
|
MD5 | c026213b1854d302030c48c099b521a4 |
|
BLAKE2b-256 | f63caf0f9b2517c6220ffcc1221d77c29bd895311b858c73ae74a0488f067cda |
File details
Details for the file pycel_fixed-1.0b27-py3-none-any.whl
.
File metadata
- Download URL: pycel_fixed-1.0b27-py3-none-any.whl
- Upload date:
- Size: 98.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | db1309e46881dc4a8a9abbe5c300cfdcb3080f05f42075ca594b2772789d8f2b |
|
MD5 | bad63242257773ed214c40247078a6e6 |
|
BLAKE2b-256 | eed3b339a009fd80e7ec986fede765a68cf921dec5c370edc57f0fa9c29cefd8 |