A python module to extract all the content of an Excel document and enable calculation without Excel

Project description

Koala

Koala converts any Excel workbook into a python object that enables on the fly calculation without the need of Excel.

Koala parses an Excel workbook and creates a network of all the cells with their dependencies. It is then possible to change any value of a node and recompute all the depending cells.

Get started

Installation

Koala is available on pypi so you can just:

pip install koala2

alternatively, you can download it and install the last version from github:

git clone https://github.com/anthill/koala.git
cd koala
python setup.py install

Basic

Koala is still in early stages of developement and feel free to leave us issues when you encounter a problem.

Graph generation

The first thing you need is to convert your workbook into a graph. This operation may take some time depending on the size of your workbook (we've used koalo on workbooks containg more than 100 000 intricated formulas).

from koala.ExcelCompiler import ExcelCompiler

sp = Spreadsheet("examples/basic.xlsx")

If this step fails, ensure that your Excel file is recent and in standalone mode (open it with Excel and save, it should rewrite the file and the resulting file should be three of four times heavier).

Graph Serialization

As the previous convertion can be long on big graphs, it is often useful to dump the graph to a file:

sp.dump('file.gzip')

which can be relaoded later with:

sp = Spreadsheet.load('file.gzip')

Graph Evaluation

You can read the values of some cells with cell_evaluate. It will only evaluate the calculation if a parent cell has been modified with cell_set_value.

sp.cell_set_value('Sheet1!A1', 10)
sp.cell_evaluate('Sheet1!D1')

Named cells or range

If your Excel file has names defined, you can use them freely:

sp.cell_set_value('myNamedCell', 0)

Advanced

Compiler options

You can pass ignore_sheets to ignore a list of Sheets, and ignore_hidden to ignore all hidden cells:

sp = Spreadsheet(file, ignore_sheets = ['Sheet2'], ignore_hidden = True)

In case you have very big files, you might want to reduce the size of the output graph. Here are a few methods.

Volatiles

Volatiles are functions that might output a reference to Cell rather than a specific value, which impose a reevaluation everytime. Typical examples are INDEX and OFFSET.

After having created the graph, you can use clean_pointers to fix the value of the pointers to their initial values, which reduces the graph size and decreases the evaluation times:

sp.clean_pointers()

Warning: this implies that Cells concerned by these functions will be fixed permanently. If you evaluate a cell whose modified parents are separated by a pointer, you may encounter errors. WIP: we are working on automatic detection of the required pointers.

Outputs

You can specify the outputs you need. In this case, all Cells not concerned in the calculation of these output Cell will be discarded, and your graph size wil be reduced.

sp = sp.gen_graph(inputs=['Sheet1!A1'], outputs=['Sheet1!D1', Sheet1!D2])

Pruning inputs

In this case, all Cells not impacted by inputs Cells will be discarded, and your graph size wil be reduced.

sp = sp.prune_graph()

Fix and free Cells

You might need to fix a Cell, so that its value is not reevaluated. You can do that with:

sp.cell_fix('Sheet1!D1')

By default, all Cells on which you use sp.cell_set_value() will be fixed.

You can free your fixed cells with:

sp.cell_free('Sheet1!D1') # frees a single Cell
sp.cell_free() # frees all fixed Cells

When you free a Cell, it is automatically reevaluated.

Set formula

If you need to change a Cell's formula, you can use:

sp.cell_set_formula('Sheet1!D1', 'Sheet1!A1 * 1000')

The string you pass as argument needs to be written with Excel syntax.

** You will find more examples and sample excel files in the directory examples.**

Detect alive

To check if you have "alive pointers", i.e, pointer functions that have one of your inputs as argument, you can use:

sp.detect_alive(inputs = [...], outputs = [...])

This will also change the Spreadsheet.pointers_to_reset list, so that only alive pointers are resetted on cell_set_value().

Create from scratch

The graph can also be created from scratch (not by using a file).

sp_scratch = Spreadsheet()

sp_scratch.cell_add('Sheet1!A1', value=1)
sp_scratch.cell_add('Sheet1!A2', value=2)
sp_scratch.cell_add('Sheet1!A3', formula='=SUM(Sheet1!A1, Sheet1!A2)')

sp_scratch.cell_evaluate('Sheet1!A3')

Origins

This project is a "double fork" of two awesome projects:

Pycel, a python module that generates AST graph from a workbook
OpenPyXL, a full API able to read/write/manipulate Excel 2010 files.

The most work we did was to adapt Pycel algorithm to more complex cases that it is capable of. This ended up in modifying some core parts of the library, especially with the introduction of Range objects.

As for OpenPyXL, we only took tiny bits, mainly concerning the reading part. Most of what we took from it is left unchanged in the openpyxl folder, with references to the original scripts on BitBucket.

This module has been enriched by Ants, but is part of a more global project of Engie company and particularly it Center of Expertise in Modelling and Economics Studies.

Licence

GPL

Project details

Release history Release notifications | RSS feed

This version

0.0.35

Jun 19, 2019

0.0.34

Jun 19, 2019

0.0.33

May 13, 2019

0.0.32

May 13, 2019

0.0.31

Apr 8, 2019

0.0.30

Mar 26, 2019

0.0.29

Feb 11, 2019

0.0.28

Oct 30, 2018

0.0.27

Oct 23, 2018

0.0.26

Oct 12, 2018

0.0.25

Sep 12, 2018

0.0.24

Sep 12, 2018

0.0.23

Aug 16, 2018

0.0.22

Aug 14, 2018

0.0.21

Aug 3, 2018

0.0.20

May 15, 2018

0.0.19

May 14, 2018

0.0.18

Apr 29, 2018

0.0.17

Nov 22, 2017

0.0.16

Nov 21, 2017

0.0.15

Nov 16, 2017

0.0.14

Aug 29, 2016

0.0.13

Aug 25, 2016

0.0.12

Aug 19, 2016

0.0.11

Aug 19, 2016

0.0.10

Aug 18, 2016

0.0.8

Jul 29, 2016

0.0.7

Jul 28, 2016

0.0.6

Jul 27, 2016

0.0.5

Jul 27, 2016

0.0.4

Jul 20, 2016

0.0.3

Jul 20, 2016

0.0.2

Jul 20, 2016

0.0.1

Jul 20, 2016

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

koala2-0.0.35.tar.gz (69.8 kB view details)

Uploaded Jun 19, 2019 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

koala2-0.0.35-py3-none-any.whl (87.0 kB view details)

Uploaded Jun 19, 2019 Python 3

File details

Details for the file koala2-0.0.35.tar.gz.

File metadata

Download URL: koala2-0.0.35.tar.gz
Upload date: Jun 19, 2019
Size: 69.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.18.4 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.5

File hashes

Hashes for koala2-0.0.35.tar.gz
Algorithm	Hash digest
SHA256	`030b5d404ec7976195204324b57041b4c2fc0f5267cd9d9db27ea1d2cd96e135`
MD5	`1632841e3c89b73ee4cb09d5e9f54b02`
BLAKE2b-256	`02acebb0a8660b892e4ba36f56aa0e90a46d8999e416013dd9d711943b71ed59`

See more details on using hashes here.

File details

Details for the file koala2-0.0.35-py3-none-any.whl.

File metadata

Download URL: koala2-0.0.35-py3-none-any.whl
Upload date: Jun 19, 2019
Size: 87.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.18.4 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.5

File hashes

Hashes for koala2-0.0.35-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f941b667fef6c46fae86dc4a690348675c96c88dbaf7adcd9ced86786975944f`
MD5	`fa1d10327eea3540454f2cde91b8c847`
BLAKE2b-256	`3475a21410eab2e05de4ac64153daefba2053f40d5b26a8f432087c8ce0868d7`

See more details on using hashes here.

koala2 0.0.35

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Koala

Get started

Installation

Basic

Graph generation

Graph Serialization

Graph Evaluation

Named cells or range

Advanced

Compiler options

Volatiles

Outputs

Pruning inputs

Fix and free Cells

Set formula

Detect alive

Create from scratch

Origins

Licence

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes