Skip to main content

XML to XLSX converter

Project description

Creating xlsx files from xml template using openpyxl.

Target

This project is intended to create xlsx files from xml api to openpyxl, supposedly generated by other tamplate engines (i.e. django, jinja).

This is a merely an xml parser translating mostly linearly to worksheet, rows and finally cells of the Excel workbook.

Example

An xml file like this one

<workbook>
    <worksheet title="test">
        <row><cell>This</cell><cell>is</cell><cell>a TEST</cell></row>
        <row><cell>Nice, isn't it?</cell></row>
    </worksheet>
</workbook>

can be parsed to create a neat Excel workbook with two rows of data in one worksheet. Parsing can be done using command line:

python xml2xlsx.py < input.xml > output.xml

or as a library call

from xml2xlsx import xml2xlsx
# TODO

This is mainly intended (and was developed for this purpose) to parse files generated by other templating engines, like django template system. One can generate an excel workbook from template like this:

{% for e in list %}
    <row><cell>{{ e.name }}</cell></row>
{% endfor %}

Features

Basic features of the library include creating multiple, named sheets within one workbook and creating rows of cells in these sheets. However, there are more possibiliteis to create complex excel based reports.

Cell formatting

Each cell can be specified to use one of the types:

  • string (default)

  • number

  • date

Type is defined in type cell attribute. The cell value is converted appropriately to the type specified. If you insert a number in the cell value and do not specify type="number" attribute, you will find Excel complaining about storing nubers as text.

Since there are more date formats than countries, you have to be aware of current locale. The simplest way to be i18n compatible is to specify date format in date-fmt attribute and pass compatible (possibily non localized) date in the cell value, as in the following example

...
<row><cell type="date" date-fmt="%Y-%m-%d">2016-10-01</cell></row>
<row><cell type="date" date-fmt="%d.%m.%Y">01.10.2016</cell></row>
...

Generated excel file will have two rows with the same date (1st of October 2016) with date formatted according to Excel defaults (and current locale).

Formulas

xml2xls can effectively create cells with formulas in them. The only limitation (as with openpyxl) is using English names of the functions.

For example:

...
<row><cell>=SUM(A1:A5)</cell></row>
...

Cell referencing

The parser can store positions of the cell in a dictionary-like structure. It then can be referenced to create complex formulas. Each value of the cell is preprocessed using string format with stored values. This means that these values can be referenced using { and } brackets.

Current row and column

There are two basic values that can always be used, i.e. row and col which return current row number and column name.

<workbook>
    <sheet>
        <row><cell>{col}{row}</cell></row>
    </sheet>
</workbook>
...

would create a workbook with a text “A1” included in the A1 cell of the worksheet. Using template languages, you can create more complicated constructs, like (using django template system):

...
{% for e in list %}
<row>
    <cell type="date" date-fmt="%Y-%m-%d">{{ e|date:"Y-m-d" }}</cell>
    <cell>=TEXT(A{row}, "ddd")</cell>
</row>
{% endfor %}
...

would create a list of rows with a date in the first column and weekday names for these dates in the second column (provided list context variable contains a list of dates).

Specified cell

It is also possible to store cell possible to store names of specified cells in a pseudo-variable (as in a dictionary). One has to use ref-id attribute of the cell tag and then reuse the value of this attribute in the remainder of the xml input. This is very useful in formulas. A simple example would be referencing another cell in a formula like this:

...
<row><cell ref-id="mycell">This is just a test</cell></row>
...
<row><cell>={mycell}</cell></row>
...

which would create an excel formula referencing a cell with “this is just a test” text, whatever this cell address was.

A more complex example using django template engine to create summaries can look like this:

...
{% for e in list %}
    <row>
        <cell ref-id="{% if forloop.first %}start{% elsif forloop.last %}end{% endif %}">
            {{ e }}
        </cell>
    </row>
{% endfor %}
<row>
    <cell>Summary</cell>
    <cell>=SUM({start}:{end})</cell>
</row>
...

List of cells

Referencing a single cell can be harsh when dealing with complex reports. Especially when creating summaries of irregularly sheet-distributed data. xml2xlsx can append a cell to a variable-like list, as in ref-id attribute, to reuse it as a comma concatenated value. Instead of ref-id, one has to use ref-append attribute.

This is a simple example to demonstrate the feature:

...
<sheet>
    <row>
        <cell ref-append="mylist">1</cell>
        <cell ref-append="mylist">2</cell>
    </row>
    <row><cell ref-append="mylist">3</cell></row>
    <row><cell>=SUM({mylist})</cell></row>
</sheet>

This will generate an Excel sheet with A3 cell containing formula to sum A1, B1 and A2 cells (=SUM(A1, B1, A2)).

Cell fromatting

The cell format can be specified using various attributes of the cell tag. Only font formatting can be specifed for now.

Font format

A font format is specified in in font attribute. It is a semicolon separated dict like list of font formats as specified in font class of openpyxl library.

An example to create a cell with bold 10px font:

...
<cell font="bold: True; size: 10px;">Cell formatted</cell>
...

Planned features

Here is the (probably incomplete) wishlist for the project

  • Global font and cell styles

  • Row widths and column heights

  • Horizontal and vertical cell merging

  • XML validation with XSD to quickly raise an error if parsing wrong xml

XML Schema Reference

Parsed xml should be enclosed in a workbook tag. Each workbook tag can have multiple sheet. The hierarchy continues to row and cell tags.

Here is a complete list of available attributes of these tags.

workbook

No attributes for now.

sheet

Attribute:

name

Usage:

Specifies the name of the sheet

row

No attributes for now

cell

Attribute:

type

Usage:

Specifies the resulting type of the excel cell.

Type:

One of unicode, date, number

Default:

unicode

Attribute:

date-fmt

Usage:

Specifies the format of the date parsed as in strftime and strptime functions of datetime standard python library.

Remarks:

Parsed only if type="date".

Attribute:

font

Usage:

Sepcifies font formatting for a single cell.

Type:

List of semicolon separated dict-like values in form of key: value; key: value;

Remarks:

Key and values are arguments of Font clas in openpyxl.

Release History

0.2

  • Added documentation

  • Added cell referencing

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xml2xlsx-0.2a0.zip (15.6 kB view details)

Uploaded Source

File details

Details for the file xml2xlsx-0.2a0.zip.

File metadata

  • Download URL: xml2xlsx-0.2a0.zip
  • Upload date:
  • Size: 15.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for xml2xlsx-0.2a0.zip
Algorithm Hash digest
SHA256 b7b9939087628f515a6589e9ce20ca108da8a5d77033903c14a040d5a1de3b29
MD5 ccb2388419605bab5115a147d0990fbb
BLAKE2b-256 c0244e7d1f34a56aae302b42d6d46e0d4d826f82ccdea5a479507df22e9868ae

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page