Read and write Apple Numbers spreadsheets

These details have not been verified by PyPI

Project links

Project description

numbers-parser

numbers-parser is a Python module for parsing Apple Numbers.numbers files. It supports Numbers files generated by Numbers version 10.3, and up with the latest tested version being 14.1 (current as of June 2024).

It supports and is tested against Python versions from 3.9 onwards. It is not compatible with earlier versions of Python.

Installation

python3 -m pip install numbers-parser

A pre-requisite for this package is python-snappy which will be installed by Python automatically, but python-snappy also requires that the binary libraries for snappy compression are present.

The most straightforward way to install the binary dependencies is to use Homebrew and source Python from Homebrew rather than from macOS as described in the python-snappy github:

For Intel Macs:

brew install snappy python3
CPPFLAGS="-I/usr/local/include -L/usr/local/lib" \
python3 -m pip install python-snappy

For Apple Silicon Macs:

brew install snappy python3
CPPFLAGS="-I/opt/homebrew/include -L/opt/homebrew/lib" \
python3 -m pip install python-snappy

For Linux (your package manager may be different):

sudo apt-get -y install libsnappy-dev

On Windows, you will need to either arrange for snappy to be found for VSC++ or you can install python binary libraries compiled by Christoph Gohlke. You must select the correct python version for your installation. For example for python 3.11:

pip install python_snappy-0.6.1-cp311-cp311-win_amd64.whl

Quick Start

Reading documents:

>>> from numbers_parser import Document
>>> doc = Document("mydoc.numbers")
>>> sheets = doc.sheets
>>> tables = sheets[0].tables
>>> rows = tables[0].rows()

Sheets and tables are iterables that can be indexed using either an integer index or using the name of the sheet/table:

>>> doc.sheets[0].name
'Sheet 1'
>>> doc.sheets["Sheet 1"].name
'Sheet 1'
>>> doc.sheets[0].tables[0].name
'Table 1'
>>> doc.sheets[0].tables["Table 1"].name
'Table 1'

Table objects have a rows method which contains a nested list with an entry for each row of the table. Each row is itself a list of the column values.

>>> data = sheets["Sheet 1"].tables["Table 1"].rows()
>>> data[0][0]
<numbers_parser.cell.EmptyCell object at 0x1022b5710>
>>> data[1][0]
<numbers_parser.cell.TextCell object at 0x101eb6790>
>>> data[1][0].value
'Debit'

Cell Data

Cells are objects with a common base class of Cell. All cell types have a property value which returns the contents of the cell as a python datatype. numbers-parser uses pendulum instead of python’s builtin types. Available cell types are:

Cell type	value type	Additional properties
NumberCell	`float`
TextCell	`str`
RichTextCell	`str`	See Rich text
EmptyCell	`None`
BoolCell	`bool`
DateCell	`pendulum.datetime`
DurationCell	`pendulum.duration`
ErrorCell	`None`
MergedCell	`None`	See Merged cells

Cell references can be either zero-offset row/column integers or an Excel/Numbers A1 notation. Where cell values are not None the property formatted_value returns the cell value as a str as displayed in Numbers. Cells that have no values in a table are represented as EmptyCell and cells containing evaluation errors of any kind ErrorCell.

>>> table.cell(1,0)
<numbers_parser.cell.TextCell object at 0x1019ade50>
>>> table.cell(1,0).value
'Debit'
>>> table.cell("B2")
<numbers_parser.cell.NumberCell object at 0x103a99790>
>>> table.cell("B2").value
1234.5
>>> table.cell("B2").formatted_value
'£1,234.50'

Pandas Support

Since the return value of rows() is a list of lists, you can pass this directly to pandas. Assuming you have a Numbers table with a single header which contains the names of the pandas series you want to create you can construct a pandas dataframe using:

import pandas as pd

doc = Document("simple.numbers")
sheets = doc.sheets
tables = sheets[0].tables
data = tables[0].rows(values_only=True)
df = pd.DataFrame(data[1:], columns=data[0])

Writing Numbers Documents

Whilst support for writing numbers files has been stable since version 3.4.0, you are highly recommended not to overwrite working Numbers files and instead save data to a new file.

Cell values are written using Table.write() and numbers-parser will automatically create empty rows and columns for any cell references that are out of range of the current table.

doc = Document("write.numbers")
sheets = doc.sheets
tables = sheets[0].tables
table = tables[0]
table.write(1, 1, "This is new text")
table.write("B7", datetime(2020, 12, 25))
doc.save("new-sheet.numbers")

Additional tables and worksheets can be added to a Document before saving using Document.add_sheet() and Sheet.add_table() respectively:

doc = Document()
doc.add_sheet("New Sheet", "New Table")
sheet = doc.sheets["New Sheet"]
table = sheet.tables["New Table"]
table.write(1, 1, 1000)
table.write(1, 2, 2000)
table.write(1, 3, 3000)
doc.save("sheet.numbers")

Styles

numbers_parser currently only supports paragraph styles and cell styles. The following styles are supported:

font attributes: bold, italic, underline, strikethrough
font selection and size
text foreground color
horizontal and vertical alignment
cell background color
cell background images
cell indents (first line, left, right, and text inset)

Numbers conflates style attributes that can be stored in paragraph styles (the style menu in the text panel) with the settings that are available on the Style tab of the Text panel. Some attributes in Numbers are not applied to new cells when a style is applied.

To keep the API simple, numbers-parser packs all styling into a single Style object. When a document is saved, the attributes not stored in a paragraph style are applied to each cell that includes it.

Styles are read from cells using the Cell.style property and you can add new styles with Document.add_style.

red_text = doc.add_style(
    name="Red Text",
    font_name="Lucida Grande",
    font_color=RGB(230, 25, 25),
    font_size=14.0,
    bold=True,
    italic=True,
    alignment=Alignment("right", "top"),
)
table.write("B2", "Red", style=red_text)
table.set_cell_style("C2", red_text)

Cell Data Formatting

Numbers has two different cell formatting types: data formats and custom formats.

Data formats are presented in Numbers in the Cell tab of the Format pane and are applied to individual cells. Like Numbers, numbers-parsers caches formatting information that is identical across multiple cells. You do not need to take any action for this to happen; this is handled internally by the package. Changing a data format for cell has no impact on any other cells.

Cell formats are changed using Table.set_cell_formatting():

table.set_cell_formatting(
   "C1",
   "date",
   date_time_format="EEEE, d MMMM yyyy"
)
table.set_cell_formatting(
   0,
   4,
   "number",
   decimal_places=3,
   negative_style=NegativeNumberStyle.RED
)

Custom formats are shared across a Document and can be applied to multiple cells in multiple tables. Editing a custom format changes the appearance of data in all cells that share that format. You must first add a custom format to the document using Document.add_custom_format() before assigning it to cells using Table.set_cell_formatting():

long_date = doc.add_custom_format(
   name="Long Date",
   type="date",
   date_time_format="EEEE, d MMMM yyyy"
)
table.set_cell_formatting("C1", "custom", format=long_date)

A limited number of currencies are formatted using symbolic notation rather than an ISO code. These are defined in numbers_parser.currencies and match the ones chosen by Numbers. For example, US dollars are referred to as US$ whereas Euros and British Pounds are referred to using their symbols of € and £ respectively.

Borders

numbers-parser supports reading and writing cell borders, though the interface for each differs. Individual cells can have each of their four borders tested, but when drawing new borders, these are set for the table to allow for drawing borders across multiple cells. Setting the border of merged cells is not possible unless the edge of the cells is at the end of the merged region.

Borders are represented using the Border class that can be initialized with line width, color and line style. The current state of a cell border is read using the Cell.border property and Table.set_cell_border() sets the border for a cell edge or a range of cells.

API

For more examples and details of all available classes and methods, see the full API docs.

Command-line scripts

When installed from PyPI, a command-like script cat-numbers is installed in Python’s scripts folder. This script dumps Numbers spreadsheets into Excel-compatible CSV format, iterating through all the spreadsheets passed on the command-line.

usage: cat-numbers [-h] [-T | -S | -b] [-V] [--formulas] [--formatting]
                  [-s SHEET] [-t TABLE] [--debug]
                  [document ...]

Export data from Apple Numbers spreadsheet tables

positional arguments:
  document              Document(s) to export

options:
  -h, --help            show this help message and exit
  -T, --list-tables     List the names of tables and exit
  -S, --list-sheets     List the names of sheets and exit
  -b, --brief           Don't prefix data rows with name of sheet/table
                        (default: false)
  -V, --version
  --formulas            Dump formulas instead of formula results
  --formatting          Dump formatted cells (durations) as they appear
                        in Numbers
  -s SHEET, --sheet SHEET
                        Names of sheet(s) to include in export
  -t TABLE, --table TABLE
                        Names of table(s) to include in export
  --debug               Enable debug logging

Note: --formatting will return different capitalization for 12-hour times due to differences between Numbers’ representation of these dates and datetime.strftime. Numbers in English locales displays 12-hour times with ‘am’ and ‘pm’, but datetime.strftime on macOS at least cannot return lower-case versions of AM/PM.

Limitations

Current known limitations of numbers-parser which may be implemented in the future are:

Table styles that allow new tables to adopt a style across the whole table are not suppported
Creating cells of type BulletedTextCell is not supported
New tables are inserted with a fixed offset below the last table in a worksheet which does not take into account title or caption size
Formulas cannot be written to a document
Pivot tables are unsupported and saving a document with a pivot table issues a UnsupportedWarning.

The following limitations are expected to always remain:

New sheets insert tables with formats copied from the first table in the previous sheet rather than default table formats
Due to a limitation in Python’s ZipFile, Python versions older than 3.11 do not support image filenames with UTF-8 characters Cell.add_style.bg_image() returns None for such files and issues a RuntimeWarning (see issue 69 for details).

License

All code in this repository is licensed under the MIT License.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

4.14.1

Oct 13, 2024

4.13.3

Oct 12, 2024

4.13.2

Sep 20, 2024

4.13.1

Aug 30, 2024

4.12.2

Jul 3, 2024

4.12.1

Jul 2, 2024

4.12.0

Jun 29, 2024

This version

4.11.5

Jun 19, 2024

4.11.4

Jun 18, 2024

4.11.3

Jun 17, 2024

4.11.2

Jun 14, 2024

4.10.6

Jun 8, 2024

4.10.5

May 18, 2024

4.10.4

Mar 23, 2024

4.10.3

Mar 14, 2024

4.10.2

Mar 14, 2024

4.10.1

Mar 2, 2024

4.10.0

Feb 29, 2024

4.9.4

Feb 25, 2024

4.9.3

Feb 25, 2024

4.9.2

Feb 25, 2024

4.9.1

Feb 25, 2024

4.9.0

Feb 25, 2024

4.8.1

Feb 17, 2024

4.8.0

Feb 7, 2024

4.7.1

Jan 3, 2024

4.7.0

Jan 3, 2024

4.6.0

Jan 2, 2024

4.5.0

Jan 1, 2024

4.4.8

Dec 25, 2023

4.4.7

Dec 25, 2023

4.4.6

Sep 22, 2023

4.4.5

Sep 21, 2023

4.4.4

Sep 19, 2023

4.4.2

Aug 5, 2023

4.4.1

Aug 5, 2023

4.4.0

Aug 4, 2023

4.3.0

Aug 2, 2023

4.2.0

Jul 28, 2023

4.1.2

Jul 20, 2023

4.1.1

Jul 16, 2023

4.1.0

Jul 16, 2023

4.0.0

Jul 14, 2023

3.11.0

Jun 29, 2023

3.10.4

Jun 28, 2023

3.10.3

Jun 19, 2023

3.10.2

Jun 13, 2023

3.10.1

Jun 6, 2023

3.9.7

Apr 21, 2023

3.9.6

Apr 16, 2023

3.9.5

Jan 15, 2023

3.9.3

Nov 18, 2022

3.9.1

Nov 16, 2022

3.9.0

Nov 16, 2022

3.8.0

Nov 11, 2022

3.7.1

Oct 27, 2022

3.7.0

Sep 25, 2022

3.6.1

Sep 25, 2022

3.6.0

Sep 25, 2022

3.5.2

Sep 24, 2022

3.4.2

Sep 19, 2022

3.4.1

Sep 18, 2022

3.4.0

Sep 18, 2022

3.3.1

Sep 12, 2022

3.3.0

Sep 11, 2022

3.2.0

Sep 11, 2022

3.1.0

Sep 10, 2022

3.0.0

Jul 18, 2022

2.3.13

Jul 13, 2022

2.3.12

Jul 7, 2022

2.3.11

Jun 23, 2022

2.3.10

May 12, 2022

2.3.9

Apr 19, 2022

2.3.8

Mar 26, 2022

2.3.7

Mar 23, 2022

2.3.6

Feb 28, 2022

2.3.5

Feb 28, 2022

2.3.4

Feb 27, 2022

2.3.3

Feb 1, 2022

2.3.2

Jan 31, 2022

2.3.1

Jan 30, 2022

2.3.0

Jan 30, 2022

2.2.0

Jan 9, 2022

2.1.1

Dec 12, 2021

2.1

Nov 20, 2021

2.0.2

Nov 20, 2021

2.0

Jul 1, 2021

1.2.1

May 15, 2021

1.1

May 11, 2021

1.0.2

May 9, 2021

1.0.1

May 9, 2021

1.0

May 8, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

numbers_parser-4.11.5.tar.gz (280.7 kB view details)

Uploaded Jun 19, 2024 Source

Built Distribution

numbers_parser-4.11.5-py3-none-any.whl (304.8 kB view details)

Uploaded Jun 19, 2024 Python 3

File details

Details for the file numbers_parser-4.11.5.tar.gz.

File metadata

Download URL: numbers_parser-4.11.5.tar.gz
Upload date: Jun 19, 2024
Size: 280.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.7.1 CPython/3.11.7 Darwin/23.5.0

File hashes

Hashes for numbers_parser-4.11.5.tar.gz
Algorithm	Hash digest
SHA256	`c0f05a35af1391c08abe3a8c8cc9fbd826e7055bea56498c2afc5ce5835725b5`
MD5	`18a815641bb07008c2e2b1d1e2a9da2d`
BLAKE2b-256	`fce00705ea23efbdcbe537ab97395f5870c225840ff0a26ea9e581f222106a24`

See more details on using hashes here.

File details

Details for the file numbers_parser-4.11.5-py3-none-any.whl.

File metadata

Download URL: numbers_parser-4.11.5-py3-none-any.whl
Upload date: Jun 19, 2024
Size: 304.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.7.1 CPython/3.11.7 Darwin/23.5.0

File hashes

Hashes for numbers_parser-4.11.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d1300001db4b835317e02dbfc01cab33806a8c4f32db6e2975da53657cd71da5`
MD5	`42b67569aba25e16a2cfa9ca48790813`
BLAKE2b-256	`760e54a13569c84dda694ec81c4fb80b67b2eadc7c80b061f07efa2d6eedd9eb`