A tool for manipulating spreadsheets and tables in Python, based on ProPublica's TableFu
Project description
Python TableFu is a tool for manipulating spreadsheet-like tables in Python. It began as a Python implementation of ProPublica's [TableFu](http://propublica.github.com/table-fu/), though new methods have been added. TableFu allows filtering, faceting and manipulating of data. Going forward, the project aims to create something akin to an ORM for spreadsheets.
Usage:
------
>>> from table_fu import TableFu
>>> table = TableFu.from_file('tests/test.csv')
>>> table.columns
['Author', 'Best Book', 'Number of Pages', 'Style']
# get all authors
>>> table.values('Author')
['Samuel Beckett', 'James Joyce', 'Nicholson Baker', 'Vladimir Sorokin']
# total a column
>>> table.total('Number of Pages')
1177.0
# filtering a table returns a new instance
>>> t2 = table.filter(Style='Modernism')
>>> list(t2)
[<Row: Samuel Beckett, Malone Muert, 120, Modernism>,
<Row: James Joyce, Ulysses, 644, Modernism>]
# each TableFu instance acts like a list of rows
>>> table[0]
<Row: Samuel Beckett, Malone Muert, 120, Modernism>
list(table.rows)
[<Row: Samuel Beckett, Malone Muert, 120, Modernism>,
<Row: James Joyce, Ulysses, 644, Modernism>,
<Row: Nicholson Baker, Mezannine, 150, Minimalism>,
<Row: Vladimir Sorokin, The Queue, 263, Satire>]
# rows, in turn, act like dictionaries
>>> row = table[1]
>>> print row['Author']
James Joyce
# transpose a table
>>> t2 = table.transpose()
>>> list(t2)
[<Row: Best Book, Malone Muert, Ulysses, Mezannine, The Queue>,
<Row: Number of Pages, 120, 644, 150, 263>,
<Row: Style, Modernism, Modernism, Minimalism, Satire>]
>>> t2.columns
['Author',
'Samuel Beckett',
'James Joyce',
'Nicholson Baker',
'Vladimir Sorokin']
# sort rows
>>> table.sort('Author')
>>> table.rows
[<Row: James Joyce, Ulysses, 644, Modernism>,
<Row: Nicholson Baker, Mezannine, 150, Minimalism>,
<Row: Samuel Beckett, Malone Muert, 120, Modernism>,
<Row: Vladimir Sorokin, The Queue, 263, Satire>]
# sorting is stored
table.options['sorted_by']
{'Author': {'reverse': False}}
# which is handy because...
# tables can also be faceted (and options are copied to new tables)
>>> for t in table.facet_by('Style'):
... print t.faceted_on
... t.table
Minimalism
[['Nicholson Baker', 'Mezannine', '150', 'Minimalism']]
Modernism
[['Samuel Beckett', 'Malone Muert', '120', 'Modernism'],
['James Joyce', 'Ulysses', '644', 'Modernism']]
Satire
[['Vladimir Sorokin', 'The Queue', '263', 'Satire']]
Here's an [advanced example](https://gist.github.com/765321) that uses faceting and filtering to produce aggregates from [this spreadsheet](https://spreadsheets.google.com/ccc?key=0AprNP7zjIYS1dG5wbVJpWTVacWpUaUh5VHUxMk1wTEE&hl=en&authkey=CJfB5MYP) (extracted from the New York Times Congress API).
Formatting
----------
Filters are just functions that take a value and some number of positional arguments.
New filters can be registered with the included Formatter class.
>>> from table_fu.formatting import Formatter
>>> format = Formatter()
>>> def capitalize(value, *args):
... return str(value).capitalize()
>>> format.register(capitalize)
>>> print format('foo', 'capitalize')
Foo
Cells can be formatted according to rules of the table (which carry over if the table is faceted):
>>> table = TableFu(open('tests/sites.csv'))
>>> table.columns
['Name', 'URL', 'About']
>>> table.formatting = {
... 'Name': {'filter': 'link', 'args': ['URL']}
... }
>>> print table[0]['Name']
<a href="http://www.chrisamico.com" title="ChrisAmico.com">ChrisAmico.com</a>
HTML Output
-----------
TableFu can output an HTML table, using formatting you specify:
>>> table = TableFu(open('tests/sites.csv'))
>>> table.columns
['Name', 'URL', 'About']
>>> table.formatting = {'Name': {'filter: 'link', 'args': ['URL']}}
>>> table.columns = 'Name', 'About'
>>> print table.html()
<table>
<thead>
<tr><th>Name</th><th>About</th></tr>
</thead>
<tbody>
<tr id="row0" class="row even"><td class="datum"><a href="http://www.chrisamico.com" title="ChrisAmico.com">ChrisAmico.com</a></td><td class="datum">My personal site and blog</td></tr>
<tr id="row1" class="row odd"><td class="datum"><a href="http://www.propublica.org" title="ProPublica">ProPublica</a></td><td class="datum">Builders of the Ruby version of this library</td></tr>
<tr id="row2" class="row even"><td class="datum"><a href="http://www.pbs.org/newshour" title="PBS NewsHour">PBS NewsHour</a></td><td class="datum">Where I spend my days</td></tr>
</tbody>
</table>
Usage:
------
>>> from table_fu import TableFu
>>> table = TableFu.from_file('tests/test.csv')
>>> table.columns
['Author', 'Best Book', 'Number of Pages', 'Style']
# get all authors
>>> table.values('Author')
['Samuel Beckett', 'James Joyce', 'Nicholson Baker', 'Vladimir Sorokin']
# total a column
>>> table.total('Number of Pages')
1177.0
# filtering a table returns a new instance
>>> t2 = table.filter(Style='Modernism')
>>> list(t2)
[<Row: Samuel Beckett, Malone Muert, 120, Modernism>,
<Row: James Joyce, Ulysses, 644, Modernism>]
# each TableFu instance acts like a list of rows
>>> table[0]
<Row: Samuel Beckett, Malone Muert, 120, Modernism>
list(table.rows)
[<Row: Samuel Beckett, Malone Muert, 120, Modernism>,
<Row: James Joyce, Ulysses, 644, Modernism>,
<Row: Nicholson Baker, Mezannine, 150, Minimalism>,
<Row: Vladimir Sorokin, The Queue, 263, Satire>]
# rows, in turn, act like dictionaries
>>> row = table[1]
>>> print row['Author']
James Joyce
# transpose a table
>>> t2 = table.transpose()
>>> list(t2)
[<Row: Best Book, Malone Muert, Ulysses, Mezannine, The Queue>,
<Row: Number of Pages, 120, 644, 150, 263>,
<Row: Style, Modernism, Modernism, Minimalism, Satire>]
>>> t2.columns
['Author',
'Samuel Beckett',
'James Joyce',
'Nicholson Baker',
'Vladimir Sorokin']
# sort rows
>>> table.sort('Author')
>>> table.rows
[<Row: James Joyce, Ulysses, 644, Modernism>,
<Row: Nicholson Baker, Mezannine, 150, Minimalism>,
<Row: Samuel Beckett, Malone Muert, 120, Modernism>,
<Row: Vladimir Sorokin, The Queue, 263, Satire>]
# sorting is stored
table.options['sorted_by']
{'Author': {'reverse': False}}
# which is handy because...
# tables can also be faceted (and options are copied to new tables)
>>> for t in table.facet_by('Style'):
... print t.faceted_on
... t.table
Minimalism
[['Nicholson Baker', 'Mezannine', '150', 'Minimalism']]
Modernism
[['Samuel Beckett', 'Malone Muert', '120', 'Modernism'],
['James Joyce', 'Ulysses', '644', 'Modernism']]
Satire
[['Vladimir Sorokin', 'The Queue', '263', 'Satire']]
Here's an [advanced example](https://gist.github.com/765321) that uses faceting and filtering to produce aggregates from [this spreadsheet](https://spreadsheets.google.com/ccc?key=0AprNP7zjIYS1dG5wbVJpWTVacWpUaUh5VHUxMk1wTEE&hl=en&authkey=CJfB5MYP) (extracted from the New York Times Congress API).
Formatting
----------
Filters are just functions that take a value and some number of positional arguments.
New filters can be registered with the included Formatter class.
>>> from table_fu.formatting import Formatter
>>> format = Formatter()
>>> def capitalize(value, *args):
... return str(value).capitalize()
>>> format.register(capitalize)
>>> print format('foo', 'capitalize')
Foo
Cells can be formatted according to rules of the table (which carry over if the table is faceted):
>>> table = TableFu(open('tests/sites.csv'))
>>> table.columns
['Name', 'URL', 'About']
>>> table.formatting = {
... 'Name': {'filter': 'link', 'args': ['URL']}
... }
>>> print table[0]['Name']
<a href="http://www.chrisamico.com" title="ChrisAmico.com">ChrisAmico.com</a>
HTML Output
-----------
TableFu can output an HTML table, using formatting you specify:
>>> table = TableFu(open('tests/sites.csv'))
>>> table.columns
['Name', 'URL', 'About']
>>> table.formatting = {'Name': {'filter: 'link', 'args': ['URL']}}
>>> table.columns = 'Name', 'About'
>>> print table.html()
<table>
<thead>
<tr><th>Name</th><th>About</th></tr>
</thead>
<tbody>
<tr id="row0" class="row even"><td class="datum"><a href="http://www.chrisamico.com" title="ChrisAmico.com">ChrisAmico.com</a></td><td class="datum">My personal site and blog</td></tr>
<tr id="row1" class="row odd"><td class="datum"><a href="http://www.propublica.org" title="ProPublica">ProPublica</a></td><td class="datum">Builders of the Ruby version of this library</td></tr>
<tr id="row2" class="row even"><td class="datum"><a href="http://www.pbs.org/newshour" title="PBS NewsHour">PBS NewsHour</a></td><td class="datum">Where I spend my days</td></tr>
</tbody>
</table>
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
python-tablefu-0.4.2.zip
(252.4 kB
view details)
python-tablefu-0.4.2.tar.gz
(247.4 kB
view details)
File details
Details for the file python-tablefu-0.4.2.zip
.
File metadata
- Download URL: python-tablefu-0.4.2.zip
- Upload date:
- Size: 252.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e67c51438ea862a19bb91f62e30a5304864fa613daecc45ad79946b803a7f9ff |
|
MD5 | 82a6b29ddd91b33b0fdece9415e679ab |
|
BLAKE2b-256 | fea377dfe0b23c09f5a87b34997c0eb415327efc59832c87090c0d3242cd7664 |
File details
Details for the file python-tablefu-0.4.2.tar.gz
.
File metadata
- Download URL: python-tablefu-0.4.2.tar.gz
- Upload date:
- Size: 247.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8de8091d031633fdc3c2cb1a5c986501116ad681a7347f305f1fec7b45803719 |
|
MD5 | bf9017dca1d095cec15324a8ccc4c025 |
|
BLAKE2b-256 | 21377932aa140eb6a2d9d1e8ac7ba2430d93c847235ecef317d7ea371f5f3d3c |