A library for cleaning tabular data.

These details have not been verified by PyPI

Project links

Homepage

Project description

Tablite

Build status

Overview

We're all tired of reinventing the wheel when we need to process a bit of data.

Pandas has a huge memory overhead when the datatypes are messy (hint: They are!).
Numpy has become a language of it's own. It just doesn't seem pythonic anymore.
Arrows isn't ready.
SQLite is great but just too slow, particularly on disk.
Protobuffer is just overkill for storing data when I still need to implement all the analytics after that.

So what do we do? We write a custom built class for the problem at hand and discover that we've just spent 3 hours doing something that should have taken 20 minutes. No more please!

Solution: Tablite

A python library for tables that does everything you need in 200kB.

Install: pip install tablite
Usage: >>> from tablite import Table

it handles all datatypes: str, float, bool, int, date, datetime, time and type checking is automatic when you append or replace values.
Move fluently between disk and ram using t.use_disk = True/False For 10,000,000 integers python will use 4.2Mb RAM instead of 133.7 Mb.
it can import csv*, tsv, txt, xls, xlsx, xlsm, ods, zip and log using Table.from_file(...)
file_reader is a generator of tables, so it doesn't take up memory until the tables are consumed.
Iterate over rows or columns with for row in table.rows or for column in table.columns.
Create multikey index, sort, use filter, any and all to select.
Lookup between tables using custom functions.
Perform multikey joins with other tables.
Perform groupby and reorganise data as a pivot table with max, min, sum, first, last, count, unique, average, st.deviation, median and mode
Update tables with += which automatically sorts out the columns - even if they're not in perfect order.
Calculate out-of-memory summaries using += on groupby, f.x. groupby += t1
you can select:
- all rows in a column as table['A']
- rows across all columns as table[4:8]
- or a slice as list(table.filter('A', 'B', slice(4,8))).
you to update with table['A'][2] = new value
you can store or send data using json, by:
- dumping to json: json_str = table.to_json(), or
- you can load it with Table.from_json(json_str).-
it automatically deduplicates header names that already are in use.
you can add any type of metadata to the table as table(some_key='some_value') or as table.metadata['some key'] = 'some value'.
you can ask column_xyz in Table.columns ?
load from files with tables = list(Table.from_file('this.csv')) which has automatic datatype detection
perform inner, outer & left sql join between tables as simple as table_1.inner_join(table2, keys=['A', 'B'])
summarise using table.groupby( ... )
create pivot tables using groupby.pivot( ... )
perform multi-criteria lookup in tables using table1.lookup(table2, criteria=.....
And everything else a python list can do, plus data type checking.

Tutorial

To learn more see tutorial.ipynb

API

To read the detailed documentation see tablite

Credits

Martynas Kaunas - GroupBy functionality.
Audrius Kulikajevas - Edge case testing / various bugs.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

2023.11.6

May 10, 2024

2023.11.5

Apr 22, 2024

2023.11.4

Apr 17, 2024

2023.11.3

Apr 12, 2024

2023.11.2

Apr 10, 2024

2023.11.1

Apr 8, 2024

2023.11.0

Apr 5, 2024

2023.10.15

Apr 4, 2024

2023.10.14

Mar 27, 2024

2023.10.13

Mar 20, 2024

2023.10.12

Mar 18, 2024

2023.10.11

Mar 15, 2024

2023.10.10

Mar 15, 2024

2023.10.9

Mar 14, 2024

2023.10.8

Mar 8, 2024

2023.10.7

Mar 8, 2024

2023.10.6

Mar 7, 2024

2023.10.5

Mar 6, 2024

2023.10.4

Mar 6, 2024

2023.10.3

Mar 4, 2024

2023.10.2

Feb 28, 2024

2023.10.1

Feb 26, 2024

2023.10.0

Feb 22, 2024

2023.9.8

Feb 6, 2024

2023.9.7

Feb 1, 2024

2023.9.6

Jan 31, 2024

2023.9.5

Jan 30, 2024

2023.9.4

Jan 29, 2024

2023.9.3

Jan 26, 2024

2023.9.2

Jan 26, 2024

2023.9.1

Jan 25, 2024

2023.9.0

Jan 25, 2024

2023.8.11

Nov 24, 2023

2023.8.10

Nov 16, 2023

2023.8.9

Nov 15, 2023

2023.8.8

Nov 14, 2023

2023.8.7

Nov 8, 2023

2023.8.6

Nov 8, 2023

2023.8.5

Nov 8, 2023

2023.8.4

Nov 7, 2023

2023.8.3

Oct 26, 2023

2023.8.2

Oct 25, 2023

2023.8.1

Oct 24, 2023

2023.8.0

Oct 23, 2023

2023.8.dev72 pre-release

Nov 8, 2023

2023.8.dev7 pre-release

Oct 17, 2023

2023.8.dev6 pre-release

Oct 12, 2023

2023.8.dev5 pre-release

Oct 10, 2023

2023.8.dev4 pre-release

Oct 6, 2023

2023.8.dev3 pre-release

Oct 5, 2023

2023.8.dev2 pre-release

Oct 5, 2023

2023.8.dev1 pre-release

Oct 4, 2023

2023.8.dev0 pre-release

Oct 4, 2023

2023.7.dev6 pre-release

Sep 26, 2023

2023.7.dev5 pre-release

Sep 25, 2023

2023.7.dev4 pre-release

Aug 31, 2023

2023.7.dev3 pre-release

Aug 30, 2023

2023.7.dev2 pre-release

Aug 28, 2023

2023.7.dev1 pre-release

Aug 25, 2023

2023.7.dev0 pre-release

Aug 23, 2023

2023.6.5

Aug 18, 2023

2023.6.4

Aug 16, 2023

2023.6.3

Aug 14, 2023

2023.6.2

Aug 10, 2023

2023.6.1

Aug 1, 2023

2023.6.dev14 pre-release

Jul 13, 2023

2023.6.dev13 pre-release

Jul 11, 2023

2023.6.dev12 pre-release

Jul 3, 2023

2023.6.dev11 pre-release

Jul 3, 2023

2023.6.dev10 pre-release

Jun 27, 2023

2023.6.dev9 pre-release

Jun 22, 2023

2023.6.dev8 pre-release

Jun 19, 2023

2023.6.dev7 pre-release

Jun 19, 2023

2023.6.dev6 pre-release

Jun 16, 2023

2023.6.dev5 pre-release

Jun 13, 2023

2023.6.dev4 pre-release

Jun 13, 2023

2023.6.dev3 pre-release

Jun 12, 2023

2023.6.dev2 pre-release

Jun 9, 2023

2023.6.dev1 pre-release

Jun 6, 2023

2022.11.19

May 15, 2023

2022.11.18

May 8, 2023

2022.11.17

Apr 14, 2023

2022.11.16

Apr 7, 2023

2022.11.15

Apr 6, 2023

2022.11.14

Mar 31, 2023

2022.11.13

Mar 29, 2023

2022.11.12

Mar 17, 2023

2022.11.11

Mar 16, 2023

2022.11.10

Mar 13, 2023

2022.11.9

Mar 10, 2023

2022.11.8

Mar 9, 2023

2022.11.7

Mar 8, 2023

2022.11.6

Feb 28, 2023

2022.11.5

Feb 20, 2023

2022.11.4

Jan 26, 2023

2022.11.3

Dec 4, 2022

2022.11.2

Nov 28, 2022

2022.11.1

Nov 28, 2022

2022.11.0

Nov 23, 2022

2022.11.dev6 pre-release

Nov 18, 2022

2022.11.dev5 pre-release

Nov 14, 2022

2022.11.dev4 pre-release

Nov 9, 2022

2022.11.dev3 pre-release

Nov 7, 2022

2022.11.dev2 pre-release

Nov 5, 2022

2022.11.dev1 pre-release

Nov 5, 2022

2022.10.12

Oct 30, 2022

2022.10.11

Oct 20, 2022

2022.10.10

Oct 19, 2022

2022.10.9

Oct 18, 2022

2022.10.8

Oct 10, 2022

2022.10.7

Sep 8, 2022

2022.10.6

Sep 7, 2022

2022.10.5

Sep 5, 2022

2022.10.4

Aug 30, 2022

2022.10.3

Aug 21, 2022

2022.10.2

Aug 21, 2022

2022.10.1

Aug 21, 2022

2022.10.0

Aug 21, 2022

2022.9.3

Aug 19, 2022

2022.9.1

Aug 18, 2022

2022.9.0

Aug 16, 2022

2022.8.0

Aug 7, 2022

2022.7.9

Aug 5, 2022

2022.7.8

Aug 4, 2022

2022.7.7

Aug 3, 2022

2022.7.6

Jul 26, 2022

2022.7.5

Jul 26, 2022

2022.7.4

Jul 25, 2022

2022.7.3

Jul 25, 2022

2022.7.2

Jul 21, 2022

2022.7.1

Jul 21, 2022

2022.7.0

Jul 14, 2022

2022.7.dev5 pre-release

Jul 12, 2022

2022.7.dev4 pre-release

Jul 12, 2022

2022.7.dev2 pre-release

Jul 8, 2022

This version

2022.7.dev0 pre-release

Jul 13, 2022

2022.2.14.79350

Feb 14, 2022

2022.2.5.67057

Feb 5, 2022

2022.1.26.39915

Jan 26, 2022

2022.1.26.31981

Jan 26, 2022

2022.1.25.68738

Jan 25, 2022

2022.1.24.56156

Jan 24, 2022

2021.11.5.66041

Nov 5, 2021

2021.11.3.62708

Nov 3, 2021

2021.6.15.38091

Jun 15, 2021

2021.5.21.47274

May 21, 2021

2021.5.20.64155

May 20, 2021

2021.3.11.45804

Mar 11, 2021

2021.3.11.33688

Mar 11, 2021

2021.3.4.65414

Mar 4, 2021

2021.3.2.36410

Mar 2, 2021

2021.2.18.60263

Feb 18, 2021

2021.2.18.54360

Feb 18, 2021

2021.2.15.44662

Feb 15, 2021

2021.2.10.52756

Feb 10, 2021

2020.12.21.68845

Dec 21, 2020

2020.11.3.62707

Nov 3, 2020

2020.11.3.61944

Nov 3, 2020

2020.11.3.59813

Nov 3, 2020

2020.11.3.53696

Nov 3, 2020

2020.10.30.46577

Oct 30, 2020

2020.10.29.44766

Oct 29, 2020

2020.10.28.59904

Oct 28, 2020

2020.10.28.59727

Oct 28, 2020

2020.10.28.59455

Oct 28, 2020

2020.9.30.51757

Sep 30, 2020

2020.7.17.40404

Jul 17, 2020

2020.7.16.58401

Jul 16, 2020

2020.6.30.66481

Jun 30, 2020

2020.6.30.57694

Jun 30, 2020

2020.6.28.62006

Jun 28, 2020

2020.6.28.56572

Jun 28, 2020

2020.6.28.55011

Jun 28, 2020

2020.6.27.58703

Jun 27, 2020

2020.6.27.54477

Jun 27, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tablite-2022.7.dev0.tar.gz (50.3 kB view details)

Uploaded Jul 13, 2022 Source

File details

Details for the file tablite-2022.7.dev0.tar.gz.

File metadata

Download URL: tablite-2022.7.dev0.tar.gz
Upload date: Jul 13, 2022
Size: 50.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.10.5

File hashes

Hashes for tablite-2022.7.dev0.tar.gz
Algorithm	Hash digest
SHA256	`0f1498e671afa36f327c06b7942b484d8e8d92e1d50f096a39f3f080369b4175`
MD5	`fe770418056ac9a6e7634223acb601ec`
BLAKE2b-256	`9c9cf60cffd7579cf74f880f33f02d6501468f6219870fdda97b36e45aa6706b`

See more details on using hashes here.

tablite 2022.7.dev0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Tablite

Overview

Solution: Tablite

Tutorial

API

Credits

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes