Skip to main content

extract insights from 2d tables

Project description

sheetcalc

Extract insights from two dimentional tables. Load data into table one by one, or in bulk using import features. Use grouping features to aggregate by one or more columns similar to SQL.

Import Package

from sheetcalc import Table

Create Table Instance

tbl = Table ('table-name', 'optional table description')

Create Columns

tbl.add_column ()
tbl.add_column ('Amt')

When no column name is supplied, a name will be autogenerated as 'Column_n' where 'n' is the Column number. Column names do not need to be pre-defined. Default names get generated dynamically when loading rows when a new column is detected.

Add Rows into Table

Add rows one at a time

tbl.add_row (['John', 100, 'Business'])
tbl.add_row (['Mary', 35,  'Personal'])
tbl.add_row (['Rob' , 2,   'Business'])
tbl.add_row (['John', 15,  'Personal'])
tbl.add_row (['Mary', 10,  'Personal'])
tbl.add_row (['Ken',  10])
tbl.add_row (['Ken',  10,  'Personal'])

or add rows from comma delimitted file with a header row

tbl.load_from_delimited_file ('/home/user/data.csv')

or from file delimitted by any char, e.g tab

tbl.load_from_delimited_file ('/home/user/data.csv', '\t')

or from a file without a header row

tbl.load_from_delimited_file ('/home/user/data.csv', ',', False)

Reordering Columns in the table

Reorder or remove columns in the table. The returned table contains the requested changes. Original Table remains untouched.

tab = Table ('test')
tab.add_column ('RollNo')
tab.add_column ('Name')
tab.add_row (['A100', 'Abagail'])
tab.add_row (['A101', 'Boris'])
tab.add_row (['A102', 'Betty'])
tab.add_row (['A103', 'Charlie'])
print (1, tab.rows)
print (2, tab.shape_shift(['Name', 'RollNo']).rows)
print (3, tab.shape_shift(['RollNo']).rows)

Output:

1 [['A100', 'Abagail'], ['A101', 'Boris'], ['A102', 'Betty'], ['A103', 'Charlie']]
2 [['Abagail', 'A100'], ['Boris', 'A101'], ['Betty', 'A102'], ['Charlie', 'A103']]
3 [['A100'], ['A101'], ['A102'], ['A103']]

Filtering Data

Filter data by values in one column using logical operations

tab = Table ('test')
tab.add_column ('RollNo')
tab.add_column ('Name')
tab.add_row (['A100', 'Abagail'])
tab.add_row (['A101', 'Boris'])
tab.add_row (['A102', 'Betty'])
tab.add_row (['A103', 'Charlie'])
print (1, tab.filter_equals ('RollNo', 'A101').rows)
print (2, tab.filter_not_equals ('RollNo', 'A101').rows)
print (3, tab.filter_starts_with ('Name', 'B').rows)
print (4, tab.filter_ends_with ('Name', 'e').rows)
print (5, tab.filter_greater_than ('RollNo', 'A102').rows)
print (6, tab.filter_greater_than_or_equals ('RollNo', 'A102').rows)
print (7, tab.filter_less_than ('RollNo', 'A102').rows)
print (8, tab.filter_less_than_or_equals ('RollNo', 'A102').rows)
print (9, tab.filter_in ('RollNo', ['A102','A101']).rows)
print (10, tab.filter_not_in ('RollNo', ['A102','A101']).rows)

Output:

1 [['A101', 'Boris']]
2 [['A100', 'Abagail'], ['A102', 'Betty'], ['A103', 'Charlie']]
3 [['A101', 'Boris'], ['A102', 'Betty']]
4 [['A103', 'Charlie']]
5 [['A103', 'Charlie']]
6 [['A102', 'Betty'], ['A103', 'Charlie']]
7 [['A100', 'Abagail'], ['A101', 'Boris']]
8 [['A100', 'Abagail'], ['A101', 'Boris'], ['A102', 'Betty']]
9 [['A101', 'Boris'], ['A102', 'Betty']]
10 [['A100', 'Abagail'], ['A103', 'Charlie']]

Grouping Data and Aggregating Results

Group values by one or more columns

keymap, valuemap, skipped = tbl.group_by (
	'Column_1, Column_3',
	'Sum (Amt), min(Amt), count (Amt)'
)

Return values:

  • keymap Hashmap from system Generated hashkey to Grouping Values
  • valuemap Hashmap from system Generated hashkey to Aggretate results
  • skipped - Rows not included in the groping operation. This includes failed rows.

You can correlate the values between the two returned Hashmaps 'keymap' and 'valuemap' using their hashkeys. The hashkeys are simply comma separated values of data in Grouping Columns. Therefore, in case of Single Column Grouping (and other situations when a comma separated list is sufficient) you can safely discard the returned 'keymap' and use only 'valuemap'.

Supported Aggregation Functions

  • count
  • min
  • max
  • sum

Print Results and Statistics

Raw Print including Data

print (values)
or
print (json.dumps (valuemap, default=lambda o: o.__dict__, sort_keys=True, indent=4))

Print Processing Runtimes and other Statistics

print (tbl.stats)
or
print (json.dumps (tbl.stats, default=lambda o: o.__dict__, sort_keys=True, indent=4))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sheetcalc-0.1.8.tar.gz (6.0 kB view hashes)

Uploaded Source

Built Distribution

sheetcalc-0.1.8-py3-none-any.whl (7.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page