Skip to main content

Data library focusing on pure python data structures and Excel interaction

Project description

## Managing tabular data shouldn’t be complicated.

If values are stored in a matrix, it should’t be any harder to iterate or modify than a normal list. One enhancement, however, would be to have row values accessible by column names instead of by integer indeces,eg

for row in m:

row.header

for row in m:

row[17] (did i count those columns correctly?)

#### Two naive solutions for this are:

  1. Convert rows to dictionaries

    Using duplicate dictionary instances for every row has a high memory footprint, and makes accessing values by index more complicated, eg

    [{‘col_a’: 1.0, ‘col_b’: ‘b’, ‘col_c’: ‘c’},

    {‘col_a’: 1.0, ‘col_b’: ‘b’, ‘col_c’: ‘c’}]

  2. Convert rows to namedtuples

    Named tuples do not have per-instance dictionaries, so they are lightweight and require no more memory than regular tuples, but their values are read-only (which makes this kinda a dealbreaker)

Another possibility would be to store the values in column-major order, like in a database. This has a further advantage in that all values in the same column are usually of the same data type, allowing them to be stored more efficiently

row-major order:
[[‘coi_a’, ‘col_b’, ‘col_c’],

[1.0, ‘b’, ‘c’], [1.0, ‘b’, ‘c’], [1.0, ‘b’, ‘c’]]

column-major order:
{‘col_a’: [1.0, 1.0, 1.0],

‘col_a’: [‘b’, ‘b’, ‘b’], ‘col_a’: [‘c’, ‘c’, ‘c’]}

This is essentially what a pandas DataFrame is. The drawback to this is a major conceptual overhead. - Intuitively, each row is some entity, each column is a property of that row

  • DataFrames have some great features, but also require specialized syntax that can get very awkward and requires a lot of memorization

The flux_cls attempts to balance ease-of-use and performance. It has the following attributes: - row-major iteration - named attributes on rows - value mutability on rows - light memory footprint - efficient updates and modifications

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vengeance-1.0.14.tar.gz (35.6 kB view details)

Uploaded Source

File details

Details for the file vengeance-1.0.14.tar.gz.

File metadata

  • Download URL: vengeance-1.0.14.tar.gz
  • Upload date:
  • Size: 35.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.9.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.6

File hashes

Hashes for vengeance-1.0.14.tar.gz
Algorithm Hash digest
SHA256 0768bb8d8326cb341908042aa039cfb9045b2db07aba52f38d0f83c4f63d45e4
MD5 3c5c5d6c59c5591e05792857003b7116
BLAKE2b-256 1c128a1eef1e6f4a379f24d69bb2b1610c3013c76ff6982193188c32f573671b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page