Skip to main content

Facilities for mappings and objects associated with mappings.

Project description

Latest release 20200130: New dicts_to_namedtuples function to yield namedtuples from an iterable of dicts.

Facilities for mappings and objects associated with mappings.

In particular named_column_tuple(column_names), a function returning a factory for namedtuples subclasses derived from the supplied column names, and named_column_tuples(rows), a function returning a namedtuple factory and an iterable of instances containing the row data. These are used by the csv_import and xl_import functions from cs.csvutils.

Class AttributableList(builtins.list)

An AttributableList maps unimplemented attributes onto the list members and returns you a new AttributableList with the results, ready for a further dereference.

Example:

>>> class C(object):
...   def __init__(self, i):
...     self.i = i
>>> Cs = [ C(1), C(2), C(3) ]
>>> AL = AttributableList( Cs )
>>> print(AL.i)
[1, 2, 3]

Method AttributableList.__init__(self, initlist=None, strict=False)

Initialise the list.

The optional parameter initlist initialises the list as for a normal list.

The optional parameter strict, if true, causes list elements lacking the attribute to raise an AttributeError. If false, list elements without the attribute are omitted from the results.

Function dicts_to_namedtuples(dicts, class_name, keys=None)

Scan an iterable of dicts, yield a sequence of namedtuples derived from them.

Parameters:

  • dicts: the dicts to scan and convert, an iterable
  • class_name: the name for the new namedtuple class
  • keys: optional iterable of dict keys of interest; if omitted then the dicts are scanned in order to learn the keys

Note that if keys is not specified this generator prescans the dicts in order to learn their keys. As a consequence, all the dicts will be kept in memory and no namedtuples will be yielded until after that prescan completes.

Class FallbackDict(collections.defaultdict,builtins.dict)

A dictlike object that inherits from another dictlike object; this is a convenience subclass of defaultdict.

Method FallbackDict.__init__(self, otherdict)

Class MappingChain

A mapping interface to a sequence of mappings.

It does not support __setitem__ at present; that is expected to be managed via the backing mappings.

Method MappingChain.__init__(self, mappings=None, get_mappings=None)

Initialise the MappingChain.

Parameters:

  • mappings: initial sequence of mappings, default None.
  • get_mappings: callable to obtain the initial sequence of

Exactly one of mappings or get_mappings must be provided.

Class MethodicalList(AttributableList,builtins.list)

A MethodicalList subclasses a list and maps unimplemented attributes into a callable which calls the corresponding method on each list members and returns you a new MethodicalList with the results, ready for a further dereference.

Example:

>>> n = 1
>>> class C(object):
...   def __init__(self):
...     global n
...     self.n = n
...     n += 1
...   def x(self):
...     return self.n
...
>>> Cs=[ C(), C(), C() ]
>>> ML = MethodicalList( Cs )
>>> print(ML.x())
[1, 2, 3]

Method MethodicalList.__init__(self, initlist=None, strict=False)

Initialise the list.

The optional parameter initlist initialises the list as for a normal list.

The optional parameter strict, if true, causes list elements lacking the attribute to raise an AttributeError. If false, list elements without the attribute are omitted from the results.

Function named_column_tuples(rows, class_name=None, column_names=None, computed=None, preprocess=None, mixin=None)

Process an iterable of data rows, usually with the first row being column names. Return a generated namedtuple factory and an iterable of instances of the namedtuples for each row.

Parameters:

  • rows: an iterable of rows, each an iterable of data values.
  • class_name: option class name for the namedtuple class
  • column_names: optional iterable of column names used as the basis for the namedtuple. If this is not provided then the first row from rows is taken to be the column names.
  • computed: optional mapping of str to functions of self
  • preprocess: optional callable to modify CSV rows before they are converted into the namedtuple. It receives a context object an the data row. It should return the row (possibly modified), or None to drop the row.
  • mixin: an optional mixin class for the generated namedtuple subclass to provide extra methods or properties

The context object passed to preprocess has the following attributes:

  • .cls: attribute with the generated namedtuple subclass; this is useful for obtaining things like the column names or column indices; this is None when preprocessing the header row, if any
  • .index: attribute with the row's enumeration, which counts from 0
  • .previous: the previously accepted row's namedtuple, or None if there is no previous row

Rows may be flat iterables in the same order as the column names or mappings keyed on the column names.

If the column names contain empty strings they are dropped and the corresponding data row entries are also dropped. This is very common with spreadsheet exports with unused padding columns.

Typical human readable column headings, also common in speadsheet exports, are lowercased and have runs of whitespace or punctuation turned into single underscores; trailing underscores then get dropped.

Basic example:

>>> data1 = [
...   ('a', 'b', 'c'),
...   (1, 11, "one"),
...   (2, 22, "two"),
... ]
>>> cls, rows = named_column_tuples(data1)
>>> print(list(rows))
[NamedRow(a=1, b=11, c='one'), NamedRow(a=2, b=22, c='two')]

Human readable column headings:

>>> data1 = [
...   ('Index', 'Value Found', 'Descriptive Text'),
...   (1, 11, "one"),
...   (2, 22, "two"),
... ]
>>> cls, rows = named_column_tuples(data1)
>>> print(list(rows))
[NamedRow(index=1, value_found=11, descriptive_text='one'), NamedRow(index=2, value_found=22, descriptive_text='two')]

Rows which are mappings:

>>> data1 = [
...   ('a', 'b', 'c'),
...   (1, 11, "one"),
...   {'a': 2, 'c': "two", 'b': 22},
... ]
>>> cls, rows = named_column_tuples(data1)
>>> print(list(rows))
[NamedRow(a=1, b=11, c='one'), NamedRow(a=2, b=22, c='two')]

CSV export with unused padding columns:

>>> data1 = [
...   ('a', 'b', 'c', '', ''),
...   (1, 11, "one"),
...   {'a': 2, 'c': "two", 'b': 22},
...   [3, 11, "three", '', 'dropped'],
... ]
>>> cls, rows = named_column_tuples(data1, 'CSV_Row')
>>> print(list(rows))
[CSV_Row(a=1, b=11, c='one'), CSV_Row(a=2, b=22, c='two'), CSV_Row(a=3, b=11, c='three')]

A mixin class providing a test1 method and a test2 property:

>>> class Mixin(object):
...   def test1(self):
...     return "test1"
...   @property
...   def test2(self):
...     return "test2"
>>> data1 = [
...   ('a', 'b', 'c'),
...   (1, 11, "one"),
...   {'a': 2, 'c': "two", 'b': 22},
... ]
>>> cls, rows = named_column_tuples(data1, mixin=Mixin)
>>> rows = list(rows)
>>> rows[0].test1()
'test1'
>>> rows[0].test2
'test2'

Function named_row_tuple(*column_names, **kw)

Return a namedtuple subclass factory derived from column_names.

Parameters:

  • column_names: an iterable of str, such as the heading columns of a CSV export
  • class_name: optional keyword parameter specifying the class name
  • computed: optional keyword parameter providing a mapping of str to functions of self; these strings are available via __getitem__
  • mixin: an optional mixin class for the generated namedtuple subclass to provide extra methods or properties

The tuple's attributes are computed by converting all runs of nonalphanumerics (as defined by the re module's "\W" sequence) to an underscore, lowercasing and then stripping leading and trailing underscores.

In addition to the normal numeric indices, the tuple may also be indexed by the attribute names or the column names.

The new class has the following additional attributes:

  • attributes_: the attribute names of each tuple in order
  • names_: the originating name strings
  • name_attributes_: the computed attribute names corresponding to the names; there may be empty strings in this list
  • attr_of_: a mapping of column name to attribute name
  • name_of_: a mapping of attribute name to column name
  • index_of_: a mapping of column names and attributes their tuple indices

Examples:

>>> T = named_row_tuple('Column 1', '', 'Column 3', ' Column 4', 'Column 5 ', '', '', class_name='Example')
>>> T.attributes_
['column_1', 'column_3', 'column_4', 'column_5']
>>> row = T('val1', 'dropped', 'val3', 4, 5, 6, 7)
>>> row
Example(column_1='val1', column_3='val3', column_4=4, column_5=5)

Class SeenSet

A set-like collection with optional backing store file.

Class SeqMapUC_Attrs

A wrapper for a mapping from keys (matching the regular expression ^[A-Z][A-Z_0-9]*$) to tuples.

Attributes matching such a key return the first element of the sequence (and requires the sequence to have exactly on element). An attribute FOOs or FOOes (ending in a literal 's' or 'es', a plural) returns the sequence (FOO must be a key of the mapping).

Class StackableValues

A collection of named stackable values with the latest value available as an attribute.

Note that names conflicting with methods are not available as attributes and must be accessed via __getitem__. As a matter of practice, in addition to the mapping methods, avoid names which are verbs or which begin with an underscore.

Example:

>>> S = StackableValues()
>>> print(S)
StackableValues()
>>> S.push('x', 1)
>>> print(S)
StackableValues(x=1)
>>> print(S.x)
1
>>> S.push('x', 2)
1
>>> print(S.x)
2
>>> S.x = 3
>>> print(S.x)
3
>>> S.pop('x')
3
>>> print(S.x)
1
>>> with S.stack(x=4):
...   print(S.x)
...
4
>>> print(S.x)
1
>>> S.update(x=5)
{'x': 1}

Class UC_Sequence(builtins.list)

A tuple-of-nodes on which .ATTRs indirection can be done, yielding another tuple-of-nodes or tuple-of-values.

Method UC_Sequence.__init__(self, Ns)

Initialise from an iterable sequence.

Release Log

Release 20200130: New dicts_to_namedtuples function to yield namedtuples from an iterable of dicts.

Release 20191120: named_row_tuple: support None in a column name, as from Excel unfilled heading row entries

Release 20190617: StackableValues.push now returns the previous value. StackableValues.update has a signature like dict.update. StackableValues.pop removes entries when their stack becomes empty. StackableValues.stack: clean implementation of save/restore. StackableValues: avoid infinite recursion through ._fallback. StackableValues.keys now returns a list of the nonempty keys. Update doctests.

Release 20190103: Documentation update.

Release 20181231: Bugfix for mapping of column names to row indices. New subclass._fallback method for when a stack is empty.

Release 20180720: Initial PyPI release specificly for named_column_tuple and named_column_tuples.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for cs.mappings, version 20200130
Filename, size File type Python version Upload date Hashes
Filename, size cs.mappings-20200130.tar.gz (15.3 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page