Skip to main content

Mutable variants of tuple and collections.namedtuple, which support assignments

Project description

# RECORDCLASS

**recordclass** is [MIT Licensed](http://opensource.org/licenses/MIT) python library.
From the begining it implements the type `memoryslots` and factory function `recordclass`
in order to create record-like classes -- mutable variant of `collection.namedtuple`.
Later more memory saving variant `structclass` is added.

* `memoryslots` is `tuple`-like type, which supports assignment operations.
* `recordclass` is a factory function that create a "mutable" analog of
`collection.namedtuple`. It produce a subclass of `memoryslots`. Attribute
access is provided on the base of specially defined desciptors (`memoryslots.getsetitem`).
* `structclass` is analog of `recordclass`. It produce a class with less memory footprint
(same as class instances with `__slots__`) and `namedtuple` -- like API. It's instances has no __dict__,
__weakref__ and don't suuport cyclic garbage collection by default. But `structclass` can support
any of them.
* `arrayclass` is factory function. It also produce a class with same memory footprint as class
instances with `__slots__`. It implements array of object.

This library starts as a "proof of concept" for the problem of fast "mutable"
alternative of `namedtuple` (see [question](https://stackoverflow.com/questions/29290359/existence-of-mutable-named-tuple-in-python) on stackoverflow).

Main repository for `recordclass`
is on [bitbucket](https://bitbucket.org/intellimath/recordclass).

Here is also a simple [example](http://nbviewer.ipython.org/urls/bitbucket.org/intellimath/recordclass/raw/default/examples/what_is_recordclass.ipynb).

## Quick start:

First load inventory:

>>> from recordclass import recordclass, RecordClass

Simple example with `recordclass`:

>>> Point = recordclass('Point', 'x y')
>>> p = Point(1,2)
>>> print(p)
Point(1, 2)
>>> print(p.x, p.y)
1 2
>>> p.x, p.y = 10, 20
>>> print(p)
Point(10, 20)

Simple example with `RecordClass` and typehints::

class Point(RecordClass):
x: int
y: int

>>> p = Point(1, 2)
>>> print(p)
Point(1, 2)
>>> print(p.x, p.y)
1 2
>>> p.x, p.y = 10, 20
>>> print(p)
Point(10, 20)

## Recordclass

Recorclass was created as unswer to [question](https://stackoverflow.com/questions/29290359/existence-of-mutable-named-tuple-in-python/29419745#29419745) on `stackoverflow.com`.

`Recordclass` was designed and implemented as a type that, by api, memory footprint, and speed, would be completely identical to` namedtuple`, except that it would support assignments that could replace any element without creating a new instance, as in ` namedtuple`, i.e. it would be almost identical to `namedtuple` and
would support in addition assignments (` __setitem__` / `setslice__`).

The effectiveness of a namedtuple is based on the effectiveness of the `tuple` type in python. In order to achieve the same efficiency, we had to create the type `memoryslots`. It's structure (`PyMemorySlotsObject`) is identical to the structure` tuple` (`PyTupleObject`) and therefore occupies the same amount of memory as` tuple`.

`Recordclass` is defined on top of `memoryslots` in the same way as `namedtuple` defined on top of `tuple`. Attributes are accessed via a descriptor (`itemgetset`), which provides quick access and assignment by attribute index.

The class generated by `recordclass` looks like:

``` python
from recordclass import memoryslots, itemgetset

class C(memoryslots):
__slots__ = ()

_fields = ('attr_1',...,'attr_m')

attr_1 = itemgetset(0)
...
attr_m = itemgetset(m-1)

def __new__(cls, attr_1, ..., attr_m):
'Create new instance of {typename}({arg_list})'
return memoryslots.__new__(cls, attr_1, ..., attr_m)
```

etc. following the definition scheme of `namedtuple`.

As a result, `recordclass` takes up as much memory as `namedtuple`, supports fast access by `__getitem__` / `__setitem__` and by the name of the attribute through the descriptor protocol.

## Recordclass2

In the discussion, it was correctly noted that instances of classes with `__slots__` also support fast access to the object fields and take up less memory than` tuple` and instances of classes created using the factory function `recordclass`. This happens because instances of classes with `__slots__` do not store the number of elements, like` tuple` and others (`PyObjectVar`), but they store the number of elements and the list of attributes in their type (` PyHeapTypeObject`).

Therefore, a special class prototype was created from which, using a special metaclass of `arrayclasstype`, classes can be created, instances of which can occupy as much in memory as instances of classes with` __slots__`, but do not use `__slots__` at all. Based on this, the factory function `recordclass2` can create classes, instances of which are all similar to instances created using `recordclass`, but taking up less memory space.

The class generated by `recordclass` looks like:

``` python
from recordclass.arrayclass import RecordClass, ArrayClassGetSet, arrayclasstype

class C(ArrayClass):
__metaclass__ = arrayclasstype

_fields = ('attr_1',...,'attr_m')

attr_1 = ArrayClassGetSet(0)
...
attr_m = ArrayClassGetSet(m-1)

def __new__(cls, attr_1, ..., attr_m):
'Create new instance of {typename}({arg_list})'
return ArrayClass.__new__(cls, attr_1, ..., attr_m)
```
etc. following the definition scheme of `recordclass`.

As a result, `recordclass2`-created objects takes up as much memory as `__slots__`-based instances and also have same functionality as `recordclass`-created instances.

## Comparisons

The following table explain memory footprints of `recordclass`-, `recordclass2`-base objects:

| namedtuple | class + \_\_slots\_\_ | recordclass | structclass |
| ------------- | ----------------- | -------------- | ------------- |
| b+s+n*p | b+n*p | b+s+n*p | b+n*p-g |

where:

* b = sizeof(`PyObject`)
* s = sizeof(`Py_ssize_t`)
* n = number of items
* p = sizeof(`PyObject*`)
* g = sizeof(PyGC_Head)

Special option `gc=False` (by default) of `structclass` allows to disable support of the cyclic
garbage collection.
This is useful in that case when you absolutely sure that reference cycle isn't possible.
For example, when all field values are instances of atomic types.
As a result the size of the instance is decreased by 24 bytes:

``` python
class S:
__slots__ = ('a','b','c')
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c

R_gc = recordclass2('R_gc', 'a b c', gc=True)
R_nogc = recordclass2('R_nogc', 'a b c')

s = S(1,2,3)
r_gc = R_gc(1,2,3)
r_nogc = R_nogc(1,2,3)
for o in (s, r_gc, r_nogc):
print(sys.getsizeof(o))
64 64 40
```


### Changes:

#### 0.8

* Add `structclass` factory function. It's analog of `recordclass` but with less memory
footprint for it's instances (same as for instances of classes with `__slots__`) in the camparison
with `recordclass` and `namedtuple`
(it currently implemented with `Cython`).
* Add `arrayclass` factory function which produce a class for creation fixed size array.
The benefit of such approach is also less memory footprint
(it currently currently implemented with `Cython`).
* `structclass` factory has argument `gc` now. If `gc=False` (by default) support of cyclic garbage collection
will switched off for instances of the created class.
* Add function `join(C1, C2)` in order to join two `structclass`-based classes C1 and C2.
* Add `sequenceproxy` function for creation of immutable and hashable proxy object from class instances,
which implement access by index
(it currently currently implemented with `Cython`).
* Add support for access to recordclass object attributes by idiom: `ob['attrname']` (Issue #5).
* Add argument `readonly` to recordclass factory to produce immutable namedtuple.
In contrast to `collection.namedtuple` it use same descriptors as for
regular recordclasses for performance increasing.

#### 0.7

* Make memoryslots objects creation faster. As a side effect: when number of fields >= 8
recordclass instance creation time is not biger than creation time of instaces of
dataclasses with __slots__.
* Recordclass factory function now create new recordclass classes in the same way as namedtuple in 3.7
(there is no compilation of generated python source of class).

#### 0.6

* Add support for default values in recordclass factory function in correspondence
to same addition to namedtuple in python 3.7.

#### 0.5

* Change version to 0.5

#### 0.4.4

* Add support for default values in RecordClass (patches from Pedro von Hertwig)
* Add tests for RecorClass (adopted from python tests for NamedTuple)

#### 0.4.3

* Add support for typing for python 3.6 (patches from Vladimir Bolshakov).
* Resolve memory leak issue.

#### 0.4.2

* Fix memory leak in property getter/setter

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

recordclass-0.8.tar.gz (107.0 kB view hashes)

Uploaded Source

Built Distributions

recordclass-0.8-cp37-cp37m-win_amd64.whl (79.0 kB view hashes)

Uploaded CPython 3.7m Windows x86-64

recordclass-0.8-cp37-cp37m-win32.whl (69.2 kB view hashes)

Uploaded CPython 3.7m Windows x86

recordclass-0.8-cp37-cp37m-macosx_10_9_x86_64.whl (86.5 kB view hashes)

Uploaded CPython 3.7m macOS 10.9+ x86-64

recordclass-0.8-cp36-cp36m-win_amd64.whl (78.6 kB view hashes)

Uploaded CPython 3.6m Windows x86-64

recordclass-0.8-cp36-cp36m-win32.whl (68.9 kB view hashes)

Uploaded CPython 3.6m Windows x86

recordclass-0.8-cp36-cp36m-macosx_10_9_x86_64.whl (87.3 kB view hashes)

Uploaded CPython 3.6m macOS 10.9+ x86-64

recordclass-0.8-cp35-cp35m-win_amd64.whl (75.4 kB view hashes)

Uploaded CPython 3.5m Windows x86-64

recordclass-0.8-cp35-cp35m-win32.whl (65.5 kB view hashes)

Uploaded CPython 3.5m Windows x86

recordclass-0.8-cp35-cp35m-macosx_10_6_intel.whl (142.9 kB view hashes)

Uploaded CPython 3.5m macOS 10.6+ intel

recordclass-0.8-cp34-cp34m-win_amd64.whl (71.1 kB view hashes)

Uploaded CPython 3.4m Windows x86-64

recordclass-0.8-cp34-cp34m-win32.whl (64.2 kB view hashes)

Uploaded CPython 3.4m Windows x86

recordclass-0.8-cp34-cp34m-macosx_10_6_intel.whl (143.0 kB view hashes)

Uploaded CPython 3.4m macOS 10.6+ intel

recordclass-0.8-cp27-cp27m-win_amd64.whl (71.8 kB view hashes)

Uploaded CPython 2.7m Windows x86-64

recordclass-0.8-cp27-cp27m-win32.whl (63.8 kB view hashes)

Uploaded CPython 2.7m Windows x86

recordclass-0.8-cp27-cp27m-macosx_10_9_x86_64.whl (83.5 kB view hashes)

Uploaded CPython 2.7m macOS 10.9+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page