Mutable variants of tuple (mutabletuple) and collections.namedtuple (recordclass), which support assignments and more memory saving variants (dataobject, litelist, ...).
Project description
Recordclass library
What is all about?
Recordclass is MIT Licensed python library.
It implements the type mutabletuple
and factory function recordclass
in order to create record-like classes -- mutable variant of collection.namedtuple
with the same API. Later more memory saving variants are added.
- mutabletuple is mutable variant of the
tuple
, which supports assignment operations. - recordclass is a factory function that create a "mutable" analog of
collection.namedtuple
. It produces a subclass ofmutabletuple
with namedtuple-like API. - structclass is an analog of
recordclass
. It produces a class with less memory footprint (less than both recordclass-based class instances and instances of class with__slots__
) andnamedtuple
-like API. It's instances has no__dict__
,__weakref__
and don't support cyclic garbage collection by default (only reference counting). Butstructclass
-created classes can support any of them. - arrayclass is factory function.
It also produces a class with same memory footprint as
structclass
-created class instances. It implements an array of object. By default created class has no__dict__
,__weakref__
and don't support cyclic garbage collection. But it can add support any of them.
Since 0.10
- dataobject is new base class for creating subclasses, which are support the following
properties by default 1) no
__dict__
and__weakref__
; 2) cyclic GC support is disabled by default; 3) instances have less memory size than class instances with__slots__
. - make_class is a factory function for creation of
dataobject
subclasses described above.
The dataobject
-based classes are not following namedtuple
-like API, but attrs
/dataclasses
-like API.
By default, subclasses of dataobject
doesn't support cyclic GC, but only reference counting.
As the result the instance of such class need less memory.
The difference is equal to the size of PyGC_Head
.
Subclasses of the dataobject
are reasonable when reference cycles are not provided.
For example, when all fields have values of atomic types (integer, float, strings, date and time, etc.).
The field's value also may be the instance of a subclass of dataobject
(i.e. without GC support).
As an exception, the value of a field can be any object if our instance is not contained in this object
and in its sub-objects.
The recordclass
library was started as a "proof of concept" for the problem of fast "mutable"
alternative of namedtuple
(see question on stackoverflow). It was evolved further in order to provide more memory saving, fast and flexible types for representation of data objects.
Main repository for recordclass
is on bitbucket.
Here is also a simple example.
Quick start:
Quick start with recordclass
First load inventory:
>>> from recordclass import recordclass, RecordClass
Example with recordclass
:
>>> Point = recordclass('Point', 'x y')
>>> p = Point(1,2)
>>> print(p)
Point(1, 2)
>>> print(p.x, p.y)
1 2
>>> p.x, p.y = 10, 20
>>> print(p)
Point(10, 20)
Example with RecordClass
and typehints::
class Point(RecordClass):
x: int
y: int
>>> ptint(Point.__annotations__)
{'x': <class 'int'>, 'y': <class 'int'>}
>>> p = Point(1, 2)
>>> print(p)
Point(1, 2)
>>> print(p.x, p.y)
1 2
>>> p.x, p.y = 10, 20
>>> print(p)
Point(10, 20)
Quick start with dataobject
First load inventory::
>>> from recordclass import dataobject, asdict
class Point(dataobject):
x: int
y: int
>>> print(Point.__annotations__)
{'x': <class 'int'>, 'y': <class 'int'>}
>>> p = Point(1,2)
>>> print(p)
Point(x=1, y=2)
>>> sys.getsizeof() # the output below is for 64bit python
32
>>> p.__sizeof__() == sys.getsizeof(p) # no additional space used by GC
True
>>> p.x, p.y = 10, 20
>>> print(p)
Point(x=10, y=20)
>>> print(iter(p))
[1, 2]
>>> asdict(p)
{'x':1, 'y':2}
Another way – factory function make_dataclass
:
>>> from recordclass import make_dataclass
>>> Point = make_dataclass("Point", [("x",int), ("y",int)])
Default values are also supported::
class CPoint(dataobject):
x: int
y: int
color: str = 'white'
or
>>> Point = make_dataclass("Point", [("x",int), ("y",int), ("color",str)], defaults=("white",))
>>> p = CPoint(1,2)
>>> print(p.x, p.y, p.color)
1 2 'white'
>>> print(p)
Point(x=1, y=2, color='white')
Recordclasses and dataobject-based classes may be cached in order to reuse them without duplication::
from recordclass import RecordclassStorage
>>> rs = RecordclassStorage()
>>> A = rs.recordclass("A", "x y")
>>> B = rs.recordclass("A", ["x", "y"])
>>> A is B
True
from recordclass import DataclassStorage
>>> ds = DataclassStorage()
>>> A = ds.make_dataclass("A", "x y")
>>> B = ds.make_dataclass("A", ["x", "y"])
>>> A is B
True
Recordclass
Recordclass was created as answer to question on stackoverflow.com
.
Recordclass
was designed and implemented as a type that, by api, memory footprint, and speed, would be almost identical to namedtuple
, except that it would support assignments that could replace any element without creating a new instance, as in namedtuple
(support assignments __setitem__
/ setslice__
).
The effectiveness of a namedtuple is based on the effectiveness of the tuple
type in python. In order to achieve the same efficiency, it was created the type mutabletuple
. The structure (PymutabletupleObject
) is identical to the structure of the tuple
(PyTupleObject
) and therefore occupies the same amount of memory as tuple
.
Recordclass
is defined on top of mutabletuple
in the same way as namedtuple
defined on top of tuple
. Attributes are accessed via a descriptor (itemgetset
), which provides quick access and assignment by attribute index.
The class generated by recordclass
looks like:
from recordclass import mutabletuple, itemgetset
class C(mutabletuple, metaclass=recordobject):
__fields__ = ('attr_1',...,'attr_m')
attr_1 = itemgetset(0)
...
attr_m = itemgetset(m-1)
def __new__(cls, attr_1, ..., attr_m):
'Create new instance of C(attr_1, ..., attr_m)'
return mutabletuple.__new__(cls, attr_1, ..., attr_m)
etc. following the definition scheme of namedtuple
.
As a result, recordclass
takes up as much memory as namedtuple
, supports fast access by __getitem__
/ __setitem__
and by the name of the attribute through the descriptor protocol.
Structclass
In the discussions, it was correctly noted that instances of classes with __slots__
also support fast access to the object fields and take up less memory than tuple
and instances of classes created using the factory function recordclass
. This happens because instances of classes with __slots__
do not store the number of elements, like tuple
and others (PyObjectVar
), but they store the number of elements and the list of attributes in their type ( PyHeapTypeObject
).
Therefore, a special class prototype was created from which, using a special metaclass structclasstype
, classes can be created, instances of which can occupy as much in memory as instances of classes with __slots__
, but do not use __slots__
at all. Based on this, the factory function structclass
can create classes, instances of which are all similar to instances created using recordclass
, but taking up less memory space.
The class generated by structclass
looks like:
from recordclass import recordobjectgetset, structclasstype
class C(recordobject, metaclass=structclasstype):
__attrs__ = ('attr_1',...,'attr_m')
attr_1 = recordobjectgetset(0)
...
attr_m = recordobjectgetset(m-1)
def __new__(cls, attr_1, ..., attr_m):
'Create new instance of C(attr_1, ..., attr_m)'
return recordobject.__new__(cls, attr_1, ..., attr_m)
etc. following the definition scheme of recordclass
.
As a result, structclass
-based objects takes up as much memory as __slots__
-based instances and also have same functionality as recordclass
-created instances.
Comparisons
The following table explain memory footprints of recordclass
-, recordclass2
-base objects:
namedtuple | class/__slots__ | recordclass | structclass |
---|---|---|---|
b+s+n*p | b+n*p | b+s+n*p | b+n*p-g |
where:
- b = sizeof(
PyObject
) - s = sizeof(
Py_ssize_t
) - n = number of items
- p = sizeof(
PyObject*
) - g = sizeof(PyGC_Head)
Special option cyclic_gc=False
(by default) of structclass
allows to disable support of the cyclic
garbage collection.
This is useful in that case when you absolutely sure that reference cycle isn't possible.
For example, when all field values are instances of atomic types.
As a result the size of the instance is decreased by 24 bytes:
class S:
__slots__ = ('a','b','c')
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
R_gc = recordclass2('R_gc', 'a b c', cyclic_gc=True)
R_nogc = recordclass2('R_nogc', 'a b c')
s = S(1,2,3)
r_gc = R_gc(1,2,3)
r_nogc = R_nogc(1,2,3)
for o in (s, r_gc, r_nogc):
print(sys.getsizeof(o))
64 64 40
Here are also table with some performance counters:
namedtuple | class/__slots__ | recordclass | structclass | |
---|---|---|---|---|
new |
739±24 ns | 915±35 ns | 763±21 ns | 889±34 ns |
getattr |
84.0±1.7 ns | 42.8±1.5 ns | 39.5±1.0 ns | 41.7±1.1 ns |
setattr |
50.5±1.7 ns | 50.9±1.5 ns | 48.8±1.0 ns |
Changes:
0.11:
- Rename
memoryslots
tomutabletuple
. mutabletuple
andimmutabletuple
dosn't participate in cyclic garbage collection.- Add
litelist
type for list-like objects, which doesn't participate in cyglic garbage collection.
0.10.3:
- Introduce DataclassStorage and RecordclassStorage. They allow cache classes and used them without creation of new one.
- Add
iterable
decorator and argument. Now dataobject with fields isn't iterable by default. - Move
astuple
todataobject.c
.
0.10.2
- Fix error with dataobject's
__copy__
. - Fix error with pickling of recordclasses and structclasses, which was appeared since 0.8.5 (Thanks to Connor Wolf).
0.10.1
- Now by default sequence protocol is not supported by default if dataobject has fields, but iteration is supported.
- By default argsonly=False for usability reasons.
0.10
- Invent new factory function
make_class
for creation of different kind of dataobject classes without GC support by default. - Invent new metaclass
datatype
and new base classdataobject
for creation dataobject class usingclass
statement. It have disabled GC support, but could be enabled by decoratordataobject.enable_gc
. It support type hints (for python >= 3.6) and default values. It may not specify sequence of field names in__fields__
when type hints are applied to all data attributes (for python >= 3.6). - Now
recordclass
-based classes may not support cyclic garbage collection too. This reduces the memory footprint by the size ofPyGC_Head
. Now by default recordclass-based classes doesn't support cyclic garbage collection.
0.9
- Change version to 0.9 to indicate a step forward.
- Cleanup
dataobject.__cinit__
.
0.8.5
- Make
arrayclass
-based objects support setitem/getitem andstructclass
-based objects able to not support them. By default, as beforestructclass
-based objects support setitem/getitem protocol. - Now only instances of
dataobject
are comparable to 'arrayclass'-based andstructclass
-based instances. - Now generated classes can be hashable.
0.8.4
- Improve support for readonly mode for structclass and arrayclass.
- Add tests for arrayclass.
0.8.3
- Add typehints support to structclass-based classes.
0.8.2
- Remove
usedict
,gc
,weaklist
from the class__dict__
.
0.8.1
- Remove Cython dependence by default for building
recordclass
from the sources [Issue #7].
0.8
- Add
structclass
factory function. It's analog ofrecordclass
but with less memory footprint for it's instances (same as for instances of classes with__slots__
) in the camparison withrecordclass
andnamedtuple
(it currently implemented withCython
). - Add
arrayclass
factory function which produce a class for creation fixed size array. The benefit of such approach is also less memory footprint (it currently currently implemented withCython
). structclass
factory has argumentgc
now. Ifgc=False
(by default) support of cyclic garbage collection will switched off for instances of the created class.- Add function
join(C1, C2)
in order to join twostructclass
-based classes C1 and C2. - Add
sequenceproxy
function for creation of immutable and hashable proxy object from class instances, which implement access by index (it currently currently implemented withCython
). - Add support for access to recordclass object attributes by idiom:
ob['attrname']
(Issue #5). - Add argument
readonly
to recordclass factory to produce immutable namedtuple. In contrast tocollection.namedtuple
it use same descriptors as for regular recordclasses for performance increasing.
0.7
- Make mutabletuple objects creation faster. As a side effect: when number of fields >= 8
recordclass instance creation time is not biger than creation time of instaces of
dataclasses with
__slots__
. - Recordclass factory function now create new recordclass classes in the same way as namedtuple in 3.7 (there is no compilation of generated python source of class).
0.6
- Add support for default values in recordclass factory function in correspondence to same addition to namedtuple in python 3.7.
0.5
- Change version to 0.5
0.4.4
- Add support for default values in RecordClass (patches from Pedro von Hertwig)
- Add tests for RecorClass (adopted from python tests for NamedTuple)
0.4.3
- Add support for typing for python 3.6 (patches from Vladimir Bolshakov).
- Resolve memory leak issue.
0.4.2
- Fix memory leak in property getter/setter
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for recordclass-0.11-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cc77c10698480264f419e3504e232a135837d5e313132bdc8a807a1de1e5f118 |
|
MD5 | 7f25ab87be9ce4c2aa983a717e230526 |
|
BLAKE2b-256 | 92d85dbd612b5483cf117282c19a9578e1bbfc194efafeeaaeb95cb4d2062770 |
Hashes for recordclass-0.11-cp37-cp37m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e9393d48d51594a014f479b5de4fd995c245df1362bb4eab5185248ed556f06d |
|
MD5 | f0b56ed05d3386251193a720af2f6498 |
|
BLAKE2b-256 | 4f2af653b6f3f700ed3f054108198f035eba2f5d606b0d18037fe181cb942a53 |
Hashes for recordclass-0.11-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fb71ada7ff7a110ee03b3abb90921c66cc55ad7388656a3057140b5cc3d7e55c |
|
MD5 | f4dc999376aaeaaa506de3a4e9b770b2 |
|
BLAKE2b-256 | d60cfdd82dd5d9b5f174327367352df64db891f673aaa94327389bacf6cea9a4 |
Hashes for recordclass-0.11-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7f6c4e68f210f8762eeadc0879bc841cd8f8c7de5d99ecea50ae441ce7282ff2 |
|
MD5 | 51337a7c9d095a212ff30be47ef5eec1 |
|
BLAKE2b-256 | 419358e5067f7121d54b4ac619f6638d87c02cc247a83f3c8bff8c8d7ed66f7f |
Hashes for recordclass-0.11-cp36-cp36m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 06dbfe2ff7263bcec0d54ea3ae6a0643d291b2b0e6cd60e0850917234341531a |
|
MD5 | e41340a7b16fed8a0fde24e1729f925b |
|
BLAKE2b-256 | 8fd3d8fb11e4f06b0c8757fb25a8c64bf0e9e317cc50e5431b92aeb34782a1d4 |
Hashes for recordclass-0.11-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a723868a795420391eeb46c10b321a29ecdb10848dbdc68519c3bad3a4db2793 |
|
MD5 | a37f848d8407dcfd7f831d8643572e08 |
|
BLAKE2b-256 | 246f20d7c98326f0d23ebe453942f4f3e52263b0bcbd03d502436ff550b438d7 |
Hashes for recordclass-0.11-cp35-cp35m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e423c7d7a804f857244d368efa000f11351b11de420c269e902d1bd0dff0b2b5 |
|
MD5 | 858a34194cf30284c292826d74164635 |
|
BLAKE2b-256 | 3e70c3c69e9fe9b6ee4023992947fb14c36eb6a8243093efe9f5c52675174ff8 |
Hashes for recordclass-0.11-cp35-cp35m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4103e122fe15a8fe8285673fc6edcc2365581761d7226a18a5129c262877cc3e |
|
MD5 | 854c88a50707459208485aedd9de9dc7 |
|
BLAKE2b-256 | 3908280a3e7cdd04002b19e94ebb6fef9fb79ab0ca6e0ab81967ae6199d05da8 |
Hashes for recordclass-0.11-cp35-cp35m-macosx_10_6_intel.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1262a88f1064d550f7c0e012dc46f52efb337ec5fc4c4c1ffabd30551389a963 |
|
MD5 | 651ea7e23ee4f8abb4e8513f772c95a4 |
|
BLAKE2b-256 | 9b947f90f3594add9d9a62f5a85df782e5cfbbc01da1d8c123e447fb24af1e93 |
Hashes for recordclass-0.11-cp34-cp34m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2ca605448931ce0d8deecdd4c2ec0b6027bd2d0c394533113c48a083d5c5f973 |
|
MD5 | e2950ef0a35f772ed5d5e9db859eec2d |
|
BLAKE2b-256 | e11cb8ab0ffdb68768388731dee256af879e0527f2290e599da2406913f610a9 |
Hashes for recordclass-0.11-cp34-cp34m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 58f52ef2f9152905776a2096383df2db28c76df80de2a961a3f0148207b3732f |
|
MD5 | 1bb372456cd7f55ab0b76dc44d5463fb |
|
BLAKE2b-256 | 8d38786efb17305aa2b65a7c050f23b43186ad1d83a4ed4722ac2f4f64ee17d7 |
Hashes for recordclass-0.11-cp34-cp34m-macosx_10_6_intel.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fd9f124be0458be26e0a2bbe90ade8a276fcc55bd7160a6fdad859cc209d27e1 |
|
MD5 | 93193931922b2d5bb6e103ae52dd4e50 |
|
BLAKE2b-256 | 35b19b1c091d464e2af1c712362cd9f58ef0c2f729f35880f5051c0404e1e725 |
Hashes for recordclass-0.11-cp27-cp27m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7ae492a1fe6408c13a35ba8160c714799dfa9bcdacfa3def972bd88f55c1468a |
|
MD5 | adc79ab57e1da6d41650f4c9cfcb7ccd |
|
BLAKE2b-256 | bcdd63d00d089cc17e4bedb44fcaadc470e9a1b3839382eaa2c884dcad7174d8 |
Hashes for recordclass-0.11-cp27-cp27m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e4914868d314ea0f762188e02669d4fd4efedba0e1b08e417c4743a9950fcf8b |
|
MD5 | bb1b6bca0f31ba3159e57abaf613da9b |
|
BLAKE2b-256 | 29c258bf90dcdb8f86f14ba792c5ef88d67b23bc9b31af45405d0bafc4b70601 |
Hashes for recordclass-0.11-cp27-cp27m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7d4cf22a94e90186bc7d63569df3b6904928c62001f9ea51610eceabfd652f85 |
|
MD5 | 9ec0282a9851a6b2a50d53d866905e4d |
|
BLAKE2b-256 | d7a8c07da9a6886dfa36a11a57479be85473e48a77b66c03629c7b5765764b1c |