Faster reimplementation of stdlib collections.namedtuple
Project description
Cheap namedtuple implementation (Python 2.7)
Namedtuples are a neat goody of Python, but they have one big caveat. In order to be precise with the type definition, they compile with exec a string template, formated with the variables given in the namedtuple() function factory. This compilation is very expensive computation-wise, and in practice can be spared if needed for performance reasons.
An issue was opened years ago in the Python bug tracker but it got rejected, arguing that the official implementation is clearer and more maintainable. This is true, but there are cases where performance is key, and the current implementation using exec is just not an option.
If you need to define new namedtuple types dynamically, and you have high performance constraints, this is for you.
There are multiple versions out there, using metaclasses, or metaclasses, or ABC
This versions is simpler. Just define a new class closed-over by the factory function.
Unittests from cPython2.7 implementation are copied here to assert that the same expected behaviour is honored.
Install
pip install cheapnamedtuple
Usage
For the purist, the namedtuple implementation is 3.75x faster than Python’s implementation, while honouring 100% of the Python’s implementation behaviour, including docstrings and everything else (corroborated by unit tests):
>>> from cheapnamedtuple import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> Point.__doc__ # docstring for the new class
'Point(x, y)'
>>> p = Point(11, y=22) # instantiate with positional args or keywords
>>> p[0] + p[1] # indexable like a plain tuple
33
>>> x, y = p # unpack like a regular tuple
>>> x, y
(11, 22)
>>> p.x + p.y # fields also accessible by name
33
>>> d = p._asdict() # convert to a dictionary
>>> d['x']
11
>>> Point(**d) # convert from a dictionary
Point(x=11, y=22)
>>> p._replace(x=100) # _replace() is like str.replace() but targets named fields
Point(x=100, y=22)
For the practical, a cheapnamedtuple implementation is 26.6x faster than the Python’s implementation, while still honouring all the public behaviour (corroborated by doctests), and still supporting copy and pickle. The only caveats identified so far are:
The docstring of the type generated by the cheapnamedtuple factory cannot be generated ad-hoc for it
Some typechecking has been trade in favor of performance (see test_name_fixer test case)
>>> from cheapnamedtuple import cheapnamedtuple
>>> Point = cheapnamedtuple('Point', ['x', 'y'])
>>> Point.__doc__ # docstring for the new class
'Point(x, y)'
>>> p = Point(11, y=22) # instantiate with positional args or keywords
>>> p[0] + p[1] # indexable like a plain tuple
33
>>> x, y = p # unpack like a regular tuple
>>> x, y
(11, 22)
>>> p.x + p.y # fields also accessible by name
33
>>> d = p._asdict() # convert to a dictionary
>>> d['x']
11
>>> Point(**d) # convert from a dictionary
Point(x=11, y=22)
>>> p._replace(x=100) # _replace() is like str.replace() but targets named fields
Point(x=100, y=22)
Compatibility
Currently only tested in Python 2.7
Benchmarking
Python’s stdlib implementation:
$python -m timeit -vvvv "from collections import namedtuple" "A = namedtuple('A', ['foo', 'bar', 'foobar'])" "a = A(1, 2, 3)" "a.bar"
10 loops -> 0.00922394 secs
100 loops -> 0.0595999 secs
1000 loops -> 0.350676 secs
raw times: 0.328964 0.33169 0.327519
1000 loops, best of 3: 327.519 usec per loop
namedtuple implementation:
$ python -m timeit -vvvv "from cheapnamedtuple import namedtuple" "A = namedtuple('A', ['foo', 'bar', 'foobar'])" "a = A(1, 2, 3)" "a.bar"
10 loops -> 0.00332594 secs
100 loops -> 0.01106 secs
1000 loops -> 0.09164 secs
10000 loops -> 0.955008 secs
raw times: 0.929455 0.872804 0.904877
10000 loops, best of 3: 87.2804 usec per loop
cheapnamedtuple implementation:
$ python -m timeit -vvvv "from cheapnamedtuple import cheapnamedtuple" "A = cheapnamedtuple('A', ['foo', 'bar', 'foobar'])" "a = A(1, 2, 3)" "a.bar"
10 loops -> 0.00241184 secs
100 loops -> 0.00281 secs
1000 loops -> 0.0245481 secs
10000 loops -> 0.156194 secs
100000 loops -> 1.25612 secs
raw times: 1.23459 1.23159 1.23687
100000 loops, best of 3: 12.3159 usec per loop
Using metaclass version found here:
$ python -m timeit -vvvv "from metanamedtuple import namedtuple" "A = namedtuple('A', ['foo', 'bar', 'foobar'])" "a = A(1, 2, 3)" "a.bar"
10 loops -> 0.00334907 secs
100 loops -> 0.0108609 secs
1000 loops -> 0.088969 secs
10000 loops -> 1.25756 secs
raw times: 1.2868 1.24004 1.25383
10000 loops, best of 3: 124.004 usec per loop
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file cheapnamedtuple-1.1.2.tar.gz
.
File metadata
- Download URL: cheapnamedtuple-1.1.2.tar.gz
- Upload date:
- Size: 5.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 669f7b8ba56a3a01503fd36eaf8a923f2cc2d209b8ec9a7dadf9c7c2a97e21c9 |
|
MD5 | db9051fa627bc31fac3885c6e537bdde |
|
BLAKE2b-256 | b57b90a3658845ce21e25b02a45625983fc14017b1066a55140194e0a6ddff05 |