Skip to main content

Faster reimplementation of stdlib collections.namedtuple

Project description

Cheap namedtuple implementation (Python 2.7)

Namedtuples are a neat goody of Python, but they have one big caveat. In order to be precise with the type definition, they compile with exec a string template, formated with the variables given in the namedtuple() function factory. This compilation is very expensive computation-wise, and in practice can be spared if needed for performance reasons.

An issue was opened years ago in the Python bug tracker but it got rejected, arguing that the official implementation is clearer and more maintainable. This is true, but there are cases where performance is key, and the current implementation using exec is just not an option.

If you need to define new namedtuple types dynamically, and you have high performance constraints, this is for you.

There are multiple versions out there, using metaclasses, or metaclasses, or ABC

This versions is simpler. Just define a new class closed-over by the factory function.

Unittests from cPython2.7 implementation are copied here to assert that the same expected behaviour is honored.

Install

pip install cheapnamedtuple

Usage

For the purist, the namedtuple implementation is 3.75x faster than Python’s implementation, while honouring 100% of the Python’s implementation behaviour, including docstrings and everything else (corroborated by unit tests):

>>> from cheapnamedtuple import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> Point.__doc__                   # docstring for the new class
'Point(x, y)'
>>> p = Point(11, y=22)             # instantiate with positional args or keywords
>>> p[0] + p[1]                     # indexable like a plain tuple
33
>>> x, y = p                        # unpack like a regular tuple
>>> x, y
(11, 22)
>>> p.x + p.y                       # fields also accessible by name
33
>>> d = p._asdict()                 # convert to a dictionary
>>> d['x']
11
>>> Point(**d)                      # convert from a dictionary
Point(x=11, y=22)
>>> p._replace(x=100)               # _replace() is like str.replace() but targets named fields
Point(x=100, y=22)

For the practical, a cheapnamedtuple implementation is 26.6x faster than the Python’s implementation, while still honouring all the public behaviour (corroborated by doctests), and still supporting copy and pickle. The only caveats identified so far are:

  • The docstring of the type generated by the cheapnamedtuple factory cannot be generated ad-hoc for it

  • Some typechecking has been trade in favor of performance (see test_name_fixer test case)

>>> from cheapnamedtuple import cheapnamedtuple
>>> Point = cheapnamedtuple('Point', ['x', 'y'])
>>> Point.__doc__                   # docstring for the new class
'Point(x, y)'
>>> p = Point(11, y=22)             # instantiate with positional args or keywords
>>> p[0] + p[1]                     # indexable like a plain tuple
33
>>> x, y = p                        # unpack like a regular tuple
>>> x, y
(11, 22)
>>> p.x + p.y                       # fields also accessible by name
33
>>> d = p._asdict()                 # convert to a dictionary
>>> d['x']
11
>>> Point(**d)                      # convert from a dictionary
Point(x=11, y=22)
>>> p._replace(x=100)               # _replace() is like str.replace() but targets named fields
Point(x=100, y=22)

Compatibility

Currently only tested in Python 2.7

Benchmarking

Python’s stdlib implementation:

$python -m timeit -vvvv "from collections import namedtuple" "A = namedtuple('A', ['foo', 'bar', 'foobar'])" "a = A(1, 2, 3)" "a.bar"
10 loops -> 0.00922394 secs
100 loops -> 0.0595999 secs
1000 loops -> 0.350676 secs
raw times: 0.328964 0.33169 0.327519
1000 loops, best of 3: 327.519 usec per loop

namedtuple implementation:

$ python -m timeit -vvvv "from cheapnamedtuple import namedtuple" "A = namedtuple('A', ['foo', 'bar', 'foobar'])" "a = A(1, 2, 3)" "a.bar"
10 loops -> 0.00332594 secs
100 loops -> 0.01106 secs
1000 loops -> 0.09164 secs
10000 loops -> 0.955008 secs
raw times: 0.929455 0.872804 0.904877
10000 loops, best of 3: 87.2804 usec per loop

cheapnamedtuple implementation:

$ python -m timeit -vvvv "from cheapnamedtuple import cheapnamedtuple" "A = cheapnamedtuple('A', ['foo', 'bar', 'foobar'])" "a = A(1, 2, 3)" "a.bar"
10 loops -> 0.00241184 secs
100 loops -> 0.00281 secs
1000 loops -> 0.0245481 secs
10000 loops -> 0.156194 secs
100000 loops -> 1.25612 secs
raw times: 1.23459 1.23159 1.23687
100000 loops, best of 3: 12.3159 usec per loop

Using metaclass version found here:

$ python -m timeit -vvvv "from metanamedtuple import namedtuple" "A = namedtuple('A', ['foo', 'bar', 'foobar'])" "a = A(1, 2, 3)" "a.bar"
10 loops -> 0.00334907 secs
100 loops -> 0.0108609 secs
1000 loops -> 0.088969 secs
10000 loops -> 1.25756 secs
raw times: 1.2868 1.24004 1.25383
10000 loops, best of 3: 124.004 usec per loop

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cheapnamedtuple-1.1.2.tar.gz (5.9 kB view details)

Uploaded Source

File details

Details for the file cheapnamedtuple-1.1.2.tar.gz.

File metadata

File hashes

Hashes for cheapnamedtuple-1.1.2.tar.gz
Algorithm Hash digest
SHA256 669f7b8ba56a3a01503fd36eaf8a923f2cc2d209b8ec9a7dadf9c7c2a97e21c9
MD5 db9051fa627bc31fac3885c6e537bdde
BLAKE2b-256 b57b90a3658845ce21e25b02a45625983fc14017b1066a55140194e0a6ddff05

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page