No project description provided
Project description
duper
🚧 There's no publicly available source code yet. Stay tuned.
Library for building fast and reusable copying factories for python objects.
Aims to fill the gaps in performance and obscurity between copy, pickle, json and other serialization libraries, becoming the go-to library for copying objects within the same Python process.
Why?
It is challenging and fun, of course.
But if I'm being serious, deepcopy is extremely slow and there's no alternative that is both faster and can replace deepcopy in all cases.
Keypoints
- Generates a cook-book to reconstruct given object
- Upon subsequent calls, follows optimized instructions to produce new object
- Much faster handling of immutable types and flat collections.
How fast?
Generally 20-50 times faster than copy.deepcopy() on nested objects.
import duper
import copy
from timesup import timesup
@timesup(number=100000, repeats=3)
def reconstruction():
x = {"a": 1, "b": [(1, 2, 3), (4, 5, 6)], "c": []}
copy.deepcopy(x) # ~0.00605 ms (deepcopy)
dup = duper.Duper(x) # ~0.00009 ms (duper_init):
dup.deep() # ~0.00014 ms (duper_dup): 42.22 times faster than deepcopy
Real use case
Pydantic
Models definition
from datetime import datetime
from functools import wraps
import duper
from pydantic import BaseModel, Field
from pydantic.fields import FieldInfo
class User(BaseModel):
id: int
name: str = "John Doe"
signup_ts: datetime | None = None
friends: list[int] = []
skills: dict[str, int] = {
"foo": {"count": 4, "size": None},
"bars": [
{"apple": "x1", "banana": "y"},
{"apple": "x2", "banana": "y"},
],
}
@wraps(Field)
def FastField(default, *args, **kwargs):
"""
Overrides the fields that need to be copied to have default_factories
"""
default_factory = duper.Duper(default, prepare=True).deep
field_info: FieldInfo = Field(*args, default_factory=default_factory, **kwargs)
return field_info
class FastUser(BaseModel):
id: int
name: str = FastField("John Doe")
signup_ts: datetime | None = FastField(None)
friends: list[int] = FastField([])
skills: dict[str, int] = FastField(
{
"foo": {"count": 4, "size": None},
"bars": [
{"apple": "x1", "banana": "y"},
{"apple": "x2", "banana": "y"},
],
}
)
@timesup(number=100000, repeats=3)
def pydantic_defaults():
User(id=42) # ~0.00935 ms (with_deepcopy)
FastUser(id=1337) # ~0.00292 ms (with_duper): 3.20 times faster than with_deepcopy
🚧 Status
Though the library is in an early development stage, it already outperforms all other solutions I've found when copying objects.
I am completing the implementation and exploring new and validating existing ideas to improve performance.
My current priority is to speed up the initial build of the copying factory. It is currently slightly slower than deepcopy in most cases.
If you're interested in this project, you can contact me via bobronium@gmail.com or Telegram.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.