Skip to main content

Provides a char-based, prefixable ID field for your Django models. Supports cuid, ksuid, ulid, et al.

Project description

django-charid-field

Provides a char-based, prefixable CharIDField for your Django models.

It can utilise cuid, ksuid, ulid or any other string-based UID generation systems.

It can be used as the primary key, or simple another key on your models.

⛲ Feature set

  • Ability to work with the UID generation spec of your choice.
  • Support for prefixing the ID on a per-model basis à la Stripe. e.g cus_ => cus_cjld2cjxh0000qzrmn831i7rn
  • Support for all database backends that support the CharField.
  • Support for Python 3.9 & above only.

🤷 Why?

To get us a global namespace of collision-resistant IDs that:

  • are URL-safe
  • can be represented in a visual-space-efficient manor
  • are collision-resistant to allow for client side generation
  • UUID v6, v7, v8 are in RFC draft and not ready.

cuid, ksuid, ulid & many others offer this now, and prefixing gets us the global namespace.

Why not use integers?

  • Auto-incrementing integers are easily enumerable and give away collection count.

  • You can solve that with HashID but then you either have to store the HashID as another column or deal with constant conversion when looking up values in your UI VS raw in your database.

  • Most importantly: relying on your database to generate IDs means sequential writes. Your clients are not free to generate their own IDs without a round trip to the database.

Why not use UUIDs?

They solve the collision problem so why not?

  • The text formats use hex, which is not visually space-efficient.
  • UUIDv4 (the one usually recommended) is completely random and thus impossible to sort. This has the known on effect of making databases work harder when looking up/indexing as binary search goes out the window.
  • Optional hyphenation when representing the hex. This nuance results in more code.

Why prefix?

Because global flat namespaces are powerful. An ID now represents the instance and it's type, which means you can have powerful lookup abilities with just the identifier alone. No more guessing whether 802302 is a Dog or a Cat.

📗 Install

Install using your favourite Python dependency manager, or straight with pip:

pip install django-charid-field

You'll also need to install your ID-generation library of choice (or bring your own).

For example:

UID Spec Python Library What could it look like? (with a prefix dev_)
cuid cuid.py: GH / PyPi dev_ckpffbliw000001mi3fw42vsn
ksuid cyksuid: GH / PyPi dev_1tOMP4onidzvnUFuTww2UeamY39
ulid python-ulid: GH / PyPi dev_01F769XGM83VR75H86ZPHKK595

✨ Usage

from charidfield import CharIDField

We recommend using functool.partial to create your own field for your codebase; this will allow you to specify your chosen ID generation and set the max_length parameter and then have an importable field you can use across all your models.

Here's an example using the cuid spec and cuid.py:

# Locate this somewhere importable
from cuid import cuid
from charidfield import CharIDField

CuidField = partial(
    CharIDField,
    default=cuid,
    max_length=30,
    help_text="cuid-format identifier for this entity."
)

# models.py
from wherever_you_put_it import CuidField

class Dog(models.Model):
    id = CuidField(primary_key=True, prefix="dog_")
    name = models.CharField()

# shell
>>> dog = Dog(name="Ronnie")
>>> dog.id
"dog_ckpffbliw000001mi3fw42vsn"

Parameters

Param Type Required Default Note
default Callable - This should be a callable which generates a UID in whatever system you chose. Your callable does not have to handle prefixing, the prefix will be applied onto the front of whatever string your default callable generates. Technically not required, but without it you must handle ID generation yourself.
max_length int Set it Controls the maximum length of the stored strings. Provide your own to match whatever ID system you pick, remembering to take into account the length of any prefixes you have configured. Also note that there is no perf/storage impact for modern Postgres so for that backend it is effectively an arbitary char limit.
primary_key boolean False Set to True to replace Django's default Autofield that gets used as the primary key, else the field will be additional ID field available to the model.
prefix str or Callable "" If provided, the ID strings generated as the field's default value will be prefixed. This provides a way to have a per-model prefix which can be helpful in providing a global namespace for your ID system. The prefix can be provided as a string literal (e.g cus_), or as a Callable which is run when the field is attached to the model instance and can allow for more dynamic prefixing needs. For more, see below.
unique boolean True Whether the field should be treated as unique across the dataset; the field provides a sane default of True so that a database index is setup to protext you against collisions (whether due to chance or, more likely, a bug/human error). To turn the index off, simply pass False.

All other django.db.models.fields.CharField keyword arguments should work as expected. See the Django docs.

Usage as the Primary Key

This will replace Django's AutoField and the cuid will become the main primary key for the entity, thus removing the default database-genererated incremental integer ID.

# models/some_model.py or models.py

class SomeModel(models.Model):
    id = CharIDField(primary_key=True, default=your_id_generator)

>>> some_model = SomeModel.objects.create()
>>> some_model.id
"ckp9jm3qn001001mrg5hw3sk4"
>>> some_model.pk
"ckp9jm3qn001001mrg5hw3sk4"
""

Setting up prefixing

What?

Prefixing allows per-entity ID namespacing, e.g:

cus_ckp9mdxpd000i01ld6gzjgyl4 (reference a specific customer)
usr_ckp9me8zy000p01lda5579o3q (reference a specific user)
org_ckp9mek2d000s01ld8ffhhvd3 (reference a specific organisation)

Why?

By prefixing your entities IDs you can create a global namespace for your ID system which has numerous advantages:

  • when displaying an ID you can immediately derive what type of object it represents from reading the prefix alone; most identifiers only showcase what instance is represented, but without information about the type it is machine-impossile to tell if ID 123 is from the Dog or Cat models. Whereas cat_123 and dog_123 make that clear.

  • by having a global system of prefixing, you can speed up internal processes as (think: support) by having features in your backoffice such as "quick find" which allows you to dump the ID in question and be taken straight to the page which represents the specific instance of that type of object.

This may sound familiar, as it's how Stripe handle their public IDs - everything is referenceable.

How?

Two options.

First is to set a string literal during field instantiation. E.g:

# models.py

class User(models.Model):
    public_id = CharIDField(prefix="usr_", ...)

>>> user = User.objects.create()
>>> user.public_id
"usr_ckp9me8zy000p01lda5579o3q"

Second is to pass a callable which is executed after field initialisation and during its addition to the model itself (Field.contribute_to_class). This allows for dynamic generation of the prefix at runtime, which is especially helpful if you've defined the field on an Abstract Django model class, as the prefix generator will be called once for every concrete model that inherits the abstract.

The callable should accept model_class: models.Model (the model the field is being added to), field_instance: django.db.models.Field (the field instance being added) & field_name: str (the name of the field on the model). E.g:

# models.py

def get_prefix_from_class_name(
    *,
    model_class: models.Model,
    field_instance: Field,
    field_name: str,
) -> str:
    """Return the Model's name in snake_case for use as a cuid prefix."""
    name = model_class.__name__
    # CamelCase to snake_case
    name = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
    return re.sub('([a-z0-9])([A-Z])', r'\1_\2', name).lower() + "_"


class UserProfile(models.Model):
    public_id = CharIDField(prefix=get_prefix_from_class_name, ...)

>>> user_profile = UserProfile.objects.create()
>>> user_profile.public_id
"user_profile_ckp9me8zy000p01lda5579o3q"

See the tests for more common usage patterns.

👩‍💻 Development

🏗️ Local environment

The local environment is handled with poetry, so install that first then:

$ poetry install

🧪 Running tests

The tests themselves use pytest as the test runner.

After setting up the environment, run them using:

$ poetry run pytest

The full CI suite is controlled by tox, which contains a set of environments that will format (fmt), lint, and test against all support Python + Django version combinations.

$ tox

⚙️ CI

Uses GitHub Actions, see ./github/workflows.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

django-charid-field-0.1.0.tar.gz (9.6 kB view hashes)

Uploaded Source

Built Distribution

django_charid_field-0.1.0-py3-none-any.whl (8.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page