Skip to main content

Pydastic is an elasticsearch python ORM based on Pydantic.

Project description

Pydastic

Package version

build Python Version Dependencies Status Code style: black

Security: bandit Pre-commit Semantic Versions License Coverage Report

Pydastic is an elasticsearch python ORM based on Pydantic.

💾 Installation

Pip:

pip install pydastic

Poetry:

poetry add pydastic

🚀 Core Features

  • Simple CRUD operations supported
  • Sessions for simplifying bulk operations (a la SQLAlchemy)
  • Dynamic index support when committing operations

📋 Usage

Defining Models

from pydastic import ESModel

class User(ESModel):
    name: str
    phone: Optional[str]
    last_login: datetime = Field(default_factory=datetime.now)

    class Meta:
        index = "user"

Establishing Connection

An elasticsearch connection can be setup by using the connect function. This function adopts the same signature as the elasticsearch.Elasticsearch client and supports editor autocomplete. Make sure to call this only once. No protection is put in place against multiple calls, might affect performance negatively.

from pydastic import connect

connect(hosts="localhost:9200")

CRUD: Create, Update

# Create and save doc
user = User(name="John", age=20)
user.save(wait_for=True)  # wait_for explained below

assert user.id != None

# Update doc
user.name = "Sam"
user.save(wait_for=True)

CRUD: Read Document

got = User.get(id=user.id)
assert got == user

CRUD: Delete

user = User(name="Marie")
user.save(wait_for=True)

user.delete(wait_for=True)

Sessions

Sessions are inspired by SQL Alchemy's sessions, and are used for simplifying bulk operations using the Elasticsearch client. From what I've seen, the ES client makes it pretty hard to use the bulk API, so they created bulk helpers (which in turn have incomplete/wrong docs).

john = User(name="John")
sarah = User(name="Sarah")

with Session() as session:
    session.save(john)
    session.save(sarah)
    session.commit()

With an ORM, bulk operations can be exposed neatly through a simple API. Pydastic also offers more informative errors on issues encountered during bulk operations. This is possible by suppressing the built-in elastic client errors and extracting more verbose ones instead.

Example error:

pydastic.error.BulkError: [
    {
        "update": {
            "_index": "user",
            "_type": "_doc",
            "_id": "test",
            "status": 404,
            "error": {
                "type": "document_missing_exception",
                "reason": "[_doc][test]: document missing",
                "index_uuid": "cKD0254aQRWF-E2TMxHa4Q",
                "shard": "0",
                "index": "user"
            }
        }
    },
    {
        "update": {
            "_index": "user",
            "_type": "_doc",
            "_id": "test2",
            "status": 404,
            "error": {
                "type": "document_missing_exception",
                "reason": "[_doc][test2]: document missing",
                "index_uuid": "cKD0254aQRWF-E2TMxHa4Q",
                "shard": "0",
                "index": "user"
            }
        }
    }
]

The sessions API will also be available through a context manager before the v1.0 release.

Dynamic Index Support

Pydastic also supports dynamic index specification. The model Metaclass index definition is still mandatory, but if an index is specified when performing operations, that will be used instead. The model Metaclass index is technically a fallback, although most users will probably be using a single index per model. For some users, multiple indices per model are needed (for example one user index per company).

user = User(name="Marie")
user.save(index="my-user", wait_for=True)

user.delete(index="my-user", wait_for=True)

Notes on testing

When writing tests with Pydastic (even applies when writing tests with the elasticsearch client), remember to use the wait_for=True argument when executing operations. If this is not used, then the test will continue executing even if Elasticsearch hasn't propagated the change to all nodes, giving you weird results.

For example if you save a document, then try getting it directly after, you'll get a document not found error. This is solved by using the wait_for argument in Pydastic (equivalent to refresh="wait_for" in Elasticsearch)

Here is a reference to where this argument is listed in the docs.

It's also supported in the bulk helpers even though its not mentioned in their docs, but you wouldn't figure that out unless you dug into their source and traced back several function calls where *args **kwargs are just being forwarded across calls.. :)

Support Elasticsearch Versions

Part of the build flow is running the tests using elasticsearch 7.12.0 DB as well as python client, and using 8.1.2 as well (DB as well as client, as part of a build matrix). This ensures support for multiple versions.

📈 Releases

None yet.

You can see the list of available releases on the GitHub Releases page.

We follow Semantic Versions specification.

We use Release Drafter. As pull requests are merged, a draft release is kept up-to-date listing the changes, ready to publish when you’re ready. With the categories option, you can categorize pull requests in release notes using labels.

🛡 License

License

This project is licensed under the terms of the MIT license. See LICENSE for more details.

📃 Citation

@misc{pydastic,
  author = {Rami Awar},
  title = {Pydastic is an elasticsearch python ORM based on Pydantic.},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/ramiawar/pydastic}}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydastic-0.4.0.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

pydastic-0.4.0-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file pydastic-0.4.0.tar.gz.

File metadata

  • Download URL: pydastic-0.4.0.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.5 CPython/3.9.10 Darwin/21.1.0

File hashes

Hashes for pydastic-0.4.0.tar.gz
Algorithm Hash digest
SHA256 7f3e71808ba408ba6b6e6e2cae31f9876b3968ed51ff14d9fe85e082fdc12668
MD5 2a4505adbc0ac186d13d761cecc8c7f2
BLAKE2b-256 c42e80e921cadfcd9c6bd1f4fafb77a86c5571ab46f3075e0bdd518efd766661

See more details on using hashes here.

File details

Details for the file pydastic-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: pydastic-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.5 CPython/3.9.10 Darwin/21.1.0

File hashes

Hashes for pydastic-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a8b543a4b8f3f7592c1e82d29f893bdb19121669142d10ac44c4948c21ae23f2
MD5 b87b82cade9485506e19cccc89c5ff7d
BLAKE2b-256 415c3d6734fcd7991883781066249a2602e6b7ac60783f149956fb51caa65d6c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page