Skip to main content

A MongoDB aggregation generator for Mongoengine

Project description

Aggify

Aggify is a Python library for generating MongoDB aggregation pipelines, designed to work seamlessly with Mongoengine. This library simplifies the process of constructing complex MongoDB queries and aggregations using an intuitive and organized interface.

Features

  • Programmatically build MongoDB aggregation pipelines.
  • Filter, project, group, and perform various aggregation operations with ease.
  • Supports querying nested documents and relationships defined using Mongoengine.
  • Encapsulates aggregation stages for a more organized and maintainable codebase.
  • Designed to simplify the process of constructing complex MongoDB queries.

TODO

  • $match: Filters the documents to include only those that match a specified condition.
  • $project: Reshapes and selects specific fields from the documents.
  • $group: Groups documents by a specified field and performs aggregation operations within each group.
  • $unwind: Deconstructs arrays within documents, creating multiple documents for each array element.
  • $limit: Limits the number of documents in the result.
  • $skip: Skips a specified number of documents in the result.
  • $lookup: Performs a left outer join to combine documents from two collections.
  • $sort: Sorts the documents in the aggregation pipeline.
  • $conf:
  • $out: Writes the result of the aggregation pipeline to a new collection.
  • $group (with accumulators): Performs various aggregation operations like counting, summing, averaging, and more.
  • $addFields: Adds new fields to the documents in the pipeline.
  • $replaceRoot: Replaces the document structure with a new one.
  • $project (with expressions): Allows you to use expressions to reshape and calculate values.
  • $redact: Controls document inclusion during the aggregation pipeline.

Installation

You can install Aggify using pip:

pip install aggify

Sample Usage

Here's a code snippet that demonstrates how to use Aggify to construct a MongoDB aggregation pipeline:

from aggify import Aggify, Q, F
from mongoengine import Document, fields
from pprint import pprint


class AccountDocument(Document):
    username = fields.StringField()
    display_name = fields.StringField()
    phone = fields.StringField()
    is_verified = fields.BooleanField()
    disabled_at = fields.LongField()
    deleted_at = fields.LongField()
    banned_at = fields.LongField()

    meta = {
        'collection': 'account',
        'ordering': ['-_id'],
        'indexes': [
            'username', 'phone', 'display_name',
            'deleted_at', 'disabled_at', 'banned_at'
        ],
    }


class PostStat(fields.EmbeddedDocument):
    like_count = fields.IntField(default=0)
    view_count = fields.IntField(default=0)
    comment_count = fields.IntField(default=0)

    meta = {'allow_inheritance': True}


class PostDocument(Document):
    owner = fields.ReferenceField('AccountDocument', db_field='owner_id')
    caption = fields.StringField()
    location = fields.StringField()
    comment_disabled = fields.BooleanField()
    stat_disabled = fields.BooleanField()
    hashtags = fields.ListField()
    archived_at = fields.LongField()
    deleted_at = fields.LongField()
    stat = fields.EmbeddedDocumentField(PostStat)


# Create Aggify instance with the base model (e.g., PostDocument)
query = Aggify(PostDocument)

pprint(query.filter(caption__contains="hello").pipelines)
# output :
#    [{'$match': {'caption': {'$options': 'i', '$regex': '.*hello.*'}}}]


pprint(query.filter(caption__contains="hello", owner__deleted_at=None).pipelines)
# output :
#         [{'$match': {'caption': {'$options': 'i', '$regex': '.*hello.*'}}},
#          {'$lookup': {'as': 'owner',
#                       'foreignField': '_id',
#                       'from': 'account',
#                       'localField': 'owner_id'}},
#          {'$unwind': {'path': '$owner', 'preserveNullAndEmptyArrays': True}},
#          {'$match': {'owner.deleted_at': None}}]


pprint(
    query.filter(caption__contains="hello").project(caption=1, deleted_at=0).pipelines
)

# output :
#         [{'$match': {'caption': {'$options': 'i', '$regex': '.*hello.*'}}},
#          {'$project': {'caption': 1, 'deleted_at': 0}}]

pprint(
    query.filter(
        (Q(caption__contains=['hello']) | Q(location__contains='test')) & Q(deleted_at=None)
    ).pipelines
)

# output :
# [{'$match': {'$and': [{'$or': [{'caption': {'$options': 'i',
#                                             '$regex': ".*['hello'].*"}},
#                                {'location': {'$options': 'i',
#                                              '$regex': '.*test.*'}}]},
#                       {'deleted_at': None}]}}]

pprint(
    query.filter(caption='hello')[3:10].pipelines
)

# output:
#         [{'$match': {'caption': 'hello'}}, {'$skip': 3}, {'$limit': 7}]

pprint(
    query.filter(caption='hello').order_by('-_id').pipelines
)

# output:
#         [{'$match': {'caption': 'hello'}}, {'$sort': {'_id': -1}}]

pprint(
    query.add_fields({
        "new_field_1": "some_string",
        "new_field_2": F("existing_field") + 10,
        "new_field_3": F("field_a") * F("field_b")}
    ).pipelines
)
# output :
#         [{'$addFields': {'new_field_1': {'$literal': 'some_string'},
#                  'new_field_2': {'$add': ['$existing_field', 10]},
#                  'new_field_3': {'$multiply': ['$field_a', '$field_b']}}}]


pprint(
    query.lookup(from_collection=AccountDocument, let=['owner'],
                 query=[
                     Q(_id__exact='owner') & Q(username__ne='seyed'),
                 ],
                 as_name="posts").filter(posts__ne=[]).pipelines
)
# output: 
#         [{'$lookup': {'as': 'posts',
#               'from': 'account',
#               'let': {'owner': '$owner_id'},
#               'pipeline': [{'$match': {'$expr': {'$and': [{'$eq': ['$_id',
#                                                                    '$$owner']},
#                                                           {'$ne': ['$username',
#                                                                    'seyed']}]}}}]}},
#  {'$match': {'posts': {'$ne': []}}}]

pprint(query.replace_root(embedded_field='stat').pipelines)
# output:
#  [{'$replaceRoot': {'$newRoot': '$stat'}}]

pprint(query.replace_with(embedded_field='stat').pipelines)
# output:
#  [{'$replaceWith': '$stat'}]

pprint(query.replace_with(embedded_field='stat', merge={
    "like_count": 0,
    "view_count": 0,
    "comment_count": 0
}).pipelines)
# output:
#  [{'$replaceWith': {'$mergeObjects': [{'comment_count': 0,
#                                       'like_count': 0,
#                                       'view_count': 0},
#                                      '$stat']}}]

In the sample usage above, you can see how Aggify simplifies the construction of MongoDB aggregation pipelines by allowing you to chain filters, projections, and other operations to build complex queries. The pprint(query.pipelines) line demonstrates how you can inspect the generated aggregation pipeline for debugging or analysis.

For more details and examples, please refer to the documentation and codebase.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aggify-0.2.0.tar.gz (13.2 kB view details)

Uploaded Source

Built Distribution

aggify-0.2.0-py3-none-any.whl (13.1 kB view details)

Uploaded Python 3

File details

Details for the file aggify-0.2.0.tar.gz.

File metadata

  • Download URL: aggify-0.2.0.tar.gz
  • Upload date:
  • Size: 13.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for aggify-0.2.0.tar.gz
Algorithm Hash digest
SHA256 96ac7997e3ef693e0b5bd0dbaab134e1c65ad35c453b3f672649ddb035a24e50
MD5 049c17282060f66752f7f37bb8900b25
BLAKE2b-256 38163ea67b895c5436f0604e42e89a8481108e6a5e1582eefc993dbc88bb9ed9

See more details on using hashes here.

Provenance

File details

Details for the file aggify-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: aggify-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 13.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for aggify-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8fd11df8f493b2902c5ad6b15eba9731ada0214329bbd7ea0b380e4249ec535b
MD5 06ffeab3310febc818fc12c34f52fcae
BLAKE2b-256 c84f23e23d0e8b63cc746808e25ae30a9e6ae171b439f187c0a83d16ba6c16fb

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page