Skip to main content

Get DB aggregations using Django ORM

Project description

Django Aggregation

Published on Django Packages PyPI version codecov badge Python Versions License: MIT

Get DB aggregations using Django ORM for efficient data analysis and reporting. This repository provides tools to perform complex aggregations with ease, leveraging the power of Django's ORM.

Key Features

  • Perform multiple aggregations simultaneously
  • Calculate percentiles (PostgreSQL) and percentages with minimal additional setup
  • Group data by multiple fields
  • Generate time series data (PostgreSQL, MySQL)
  • Flexible result limiting and pagination
  • Custom aggregation types support

Installation

pip install drf-aggregation

Usage

With DRF

First, add the AggregationMixin to your viewset:

from drf_aggregation import AggregationMixin
from rest_framework.viewsets import GenericViewSet

class TicketViewSet(AggregationMixin, GenericViewSet):
    queryset = Ticket.objects.all()
    serializer_class = TicketSerializer

Then, register the viewset with your router:

urlpatterns = [
    path("aggregation/ticket", TicketViewSet.as_view({"post": "aggregation"})),
]

Once set up, you can make requests like the following:

POST /aggregation/ticket
Content-Type: application/json
{
    "group_by": "service",
    "limit": 5,
    "order_by": "-total_tasks",
    "aggregations": {
        "total_tasks": {
            "type": "count"
        },
        "average_execution_time": {
            "type": "average",
            "field": "execution_time"
        }
    }
}

Direct Usage in Code

You can also use the aggregation function directly:

from drf_aggregation import get_aggregations

result = get_aggregations(
    queryset=Ticket.objects.all(),
    aggregations={
        "total_tasks": {
            "type": "count"
        }
    }
)

Parameters

  • aggregations: A dictionary specifying the aggregations to perform.

    • key: The name under which the aggregation result will be returned.
    • value: A dictionary with aggregation settings.
      • type: The type of aggregation (e.g., count, sum).
      • index_by_group: Index for sorting by a specific field.
      • field: Required for sum, average, minimum, maximum, percentile.
      • percentile: A value from 0 to 1, required for percentile calculations.
      • additional_filter: Uses filter parser from drf-complex-filter, required for percent.
  • group_by: List of fields to group the results by.

  • order_by: List of fields to sort the results.

  • limit: Number of groups to return or a dictionary with settings:

    • limit: Number of groups to return.
    • offset: Offset for the start of returned groups.
    • by_group: Field to limit the result by, defaults to the first grouping field.
    • by_aggregation: Aggregation to limit the result by, defaults to the first declared aggregation.
    • show_other: Return remaining records as an additional group.
    • other_label: Label for the additional group.

Supported Field Types

  • IntegerField
  • FloatField
  • DateField (min/max only)
  • DateTimeField (min/max only)
  • DurationField

Extending Aggregation Types

By default, the following aggregations are enabled: count, distinct, sum, average, minimum, maximum.

To enable additional aggregations like percent and percentile, modify your settings.py:

# in settings.py
DRF_AGGREGATION_SETTINGS = {
    "AGGREGATION_CLASSES": [
        "drf_aggregation.aggregations.common.CommonAggregations",
        # Requires additional package "drf-complex-filter"
        "drf_aggregation.aggregations.percent.PercentAggregation",
        # Works only on PostgreSQL
        "drf_aggregation.aggregations.percentile.PercentileAggregation",
    ],
}

Custom Aggregations

Create a class with static methods for custom aggregation types:

class MyAggregations:
    @staticmethod
    def my_aggregation(aggregation, queryset):
        name = aggregation.get("name")
        return {f"{name}": models.Count("id")}

# in settings.py
DRF_AGGREGATION_SETTINGS = {
    "AGGREGATION_CLASSES": [
        "drf_aggregation.aggregations.common.CommonAggregations",
        "path.to.MyAggregations",
    ],
}

result = get_aggregations(
    queryset=Ticket.objects.all(),
    aggregations={
        "value": {
            "type": "my_aggregation"
        }
    }
)

Usage Examples

Grouping Results

Group results by a list of fields:

result = get_aggregations(
    queryset=Ticket.objects.all(),
    aggregations={
        "total_tasks": {
            "type": "count"
        }
    },
    group_by=["field1", "field2"]
)

Sorting Results

Sort results by specified fields:

result = get_aggregations(
    queryset=Ticket.objects.all(),
    aggregations={
        "total_tasks": {
            "type": "count"
        }
    },
    group_by="field1",
    order_by="field1"
)

Use aggregations as sorting keys:

result = get_aggregations(
    queryset=Ticket.objects.all(),
    aggregations={
        "total_tasks": {
            "type": "count"
        }
    },
    group_by="field1",
    order_by="-total_tasks"
)

Limiting Displayed Groups

Limit the number of displayed groups:

result = get_aggregations(
    queryset=Ticket.objects.all(),
    aggregations={
        "total_tasks": {
            "type": "count"
        }
    },
    group_by="field1",
    order_by="-total_tasks",
    limit=2
)

Display remaining groups as an additional category:

result = get_aggregations(
    queryset=Ticket.objects.all(),
    aggregations={
        "total_tasks": {
            "type": "count"
        }
    },
    group_by="field1",
    order_by="-total_tasks",
    limit={
        "limit": 2,
        "show_other": true
    }
)

Time Series

Note: Time series aggregations are not supported on SQLite.

To perform time series aggregations, annotate your queryset with a truncated date field:

truncate_rules = { "created_at": "day" }
queryset = truncate_date(Ticket.objects.all(), truncate_rules)

result = get_aggregations(
    queryset=queryset,
    aggregations={
        "total_tasks": {
            "type": "count"
        }
    },
    group_by="created_at__trunc__day"
)

Using AggregationMixin, pass truncate_rules in the request body:

POST /aggregation/ticket
Content-Type: application/json
{
    "truncate_rules": { "created_at": "day" },
    "group_by": "created_at__trunc__day",
    "aggregations": {
        "total_tasks": {
            "type": "count"
        }
    }
}

Available truncation periods: year, quarter, month, week, day, hour, minute, second

For more details on date truncation, see the Django documentation.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

drf_aggregation-2.1.2.tar.gz (18.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

drf_aggregation-2.1.2-py3-none-any.whl (14.3 kB view details)

Uploaded Python 3

File details

Details for the file drf_aggregation-2.1.2.tar.gz.

File metadata

  • Download URL: drf_aggregation-2.1.2.tar.gz
  • Upload date:
  • Size: 18.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for drf_aggregation-2.1.2.tar.gz
Algorithm Hash digest
SHA256 6c2c8963c069cf09d628dacada59683c32b19bae018cd13084d7907924bc343b
MD5 89ae1f4c8ecc945201a8ce6dc8a6b8c7
BLAKE2b-256 15b7b0a94162df3da7296db866422e6e6a5f43864a524fd75176b8a28c44cf97

See more details on using hashes here.

File details

Details for the file drf_aggregation-2.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for drf_aggregation-2.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c5ba89f7fc0cda1b4e08433c5e38154cb38211c34ae1beb2dee4ac2f443b7574
MD5 557c12540f55d314ba54e627289135b7
BLAKE2b-256 1fd74951d8072378e88bfed912bfeef7259aac5640381e192f27847b52b5f34d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page