Skip to main content

MongoDB aggregation pipelines made easy. Joins, grouping, counting and much more...

Project description

Overview

Monggregate is a library that aims at simplifying usage of MongoDB aggregation pipelines in python. It is based on MongoDB official python driver, pymongo and on pydantic.

Features

  • provides an OOP interface to the aggregation pipeline.
  • allows you to focus on your requirements rather than MongoDB syntax
  • integrates all the MongoDB documentation and allows you to quickly refer to it without having to navigate to the website.
  • enables autocompletion on the various MongoDB features.
  • offers a pandas-style way to chain operations on data.

Requirements

This package requires python > 3.10, pydantic > 1.8.0

Installation

The repo is now available on PyPI:

pip install monggregate

Usage

The below examples reference the MongoDB sample_mflix database

Basic Pipeline usage

import os

from dotenv import load_dotenv 
import pymongo
from monggregate import Pipeline, S

# Creating connexion string securely
# You need to create a .env file with your password
load_dotenv(verbose=True)
PWD = os.environ["MONGODB_PASSWORD"] 

MONGODB_URI = f"mongodb+srv://dev:{PWD}@myserver.xciie.mongodb.net/?retryWrites=true&w=majority"

# Connect to your MongoDB cluster:
client = pymongo.MongoClient(MONGODB_URI)

# Get a reference to the "sample_mflix" database:
db = client["sample_mflix"]

# Creating the pipeline
pipeline = Pipeline()

# The below pipeline will return the most recent movie with the title "A Star is Born"
pipeline.match(
    title="A Star Is Born"
).sort(
    by="year"
).limit(
    value=1
)

# Executing the pipeline
curosr = db["movies"].aggregate(pipeline.export())

# Printing the results
results = list(curosr)
#print(results) # Uncomment to see the results

Advanced usage, with MongoDB operators

import os

from dotenv import load_dotenv 
import pymongo
from monggregate import Pipeline, S


# Creating connexion string securely
load_dotenv(verbose=True)
PWD = os.environ["MONGODB_PASSWORD"]
MONGODB_URI = f"mongodb+srv://dev:{PWD}@myserver.xciie.mongodb.net/?retryWrites=true&w=majority"


# Connect to your MongoDB cluster:
client = pymongo.MongoClient(MONGODB_URI)

# Get a reference to the "sample_mflix" database:
db = client["sample_mflix"]


# Creating the pipeline
pipeline = Pipeline()
pipeline.match(
    year=S.type_("number") # Filtering out documents where the year field is not a number
).group(
    by="year",
    query = {
        "movie_count":S.sum(1), # Aggregating the movies per year
        "movie_titles":S.push("$title")
    }
).sort(
    by="_id",
    descending=True
).limit(10)

# Executing the pipeline
cursor = db["movies"].aggregate(pipeline.export())

# Printing the results
results = list(cursor)
#print(results)

Even more advanced usage with Expressions

import os

from dotenv import load_dotenv 
import pymongo
from monggregate import Pipeline, S, Expression

# Creating connexion string securely
load_dotenv(verbose=True)
PWD = os.environ["MONGODB_PASSWORD"]
MONGODB_URI = f"mongodb+srv://dev:{PWD}@myserver.xciie.mongodb.net/?retryWrites=true&w=majority"


# Connect to your MongoDB cluster:
client = pymongo.MongoClient(MONGODB_URI)

# Get a reference to the "sample_mflix" database:
db = client["sample_mflix"]

# Using expressions
comments_count = Expression.field("comments").size()


# Creating the pipeline
pipeline = Pipeline()
pipeline.lookup(
    right="comments",
    right_on="movie_id",
    left_on="_id",
    name="comments"
).add_fields(
    comments_count=comments_count
).match(
    expression=comments_count>2
).limit(1)

# Executing the pipeline
cursor = db["movies"].aggregate(pipeline.export())

# Printing the results
results = list(cursor)
#print(results)

Going further

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

monggregate-0.17.0.tar.gz (101.6 kB view details)

Uploaded Source

Built Distribution

monggregate-0.17.0-py3-none-any.whl (150.6 kB view details)

Uploaded Python 3

File details

Details for the file monggregate-0.17.0.tar.gz.

File metadata

  • Download URL: monggregate-0.17.0.tar.gz
  • Upload date:
  • Size: 101.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for monggregate-0.17.0.tar.gz
Algorithm Hash digest
SHA256 5b5ec83ee0a954c8dece7e4e34741d290c8f4b61592a31691f4bd15ba19e5423
MD5 9ca4e6f68d2961811f6d6fe369da0713
BLAKE2b-256 7b9550d2b69594f3e071573eb6c1d1592b1684c21d2173d0f60cb31515f69d69

See more details on using hashes here.

File details

Details for the file monggregate-0.17.0-py3-none-any.whl.

File metadata

  • Download URL: monggregate-0.17.0-py3-none-any.whl
  • Upload date:
  • Size: 150.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for monggregate-0.17.0-py3-none-any.whl
Algorithm Hash digest
SHA256 de0b689741e0b8c53ffce098f011b93dbfcf2010eef6290a537431efb4225f39
MD5 55afe66b4aa88c6aa7f5afd503d31c4f
BLAKE2b-256 a8ca8ff624aed7352a3b331b691d0bdf65dcf5b5b2c8a80191e026d6785a5a77

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page