Basic MongoDB wrapper for object-oriented collection handling
Project description
Python Basic Utilities - Mongo pbumongo
Available on PyPi
Table of Contents
- Installation
- Usage
- Classes
- AbstractMongoStore - abstract class for handling MongoDB collection access
- MongoConnection - a helper class to assist with creating multiple store instances
- AbstractMongoDocument - abstract class for wrapping MongoDB BSON documents
- ProgressUpdater - a collection of classes to help with updating job progress
- AbstractMongoStore - abstract class for handling MongoDB collection access
- Archives
Installation
Install via pip:
pip install pbumongo
Usage
It is good practice associating a sub-class of AbstractMongoDocument with a sub-class of AbstractMongoStore. This is
done through the deserialised_class parameter in the super() constructor call of the store class. Any method for
querying documents will use that class to deserialise the BSON document into the provided class, which should extend
AbstractMongoDocument.
Example: let's say we want to implement access to a collection containing user documents. We'll define a class User
that extends AbstractMongoDocument and a class UserStore that extends AbstractMongoStore.
# main imports
from pbumongo import AbstractMongoDocument, AbstractMongoStore
# supporting imports
import crypt
from typing import List, Optional
from time import time
# this is an example of a minimum viable class
class User(AbstractMongoDocument):
def __init__(self):
super().__init__()
# define attributes with meaningful defaults
self.username: str = None
self.password: str = None
self.permissions: List[str] = []
self.last_login: int = 0
def get_attribute_mapping(self) -> dict:
# the values are what is used inside MongoDB documents
return {
"username": "username",
"password": "password",
"permissions": "permissions",
"last_login": "lastLogin"
}
@staticmethod
def from_json(json: dict):
user = User()
user.extract_system_fields(json)
return user
class UserStore(AbstractMongoStore):
def __init__(self, mongo_url, mongo_db, collection_name):
super().__init__(mongo_url, mongo_db, collection_name, deserialised_class=User, data_model_version=1)
def login(self, username, password) -> Optional[User]:
# encrypt the password!
pw_encrypted = crypt.crypt(password, crypt.METHOD_MD5)
user: Optional[User] = self.query_one({"username": username, "password": pw_encrypted})
if user is not None:
# update last_login attribute and save it in database as well
user.last_login = round(time())
self.update_one(AbstractMongoStore.id_query(user.id),
AbstractMongoStore.set_update("lastLogin", user.last_login))
return user
def create_user(self, username, password) -> User:
# check if this user already exists
existing = self.query_one({"username": username})
if existing is not None:
raise ValueError(f"User with username '{username}' already exists.")
# create new user object
user = User()
user.username = username
user.password = crypt.crypt(password, crypt.METHOD_MD5)
# store in database and return document
user_id = self.create(user)
return self.get(user_id)
MongoConnection
To use these classes in your application, you can use the MongoConnection helper or create the UserStore class
instance directly. The MongoConnection helper is useful, when you have a lot of collections and don't want to repeat
the mongo connection URL and DB name for every constructor.
from pbumongo import MongoConnection
from mypackage import UserStore # see implementation above
con = MongoConnection("mongodb://localhost:27017", "myDbName")
user_store = con.create_store(store_class=UserStore, collection_name="users")
user = user_store.login(username="admin", password="mypassword")
Classes
AbstractMongoStore
This is an abstract class and cannot be instantiated directly. Instead, define a class that extends this class.
Constructor
__init__(mongo_url, mongo_db, collection_name, deserialised_class, data_model_version=1, archive_store)
mongo_url- this is the Mongo connection URL containing the host, port and optional username, passwordmongo_db- this is the Mongo DB name - the one you provide when usinguse <dbname>on the Mongo shellcollection_name- the name of the collection - e.g.myCollectionfordb.myCollection.find({})on the Mongo shelldeserialised_class- used for all the query methods to deserialise the BSON document into a class with attributes for easier accessdata_model_version- a number that can be used for database migration as an app develops over time
Methods
get(doc_id: str)- fetches a single document with a matchingdoc_id == document["_id"]get_all()- fetches the entire collection content and deserialises every document. Careful, this is not an iterator, but returns alistof all the documents and can consume quite a bit of compute and memory.create(document)- creates a new document and returns the_idof the newly created BSON document as string. Thedocumentcan be eitherdictor an instance of thedeserialised_classprovided in thesuper().__init(..)call.- Since version 1.0.1 a new parameter is available
create(document, return_doc=True)which will return the entire document/object instead of just the_idof the newly created document.
- Since version 1.0.1 a new parameter is available
query_one(query: dict)- fetches a single document and deserialises it or returnsNoneif no document can be foundquery(query: dict, sorting, paging)- fetches multiple documents and deserialises them.sortingcan be an attribute name (as provided in the BSON) or a dictionary with the sort order.pagingis an instance ofpbumongo.PagingInformation.update_one(query: dict, update: dict)- proxies thedb.collection.updateOne(..)function from the Mongo shellupdate(query:, update: dict- same asupdate_one, but will update multiple documents, if the query matchesupdate_full(document)- shortcut for updating the entire document with an updated version, the query will be constructed from theid/_idprovided by thedocument.delete(doc_id)- deletes a single document with the provided document IDdelete_many(query: dict)- deletes multiple documents matching the query.set_archive(archive_store: AbstractMongoStore)- pass another store instance used for backups/archives, should also used to create indexes in the main store - see Archivesrun_archive(options: Optional)- can be implemented by the sub-class, by default does nothing. Options can be anything the implementing class wants
Static Methods
AbstractMongoStore.id_query(string_id: str)- creates a query{ "_id": ObjectId(string_id) }, which can be used to query the databaseAbstractMongoStore.set_update(keys, values)- creates a$setupdate statement. If only a single attribute is updated, you can pass them directly as parameters, e.g. updating a key"checked": True, can be done by.set_update("checked", True). If you update multiple attributes provide them as list in the matching order.AbstractMongoStore.unset_update(keys)- creates an$unsetupdate statement with the attributes listed askeys. Similarly to.set_update, you can provide a single key without a list for ease of use.
AbstractMongoDocument
This is an abstract class and cannot be instantiated directly. Instead, define a class that extends this class.
Constructor
__init__(doc_id=None, data_model_version=None)
The parameters are entirely optional. Generally it is recommended to use the static method from_json(json: dict) to
create BSON documents you've loaded from the database instead of calling the constructor. For new documents, you would
not provide the _id as the store class handles that.
Methods
For methods and static methods please see the documentation of JsonDocument from pbu. AbstractMongoDocument
extends that class.
ProgressUpdater
The ProgressUpdaer class is part of a set of classes that assist with keeping track of job progress. The other classes
are:
ProgressObject: a database object with fields for a status (seepbu>JobStatus), start and end timestamp, total count, processed count, a list of errors and a main error.ProgressObjectStore: an abstract class that provides store methods to update status, progress and errors of aProgressObjectProgressError: a JSON document containing an error message as well as a dictionary for data related to the error. These objects will be appeneded to aProgressObject'serrorslist.ProgressUpdater: an object to pass into a processor, which holds references to the progress store and progress object and provides methods for updating progress and handling errors.
Both, ProgressObject and ProgressObjectStore are abstract classes and should be extended with remaining attributes
of a process / job definition (like a name/label, extra configuration, etc.). ProgressObject is an
AbstractMongoDocument and ProgressUpdateStore is an AbstractMongoStore.
Archives
Since 1.3.0 each AbstractMongoStore provides an interface for archives/backups with the folowing goals in mind:
- In some cases, when a collection contains lots of documents and you have a few indexes for faster queries running, MongoDBs memory consumption can get quite high. So it can make sense to archive older documents in a separate store/collection that is identical, but doesn't have these indexes.
- The main store should have access to it's own archive.
- It's possible to provide a different store class as archive store.
Usage:
from pbumongo import MongoConnection
from my_stores import InvoiceStore
con = MongoConnection("mongodb://localhost:27017", "myDbName")
invoice_store = con.create_store(
store_class=InvoiceStore,
collection_name="invoices"
).set_archive_store(
con.create_store(
store_class=InvoiceStore,
collection_name="invoicesArchive"
)
)
This creates 2 instances of InvoiceStore, each with their own collection name. The invoice_store (the main store)
knows about its archive store and can access it as self.archive_store.
The second instance of InvoiceStore (the archive store) can detect whether it is the archive by checking:
if self.archive_store is None.
The set_archive_store method is only called for the main store, which makes it the best place to create indexes
instead of doing this in the constructor.
class InvoiceStore(AbstractMongoStore):
def set_archive_store(self, archive_store):
# create/ensure your indexes
self.collection.create_index(...)
self.collection.create_index(...)
self.collection.create_index(...)
# ensure to call super() - this will set self.archive_store
return super().set_archive_store(archive_store)
You can pass a different class as archive_store. I would not recommend doing this, as it
complicates things if you want to use the archive for lookups as well, in case your main store does not return any
results (e.g. for a start/end date query, see example below). This can however be mitigated by the archive store
translating its own document structure into the structure of the main store.
By default no other method will use the archive store. It is purely there for convenience. And so is the run_archive() method.
Archive lookups can be expensive and slow down a system. Be careful about when to allow to access the archive store and when not and allow for longer query times when using them, as they shouldn't have the same indexes as the main store.
Example of query that uses the archive store:
class MyStore(AbstractMongoStore):
def find_by_dates(self, start: int, end: int):
query = {"timestamp": {"$gte": start, "$lte": end}}
result = self.query(query)
if len(result) == 0 and self.archive_store is not None:
# regular query did not return anything, proxy the query to the archive store
return self.archive_store.find_by_dates(start, end)
return result
You can also combine archive results with regular results, provided they map to the same object. The start parameter
in above example is perfect to be used for this by checking if start < last_archive_date.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pbumongo-1.3.0.tar.gz.
File metadata
- Download URL: pbumongo-1.3.0.tar.gz
- Upload date:
- Size: 19.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7902d9e8b5cdfb203366c497963985596af5d86efc43c9def416c3c879f3941d
|
|
| MD5 |
d91a0ed8bbd1a78095460402481ac33b
|
|
| BLAKE2b-256 |
e8848d39c90c035c73714af548acd63cd602a205a7024f248aeff7bdd292add5
|
File details
Details for the file pbumongo-1.3.0-py3-none-any.whl.
File metadata
- Download URL: pbumongo-1.3.0-py3-none-any.whl
- Upload date:
- Size: 16.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
67a2bd460b1d393e1485b834f116eb36050b95e99ab8ec52a554240cc820dadb
|
|
| MD5 |
46caf470e5fd43423b67fc581a07238e
|
|
| BLAKE2b-256 |
46359e303887a46936cb929499257ebc6152f3b99fae776311c4c05a0195c753
|