Skip to main content

MongoDB Schema Object - A ODM from Mongo DB JSON validation schema

Project description

MSO (Mongo Schema Object Library)

MSO is a lightweight Object-Document Mapper (ODM) for MongoDB that allows Python developers to interact with MongoDB collections in an intuitive and Pythonic way. It offers the flexibility of a schema-less database with the convenience of strongly-typed classes, enabling seamless operations on MongoDB collections using familiar Python patterns.


๐Ÿš€ Key Features:

  • Dynamic Model Generation: Automatically generates Python classes from your MongoDB collectionโ€™s $jsonSchema.
  • Pythonic API: Use common patterns like save(), find_one(), update_one(), etc.
  • Deeply Nested Models: Supports arbitrarily nested schemas, including arrays of objects.
  • Auto-validation: Ensures types, enums, and structure match your schema.
  • Recursive Object Serialization: Works out-of-the-box with nested documents and arrays.
  • Developer Tools: Includes tree views, schema printers, and class introspection.

๐Ÿ“ฆ Requirements

  • Python 3.12+
  • MongoDB with $jsonSchema validation on your collections

๐Ÿ”ง Installation

pip install mso

Recommended MongoDB Validation Schema Format

{
  $jsonSchema: {
    bsonType: 'object',
    properties: {
      _id: {
        bsonType: 'objectId'
      },
      # ADD YOUR FIELDS HERE
      last_modified: {
        bsonType: [
          'date',
          'null'
        ]
      },
      created_at: {
        bsonType: [
          'date',
          'null'
        ]
      }
    },
    additionalProperties: false
  }
}

๐Ÿ› ๏ธ Basic Usage

In this basic example we have already created a $jsonSchema validator for the "People" collection in MongoDB. We create a new person, update some information and save the person MongoDB.

from pymongo import MongoClient
from mso.generator import get_model

# Connect to MongoDB
client = MongoClient("mongodb://localhost:27017")
db = client["mydb"]

# Generate a model based on the "people" collection's schema
People = get_model(db, "people")

# Create a new person
person = People(name="Tony Pajama", age=34)

# Add nested data
person.health.primary_physician.name = "Dr. Strange"
person.address.add(type="home", street="123 Elm", city="NY", state="NY", zip="10001")

# Save to the database
person.save()

๐Ÿงช View Your Class Tree

People.print_nested_class_tree()

Output

Tree View:
โ””โ”€โ”€ people
    โ”œโ”€โ”€ name: str
    โ”œโ”€โ”€ age: int
    โ”œโ”€โ”€ email: str
    โ”œโ”€โ”€ gender: enum [Male, Female, Other]
    โ”œโ”€โ”€ addresses: List[addresses_item]
    โ”‚   โ”œโ”€โ”€ type: enum [Home, Business, Other]
    โ”‚   โ”œโ”€โ”€ street: str
    โ”‚   โ”œโ”€โ”€ city: str
    โ”‚   โ”œโ”€โ”€ state: str
    โ”‚   โ””โ”€โ”€ zip: str
    โ””โ”€โ”€ health: Object
        โ”œโ”€โ”€ medical_history: Object
        โ”‚   โ”œโ”€โ”€ conditions: List[conditions_item]
        โ”‚   โ”‚   โ”œโ”€โ”€ name: str
        โ”‚   โ”‚   โ”œโ”€โ”€ diagnosed: str
        โ”‚   โ”‚   โ””โ”€โ”€ medications: List[medications_item]
        โ”‚   โ”‚       โ”œโ”€โ”€ name: str
        โ”‚   โ”‚       โ”œโ”€โ”€ dose: str
        โ”‚   โ”‚       โ””โ”€โ”€ frequency: str
        โ”‚   โ””โ”€โ”€ allergies: List
        โ””โ”€โ”€ primary_physician: Object
            โ”œโ”€โ”€ name: str
            โ””โ”€โ”€ contact: Object
                โ”œโ”€โ”€ phone: str
                โ””โ”€โ”€ address: Object
                    โ”œโ”€โ”€ street: str
                    โ”œโ”€โ”€ city: str
                    โ”œโ”€โ”€ state: str
                    โ””โ”€โ”€ zip: str

๐Ÿ” Querying the Database

# Find one
person = People.find_one({"name": "Tony Pajama"})

# Find many
person_list = People.find_many(sort=[("created_at", -1)], limit=10)

Document Manipulation

# Delete
person.delete()

# Clone
new_person = person.clone()

๐Ÿ“Š Data Summary & Analysis

MSO includes a powerful .summarize() method to help you quickly explore and understand your MongoDB collection. It performs a field-level summary with support for:

โš™๏ธ Options

sample_size: Limit the number of documents to analyze (defaults to all)

top: Number of top strings to return (default: 5)

๐Ÿ” Example

import os
from pymongo import MongoClient
from mso.generator import get_model

# Connect to MongoDB
MONGO_URI = os.getenv("MONGO_URI", "mongodb://localhost:27017")
MONGO_DB = os.getenv("MONGO_DB", "mydb")

client = MongoClient(MONGO_URI)
db = client[MONGO_DB]

# Get the model for the "people" collection
People = get_model(db, "people")

print(People.summarize(top=10))

๐Ÿง  Example Output

{
  "sample_size": 1000,
  "fields": {
    "name": {
      "type": "str",
      "count": 1000,
      "missing": 0,
      "unique": 993,
      "top_5": [
        {
          "value": "Tony Pajama",
          "count": 7,
          "percent": 0.007
        },
        ...
      ]
    },
    "age": {
      "type": "int",
      "count": 978,
      "missing": 22,
      "unique": 43,
      "min": 1,
      "max": 99,
      "mean": 38.6,
      "median": 34,
      "stdev": 19.2
    },
    "health.primary_physician.name": {
      "type": "str",
      "count": 1000,
      "missing": 0,
      "unique": 12,
      "top_5": [
        {
          "value": "Dr. Strange",
          "count": 46,
          "percent": 0.046
        },
        ...
      ]
    }
  }
}

๐Ÿ” Document Comparison

MSO makes it easy to compare two MongoDB documentsโ€”either as model instances or dictionariesโ€”using the powerful Model.diff() method. It supports:

  • Deep recursive comparison of nested objects and arrays
  • Detection of value and type changes
  • Flat or nested output formatting
  • Optional strict mode (type-sensitive)
  • Filtering for specific fields or changes

Basic Example

from mso.generator import get_model
from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017")
db = client["mydb"]
People = get_model(db, "people")

# Create a valid model instance
person1 = People(name="Alice", age=30, gender="Female")

# Use a dictionary with type mismatch (age as string)
person2 = {
    "name": "Alice",
    "age": "30",  # string instead of int
    "gender": "Female"
}

diff = People.diff(person1, person2, strict=True)

from pprint import pprint

pprint(diff)

Example Output

{
  'age': {
    'old': 30,
    'new': '30',
    'type_changed': True
  }
}

Convert to and from dictionary

person_dict = person.to_dict()

โฑ Automatic Timestamps

By default, models automatically include created_at and updated_at fields to track when a document is created or modified. These are managed internally and do not need to be defined in your schema.

๐Ÿ”ง How it works

created_at is set once, when the document is first saved.

updated_at is updated every time the document is modified and saved.

Both are stored as UTC datetime.datetime objects.

๐Ÿšซ Disabling timestamps

Timestamps are enabled by default. To disable them, set the timestamps parameter to False when creating a model.

import os
from time import sleep
from pymongo import MongoClient
from mso.generator import get_model

MONGO_URI = os.getenv("MONGO_URI", "mongodb://localhost:27017")
MONGO_DB = os.getenv("MONGO_DB", "mydb")

client = MongoClient(MONGO_URI)
db = client[MONGO_DB]

# Get the model for the collection
People = get_model(db, "people")

# Disable timestamps for a specific model or instance
People.timestamps_enabled = False

๐Ÿงฉ Lifecycle Hooks

You can use decorators like @pre_save, @post_save, @pre_delete, and @post_delete to hook into model lifecycle events. This is useful for setting defaults, cleaning up, triggering logs, or validating conditions.

Example: Automatically output a message when a document is saved

from mso.base_model import MongoModel, pre_save, post_save
from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017")
db = client["mydb"]


# Define the model hooks you would like to use
class People(MongoModel):
    @post_save  # This method will be called after the document is saved
    def confirm_save(self):
        print(f"[+] Document saved: {self.name}")


People = get_model(db, "people")

person = People(name="Jane Doe")
person.save()

# Output:
# [+] Document saved: Jane Doe

๐Ÿ”— Community & Links

PyPi: https://pypi.org/project/MSO/

Reddit: https://www.reddit.com/r/MSO_Mongo_Python_ORM/

Gitlab: https://github.com/chuckbeyor101/MSO-Mongo-Schema-Object-Library.git

๐Ÿ›ก LICENSE & COPYWRIGHT WARNING

MSO Copyright (c) 2025 by Charles L Beyor
is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.
To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-sa/4.0/

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND
, either express or implied.

See the License for the specific language governing permissions and limitations under the License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mso-1.0.77.tar.gz (25.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mso-1.0.77-py3-none-any.whl (29.5 kB view details)

Uploaded Python 3

File details

Details for the file mso-1.0.77.tar.gz.

File metadata

  • Download URL: mso-1.0.77.tar.gz
  • Upload date:
  • Size: 25.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for mso-1.0.77.tar.gz
Algorithm Hash digest
SHA256 8bd5b601639de319987f7e6b87ed7afc02b44ee8452fde3e3cfbd4c4e0f42b21
MD5 1d42a5b89a3bc8e683d3a7abc155de9f
BLAKE2b-256 cc4ab4e40668e702378d2c11b0b2f7dab9eb8854ddbc366c5de0adef60a38a10

See more details on using hashes here.

Provenance

The following attestation bundles were made for mso-1.0.77.tar.gz:

Publisher: python-publish.yml on chuckbeyor101/MSO-Mongo-Schema-Object-Library

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mso-1.0.77-py3-none-any.whl.

File metadata

  • Download URL: mso-1.0.77-py3-none-any.whl
  • Upload date:
  • Size: 29.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for mso-1.0.77-py3-none-any.whl
Algorithm Hash digest
SHA256 f220fb86740f48b8c6ab1dc3a3d742139e3a711cb12beeca8084e2da71215bdf
MD5 624fcc3b4f696eab8761a27c9b7d64b8
BLAKE2b-256 141ff8b95db97f9726f143cf64cb8e7f7418f7e60b195eea311cb032acbfdc81

See more details on using hashes here.

Provenance

The following attestation bundles were made for mso-1.0.77-py3-none-any.whl:

Publisher: python-publish.yml on chuckbeyor101/MSO-Mongo-Schema-Object-Library

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page