Skip to main content

"Tools for using NumPy, Pandas, Polars, and PyArrow with MongoDB"

Project description

PyMongoArrow

PyPI Version Python Versions Monthly Downloads Documentation Status

PyMongoArrow is a companion library to PyMongo that contains tools for loading MongoDB query result sets as Apache Arrow tables, Pandas DataFrames or NumPy arrays.

>>> from pymongoarrow.monkey import patch_all
... patch_all()
... from pymongoarrow.api import Schema
... schema = Schema({"_id": int, "qty": float})
... from pymongo import MongoClient
... client = MongoClient()
... client.db.data.insert_many(
...     [{"_id": 1, "qty": 25.4}, {"_id": 2, "qty": 16.9}, {"_id": 3, "qty": 2.3}]
... )
... data_frame = client.db.test.find_pandas_all({}, schema=schema)
... data_frame
   _id   qty
0    1  25.4
1    2  16.9
2    3   2.3
... arrow_table = client.db.test.find_arrow_all({}, schema=schema)
# The schema may also be omitted
... arrow_table = client.db.test.find_arrow_all({})
... arrow_table
pyarrow.Table
_id: int64
qty: double
... ndarrays = client.db.test.find_numpy_all({}, schema=schema)
... ndarrays
{'_id': array([1, 2, 3]), 'qty': array([25.4, 16.9,  2.3])}

PyMongoArrow is the recommended way to materialize MongoDB query result sets as contiguous-in-memory, typed arrays suited for in-memory analytical processing applications.

Installing PyMongoArrow

PyMongoArrow is available on PyPI:

python -m pip install pymongoarrow

To use PyMongoArrow with MongoDB Atlas' mongodb+srv:// URIs, you will need to also install PyMongo with the srv extra:

python -m pip install 'pymongo[srv]' pymongoarrow

To use PyMongoArrow APIs that return query result sets as pandas DataFrame instances, you will also need to have the pandas package installed:

python -m pip install pandas

Note: pymongoarrow is not supported or tested on big-endian systems (e.g. Linux s390x).

Development Install

See the instructions on Read the Docs.

Documentation

Full documentation is available on Read the Docs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymongoarrow-1.3.0.tar.gz (54.5 kB view hashes)

Uploaded source

Built Distributions

pymongoarrow-1.3.0-cp312-cp312-win_amd64.whl (231.8 kB view hashes)

Uploaded cp312

pymongoarrow-1.3.0-cp311-cp311-win_amd64.whl (236.4 kB view hashes)

Uploaded cp311

pymongoarrow-1.3.0-cp310-cp310-win_amd64.whl (236.1 kB view hashes)

Uploaded cp310

pymongoarrow-1.3.0-cp39-cp39-win_amd64.whl (236.2 kB view hashes)

Uploaded cp39

pymongoarrow-1.3.0-cp38-cp38-win_amd64.whl (237.4 kB view hashes)

Uploaded cp38

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page