Skip to main content

"Tools for using NumPy, Pandas and PyArrow with MongoDB"

Project description

Info:A companion library to PyMongo that makes it easy to move data between MongoDB and Apache Arrow. See GitHub for the latest source.
Documentation:Available at mongo-arrow.readthedocs.io.
Author: Prashant Mital

PyMongoArrow is a companion library to PyMongo that contains tools for loading MongoDB query result sets as Apache Arrow tables, Pandas DataFrames or NumPy arrays.

>>> from pymongoarrow.monkey import patch_all
>>> patch_all()
>>> from pymongoarrow.api import Schema
>>> schema = Schema({'_id': int, 'qty': float})
>>> from pymongo import MongoClient
>>> client = MongoClient()
>>> client.db.data.insert_many([{'_id': 1, 'qty': 25.4}, {'_id': 2, 'qty': 16.9}, {'_id': 3, 'qty': 2.3}])
>>> data_frame = client.db.test.find_pandas_all({}, schema=schema)
>>> data_frame
   _id   qty
0    1  25.4
1    2  16.9
2    3   2.3
>>> arrow_table = client.db.test.find_arrow_all({}, schema=schema)
>>> arrow_table
pyarrow.Table
_id: int64
qty: double
>>> ndarrays = client.db.test.find_numpy_all({}, schema=schema)
>>> ndarrays
{'_id': array([1, 2, 3]), 'qty': array([25.4, 16.9,  2.3])}

PyMongoArrow is the recommended way to materialize MongoDB query result sets as contiguous-in-memory, typed arrays suited for in-memory analytical processing applications.

Installing PyMongoArrow

PyMongoArrow is available on PyPI:

$ python -m pip install pymongoarrow

To use PyMongoArrow with MongoDB Atlas’ mongodb+srv:// URIs, you will need to also install PyMongo with the srv extra:

$ python -m pip install 'pymongo[srv]' pymongoarrow

To use PyMongoArrow APIs that return query result sets as pandas DataFrame instances, you will also need to have the pandas package installed:

$ python -m pip install pandas

Development Install

See the instructions on Read the Docs.

Documentation

Full documentation is available on Read the Docs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymongoarrow-0.4.0.tar.gz (30.3 kB view hashes)

Uploaded source

Built Distributions

pymongoarrow-0.4.0-cp310-cp310-win_amd64.whl (7.6 MB view hashes)

Uploaded cp310

pymongoarrow-0.4.0-cp39-cp39-win_amd64.whl (7.6 MB view hashes)

Uploaded cp39

pymongoarrow-0.4.0-cp38-cp38-win_amd64.whl (7.6 MB view hashes)

Uploaded cp38

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page