Transfer data between pandas dataframes and MongoDB
Project description
Overview
This package allows you to read/write pandas dataframes in MongoDB in the simplest way possible.
Free software: MIT license
Quick Start
Install pdmongo:
pip install pdmongo
Write a pandas DataFrame to a MongoDB collection:
import pandas as pd import pdmongo as pdm df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}) df.to_mongo("MyCollection", "mongodb://localhost:27017/mydb")
Read a MongoDB collection into a pandas DataFrame:
import pdmongo as pdm df = pdm.read_mongo("MyCollection", [], "mongodb://localhost:27017/mydb") print(df)
Examples / use cases
Reading a MongoDB collection into a pandas data frame (aggregation query)
You can use an aggregation query to filter/transform data in MongoDB before fetching them into a data frame. This allows you to delegate the slow operation to MongoDB.
Reading a collection from MongoDB into a pandas DataFrame by using an aggregation query:
import pdmongo as pdm import pandas as pd # First generate some data and write them to MongoDB df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}) df.to_mongo(df, 'MyCollection', "mongodb://localhost:27017/mydb") # Filter with an aggregate query and parse results into a data frame. query = [{"$match": {'A': 1} }] df = pdm.read_mongo("MyCollection", query, "mongodb://localhost:27017/mydb") print(df) # Only values where A > 1 is returned
The query accepts the same arguments as the aggregate method of pymongo package.
Write MongoDB to a PostgreSQL table
You can write a MongoDB collection to a PostgreSQL table:
import numpy as np import pandas as pd import pdmongo as pdm from sqlalchemy import create_engine # Generate some data and write them to MongoDB df = pd.DataFrame({'A': [1, 2, 3]}) df.to_mongo("MyCollection", "mongodb://localhost:27017/mydb") # Read data from MongoDB and write them to PostgreSQL new_df = pdm.read_mongo("MyCollection", [], "mongodb://localhost:27017/mydb") engine = create_engine('postgres://postgres:postgres@localhost:5432', echo=False) new_df[["A"]].to_sql("APostgresTable", engine)
Plot data retrieved from a MongoDB Collection
You can plot a collection retrieved from MongoDB
import numpy as np import pandas as pd import pdmongo as pdm import matplotlib.pyplot as plt # Generate data and write them to MongoDB df = pd.DataFrame({'Value': np.random.randn(1000)}) df.to_mongo('TimeSeries', 'mongodb://localhost:27017/mydb') # Read collection from MongoDB and plot data new_df = pdm.read_mongo("TimeSeries", [], "mongodb://localhost:27017/mydb") new_df.plot() plt.show()
Installation
pip install pdmongo
You can also install the in-development version with:
pip install https://github.com/pakallis/python-pandas-mongo/archive/master.zip
Documentation
You can find the documentation at:
https://python-pandas-mongo.readthedocs.io/
Development
To run the all tests run:
tox
Note, to combine the coverage data from all the tox environments run:
Windows |
set PYTEST_ADDOPTS=--cov-append tox |
---|---|
Other |
PYTEST_ADDOPTS=--cov-append tox |
Changelog
0.3.4 (2022-11-17)
Support for python3.7-3.10
Fix wrong version of Python in CI
0.3.3 (2022-11-17)
Restrict pandas to >=0.20,<1.6
Restrict pymongo to >=13,<4.4
Remove hypothesis
Run tests with tox in CI
Add flake8 checks in CI
0.2.3 (2022-11-12)
Add prepare release script
0.2.2 (2022-11-12)
Fix lint offenses
0.2.1 (2022-11-12)
Minor changes
0.2.0 (2022-11-12)
Add compatibility for pymongo 4+
0.1.0 (2020-05-05)
Added static typing
Added mypy to travis CI
Removed unecessary params
0.0.2 (2020-05-04)
Dropped support for pypy3
0.0.1 (2020-04-30)
Added read_mongo and basic support for reading MongoDB collections into pandas dataframes
Added to_mongo and basic support for writing pandas dataframes in MongoDB collections
0.0.0 (2020-03-22)
First release on PyPI.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pdmongo-0.3.4.tar.gz
.
File metadata
- Download URL: pdmongo-0.3.4.tar.gz
- Upload date:
- Size: 15.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 40d9ff3de1d6f3c2736c3424a765b88b71c27bb137557a420e7fc2e90fca5311 |
|
MD5 | 27bb924740b04ecf7a4eb3e11d36ef5f |
|
BLAKE2b-256 | bcd840ec0a1bd0da59307eda0c2ade656ef951b13b79c7ba4f9c98e47ca9c209 |
File details
Details for the file pdmongo-0.3.4-py2.py3-none-any.whl
.
File metadata
- Download URL: pdmongo-0.3.4-py2.py3-none-any.whl
- Upload date:
- Size: 6.5 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fd9ed822e71839584810467a4c37b18087d0703ed566d734f9745be60e45cd20 |
|
MD5 | 1bd81f5aeab04dd541711c9464316b50 |
|
BLAKE2b-256 | c07198e04941e02bd1af1a7b4cadefa60c3b1cb21fda3fcb0e88a4927d5e5472 |