A Module to add read and write capabilities to pandas for several nosql databases
Project description
Pandas-NoSQL
Pandas-NoSQL adds read and write capabilities to pandas for several nosql databases
Pandas read and write methods for:
- MongoDB
- Elasticsearch
- Redis
- Apache Cassandra
Installation
$ pip install pandas-nosql
Documentation
MongoDB
# Read a MongoDB collection into a DataFrame
# Defaults: host='localhost' port=27017
# For nested collections use normalize=True
import pandas as pd
import pandas_nosql
df = pd.read_mongo(database = 'test_db', collection = 'test_col', normalize = True)
# Write DataFrame to MongoDB collection
# modify_colletion parameter is to help prevent accidental overwrites
df.to_mongo(database = 'test_db', collection = 'test_col2', mode = 'a')
Elasticsearch
import pandas as pd
import pandas_nosql
# Read an Elastic index into a DataFrame
# To Access localhost:
# * hosts='https://localhost:9200'
# * verify_certs=False
# If "xpack.security.enabled: false" use http instead
# To split out _source use split_source=True
elastic_cols = ('make', 'model', 'purchase_date', 'miles')
df = pd.read_elastic(
hosts = 'https://localhost:9200',
username = 'elastic',
password = 'strong_password',
index = 'test_index',
fields = elastic_cols,
verify_certs = False
split_source = True
)
# Write DataFrame to Elastic Index
df.to_elastic(
hosts = 'https://localhost:9200',
username = 'elastic',
password = 'strong_password',
index = 'test_index',
mode = 'w'
)
Redis
import pandas as pd
import pandas_nosql
# Read a DataFrame that was sent to Redis using the to_redis method
# A DataFrame not sent to Redis using the to_redis method is not guaranteed to be read properly
# To Access localhost:
# * host='localhost'
# To persist the DataFrame in Redis use expire_seconds=None
# To set an expiration for the DataFrame pass an integer to expire_seconds
df = pd.read_redis(host = 'localhost', port = 6379, redis_key = 'test_key', expire_seconds = None)
# Write a DataFrame to Redis
df.to_redis(host = 'localhost', port = 6379, redis_key = 'test_key')
Apache Cassandra
import pandas as pd
import pandas_nosql
# Read an Apache Cassandra table into a Panda DataFrame
# To Access localhost:
# * contact_points=['localhost']
# * contact_points must be a list
df = pd.read_cassandra(
contact_points = ['localhost'],
port = 9042,
keyspace = 'test_keyspace',
table = 'test_table'
)
# Append a DataFrame to Apache Cassandra
# DataFrame Columns must match table Columns
df.to_cassandra(
contact_points = ['localhost'],
port = 9042,
keyspace = 'test_keyspace',
table = 'test_table'
)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pandas-nosql-1.0.0.tar.gz
(8.3 kB
view hashes)
Built Distribution
Close
Hashes for pandas_nosql-1.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 19cb7fff181d80df04771606801ad9858b507c008818f342f791199ec03c07b9 |
|
MD5 | 4fe17403b79aadbaedef7f7564b5ce76 |
|
BLAKE2b-256 | 383200562a818e9341d7bed89c450853607e404a43475b7f0d333a282009d94b |