A Module to add read and write capabilities to pandas for several nosql databases
Project description
Pandas NoSQL
Pandas-NoSQL adds read and write capabilities to pandas for several nosql databases
Import the pandas_nosql module to add read and write methods for:
- MongoDB
- Elasticsearch
- Redis
- Apache Cassandra
MongoDB
# Read a MongoDB collection into a DataFrame
# Defaults: host='localhost' port=27017
# For nested collections use normalize=True
import pandas as pd
import pandas_nosql
df = pd.read_mongo(database='test_db', collection='test_col', normalize=True)
# Write DataFrame to MongoDB collection
# modify_colletion parameter is to help prevent accidental overwrites
df.to_mongo(database='test_db', collection='test_col2', modify_collection=True)
Elasticsearch
import pandas as pd
import pandas_nosql
# Read an Elastic index into a DataFrame
# Defaults:
# * hosts='https://localhost:9200'
# * verify_certs=False
# To split out _source use normalize=True
elastic_cols = ('make', 'model', 'purchase_date', 'miles')
df = pd.read_elastic(
hosts='https://localhost:9200',
username='elastic',
password='strong_password',
index='test_index',
fields=elastic_cols,
verify_certs=False
normalize=True
)
# Write DataFrame to Elastic Index
df.to_elastic(
hosts='https://localhost:9200',
username='elastic',
password='strong_password',
index='test_index',
create_index=True
)
Redis
import pandas as pd
import pandas_nosql
# Read a DataFrame that was sent to Redis using the to_redis method
# A DataFrame not sent to Redis using the to_redis method is not guaranteed to be read properly
# Defaults:
# * host='localhost'
# * expire_seconds=None
# To persist the DataFrame in Redis use expire_seconds=None
# To set an expiration for the DataFrame pass an integer to expire_seconds
df = pd.read_redis(host='localhost', port=6379, redis_key='test_key', expire_seconds=None)
# Write a DataFrame to Redis
df.to_redis(host='localhost', port=6379, redis_key='test_key')
Apache Cassandra
import pandas as pd
import pandas_nosql
# Read an Apache Cassandra table into a Panda DataFrame
# contact_points must be a list
df = pd.read_cassandra(
contact_points=['localhost'],
port=9042,
keyspace='test_keyspace',
table='test_table'
)
# Append a DataFrame to Apache Cassandra
# DataFrame Columns must match table Columns
df.to_cassandra(
contact_points=['localhost'],
port=9042,
keyspace='test_keyspace',
table='test_table'
)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pandas-nosql-0.0.1.tar.gz
(7.6 kB
view hashes)
Built Distribution
Close
Hashes for pandas_nosql-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8badbe82e6dc0dc79465487dd2a337f9a5ba2f59abd5261d7e79b7416549ecc4 |
|
MD5 | 86c3921f65e72ba0a8c57035d1192f5d |
|
BLAKE2b-256 | b4f36042c626c1197a3e90a891f85c0b1974804409b43c804f29a1f760a12c57 |