Skip to main content

A Module to add read and write capabilities to pandas for several nosql databases

Project description

Pandas NoSQL

Pandas-NoSQL adds read and write capabilities to pandas for several nosql databases


Import the pandas_nosql module to add read and write methods for:

  • MongoDB
  • Elasticsearch
  • Redis
  • Apache Cassandra

MongoDB

# Read a MongoDB collection into a DataFrame
# Defaults: host='localhost' port=27017
# For nested collections use normalize=True

import pandas as pd
import pandas_nosql

df = pd.read_mongo(database='test_db', collection='test_col', normalize=True)

# Write DataFrame to MongoDB collection
# modify_colletion parameter is to help prevent accidental overwrites

df.to_mongo(database='test_db', collection='test_col2', modify_collection=True)

Elasticsearch

import pandas as pd
import pandas_nosql

# Read an Elastic index into a DataFrame
# Defaults: 
#   * hosts='https://localhost:9200'
#   * verify_certs=False
# To split out _source use normalize=True

elastic_cols = ('make', 'model', 'purchase_date', 'miles')

df = pd.read_elastic(
    hosts='https://localhost:9200',
    username='elastic',
    password='strong_password',
    index='test_index',
    fields=elastic_cols,
    verify_certs=False
    normalize=True
    )

# Write DataFrame to Elastic Index
df.to_elastic(
    hosts='https://localhost:9200',
    username='elastic',
    password='strong_password',
    index='test_index',
    create_index=True
    )

Redis

import pandas as pd
import pandas_nosql

# Read a DataFrame that was sent to Redis using the to_redis method
# A DataFrame not sent to Redis using the to_redis method is not guaranteed to be read properly

# Defaults:
#   * host='localhost'
#   * expire_seconds=None
# To persist the DataFrame in Redis use expire_seconds=None
# To set an expiration for the DataFrame pass an integer to expire_seconds

df = pd.read_redis(host='localhost', port=6379, redis_key='test_key', expire_seconds=None)

# Write a DataFrame to Redis
df.to_redis(host='localhost', port=6379, redis_key='test_key')

Apache Cassandra

import pandas as pd
import pandas_nosql

# Read an Apache Cassandra table into a Panda DataFrame
# contact_points must be a list

df = pd.read_cassandra(
    contact_points=['localhost'],
    port=9042,
    keyspace='test_keyspace',
    table='test_table'
    )

# Append a DataFrame to Apache Cassandra
# DataFrame Columns must match table Columns
df.to_cassandra(
    contact_points=['localhost'],
    port=9042,
    keyspace='test_keyspace',
    table='test_table'
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas-nosql-0.0.1.tar.gz (7.6 kB view hashes)

Uploaded Source

Built Distribution

pandas_nosql-0.0.1-py3-none-any.whl (7.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page