Skip to main content

A Module to add read and write capabilities to pandas for several nosql databases

Project description

Pandas-NoSQL

Pandas-NoSQL adds read and write capabilities to pandas for several nosql databases


Pandas read and write methods for:

  • MongoDB
  • Elasticsearch
  • Redis
  • Apache Cassandra

The dependencies used for the functions are:

  • pymongo by MongoDB
  • elasticsearch by Elastic
  • redis-py by Redis
  • cassandra-driver by DataStax

Installation

$ pip install pandas-nosql

Documentation

MongoDB

# Read a MongoDB collection into a DataFrame
# Defaults: host='localhost' port=27017
# For nested collections use normalize=True

import pandas as pd
import pandas_nosql

df = pd.read_mongo(
    database = 'test_db',
    collection = 'test_col',
    normalize = True,
    host = 'localhost',
    port = 27017
    )

# Write DataFrame to MongoDB collection
# modify_colletion parameter is to help prevent accidental overwrites

df.to_mongo(
    database = 'test_db',
    collection = 'test_col2',
    mode = 'a',
    host = 'localhost',
    port = 27017
    )

Elasticsearch

import pandas as pd
import pandas_nosql

# Read an Elastic index into a DataFrame
# To Access localhost: 
#   * hosts='https://localhost:9200'
#   * verify_certs=False
# If "xpack.security.enabled: false" use http instead
# To split out _source use split_source=True

elastic_cols = ('make', 'model', 'purchase_date', 'miles')

df = pd.read_elastic(
    hosts = 'https://localhost:9200',
    username = 'elastic',
    password = 'strong_password',
    index = 'test_index',
    fields = elastic_cols,
    verify_certs = False
    split_source = True
    )

# Write DataFrame to Elastic Index
df.to_elastic(
    hosts = 'https://localhost:9200',
    username = 'elastic',
    password = 'strong_password',
    index = 'test_index',
    mode = 'w'
    )

Redis

import pandas as pd
import pandas_nosql

# Read a DataFrame that was sent to Redis using the to_redis method
# A DataFrame not sent to Redis using the to_redis method is not guaranteed to be read properly

# To Access localhost:
#   * host='localhost'
# To persist the DataFrame in Redis use expire_seconds=None
# To set an expiration for the DataFrame pass an integer to expire_seconds

df = pd.read_redis(
    host = 'localhost',
    port = 6379,
    redis_key = 'test_key'
    )

# Write a DataFrame to Redis
df.to_redis(
    host = 'localhost',
    port = 6379,
    redis_key = 'test_key',
    expire_seconds = None
    )

Apache Cassandra

import pandas as pd
import pandas_nosql

# Read an Apache Cassandra table into a Panda DataFrame
# To Access localhost:
#   * contact_points=['localhost']
#   * contact_points must be a list

df = pd.read_cassandra(
    contact_points = ['localhost'],
    port = 9042,
    keyspace = 'test_keyspace',
    table = 'test_table'
    )

# Append a DataFrame to Apache Cassandra
# DataFrame Columns must match table Columns
replication = {
    'class' : 'SimpleStrategy',
    'replication_factor' : 1
    }
df.to_cassandra(
    contact_points = ['localhost'],
    port = 9042,
    keyspace = 'test_keyspace',
    table = 'test_table',
    mode = 'w',
    replication = replication
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas-nosql-1.2.0.tar.gz (9.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pandas_nosql-1.2.0-py3-none-any.whl (9.5 kB view details)

Uploaded Python 3

File details

Details for the file pandas-nosql-1.2.0.tar.gz.

File metadata

  • Download URL: pandas-nosql-1.2.0.tar.gz
  • Upload date:
  • Size: 9.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.6

File hashes

Hashes for pandas-nosql-1.2.0.tar.gz
Algorithm Hash digest
SHA256 2249f89c73bd920638eaba42dcecee9389531b7435c6955a1532b7585c6e8287
MD5 34320245cb1c7019d2259b7f9badfb5e
BLAKE2b-256 b757a14715acda52ef0516cbf45bfcab0e4cdfd0817994b6c8f2bd77aa0b1d8e

See more details on using hashes here.

File details

Details for the file pandas_nosql-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: pandas_nosql-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 9.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.6

File hashes

Hashes for pandas_nosql-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2870a4289335d3e56093237dad591f5653e721de430a1324f404aaeb064ff3fa
MD5 84d77bc65f2b41d6391fa6d5bf82f8e6
BLAKE2b-256 067269a9f839aa11e29e3bb926dba688b276a78daf80d0407aceb2c0f5598197

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page