Read, write and update large scale pandas DataFrame with ElasticSearch
Project description
es_pandas
Read, write and update large scale pandas DataFrame with ElasticSearch.
Requirements
This package should work on Python3(>=3.4) and ElasticSearch should be version 5.x, 6.x or 7.x.
Installation The package is hosted on PyPi and can be installed with pip:
pip install es_pandas
Usage
import time
import pandas as pd
from es_pandas import es_pandas
# Information of es cluseter
es_host = 'localhost:9200'
index = 'demo'
# crete es_pandas instance
ep = es_pandas(es_host)
# Example data frame
df = pd.DataFrame({'Num': [x for x in range(100000)]})
df['Alpha'] = 'Hello'
df['Date'] = pd.datetime.now()
# init template if you want
doc_type = 'demo'
ep.init_es_tmpl(df, doc_type)
# Example of write data to es, use the template you create
ep.to_es(df, index, doc_type=doc_type)
# set use_index=True if you want to use DataFrame index as records' _id
ep.to_es(df, index, doc_type=doc_type, use_index=True)
# delete records from es
ep.to_es(df.iloc[5000:], index, doc_type=doc_type, _op_type='delete')
# Update doc by doc _id
df.iloc[:1000, 1] = 'Bye'
df.iloc[:1000, 2] = pd.datetime.now()
ep.to_es(df.iloc[:1000, 1:], index, doc_type=doc_type, _op_type='update')
# Example of read data from es
df = ep.to_pandas(index)
print(df.head())
# return certain fields in es
heads = ['Num', 'Date']
df = ep.to_pandas(index, heads=heads)
print(df.head())
# set certain columns dtype
dtype = {'Num': 'float', 'Alpha': object}
df = ep.to_pandas(index, dtype=dtype)
print(df.dtypes)
# infer dtype from es template
df = ep.to_pandas(index, infer_dtype=True)
print(df.dtypes)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
es_pandas-0.0.14.tar.gz
(4.6 kB
view hashes)
Built Distribution
Close
Hashes for es_pandas-0.0.14-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5243d02cf958319fa08ce646d1e047fd8d334ff8fbb47e2b0791dc5c3730c4db |
|
MD5 | f5f633b6f7a556b09319d08722300ead |
|
BLAKE2b-256 | bccbef75233d2ac12e369135f7bec1760815ba7b45f5e261608883f473d39829 |