tsstore - Fast and simple timeseries storage
Project description
tsstore
tsstore - Fast and simple timeseries storage
Copyright (C) 2018 Jorge M. Faleiro Jr.
See LICENSE for important licensing information.
Installation
The installation depends on which underlying storage you plan on using. If you plan on using fastparquet:
pip install jfaleiro.tsstore[fastparquet]
Or dask:
pip install jfaleiro.tsstore[dask]
Use
All starts by retrieving a root
node, with a root path and a type of storage:
from jfaleiro_tsstore import root
r = root('~/mydata', type_='fastparquet')
And from that root you can define and retrieve a storage. You do that by defining a number of attributes for the storage.
Let's say for example you want all your "open/high/low/close" prices of all tech stocks from the source quandl
, unadjusted, in intervals of one day, in one store, you would use attributes along these lines:
s = r.get_store(type='stock',
sector='tech',
source='quandl/wiki',
serie='OHLC',
adjusted=False,
interval='day',
interval_size=1)
You should be able to use any reasonable dictionary to define the attributes.
You can now store your series, associating each serie to individual symbols:
import quandl
s.put('GOOG', quandl.get('GOOG/WIKI', ...))
s.put('IRBT', quandl.get('IRBT/WIKI', ...))
Series must be instances of pandas DataFrames
. For each symbol you apply basic operations put
, get
, append
, prepend
, and delete
. These operations are simple and self-explanatory. For example:
import pandas as pd
from pandas.testing import assert_frame_equal
s.put('ABC', df) # set series df in ABC
df1 = s.get('ABC') # retrieve series ABC
assert_frame_equal(df, df1) # OK
s.append('ABC', df2) # append df2 to ABC
s.prepend('ABC', df3) # prepend df3 to ABC
assert_frame_equal(s.get('ABC'), pd.concat([df3, df, df2])) # OK
s.delete('ABC') # delete series
Traversal
You have a few traversal operations available. For example, to retrieve all stores of root, and the number of entries on thr timeseries of all symbols of each store:
# lists all symbols and size of each serie under root 'r'
for s in r.all_stores:
for j in s.all_symbols:
print(f'symbol {j} has {len(s.get(j).index)} items')
Final Notes
I am a practioner in computational finance. On any given day I must deal with lots of time series of all kinds. Series of prices, measurements of risk, signals of buy/sell/hold, and many others. Series can be related to any window of time: seconds, minutes, weeks, and so forth. And these can be grouped in an arbitrary length (every 5 seconds, or every 3 days, etc)
At the same time I have to do all of that, I wanted something simple, fast, and reliable.
This library is a result of these requirements.
If you are on the same boat, you might benefit from this as well. If you need any additional storage alternatives, the implementation is very simple and straigthforward. Feel free to fork this repository and contribute back your work when you are done.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for jfaleiro.tsstore-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | afaaa4fa7953609682b32d8e5ec8e873e570670be80d06b74b4460eba7cc20aa |
|
MD5 | f44ff7f00646801a34002657e4adf2c3 |
|
BLAKE2b-256 | 9860b2af7e195b9f2d5b4ea18bb06fb0687ad01f320cdd73542900527147a368 |