Quantclean is a program that reformats every financial dataset to US Equity TradeBar
Project description
Quantclean 🧹
"Make it cleaner, make it leaner"
Already used by several people working in the quant and finance industries, Quantclean is the all-in-one tool that will help you to reformat your dataset and clean it.
Quantclean is a program that reformats every financial dataset to US Equity TradeBar (Quantconnect format)
We all faced the problem of reformating or data to a standard. Manual data cleaning is clearly boring and take time. Quantclean is here to help you and to make you life easier as a quant.
Works great with datas from Quandl, Algoseek, Alpha Vantage, yfinance, and many other more...
Few things you may want to know before getting started 🍉
-
Even if you don't have an open, close, volume, high, low, date column, quantclean will create a blank column for it. No problem!
-
The dataframe generated will look like this if you have a date and time column (or if both are on the same column):
Date | Open | High | Low | Close | Volume |
---|---|---|---|---|---|
20131001 09:00 | 6448000 | 6448000 | 6448000 | 6448000 | 90 |
- Date - String date "YYYYMMDD HH:MM" in the timezone of the data format.
- Open - Deci-cents Open Price for TradeBar.
- High - Deci-cents High Price for TradeBar.
- Low - Deci-cents Low Price for TradeBar.
- Close - Deci-cents Close Price for TradeBar.
- Volume - Number of shares traded in this TradeBar.
- You can also get something like that if use the
sweeper_dash
function instead ofsweeper
Date | Open | High | Low | Close | Volume |
---|---|---|---|---|---|
2013-10-01 09:00:00 | 6448000 | 6448000 | 6448000 | 6448000 | 90 |
As you can see, the date format is YYYY-MM-DD and no more YYYYMMDD.
- If you just have a date column (e.g : something like YYYY-MM-DD), it will look like this:
Date | Open | High | Low | Close | Volume |
---|---|---|---|---|---|
20131001 | 6448000 | 6448000 | 6448000 | 6448000 | 90 |
You can also use the sweeper_dash
function here.
How to use it? 🚀
First, download the quantclean.py file in the folder where you are working
Note : I took this data from Quandl, your dataset doesn't have to look like this one necessarily, quantclean adapts to your dataset as well as possible
from quantclean import sweeper
df = pd.read_csv('AS-N100.csv')
df
_df = sweeper(df)
_df
Output:
Now, you may not be happy of this date colum which is presented in the YYYYMMDD format and maybe be prefer YYYY-MM-DD.
In that case do :
df_dash = sweeper_dash(df)
df_dash
Output:
Contribution
If you have some suggestions or improvements don't hesitate to create an issue or make a pull request. Any help is welcome!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for quantclean-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c86682059527e114e353be0ae2768242039a8e6fc9f1292ac0a70c95aecdbfb8 |
|
MD5 | eb9e3b6f66b611d968d77c8b24c55bff |
|
BLAKE2b-256 | 24f3d7338ed5b40f4b30f43a45deca74e111ea11db347749e32cb6e61e5304ff |