Skip to main content

Quantclean is a program that reformats every financial dataset to US Equity TradeBar

Project description

Quantclean 🧹

"Make it cleaner, make it leaner"

Already used by several people working in the quant and finance industries, Quantclean is the all-in-one tool that will help you to reformat your dataset and clean it.

Quantclean is a program that reformats every financial dataset to US Equity TradeBar (Quantconnect format)

We all faced the problem of reformating or data to a standard. Manual data cleaning is clearly boring and take time. Quantclean is here to help you and to make you life easier as a quant.

Works great with datas from Quandl, Algoseek, Alpha Vantage, yfinance, and many other more...

Few things you may want to know before getting started 🍉

  1. Even if you don't have an open, close, volume, high, low, date column, quantclean will create a blank column for it. No problem!

  2. The dataframe generated will look like this if you have a date and time column (or if both are on the same column):

Date Open High Low Close Volume
20131001 09:00 6448000 6448000 6448000 6448000 90
  • Date - String date "YYYYMMDD HH:MM" in the timezone of the data format.
  • Open - Deci-cents Open Price for TradeBar.
  • High - Deci-cents High Price for TradeBar.
  • Low - Deci-cents Low Price for TradeBar.
  • Close - Deci-cents Close Price for TradeBar.
  • Volume - Number of shares traded in this TradeBar.
  1. You can also get something like that if use the sweeper_dash function instead of sweeper
Date Open High Low Close Volume
2013-10-01 09:00:00 6448000 6448000 6448000 6448000 90

As you can see, the date format is YYYY-MM-DD and no more YYYYMMDD.

  1. If you just have a date column (e.g : something like YYYY-MM-DD), it will look like this:
Date Open High Low Close Volume
20131001 6448000 6448000 6448000 6448000 90

You can also use the sweeper_dash function here.

How to use it? 🚀

First, download the quantclean.py file in the folder where you are working

Note : I took this data from Quandl, your dataset doesn't have to look like this one necessarily, quantclean adapts to your dataset as well as possible

from quantclean import sweeper

df = pd.read_csv('AS-N100.csv')
df
_df = sweeper(df)
_df

Output:

Now, you may not be happy of this date colum which is presented in the YYYYMMDD format and maybe be prefer YYYY-MM-DD.

In that case do :

df_dash = sweeper_dash(df)
df_dash

Output:

Contribution

If you have some suggestions or improvements don't hesitate to create an issue or make a pull request. Any help is welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quantclean-0.0.2.tar.gz (3.4 kB view details)

Uploaded Source

Built Distribution

quantclean-0.0.2-py3-none-any.whl (3.5 kB view details)

Uploaded Python 3

File details

Details for the file quantclean-0.0.2.tar.gz.

File metadata

  • Download URL: quantclean-0.0.2.tar.gz
  • Upload date:
  • Size: 3.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.7.1

File hashes

Hashes for quantclean-0.0.2.tar.gz
Algorithm Hash digest
SHA256 3ca5effe7c30b8487943526754ad9620dbadf9fd55e5273636fe77d49f4eaabd
MD5 b5bbdf8e1d0a8c5dd55efea0641193c2
BLAKE2b-256 8afc2ee747c3d93a8d67418bbc0fa894e68c4bf1f990622a67d637e55a6de1f8

See more details on using hashes here.

File details

Details for the file quantclean-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: quantclean-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 3.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.7.1

File hashes

Hashes for quantclean-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 34004ed1f9592de48f604c281e47ca6d388895eb36c9c80a4711039eb4032ffb
MD5 8038185e84b8b21727b11ad3bff0b9f5
BLAKE2b-256 c431c3285116ab67bb60bbabf91b395fad93d1c74804a9d0701bcd9530c40b9e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page