Skip to main content

USEP price prediction

Project description

nextstep

Introduction

Nextstep integrates major popular machine learning algorithms, offering data scientists an all-in-one package. At the same time, it lifts the programming constraints by extracting key hyper-parameters into a configuration dictionary, empowering less experienced python users the ability to explore machine learning.

Nextstep was originally developed for a data science challenge which involved price prediction. So it has a dedicated module to obtain data (oil and weather) via open API or web-scraping. It evolves into a machine learning prediction toolkit.

Installation

First time installation

pip install nextstep

Upgrade to the latest version

pip install nextstep --upgrade

Quick Tutorial

getData module

generate oil prices

from nextstep.getData.oil import *
oil_prices.process()

brent_daily.csv and wti_daily.csv will be generated at the current directory. They contain historical oil price until the most recent day.

generate weather data

This function relies on an API key from worldweatheronline. It is free for 60 days as of 27/3/2020. It will generate csv data files in the current directory.

from nextstep.getData.weather import weather
config = {
		'frequency' : 1,
		'start_date' : '01-Jan-2020',
		'end_date' : '31-Jan-2020',
		'api_key' : 'your api key here',
		'location_list' : ['singapore'],
		'location_label' : False
		}
weather(config).get_weather_data()

model module

Every ML model has a unique config. Please fill in accordingly.

random forest

# examples, please fill in according to your project scope
from nextstep.model.random_forest import random_forest
config = {
            'label_column' : 'USEP',
            'train_size' : 0.9,
            'seed' : 66,
            'n_estimators' : 10,
            'bootstrap' : True,
            'criterion' : 'mse',
            'max_features' : 'sqrt'
	}
random_forest_shell = random_forest(config)

random_forest_shell.build_model(data) # build model

arima

from nextstep.model.arima import arima
config = {
		'lag' : 7,
		'differencing' : 0,
		'window_size' : 2,
		'label_column' : 'USEP',
		'train_size' : 0.8,
		'seed' : 33
	}
arima_shell = arima(config)

arima_shell.autocorrelation(data) # plot autocorrelation to determine p, lag order
arima_shell.partial_autocorrelation(data) # plot partial autocorrelation to determine q, moving average widow size
arima_shell.build_model(data) # build model

# residual plot to check model performance
arima_shell.residual_plot()
arima_shell.residual_density_plot()

Contributing

Pull requests are welcome

Author

yuesong YANG

bolin ZHU

Ziyue Yang

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nextstep-0.0.19.tar.gz (6.5 MB view details)

Uploaded Source

Built Distribution

nextstep-0.0.19-py3-none-any.whl (18.3 kB view details)

Uploaded Python 3

File details

Details for the file nextstep-0.0.19.tar.gz.

File metadata

  • Download URL: nextstep-0.0.19.tar.gz
  • Upload date:
  • Size: 6.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for nextstep-0.0.19.tar.gz
Algorithm Hash digest
SHA256 ed02068c4f6efdc4daccb8e18a21ae83204b8e9e5b6256f26130eebd65fd175a
MD5 aa00ff7d02538a93c1017b8e3d96511a
BLAKE2b-256 ec700e80f0c506990f80f3fc5e0003bf82c61b712b5e5464f9141aac4639b31d

See more details on using hashes here.

File details

Details for the file nextstep-0.0.19-py3-none-any.whl.

File metadata

  • Download URL: nextstep-0.0.19-py3-none-any.whl
  • Upload date:
  • Size: 18.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for nextstep-0.0.19-py3-none-any.whl
Algorithm Hash digest
SHA256 506765dfbd880fd17dafebc0474dac2ef868456cb2c03f7d75f9a3c2d8ed2f9b
MD5 f75ff4754f78b73fcc84612474341a67
BLAKE2b-256 5265d657f41b2fff964dd84d46386d04abf688108f9f1e31dd09e60aa69c102d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page