Skip to main content

A package for the clustering of financial time series data with the realized volatility allowed for.

Project description

Ftsa

Ftsa is a package for the clustering of financial time series data with the realized volatility allowed for. This work provides a method to cluster daily financial time series. The objective is to find transaction dates with unusual financial behaviors or large volatility.

The clustering consists of 2 stages.

Stage 1: The clustering based on the shape of time series. This stage can be viewed as a normal clustering for the time series data.

Stage 2: The clustering based on the realized volatility. For each cluster found in the first stage, the algorithm will further divide the series within it to different groups according to the realized volatility.

As a result, the final partition is conducted with both the shape and the volatility of the series taken into consideration.

How to use Ftsa

1. Import the package

from Ftsa import ftsworker

2. Initialize a Ftsa worker.

The following example shows the necessary parameters:

  • The path of the data (CSV file);

  • A dictionary that specifies the column name of the time and that of the corresponding value. In this case, the column names are 'Date' and 'Close', respectively.

worker = ftsworker.worker('SH000001.csv', 
	{
	'time':'Date', 
	'val':'Close'
	})

3. Load the data.

df = worker.load()

4. Conduct the clustering.

The parameters are:

  • The first parameter refers to the data, i.e., the loaded dataframe.

  • h: The window size for the smooth function. Default: 5.

  • K: The number of clusters in the first stage. Default: 2.

  • rv_K: The number of clusters in the second stage. Default: 2.

  • op: Whether to output dendrograms. 1 - output; 0 - not output. Default: 0.

  • mop: Whether to implement the second stage. If mop is set to 1, it will conduct the 2 stages. If mop is set to 0, it will only conduct first stage, the clustering based on the shape of the time series. Default: 1.

label = worker.cluster(df, K=3, rv_K=2, mop=1, op=0, h=5)

The output label shows the final partition for all daily time series. The following example display the format of the output. A dictionary is provided to describe the partition. Each date refers to a daily time series.

{
	'changepoint cluster 1': 
		{
			'RV cluster 1': ['2015-01-01', '2015-03-02', ...],
			'RV cluster 2': ['2015-06-30', ...]
		},
	'changepoint cluster 2': 
		{
			'RV cluster 1': ['2015-10-11', '2015-10-12', ...],
			'RV cluster 2': ['2015-12-01', ...]
		}
	...
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Ftsa-0.0.4.tar.gz (6.1 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page