Advanced Econometric Analysis Tools
Project description
EconKit
econkit
is a Python library that provides various statistical and econometric analysis tools, including descriptive statistics, correlation matrices, and tests for stationarity and autocorrelation.
Installation
Ensure you have the required packages installed:
pip install pandas numpy scipy statsmodels tabulate yfinance pandas requests tabulate warnings
Functions
Descriptive Statistics
descriptives(data)
Computes descriptive statistics for each numeric column in a DataFrame.
Parameters:
data
:pandas.DataFrame
containing the data to be analyzed.
Returns:
- None. Prints a summary table of the descriptive statistics.
Example Usage:
import pandas as pd
from econkit import econometrics as ec
df = pd.read_csv('your_data.csv')
ec.descriptives(df)
Correlation Matrix
correlation(df, method='Pearson', p=False)
Calculates and prints the correlation matrix and p-values for numeric columns in the provided DataFrame. Supports Pearson, Spearman, and Kendall correlation methods.
Parameters:
df
:pandas.DataFrame
containing the data to be analyzed.method
:str
(optional). Method of correlation ('Pearson', 'Spearman', or 'Kendall'). Default is 'Pearson'.p
:bool
(optional). If True, p-value matrix is also printed; if False, only the correlation matrix is printed. Default is False.
Returns:
- None. Prints the correlation matrix and optionally the p-value matrix.
Example Usage:
import pandas as pd
from econkit import econometrics as ec
df = pd.read_csv('your_data.csv')
ec.correlation(df, method='Spearman', p=True)
Augmented Dickey-Fuller (ADF) Test
adf(dataframe, maxlag=None, regression='c', autolag='AIC', handle_na='drop')
Performs the ADF test on each column in the DataFrame and returns a summary table.
Parameters:
dataframe
:pandas.DataFrame
containing the data to be tested.maxlag
:int
(optional). Maximum number of lags to use. Default is None.regression
:str
{'c', 'ct', 'ctt', 'nc'} (optional). Type of regression trend. Default is 'c'.autolag
:str
(optional). Method to use when automatically determining the lag length ('AIC', 'BIC', 't-stat'). Default is 'AIC'.handle_na
:str
{'drop', 'fill'} (optional). How to handle missing values. Default is 'drop'.
Returns:
- None. Prints a summary table of the ADF test results.
Example Usage:
import pandas as pd
from econkit import econometrics as ec
df = pd.read_csv('your_data.csv')
ec.adf(df, regression='ct', autolag='BIC')
KPSS Test
kpss(dataframe, regression='c', nlags='auto', handle_na='drop')
Performs the KPSS test on each column in the DataFrame and returns a summary table.
Parameters:
dataframe
:pandas.DataFrame
containing the data to be tested.regression
:str
{'c', 'ct'} (optional). Type of regression trend. Default is 'c'.nlags
:str
orint
(optional). Number of lags to use. Default is 'auto'.handle_na
:str
{'drop', 'fill'} (optional). How to handle missing values. Default is 'drop'.
Returns:
- None. Prints a summary table of the KPSS test results.
Example Usage:
import pandas as pd
from econkit import econometrics as ec
df = pd.read_csv('your_data.csv')
ec.kpss(df, regression='ct', nlags='auto')
Durbin-Watson Test
dw(data)
Performs the Durbin-Watson autocorrelation test and Ljung-Box test for each column of the dataset.
Parameters:
data
:pandas.DataFrame
where each column is a time series.
Returns:
- None. Prints a summary table of the Durbin-Watson test results.
Example Usage:
import pandas as pd
from econkit import econometrics as ec
df = pd.read_csv('your_data.csv')
ec.dw(df)
Financial Data Retrieval
data(ticker_symbol, start_date, end_date, interval)
Downloads financial data from Yahoo Finance and calculates daily returns.
Parameters:
ticker_symbol
:str
. The stock ticker symbol.start_date
:str
. Start date in 'dd-mm-yyyy' format.end_date
:str
. End date in 'dd-mm-yyyy' format.interval
:str
. Data interval (e.g., '1d', '1wk', '1mo').
Returns:
pandas.DataFrame
containing the stock data and calculated returns.
Example Usage:
from econkit import finance as f
start = '01-06-2024'
end = '07-06-2024'
int = '1m'
SP500 = f.data('^GSPC', start, end, int)
SP500.head()
Usage Notes
- Ensure your data is clean and properly formatted before using these functions.
- Some functions handle missing values; specify your preferred method using the
handle_na
parameter. - For time series analysis, ensure your data is indexed by date.
For more details, refer to the function docstrings or the examples provided above.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for econkit-0.4.1.9.9.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8d915100d6e2238681c0711cf29190dd28020ce09179e7bba2bb0b8a6f39c57e |
|
MD5 | 8e18469b912fbfeff747824ab99ddedf |
|
BLAKE2b-256 | 9061b745d0fef27a834fcdf25aee1712c9c35d06d0bb6b278e375dac32e603f7 |