Skip to main content

The `DataFrame Statistical Analyzer` package provides a utility tool for statistical analyzing in a Pandas DataFrame.

Project description

DataFrame Statistical Analyzer Utility 📊

The DataFrameAnalyzer project provides a robust and extensible tool for analyzing and visualizing data stored in a Pandas DataFrame. The tool encapsulates various data analysis functionalities, including summary statistics, percentage change computation, outlier detection, trend analysis, moving average calculation, correlation analysis, and seasonal pattern interpretation. The project is designed following the SOLID principles and incorporates design patterns to ensure maintainability and ease of use. 🚀

Features 🌟

  • Summary Statistics: Statistical summary of the DataFrame. 📈
  • Month-to-Month Percentage Changes: Percentage changes between consecutive months. 🔄
  • Outliers Detection (Z-score > 3): DataFrame segments identified as outliers based on Z-score. 🚨
  • Outliers Detection (MAD): DataFrame segments identified as outliers based on Median Absolute Deviation. 📉
  • Trend Analysis (Linear Regression): Slope and intercept of linear trends for numeric columns. 📈
  • Moving Average (3 months window): Moving average values for numeric columns over a 3-month window. 📊
  • Calculating DIPS: DataFrame segments identified as dips below certain thresholds. 📉
  • Calculating Increases: DataFrame segments identified as increases above certain thresholds. 📈
  • Seasonal Patterns: Monthly seasonal patterns identified using Holt-Winters exponential smoothing. 🌿
  • Correlation Analysis: Correlation matrix between numeric columns. 🔗

Installation 🛠️

  1. Install the package:
    pip install dataframe-statistical-analyzer
    

Usage 🖥️

  1. Import the necessary modules:

    import pandas as pd
    from dataframe_statistical_analyzer import DataFrameAnalyzer
    
  2. Prepare your DataFrame:

    data = {
        "month": ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December'],
        "stock_price": [50.0, 51.5, 49.8, 52.0, 53.2, 54.0, 55.0, 56.0, 57.5, 59.0, 60.0, 61.0]
    }
    
    df = pd.DataFrame(data)
    
  3. Initialize the DataFrameAnalyzer with the DataFrame:

    analyzer = DataFrameAnalyzer(df)
    
  4. Perform the analysis:

    analyzer.analyze()
    
  5. Expected Outputs: When you run the analyze() method of DataFrameAnalyzer, you can expect to see the following outputs:

    • Summary Statistics: Statistical summary of the DataFrame.
    • Month-to-Month Percentage Changes: Percentage changes between consecutive months.
    • Outliers Detection (Z-score > 3): DataFrame segments identified as outliers based on Z-score.
    • Outliers Detection (MAD): DataFrame segments identified as outliers based on Median Absolute Deviation.
    • Trend Analysis (Linear Regression): Slope and intercept of linear trends for numeric columns.
    • Moving Average (3 months window): Moving average values for numeric columns over a 3-month window.
    • Calculating DIPS: DataFrame segments identified as dips below certain thresholds.
    • Calculating Increases: DataFrame segments identified as increases above certain thresholds.
    • Seasonal Patterns: Monthly seasonal patterns identified using Holt-Winters exponential smoothing.
    • Correlation Analysis: Correlation matrix between numeric columns.

Contributing 🤝

We welcome contributions to the DataFrameAnalyzer project. Please fork the repository and submit a pull request with your changes. Ensure your code adheres to the existing style and includes appropriate tests.

License 📜

This project is licensed under the MIT License. See the LICENSE file for more details.

Acknowledgments 🙏

This project utilizes several open-source libraries, including Pandas, Matplotlib, Scipy, Scikit-learn, and Statsmodels. We thank the developers and maintainers of these libraries for their invaluable contributions to the open-source community.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframe_statistical_analyzer-1.0.2.tar.gz (3.9 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file dataframe_statistical_analyzer-1.0.2.tar.gz.

File metadata

File hashes

Hashes for dataframe_statistical_analyzer-1.0.2.tar.gz
Algorithm Hash digest
SHA256 b09dedd7bdc7fd44e46bbe53fd4d07cc2a35a5317092dc67aebe85087590729a
MD5 c0cbe4c5c490b6c78f2e966c62d7f8fd
BLAKE2b-256 1d3c93ca5a3e468fb4abbb9835971992ce8e51953113b73720692fec47b75f11

See more details on using hashes here.

File details

Details for the file dataframe_statistical_analyzer-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for dataframe_statistical_analyzer-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2973b98a0490fee10c5f5428e689f494b542a47dead8254ac1004db599ee8983
MD5 c3bd494eeb5977609742dd7fbc0db52b
BLAKE2b-256 c1f73749dbb1281ca534b7268c95a5938f121928392de799ca044bb35430f547

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page