Skip to main content

A Python package for analyzing and detecting stock market crashes using TDA and ML.

Project description

Stocks Miner

Stocks Miner is a Python-based tool for financial data analysis, leveraging libraries like yfinance, pandas, matplotlib, seaborn, tqdm, scikit-learn, ripser, and persim to process stock market data. It provides command-line interfaces (CLIs) for four main functionalities:

  1. Analyze SENSEX and NIFTY indices with statistical metrics (CAGR, daily returns, correlations) and visualizations (price trends, returns plots, heatmaps).
  2. Analyze NSE stocks with company-wise metrics (CAGR) and visualizations of top performers.
  3. Analyze randomly selected stocks or sectors with CAGR rankings.
  4. Detect market crashes using Topological Data Analysis (TDA) on user-provided stock data via Takens embedding, persistent homology, and bottleneck distances.

Installation

You can install the package locally by navigating to the Stocks_Miner directory and running:

pip install .

Or using the traditional setup:

python setup.py install

Note: Run commands from the parent Stocks_Miner/ directory (not inside stocks_miner/) using python -m stocks_miner.cli <command> to handle relative imports.

Usage

  1. Analyze Market Indices

    python -m stocks_miner.cli indices --start_date 2024-01-01 --end_date 2025-09-21
    
  2. Analyze NSE Stocks

    python -m stocks_miner.cli nse --num_tickers 10 --top_x 5 --start_date 2024-01-01 --end_date 2025-09-21
    
  3. Analyze Random Stocks/Sectors

    python -m stocks_miner.cli random --k 5 --selection_type companies --start_date 2024-01-01 --end_date 2025-09-21
    

Stocks Miner

Stocks Miner is a Python tool for financial data analysis. It uses libraries such as yfinance, pandas, matplotlib, seaborn, tqdm, scikit-learn, ripser, and persim to process stock market data. The package exposes a CLI with the following capabilities:

  • Analyze market indices (SENSEX, NIFTY) with metrics and visualizations (CAGR, daily returns, correlations, price trends, heatmaps).
  • Analyze NSE stocks with company-wise metrics and top-performer visualizations.
  • Analyze randomly selected stocks or sectors and rank by CAGR.
  • Detect market crashes using Topological Data Analysis (TDA) on stock time series via Takens' embedding, persistent homology, and bottleneck distances.

Prerequisites

  • Python 3.8+ is recommended (set this in setup.py with python_requires if you want to enforce it).
  • A virtual environment is strongly recommended to keep dependencies isolated.

Installation

From the repository root (recommended):

# Create and activate a virtual environment (PowerShell)
python -m venv venv
.\venv\Scripts\Activate.ps1

# Install in editable/development mode
python -m pip install -e .

# Or install dependencies from requirements.txt
python -m pip install -r requirements.txt

Notes:

  • Prefer python -m pip install -e . or python -m pip install . over python setup.py install which is deprecated for most workflows.
  • If you publish this package, pin dependency versions in requirements.txt or use setup.cfg/pyproject.toml to manage them.

Usage (examples)

Syntax for vscode Analyze market indices (example dates):

python -m stocks_miner.cli indices --start_date 2024-01-01 --end_date 2025-09-21

Analyze NSE stocks:

python -m stocks_miner.cli nse --num_tickers 10 --top_x 5 --start_date 2024-01-01 --end_date 2025-09-21

Analyze random stocks/sectors:

python -m stocks_miner.cli random --k 5 --selection_type companies --start_date 2024-01-01 --end_date 2025-09-21

TDA crash detection for a ticker:

python -m stocks_miner.cli tda --ticker <TICKER> --start_date 2024-01-01 --end_date 2025-09-21

For full help and options:

python -m stocks_miner.cli --help

Syntax for notebook

Cell 1: Setup and Import

import sys import os

Add stocks_miner to path

sys.path.insert(0, os.path.join(os.getcwd(), 'stocks_miner'))

Import the helper

from notebook_helper import setup_stocks_miner

Setup Stocks Miner and get the modules

sm = setup_stocks_miner()

Optional: Also load random_stocks

from stocks_miner import random_stocks sm.random_stocks = random_stocks

print("✓ All modules loaded!")

Cell 2: Analyze Market Indices

Analyze major market indices (NIFTY 50, SENSEX, etc.)

sm.market_indices.analyze_market_indices( start_date="2025-01-01", end_date="2025-09-21" )

Cell 3: Analyze NSE Stocks

Analyze top NSE stocks

sm.nse_stocks.analyze_nse_stocks( num_tickers=10, # Number of stocks to analyze top_x=5, # Top performers to identify start_date="2025-01-01", end_date="2025-09-21" )

Cell 4: Analyze Random Stocks or Sectors

Analyze random selection of stocks or sectors

sm.random_stocks.analyze_random_stocks_or_sectors( k=5, # Number to select selection_type='companies', # 'companies' or 'sectors' start_date="2025-01-01", end_date="2025-09-21" )

Cell 5: TDA Crash Detection (Advanced)

Topological Data Analysis for crash detection

sm.tda_crash.process_stock_data( ticker='RELIANCE.NS', start_date='2020-01-01', end_date='2024-12-31', window_size=50, embedding_dim=3, time_delay=1 )

Directory structure

Stocks_Miner/
├── data/                   # User-provided CSV/XLSX files for TDA
├── examples/               # Example scripts or notebooks (optional)
├── stocks_miner/           # Main Python package
│   ├── __init__.py
│   ├── cli.py
│   ├── market_indices.py
│   ├── nse_stocks.py
│   ├── random_stocks.py
│   ├── tda_crash_detection.py
│   └── utils.py
├── tests/                  # Unit tests (optional)
├── LICENSE
├── README.md
├── requirements.txt
└── setup.py

Dependencies

Major dependencies include: pandas, numpy, yfinance, matplotlib, seaborn, tqdm, scikit-learn, ripser, persim, and yahooquery.

Install them via the requirements.txt file as shown above.

Notes & recommendations

  • Grammar/wording: "Takens' embedding" is a clearer form than "Takens embedding".
  • Git: avoid committing generated artifacts such as virtual environments and Python bytecode. Add a .gitignore (example below) and remove tracked artifacts from the repo if present.

Example .gitignore snippet to add at the repo root:

# Virtual envs
venv/
.venv/

# Byte-compiled / caches
__pycache__/
*.py[cod]
*$py.class

# Packaging
dist/
build/
*.egg-info/

Tests

If you have tests under tests/, run them with your test runner (e.g., pytest) after activating the virtual environment:

pytest -q

Author

Soumyadip Das, and Rajdeep Chatterjee

Organization

AmygdalaAI-India Lab [https://amygdalaaiindia.github.io/]

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stocks_miner-0.1.0.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stocks_miner-0.1.0-py3-none-any.whl (15.6 kB view details)

Uploaded Python 3

File details

Details for the file stocks_miner-0.1.0.tar.gz.

File metadata

  • Download URL: stocks_miner-0.1.0.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.23

File hashes

Hashes for stocks_miner-0.1.0.tar.gz
Algorithm Hash digest
SHA256 99f3a74e1ec1a17af177433c60775c30c567664b70f4c46e3571929fd28f518f
MD5 ec094cc44cda89e0495ded6377f2143d
BLAKE2b-256 7a88fffd7760194b4f2a4f6369bd40dbffe6ba1158818f77227ac96c4195b3e8

See more details on using hashes here.

File details

Details for the file stocks_miner-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: stocks_miner-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 15.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.23

File hashes

Hashes for stocks_miner-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 11ddf46747642a8672057d4c10fdc787eb086645c6a84d5b6483e5c8e41189e3
MD5 84fb7be22b35a95b4ca410de7e83a35f
BLAKE2b-256 538bc1783ad0971b7556f359d29bd03857c206ebebe6085b5b5f550f8409ead1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page