A Python package for analyzing and detecting stock market crashes using TDA and ML.
Project description
Stocks Miner
Stocks Miner is a Python-based tool for financial data analysis, leveraging libraries like yfinance, pandas, matplotlib, seaborn, tqdm, scikit-learn, ripser, and persim to process stock market data. It provides command-line interfaces (CLIs) for four main functionalities:
- Analyze SENSEX and NIFTY indices with statistical metrics (CAGR, daily returns, correlations) and visualizations (price trends, returns plots, heatmaps).
- Analyze NSE stocks with company-wise metrics (CAGR) and visualizations of top performers.
- Analyze randomly selected stocks or sectors with CAGR rankings.
- Detect market crashes using Topological Data Analysis (TDA) on user-provided stock data via Takens embedding, persistent homology, and bottleneck distances.
Installation
You can install the package locally by navigating to the Stocks_Miner directory and running:
pip install .
Or using the traditional setup:
python setup.py install
Note: Run commands from the parent Stocks_Miner/ directory (not inside stocks_miner/) using python -m stocks_miner.cli <command> to handle relative imports.
Usage
-
Analyze Market Indices
python -m stocks_miner.cli indices --start_date 2024-01-01 --end_date 2025-09-21
-
Analyze NSE Stocks
python -m stocks_miner.cli nse --num_tickers 10 --top_x 5 --start_date 2024-01-01 --end_date 2025-09-21
-
Analyze Random Stocks/Sectors
python -m stocks_miner.cli random --k 5 --selection_type companies --start_date 2024-01-01 --end_date 2025-09-21
Stocks Miner
Stocks Miner is a Python tool for financial data analysis. It uses libraries such as yfinance, pandas, matplotlib, seaborn, tqdm, scikit-learn, ripser, and persim to process stock market data. The package exposes a CLI with the following capabilities:
- Analyze market indices (SENSEX, NIFTY) with metrics and visualizations (CAGR, daily returns, correlations, price trends, heatmaps).
- Analyze NSE stocks with company-wise metrics and top-performer visualizations.
- Analyze randomly selected stocks or sectors and rank by CAGR.
- Detect market crashes using Topological Data Analysis (TDA) on stock time series via Takens' embedding, persistent homology, and bottleneck distances.
Prerequisites
- Python 3.8+ is recommended (set this in
setup.pywithpython_requiresif you want to enforce it). - A virtual environment is strongly recommended to keep dependencies isolated.
Installation
From the repository root (recommended):
# Create and activate a virtual environment (PowerShell)
python -m venv venv
.\venv\Scripts\Activate.ps1
# Install in editable/development mode
python -m pip install -e .
# Or install dependencies from requirements.txt
python -m pip install -r requirements.txt
Notes:
- Prefer
python -m pip install -e .orpython -m pip install .overpython setup.py installwhich is deprecated for most workflows. - If you publish this package, pin dependency versions in
requirements.txtor usesetup.cfg/pyproject.tomlto manage them.
Usage (examples)
Syntax for vscode Analyze market indices (example dates):
python -m stocks_miner.cli indices --start_date 2024-01-01 --end_date 2025-09-21
Analyze NSE stocks:
python -m stocks_miner.cli nse --num_tickers 10 --top_x 5 --start_date 2024-01-01 --end_date 2025-09-21
Analyze random stocks/sectors:
python -m stocks_miner.cli random --k 5 --selection_type companies --start_date 2024-01-01 --end_date 2025-09-21
TDA crash detection for a ticker:
python -m stocks_miner.cli tda --ticker <TICKER> --start_date 2024-01-01 --end_date 2025-09-21
For full help and options:
python -m stocks_miner.cli --help
Syntax for notebook
Cell 1: Setup and Import
import sys import os
Add stocks_miner to path
sys.path.insert(0, os.path.join(os.getcwd(), 'stocks_miner'))
Import the helper
from notebook_helper import setup_stocks_miner
Setup Stocks Miner and get the modules
sm = setup_stocks_miner()
Optional: Also load random_stocks
from stocks_miner import random_stocks sm.random_stocks = random_stocks
print("✓ All modules loaded!")
Cell 2: Analyze Market Indices
Analyze major market indices (NIFTY 50, SENSEX, etc.)
sm.market_indices.analyze_market_indices( start_date="2025-01-01", end_date="2025-09-21" )
Cell 3: Analyze NSE Stocks
Analyze top NSE stocks
sm.nse_stocks.analyze_nse_stocks( num_tickers=10, # Number of stocks to analyze top_x=5, # Top performers to identify start_date="2025-01-01", end_date="2025-09-21" )
Cell 4: Analyze Random Stocks or Sectors
Analyze random selection of stocks or sectors
sm.random_stocks.analyze_random_stocks_or_sectors( k=5, # Number to select selection_type='companies', # 'companies' or 'sectors' start_date="2025-01-01", end_date="2025-09-21" )
Cell 5: TDA Crash Detection (Advanced)
Topological Data Analysis for crash detection
sm.tda_crash.process_stock_data( ticker='RELIANCE.NS', start_date='2020-01-01', end_date='2024-12-31', window_size=50, embedding_dim=3, time_delay=1 )
Directory structure
Stocks_Miner/
├── data/ # User-provided CSV/XLSX files for TDA
├── examples/ # Example scripts or notebooks (optional)
├── stocks_miner/ # Main Python package
│ ├── __init__.py
│ ├── cli.py
│ ├── market_indices.py
│ ├── nse_stocks.py
│ ├── random_stocks.py
│ ├── tda_crash_detection.py
│ └── utils.py
├── tests/ # Unit tests (optional)
├── LICENSE
├── README.md
├── requirements.txt
└── setup.py
Dependencies
Major dependencies include: pandas, numpy, yfinance, matplotlib, seaborn, tqdm, scikit-learn, ripser, persim, and yahooquery.
Install them via the requirements.txt file as shown above.
Notes & recommendations
- Grammar/wording: "Takens' embedding" is a clearer form than "Takens embedding".
- Git: avoid committing generated artifacts such as virtual environments and Python bytecode. Add a
.gitignore(example below) and remove tracked artifacts from the repo if present.
Example .gitignore snippet to add at the repo root:
# Virtual envs
venv/
.venv/
# Byte-compiled / caches
__pycache__/
*.py[cod]
*$py.class
# Packaging
dist/
build/
*.egg-info/
Tests
If you have tests under tests/, run them with your test runner (e.g., pytest) after activating the virtual environment:
pytest -q
Author
Soumyadip Das, and Rajdeep Chatterjee
Organization
AmygdalaAI-India Lab [https://amygdalaaiindia.github.io/]
License
This project is licensed under the MIT License. See the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stocks_miner-0.1.0.tar.gz.
File metadata
- Download URL: stocks_miner-0.1.0.tar.gz
- Upload date:
- Size: 15.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
99f3a74e1ec1a17af177433c60775c30c567664b70f4c46e3571929fd28f518f
|
|
| MD5 |
ec094cc44cda89e0495ded6377f2143d
|
|
| BLAKE2b-256 |
7a88fffd7760194b4f2a4f6369bd40dbffe6ba1158818f77227ac96c4195b3e8
|
File details
Details for the file stocks_miner-0.1.0-py3-none-any.whl.
File metadata
- Download URL: stocks_miner-0.1.0-py3-none-any.whl
- Upload date:
- Size: 15.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
11ddf46747642a8672057d4c10fdc787eb086645c6a84d5b6483e5c8e41189e3
|
|
| MD5 |
84fb7be22b35a95b4ca410de7e83a35f
|
|
| BLAKE2b-256 |
538bc1783ad0971b7556f359d29bd03857c206ebebe6085b5b5f550f8409ead1
|