Skip to main content

Fast, robots.txt-respecting NSE India market data collector for swing trading, quant research, and backtesting

Project description

nsefast

Fast NSE India data collector for swing trading, quant research, AI training, backtesting, and market intelligence.

⚠️ Ethics & Compliance: nsefast only uses publicly downloadable NSE reports and pages allowed by NSE's robots.txt. It does not bypass logins, captchas, Cloudflare, anti-bot systems, or rate limits. Add appropriate delays and use responsibly. You are responsible for complying with NSE's terms of service.

Features

  • Polite, retrying HTTP client with robots.txt checks
  • Modular collectors for equity, derivatives, corporate, deals, indices, surveillance, calendar, and master data
  • Smart-signals layer (processing.technicals + swing.scoring / relative_strength / breakouts): full indicator pack (SMA / EMA / RSI / MACD / Bollinger / Donchian / SuperTrend / ADX / OBV), multi-timeframe Z-scores, sector-relative strength, 52-week and Donchian breakouts, VCP-style consolidation
  • Risk engine (swing.risk + swing.portfolio): per-trade ATR / chandelier / trailing / swing-low stops, vol-targeted sizing, and a portfolio-level water-fill allocator with single-name, sector, and correlation-cluster caps
  • Polars for fast dataframe processing
  • Parquet primary storage, partitioned by dataset/date
  • DuckDB local analytics layer
  • Optional PostgreSQL storage
  • Optional Rust core (rust-core/) for hashing / dedup / large parsing
  • Typer-based CLI

Install

pip install nsefast

Optional extras:

pip install "nsefast[pandas]"      # pandas export helpers
pip install "nsefast[postgres]"    # PostgreSQL sink
pip install "nsefast[api]"         # FastAPI server scaffold
pip install "nsefast[dev]"         # pytest, ruff, build, twine

For development:

git clone https://github.com/nikhilshinde/nsefast
cd nsefast
pip install -e ".[dev]"
pytest -q

Quick start

# Discover all downloadable report links from NSE public pages
nsefast collect-reports

# Run the full scaffold
nsefast collect-all

# Equity bhavcopy for a date
nsefast collect equity-bhavcopy --date 2026-05-07

# Corporate announcements range
nsefast collect corporate-announcements --start 2026-05-01 --end 2026-05-07

# Build swing-trading features
nsefast features swing --date 2026-05-07

# Export a dataset to Parquet
nsefast export parquet --dataset daily_bhavcopy

In Python:

from nsefast.collectors.report_links import collect_report_links
from nsefast.storage.parquet_store import save_parquet

df = collect_report_links()  # polars DataFrame
save_parquet(df, dataset="report_links")

Project layout

nsefast/
├── pyproject.toml
├── requirements.txt
├── main.py
├── README.md
│
├── nsefast/
│   ├── config.py          # URLs, headers, paths
│   ├── http_client.py     # session + retries
│   ├── robots.py          # robots.txt checker
│   ├── collectors/        # one module per data domain
│   ├── processing/        # normalize, features, adjustments, technicals
│   ├── swing/             # filters, scoring, RS, breakouts, risk, portfolio,
│   │                      # scanner, watchlist, backtest
│   ├── master/            # symbol survivorship, index constituents
│   ├── storage/           # parquet, duckdb, postgres
│   └── cli.py             # Typer CLI
│
└── rust-core/             # optional pyo3 module
    ├── Cargo.toml
    └── src/lib.rs

Storage zones

  • data/raw/ — raw downloads exactly as fetched
  • data/clean/ — normalized intermediate files
  • data/parquet/ — partitioned Parquet, the canonical store

Rust core (optional)

The rust-core/ crate exposes a nsefast_core Python module via PyO3 for CPU-bound work (SHA-256 hashing, dedup, fast CSV normalization). HTTP scraping stays in Python — it's I/O bound.

Build with maturin:

cd rust-core
maturin develop --release

Verify your install

pip install nsefast
nsefast verify              # offline checks: imports, parquet, duckdb
nsefast verify --network    # also pings NSE warm-up + robots.txt
nsefast version

Cache, logging, partitioning

# Cache (5-min TTL by default; collectors opt in via cached_get())
nsefast cache stats
nsefast cache clear

# Structured JSON logs (for production / log shippers)
NSEFAST_LOG_FORMAT=json NSEFAST_LOG_LEVEL=INFO nsefast collect bulk-deals --start 2026-04-01 --end 2026-05-07
# Hive-partitioned parquet writes
from nsefast.storage.parquet_store import (
    save_parquet_partitioned, read_parquet_partitioned, derive_date_partitions,
)
df = derive_date_partitions(df, "trade_date", parts=("year", "month"))
save_parquet_partitioned(df, dataset="daily_bhavcopy", by=["year", "month"])
# -> data/parquet/daily_bhavcopy/year=2026/month=05/*.parquet

q1 = read_parquet_partitioned("daily_bhavcopy",
                              filters=[("year","==",2026), ("month",">=",4)])

# DuckDB analytics
from nsefast.storage.duckdb_store import (
    connect, register_all, top_gainers, sector_leaderboard,
)
con = connect()
register_all(con)
top_gainers(con, dataset="all_indices", n=10)
sector_leaderboard(con, dataset="sector_strength")

Swing-trading research (nsefast.swing)

from nsefast.collectors.equity   import daily_bhavcopy, delivery_data
from nsefast.collectors.indices  import sector_strength
from nsefast.collectors.deals    import bulk_deals
from nsefast.collectors.corporate import corporate_announcements
from nsefast.processing.features import add_volume_breakout
from nsefast.swing import (
    top_upside, top_downside, avoid_list, sector_leaders,
    delivery_breakout, volume_breakout,
    bulk_block_watchlist, corporate_announcement_watchlist, combined_watchlist,
)

bhav = daily_bhavcopy("2026-05-07")
bhav = add_volume_breakout(bhav)            # adds avg_volume_20

# Long candidates (filtered, scored, ranked)
top_upside(bhav, n=20, min_turnover=1e7)

# Weakest names (short candidates)
top_downside(bhav, n=20)

# What to skip (surveillance + extreme-move list)
avoid_list(bhav, max_volatility_pct=15.0)

# Sector rotation
sector_leaders(sector_strength(), n=5)

# Sticky-money & spike scans
delivery_breakout(delivery_data("2026-05-07"), min_delivery_pct=70)
volume_breakout(bhav, min_ratio=2.0)

# Smart-money & event watchlists
bulk_block_watchlist(bulk_deals("2026-04-01", "2026-05-07"), min_qty=10_000)
corporate_announcement_watchlist(corporate_announcements("2026-04-01", "2026-05-07"))
combined_watchlist(deals_df=..., ann_df=...)

Smart signals (technicals, Z-scores, RS, breakouts)

from nsefast.processing.technicals import (
    add_sma, add_ema, add_rsi, add_macd, add_bollinger,
    add_donchian, add_supertrend, add_adx, add_obv,
    add_all_technicals,
)
from nsefast.swing import (
    add_multi_timeframe_zscores,
    add_relative_strength, add_rs_score,
    near_52w_high, donchian_breakout, consolidation_breakout,
    add_gap_pct, add_range_atr,
)

# Full indicator pack on a multi-symbol panel (per-symbol via .over("symbol"))
panel = add_all_technicals(history_df)

# 5/20/60-day momentum & volume Z-scores
panel = add_multi_timeframe_zscores(panel)        # mom_z_5/20/60, vol_z_5/20/60

# Sector-relative strength vs a benchmark (eg NIFTY)
panel = add_relative_strength(panel, benchmark_df, lookback=20)
panel = add_rs_score(panel, rs_col="rs_20")       # cross-sectional 0-100 percentile

# Breakout filters
near_52w_high(panel, within_pct=2.0)              # within 2% of 52w high
donchian_breakout(panel, n=20)                    # close > prior 20-bar high (today excluded)
consolidation_breakout(panel)                     # VCP-style range expansion

panel = add_gap_pct(panel)                        # uses prev_close if present
panel = add_range_atr(panel, period=14)           # > 1 = expansion day

Per-trade sizing & stops (swing.risk)

from nsefast.swing.risk import (
    position_size, add_atr, add_atr_stop, add_position_size,
    add_chandelier_stop, add_trailing_atr_stop, add_swing_low_stop,
    add_vol_target_size,
)

# Classic equal-rupee risk sizing
qty = position_size(capital=500_000, entry=120, stop=115, risk_pct=1.0)

# Stop variants — pick one to fit the setup
sized = (bhav
         .pipe(add_atr)
         .pipe(add_chandelier_stop, period=22, mult=3.0)   # rolling-high anchored
         .pipe(add_trailing_atr_stop, period=14, mult=3.0) # never moves down
         .pipe(add_swing_low_stop,   window=10))           # pure price-action

# Vol-targeted sizing: each position contributes equal daily rupee P&L vol
# (qty = capital * target_daily_vol_pct / 100 / atr)
sized = sized.pipe(add_vol_target_size, capital=500_000,
                   target_daily_vol_pct=0.5)

Portfolio-level allocation (swing.portfolio)

from nsefast.swing import (
    portfolio_size, correlation_clusters,
    add_relative_strength, add_rs_score,
)

# 1. Rank by relative strength and take the top names
ranked = (history_df
          .pipe(add_relative_strength, benchmark_df, lookback=20)
          .pipe(add_rs_score, rs_col="rs_20"))
picks  = (ranked.sort("rs_20_score", descending=True)
                .head(15)
                .join(symbol_meta, on="symbol", how="left"))   # adds 'sector'

# 2. Cluster correlated names so TCS/INFY/WIPRO is one bet, not three
clusters = correlation_clusters(history_df,
                                symbols=picks["symbol"].to_list(),
                                lookback=60, threshold=0.7)
picks    = picks.join(clusters, on="symbol", how="left")

# 3. Water-fill allocation under single / sector / cluster caps.
#    Residual is redistributed to names with headroom — no avoidable cash drag.
book = portfolio_size(
    picks, capital=10_00_000,
    max_positions=10,
    max_single_pct=10.0,        # no name > 10% of capital
    max_sector_pct=30.0,        # no sector > 30%
    max_cluster_pct=20.0,       # no correlation cluster > 20%
)
# Output columns: symbol, weight, allocated_pct, allocated_rs, capped_by

Walk-forward backtest

# Minimal walk-forward backtest (full engine + ML lands in v0.3.0)
from nsefast.swing.backtest import run_backtest, summary_stats
trades = run_backtest(history_df, signal_fn=lambda d: d["close"] > d["close"].shift(20),
                      holding_days=5)
summary_stats(trades)

Documentation

Failure semantics

Every public collector returns a Polars DataFrame with its canonical schema on any failure (invalid input, network error, malformed payload, polars error, robots block). Collectors never raise — your pipelines stay crash-proof.

Tests

pytest -q     # 213 unit tests, no network calls

License

MIT — see LICENSE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nsefast-0.3.3.tar.gz (157.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nsefast-0.3.3-py3-none-any.whl (160.5 kB view details)

Uploaded Python 3

File details

Details for the file nsefast-0.3.3.tar.gz.

File metadata

  • Download URL: nsefast-0.3.3.tar.gz
  • Upload date:
  • Size: 157.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for nsefast-0.3.3.tar.gz
Algorithm Hash digest
SHA256 bcdaee8ac0a9c8a0d4350af635abe0dc908cd92d67c0ede14204a118b844a066
MD5 c8fd473083a98810255dc2816bc8cbbc
BLAKE2b-256 dceb04a7f782d3e269220c4cbb677c79cc3c976faf25bc99a9f9c223a390e6b0

See more details on using hashes here.

File details

Details for the file nsefast-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: nsefast-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 160.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for nsefast-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b8bbc5be8f57738a377ecea174f8aba4c8f3ccc4ba2d8a3555d84115c85f6395
MD5 a3301ed2fb3665f55359eee1703d9dea
BLAKE2b-256 1413661f79d448b004c91300217387d07b382b21b4c6e47fe18fbf45a64a2849

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page