Async Python client for Indian court data — eCourts, HC Services, Supreme Court

These details have not been verified by PyPI

Project description

bharat-courts

Async Python SDK for Indian court data — search cases, download orders, and access cause lists from eCourts and the Supreme Court.

What is this?

India's eCourts platform holds millions of case records across 25+ High Courts, 700+ District Courts, and the Supreme Court — but there's no official API. Checking case status means navigating clunky portals, solving CAPTCHAs by hand, and copy-pasting results one at a time.

bharat-courts fixes that. It gives you — and your AI assistant — direct programmatic access to:

Find any judgment with one call — Judgments().find(judge=..., year=..., text=..., cnr=...) routes to the right backend (archive vs live) and returns a uniform result. The SDK does the heavy lifting; you describe the data, not where it lives.
Track matters — search by case number, party name, or advocate across any High Court or District Court
Download orders & judgments — get PDFs for all orders in a case with one call
Monitor cause lists — see which cases are listed before which bench, every day
Pull recent Supreme Court judgments — scrape the homepage's "Latest Judgements / Orders" feed and download the PDFs
Query the historical archive — instant offline search across SCI judgments from 1950 and 25-HC judgments (CC-BY-4.0 AWS Open Data, no CAPTCHA, no rate limits)
Access District Courts — dynamically discover courts across 36 states/UTs and search 700+ court complexes
Bulk download judgments — paginate through results, batch-download PDFs with automatic session management
Automate CAPTCHA handling — built-in OCR solver, ONNX solver, or plug in your own

Works standalone as a Python library, as a CLI tool, or as an AI agent skill — install it into Claude Code, GitHub Copilot, or any MCP-compatible assistant and ask questions in plain English.

Built for practicing lawyers, litigation teams, legal researchers, legal aid organizations, and legal tech builders.

Installation

pip install bharat-courts

# With automatic CAPTCHA solving (recommended)
pip install bharat-courts[ocr]

# With lightweight ONNX CAPTCHA solver (alternative to ddddocr)
pip install bharat-courts[onnx]

# With CLI
pip install bharat-courts[cli]

# With historical-archive support (DuckDB over AWS Open Data buckets)
pip install bharat-courts[archive]

# Everything (OCR + ONNX + CLI + archive + dev tools)
pip install bharat-courts[all]

Requires Python 3.11+

Quick Start

Find a judgment without picking a backend

The Judgments facade is the recommended entry point for "find a judgment matching some criteria" — it owns both the archive and live clients, picks the right one per query, and returns a uniform Judgment list. Use this unless you specifically need a portal-only feature (cause list, live case status, district-court drill-down).

import asyncio
from bharat_courts import Judgments

async def main():
    async with Judgments() as j:
        # Structured filters → archive (no CAPTCHA, partition-pruned)
        for r in await j.find(judge="chandrachud", year=(2018, 2024), court="sci", limit=10):
            print(f"{r.decision_date}  {r.case_id}  {r.title}")

        # Free-text → live (only the live portal does full-text)
        for r in await j.find(text="right to privacy", limit=5):
            print(f"{r.decision_date}  {r.title}  [{r.source}]")

        # CNR alone — prefix routes to the right bucket, no full scan
        result = await j.find(cnr="DLHC010230802020")
        pdf = await j.fetch_pdf(result[0])
        with open("judgment.pdf", "wb") as f:
            f.write(pdf)

        # Force a specific backend if you need to
        archive_only = await j.find(text="bail", source="archive", limit=5)

asyncio.run(main())

Routing rules (see API reference for details):

Filter shape	Backend
`cnr=`	archive (prefix → court → partition)
`text=` only	live (only it does full-text body search)
structured only (judge/party/year/court/citation)	archive
`text=` + structured	archive — `text` folds into a title-substring match

Each returned Judgment carries a source field ("archive" or "live") so consumers can still tell where it came from when that matters.

Find all pending matters for your client

import asyncio
from bharat_courts import get_court, HCServicesClient
from bharat_courts.captcha.ocr import OCRCaptchaSolver

async def main():
    delhi = get_court("delhi")
    solver = OCRCaptchaSolver()

    async with HCServicesClient(captcha_solver=solver) as client:
        cases = await client.case_status_by_party(
            delhi,
            party_name="Reliance Industries",
            year="2024",
            status_filter="Pending",
        )
        for case in cases:
            print(f"{case.case_number}: {case.petitioner} v {case.respondent}")
            print(f"  CNR: {case.cnr_number}")

asyncio.run(main())

Check case status and download orders

async with HCServicesClient(captcha_solver=solver) as client:
    # Look up a specific writ petition. `case_type` is the numeric code from
    # list_case_types() (e.g. "134" = W.P.(C) on Delhi HC).
    cases = await client.case_status(
        get_court("bombay"),
        case_type="134",
        case_number="4520",
        year="2023",
    )
    # case_type on the result is now a label like "W.P.(C)" (from the
    # portal's type_name field). The showRecords endpoint does not return
    # case status, so case.status is always empty.
    print(f"{cases[0].case_type} {cases[0].case_number} — CNR: {cases[0].cnr_number}")

    # Download all orders for the case
    orders = await client.court_orders(
        get_court("bombay"),
        case_type="134",
        case_number="4520",
        year="2023",
    )
    for order in orders:
        print(f"{order.order_date} — {order.order_type} by {order.judge}")
        # download_order_pdf raises RuntimeError if the portal hands back its
        # 30-byte BOM+error string instead of a real PDF.
        pdf = await client.download_order_pdf(order.pdf_url)
        with open(f"order_{order.order_date}.pdf", "wb") as f:
            f.write(pdf)

Get tomorrow's cause list before court

pdfs = await client.cause_list(
    get_court("delhi"),
    civil=True,
    causelist_date="03-03-2026",   # DD-MM-YYYY
)
for pdf in pdfs:
    print(f"{pdf.bench} — {pdf.cause_list_type}")
    print(f"  Download: {pdf.pdf_url}")

Search District Court cases

from bharat_courts import DistrictCourtClient
from bharat_courts.districtcourts.parser import parse_complex_value

async with DistrictCourtClient(captcha_solver=solver) as client:
    # Discover the court hierarchy
    districts = await client.list_districts("8")        # Bihar
    complexes = await client.list_complexes("8", "1")   # Patna district

    # Parse complex value to get code + establishment info
    complex_val = list(complexes.keys())[-1]            # e.g. "1080010@2,3,4@Y"
    code, ests, needs_est = parse_complex_value(complex_val)
    est = ests[0] if needs_est else ""

    # Look up case types — the portal returns codes as "<case_type>^<est>"
    # compound strings (e.g. "89^2"); pass them back verbatim.
    case_types = await client.list_case_types("8", "1", code, est)
    # {"89^2": "ADMINISTRATIVE SUITE", "152^2": "Anticipatory Bail - ABP", ...}

    cases = await client.case_status(
        state_code="8", dist_code="1",
        court_complex_code=code, est_code=est,
        case_type="89^2",       # full compound code, not just "89"
        case_number="100", year="2024",
    )
    for case in cases:
        print(f"{case.case_number}: {case.petitioner} v {case.respondent}")

List recent Supreme Court judgments

from bharat_courts import SCIClient

# www.sci.gov.in surfaces the 50 most recent items inline on the homepage.
# No CAPTCHA, no search form — just scrape the feed.
async with SCIClient() as client:
    recent = await client.list_recent_judgments(limit=10)
    for j in recent:
        print(f"{j.judgment_date}: {j.title}")
        print(f"  Diary: {j.source_id}  {j.case_number}")
        # Download via /sci-get-pdf/?diary_no=... (portal viewer URL).
        await client.download_pdf(j)
        if j.pdf_bytes:
            with open(f"sci_{j.source_id}.pdf", "wb") as f:
                f.write(j.pdf_bytes)

(Date-range / party-name search against the legacy main.sci.gov.in host is no longer functional — that host is permanently 503 and the live www.sci.gov.in portal gates those flows behind a CAPTCHA-protected case-no/diary-no form that the SDK does not yet wire up. search_by_year and search_by_party raise NotImplementedError.)

Query the historical archive (no CAPTCHA, no rate limits)

For research workloads — "find every judgment by Justice X", "all 2020 Delhi HC writ petitions", bulk PDF retrieval — use the ArchiveClient, which reads the public AWS Open Data buckets maintained by Dattam Labs: SCI judgments from 1950 onwards and 25 High Courts.

from bharat_courts import ArchiveClient

async with ArchiveClient() as client:
    # Substring match on judge, year range, partition-pruned in DuckDB.
    results = await client.search(
        court="sci", judge="chandrachud", year=(2018, 2024), limit=20,
    )
    for j in results:
        print(f"{j.decision_date}  {j.case_id}  {j.title}")
        print(f"  {j.citation}  outcome: {j.disposal_nature}")

    # Stream every Delhi 2020 judgment (~18k) without holding them all in memory.
    async for j in client.iter_judgments(court="delhi", year=2020, batch_size=500):
        process(j)

    # Fetch the PDF — CNR alone is enough; the SDK infers the source.
    pdf_bytes = await client.fetch_pdf("DLHC010230802020")
    # SCI judgments default to English; pass language="hindi" / "tamil" / etc.
    sci_pdf = await client.fetch_pdf("ESCR010000301950", language="english")

Notes:

Freshness gap: the buckets update bi-monthly (SCI) and quarterly (HC). For judgments delivered in the last 2–3 months, fall back to JudgmentSearchClient.
Cache: PDFs and parquet shards cache under ~/.cache/bharat-courts/archive/. Default cap is 5 GiB (BHARAT_COURTS_ARCHIVE_CACHE_MAX_GB); metadata TTL is 30 days (BHARAT_COURTS_ARCHIVE_METADATA_TTL_DAYS).
License: data is CC-BY-4.0 — attribute Dattam Labs / the eCourts platform when redistributing.

Use with AI agents (Claude Code, Copilot, etc.)

Install the bundled skill so your AI assistant can look up court data for you in natural language:

bharat-courts install-skills

Then just ask your AI agent:

"Find all pending writ petitions for Tata Motors in Delhi High Court from 2024"

"Download the latest order in WP(C) 4520/2023 before the Bombay High Court"

"What's on the cause list for Karnataka High Court tomorrow?"

"Search for cases filed by State of Bihar in Patna district court in 2024"

"Show me the most recent Supreme Court judgments from this week"

The agent uses bharat-courts under the hood — handles CAPTCHA, sessions, and parsing automatically.

JSON serialization

All models support to_dict() and to_json() — pipe results into spreadsheets, dashboards, or case management tools:

import json

cases = await client.case_status_by_party(delhi, party_name="HDFC", year="2024")
# Export to JSON for your case tracker
with open("matters.json", "w") as f:
    json.dump([c.to_dict(exclude_none=True) for c in cases], f, indent=2)

[
  {
    "case_number": "3/2024",
    "case_type": "W.P.(C)",
    "cnr_number": "DLHC010582482024",
    "filing_number": "213400000032024",
    "registration_number": "3",
    "petitioner": "HDFC BANK LTD.",
    "respondent": "UNION OF INDIA & ORS.",
    "court_name": "Delhi High Court",
    "judges": []
  }
]

Note: status, registration_date, judges, and next_hearing_date are not returned by the live showRecords endpoint and stay empty/null. They live behind the case-history endpoint which the SDK does not call yet.

Supported Portals

Source	Client	Status
Federated (archive + live)	`Judgments`	Recommended entry point — one `find()` call routes to archive vs live by query shape
HC Services	`HCServicesClient`	Fully working
District Courts	`DistrictCourtClient`	Case status, orders, cause lists across 700+ courts
Judgment Search	`JudgmentSearchClient`	Search, pagination, bulk PDF download
Supreme Court	`SCIClient`	Recent judgments feed + PDF download (case-no search not yet implemented)
Calcutta High Court	`CalcuttaHCClient`	Order/judgment search + PDF download (direct from HC website)
SCI Archive (S3, CC-BY-4.0)	`ArchiveClient`	DuckDB metadata search + PDF retrieval, 1950–present; SCI bi-monthly + 25 HCs quarterly
HC Archive (S3, CC-BY-4.0)	`ArchiveClient`	Same as above; routed via the unified `ArchiveClient`

API Reference

`Judgments` (federated facade)

The recommended entry point for finding judgments. Owns an ArchiveClient and a JudgmentSearchClient internally (lazy-initialised), picks the right backend per query, and returns a uniform list[Judgment]. Reach for the portal-specific clients only when you need something the facade doesn't expose (cause lists, case status, district-court drill-down).

from bharat_courts import Judgments

async with Judgments() as j:
    ...

No new install extra — pulls in whatever backends are available. For best behaviour install both:

pip install 'bharat-courts[archive,ocr]'

`find(*, text=None, court=None, year=None, judge=None, party=None, citation=None, cnr=None, source="auto", limit=50) -> list[Judgment]`

Run a search, routing transparently between the archive and the live portal.

Parameter	Type	Description
`text`	`str \| None`	Free-text keyword search. Routes to live (only the live judgments portal does full-body search).
`court`	`Court \| str \| None`	Court object or code (`"sci"`, `"delhi"`).
`year`	`int \| tuple[int, int] \| None`	Single year or inclusive range. Drives archive partition pruning — strongly recommended for non-CNR queries.
`judge`	`str \| None`	Substring on the judge field (archive).
`party`	`str \| None`	Substring on petitioner/respondent (SCI) or title (HC).
`citation`	`str \| None`	Citation substring (archive, SCI only).
`cnr`	`str \| None`	Exact CNR match. Auto-routes via the 4-letter prefix when `source="auto"`.
`source`	`"auto" \| "archive" \| "live"`	Override the automatic routing.
`limit`	`int`	Total results to return (default 50).

Returns: list[Judgment] with source set to "archive" or "live" per item.

Routing (in source="auto" mode):

filter shape	backend
`cnr=` set	archive (prefix-routed; no scan)
`text=` set, no structured filters	live
structured filters only (court/year/judge/party/citation)	archive
`text=` + structured filters	archive — `text` folds into a title-substring match (party slot)
nothing	raises `ValueError`

The decision is logged at INFO level (logging.getLogger("bharat_courts.facade")).

`fetch_pdf(judgment_or_cnr, *, language="english") -> bytes`

Convenience wrapper that delegates to ArchiveClient.fetch_pdf. Accepts a Judgment instance or a CNR string. Raises NotImplementedError if you pass a Judgment with source="live" — the live download path needs the original JudgmentResult (for session continuity), so call JudgmentSearchClient.download_pdf(judgment_result, court_type) directly for that case.

`live_to_judgment(jr: JudgmentResult) -> Judgment`

Public helper if you're doing routing yourself: normalises a JudgmentResult (returned by JudgmentSearchClient.search) into the unified Judgment shape. Maps source_id → cnr, resolves court_name via the registry, pulls disposal_nature / registration_date out of metadata.

`HCServicesClient`

Primary client for High Court case data via hcservices.ecourts.gov.in.

from bharat_courts import HCServicesClient

client = HCServicesClient(
    config=None,            # BharatCourtsConfig | None — uses global config singleton if None
    captcha_solver=None,    # CaptchaSolver | None — defaults to OCRCaptchaSolver if ddddocr installed
    http_client=None,       # RateLimitedClient | None — creates one internally if None
)

Use as an async context manager (no solver needed if bharat-courts[ocr] is installed):

async with HCServicesClient() as client:
    ...

`list_benches(court) -> dict[str, str]`

Get available benches for a High Court. No CAPTCHA required.

Parameter	Type	Required	Description
`court`	`Court`	Yes	Court object from `get_court()`

Returns: dict[str, str] — mapping of bench code to bench name.

delhi = get_court("delhi")
benches = await client.list_benches(delhi)
# {'1': 'Principal Bench at Delhi'}

bombay = get_court("bombay")
benches = await client.list_benches(bombay)
# {'1': 'Principal Seat at Bombay', '2': 'Nagpur Bench', '3': 'Aurangabad Bench', '4': 'Goa Bench'}

`list_case_types(court, *, bench_code="1") -> dict[str, str]`

Get available case type codes for a court bench. No CAPTCHA required.

Parameter	Type	Required	Default	Description
`court`	`Court`	Yes	—	Court object
`bench_code`	`str`	No	`"1"`	Bench code from `list_benches()`

Returns: dict[str, str] — mapping of case type code to name.

case_types = await client.list_case_types(delhi)
# {'134': 'W.P.(C)(CIVIL WRITS)-134', '27': 'W.P.(CRL)-27', '3': 'EL.PET.-3', ...}

`case_status(court, *, case_type, case_number, year, bench_code="1") -> list[CaseInfo]`

Look up case status by case number. CAPTCHA required (auto-retried, default 5 attempts).

Parameter	Type	Required	Default	Description
`court`	`Court`	Yes	—	Court object
`case_type`	`str`	Yes	—	Numeric case type code (use `list_case_types()` to discover)
`case_number`	`str`	Yes	—	Case number without type/year
`year`	`str`	Yes	—	Registration year, e.g. `"2024"`
`bench_code`	`str`	No	`"1"`	Bench code from `list_benches()`

Returns: list[CaseInfo] — matching cases. Notable field semantics:

case_type on the result is a label like "W.P.(C)" (sourced from the portal's type_name field), not the numeric code you passed in.
registration_number is populated from the portal's case_no2 field.
status is always empty — the live showRecords endpoint doesn't return Pending/Disposed (that data lives behind o_civil_case_history.php, which the SDK doesn't call yet). Same for registration_date, judges, and next_hearing_date.

cases = await client.case_status(
    delhi,
    case_type="134",      # numeric code from list_case_types()
    case_number="1",
    year="2024",
)
for case in cases:
    print(f"{case.case_type} {case.case_number}  CNR: {case.cnr_number}")
    print(f"  {case.petitioner} v {case.respondent}")

`case_status_by_party(court, *, party_name, year, bench_code="1", status_filter="Both") -> list[CaseInfo]`

Search cases by party name. CAPTCHA required (auto-retried, default 5 attempts).

Parameter	Type	Required	Default	Description
`court`	`Court`	Yes	—	Court object
`party_name`	`str`	Yes	—	Petitioner or respondent name (min 3 characters)
`year`	`str`	Yes	—	Registration year — mandatory, server returns error if empty
`bench_code`	`str`	No	`"1"`	Bench code
`status_filter`	`str`	No	`"Both"`	`"Pending"`, `"Disposed"`, or `"Both"` (forwarded to the portal — but the response carries no status field, so filtering happens server-side and the returned `CaseInfo.status` is still empty)

Returns: list[CaseInfo] — matching cases. Same field-population caveats as case_status above. Wide queries can return tens of thousands of records in a single response with no pagination — see issue tracker.

cases = await client.case_status_by_party(
    delhi,
    party_name="state",
    year="2024",
    status_filter="Pending",
)
for case in cases:
    print(f"{case.case_number}: {case.petitioner} v {case.respondent}")

`court_orders(court, *, case_type, case_number, year, bench_code="1") -> list[CaseOrder]`

Get court orders for a case. CAPTCHA required (auto-retried).

Parameter	Type	Required	Default	Description
`court`	`Court`	Yes	—	Court object
`case_type`	`str`	Yes	—	Numeric case type code
`case_number`	`str`	Yes	—	Case number
`year`	`str`	Yes	—	Registration year
`bench_code`	`str`	No	`"1"`	Bench code

Returns: list[CaseOrder] — orders with dates, types, judges, and PDF URLs.

orders = await client.court_orders(
    delhi,
    case_type="134",
    case_number="1",
    year="2024",
)
for order in orders:
    print(f"{order.order_date}: {order.order_type} by {order.judge}")
    if order.pdf_url:
        pdf_bytes = await client.download_order_pdf(order.pdf_url)
        with open(f"order_{order.order_date}.pdf", "wb") as f:
            f.write(pdf_bytes)

`cause_list(court, *, civil=True, bench_code="1", causelist_date="") -> list[CauseListPDF]`

Get cause list PDFs for a court. CAPTCHA required (auto-retried).

Parameter	Type	Required	Default	Description
`court`	`Court`	Yes	—	Court object
`civil`	`bool`	No	`True`	`True` for civil, `False` for criminal
`bench_code`	`str`	No	`"1"`	Bench code
`causelist_date`	`str`	No	`""` (today)	Date in `DD-MM-YYYY` format

Returns: list[CauseListPDF] — one entry per bench with bench name, list type, and PDF URL.

pdfs = await client.cause_list(delhi, civil=True)
for pdf in pdfs:
    print(f"#{pdf.serial_number} {pdf.bench} — {pdf.cause_list_type}")
    print(f"  PDF: {pdf.pdf_url}")

# Criminal cause list for a specific date
criminal_pdfs = await client.cause_list(
    delhi,
    civil=False,
    causelist_date="15-01-2025",
)

`download_order_pdf(pdf_url) -> bytes`

Download an order or judgment PDF. No CAPTCHA required.

Parameter	Type	Required	Description
`pdf_url`	`str`	Yes	URL from `CaseOrder.pdf_url` or `CauseListPDF.pdf_url`

Returns: bytes — raw PDF file content.

Raises: RuntimeError if the response doesn't start with the %PDF magic bytes. The HC Services portal sometimes hands back a 30-byte BOM-prefixed "Unable to connect to server" string with HTTP 200; this method now refuses to silently return that as a PDF.

pdf_bytes = await client.download_order_pdf(order.pdf_url)
with open("order.pdf", "wb") as f:
    f.write(pdf_bytes)

`DistrictCourtClient`

Client for District Courts across India via services.ecourts.gov.in. Covers 700+ court complexes across 36 states/UTs.

Unlike High Courts (which use static get_court() codes), district courts require dynamic discovery of the 4-level hierarchy: State → District → Court Complex → Establishment.

from bharat_courts import DistrictCourtClient

client = DistrictCourtClient(
    config=None,            # BharatCourtsConfig | None
    captcha_solver=None,    # CaptchaSolver | None — defaults to OCRCaptchaSolver if ddddocr installed
    http_client=None,       # RateLimitedClient | None
)

Use as an async context manager:

async with DistrictCourtClient() as client:
    ...

Court Discovery Methods (No CAPTCHA)

These methods discover the court hierarchy dynamically.

`list_states() -> dict[str, str]`

Returns all 36 states/UTs with their codes. Static data, no network call.

states = await client.list_states()
# {"8": "Bihar", "7": "Delhi", "27": "Maharashtra", ...}

`list_districts(state_code) -> dict[str, str]`

Get districts for a state.

districts = await client.list_districts("8")  # Bihar
# {"1": "Patna", "35": "Gaya", "38": "Muzaffarpur", ...}

`list_complexes(state_code, dist_code) -> dict[str, str]`

Get court complexes for a district. Values are in code@ests@flag format.

complexes = await client.list_complexes("8", "1")  # Bihar, Patna
# {"1080010@2,3,4@Y": "Civil Court, Patna Sadar", ...}

# Parse the value to extract the code and check if establishment selection is needed
from bharat_courts.districtcourts.parser import parse_complex_value
code, est_codes, needs_est = parse_complex_value("1080010@2,3,4@Y")
# code="1080010", est_codes=["2","3","4"], needs_est=True

`list_establishments(state_code, dist_code, court_complex_code) -> dict[str, str]`

Get establishments for a court complex. Only needed when needs_est is True.

establishments = await client.list_establishments("8", "1", "1080010")
# {"2": "DJ Div. Patna Sadar", "3": "CJM Div. Patna Sadar", ...}

`list_case_types(state_code, dist_code, court_complex_code, est_code) -> dict[str, str]`

Get available case types for a court. Codes are returned in the portal's compound "<case_type>^<est_code>" format — pass them back verbatim to case_status / court_orders; do not strip the ^N suffix.

case_types = await client.list_case_types("8", "1", "1080010", "2")
# {"89^2": "ADMINISTRATIVE SUITE", "152^2": "Anticipatory Bail - ABP", ...}

`list_cause_list_courts(state_code, dist_code, court_complex_code, est_code="") -> dict[str, str]`

Get the courts dropdown for cause-list lookup. Returns a mapping of court_no (e.g. "1@2") to court display name (e.g. "District & Sessions Judge - DJ Div. Patna Sadar"). The cause-list form requires both — pass either through directly to cause_list(), which will look up the matching name automatically if you only know the code.

courts = await client.list_cause_list_courts("8", "1", "1080010", "2")
# {"1@2": "District & Sessions Judge - DJ Div. Patna Sadar", ...}

Search Methods (CAPTCHA Required)

All search methods take the 4-level court identifiers as keyword arguments.

`case_status(*, state_code, dist_code, court_complex_code, est_code, case_type, case_number, year) -> list[CaseInfo]`

Search by case number. case_type must be the full compound "<code>^<est>" string from list_case_types().

cases = await client.case_status(
    state_code="8", dist_code="1",
    court_complex_code="1080010", est_code="2",
    case_type="89^2",      # full compound code, not just "89"
    case_number="100", year="2024",
)

`case_status_by_party(*, state_code, dist_code, court_complex_code, est_code, party_name, year, status_filter="Both") -> list[CaseInfo]`

Search by party name (min 3 characters). year is mandatory.

cases = await client.case_status_by_party(
    state_code="8", dist_code="1",
    court_complex_code="1080010", est_code="2",
    party_name="kumar", year="2024",
    status_filter="Pending",   # "Pending", "Disposed", or "Both"
)

`court_orders(*, state_code, dist_code, court_complex_code, est_code, case_type, case_number, year) -> list[CaseOrder]`

Get court orders for a case.

orders = await client.court_orders(
    state_code="8", dist_code="1",
    court_complex_code="1080010", est_code="2",
    case_type="1", case_number="100", year="2024",
)

`cause_list(*, state_code, dist_code, court_complex_code, est_code, court_no, court_name="", causelist_date="", civil=True) -> list[CauseListEntry]`

Get cause list entries. court_no is now required — discover the available codes via list_cause_list_courts(). court_name is the option's display label; the portal validates against it (sending an empty court_name_txt triggers a "Court Name is required" error). If you leave court_name blank, this method calls list_cause_list_courts() once and looks up the matching label for court_no.

entries = await client.cause_list(
    state_code="8", dist_code="1",
    court_complex_code="1080010", est_code="2",
    court_no="1@2",                # required, from list_cause_list_courts()
    civil=True,
    causelist_date="20-03-2026",   # DD-MM-YYYY, defaults to today
)
for e in entries:
    print(f"#{e.serial_number} {e.case_number} — {e.petitioner} v {e.respondent}")

`JudgmentSearchClient`

Client for the eCourts judgment search portal (judgments.ecourts.gov.in).

from bharat_courts import JudgmentSearchClient

async with JudgmentSearchClient(captcha_solver=solver) as client:
    ...

`search(search_text, *, page=1, page_size=10, search_opt="PHRASE", court_type="2", max_captcha_attempts=5) -> SearchResult`

Search for judgments by keyword. CAPTCHA required.

Parameter	Type	Required	Default	Description
`search_text`	`str`	Yes	—	Search query text
`page`	`int`	No	`1`	Page number (1-indexed)
`page_size`	`int`	No	`10`	Rows per page (portal supports 10/25/50/100/1000)
`search_opt`	`str`	No	`"PHRASE"`	`"PHRASE"`, `"ANY"`, or `"ALL"`
`court_type`	`str`	No	`"2"`	`"2"` for High Courts, `"3"` for SCR
`max_captcha_attempts`	`int`	No	`5`	Max CAPTCHA retry attempts

Returns: SearchResult — contains items: list[JudgmentResult], total_count, pagination info. Each JudgmentResult includes parsed metadata (CNR number, disposal nature, registration date) and source_id (CNR) when available.

Raises: CaptchaError if the CAPTCHA solver couldn't authenticate within max_captcha_attempts tries. Empty results now mean "the portal returned zero rows" — they no longer mask a silent CAPTCHA failure (older versions returned SearchResult() with no signal).

from bharat_courts import JudgmentSearchClient
from bharat_courts.captcha.ocr import OCRCaptchaSolver

async with JudgmentSearchClient(captcha_solver=OCRCaptchaSolver()) as client:
    results = await client.search("right to privacy")
    print(f"Found {results.total_count} results")
    for judgment in results.items:
        print(f"{judgment.title}")
        print(f"  Court: {judgment.court_name}, Date: {judgment.judgment_date}")
        print(f"  CNR: {judgment.source_id}")
        print(f"  Metadata: {judgment.metadata}")

`search_all(search_text, *, page_size=25, search_opt="PHRASE", court_type="2", max_captcha_attempts=5) -> AsyncIterator[SearchResult]`

Iterate through all pages of search results. Yields one SearchResult per page, automatically handling pagination, token rotation, and session expiry (re-authenticates mid-walk if the portal session lapses).

async with JudgmentSearchClient(captcha_solver=solver) as client:
    async for page in client.search_all("land acquisition"):
        for judgment in page.items:
            print(f"{judgment.title} ({judgment.judgment_date})")

`download_pdf(judgment, *, court_type="2") -> JudgmentResult`

Download the PDF for a judgment result.

Important: judgment.pdf_url is not a directly-fetchable URL — it's the row's relative path from the portal's open_pdf(...) JS handler. This method does the openpdfcaptcha resolution dance to obtain a per-session outputfile URL, then GETs the actual PDF bytes. Each row's pdf_val (also stashed by the parser inside judgment.metadata) is forwarded automatically; without it the portal serves the first row's PDF for every subsequent call within the same session.

Parameter	Type	Required	Default	Description
`judgment`	`JudgmentResult`	Yes	—	A result from `search()`
`court_type`	`str`	No	`"2"`	Same `"2"` / `"3"` as on `search()`

Returns: the same JudgmentResult mutated in place — pdf_bytes is set on success.

Raises: RuntimeError if the response is empty, non-JSON, or doesn't start with %PDF.

judgment = results.items[0]
await client.download_pdf(judgment)
if judgment.pdf_bytes:
    with open("judgment.pdf", "wb") as f:
        f.write(judgment.pdf_bytes)

`download_pdfs(judgments, *, court_type="2", stop_on_error=False) -> list[JudgmentResult]`

Bulk-download PDFs for multiple judgments. Skips entries that already have pdf_bytes set. Failed downloads are logged at WARNING level by default; pass stop_on_error=True to raise on the first failure instead.

Parameter	Type	Required	Default	Description
`judgments`	`list[JudgmentResult]`	Yes	—	Judgments to download PDFs for
`court_type`	`str`	No	`"2"`	Forwarded to each `download_pdf` call
`stop_on_error`	`bool`	No	`False`	Re-raise the first download exception instead of logging it

Returns: the same list, with pdf_bytes populated where successful.

async with JudgmentSearchClient(captcha_solver=solver) as client:
    results = await client.search("constitution")
    await client.download_pdfs(results.items)
    for j in results.items:
        if j.pdf_bytes:
            with open(f"{j.case_number}.pdf", "wb") as f:
                f.write(j.pdf_bytes)

`SCIClient`

Client for the Supreme Court of India (www.sci.gov.in). No CAPTCHA required.

The legacy host (main.sci.gov.in) that older versions of this SDK targeted has been in long-term maintenance for years and now returns HTTP 503 to every path. The live site is www.sci.gov.in (WordPress); SCIClient was rewritten against it.

from bharat_courts import SCIClient

# Note: no captcha_solver parameter — the homepage feed doesn't use CAPTCHAs
async with SCIClient() as client:
    ...

`list_recent_judgments(*, limit=50) -> list[JudgmentResult]`

Scrape the homepage's "Latest Judgements / Orders" feed. Returns the 50 most recent items the portal surfaces inline (the portal caps it at 50 — pass a smaller limit to truncate).

Parameter	Type	Required	Default	Description
`limit`	`int`	No	`50`	Max items to return

Returns: list[JudgmentResult] — each carries:

title — "PETITIONER VS. RESPONDENT"
case_number — e.g. "C.A. No. 6677/2026"
judgment_date — parsed from the row's "DD-MMM-YYYY" tail
source_id — diary number (the portal's primary key)
pdf_url — the /sci-get-pdf/?diary_no=... URL the in-page viewer iframe uses
source_url — the matching /view-pdf/?diary_no=... URL (for opening in a browser)
metadata["petitioner"], metadata["respondent"], metadata["type"] ("j" = judgment, "o" = order)

async with SCIClient() as client:
    recent = await client.list_recent_judgments(limit=10)
    for j in recent:
        print(f"{j.judgment_date}: {j.title}  [diary {j.source_id}]")

`download_pdf(judgment) -> JudgmentResult`

Download the PDF bytes for a Supreme Court judgment via the /sci-get-pdf/?diary_no=... endpoint.

Parameter	Type	Required	Description
`judgment`	`JudgmentResult`	Yes	An item from `list_recent_judgments()`

Returns: the same JudgmentResult with pdf_bytes populated.

Raises: RuntimeError if the response doesn't start with %PDF.

`search_by_year(year, month=None)` and `search_by_party(party_name)` — not implemented

Both methods now raise NotImplementedError. The legacy main.sci.gov.in form they hit is permanently 503; the equivalent flow on www.sci.gov.in is gated behind a CAPTCHA-protected case-no/diary-no/party-name form (/judgements-case-no/) that the SDK does not yet wire up. Use list_recent_judgments() for the most recent items.

`CalcuttaHCClient`

Client for Calcutta High Court's own website (calcuttahighcourt.gov.in). Provides order/judgment search with PDF download for cases from September 2020 onwards (CIS system). Has better PDF coverage than the eCourts portal for Calcutta HC cases.

from bharat_courts import CalcuttaHCClient

async with CalcuttaHCClient() as client:
    ...

`search_orders(*, case_type, case_number, year, establishment="appellate", max_captcha_attempts=5) -> tuple[CaseInfo | None, list[CaseOrder]]`

Search for orders/judgments by case number. CAPTCHA required (auto-retried, default 5 attempts).

Parameter	Type	Required	Default	Description
`case_type`	`str`	Yes	—	Numeric case type code (e.g. `"12"` for WPA)
`case_number`	`str`	Yes	—	Case registration number
`year`	`str`	Yes	—	Case year
`establishment`	`str`	No	`"appellate"`	`"appellate"`, `"original"`, `"jalpaiguri"`, or `"portblair"`
`max_captcha_attempts`	`int`	No	`5`	Max CAPTCHA retries

Returns: tuple[CaseInfo | None, list[CaseOrder]]. The CaseInfo carries the case-level metadata the portal returns alongside the order rows (parties, CNR, full case number, side); previous versions silently dropped this. Returns (None, []) when nothing matched.

case_info, orders = await client.search_orders(
    case_type="12",        # WPA
    case_number="12886",
    year="2024",
    establishment="appellate",
)
if case_info:
    print(f"{case_info.case_number}  CNR: {case_info.cnr_number}")
    print(f"  {case_info.petitioner} v {case_info.respondent}")
for order in orders:
    print(f"{order.order_date}: {order.order_type} by {order.judge}")
    print(f"  Neutral Citation: {order.neutral_citation}")
    if order.pdf_url:
        pdf = await client.download_order_pdf(order.pdf_url)

`download_order_pdf(pdf_url) -> bytes`

Download an order/judgment PDF. No CAPTCHA required.

Raises: RuntimeError if the response doesn't start with the %PDF magic bytes.

pdf_bytes = await client.download_order_pdf(order.pdf_url)

`ArchiveClient`

Read-only access to the public AWS Open Data judgment archives (no CAPTCHA, no rate limits, no accounts). Requires the archive extra:

pip install 'bharat-courts[archive]'

from bharat_courts import ArchiveClient

async with ArchiveClient(
    cache_dir=None,          # str | None — defaults to ~/.cache/bharat-courts/archive/
    cache_max_bytes=None,    # int | None — defaults to 5 GiB (or env override)
    metadata_cache=True,     # bool — disable to skip the local parquet mirror
) as client:
    ...

DuckDB runs the metadata queries against partitioned parquet files; PDFs are served via direct HTTP GET (HC, one file per judgment) or random-access tar extraction (SCI, one tar per year). Both layers cache on disk.

`search(*, court=None, year=None, judge=None, party=None, citation=None, cnr=None, limit=50) -> list[Judgment]`

Search both archives in one call. CNR-only queries auto-route via the prefix — no need to specify court= for fetch_pdf("DLHC...").

Parameter	Type	Required	Default	Description
`court`	`Court \| str \| None`	No	`None`	Court object, code string (`"sci"`, `"delhi"`), or `None` to query both buckets
`year`	`int \| tuple[int, int] \| None`	No	`None`	Single year (`2020`) or inclusive range (`(2018, 2024)`). Drives partition pruning — strongly recommended for non-CNR queries
`judge`	`str \| None`	No	`None`	Case-insensitive substring on the judge field
`party`	`str \| None`	No	`None`	SCI: searches petitioner/respondent/title. HC: title only (HC parquet has no party columns)
`citation`	`str \| None`	No	`None`	SCI only — silently ignored for HC
`cnr`	`str \| None`	No	`None`	Exact CNR match. Auto-resolves source from the 4-letter prefix when `court` isn't given
`limit`	`int`	No	`50`	Total results across sources

Returns: list[Judgment] sorted by decision_date DESC.

`iter_judgments(*, court=None, year=None, judge=None, party=None, citation=None, cnr=None, batch_size=500, max_results=None) -> AsyncIterator[Judgment]`

Stream judgments page-by-page via LIMIT/OFFSET with a stable sort (decision_date DESC, cnr). Use this for bulk pulls — "all 18k Delhi 2020 judgments" — without materialising everything in memory.

Sources are streamed sequentially (SCI first, then HC) with no cross-source date merge.

count = 0
async for j in client.iter_judgments(court="delhi", year=2020, batch_size=500):
    count += 1
    # process(j)

`fetch_pdf(judgment_or_cnr, *, language="english") -> bytes`

Fetch a judgment PDF. Pass a Judgment (preferred — avoids a metadata lookup) or a CNR string. SCI judgments support language="hindi" | "tamil" | "gujarati" | … (see bharat_courts.archive.endpoints.SCI_LANGUAGE_MAP); HC PDFs are English-only in the archive.

data = await client.fetch_pdf("DLHC010230802020")        # ~250 KB direct GET
data = await client.fetch_pdf("ESCR010000301950", language="english")
# First SCI fetch in a year downloads the year tar (~40–500 MB); subsequent
# fetches for that year are tar-extraction-fast.

Raises: ArchivePdfError for missing files, missing metadata fields, or HTTP failures.

`prefetch_sci_year(year, language="english") -> str`

Pre-warm the SCI tar cache for a year. Useful before a batch of related fetches.

`count(*, court=None, year=None) -> dict[str, int]`

Per-bucket row counts, e.g. {"sci": 571} for a single SCI year.

`cache_info() -> dict`

Snapshot: {"cache_dir": ..., "files": ..., "bytes": ..., "max_bytes": ...}.

Helpers

from bharat_courts import infer_court_from_cnr

infer_court_from_cnr("DLHC010230802020")  # → Court(code="delhi", ...)
infer_court_from_cnr("ESCR010000301950")  # → SUPREME_COURT
infer_court_from_cnr("ZZZZ012345")        # → None

Use this if you're routing CNRs yourself (e.g. into the live JudgmentSearchClient).

Court Registry Functions

from bharat_courts import get_court, get_court_by_name, list_high_courts, list_all_courts
from bharat_courts.courts import get_court_by_judgment_code

`get_court(code) -> Court | None`

Look up a court by its code. Case-insensitive.

get_court("delhi")          # Delhi High Court
get_court("bombay-nagpur")  # Bombay HC, Nagpur Bench
get_court("sci")            # Supreme Court of India
get_court("nonexistent")    # None

`get_court_by_name(name) -> Court | None`

Look up a court by its full name. Case-insensitive exact match.

get_court_by_name("Delhi High Court")  # Court(name="Delhi High Court", ...)

`get_court_by_judgment_code(judgment_code) -> Court | None`

Look up a court by its judgments.ecourts.gov.in code. Returns the main court (not bench variants).

get_court_by_judgment_code("7")   # Delhi High Court
get_court_by_judgment_code("27")  # Bombay High Court (main, not bench)

`list_high_courts() -> list[Court]`

Returns all 29 High Court entries (25 HCs + bench-specific entries for Bombay and Allahabad).

`list_all_courts() -> list[Court]`

Returns all 30 courts (Supreme Court + all High Courts).

Module-level constants

from bharat_courts import ALL_COURTS, SUPREME_COURT

SUPREME_COURT  # Court(name="Supreme Court of India", code="sci", state_code="0")
ALL_COURTS     # list of all 30 Court objects

Data Models

All models are Python dataclasses with to_dict() and to_json() serialization methods.

# Available on all models
model.to_dict(exclude_none=False)   # -> dict (dates become ISO strings, enums become values)
model.to_json(indent=None, exclude_none=False)  # -> JSON string

`Court`

@dataclass(frozen=True)
class Court:
    name: str                   # "Delhi High Court"
    code: str                   # "delhi"
    state_code: str             # "26" (hcservices.ecourts.gov.in)
    court_type: CourtType       # CourtType.HIGH_COURT
    bench: str | None = None    # "Lucknow Bench" (for bench-specific entries)
    judgment_code: str = ""     # "7" (judgments.ecourts.gov.in)

    @property
    def slug(self) -> str               # code lowercased, spaces replaced with hyphens
    @property
    def judgment_compound_code(self) -> str  # "{judgment_code}~{state_code}", e.g. "7~26"

`CourtType`

class CourtType(str, Enum):
    SUPREME_COURT  = "supreme_court"
    HIGH_COURT     = "high_court"
    DISTRICT_COURT = "district_court"
    TRIBUNAL       = "tribunal"

`CaseInfo`

Returned by case_status() and case_status_by_party().

@dataclass
class CaseInfo:
    case_number: str                        # "3/2024"
    case_type: str                          # Case type label, e.g. "W.P.(C)"
    cnr_number: str = ""                    # "DLHC010582482024"
    filing_number: str = ""
    registration_number: str = ""
    registration_date: date | None = None
    petitioner: str = ""
    respondent: str = ""
    status: str = ""                        # empty for HC Services (showRecords doesn't return it)
    court_name: str = ""
    judges: list[str] = []
    next_hearing_date: date | None = None

`CaseOrder`

Returned by court_orders().

@dataclass
class CaseOrder:
    order_date: date
    order_type: str             # "Judgment" | "Order" | "Interim Order"
    judge: str = ""
    pdf_url: str = ""
    pdf_bytes: bytes | None = None   # populated by download_order_pdf(); excluded from serialization
    order_text: str = ""
    neutral_citation: str = ""  # e.g. "2024:CHC-AS:1277" (Calcutta HC)

`CauseListPDF`

Returned by cause_list().

@dataclass
class CauseListPDF:
    serial_number: int
    bench: str                  # "Division Bench"
    cause_list_type: str = ""   # "COMPLETE CAUSE LIST"
    pdf_url: str = ""
    pdf_bytes: bytes | None = None   # excluded from serialization

`JudgmentResult`

Returned by JudgmentSearchClient.search() / search_all() and SCIClient.list_recent_judgments().

@dataclass
class JudgmentResult:
    title: str
    court_name: str
    case_number: str = ""
    judgment_date: date | None = None
    judges: list[str] = []
    pdf_url: str = ""
    pdf_bytes: bytes | None = None   # populated by download_pdf(); excluded from serialization
    citation: str = ""
    bench_type: str = ""             # "Division Bench" | "Single Bench" | "Full Bench"
    source_url: str = ""
    source_id: str = ""
    metadata: dict = {}

`SearchResult`

Returned by JudgmentSearchClient.search().

@dataclass
class SearchResult:
    items: list[CaseInfo | JudgmentResult | CauseListEntry] = []
    total_count: int = 0
    page: int = 1
    page_size: int = 10
    has_next: bool = False

    @property
    def total_pages(self) -> int   # ceil(total_count / page_size)

`CauseListEntry`

Structured cause list data (for parsed cause list entries).

@dataclass
class CauseListEntry:
    serial_number: int
    case_number: str
    case_type: str = ""
    petitioner: str = ""
    respondent: str = ""
    advocate_petitioner: str = ""
    advocate_respondent: str = ""
    court_number: str = ""
    judge: str = ""
    listing_date: date | None = None
    item_number: str = ""

CAPTCHA Solvers

All solvers implement the CaptchaSolver abstract base class:

from bharat_courts.captcha.base import CaptchaSolver

class CaptchaSolver(ABC):
    @abstractmethod
    async def solve(self, image_bytes: bytes) -> str:
        """Given raw CAPTCHA image bytes, return the solved text."""

`OCRCaptchaSolver`

Automatic CAPTCHA solving using ddddocr. Requires pip install bharat-courts[ocr].

from bharat_courts.captcha.ocr import OCRCaptchaSolver

solver = OCRCaptchaSolver(
    preprocess=False,    # Apply image binarization + median filter before OCR
    threshold=128,       # Binarization threshold (0-255), used if preprocess=True
)

~75% accuracy on the judgments portal in our measurements; failed attempts are automatically retried with fresh sessions (default 5 retries — P(all fail) ≈ 0.1%). Outputs that aren't exactly 6 alphanumeric characters are rejected before being submitted, so the portal's "captcha must be 6 chars" envelope no longer burns a retry.

`ONNXCaptchaSolver`

Lightweight CAPTCHA solver using ONNX Runtime. Requires pip install bharat-courts[onnx]. Uses a pre-trained model from HuggingFace (captchabreaker), downloaded to ~/.cache/bharat-courts/ at init time.

Requires HF_TOKEN: The HuggingFace model repo requires authentication. Set export HF_TOKEN=hf_... (get a token at https://huggingface.co/settings/tokens). If you don't have a token, use OCRCaptchaSolver instead.

from bharat_courts.captcha.onnx import ONNXCaptchaSolver

solver = ONNXCaptchaSolver()

# Or with a custom model file
solver = ONNXCaptchaSolver(model_path="/path/to/custom_model.onnx")

Parameter	Type	Required	Default	Description
`model_path`	`str \| Path \| None`	No	`None`	Path to a custom ONNX model. If `None`, downloads the default captchabreaker model.

Validates that decoded text is exactly 6 characters — returns empty string on wrong length to trigger client retry.

`ManualCaptchaSolver`

Interactive solver that saves the CAPTCHA image and prompts the user.

from bharat_courts.captcha.manual import ManualCaptchaSolver

# Prompt on stdin (saves image to /tmp/*.png for viewing)
solver = ManualCaptchaSolver()

# Or provide a custom callback (sync or async)
solver = ManualCaptchaSolver(callback=my_captcha_handler)

Parameter	Type	Required	Default	Description
`callback`	`Callable[[bytes], str \| Awaitable[str]] \| None`	No	`None`	Custom handler. Receives image bytes, returns solved text. If `None`, prompts on stdin.

Custom Solver

Implement CaptchaSolver for your own solving strategy:

from bharat_courts.captcha.base import CaptchaSolver

class MyCaptchaSolver(CaptchaSolver):
    async def solve(self, image_bytes: bytes) -> str:
        # Send to a CAPTCHA solving service, ML model, etc.
        return "solved_text"

async with HCServicesClient(captcha_solver=MyCaptchaSolver()) as client:
    ...

CLI

The CLI is organised into one command group per portal, matching the SDK module layout:

bharat-courts version
bharat-courts courts [--type all|hc|sc]
bharat-courts find             [--text | --judge | --party | --citation | --cnr | --court | --year | --source]
bharat-courts hcservices       benches | case-types | search | search-by-party | orders | cause-list
bharat-courts districtcourts   states | districts | complexes | establishments | case-types | courts | search | search-by-party | orders | cause-list
bharat-courts calcuttahc       search
bharat-courts judgments        search | search-all
bharat-courts sci              recent
bharat-courts archive          query | get | download | count | cache
bharat-courts install-skills

The top-level find command is the federated entry point: it routes between archive and live based on which filters you pass. Use it as the default for "find a judgment"; reach for the portal-specific groups when you need a portal-only feature.

Global flags (apply to every subcommand):

Flag	Description
`--json`	Emit machine-readable JSON instead of formatted text. Lists return arrays; single dataclasses return objects; `calcuttahc search` returns `{"case_info": ..., "orders": [...]}`.
`--captcha-attempts N`	Override the default CAPTCHA retry budget (5). Currently honoured by `judgments` and `calcuttahc`; `hcservices` and `districtcourts` use a fixed internal budget.
`--verbose` / `-v`	Enable INFO-level SDK logging on stderr.

Every PDF-producing command takes --download DIR to save PDFs alongside the printed output. Filenames are <case_or_title>_<date>.pdf, sanitised.

Examples

# Print version, list available courts
bharat-courts version
bharat-courts courts --type hc

# Federated find — routes between archive and live for you
bharat-courts find --judge "chandrachud" --year 2022 --court sci --limit 5
bharat-courts find --cnr DLHC010230802020              # auto-routed via CNR prefix
bharat-courts find --text "right to privacy" --limit 5  # → live (full-text)
bharat-courts find --text "asian hotels" --court delhi --year 2020  # mixed → archive
bharat-courts find --party "tata motors" --year 2020 --source archive  # force backend
bharat-courts --json find --judge "bobde" --year 2020 --court sci  # JSON output

# HC Services — discover bench / case-type codes, then search
bharat-courts hcservices benches delhi
bharat-courts hcservices case-types delhi --bench 1
bharat-courts hcservices search delhi --case-type 134 --case-number 1 --year 2024
bharat-courts hcservices orders delhi --case-type 134 --case-number 1 --year 2024 --download ./orders/
bharat-courts hcservices cause-list delhi --date 24-04-2026

# District Courts — drill down state -> district -> complex -> establishment
bharat-courts districtcourts states
bharat-courts districtcourts districts --state 8
bharat-courts districtcourts complexes --state 8 --dist 1
bharat-courts districtcourts case-types --state 8 --dist 1 --complex 1080010 --est 2
bharat-courts districtcourts search \
    --state 8 --dist 1 --complex 1080010 --est 2 \
    --case-type "89^2" --case-number 100 --year 2024
bharat-courts districtcourts cause-list \
    --state 8 --dist 1 --complex 1080010 --est 2 \
    --court-no "1@2"   # --court-name auto-resolves if blank

# Calcutta HC (returns case_info + orders)
bharat-courts calcuttahc search --case-type 12 --case-number 12886 --year 2024

# Judgments portal
bharat-courts judgments search --text "right to privacy" --page-size 25
bharat-courts judgments search-all --text "land acquisition" --max-pages 5 --download ./pdfs/

# Supreme Court — homepage feed
bharat-courts sci recent --limit 10
bharat-courts sci recent --limit 5 --download ./sci-pdfs/

# Historical archive (AWS Open Data buckets — needs `pip install bharat-courts[archive]`)
bharat-courts archive query --court sci --judge "chandrachud" --year 2022 --limit 5
bharat-courts archive query --court delhi --year 2020 --judge endlaw --limit 3
bharat-courts archive get --cnr DLHC010230802020 --pdf --out ./judgment.pdf
bharat-courts archive download --court sci --year 2020   # pre-warm the year tar
bharat-courts archive count --court sci --year 2020      # → "sci: 571"
bharat-courts archive cache                              # show disk usage
bharat-courts archive cache --clear                      # wipe local cache

# JSON output for piping to jq / spreadsheets
bharat-courts --json courts --type sc | jq '.[].name'
bharat-courts --json hcservices benches bombay
bharat-courts --json archive query --court sci --year 2020 --judge bobde --limit 10

# Install the AI agent skill bundle (Claude Code, Copilot, etc.)
bharat-courts install-skills

Configuration

Environment variables with BHARAT_COURTS_ prefix:

Variable	Default	Description
`BHARAT_COURTS_REQUEST_DELAY`	`1.0`	Seconds between requests
`BHARAT_COURTS_TIMEOUT`	`60`	Request timeout (seconds). Wide District Courts party-name searches genuinely take 30-60s on the portal — the previous default of 30 was too tight and triggered timeouts against endpoints that were about to respond.
`BHARAT_COURTS_MAX_RETRIES`	`3`	Retry count on failure (only applied to 5xx and connect/read timeouts; 4xx responses propagate immediately).
`BHARAT_COURTS_LOG_LEVEL`	`INFO`	Logging level

Or use a .env file. See .env.example.

Supported Courts

All 25 High Courts with verified eCourts state codes and judgment portal codes:

Court	Code	State Code	Judgment Code
Allahabad HC	`allahabad`	13	9
Andhra Pradesh HC	`andhra`	2	28
Bombay HC	`bombay`	1	27
Calcutta HC	`calcutta`	16	19
Chhattisgarh HC	`chhattisgarh`	18	22
Delhi HC	`delhi`	26	7
Gauhati HC	`gauhati`	6	18
Gujarat HC	`gujarat`	17	24
Himachal Pradesh HC	`himachal`	5	2
J&K HC	`jammu`	12	1
Jharkhand HC	`jharkhand`	7	20
Karnataka HC	`karnataka`	3	29
Kerala HC	`kerala`	4	32
Madhya Pradesh HC	`mp`	23	23
Madras HC	`madras`	10	33
Manipur HC	`manipur`	25	14
Meghalaya HC	`meghalaya`	21	17
Orissa HC	`orissa`	11	21
Patna HC	`patna`	8	10
Punjab & Haryana HC	`punjab`	22	3
Rajasthan HC	`rajasthan`	9	8
Sikkim HC	`sikkim`	24	11
Telangana HC	`telangana`	29	36
Tripura HC	`tripura`	20	16
Uttarakhand HC	`uttarakhand`	15	5
Supreme Court	`sci`	0	—

Bombay and Allahabad HCs also have bench-specific entries (e.g., bombay-nagpur, allahabad-lucknow).

Contributing

Contributions are welcome! Here's how to get set up.

Prerequisites

Python 3.11+ — check with python3 --version
git

Dev environment setup

# 1. Fork and clone
git clone https://github.com/<your-username>/bharat-courts.git
cd bharat-courts

# 2. Create a virtual environment
python3 -m venv .venv
source .venv/bin/activate   # Linux/macOS
# .venv\Scripts\activate    # Windows

# 3. Install with all extras (OCR, CLI, dev tools)
pip install -e ".[all]"

# 4. Verify everything works
pytest                                    # 250 unit tests, no network needed
ruff check . && ruff format --check .     # lint + format check

Running tests

# Unit tests (fast, offline)
pytest

# Single test file
pytest tests/test_hcservices_parser.py

# Single test
pytest tests/test_hcservices_parser.py::test_parse_case_status_json

# With verbose output
pytest -v

# Live integration tests against real eCourts portals (requires ddddocr + network)
python tests/integration/hcservices.py    # live HC Services + CAPTCHA solver
python tests/integration/archive.py       # archive + facade against real S3
# see tests/integration/README.md for the full list

Code style

The project uses ruff for linting and formatting:

# Check for issues
ruff check .

# Auto-fix what's possible
ruff check --fix .

# Format code
ruff format .

Config is in pyproject.toml — Python 3.11 target, 100-char line length, rules: E/F/I/N/W.

Project structure

src/bharat_courts/
├── __init__.py          # Public API exports
├── models.py            # Dataclasses: CaseInfo, CaseOrder, CauseListPDF, etc.
├── config.py            # Pydantic Settings (BHARAT_COURTS_ env prefix)
├── http.py              # Rate-limited async HTTP client (httpx)
├── courts.py            # Registry of 25+ HCs with eCourts codes
├── captcha/
│   ├── base.py          # CaptchaSolver ABC
│   ├── manual.py        # Stdin/callback solver
│   ├── ocr.py           # ddddocr-based solver
│   └── onnx.py          # ONNX Runtime solver (captchabreaker)
├── hcservices/          # HC Services portal (primary, fully working)
│   ├── client.py        # HCServicesClient
│   ├── endpoints.py     # URL + form builders
│   └── parser.py        # JSON + HTML response parsers
├── districtcourts/      # District Courts portal (700+ courts)
│   ├── client.py        # DistrictCourtClient
│   ├── endpoints.py     # URL + form builders + state codes
│   └── parser.py        # HTML response parsers
├── calcuttahc/          # Calcutta High Court (direct website)
│   ├── client.py        # CalcuttaHCClient
│   ├── endpoints.py     # URL + form builders
│   └── parser.py        # JSON + HTML response parsers
├── judgments/            # Judgment Search portal (basic)
│   ├── client.py
│   ├── endpoints.py
│   └── parser.py
├── sci/                 # Supreme Court (basic)
│   ├── client.py
│   └── parser.py
├── archive/             # AWS Open Data archive (opt-in via [archive] extra)
│   ├── client.py        # ArchiveClient — async facade
│   ├── endpoints.py     # Bucket URIs + SCI language map
│   ├── metadata.py      # DuckDB query layer over partitioned parquet
│   ├── metadata_cache.py# Local mirror of parquet shards (TTL-invalidated)
│   ├── schema.py        # Row → Judgment mapping (handles SCI + HC schemas)
│   └── storage.py       # PDF cache (per-tar SCI, per-file HC) + LRU eviction
├── facade.py            # Judgments — federated find()/fetch_pdf, routes archive vs live
└── cli.py               # Click CLI entry point

Areas where help is needed

Better CAPTCHA solving — ddddocr is ~75% accurate on the judgments portal; the ONNX solver is an alternative, but a fine-tuned model would help further
District court search reliability — case_status, court_orders, and cause_list were rewired this cycle to send the right portal field names; broader coverage testing would surface remaining edge cases (and case_status_by_party still has no pagination)
Supreme Court case search — SCIClient.search_by_year / search_by_party are stubbed; the live www.sci.gov.in portal has a CAPTCHA-protected case-no/diary-no/party-name form that needs wiring up
HC Services case history — case_status doesn't return Pending/Disposed (or registration date / next hearing) because the SDK hits showRecords only; calling o_civil_case_history.php afterwards would fill in the rest
More High Court coverage — test the client against courts beyond Delhi/Bombay/Allahabad
Documentation — more examples, tutorials

Submitting changes

Fork the repo and create a branch (git checkout -b my-feature)
Make your changes
Run pytest and ruff check . to ensure tests pass and code is clean
Commit with a descriptive message
Open a pull request

How it works

HC Services Portal

The eCourts HC Services portal (hcservices.ecourts.gov.in) uses a PHP backend with:

Session cookies — GET main.php establishes HCSERVICES_SESSID
Securimage CAPTCHAs — pinned to the session (same image within one session)
AJAX POST requests — cases_qry/index_qry.php with action_code parameter
JSON responses — {"con": ["[{...}]"], "totRecords": N, "Error": ""}

District Courts Portal

The District Courts portal (services.ecourts.gov.in/ecourtindia_v6/) uses a similar PHP backend with key differences:

Session cookies — SERVICES_SESSID (established on page load)
Rotating app_token — every AJAX response returns a new token that must be sent with the next request
MVC-style AJAX — /?p=controller/action URL pattern (e.g., /?p=casestatus/submitCaseNo)
HTML responses — search results are pre-rendered HTML tables (not JSON)
4-level court hierarchy — State → District → Court Complex → Establishment (discovered dynamically)

Both portals are handled transparently — session management, token rotation, CAPTCHA solving with retry, request/response parsing, and rate limiting.

Historical Archive (AWS Open Data)

Two public S3 buckets in ap-south-1, CC-BY-4.0, maintained by Dattam Labs:

s3://indian-supreme-court-judgments/ — SCI judgments 1950–present, bi-monthly updates
s3://indian-high-court-judgments/ — 25 High Courts, quarterly updates

Both partition metadata as Hive-style parquet (SCI: year=YYYY/; HC: year=YYYY/court=<archive_id>_<state_code>/bench=<slug>/) and ship PDFs in per-year tar bundles. The HC bucket additionally exposes individual PDFs at data/pdf/year=…/court=…/bench=…/<basename>, so single-PDF fetches don't need to download the whole tar.

ArchiveClient reads anonymously via DuckDB's httpfs extension (no AWS account needed), translates rows through row_to_judgment() into the unified Judgment shape, and serves PDFs with on-disk LRU caching. CNR-only queries auto-route via the 4-letter prefix — DLHC* → Delhi, ESCR* → SCI, HCBM* → Bombay, WBCH* → Calcutta, etc. (the full mapping is in courts._CNR_PREFIX_TO_COURT_CODE, verified against a 2020 sample of every HC partition).

The archive is complementary to the live clients, not a replacement: it only contains delivered judgments and lags by 2–3 months, so case status, cause lists, and in-progress orders still need HCServicesClient / DistrictCourtClient.

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.0

May 16, 2026

0.2.1

Apr 25, 2026

0.2.0

Mar 19, 2026

0.1.2

Mar 2, 2026

0.1.1

Feb 28, 2026

0.1.0

Feb 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bharat_courts-0.3.0.tar.gz (159.0 kB view details)

Uploaded May 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

bharat_courts-0.3.0-py3-none-any.whl (120.6 kB view details)

Uploaded May 16, 2026 Python 3

File details

Details for the file bharat_courts-0.3.0.tar.gz.

File metadata

Download URL: bharat_courts-0.3.0.tar.gz
Upload date: May 16, 2026
Size: 159.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bharat_courts-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`e7f38a98a8136b1248ff1c83c39a1a6daa252f270a3419b180d08e86390406aa`
MD5	`274df3910e23a0c3d9d41c3fc1dff4f9`
BLAKE2b-256	`80b0d0f5281408019740a7054af654da6037d86d9f0d37cb6fb24d604723370c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for bharat_courts-0.3.0.tar.gz:

Publisher: python-publish.yml on iamshouvikmitra/bharat-courts

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: bharat_courts-0.3.0.tar.gz
- Subject digest: e7f38a98a8136b1248ff1c83c39a1a6daa252f270a3419b180d08e86390406aa
- Sigstore transparency entry: 1554553661
- Sigstore integration time: May 16, 2026
Source repository:
- Permalink: iamshouvikmitra/bharat-courts@0ebc80c8849d6e76cd84102d2cecef0bc94fb2f2
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/iamshouvikmitra
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@0ebc80c8849d6e76cd84102d2cecef0bc94fb2f2
- Trigger Event: release

File details

Details for the file bharat_courts-0.3.0-py3-none-any.whl.

File metadata

Download URL: bharat_courts-0.3.0-py3-none-any.whl
Upload date: May 16, 2026
Size: 120.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bharat_courts-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1177e8c229fa77a8ca71b524a7eb73d54dbfb6d230201f292bcb9631fc4ec96e`
MD5	`bcca25762e6e08c1ea0c902b4c38dc4e`
BLAKE2b-256	`4220857b61a2bfc0b816f82174f21de59bdc71ea2e4b46f93577371059c1cc76`

See more details on using hashes here.

Provenance

The following attestation bundles were made for bharat_courts-0.3.0-py3-none-any.whl:

Publisher: python-publish.yml on iamshouvikmitra/bharat-courts

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: bharat_courts-0.3.0-py3-none-any.whl
- Subject digest: 1177e8c229fa77a8ca71b524a7eb73d54dbfb6d230201f292bcb9631fc4ec96e
- Sigstore transparency entry: 1554553670
- Sigstore integration time: May 16, 2026
Source repository:
- Permalink: iamshouvikmitra/bharat-courts@0ebc80c8849d6e76cd84102d2cecef0bc94fb2f2
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/iamshouvikmitra
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@0ebc80c8849d6e76cd84102d2cecef0bc94fb2f2
- Trigger Event: release

bharat-courts 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

bharat-courts

What is this?

Installation

Quick Start

Find a judgment without picking a backend

Find all pending matters for your client

Check case status and download orders

Get tomorrow's cause list before court

Search District Court cases

List recent Supreme Court judgments

Query the historical archive (no CAPTCHA, no rate limits)

Use with AI agents (Claude Code, Copilot, etc.)

JSON serialization

Supported Portals

API Reference

Judgments (federated facade)

find(*, text=None, court=None, year=None, judge=None, party=None, citation=None, cnr=None, source="auto", limit=50) -> list[Judgment]

fetch_pdf(judgment_or_cnr, *, language="english") -> bytes

live_to_judgment(jr: JudgmentResult) -> Judgment

HCServicesClient

list_benches(court) -> dict[str, str]

list_case_types(court, *, bench_code="1") -> dict[str, str]

case_status(court, *, case_type, case_number, year, bench_code="1") -> list[CaseInfo]

case_status_by_party(court, *, party_name, year, bench_code="1", status_filter="Both") -> list[CaseInfo]

court_orders(court, *, case_type, case_number, year, bench_code="1") -> list[CaseOrder]

cause_list(court, *, civil=True, bench_code="1", causelist_date="") -> list[CauseListPDF]

download_order_pdf(pdf_url) -> bytes

DistrictCourtClient

Court Discovery Methods (No CAPTCHA)

list_states() -> dict[str, str]

list_districts(state_code) -> dict[str, str]

list_complexes(state_code, dist_code) -> dict[str, str]

list_establishments(state_code, dist_code, court_complex_code) -> dict[str, str]

list_case_types(state_code, dist_code, court_complex_code, est_code) -> dict[str, str]

list_cause_list_courts(state_code, dist_code, court_complex_code, est_code="") -> dict[str, str]

Search Methods (CAPTCHA Required)

case_status(*, state_code, dist_code, court_complex_code, est_code, case_type, case_number, year) -> list[CaseInfo]

case_status_by_party(*, state_code, dist_code, court_complex_code, est_code, party_name, year, status_filter="Both") -> list[CaseInfo]

court_orders(*, state_code, dist_code, court_complex_code, est_code, case_type, case_number, year) -> list[CaseOrder]

cause_list(*, state_code, dist_code, court_complex_code, est_code, court_no, court_name="", causelist_date="", civil=True) -> list[CauseListEntry]

JudgmentSearchClient

search(search_text, *, page=1, page_size=10, search_opt="PHRASE", court_type="2", max_captcha_attempts=5) -> SearchResult

search_all(search_text, *, page_size=25, search_opt="PHRASE", court_type="2", max_captcha_attempts=5) -> AsyncIterator[SearchResult]

download_pdf(judgment, *, court_type="2") -> JudgmentResult

download_pdfs(judgments, *, court_type="2", stop_on_error=False) -> list[JudgmentResult]

SCIClient

list_recent_judgments(*, limit=50) -> list[JudgmentResult]

download_pdf(judgment) -> JudgmentResult

search_by_year(year, month=None) and search_by_party(party_name) — not implemented

CalcuttaHCClient

search_orders(*, case_type, case_number, year, establishment="appellate", max_captcha_attempts=5) -> tuple[CaseInfo | None, list[CaseOrder]]

download_order_pdf(pdf_url) -> bytes

ArchiveClient

search(*, court=None, year=None, judge=None, party=None, citation=None, cnr=None, limit=50) -> list[Judgment]

iter_judgments(*, court=None, year=None, judge=None, party=None, citation=None, cnr=None, batch_size=500, max_results=None) -> AsyncIterator[Judgment]

fetch_pdf(judgment_or_cnr, *, language="english") -> bytes

prefetch_sci_year(year, language="english") -> str

count(*, court=None, year=None) -> dict[str, int]

cache_info() -> dict

Helpers

Court Registry Functions

get_court(code) -> Court | None

get_court_by_name(name) -> Court | None

get_court_by_judgment_code(judgment_code) -> Court | None

list_high_courts() -> list[Court]

list_all_courts() -> list[Court]

Module-level constants

Data Models

Court

CourtType

CaseInfo

CaseOrder

`Judgments` (federated facade)

`find(*, text=None, court=None, year=None, judge=None, party=None, citation=None, cnr=None, source="auto", limit=50) -> list[Judgment]`

`fetch_pdf(judgment_or_cnr, *, language="english") -> bytes`

`live_to_judgment(jr: JudgmentResult) -> Judgment`

`HCServicesClient`

`list_benches(court) -> dict[str, str]`

`list_case_types(court, *, bench_code="1") -> dict[str, str]`

`case_status(court, *, case_type, case_number, year, bench_code="1") -> list[CaseInfo]`

`case_status_by_party(court, *, party_name, year, bench_code="1", status_filter="Both") -> list[CaseInfo]`

`court_orders(court, *, case_type, case_number, year, bench_code="1") -> list[CaseOrder]`

`cause_list(court, *, civil=True, bench_code="1", causelist_date="") -> list[CauseListPDF]`

`download_order_pdf(pdf_url) -> bytes`

`DistrictCourtClient`

`list_states() -> dict[str, str]`

`list_districts(state_code) -> dict[str, str]`

`list_complexes(state_code, dist_code) -> dict[str, str]`

`list_establishments(state_code, dist_code, court_complex_code) -> dict[str, str]`

`list_case_types(state_code, dist_code, court_complex_code, est_code) -> dict[str, str]`

`list_cause_list_courts(state_code, dist_code, court_complex_code, est_code="") -> dict[str, str]`

`case_status(*, state_code, dist_code, court_complex_code, est_code, case_type, case_number, year) -> list[CaseInfo]`

`case_status_by_party(*, state_code, dist_code, court_complex_code, est_code, party_name, year, status_filter="Both") -> list[CaseInfo]`

`court_orders(*, state_code, dist_code, court_complex_code, est_code, case_type, case_number, year) -> list[CaseOrder]`

`cause_list(*, state_code, dist_code, court_complex_code, est_code, court_no, court_name="", causelist_date="", civil=True) -> list[CauseListEntry]`

`JudgmentSearchClient`

`search(search_text, *, page=1, page_size=10, search_opt="PHRASE", court_type="2", max_captcha_attempts=5) -> SearchResult`

`search_all(search_text, *, page_size=25, search_opt="PHRASE", court_type="2", max_captcha_attempts=5) -> AsyncIterator[SearchResult]`

`download_pdf(judgment, *, court_type="2") -> JudgmentResult`

`download_pdfs(judgments, *, court_type="2", stop_on_error=False) -> list[JudgmentResult]`

`SCIClient`

`list_recent_judgments(*, limit=50) -> list[JudgmentResult]`

`download_pdf(judgment) -> JudgmentResult`

`search_by_year(year, month=None)` and `search_by_party(party_name)` — not implemented

`CalcuttaHCClient`

`search_orders(*, case_type, case_number, year, establishment="appellate", max_captcha_attempts=5) -> tuple[CaseInfo | None, list[CaseOrder]]`

`download_order_pdf(pdf_url) -> bytes`

`ArchiveClient`

`search(*, court=None, year=None, judge=None, party=None, citation=None, cnr=None, limit=50) -> list[Judgment]`

`iter_judgments(*, court=None, year=None, judge=None, party=None, citation=None, cnr=None, batch_size=500, max_results=None) -> AsyncIterator[Judgment]`

`fetch_pdf(judgment_or_cnr, *, language="english") -> bytes`

`prefetch_sci_year(year, language="english") -> str`

`count(*, court=None, year=None) -> dict[str, int]`

`cache_info() -> dict`

`get_court(code) -> Court | None`

`get_court_by_name(name) -> Court | None`

`get_court_by_judgment_code(judgment_code) -> Court | None`

`list_high_courts() -> list[Court]`

`list_all_courts() -> list[Court]`

`Court`

`CourtType`

`CaseInfo`

`CaseOrder`

`CauseListPDF`

`JudgmentResult`

`SearchResult`

`CauseListEntry`

`OCRCaptchaSolver`

`ONNXCaptchaSolver`

`ManualCaptchaSolver`