Skip to main content

Efficiently query large Upstox instruments JSON files using SQLite

Project description

Upstox Instrument Query

PyPI version Python versions MIT License Python 3.9+ Tests Passing Coverage 93% Conventional Commits

A Python package to efficiently query large Upstox instruments JSON files (~60MB) using SQLite for optimal performance.

Features

  • Memory Efficient: Streams JSON parsing for minimal memory footprint
  • High Performance: Uses SQLite with optimized indexes
  • Flexible: Query by instrument key, exchange, instrument type, or custom criteria
  • Caching: Implements LRU caching for frequently accessed queries
  • CLI Support: Command-line tools for database initialization and updates
  • URL Support: Direct loading from Upstox API URLs with gzip handling

Installation

pip install upstox-instrument-query

Usage

Initialize the Database

From a local JSON file:

upstox-query init /path/to/instruments.json /path/to/database.db

From the Upstox URL:

upstox-query init https://assets.upstox.com/market-quote/instruments/exchange/complete.json.gz /path/to/database.db --url

Query the Data

from upstox_instrument_query import InstrumentQuery

# Initialize query interface
query = InstrumentQuery('/path/to/database.db')

# Get instrument by key
instrument = query.get_by_instrument_key('NSE_EQ|INE002A01018')
print(instrument)

# Filter by exchange
nse_instruments = query.filter_by_exchange('NSE')
print(f"Found {len(nse_instruments)} NSE instruments")

# Filter by instrument type
equity_instruments = query.filter_by_instrument_type('EQUITY')
print(f"Found {len(equity_instruments)} EQUITY instruments")

# Search by name (regex)
reliance_instruments = query.search_by_name('RELIANCE')
print(f"Found {len(reliance_instruments)} RELIANCE instruments:")
for instr in reliance_instruments[:3]:  # Print first 3
    print(f"- {instr['trading_symbol']} ({instr['instrument_type']})")

# Custom query
futures = query.custom_query('instrument_type = ? AND expiry > ?', ('FUTURES', '2025-01-01'))
print(f"Found {len(futures)} futures expiring after 2025-01-01")

CLI Commands

The package provides a command-line interface for common operations:

# Initialize database from file
upstox-query init /path/to/instruments.json /path/to/database.db

# Initialize database from URL
upstox-query init https://assets.upstox.com/market-quote/instruments/exchange/complete.json.gz /path/to/database.db --url

# Update existing database from file
upstox-query update /path/to/instruments.json /path/to/database.db

# Update existing database from URL
upstox-query update https://assets.upstox.com/market-quote/instruments/exchange/complete.json.gz /path/to/database.db --url

Advanced Usage

Case-Sensitive Name Searching

# Default is case-insensitive
case_insensitive = query.search_by_name('reliance')

# Enable case sensitivity
case_sensitive = query.search_by_name('RELIANCE', case_sensitive=True)

Complex Custom Queries

# Find all NSE futures expiring in 2025 with a lot size greater than 500
complex_query = query.custom_query(
    'exchange = ? AND instrument_type = ? AND expiry LIKE ? AND lot_size > ?',
    ('NSE', 'FUTURES', '2025-%', 500)
)

Performance Notes

  • SQLite Storage: Uses disk-based SQLite database, minimizing memory usage
  • Optimized Indexes: Includes indexes on instrument_key, exchange, instrument_type, and name for fast queries
  • Result Caching: Caches results for repeated queries using lru_cache
  • Streaming Parser: Streams JSON parsing from both local files and URLs to handle large files efficiently
  • Compression Support: Handles gzip-compressed JSON from URLs for direct processing

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Development Setup

  1. Clone the repository:

    git clone https://github.com/jinto-ag/upstox_instrument_query.git
    cd upstox_instrument_query
    
  2. Install development dependencies:

    pip install -e ".[dev]"
    # or
    pip install -e . pytest pytest-cov pre-commit
    
  3. Install pre-commit hooks:

    pre-commit install --hook-type pre-commit --hook-type commit-msg
    

Running Tests

Run the test suite using pytest:

pytest

To generate a coverage report:

pytest --cov=upstox_instrument_query --cov-report=html

The coverage report will be available in the htmlcov directory. Open htmlcov/index.html in your browser to view it.

Test-Driven Development (TDD)

This project follows Test-Driven Development practices for all new features. When contributing or adding new functionality, please follow these TDD principles:

  1. Write tests first: Before implementing any new feature, write tests that define the expected behavior
  2. Run the tests: Verify that the tests fail (since the feature doesn't exist yet)
  3. Implement the feature: Write the minimal code needed to make the tests pass
  4. Refactor: Clean up the code while ensuring tests continue to pass
  5. Repeat: Continue this cycle for each new piece of functionality

Example TDD workflow:

# 1. Create a test file for the new feature
touch tests/test_new_feature.py

# 2. Write tests for the expected behavior
# Edit tests/test_new_feature.py with test cases

# 3. Run tests to verify they fail appropriately
pytest tests/test_new_feature.py -v

# 4. Implement the feature
# Edit the relevant files in upstox_instrument_query/

# 5. Run tests again to see if they pass
pytest tests/test_new_feature.py -v

# 6. Run coverage to ensure proper test coverage
pytest --cov=upstox_instrument_query --cov-report=term-missing

Always aim to maintain high test coverage (90%+) when adding new features. The project CI pipeline will automatically check test coverage on pull requests.

Conventional Commits

This project follows the Conventional Commits specification for commit messages. This leads to more readable messages that are easy to follow when looking through the project history.

Commit messages should be structured as follows:

<type>[optional scope]: <description>

[optional body]

[optional footer(s)]

Types include:

  • feat: A new feature
  • fix: A bug fix
  • docs: Documentation only changes
  • style: Changes that do not affect the meaning of the code
  • refactor: A code change that neither fixes a bug nor adds a feature
  • perf: A code change that improves performance
  • test: Adding missing tests or correcting existing tests
  • chore: Changes to the build process or auxiliary tools

Examples:

feat: add option chain retrieval functionality
fix: correct database connection leak
docs: update README with new API methods

The pre-commit hooks enforce this convention when committing changes.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

upstox_instrument_query-0.1.0.tar.gz (24.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

upstox_instrument_query-0.1.0-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file upstox_instrument_query-0.1.0.tar.gz.

File metadata

  • Download URL: upstox_instrument_query-0.1.0.tar.gz
  • Upload date:
  • Size: 24.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for upstox_instrument_query-0.1.0.tar.gz
Algorithm Hash digest
SHA256 714ec0e4ea3c0b1bee79f0631b7386c1749ec8885ff8d951c8d2d1e3d1566b0c
MD5 f06c1fd944e15e100822c518b28dbdbf
BLAKE2b-256 69f9b8609dcd71acdaae92d3814753f9d6b4817643902d5f1225fe6c347e06dc

See more details on using hashes here.

File details

Details for the file upstox_instrument_query-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for upstox_instrument_query-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b107f1d3da9d9603826333015d791a0c37a501d01cef50ba65ee87e080e1f198
MD5 e80eb1e202f69f5974c2317e067f60c9
BLAKE2b-256 a1f8872af869b5425cae1ad0afa523ea346a02f23efb07bcdeb98dc7cd65d905

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page