Complete financial data scraper with CLI, GUI, and API interfaces
Project description
FinPull - Financial Data Scraper
FinPull is a comprehensive financial data scraping tool providing multiple interfaces for accessing financial market data. The package includes API, command-line, and graphical user interfaces, making it suitable for various use cases from automated trading systems to interactive data analysis.
Installation
pip install finpull
For API-only usage (lightweight):
pip install finpull-core
Quick Start
GUI Mode (Default)
finpull
Command Line Interface
# Interactive mode
finpull --interactive
# Direct commands
finpull add AAPL GOOGL MSFT
finpull show AAPL --full
finpull export portfolio.xlsx --xlsx
finpull refresh
Programmatic API
from finpull import FinancialDataAPI
api = FinancialDataAPI()
result = api.add_ticker("AAPL")
data = api.get_data("AAPL")
if data['success']:
stock_info = data['data']
print(f"Company: {stock_info['company_name']}")
print(f"Price: ${stock_info['price']}")
print(f"Market Cap: {stock_info['market_cap']}")
Performance
Comprehensive benchmarks on typical hardware with 10 test runs each:
| Metric | Core Package | Full Package | Difference |
|---|---|---|---|
| Package Size | 21.9 KB | 27.2 KB | +5.3 KB (+24.2%) |
| Installed Size | 134 KB | 188 KB | +54 KB (+40.3%) |
| Import Time (cached) | 0.0002s | 0.0002s | No difference |
| Dependencies | 3 packages | 4 packages | +openpyxl |
Key Features
- Multiple Interfaces: GUI, CLI, and API access
- Export Options: JSON, CSV, and Excel formats
- Progress Tracking: Real-time operation status
- Cross-Platform: Windows, macOS, and Linux support
Interfaces
Graphical User Interface (GUI)
Launch: finpull or finpull --gui
Features:
- Data Grid: Complete view of all 27 financial metrics with horizontal/vertical scrolling
- Multi-Selection: Select and manage multiple tickers simultaneously (Ctrl+Click, Shift+Click)
- Smart Sorting: Click column headers to sort by any metric with data type awareness
- Real-time Updates: Progress indicators showing 🔄 loading, ✅ success, ❌ error states
- Export Dialog: Save data to JSON, CSV, or Excel with file browser integration
- Status Bar: Current operation status and ticker count display
- Responsive Layout: Adapts to different screen sizes and resolutions
Command Line Interface (CLI)
Launch: finpull --interactive or direct commands
Available Commands:
add <tickers>- Add ticker symbols for trackingremove <tickers>- Remove tickers from trackingshow [ticker] [--full]- Display ticker informationrefresh [ticker]- Update data from sourcesexport <filename> [--json] [--csv] [--xlsx]- Save data to filesstats- Show system statistics and healthclear- Remove all tracked data (with confirmation)
Features:
- Interactive Mode: Shell-like interface (
finpull>prompt) for exploration - Direct Commands: Single-command operations perfect for scripting and automation
- Batch Operations: Process multiple tickers efficiently in one command
- Formatted Output: Beautiful ASCII tables with aligned columns and borders
- Progress Indicators: Real-time status updates for long-running operations
- Auto-completion: Tab completion for commands and options (in interactive mode)
API Interface
Import: from finpull import FinancialDataAPI, FinancialDataScraper
Classes:
- FinancialDataAPI: High-level interface with error handling and validation
- FinancialDataScraper: Low-level scraper for direct data access and control
- FinancialData: Data model with 27+ financial attributes
Features:
- Consistent Responses: All methods return standardized JSON format
- Comprehensive Error Handling: Detailed error codes and descriptive messages
- Type Hints: Full type annotation support for better IDE integration
- Validation: Built-in ticker format validation and data sanitization
- Callback Support: Progress callbacks for batch operations and real-time updates
Data Coverage
Provides 27 financial metrics per ticker across all categories: basic info, valuation ratios, earnings, profitability, growth, financial position, and market data. Uses Finviz and Yahoo Finance with automatic failover for high reliability.
Configuration
Environment Variables
# Custom storage location
export FINPULL_STORAGE_FILE="/path/to/custom/storage.json"
# Rate limiting (seconds between requests)
export FINPULL_RATE_LIMIT="2"
# Logging level (DEBUG, INFO, WARNING, ERROR)
export FINPULL_LOG_LEVEL="INFO"
# GUI theme (if supported)
export FINPULL_GUI_THEME="default"
Storage
Data is persisted locally in JSON format with automatic backups:
- Linux/macOS:
~/.finpull/data.json - Windows:
%USERPROFILE%\.finpull\data.json - Backup: Automatic backup before major operations
- Format: Human-readable JSON with proper indentation
Rate Limiting
Built-in intelligent rate limiting prevents API blocks:
- Default: 1 request per second (configurable)
- Adaptive: Automatically increases delays if rate limits detected
- Burst Protection: Prevents accidental rapid-fire requests
- Source-Specific: Different limits for different data sources
Documentation
- Interface Guide - Complete interface documentation
- API Reference - API methods and responses
- Data Format - JSON schema and field descriptions
Web Integration
Supports browser integration via Pyodide and Node.js via CLI commands. See main repository README for complete integration examples and code samples.
Package Architecture
finpull/
├── Core Package (finpull-core)
│ ├── API interface
│ ├── Data scraping engine
│ ├── Storage management
│ └── Utility functions
└── Interface Extensions
├── Command-line interface
├── Graphical user interface
└── Excel export functionality
The full package depends on finpull-core for core functionality, ensuring:
- No code duplication between packages
- Consistent API across all interfaces
- Modular installation options
- Streamlined maintenance and updates
Examples
Portfolio Management
from finpull import FinancialDataAPI
api = FinancialDataAPI()
# Build a technology portfolio
tech_stocks = ["AAPL", "GOOGL", "MSFT", "AMZN", "TSLA", "META", "NVDA"]
results = api.batch_add_tickers(tech_stocks)
print(f"Successfully added {results['summary']['added_count']} stocks")
# Analyze portfolio performance
portfolio_data = api.get_data()
for stock in portfolio_data['data']:
print(f"{stock['ticker']}: ${stock['price']} | P/E: {stock['pe_ratio']} | Cap: {stock['market_cap']}")
Automated CLI Workflow
#!/bin/bash
# Daily portfolio update script
echo "Updating portfolio..."
# Add new stocks if needed
finpull add AAPL GOOGL MSFT
# Refresh all data
finpull refresh
# Generate reports
finpull export "reports/portfolio_$(date +%Y%m%d)" --json --csv --xlsx
# Show summary
finpull show --full
echo "Portfolio update complete"
Real-time Monitoring
import time
from finpull import FinancialDataAPI
api = FinancialDataAPI()
# Add watchlist
watchlist = ["AAPL", "GOOGL", "MSFT", "TSLA"]
api.batch_add_tickers(watchlist)
while True:
# Refresh data every 5 minutes
api.refresh_data()
# Check for significant changes
data = api.get_data()
for stock in data['data']:
change_5y = float(stock['change_5y'].replace('%', ''))
if abs(change_5y) > 10: # More than 10% change
print(f"Alert: {stock['ticker']} changed {change_5y}% over 5 years")
time.sleep(300) # 5 minutes
GUI Automation
import threading
from finpull import FinancialDataGUI
def setup_automated_gui():
gui = FinancialDataGUI()
# Pre-populate with data
gui.scraper.add_ticker("AAPL")
gui.scraper.add_ticker("GOOGL")
gui.refresh_display()
# Run GUI in separate thread
gui_thread = threading.Thread(target=gui.run)
gui_thread.start()
return gui
# Launch automated GUI session
gui = setup_automated_gui()
Error Handling & Logging
import logging
from finpull import FinancialDataAPI
# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
api = FinancialDataAPI()
def safe_operation(ticker):
try:
result = api.add_ticker(ticker)
if result['success']:
logger.info(f"Successfully added {ticker}")
return True
else:
logger.warning(f"Failed to add {ticker}: {result.get('error')}")
return False
except Exception as e:
logger.error(f"Exception for {ticker}: {e}")
return False
# Safe batch processing
tickers = ["AAPL", "INVALID", "GOOGL", "MSFT"]
successful = [t for t in tickers if safe_operation(t)]
print(f"Successfully processed {len(successful)}/{len(tickers)} tickers")
License
MIT License - see LICENSE file for details.
Links
- Core Package - Lightweight API-only version
- Source Code - GitHub repository
- Issues - Bug reports and feature requests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file finpull-1.1.0.tar.gz.
File metadata
- Download URL: finpull-1.1.0.tar.gz
- Upload date:
- Size: 27.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
45e09189379327164a0eeaf15f15a4a0db50d349d409d951813f579c13527a1d
|
|
| MD5 |
61242e04e0e43dba49e8a1d5c1b24aaf
|
|
| BLAKE2b-256 |
c5b0f9dd791a8c902a0f867fcfa3d96aaee0f5ca8403e8158b7c6909724896d8
|
File details
Details for the file finpull-1.1.0-py3-none-any.whl.
File metadata
- Download URL: finpull-1.1.0-py3-none-any.whl
- Upload date:
- Size: 24.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
240db33126b6a3ce87a7909704ff4c0af141aebe278bdf479cf95877ef75ed49
|
|
| MD5 |
88d087a814bbd8363b605460668e99ea
|
|
| BLAKE2b-256 |
1c1bda2b75b366f45ba30d55387494c0244f334bee7d959a0f40c79634f4084f
|