A tool for scraping historical forex data from Yahoo Finance
Project description
Forex Data Extractor
A robust, type-safe Python package for extracting historical forex data from Yahoo Finance. Built with enterprise-grade architecture using Pydantic validation, async/await support, and comprehensive error handling.
Table of Contents
- Features
- Quick Start
- Installation
- Requirements
- Usage Examples
- Use Cases
- API Reference
- Screenshots
- Configuration
- License
- Contributing
- Acknowledgments
- Contact
Features
✨ Professional-Grade Architecture
- Type-safe Pydantic models with comprehensive validation
- Async/await support for high-performance data extraction
- Robust error handling and detailed logging
📊 Flexible Data Export
- Multiple output formats: CSV, JSON, or both simultaneously
- Smart file operations with append/overwrite capabilities
- Configurable output directories and naming conventions
🎯 Precision Financial Data
- Decimal precision for accurate price handling
- Comprehensive date validation and constraint checking
- Yahoo Finance integration with reliable scraping
⚡ Developer Experience
- Interactive CLI with guided prompts
- Command-line interface for automation and scripting
- Extensive configuration options for customization
🔧 Enterprise Features
- Playwright-based browser automation for reliability
- Resource optimization with selective content blocking
- Comprehensive metadata tracking and export statistics
Quick Start
Get up and running in 30 seconds:
# Install the package
pip install forex-data-extractor
# Extract USD/EUR data for the last year (interactive mode)
forex-scraper --interactive
# Or use direct command-line
forex-scraper USDEUR "Jan 01, 2024" "Dec 31, 2023" csv
That's it! Your forex data will be saved to ./Extracted_Data/USDEUR_historical_data.csv
Installation
From PyPI (Recommended)
pip install forex-data-extractor
From GitHub (Latest Development)
pip install git+https://github.com/yungKnight/forex_data_extractor.git
For Development
git clone https://github.com/yungKnight/forex_data_extractor.git
cd forex_data_extractor
pip install -e .
Requirements
- Python: >= 3.8
- Operating System: Cross-platform (Windows, macOS, Linux)
- Dependencies: Automatically installed with package
- Playwright (browser automation)
- Scrapy (web scraping framework)
- Pydantic (data validation)
- Additional dependencies listed in requirements.txt
Usage Examples
Command Line Interface
Basic Usage
# Extract EUR/JPY data for Q1 2024
forex-scraper EURJPY "Mar 31, 2024" "Jan 01, 2024" csv
# Get GBP/USD data in JSON format
forex-scraper GBPUSD "Dec 31, 2023" "Jan 01, 2023" json
# Export both CSV and JSON formats
forex-scraper USDCAD "Jun 30, 2024" "Jan 01, 2024" both
Interactive Mode
forex-scraper --interactive
# Follow the guided prompts for currency pair, dates, and format selection
Advanced CLI Options
# Show help and all available options
forex-scraper --help
# Check version
forex-scraper --version
Programmatic Usage
Basic Extraction
from forex_data_extractor import fetch_forex_data
from datetime import datetime
# Simple synchronous extraction
result = fetch_forex_data(
currency_pair="USDEUR",
start_date=datetime(2024, 3, 31),
end_date=datetime(2024, 1, 1),
output_format="json"
)
if result.success:
print(f"Extracted {len(result.data_points)} data points")
for point in result.data_points[:5]: # Show first 5 points
print(f"{point.date_string}: {point.close_price}")
else:
print(f"Extraction failed: {result.error_message}")
Advanced Async Usage
import asyncio
from forex_data_extractor import ForexDataExtractor, create_extraction_request
from datetime import datetime
async def advanced_extraction():
# Create a structured request
request = create_extraction_request(
currency_pair="GBPUSD",
start_date=datetime(2024, 12, 31),
end_date=datetime(2024, 1, 1),
output_file="gbp_usd_2024.json",
output_format="both"
)
# Use the extractor class for full control
extractor = ForexDataExtractor()
result = await extractor.extract_forex_data(request)
# Access comprehensive metadata
print(f"URL accessed: {result.metadata.url_accessed}")
print(f"Headers found: {result.metadata.headers_found}")
print(f"Extraction time: {result.metadata.extraction_timestamp}")
return result
# Run the async extraction
result = asyncio.run(advanced_extraction())
Data Processing and Analysis
from forex_data_extractor import get_forex_data
import pandas as pd
from datetime import datetime
async def analyze_forex_data():
# Extract data
result = await get_forex_data("EURUSD", datetime(2024, 6, 30), datetime(2024, 1, 1))
# Convert to DataFrame for analysis
data = [(point.date, float(point.close_price)) for point in result.data_points]
df = pd.DataFrame(data, columns=['Date', 'Close'])
df['Date'] = pd.to_datetime(df['Date'])
# Basic analytics
print(f"Average rate: {df['Close'].mean():.4f}")
print(f"Min rate: {df['Close'].min():.4f}")
print(f"Max rate: {df['Close'].max():.4f}")
print(f"Volatility (std): {df['Close'].std():.4f}")
return df
# Usage
df = asyncio.run(analyze_forex_data())
Use Cases
📈 Financial Research & Analysis
- Historical exchange rate analysis for academic research
- Currency trend analysis and statistical modeling
- Risk assessment and volatility calculations
- Economic indicator correlation studies
🏦 Fintech Development
- Building financial dashboards and applications
- Currency conversion service data feeds
- Algorithmic trading strategy backtesting
- Financial data pipeline integration
💼 Business Intelligence
- Multi-currency business performance analysis
- International trade impact assessment
- Foreign exchange exposure reporting
- Economic forecasting and planning
🔬 Data Science Projects
- Machine learning model training data
- Time series forecasting experiments
- Financial data preprocessing pipelines
- Cross-currency correlation analysis
API Reference
Key Functions
fetch_forex_data(currency_pair, start_date, end_date, output_format="csv")
Synchronous data extraction function - ideal for simple use cases.
get_forex_data(currency_pair, start_date, end_date, output_format="csv")
Async data extraction function - use for high-performance applications.
create_extraction_request(currency_pair, start_date, end_date, **kwargs)
Factory function for creating validated extraction requests.
Core Classes
ForexDataExtractor
Main extraction engine with async support and comprehensive error handling.
ForexDataExporter
Handles file operations and supports multiple output formats with metadata.
ExtractionRequest
Pydantic model for type-safe request validation and parameter handling.
ForexExtractionResult
Comprehensive result container with data points, metadata, and operation status.
Data Models
All data models use Pydantic for runtime validation and type safety:
PriceDataPoint- Individual forex price with date and decimal precisionExtractionMetadata- Complete extraction context and statisticsFileOperationResult- File save operation results and diagnostics
Screenshots
Interactive CLI Mode And Data Output Examples
The interactive CLI guides users through currency pair selection, date ranges, and output format choices
Command Line Usage
Direct command-line execution with comprehensive help and error messages
Configuration
For Developers: Customization Options
The package offers extensive configuration through the config module:
from forex_data_extractor.config import config
# Customize scraping behavior
config.scraper.BROWSER_HEADLESS = False # Show browser during scraping
config.scraper.PAGE_WAIT_DELAY = 10 # Wait longer for page loads
# Modify output settings
config.files.DEFAULT_OUTPUT_DIR = "/custom/path/data"
config.files.JSON_INDENT = 4
# Adjust date constraints
from datetime import datetime
config.dates.MIN_END_DATE = datetime(2010, 1, 1) # Allow older data
# CLI customization
config.cli.DEFAULT_OUTPUT_FORMAT = "json"
License
This project is licensed under the MIT License - see the LICENSE file for details.
Development Setup
git clone https://github.com/yungKnight/forex_data_extractor.git
cd forex_data_extractor
pip install -e ".[dev]"
Contributing Guidelines
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Your_feature_description') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Please ensure all tests pass and code follows the project's style guidelines.
Acknowledgments
- Yahoo Finance for providing reliable financial data
- Playwright Team for robust browser automation capabilities
- Pydantic for excellent data validation and serialization
- Scrapy for the powerful and flexible scraping framework
- Open Source Community for inspiration and continuous improvement
Contact
Developer: kennery
Email: badoknight1@gmail.com
GitHub: @yungKnight
Project: forex_data_extractor
Support & Issues
- 📖 Documentation: Project Wiki
- ⭐ Show Support: Star the repository if you find it helpful!
- Report Issues: Contact developer to report bug or suggest features, thanks.
Built with ❤️ for the financial data community. Happy trading! 📈
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file forex_data_extractor-1.0.0.tar.gz.
File metadata
- Download URL: forex_data_extractor-1.0.0.tar.gz
- Upload date:
- Size: 23.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c825f4808873613a653a8a22451e0696fcdd69dfebea2899c3f68fbc49549989
|
|
| MD5 |
9cd960cc1583dfa385bb6572fe5520ad
|
|
| BLAKE2b-256 |
3595724d0f8839f0107dce09307536c9d84cb5a8ac20fd20779717a25a948247
|
File details
Details for the file forex_data_extractor-1.0.0-py3-none-any.whl.
File metadata
- Download URL: forex_data_extractor-1.0.0-py3-none-any.whl
- Upload date:
- Size: 20.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a81058f81b4f0ab791a54cb8693dcdbaa09149cc9ffd845eace6ff84ddeba77d
|
|
| MD5 |
976a7baed331bee9c1eefb6f6952037a
|
|
| BLAKE2b-256 |
14e67c73bf37db56ecc07df17528705855d374c65c2d0dcc9c669e96dd0d1422
|