Cryptocurrency exchange announcement news crawler for major crypto exchanges
Project description
Cryptocurrency Exchange News Crawler | Bybit Binance Bitget Announcement Scraper
A comprehensive Scrapy-based web crawler for cryptocurrency exchange announcements. This crypto news scraper automatically collects trading announcements, listing news, and platform updates from major crypto exchanges.
🚀 Features
- Multi-Exchange Support: Crawls announcements from major cryptocurrency exchanges
- Real-time Data: Extracts latest announcements with timestamps
- Structured Output: Clean JSON format for easy integration
- Scalable Architecture: Easy to extend for additional exchanges
- Rate Limiting: Respectful crawling with configurable delays
- Proxy Support: Built-in proxy rotation capabilities
📊 Supported Cryptocurrency Exchanges
| Exchange | Status | Announcement Types |
|---|---|---|
| Binance ✅ | Active | New coin listings, trading pairs, system updates |
| OKX ✅ | Active | Trading updates, new assets, platform changes |
| Bybit ✅ | Active | Trading announcements, new listings, platform updates |
| Bitget ✅ | Active | Futures listings, spot trading, platform news |
| XT Exchange ✅ | Active | Token listings, trading announcements |
| Bitfinex ✅ | Active | Trading updates, new assets, platform changes |
Project Description
This project crawls announcement news from cryptocurrency exchanges to help users stay updated with the latest developments, updates, and announcements from major crypto trading platforms. The crawler extracts key information including:
- News title and description
- Publication timestamp
- News URL
- Exchange source
- Unique news ID
- News categories (where available)
Currently Supported Exchanges
- Binance
- OKX
- Bybit
- Bitget
- XT
- Bitfinex
🎯 Use Cases
- Trading Bots: Feed announcement data to automated trading systems
- Market Research: Analyze exchange listing patterns and trends
- News Aggregation: Build crypto news platforms and alert systems
- Academic Research: Study cryptocurrency market announcements
- Investment Tools: Track new token listings across exchanges
🔧 Quick Start
Installation
- Clone the repository:
git clone https://github.com/lowweihong/crypto-exchange-news-crawler.git
cd crypto_exchange_news
- Create a virtual environment (recommended):
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Install Playwright browsers (required for Bitget spider):
playwright install chromium
Running Specific Exchange Crawlers
Bybit Announcements:
scrapy crawl bybit -o bybit_announcements.json
Binance News:
scrapy crawl binance -o binance_news.json
Bitget Updates:
scrapy crawl bitget -o bitget_updates.json
📋 Data Schema
Each announcement contains structured data perfect for analysis:
{
"news_id": "Unique identifier from the exchange",
"title": "News headline",
"desc": "News description/content",
"url": "Full URL to the news article",
"category_str": "News category (detailed for Bitget)",
"exchange": "Exchange name ('bitfinex' or 'bitget' or 'xt' or 'bybit', or 'binance')",
"announced_at_timestamp": "Original publication timestamp (Unix)",
"timestamp": "Crawl timestamp (Unix)"
}
⚙️ Configuration
Key settings in settings.py:
MAX_PAGE: Maximum number of pages to crawl (default: 2)DOWNLOAD_DELAY: Delay between requests in seconds (default: 3)CONCURRENT_REQUESTS: Number of concurrent requests (default: 8)USER_AGENT: List of user agents for rotationPROXY_LIST: Fill the list with your proxy list and remember also to open uncomment the DOWNLOADER_MIDDLEWARES part to use the proxy middlewarePLAYWRIGHT_LAUNCH_OPTIONS: Browser configuration for Playwright spiders
Custom Settings
You can override settings from the command line:
scrapy crawl bitget -s MAX_PAGE=5 -s DOWNLOAD_DELAY=2
🔧 Technical Requirements
- Python 3.7+
- Scrapy 2.11.0+
- Playwright (for Bitget spider)
- Chromium browser (automatically installed with Playwright)
🌐 Exchange URLs
Direct links to announcement pages:
⚖️ Legal & Ethical Usage
This crawler is designed for educational and research purposes. Please ensure you comply with:
- Each exchange's Terms of Service
- Rate limiting and robots.txt policies
- Applicable data protection laws
- Fair use guidelines
Always use the crawler responsibly and consider the impact on the target servers.
🤝 Contributing
Contributions welcome! Areas for improvement:
- Add support for more exchanges (Huobi, KuCoin, Gateio, etc.)
- Implement real-time WebSocket feeds
- Add telegram/discord notification integrations
- Improve data parsing and categorization
Support
For issues, questions, or contributions, please create an issue in the repository.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file crypto_exchange_news_crawler-0.1.0.tar.gz.
File metadata
- Download URL: crypto_exchange_news_crawler-0.1.0.tar.gz
- Upload date:
- Size: 13.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0e550f2eac7622d482efdc9011b4ccd1a051a8aa846f1343c72723e25e284f0a
|
|
| MD5 |
ca26fb2e25036d3da6cab45ce5865cd0
|
|
| BLAKE2b-256 |
1f49c9c8b11d29d1c431661ddaaf3a63a85a9f578e7f8989f3174af02aec32d8
|
File details
Details for the file crypto_exchange_news_crawler-0.1.0-py3-none-any.whl.
File metadata
- Download URL: crypto_exchange_news_crawler-0.1.0-py3-none-any.whl
- Upload date:
- Size: 17.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8f8dde14fb2e6be7851f8ef8e3ffc7de7b06ffdb0ade368fb2692f41b0b2934a
|
|
| MD5 |
1122151ed9afb7eae363b9cca94b790e
|
|
| BLAKE2b-256 |
d248177e209b1e3daa556322ad780c797a48bf685b81960443af29e06a5dd68f
|