Skip to main content

Cryptocurrency exchange announcement news crawler for major crypto exchanges

Project description

Crypto Exchange News Crawler 🚀

A powerful and easy-to-use Python package for scraping cryptocurrency exchange announcements from major exchanges.

🎯 Features

  • Multi-Exchange Support: Scrape from 12 major crypto exchanges
  • Multiple Output Formats: JSON, CSV, and XML support
  • Structured Data: Clean, standardized output format
  • Rate Limiting: Built-in delays to respect exchange servers
  • Extensible: Easy to add new exchanges

📦 Installation Options

Option 1: Direct Usage

git clone https://github.com/lowweihong/crypto-exchange-news-crawler.git
cd crypto-exchange-news-crawler
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
playwright install

scrapy crawl bybit -o output.json

Option 2: Install from PyPI

pip install crypto-exchange-news-crawler
playwright install

## directly use proxy and uncomment DOWNLOADER_MIDDLEWARES
crypto-news crawl binance -o binance.json

crypto-news crawl bybit -s DOWNLOADER_MIDDLEWARES='{"crypto_exchange_news.middlewares.MyProxyMiddleware": 610}' -s PROXY_LIST="http://proxy1:port,http://proxy2:port"

Supported Exchanges

Exchange Status
Bybit
Binance
OKX
Bitget
BingX
Kraken
Bitfinex
XT
Crypto.com
MEXC
Deepcoin
Kucoin
Upbit
Available options : ["bybit", "binance", "okx", "bitget", "bitfinex", "xt", "bingx", 'kraken', 'cryptocom', 'mexc', 'deepcoin', 'kucoin', 'upbit']

📊 Output Format

Each scraped announcement includes:

{
    "news_id": "unique_identifier",
    "title": "Announcement title",
    "desc": "Announcement description",
    "url": "Full URL to announcement",
    "category_str": "Category (e.g., latest_activities, new_crypto)",
    "exchange": "Exchange name",
    "announced_at_timestamp": 1749235200,
    "timestamp": 1749232733
}

⚙️ Configuration

Key settings in settings.py:

  • MAX_PAGE: Maximum number of pages to crawl (default: 2)
  • DOWNLOAD_DELAY: Delay between requests in seconds (default: 3)
  • CONCURRENT_REQUESTS: Number of concurrent requests (default: 8)
  • USER_AGENT: List of user agents for rotation
  • PROXY_LIST: Fill the list with your proxy list and remember also to open uncomment the DOWNLOADER_MIDDLEWARES part to use the proxy middleware
  • PLAYWRIGHT_LAUNCH_OPTIONS: Browser configuration for Playwright spiders

Custom Settings

You can override settings from the command line:

scrapy crawl bitget -s MAX_PAGE=5 -s DOWNLOAD_DELAY=2

🔧 Technical Requirements

  • Python 3.7+
  • Scrapy 2.11.0+
  • Playwright (for Bitget spider)
  • Chromium browser (automatically installed with Playwright)

🌐 Exchange URLs

Direct links to announcement pages:

Exchange Announcement URL
Binance https://www.binance.com/en/support/announcement
OKX https://www.okx.com/help/category/announcements
Bybit https://announcements.bybit.com/en/?category=&page=1
Bitget https://www.bitget.com/support/sections/12508313443483
BingX https://bingx.com/en/support/notice-center/
Kraken https://blog.kraken.com/category/product
XT https://xtsupport.zendesk.com/hc/en-us/categories/10304894611993-Important-Announcements
Bitfinex https://www.bitfinex.com/posts/
Crypto.com https://crypto.com/exchange/announcements
MEXC https://www.mexc.com/support/categories/360000254192
Deepcoin https://support.deepcoin.online/hc/en-001/categories/360003875752-Important-Announcements
Kucoin https://www.kucoin.com/announcement
Upbit https://sg.upbit.com/service_center/notice

⚖️ Legal & Ethical Usage

This crawler is designed for educational and research purposes. Please ensure you comply with:

  • Applicable data protection laws
  • Fair use guidelines

Always use the crawler responsibly and consider the impact on the target servers.

🤝 Contributing

Contributions welcome! Areas for improvement:

  • Add support for more exchanges (Huobi, Gateio, etc.)
  • Implement real-time WebSocket feeds
  • Add telegram/discord notification integrations
  • Improve data parsing and categorization

Support

For issues, questions, or contributions, please create an issue in the repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crypto_exchange_news_crawler-0.1.9.tar.gz (21.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crypto_exchange_news_crawler-0.1.9-py3-none-any.whl (28.7 kB view details)

Uploaded Python 3

File details

Details for the file crypto_exchange_news_crawler-0.1.9.tar.gz.

File metadata

File hashes

Hashes for crypto_exchange_news_crawler-0.1.9.tar.gz
Algorithm Hash digest
SHA256 bcd6bf67ce5887289dc8b3e4d5d312db1b186d0b07837976ae6a7101d8a5a408
MD5 e0521cf1d19625ed614c0da1b4734aed
BLAKE2b-256 5802a58f37cb0d6d75ad0603def59c8e32a588e69637ec727efddebd9b3ca035

See more details on using hashes here.

File details

Details for the file crypto_exchange_news_crawler-0.1.9-py3-none-any.whl.

File metadata

File hashes

Hashes for crypto_exchange_news_crawler-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 f82616a7c436a644b2e071ce584dcbd6cfa267490339137f059011fcbcda5f05
MD5 9741125707af2b0d927c974b8a90138f
BLAKE2b-256 7a5550b6b787b22ccc0b55aa5bfc56c312646c587af928c4448e69d171f465fe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page