Skip to main content

Crawl stock historical data.

Project description

pystock-crawler is a utility for crawling historical data of US stocks, including:

Example Output

NYSE ticker symbols:

DDD   3D Systems Corporation
MMM   3M Company
WBAI Limited

Apple’s daily prices:


Google’s fundamentals:




  • Python 2.7

pystock-crawler is based on Scrapy, so you will also need to install prerequisites such as lxml and libffi for Scrapy and its dependencies. See Scrapy’s installation guide for more details.

Install with virtualenv (recommended):

pip install pystock-crawler

Or do system-wide installation:

sudo pip install pystock-crawler


Example 1. Google’s and Yahoo’s daily prices ordered by date:

pystock-crawler prices GOOG,YHOO -o out.csv --sort

Example 2. Daily prices of all companies listed in ./symbols.txt:

pystock-crawler prices ./symbols.txt -o out.csv

Example 3. Facebook’s fundamentals during 2013:

pystock-crawler reports FB -o out.csv -s 20130101 -e 20131231

Example 4. Fundamentals all companies in ./nyse.txt and direct the logs to ./crawling.log:

pystock-crawler reports ./nyse.txt -o out.csv -l ./crawling.log

Example 5. All ticker symbols in NYSE and NASDAQ:

pystock-crawler symbols NYSE,NASDAQ -o out.txt


Type pystock-crawler -h to see command help:

  pystock-crawler symbols <exchanges> (-o OUTPUT) [-l LOGFILE] [--sort]
  pystock-crawler prices <symbols> (-o OUTPUT) [-s YYYYMMDD] [-e YYYYMMDD] [-l LOGFILE] [--sort]
  pystock-crawler reports <symbols> (-o OUTPUT) [-s YYYYMMDD] [-e YYYYMMDD]  [-l LOGFILE] [--sort]
  pystock-crawler (-h | --help)
  pystock-crawler (-v | --version)

  -h --help     Show this screen
  -o OUTPUT     Output file
  -s YYYYMMDD   Start date [default: ]
  -e YYYYMMDD   End date [default: ]
  -l LOGFILE    Log output [default: ]
  --sort        Sort the result

There are three commands available:

  • pystock-crawler symbols grabs ticker symbol lists
  • pystock-crawler prices grabs daily prices
  • pystock-crawler reports grabs fundamentals

<exchanges> is a comma-separated string that specifies the stock exchanges you want to include. Only NYSE and NASDAQ are supported.

The output file of pystock-crawler symbols can be used for <symbols> argument in pystock-crawler prices and pystock-crawler reports commands.

<symbols> can be an inline string separated with commas or a text file that lists symbols line by line. For example, the inline string can be something like AAPL,GOOG,FB. And the text file may look like this:

# This line is comment
AAPL    Put anything you want here
GOOG    Since the text here is ignored

Use -o to specify the output file. For pystock-crawler symbols command, the output format is a simple text file. For pystock-crawler prices and pystock-crawler reports the output format is CSV.

-l is where the crawling logs go to. If not specified, the logs go to stdout.

The rows in the output CSV file are in an arbitrary order by default. Use --sort to sort them by symbols and dates. But if you have a large output file, don’t use --sort because it will be slow and eat a lot of memory.

NOTE: The crawler stores HTTP cache in a directory named .scrapy under your current working directory. The cache helps speed up the crawling process next time your fetch same web pages again. The cache can be quite huge. If you don’t need it, just delete the .scrapy directory after you’ve done crawling.

Developer Guide

Installing Dependencies

pip install -r requirements.txt

Running Test

Install pytest, pytest-cov, and requests if you don’t have them:

pip install pytest pytest-cov requests

Then run the test:


This downloads the test data from from SEC EDGAR on the fly, so it will take some time and disk space. If you want to delete test data, just delete pystock_crawler/tests/sample_data directory.

Project details

Release history Release notifications

History Node


History Node


History Node


History Node


History Node


History Node


History Node


This version
History Node


History Node


History Node


History Node


History Node


History Node


History Node


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
pystock-crawler-0.5.0.tar.gz (18.1 kB) Copy SHA256 hash SHA256 Source None May 12, 2014

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page