Skip to main content

A tool to scrape data about fires from Twitter.

Project description

What is this?

This is a Python Twitter "Fire event" scraper/listener.

It is an application that will listen for or scrape data relating to house fires (Chicago specifically) in order to analyze how people use Twitter as a platform to report and talk about disasters.

How will this use Twitter data?

This application allows one to analyze, collect, and collate data about house fires and other disasters on Twitter.

How do I install this?

Dependencies

Steps

pip install twitter-fire-scraper

If it is already installed and a newer version is available, you can update with:

pip install twitter-fire-scraper --upgrade

Notes

This README assumes all commands take place in the same folder as this README file.

Examples

Examples of how to use this package can be found in this examples folder and also in our internal test suites.

These should give you a good idea of how to use our scraper, and can be considered a 'living standard' of how our code works.

Setting up your secrets

This secrets file is only used for the demos. When using this library, it is up to you to manage how you store and retrieve your API keys.

More specifically, if the Scraper object is not initialized with a TwitterAuthentication object, it will search for a file called ~/.twitterfirescraper/secrets.json for API keys as a fallback.

This is to make the demonstrations work and not recommended usage when using the library.

A note: These are called 'secrets' for a reason. Don't ever stage or commit secrets.json, please.

Twitter secrets

You will need:

  • A twitter developer account & API key

    • A consumer API key (goes into "consumer_key")
    • A consumer API secret key (goes into "consumer_secret")
    • An access token (goes into "access_token")
    • An access secret (goes into "access_token_secret")
  • A twitter handle you're authorized to make queries on behalf of

You are to put these into a file called secrets.json in your home folder under .twitterfirescraper/ (For example, mine is C:/Users/henryfbp/.twitterfirescraper/secrets.json.)

An example file is provided for you to base your file off of, called secrets.example.json.

MongoDB secrets

The demos in our code connect to the following mongodb address:

mongodb://localhost:27017/

Setting up a database

For the database, we have chosen to use MongoDB since twitter data is stored in JSON and MongoDB is very well-suited for storing JSON data.

Follow this tutorial on how to install MongoDB.

Developer dependencies

  • Same as above.
  • Ruby, used for running scripts to build and test Python wheels.

Setting up Pipenv

You can install Pipenv by executing

pip install pipenv

You can then install all packages (including dev packages like twisted) in this folder's ./Pipenv by executing

pipenv install --dev

Then, you can run tests by executing

pipenv run python /src/twitter-fire-scraper/tests/test/__main__.py

Running a functional demo

Inside this folder, there are two files called Run-Demo.bat and Run-Demo.ps1. You can run either of those to start a demo intended for presentation purposes.

Starting the Web API

There is a web API that is included with the twitter-fire-scraper package.

It exposes functions of the twitter-fire-scraper over HTTP.

From source

You can run the web API from the live source code with pipenv run python twitter_fire_scraper/app.py.

Using PyPI

You can run the Web API after installing it with pip by typing

python -m twitter_fire_scraper.app

Running tests

You can execute pipenv run python fire-scraper/tests/<TESTNAME>.py to run a test.

To run all tests, execute pipenv run python fire-scraper/tests/test/__init__.py and all tests will run.

Alternatively, if you have this package installed, run

python -m twitter_fire_scraper.tests.test

to run the package's test module.

What was this adapted from?

A movie sentiment analysis project by Raul, the repository is here and a live site is here.

Commit 2fb844e8c081c1dc31cfb4760e3a80cefb6a0eee was used.

There's got to be a better way to run this than from the command line!

There is! Use an IDE (like PyCharm, which I use) that preferably integrates with Python to show you import errors, syntax errors, etc. Go google "Python IDE" and pick one you like.

Adding the location of Venv to your IDE

In order to run our tests through an IDE, we need to let our IDE know where venv was installed. I will explain this through Pycharm, but the method should be the same for any IDE.

If running python in windows powershell runs Python 3 (or you only have Python 3 installed), run python -m pipenv --venv

This will yield the location of the python 3 Virtual Environment (It should be something like C:\Users\Your Name\...\.virtualenvs\...). Copy this path and open Pycharm.

Go into files -> settings and expand the Project: fire-scraper-twitter. In the drop down, go into Project Interpreter. Go to the top and click the gear and select add, as we will be adding a new interpreter.

Select Existing environment and click the three dots to the right. Copy your path at the top, then OK everything.

There! Done! Now we can run our tests from inside our IDE.

Generating/uploading distribution archives

If you want to distribute this source code as a Python Wheel, follow this guide.

There are a series of Ruby scripts (cross-platform!) that handle building, cleaning, uploading.

Make sure you have the twine package installed for Python.

Building

ruby build.rb

Cleaning

ruby clean.rb

Uploading

You'll need to bump the version in ./VERSION when uploading a new version.

To the test site (test.pypi.org)

ruby upload.rb --test

To the real site (pypi.org)

ruby upload.rb --deploy

Testing download and install

There are a couple ways for you to test how a user would experience installing this package.

There are three Ruby scripts here, each doing what its name suggests.

test-localwheel-install.rb will install and test the latest WHL file generated by build.rb.

test-testpypi-install.rb will install and test the TEST PyPI's twitter-fire-scraper package.

test-realpypi-install.rb will install and test the offical PyPI's twitter-fire-scraper package.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

twitter_fire_scraper-2.2.0.tar.gz (34.8 kB view details)

Uploaded Source

Built Distribution

twitter_fire_scraper-2.2.0-py3-none-any.whl (46.4 kB view details)

Uploaded Python 3

File details

Details for the file twitter_fire_scraper-2.2.0.tar.gz.

File metadata

  • Download URL: twitter_fire_scraper-2.2.0.tar.gz
  • Upload date:
  • Size: 34.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for twitter_fire_scraper-2.2.0.tar.gz
Algorithm Hash digest
SHA256 5d286fdf2f9ec66781de6403c1667b346021c280a77f9b5f99d35dd6a0dd73dd
MD5 af040344c89e9747eefba5fdd7546145
BLAKE2b-256 b571fcf0b753124b0567ad9c44af1af2e4885abfd6835e1f7e8c6eed97a5c9a7

See more details on using hashes here.

File details

Details for the file twitter_fire_scraper-2.2.0-py3-none-any.whl.

File metadata

  • Download URL: twitter_fire_scraper-2.2.0-py3-none-any.whl
  • Upload date:
  • Size: 46.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for twitter_fire_scraper-2.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 933abe0391616d1b5842af34992777abb93d8a7128f6345f1b302d98e132a03d
MD5 8fa19e9ce9e7379d7af96b0d869dda02
BLAKE2b-256 b5ae45accdaf9263ff5ed8ae32f3772e3d11bbba0f68705d187e1ce086a17aaa

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page