A tool to scrape data about fires from Twitter.
Project description
What is this?
This is a Python Twitter "Fire event" scraper/listener.
It is an application that will listen for or scrape data relating to house fires (Chicago specifically) in order to analyze how people use Twitter as a platform to report and talk about disasters.
How will this use Twitter data?
This application allows one to analyze, collect, and collate data about house fires and other disasters on Twitter.
How do I install this?
Dependencies
Steps
pip install twitter-fire-scraper
If it is already installed and a newer version is available, you can update with:
pip install twitter-fire-scraper --upgrade
Notes
This README assumes all commands take place in the same folder as this README file.
Examples
Examples of how to use this package can be found in this examples folder and also in our internal test suites.
These should give you a good idea of how to use our scraper, and can be considered a 'living standard' of how our code works.
Setting up your secrets
This secrets file is only used for the demos. When using this library, it is up to you to manage how you store and retrieve your API keys.
More specifically, if the Scraper
object is not initialized with a TwitterAuthentication
object, it will search for a file called ~/.twitterfirescraper/secrets.json
for API keys as a fallback.
This is to make the demonstrations work and not recommended usage when using the library.
A note: These are called 'secrets' for a reason. Don't ever stage or commit secrets.json
, please.
Twitter secrets
You will need:
-
A twitter developer account & API key
- A consumer API key (goes into
"consumer_key"
) - A consumer API secret key (goes into
"consumer_secret"
) - An access token (goes into
"access_token"
) - An access secret (goes into
"access_token_secret"
)
- A consumer API key (goes into
-
A twitter handle you're authorized to make queries on behalf of
You are to put these into a file called secrets.json
in your home folder under .twitterfirescraper/
(For example, mine is C:/Users/henryfbp/.twitterfirescraper/secrets.json
.)
An example file is provided for you to base your file off of, called secrets.example.json
.
MongoDB secrets
The demos in our code connect to the following mongodb address:
mongodb://localhost:27017/
Setting up a database
For the database, we have chosen to use MongoDB since twitter data is stored in JSON and MongoDB is very well-suited for storing JSON data.
Follow this tutorial on how to install MongoDB.
Developer dependencies
- Same as above.
- Ruby, used for running scripts to build and test Python wheels.
Setting up Pipenv
You can install Pipenv by executing
pip install pipenv
You can then install all packages (including dev packages like twisted
) in this folder's ./Pipenv
by executing
pipenv install --dev
Then, you can run tests by executing
pipenv run python /src/twitter-fire-scraper/tests/test/__main__.py
Running a functional demo
Inside this folder, there are two files called Run-Demo.bat
and Run-Demo.ps1
. You can run either of those to start a demo intended for presentation purposes.
Starting the Web API
There is a web API that is included with the twitter-fire-scraper
package.
It exposes functions of the twitter-fire-scraper
over HTTP.
From source
You can run the web API from the live source code with pipenv run python twitter_fire_scraper/app.py
.
Using PyPI
You can run the Web API after installing it with pip
by typing
python -m twitter_fire_scraper.app
Running tests
You can execute pipenv run python fire-scraper/tests/<TESTNAME>.py
to run a test.
To run all tests, execute pipenv run python fire-scraper/tests/test/__init__.py
and all tests will run.
Alternatively, if you have this package installed, run
python -m twitter_fire_scraper.tests.test
to run the package's test module.
What was this adapted from?
A movie sentiment analysis project by Raul, the repository is here and a live site is here.
Commit 2fb844e8c081c1dc31cfb4760e3a80cefb6a0eee
was used.
There's got to be a better way to run this than from the command line!
There is! Use an IDE (like PyCharm, which I use) that preferably integrates with Python to show you import errors, syntax errors, etc. Go google "Python IDE" and pick one you like.
Adding the location of Venv to your IDE
In order to run our tests through an IDE, we need to let our IDE know where venv was installed. I will explain this through Pycharm, but the method should be the same for any IDE.
If running python
in windows powershell runs Python 3 (or you only have Python 3 installed), run python -m pipenv --venv
This will yield the location of the python 3 Virtual Environment (It should be something like C:\Users\Your Name\...\.virtualenvs\...
). Copy this path and open Pycharm.
Go into files -> settings
and expand the Project: fire-scraper-twitter
. In the drop down, go into Project Interpreter
. Go to the top and click the gear and select add
, as we will be adding a new interpreter.
Select Existing environment
and click the three dots to the right. Copy your path at the top, then OK everything.
There! Done! Now we can run our tests from inside our IDE.
Generating/uploading distribution archives
If you want to distribute this source code as a Python Wheel, follow this guide.
There are a series of Ruby scripts (cross-platform!) that handle building, cleaning, uploading.
Make sure you have the twine
package installed for Python.
Building
ruby build.rb
Cleaning
ruby clean.rb
Uploading
You'll need to bump the version in ./VERSION
when uploading a new version.
To the test site (test.pypi.org)
ruby upload.rb --test
To the real site (pypi.org)
ruby upload.rb --deploy
Testing download and install
There are a couple ways for you to test how a user would experience installing this package.
There are three Ruby scripts here, each doing what its name suggests.
test-localwheel-install.rb
will install and test the latest WHL file generated by build.rb
.
test-testpypi-install.rb
will install and test the TEST PyPI's twitter-fire-scraper
package.
test-realpypi-install.rb
will install and test the offical PyPI's twitter-fire-scraper
package.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file twitter_fire_scraper-2.2.0.tar.gz
.
File metadata
- Download URL: twitter_fire_scraper-2.2.0.tar.gz
- Upload date:
- Size: 34.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5d286fdf2f9ec66781de6403c1667b346021c280a77f9b5f99d35dd6a0dd73dd |
|
MD5 | af040344c89e9747eefba5fdd7546145 |
|
BLAKE2b-256 | b571fcf0b753124b0567ad9c44af1af2e4885abfd6835e1f7e8c6eed97a5c9a7 |
File details
Details for the file twitter_fire_scraper-2.2.0-py3-none-any.whl
.
File metadata
- Download URL: twitter_fire_scraper-2.2.0-py3-none-any.whl
- Upload date:
- Size: 46.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 933abe0391616d1b5842af34992777abb93d8a7128f6345f1b302d98e132a03d |
|
MD5 | 8fa19e9ce9e7379d7af96b0d869dda02 |
|
BLAKE2b-256 | b5ae45accdaf9263ff5ed8ae32f3772e3d11bbba0f68705d187e1ce086a17aaa |