Automate Twitter Stream data collection
Project description
Twistream: Twitter Stream API data collection
Twistream helps you automatically collect and store data from Twitter Stream API.
Installation
Latest stable release:
pip install twistream
From source:
git clone https://github.com/guillermo-carrasco/twistream.git
cd twistream
pip install .
Setting up
Twitter credentials
You need your twitter credentials in order to be able to use Twitter API. For that, create an application here. Once created, save the credentials to configure twistream
Create a configuration file
You can use the command twistream init
to help you create a correctly formatted configuration file
for your collections.
Once created, you will have a file that will luke like this:
~> cat ~/.twistream/twistream.yml
twitter:
consumer_key: your_consumer_key
consumer_secret: your_consumer_secret
access_token_key: your_access_token_key
access_token_secret: your_access_token_secret
backend: backend_name
backend_params:
username: db_username
password: db_password
Usage
Remember that --help
is always an available option
Once created a configuration file, start collecting tweets!
twistream collect --tracks tracks,to,follow config.yaml
Refer to the twitter documentation to know what tracks are, in short:
A comma-separated list of phrases which will be used to determine what Tweets will be delivered on the stream. A phrase may be one or more terms separated by spaces, and a phrase will match if all of the terms in the phrase are present in the Tweet, regardless of order and ignoring case. By this model, you can think of commas as logical ORs, while spaces are equivalent to logical ANDs (e.g. ‘the twitter’ is the AND twitter, and ‘the,twitter’ is the OR twitter).
If what you want is to follow hashtags, don't forget to include the #
character.
Supported backends
From version 0.1.3, twistream supports two backends. A relational database (SQLite) and a no-sql database (MongoDB).
NOTE that the SQLite backend will only save a couple of tweet fields, whilst the MongoDB backend will save the whole blob. It is a trade off between information and storage space.
Backend params format
SQLite
backend: sqlite
backend_params:
db_path: /path/to/your/db
MongoDB
backend: mongodb
backend_params:
db_string: database_connection_string
(See database connection string documentation)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file twistream-0.1.4.tar.gz
.
File metadata
- Download URL: twistream-0.1.4.tar.gz
- Upload date:
- Size: 8.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.2.0.post20200210 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.8.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5aa177d3c9a508c922bb93088acc73fa36e21da363bf52a4c4452c3fee3ccbdc |
|
MD5 | f8d2f76e00074ca92c618f15fe0d91c2 |
|
BLAKE2b-256 | 14532974f1b78a605a87aa61ca953e672736776f585d592f7f4a42a9db35304f |
File details
Details for the file twistream-0.1.4-py3-none-any.whl
.
File metadata
- Download URL: twistream-0.1.4-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.2.0.post20200210 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.8.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fa51902641bd6aa7b896e71dba7c689e938f9b772b238a2087acc338a88832e2 |
|
MD5 | 3fc77a3137a04b22a6a875f7ac179130 |
|
BLAKE2b-256 | 471da0d43e5563b64cc168be439917117df5040dea0242939bc7cdcc2bbae9c2 |