Skip to main content

NASTY Advanced Search Tweet Yielder

Project description

Logo

NASTY is a tool/library for retrieving Tweets via the Twitter Web UI. Instead of using the Twitter Developer API it works by acting like a normal web browser accessing Twitter. That is, it sends AJAX requests and parses Twitter’s JSON responses. This approach makes it substantially different from the other popular crawlers and allows for the following features:

  • Search for tweets by keyword (and filter by latest/top/photos/videos, date of authorship, and language).

  • Retrieve all direct replies to a Tweet.

  • Retrieve all Tweets threaded under a Tweet.

  • Return fully-hydrated JSON-objects of Tweets that exactly match the extended mode of the developer API

  • Operate in batch mode to run a large set of requests, abort at any time, and rerun both uncompleted and failed requests.

  • Written in tested and fully type-checked Python code.

Installation

Python 3.6+ is required. Install via:

pip install nasty

Command Line Interface

To get help for the command line interface use the --help option:

$ nasty --help
usage: nasty [-h] [-v] [search|replies|thread|executor] ...

NASTY Advanced Search Tweet Yielder.

Commands:
  <COMMAND>
    search (s)         Retrieve Tweets using the Twitter advanced search.
    replies (r)        Retrieve all directly replying Tweets to a Tweet.
    thread (t)         Retrieve all Tweets threaded under a Tweet.
    executor (e)       Execute previously submitted requests.

General Arguments:
  -h, --help           Show this help message and exit.
  -v, --version        Show program's version number and exit.
  --log-level <LEVEL>  Logging level (DEBUG, INFO, WARN, ERROR.)

You can also get help for the individual sub commands, that is via nasty search --help, etc.

Replies

You can fetch all direct replies to the Tweet with ID 332308211321425920:

$ nasty replies --tweet-id 332308211321425920

Thread

You can fetch all Tweets threaded under the Tweet with ID 332308211321425920:

$ nasty thread --tweet-id 332308211321425920

Executor

NASTY further supports writing requests to a jobs file to be executed in batch mode later. The benefits of this include being able to track the progress of a large set of requests, aborting at any time, and rerunning both completed and failed requests. The mechanism for this is called the executor in NASTY.

To write down a request to a jobs file, use the --to-executor argument on any of the above requests, for example:

$ nasty search --query "climate change" --to-executor jobs.jsonl

To run all files stored in a jobs file and write the output to directory out:

$ nasty executor --executor-file jobs.jsonl --out-dir out/

Python API

To fetch all Tweets about “climate change” written after 14 January 2019 in German:

import nasty
from datetime import datetime

tweet_stream = nasty.Search("climate change",
                            until=datetime(2019, 1, 14),
                            lang="de").request()
for tweet in tweet_stream:
    print(tweet.created_at, tweet.text)

Similar functionality is available in the nasty.Replies and nasty.Thread classes. The returned tweet_stream is an Iterable of nasty.Tweets. The executor functionality is available in the nasty.RequestExecutor class.

A comprehensive Python API documentation is coming in the future, but the code should be easy to understand.

Contributing

Please feel free to submit bug reports and pull requests!

Pipenv is used for managing the Python environment and tracking dependencies. After its installation you can use the Makefile-helpers to run the plethora of axuiliary development tools.

  • make devenv to create a new virtual environment for Python and install all development dependencies.

  • make test to run all tests and report test coverage.

  • make test-tox to run all tests against all supported Python versions and run linters.

  • make check to run linters and perform static type-checking.

  • make format to format all source code according to the project guidelines.

  • make publish to build the source and binary distributions and upload to TestPyPI.

  • make clean to remove all generated files.

Acknowledgements

License

Copyright 2019 Lukas Schmelzeisen. Licensed under the Apache License, Version 2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nasty-0.1.1.tar.gz (97.8 kB view hashes)

Uploaded Source

Built Distribution

nasty-0.1.1-py3-none-any.whl (54.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page