Skip to main content

Retrieves archived tweets CDX data from the Wayback Machine, performs necessary parsing, and saves the data.

Project description

Wayback Tweets

PyPI docs Streamlit App

Retrieves archived tweets CDX data from the Wayback Machine, performs necessary parsing (see Field Options), and saves the data in HTML (for easy viewing of the tweets using the iframe tag), CSV, and JSON formats.

Installation

pip install waybacktweets

Quickstart

Using Wayback Tweets as a standalone command line tool

waybacktweets [OPTIONS] USERNAME

waybacktweets --from 20150101 --to 20191231 --limit 250 jack

Using Wayback Tweets as a Web App

Open the application, a prototype written in Python with the Streamlit framework and hosted on Streamlit Cloud.

Using Wayback Tweets as a Python Module

from waybacktweets import WaybackTweets, TweetsParser, TweetsExporter

USERNAME = "jack"

api = WaybackTweets(USERNAME)
archived_tweets = api.get()

if archived_tweets:
    field_options = [
        "archived_timestamp",
        "original_tweet_url",
        "archived_tweet_url",
        "archived_statuscode",
    ]

    parser = TweetsParser(archived_tweets, USERNAME, field_options)
    parsed_tweets = parser.parse()

    exporter = TweetsExporter(parsed_tweets, USERNAME, field_options)
    exporter.save_to_csv()

Documentation

Acknowledgements

  • Tristan Lee (Bellingcat's Data Scientist) for the idea of the application.
  • Jessica Smith (Snowflake's Marketing Specialist) and Streamlit/Snowflake teams for the additional server resources on Streamlit Cloud.
  • OSINT Community for recommending the application.

[!NOTE] If the Streamlit application is down, please check the Streamlit Cloud Status.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

waybacktweets-1.0a5.tar.gz (26.7 kB view hashes)

Uploaded Source

Built Distribution

waybacktweets-1.0a5-py3-none-any.whl (30.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page