Skip to main content

Retrieves archived tweets CDX data from the Wayback Machine, performs necessary parsing, and saves the data.

Project description

Wayback Tweets

PyPI DOI Streamlit App Open In Collab

Retrieves archived tweets CDX data from the Wayback Machine, performs necessary parsing (see Field Options), and saves the data in HTML, for easy viewing of the tweets using the iframe tags, CSV, and JSON formats.

Installation

pip install waybacktweets

Quickstart

Using Wayback Tweets as a standalone command line tool

waybacktweets [OPTIONS] USERNAME

waybacktweets --from 20150101 --to 20191231 --limit 250 jack

Using Wayback Tweets as a Web App

Open the application, a prototype written in Python with the Streamlit framework and hosted on Streamlit Cloud.

Using Wayback Tweets as a Python Module

from waybacktweets import WaybackTweets, TweetsParser, TweetsExporter

USERNAME = "jack"

api = WaybackTweets(USERNAME)
archived_tweets = api.get()

if archived_tweets:
    field_options = [
        "archived_timestamp",
        "original_tweet_url",
        "archived_tweet_url",
        "archived_statuscode",
    ]

    parser = TweetsParser(archived_tweets, USERNAME, field_options)
    parsed_tweets = parser.parse()

    exporter = TweetsExporter(parsed_tweets, USERNAME, field_options)
    exporter.save_to_csv()

Documentation

Acknowledgements

  • Tristan Lee (Bellingcat's Data Scientist) for the idea of the application.
  • Jessica Smith (Snowflake's Community Growth Specialist) and Streamlit/Snowflake team for the additional server resources on Streamlit Cloud.
  • OSINT Community for recommending the application.

[!NOTE] If the Streamlit application is down, please check the Streamlit Cloud Status.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

waybacktweets-1.0rc0.tar.gz (26.7 kB view details)

Uploaded Source

Built Distribution

waybacktweets-1.0rc0-py3-none-any.whl (30.5 kB view details)

Uploaded Python 3

File details

Details for the file waybacktweets-1.0rc0.tar.gz.

File metadata

  • Download URL: waybacktweets-1.0rc0.tar.gz
  • Upload date:
  • Size: 26.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.5 Linux/5.15.153.1-microsoft-standard-WSL2

File hashes

Hashes for waybacktweets-1.0rc0.tar.gz
Algorithm Hash digest
SHA256 a6c35822fa7b5c0d569cc9a2f47107589a9e044ea4fdaaaaa651da1d0cd022e4
MD5 917cfc182b20a81f2654f52edb01e788
BLAKE2b-256 ab14dcbb0783397374fbe0eb593f0da8b69211e88267c6a3bd789548a2ac7700

See more details on using hashes here.

File details

Details for the file waybacktweets-1.0rc0-py3-none-any.whl.

File metadata

  • Download URL: waybacktweets-1.0rc0-py3-none-any.whl
  • Upload date:
  • Size: 30.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.5 Linux/5.15.153.1-microsoft-standard-WSL2

File hashes

Hashes for waybacktweets-1.0rc0-py3-none-any.whl
Algorithm Hash digest
SHA256 17a079b7a4d2d826bca051a3be5f1f51e4b99b8f605005b8a2997d28cb4e4253
MD5 61ff3062fc9bc6317d298dce0098033b
BLAKE2b-256 7406283550f1b768283cec7982bf1b612644d7a383da34f057af924a966fb0bf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page