Skip to main content

Archive tweets from the command line

Project description

twarc

twarc is a command line tool and Python library for collecting and archiving Twitter JSON data via the Twitter API. It has separate commands (twarc and twarc2) for working with the older v1.1 API and the newer v2 API and Academic Access (respectively). It also has an ecosystem of plugins for doing things with the collected data.

See the twarc documentation for running commands: twarc2 and twarc1 for using the v1.1 API. If you aren't sure about which one to use you'll want to start with twarc2 since the v1.1 is scheduled to be retired.

Install

If you have python installed, you can install twarc from a terminal (such as the Windows Command Prompt available in the "start" menu, or the OSX Terminal application):

pip3 install twarc

Once installed, you should be able to use the twarc and twarc2 command line utilities, or use it as a Python library - check the examples here for that.

Other Tools

Twarc is purpose build for working with the twitter API for archiving and studying digital trace data. It is not built as a general purpose API library for Twitter. While the primary use is academic, it works just as well with "Standard" v2 API and "Premium" v1.1 APIs.

For a list of general purpose Twitter Libraries in different languages see the Twitter Documentation. For Python, TwitterAPI and tweepy are both up to date and maintained. They also support v2 APIs, and their data format with expansions may differ from twarc. There is also a reference implementation of the v2 Academic Access Search and v1.1 Premium Search from Twitter here. The v2 version of this script is compatible with twarc.

For R there is academictwitteR. Unlike twarc, it focuses solely on querying the Twitter Academic Research Product Track v2 API endpoint. Data gathered in twarc can be imported into R for analysis as a dataframe if you export the data into CSV using twarc-csv.

Getting Help

Check out the tutorial to get started, or follow along with this recorded stream introducing twarc. You can also find additional resources linked from resources. If you run into trouble, feel free to make a post on the Twarc Repository or on the Twitter Developer Forums.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

twarc-2.14.0.tar.gz (57.8 kB view details)

Uploaded Source

Built Distribution

twarc-2.14.0-py3-none-any.whl (60.2 kB view details)

Uploaded Python 3

File details

Details for the file twarc-2.14.0.tar.gz.

File metadata

  • Download URL: twarc-2.14.0.tar.gz
  • Upload date:
  • Size: 57.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/6.1.0 pkginfo/1.7.0 requests/2.28.2 requests-toolbelt/0.9.1 tqdm/4.65.0 CPython/3.7.5

File hashes

Hashes for twarc-2.14.0.tar.gz
Algorithm Hash digest
SHA256 fa8ee3052d8b9678231bea95d1bdcbabb3968d35c56a8d1fcedc8982e8c66a66
MD5 f01d32a8601642957f27bf9f98a2455e
BLAKE2b-256 8aedac80b24ece6ee552f6deb39be34f01491cff4018cca8c5602c901dc08ecf

See more details on using hashes here.

File details

Details for the file twarc-2.14.0-py3-none-any.whl.

File metadata

  • Download URL: twarc-2.14.0-py3-none-any.whl
  • Upload date:
  • Size: 60.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for twarc-2.14.0-py3-none-any.whl
Algorithm Hash digest
SHA256 25f7de77d0d5c82eb2d0aacf9315f1f6212a0643a6b1a8f6ab401e83b3173925
MD5 af34dd8a225b6a089c64e3e3385498da
BLAKE2b-256 82118210b2b049a37b01c3b583dec9af07b3251e280bd4bd1f29bcad4b852f1b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page