Twitter GraphQL and Search API implementation with SNScrape data models
Project description
twscrape
twscrape is an async Python library and CLI for X/Twitter Search and GraphQL endpoints. It runs on your own account pool, keeps sessions in SQLite, rotates accounts when an endpoint is rate-limited, and returns either parsed SNScrape-style models or raw API responses.
Install
pip install twscrape
httpx is the default HTTP backend. For browser-like TLS fingerprinting, install the optional curl-cffi backend:
pip install "twscrape[curl]"
TWS_HTTP_BACKEND=curl twscrape user_by_login xdevelopers
Features
- Search and GraphQL X/Twitter API methods
- Async/await API for running multiple scrapers concurrently
- Login flow with optional email verification code retrieval
- Cookie-based account setup
- Saved account sessions and per-account proxies
- Raw Twitter API responses and parsed SNScrape-compatible models
- Automatic account switching across rate-limited operations
Start With Cookies
twscrape requires authorized X/Twitter accounts. The most stable setup is to add an account from browser cookies containing auth_token and ct0.
twscrape add_cookie my_account "auth_token=xxx; ct0=yyy"
twscrape accounts
twscrape search "from:xdevelopers lang:en" --limit=20
Or let the CLI prompt for the cookie value:
twscrape add_cookie my_account
Cookie accounts that include ct0 are activated immediately; no login_accounts step is needed.
To get cookies: open x.com -> DevTools (F12) -> Application -> Cookies -> copy auth_token and ct0 values.
Ready-to-use cookie accounts are available from this provider. Proxy users can bring their own proxies or use this provider. These are referral links.
X/Twitter's Terms of Service discourage using multiple accounts. Use this project responsibly and at your own discretion.
Python API
import asyncio
from twscrape import API, gather
async def main():
api = API() # or API("accounts.db")
# Add once; the session is stored in the account database.
await api.pool.add_account_cookies("my_account", "auth_token=xxx; ct0=yyy")
user = await api.user_by_login("xdevelopers")
print(user.id, user.username, user.followersCount)
tweets = await gather(api.search("from:xdevelopers lang:en", limit=20))
for tweet in tweets:
print(tweet.id, tweet.user.username, tweet.rawContent)
if __name__ == "__main__":
asyncio.run(main())
gather() is a convenience helper. You can stream results directly:
async for tweet in api.search("open source lang:en", limit=100):
print(tweet.id, tweet.rawContent)
Search defaults to the Latest tab. Pass kv={"product": "Top"} or kv={"product": "Media"} to use another search product:
tweets = await gather(api.search("python", limit=20, kv={"product": "Top"}))
Every parsed method has a _raw version for the original response wrapper:
async for rep in api.search_raw("from:xdevelopers", limit=20):
print(rep.status_code, rep.json())
When breaking out of an async generator early, close it with contextlib.aclosing so the account lock is released promptly:
from contextlib import aclosing
async with aclosing(api.search("elon musk")) as gen:
async for tweet in gen:
if tweet.id < 200:
break
API Surface
Search:
await gather(api.search("elon musk", limit=20)) # list[Tweet]
await gather(api.search("elon musk", limit=20, kv={"product": "Top"})) # Top tab
await gather(api.search_user("openai", limit=20)) # list[User]
await gather(api.search_trend("python", limit=20)) # list[Trend]
Tweets:
tweet_id = 20
await api.tweet_details(tweet_id) # Tweet
await gather(api.tweet_replies(tweet_id, limit=20)) # list[Tweet]
await gather(api.tweet_thread(tweet_id, limit=20)) # list[Tweet]
await gather(api.retweeters(tweet_id, limit=20)) # list[User]
await gather(api.bookmarks(limit=20)) # list[Tweet]
Users and timelines:
user_login = "xdevelopers"
user_id = 2244994945
await api.user_by_login(user_login) # User
await api.user_about(user_login) # AccountAbout
await gather(api.following(user_id, limit=20)) # list[User]
await gather(api.followers(user_id, limit=20)) # list[User]
await gather(api.verified_followers(user_id, limit=20)) # list[User]
await gather(api.subscriptions(user_id, limit=20)) # list[User]
await gather(api.user_tweets(user_id, limit=20)) # list[Tweet]
await gather(api.user_tweets_and_replies(user_id, limit=20)) # list[Tweet]
await gather(api.user_media(user_id, limit=20)) # list[Tweet]
Lists:
list_id = 123456789
await gather(api.list_timeline(list_id, limit=20)) # list[Tweet]
await gather(api.list_members(list_id, limit=20)) # list[User]
Communities:
community_id = 1501272736215322629
await api.community_info(community_id) # Community
await gather(api.community_members(community_id, limit=20)) # list[User]
await gather(api.community_moderators(community_id, limit=20)) # list[User]
await gather(api.community_tweets(community_id, limit=20)) # list[Tweet]
Trends:
await gather(api.trends("news")) # list[Trend]
await gather(api.trends("sport")) # list[Trend]
await gather(api.trends("entertainment")) # list[Trend]
await gather(api.trends("VGltZWxpbmU6DAC2CwABAAAACHRyZW5kaW5nAAA")) # list[Trend]
Parsed Tweet, User, Community, and trend objects can be converted with .dict() or .json().
CLI
twscrape
twscrape search --help
Commands:
twscrape search "QUERY" --limit=20
twscrape tweet_details TWEET_ID
twscrape tweet_replies TWEET_ID --limit=20
twscrape tweet_thread TWEET_ID --limit=20
twscrape retweeters TWEET_ID --limit=20
twscrape user_by_login USERNAME
twscrape user_about USERNAME
twscrape user_media USER_ID --limit=20
twscrape following USER_ID --limit=20
twscrape followers USER_ID --limit=20
twscrape verified_followers USER_ID --limit=20
twscrape subscriptions USER_ID --limit=20
twscrape user_tweets USER_ID --limit=20
twscrape user_tweets_and_replies USER_ID --limit=20
twscrape list_timeline LIST_ID --limit=20
twscrape list_members LIST_ID --limit=20
twscrape community_info COMMUNITY_ID
twscrape community_members COMMUNITY_ID --limit=20
twscrape community_moderators COMMUNITY_ID --limit=20
twscrape community_tweets COMMUNITY_ID --limit=20
twscrape trends sport
CLI output is JSON Lines: one document per line.
twscrape search "elon musk lang:es" --limit=20 > tweets.jsonl
twscrape search "elon musk lang:es" --limit=20 --raw
Use a separate account database when you need isolated account pools:
twscrape --db research.db search "python lang:en" --limit=100
Accounts
Add username/password accounts from a file:
twscrape add_accounts ./accounts.txt username:password:email:email_password
twscrape login_accounts
twscrape login_accounts starts the login flow for each inactive account. If X asks for email verification and email_password is available, twscrape tries to read the code through IMAP and saves the resulting cookies for later use.
line_format describes how each line is split. Supported tokens:
username- requiredpassword- requiredemail- requiredemail_password- used to fetch email verification codes through IMAPcookies- cookie string, JSON, base64, or another format accepted by the parser_- skip column
Example account file:
username:password:email:email password:user_agent:cookies
Command:
twscrape add_accounts ./accounts.txt username:password:email:email_password:_:cookies
If IMAP is unavailable, enter verification codes manually:
twscrape login_accounts --manual
twscrape relogin user1 user2 --manual
twscrape relogin_failed --manual
Inspect and maintain the pool:
twscrape accounts
twscrape stats
twscrape relogin user1 user2
twscrape relogin_failed
twscrape reset_locks
twscrape delete_inactive
twscrape del_accounts user1 user2
twscrape accounts prints the current account state:
username logged_in active last_used total_req error_msg
user1 True True 2023-05-20 03:20:40 100 None
user2 True True 2023-05-20 03:25:45 120 None
user3 False False None 120 Login error
Limits
limit is the target number of parsed objects, not a page size. X/Twitter controls page size per endpoint, so a call can return fewer or more objects than requested.
Rate limits are tracked per account and per endpoint. When an account is limited for one operation, twscrape locks it for that operation until the reset time and tries another active account.
user_tweets and user_tweets_and_replies are limited by X/Twitter to about 3200 tweets.
Proxy
Use a proxy per account, per API instance, or for CLI commands:
await api.pool.add_account(
"user1",
"pass1",
"user1@example.com",
"email_pass1",
proxy="http://login:pass@example.com:8080",
)
api = API(proxy="http://login:pass@example.com:8080")
doc = await api.user_by_login("xdevelopers")
TWS_PROXY=socks5://user:pass@127.0.0.1:1080 twscrape user_by_login xdevelopers
Proxy priority:
api.proxyTWS_PROXY- account proxy
Do not set api.proxy or TWS_PROXY when you want per-account proxies to be used.
Environment
TWS_PROXY- global proxy for all accountsTWS_WAIT_EMAIL_CODE- email verification timeout in seconds, default30TWS_RAISE_WHEN_NO_ACCOUNT- raiseNoAccountErrorinstead of waiting; acceptsfalse,0,true,1TWS_HTTP_BACKEND-httpxorcurlTWS_LOG_LEVEL- logger level, defaultINFOTWS_TELEMETRY=0- disable anonymous telemetryDO_NOT_TRACK=1- disable anonymous telemetry
Telemetry
twscrape collects anonymous, aggregated telemetry about GraphQL operation names and the selected HTTP backend.
Included:
- GraphQL operation name
- HTTP method and backend
- twscrape version
- Python version
- platform
Not included:
- account usernames
- cookies
- proxies
- query variables
- request URLs
- response bodies
- search text
Disable telemetry with TWS_TELEMETRY=0 or DO_NOT_TRACK=1.
See Also
- twitter-advanced-search - guide on search filters
- TweeterPy - another X client
- twitter-api-client - implementation of Twitter's v1, v2, and GraphQL APIs
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file twscrape-0.19.0.tar.gz.
File metadata
- Download URL: twscrape-0.19.0.tar.gz
- Upload date:
- Size: 949.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
db246d836e99d04e4c30f66dcc52825cf2ca913b10a55b75a993a5f01227f540
|
|
| MD5 |
fcdaed59a0045d9607a4f37b8c2facbe
|
|
| BLAKE2b-256 |
9836a2ecf1eb8d419925ada3dbab2dcc52184b1db664afcb5397d5fed8346ac9
|
Provenance
The following attestation bundles were made for twscrape-0.19.0.tar.gz:
Publisher:
ci.yml on vladkens/twscrape
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
twscrape-0.19.0.tar.gz -
Subject digest:
db246d836e99d04e4c30f66dcc52825cf2ca913b10a55b75a993a5f01227f540 - Sigstore transparency entry: 1830347142
- Sigstore integration time:
-
Permalink:
vladkens/twscrape@57c18bf9444e6842493ffbfa64eca462a711d242 -
Branch / Tag:
refs/tags/v0.19.0 - Owner: https://github.com/vladkens
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@57c18bf9444e6842493ffbfa64eca462a711d242 -
Trigger Event:
push
-
Statement type:
File details
Details for the file twscrape-0.19.0-py3-none-any.whl.
File metadata
- Download URL: twscrape-0.19.0-py3-none-any.whl
- Upload date:
- Size: 47.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b21e1cd1196e6328868af1b4541d86b727b906c27f97c1d5b35f1a7333f4290f
|
|
| MD5 |
313e7d76d42beeabe334cbe86d6e6232
|
|
| BLAKE2b-256 |
42b4f4cf65d88b86df3569f0a9eb163f4b34d9be6c32bcf2344117e6c367f66a
|
Provenance
The following attestation bundles were made for twscrape-0.19.0-py3-none-any.whl:
Publisher:
ci.yml on vladkens/twscrape
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
twscrape-0.19.0-py3-none-any.whl -
Subject digest:
b21e1cd1196e6328868af1b4541d86b727b906c27f97c1d5b35f1a7333f4290f - Sigstore transparency entry: 1830347274
- Sigstore integration time:
-
Permalink:
vladkens/twscrape@57c18bf9444e6842493ffbfa64eca462a711d242 -
Branch / Tag:
refs/tags/v0.19.0 - Owner: https://github.com/vladkens
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@57c18bf9444e6842493ffbfa64eca462a711d242 -
Trigger Event:
push
-
Statement type: