Implementing X (Twitter) data crawling based on Camoufox, for personal use.

Project description

X-CrawlFox 🦊

Python Version

A free, high-anonymity X (Twitter) human-like scraping CLI tool.

🌐 English | 中文

🚀 Key Features

Features: Free, highly customizable, incremental crawling, and built-in human-like behavior for anti-bot protection.

Human-like Interaction: Integrates Camoufox fingerprint obfuscation to simulate real human scrolling, random delays, and typing interactions, significantly reducing the risk of detection.
Timeline Scraping: Supports crawling "Following" and "For you" feeds with configurable item limits.
Deep News Scraping: Automatically scrapes the "Today's News" sidebar, with support for clicking into details to extract Grok summaries and related popular posts.
Keyword Search: Simulates real keyboard input for search queries to bypass anti-bot detection.
Incremental Account Monitoring: Supports multi-account monitoring with automatic tracking of the last crawled tweet ID to only fetch new content.
One-click Composite Tasks: Launch composite tasks (Timeline, News, Monitoring, Search) via a unified JSON configuration file.
Automatic State Management: Automatically saves login sessions (Cookie) and crawling progress (Crawler State).

📦 Quick Start

Installation

Install from PyPI:
```
pip install x-crawlfox
```

Build from source: This project uses uv for package management.

git clone https://github.com/Jiutwo/x-crawlfox.git
cd x-crawlfox
uv sync

How to Use

1. Initialize Config Directory

Before first use, run the following command to generate the .x-crawlfox configuration folder and default settings in the current directory:

x-crawlfox init

# To save the configuration to the user home directory (Global Mode):
x-crawlfox init --global

2. Account Login or Cookie Export (Required)

You must have a logged-in session (Cookie) before scraping.

Note: Scraping immediately with a newly registered account is risky; it is recommended to use the account normally for a while first.

Method 1: Export via Cookie Editor Extension (Recommended)

Use the browser extension Cookie Editor to export your current session cookies as JSON and save them to .x-crawlfox/x_cookies.json.

The .x-crawlfox folder can be located in the current directory or the user home directory. X-CrawlFox will automatically recognize and convert the Cookie Editor format to the required internal format upon loading.

Cookie Editor

Method 2: Command Line Login

x-crawlfox x login

Complete the login in the popup browser window, then return to the terminal and press Enter to save the state. The login state will be automatically saved to .x-crawlfox/x_cookies.json.

If X blocks the login as a "suspicious attempt," please switch to Method 1.

3. Scrape Personal Timeline

# Scrape the first 20 items from the Following feed
# Add --no-headless to visualize the process
x-crawlfox x timeline --type Following --max-items 20

# Scrape the For You feed
x-crawlfox x timeline --type "For you" --max-items 50

4. Scrape Today's News

# Scrape sidebar list only
x-crawlfox x news

# Deep scraping: Enter details to get summaries and related posts
x-crawlfox x news --detail --max-items 3

5. Scrape/Monitor Specific User

# Fetch the latest 20 tweets from a specific user
x-crawlfox x user elonmusk --max-tweets 20

# Incremental fetch: Only get new content since the last run
x-crawlfox x user elonmusk --only-new

Run multi-account monitoring independently (reads x.monitor from crawl_config.json):

x-crawlfox x monitor

You can also specify a custom config file (flat list format):

x-crawlfox x monitor --config my_accounts.json

6. One-click Composite Tasks

Edit .x-crawlfox/crawl_config.json, then run:

x-crawlfox x all

You can also specify a different config file path via --config:

x-crawlfox x all --config /path/to/crawl_config.json

Example crawl_config.json format:

{
    "global": {
        "output_dir": "output",
        "headless": true
    },
    "x": {
        "timeline": [
            { "type": "For you",   "max_scrolls": 2, "max_items": 10 },
            { "type": "Following", "max_scrolls": 3, "max_items": 10 }
        ],
        "news": {
            "enabled": true,
            "detail": true,
            "max_items": 5
        },
        "monitor": [
            { "username": "elonmusk", "only_new": true, "max_tweets": 10 },
            { "username": "OpenAI",   "only_new": true, "max_tweets": 10 }
        ]
    }
}

📂 Storage & Configuration (.x-crawlfox)

To protect privacy and support persistence, X-CrawlFox uses the .x-crawlfox folder to store sensitive data:

Storage Location:
- Local Mode: The program first checks if .x-crawlfox exists in the current working directory. If found, all data is stored here (ideal for account isolation).
- Global Mode: If the local directory does not exist, it defaults to ~/.x-crawlfox in the user home directory (Windows: %USERPROFILE%\.x-crawlfox).
Stored Content:
- x_cookies.json: Stores X login cookies and auth tokens. Do not share this file.
- crawl_config.json: Unified configuration file for the all and monitor commands.
- x_crawl_state.json: Stores the last tweet ID fetched for each monitored account to enable incremental fetching.
Output Location: All scraping results are saved in .jsonl format in the output/ directory for easy analysis or database import.

🙏 Acknowledgments

This project is deeply inspired by the open-source community and integrates excellent open-source projects such as Camoufox. Sincere thanks to all the open-source libraries and developers who provide foundational support for this project.

⚠️ Disclaimer

This tool is for educational and research purposes only. Please comply with the X (Twitter) Terms of Service. The developers are not responsible for any account restrictions or legal issues resulting from the use of this tool.

Project details

Release history Release notifications | RSS feed

0.1.2

Apr 19, 2026

This version

0.1.1

Apr 12, 2026

0.1.0

Mar 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

x_crawlfox-0.1.1.tar.gz (32.7 kB view details)

Uploaded Apr 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

x_crawlfox-0.1.1-py3-none-any.whl (31.2 kB view details)

Uploaded Apr 12, 2026 Python 3

File details

Details for the file x_crawlfox-0.1.1.tar.gz.

File metadata

Download URL: x_crawlfox-0.1.1.tar.gz
Upload date: Apr 12, 2026
Size: 32.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for x_crawlfox-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`c7ded4ca59e29ecd9ad52bc10bc874efdc2ff5d8dcdc7a2471276c292f78b598`
MD5	`4ed4d1902b5d3fb0dfcad0cce81306d3`
BLAKE2b-256	`444a70f7691bf919e611c6003892788f4845d06c91d606bf85d12bda37d67ebf`

See more details on using hashes here.

File details

Details for the file x_crawlfox-0.1.1-py3-none-any.whl.

File metadata

Download URL: x_crawlfox-0.1.1-py3-none-any.whl
Upload date: Apr 12, 2026
Size: 31.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for x_crawlfox-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`591f5d20f5f742ee06c63845d08eecb0fc231bb80e4b768d00ac20fd058a6161`
MD5	`5deddc75687e42aa5ccb64809b6fbd99`
BLAKE2b-256	`c0d0b0b5687f2a9780869cbf2383e686c4f8f741793e2eeb76eb1f9052439f40`

See more details on using hashes here.

x-crawlfox 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

X-CrawlFox 🦊

🚀 Key Features

📦 Quick Start

Installation

How to Use

1. Initialize Config Directory

2. Account Login or Cookie Export (Required)

3. Scrape Personal Timeline

4. Scrape Today's News

5. Scrape/Monitor Specific User

6. One-click Composite Tasks

📂 Storage & Configuration (.x-crawlfox)

🙏 Acknowledgments

⚠️ Disclaimer

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes