Skip to main content

Arcjet Python SDK. Bot detection, rate limiting, email validation, attack protection for Python applications.

Project description

Arcjet Logo

Arcjet - Python SDK

PyPI badge

Arcjet helps developers protect their apps in just a few lines of code. Bot detection. Rate limiting. Email validation. Attack protection. A developer-first approach to security.

This is the monorepo containing various Arcjet open source packages for Python.

Features

Arcjet security features for protecting Python apps:

Get help

Join our Discord server or reach out for support.

Installation

Install from PyPI with uv:

# With a uv project
uv add arcjet

# With an existing pip managed project
uv pip install arcjet

Or with pip:

pip install arcjet

Usage

Read the docs at docs.arcjet.com

Quick start example

This example implements Arcjet bot protection, rate limiting, email validation, and Shield WAF in a FastAPI application. Requests from bots not in the allow list will be blocked with a 403 Forbidden response.

The example email is invalid so an error will be returned - change the email to see different results.

FastAPI

An asynchronous example using FastAPI with the Arcjet async client.

# main.py
import os
from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse

from arcjet import (
    arcjet,
    shield,
    detect_bot,
    token_bucket,
    Mode,
    BotCategory,
)

app = FastAPI()

arcjet_key = os.getenv("ARCJET_KEY")
if not arcjet_key:
    raise RuntimeError(
        "ARCJET_KEY is required. Get one at https://app.arcjet.com")

aj = arcjet(
    key=arcjet_key,  # Get your key from https://app.arcjet.com
    rules=[
        # Shield protects your app from common attacks e.g. SQL injection
        shield(mode=Mode.LIVE),
        # Create a bot detection rule
        detect_bot(
            mode=Mode.LIVE, allow=[
                BotCategory.SEARCH_ENGINE,  # Google, Bing, etc
                # Uncomment to allow these other common bot categories
                # See the full list at https://docs.arcjet.com/bot-protection/identifying-bots
                # BotCategory.MONITOR, # Uptime monitoring services
                # BotCategory.PREVIEW, # Link previews e.g. Slack, Discord
            ]
        ),
        # Create a token bucket rate limit. Other algorithms are supported
        token_bucket(
            # Tracked by IP address by default, but this can be customized
            # See https://docs.arcjet.com/fingerprints
            # characteristics=["ip.src"],
            mode=Mode.LIVE,
            refill_rate=5,  # Refill 5 tokens per interval
            interval=10,  # Refill every 10 seconds
            capacity=10,  # Bucket capacity of 10 tokens
        ),
    ],
)


@app.get("/")
async def hello(request: Request):
    # Call protect() to evaluate the request against the rules
    decision = await aj.protect(
        request, requested=5  # Deduct 5 tokens from the bucket
    )

    # Handle denied requests
    if decision.is_denied():
        status = 429 if decision.reason_v2.type == "RATE_LIMIT" else 403
        return JSONResponse(
            {"error": "Denied", "reason": decision.reason_v2},
            status_code=status,
        )

    # Check IP metadata (VPNs, hosting, geolocation, etc)
    if decision.ip.is_hosting():
        # Requests from hosting IPs are likely from bots, so they can usually be
        # blocked. However, consider your use case - if this is an API endpoint
        # then hosting IPs might be legitimate.
        # https://docs.arcjet.com/blueprints/vpn-proxy-detection

        return JSONResponse(
            {"error": "Denied from hosting IP"},
            status_code=403,
        )

    ip = decision.ip_details
    if ip and ip.city and ip.country_name:
        print(f"Request from {ip.city}, {ip.country_name}")

    return {"message": "Hello world", "decision": decision.to_dict()}

Flask

A synchronous example using Flask with the sync client.

# main.py
from flask import Flask, request, jsonify
import os

from arcjet import (
  arcjet_sync,
  shield,
  detect_bot,
  token_bucket,
  validate_email,
  is_spoofed_bot,
  Mode,
  BotCategory,
  EmailType,
)

app = Flask(__name__)

arcjet_key = os.getenv("ARCJET_KEY")
if not arcjet_key:
    raise RuntimeError(
        "ARCJET_KEY is required. Get one at https://app.arcjet.com")

aj = arcjet_sync(
    key=arcjet_key,
    rules=[
        shield(mode=Mode.LIVE),
        detect_bot(
            mode=Mode.LIVE, allow=[BotCategory.SEARCH_ENGINE, "OPENAI_CRAWLER_SEARCH"]
        ),
        token_bucket(mode=Mode.LIVE, refill_rate=5, interval=10, capacity=10),
        validate_email(
            mode=Mode.LIVE,
            deny=[EmailType.DISPOSABLE, EmailType.INVALID, EmailType.NO_MX_RECORDS],
        ),
    ],
)

@app.route("/")
def hello():
    # requested is optional; only relevant for token bucket rules (default: 1)
    # email is only required if validate_email() is configured
    decision = aj.protect(request, requested=1, email="example@arcjet.com")

    if decision.is_denied():
        status = 429 if decision.reason_v2.type == "RATE_LIMIT" else 403
        return jsonify(error="Denied", reason=decision.reason_v2), status

    if decision.ip.is_hosting():
        return jsonify(error="Hosting IP blocked"), 403

    ip = decision.ip_details
    if ip and ip.city and ip.country_name:
        print(f"Request from {ip.city}, {ip.country_name}")

    if any(is_spoofed_bot(r) for r in decision.results):
        return jsonify(error="Spoofed bot"), 403

    return jsonify(message="Hello world", decision=decision.to_dict())

if __name__ == "__main__":
    app.run(debug=True)

Identifying bots

Arcjet allows you to configure a list of bots to allow or deny. To construct the list, you can specify individual bots and/or use categories to allow or deny all bots in a category.

If you specify a list of bots to allow, then all other bots will be denied. An empty allow list means all bots are denied. The opposite applies for deny lists, if you specify bots to deny then all other bots will be allowed.

Bot categories

Bots can be configured by category and/or by specific bot name. For example, to allow all search engines and OpenAI crawler bots, but deny all other bots:

from arcjet import arcjet, Mode, BotCategory, detect_bot

aj = arcjet(
    key=arcjet_key,
    rules=[
        detect_bot(
            mode=Mode.LIVE, 
            allow=[
                BotCategory.SEARCH_ENGINE, 
                "OPENAI_CRAWLER_SEARCH",
            ]
        ),
    ],
)

The identifiers on the bot list are generated from a collection of known bots which includes details of their owner and any variations.

If a bot is detected but cannot be identified as a known bot, it will be labeled as UNKNOWN_BOT. This is separate from the CATEGORY:UNKNOWN category, which is for bots that cannot be classified into any category but can still be identified as a specific bot. You can see a list of these named, but unclassified bots in the bot list.

Detections returned as UNKNOWN_BOT happen if the bot is new or hides itself. It’s a bot with no name. Arcjet uses various techniques to detect these bots, including analyzing request patterns and tracking IP addresses.

Custom characteristics

Each client is tracked by IP address by default. To customize client fingerprinting you can configure custom characteristics:

# main.py
from flask import Flask, request, jsonify
import os
import logging

from arcjet import (
    arcjet_sync,
    shield,
    detect_bot,
    token_bucket,
    Mode,
    BotCategory,
    EmailType,
)

app = Flask(__name__)

arcjet_key = os.getenv("ARCJET_KEY")
if not arcjet_key:
    raise RuntimeError(
        "ARCJET_KEY is required. Get one at https://app.arcjet.com")

aj = arcjet_sync(
    key=arcjet_key,  # Get your key from https://app.arcjet.com
    rules=[
        # Shield protects your app from common attacks e.g. SQL injection
        shield(mode=Mode.LIVE),
        # Create a bot detection rule
        detect_bot(
            mode=Mode.LIVE,
            allow=[
                BotCategory.SEARCH_ENGINE,  # Google, Bing, etc
                # Uncomment to allow these other common bot categories
                # See the full list at https://docs.arcjet.com/bot-protection/identifying-bots
                # BotCategory.MONITOR, # Uptime monitoring services
                # BotCategory.PREVIEW, # Link previews e.g. Slack, Discord
            ],
        ),
        # Create a token bucket rate limit. Other algorithms are supported
        token_bucket(
            # Pass a custom characteristics to track requests
            characteristics=["userId"],
            mode=Mode.LIVE,
            refill_rate=5,  # Refill 5 tokens per interval
            interval=10,  # Refill every 10 seconds
            capacity=10,  # Bucket capacity of 10 tokens
        ),
    ],
)


@app.route("/")
def hello():
     # Replace with actual user ID from the user session
    userId = "your_user_id"

    # Call protect() to evaluate the request against the rules
    decision = aj.protect(
        request,
        # Deduct 5 tokens from the bucket
        requested=5,
        # Identify the user to track the limit against
        characteristics={"userId": userId},
    )

    # Handle denied requests
    if decision.is_denied():
        status = 429 if decision.reason_v2.type == "RATE_LIMIT" else 403
        return jsonify(error="Denied", reason=decision.reason_v2), status

    # Check IP metadata (VPNs, hosting, geolocation, etc)
    if decision.ip.is_hosting():
        # Requests from hosting IPs are likely from bots, so they can usually be
        # blocked. However, consider your use case - if this is an API endpoint
        # then hosting IPs might be legitimate.
        # https://docs.arcjet.com/blueprints/vpn-proxy-detection

        return jsonify(error="Hosting IP blocked"), 403

    ip = decision.ip_details
    if ip and ip.city and ip.country_name:
        app.logger.info("Request from %s, %s", ip.city, ip.country_name)

    return jsonify(message="Hello world", decision=decision.to_dict())


if __name__ == "__main__":
    app.run(debug=True)

Sensitive information detection

Detect and optionally block sensitive information (PII) in request content such as email addresses, phone numbers, IP addresses, and credit card numbers.

from arcjet import arcjet, detect_sensitive_info, SensitiveInfoEntityType, Mode

aj = arcjet(
    key=arcjet_key,
    rules=[
        detect_sensitive_info(
            mode=Mode.LIVE,
            deny=[
                SensitiveInfoEntityType.EMAIL,
                SensitiveInfoEntityType.CREDIT_CARD_NUMBER,
            ],
        ),
    ],
)

# Pass the content to scan on each protect() call
decision = await aj.protect(request, sensitive_info_value="User input to scan")

You can also provide a custom detect callback to supplement the built-in detectors:

def my_detect(tokens: list[str]) -> list[str | None]:
    return ["CUSTOM_PII" if "secret" in t.lower() else None for t in tokens]

rules = [
    detect_sensitive_info(
        mode=Mode.LIVE,
        deny=["CUSTOM_PII"],
        detect=my_detect,
    ),
]

Request filters

Filter requests using expression-based rules against request properties (IP, headers, path, method, etc.).

from arcjet import arcjet, filter_request, Mode

aj = arcjet(
    key=arcjet_key,
    rules=[
        filter_request(
            mode=Mode.LIVE,
            deny=['ip.src == "1.2.3.4"'],
        ),
    ],
)

You can also pass local fields for use in filter expressions:

decision = await aj.protect(
    request,
    filter_local={"userId": current_user.id},
)

These are then available as local.userId in expressions.

Trusted proxies

When your app runs behind one or more reverse proxies or a load balancer, pass their IPs or CIDR ranges so Arcjet can correctly resolve the real client IP from X-Forwarded-For and similar headers.

from arcjet import arcjet

aj = arcjet(
    key=arcjet_key,
    rules=[...],
    proxies=["10.0.0.0/8", "192.168.0.1"],
)

Only globally routable IPs are accepted for client identification; private, loopback, link-local, and addresses matching proxies are ignored during IP extraction.

Overriding automatic IP detection

By default, Arcjet automatically detects the client IP from the request using X-Forwarded-For. We recommend leaving this enabled in most cases and configuring trusted proxies as needed (see above).

[!WARNING] Disabling automatic IP detection is not recommended unless you have written your own IP detection logic that considers the correct parsing of IP headers. Accepting client IPs from untrusted sources can expose your application to IP spoofing attacks. See the MDN documentation for further guidance.

To disable automatic IP detection (for example, if you have your own custom logic to extract the client IP), set disable_automatic_ip_detection=True when creating the Arcjet client, and then provide the ip_src parameter to .protect(...).

from arcjet import arcjet
aj = arcjet(
    key=arcjet_key,
    rules=[...],
    disable_automatic_ip_detection=True,
)

# ...

decision = await aj.protect(
    request,
    ip_src="8.8.8.8",  # provide the client IP here
)

Logging

Enable debug logging to troubleshoot issues with Arcjet integration.

import logging
logging.basicConfig(
    level=logging.DEBUG,
    format="%(asctime)s %(levelname)s %(name)s %(message)s"
)

Arcjet logging can be controlled directly by setting the ARCJET_LOG_LEVEL environment variable e.g. export ARCJET_LOG_LEVEL=debug.

Accessing decision details

Arcjet returns per-rule rule_results and a top-level decision.reason_v2. To make a simple decision about allowing or denying a request you can check if decision.is_denied():. For more details, inspect the rule results.

Getting bot detection details

To find out which bots were detected (if any):

if decision.reason_v2.type == "BOT":
   denied = decision.reason_v2.denied

   print("Denied bots:", ", ".join(denied) if denied else "none")

Verified vs spoofed bots

Bots claiming to be certain well-known bots (e.g. Googlebot) are verified by checking their IP address against the known IP ranges for that bot. If a bot claims to be a certain bot but fails verification, it is labeled as a spoofed bot. You can check for spoofed bots with the is_spoofed_bot() helper:

from arcjet import is_spoofed_bot

# ... after calling aj.protect() and getting a decision

if any(is_spoofed_bot(r) for r in decision.results):
    return jsonify(error="Spoofed bot"), 403

The decision reason will also indicate whether a bot was verified or spoofed:

if decision.reason_v2.type == "BOT":
    print("Spoofed:", decision.reason_v2.spoofed)
    print("Verified:", decision.reason_v2.verified)

    # Example policy decisions
    if decision.reason_v2.spoofed:
        return jsonify(error="Spoofed bot"), 403

    if decision.reason_v2.verified:
        print("Known bot verified by Arcjet")

If you want to inspect bot results at the per-rule level, iterate through decision.results and read reason.spoofed / reason.verified on BOT reasons:

for result in decision.results:
    reason = result.reason
    if reason.type != "BOT":
        continue

    if reason.spoofed:
        return jsonify(error="Spoofed bot"), 403

    if reason.verified:
        print("Verified bot traffic")

IP analysis

Arcjet returns an ip_details object as part of a Decision from aj.protect(...). There are several ways to inspect that data:

  1. high-level helpers for common reputation checks.
  2. typed fields via Decision.ip_details.
  3. raw fields via Decision.to_dict().

IP analysis helpers

For common checks (is this IP a VPN, proxy, Tor exit node, or a hosting provider) use the IpInfo helpers exposed at decision.ip:

# high level booleans
if decision.ip.is_hosting():
    # likely a cloud / hosting provider — often suspicious for bots
    do_block()

if decision.ip.is_vpn() or decision.ip.is_proxy() or decision.ip.is_tor():
    # treat according to your policy
    do_something_else()

IP analysis fields

Use decision.ip_details for typed field access:

ip = decision.ip_details
if ip:
    lat = ip.latitude
    lon = ip.longitude
    asn = ip.asn
    asn_name = ip.asn_name
    service = ip.service  # str | None
else:
    # ip details not present

Decision.to_dict() also includes ip_details as a raw dictionary shape.

These are the available fields, although not all may be present for every IP:

  • Geolocation: latitude, longitude, accuracy_radius, timezone, postal_code, city, region, country, country_name, continent, continent_name
  • ASN / network: asn, asn_name, asn_domain, asn_type (isp, hosting, business, education), asn_country
  • Reputation / service: service name (when present) and boolean indicators for is_vpn, is_proxy, is_tor, is_hosting, is_relay

Support

This repository follows the Arcjet Support Policy.

Security

This repository follows the Arcjet Security Policy.

Compatibility

Packages maintained in this repository are compatible with Python 3.10 and above.

License

Licensed under the Apache License, Version 2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arcjet-0.6.0.tar.gz (1.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arcjet-0.6.0-py3-none-any.whl (1.0 MB view details)

Uploaded Python 3

File details

Details for the file arcjet-0.6.0.tar.gz.

File metadata

  • Download URL: arcjet-0.6.0.tar.gz
  • Upload date:
  • Size: 1.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.1 {"installer":{"name":"uv","version":"0.11.1","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for arcjet-0.6.0.tar.gz
Algorithm Hash digest
SHA256 a39c012baa5f7a02cf0615287b7da8a908e2bdb8c5e2ba47c5bbc58161275811
MD5 36b590938e5d46c38d9ea33552e94dab
BLAKE2b-256 01f81fbcffb82af1c4f8228f87bc5662ae17f7117471653eeff2a2a021832b59

See more details on using hashes here.

File details

Details for the file arcjet-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: arcjet-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 1.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.1 {"installer":{"name":"uv","version":"0.11.1","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for arcjet-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5dd107f38afb05a4bb8373218561dc73ab138fdd3f7e3dd39e6d277b83480060
MD5 420690da48f3270d8ec29d8b31cde32c
BLAKE2b-256 34ce80cf2d810a4b6c6d2451fe9fc7064f2b6b7390f1b2af7acaa387539aea6f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page