Skip to main content

A light weight anti crawler app for Django.

Project description

Djano-Anti-Crawler

PyPI version

A light weight anti crawler Django app which blocks the IP addresses which sends too many hits to your application. You can decide the number of hits that are allowed per IP address in defined time.

Installation

  1. Install via pip

    pip install django-anti-crawler
    

Quick start

  1. Add django_anti_crawler to your INSTALLED_APPS setting like this::

    INSTALLED_APPS = [
        ...
        'django_anti_crawler',
    ]
    
  2. Add 'DjangoAntiCrawlerMiddleware' to your middleware classes in settings.py file::

    MIDDLEWARE = [
        'django_anti_crawler.middlewares.DjangoAntiCrawlerMiddleware',
        'django.middleware.security.SecurityMiddleware',
        'django.contrib.sessions.middleware.SessionMiddleware',
        'django.middleware.common.CommonMiddleware',
        'django.middleware.csrf.CsrfViewMiddleware',
        'django.contrib.auth.middleware.AuthenticationMiddleware',
        'django.contrib.messages.middleware.MessageMiddleware',
        'django.middleware.clickjacking.XFrameOptionsMiddleware',
    ]
    

    DjangoAntiCrawlerMiddleware can be and should be first middleware as we need to make sure IP check is the very first thing in processing request.

  3. If cache settings are not defined in your settings.py file, then you need to add below lines to your settings.py file::

    CACHES = {
        'default': {
            'BACKEND': 'django.core.cache.backends.db.DatabaseCache',
            'LOCATION': 'cache_table',
        }
    }
    

    Make sure you have database settings configured. Run the below command to create cache_table in database::

    python manage.py createcachetable
    

    You may choose whatever cache backend you want to use.

  4. (optional) Set variables in MAX_ALLOWED_HITS_PER_IP and IP_HITS_TIMEOUT in settings.py file::

    MAX_ALLOWED_HITS_PER_IP = 2000 # max allowed hits per IP_TIMEOUT time from an IP. Default 2000. IP_HITS_TIMEOUT = 60 # timeout in seconds for IP in cache. Default 60.

  5. (optional) Set variable ANTI_CRAWLER_WHITELIST_BOTS and IP_HITS_TIMEOUT in settings.py file to whitelist specific bots or Ips::

    To test on local system, set these values to very low, e.g. IP_HITS_TIMEOUT = 30 and MAX_ALLOWED_HITS_PER_IP = 2. Restart the server and send requests frequently. After two requests you will start receiving 403 error. If not defined in settings file, default values will be used.

Authors

Licensing

The project is MIT Licenced.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

django_anti_crawler-0.5.tar.gz (3.1 kB view details)

Uploaded Source

Built Distribution

django_anti_crawler-0.5-py3-none-any.whl (4.5 kB view details)

Uploaded Python 3

File details

Details for the file django_anti_crawler-0.5.tar.gz.

File metadata

  • Download URL: django_anti_crawler-0.5.tar.gz
  • Upload date:
  • Size: 3.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.6.5

File hashes

Hashes for django_anti_crawler-0.5.tar.gz
Algorithm Hash digest
SHA256 bfa0d26e1b1345cd495a3abf35957fc0226284059fc7c30b7857f4a3e0bc12ee
MD5 5657367389b9e1f8347de7386ba778da
BLAKE2b-256 8d78edd8aced388ffc7ef23f4ca34ad7250b820d98da15937ef618ed44488119

See more details on using hashes here.

File details

Details for the file django_anti_crawler-0.5-py3-none-any.whl.

File metadata

  • Download URL: django_anti_crawler-0.5-py3-none-any.whl
  • Upload date:
  • Size: 4.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.6.5

File hashes

Hashes for django_anti_crawler-0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 0bd3e12dbd2dd963416c84940e74d966f8cf725362b46499832914c59298eff3
MD5 bc38f607de5ab65cb097170fc75c76c8
BLAKE2b-256 c85d3c01f56b4385a9a785af66c514502bb0a2711c527f0e9325073e45273f4a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page