A link crawler and permission testing tool for websites

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

The Link Crab

A simple CLI tool which can crawl through your website and catch broken links, and can check user permissions to specific pages on your website.

Workmode: Link gathering:

In this mode, you provide a starting url, and the Link Crab will crawl through the starting page, and all the page which is accessible thorugh links from that page and is in the same domain as the starting apge. The program export the gathered links in a txt file, then exercise them, gathering response time and status code, and exporting these in a csv file.

Workmode: Link access permission checking:

In this mode you provide a csv file with links to check, and wether those links should be accessible. The Link Crab will check every link in the list, determines if its accessible or not, and then assert the expected accessibilty to the actual accessibility. A link is considered accessible if the http response for a get request on the link has a status code under 400, and after all redirects, the url is equals of the starting url. (Most of the websites either give you a 404 or redirect to the sign-in page.) Maybe following the redirects is unnecessary, but I considered it safer

Session management:

In both workmode, you can provide login informations. The Link Crab opens up a Chrome browser with Selenium webdriver, and

All reports are saved in the reports folder under a filder named by the domain name. The configuration is done through a yaml config files.

Installation

soon...

Usage:

Simply use the command link-crab/link-crab.py path/to/your/config.yaml in the Link Crabs directory. All the configuration is done in the config files, which is expanded bellow. If you want to use the sample flask mock app for testing, provide the -t flag.

Usable config keys:

starting_url

starting_url: http://127.0.0.1:5000

Gather the reachable links in the starting_url's page and all of its subpages. After collecting all the links, the link exerciser load every in-domain url with a GET request, and measures status code, response time, response url after all redirects, and accessibility based on status code and response url

path_to_link_perms

path_to_link_perms: testapp_member_access.csv

Test accessibility of provided links. The csv should have a link and a should-access column. asserts the link accessibility equals to provided should-access. A link is accessible if the response status code<400, and after redirets the respone url equals the starting url (some framework give a 404 for unaccessible pages or redriects to sign_in page)

User

user:
    email: member@example.com
    email_locator_id: email
    login_url: http://127.0.0.1:5000/user/sign-in
    password: Password1
    password_locator_id: password

Login with the help of selenium webdriver (chromedriver). You need to provide the url of the login form, and the id's of the email (or username) and password fields.

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.2.1

Sep 1, 2020

0.2.0

Sep 1, 2020

0.1.1

Aug 18, 2020

This version

0.1.0

Aug 16, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

link-crab-0.1.0.tar.gz (10.9 kB view details)

Uploaded Aug 16, 2020 Source

Built Distribution

link_crab-0.1.0-py3-none-any.whl (25.3 kB view details)

Uploaded Aug 16, 2020 Python 3

File details

Details for the file link-crab-0.1.0.tar.gz.

File metadata

Download URL: link-crab-0.1.0.tar.gz
Upload date: Aug 16, 2020
Size: 10.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.2

File hashes

Hashes for link-crab-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`6e736d7214dd5dd4b7772df0df418a3ae77f77727b582243edb95dc7d09ae1a4`
MD5	`9cdd500255566507f94f398d2953b462`
BLAKE2b-256	`eb7e99e8c7485a378e39e3c7880a136d4ad1fa9297a7ac7c32d9c18bf32f42ac`

See more details on using hashes here.

File details

Details for the file link_crab-0.1.0-py3-none-any.whl.

File metadata

Download URL: link_crab-0.1.0-py3-none-any.whl
Upload date: Aug 16, 2020
Size: 25.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.2

File hashes

Hashes for link_crab-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`eea6b9153481ff86a9b4e01fe17ab2da920680a925bbba3ed18c2e5f8bd37419`
MD5	`10432331e2ad49e6a278b11ba820b584`
BLAKE2b-256	`8f33ae8aab042bd37a69d7f49a8cd769dabfcc3fef0aeff44f6460efc642d9ac`

See more details on using hashes here.

link-crab 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

The Link Crab

Workmode: Link gathering:

Workmode: Link access permission checking:

Session management:

Installation

Usage:

Usable config keys:

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes