Skip to main content

Abstractions of web interactions

Project description

Web Traversal Library

The Web Traversal Library (WTL) is a Python library for abstracting web interactions on top of a base execution layer such as Selenium.

Build Status License Developed at Klarna

Installation

Run pip install webtraversallibrary. That's it.

Usage example

Glossary

You will find more information in the API docs. As a high-level overview, common terms in the documentation are:

  • Workflow: The main orhcestrating class handling the main "event loop". Sometimes "schema" is also used for the specification of a certain workflow.

  • View: A static snapshot of a current website in a tab, with metadata associated to the page and its elements, possibly augmented with certain ML classifiers.

  • Policy: WTL is based on principles of reinforcement learning, where a policy is a function of the current state (here, the snapshots of current open tabs) to a set of actions.

  • Classifier: These, along with preload_callbacks and postload_callbacks are arbitray code that is executed on each workflow iteration. A classifier takes a set of elements and returns either a subset or a mapping from elements to numeric scores.

  • Config: A helper class containing string, numeric, or boolean values for a number of configurations related to WTL. Some are pregrouped under certain umbrella names, such as desktop (running as a Desktop browser, the default is mobile emulation), but all values can be arbitrarily set. See the documentation for the Config class for more information.

Getting started

See the documentation at webtraversallibrary.readthedocs.io!

Also watch "Machine Learning to Auto-Navigate Websites" given at PyCon SE 2020 for an introduction and examples.

General architecture

The flow in a workflow is as follows:

  1. Initialize the workflow with given config
  2. Navigate to given URLs
  3. Snapshot the pages
  4. Run all classifiers
  5. Check if the goal is fulfilled, if so exit
  6. Call policy with the current view(s)
  7. Execute the returned action(s)
  8. Goto 3

For more examples and usage, please run make docs and look at docs/build/html/index.html.

Development setup

All development requirements are in requirements.txt. Install the packages from pip. Helper commands are available to create a virtual environment - make env-create and make env-update.

To lint the JavaScript files (not required unless you're editing them) you need jshint which is available from npm.

How to contribute

See our guide on contributing.

Release History

See our changelog.

License

Copyright © 2020 Klarna Bank AB

For license details, see the LICENSE file in the root of this project.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

webtraversallibrary-0.11.0.tar.gz (202.7 kB view details)

Uploaded Source

Built Distribution

webtraversallibrary-0.11.0-py3-none-any.whl (233.7 kB view details)

Uploaded Python 3

File details

Details for the file webtraversallibrary-0.11.0.tar.gz.

File metadata

  • Download URL: webtraversallibrary-0.11.0.tar.gz
  • Upload date:
  • Size: 202.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.8.6

File hashes

Hashes for webtraversallibrary-0.11.0.tar.gz
Algorithm Hash digest
SHA256 9589376b1abd9e96b50f0597c35ce6e3a44a20911c3ff8892b598e2359d12c82
MD5 5a4c786d6d6466ffda0498a19645ff95
BLAKE2b-256 d63db1ef65781db24e692232f3b6777fbef46667e8b0e9c467cdbac63c5b9036

See more details on using hashes here.

File details

Details for the file webtraversallibrary-0.11.0-py3-none-any.whl.

File metadata

  • Download URL: webtraversallibrary-0.11.0-py3-none-any.whl
  • Upload date:
  • Size: 233.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.8.6

File hashes

Hashes for webtraversallibrary-0.11.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a03c41f62d57becc7d1233c6786fbe5e6f8143e738295580e4e2e98b3fcc93ac
MD5 cdc61993d9e2d18e8be7bfbd3bfc4d67
BLAKE2b-256 56ba01dc73441b9468ae44c35be72b8d1bf932ba4dab206d2b3a43117a6469cf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page