Skip to main content

Library of web-related functions

Project description

https://secure.travis-ci.org/scrapy/w3lib.png?branch=master Coverage report

Overview

This is a Python library of web-related functions, such as:

  • remove comments, or tags from HTML snippets

  • extract base url from HTML snippets

  • translate entites on HTML strings

  • convert raw HTTP headers to dicts and vice-versa

  • construct HTTP auth header

  • converting HTML pages to unicode

  • sanitize urls (like browsers do)

  • extract arguments from urls

Requirements

Python 2.7 or Python 3.3+

Install

pip install w3lib

Documentation

See http://w3lib.readthedocs.org/

License

The w3lib library is licensed under the BSD license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

w3lib-1.14.3.tar.gz (42.3 kB view details)

Uploaded Source

Built Distribution

w3lib-1.14.3-py2.py3-none-any.whl (16.2 kB view details)

Uploaded Python 2Python 3

File details

Details for the file w3lib-1.14.3.tar.gz.

File metadata

  • Download URL: w3lib-1.14.3.tar.gz
  • Upload date:
  • Size: 42.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for w3lib-1.14.3.tar.gz
Algorithm Hash digest
SHA256 5bf68984ef300b3a7c05a7dee3f022a5094dd4bffd2a1a8b6373bc885ad37713
MD5 da7743a338eea7d81dd69992da1e03bd
BLAKE2b-256 fe76a276e5baa09d2474b079222fb2da76c0d2cd2989684bb371ab0b6b9c2fc7

See more details on using hashes here.

File details

Details for the file w3lib-1.14.3-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for w3lib-1.14.3-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 8123d690c6906543b3455c1ee36a4ad51a9be274f44e47c9d5d5bcc5b380a914
MD5 c31ac92aa0f912f5ccc9a118a1daf1dc
BLAKE2b-256 bf4d4556b2d6902125609646ed0ac99dc32c195f6d31ef39823570b8a70f953c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page