Skip to main content

A library to scrape text from website and web pages

Project description

A library to parse and scrape text from websites

This library will provide 3 ways to scrape the text from the website:

  • The first method is to scrape all text from a single webpage.
  • The second method is to scrape text from the whole website. That includes sitemaps too.
  • The third method is to scrape text from the specified list. Also you could specify a target element (by CSS selector) to scrape only intended parts of webpage.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

text_thief-0.0.1.tar.gz (12.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

text_thief-0.0.1-py3-none-any.whl (15.6 kB view details)

Uploaded Python 3

File details

Details for the file text_thief-0.0.1.tar.gz.

File metadata

  • Download URL: text_thief-0.0.1.tar.gz
  • Upload date:
  • Size: 12.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for text_thief-0.0.1.tar.gz
Algorithm Hash digest
SHA256 59c6e3d3a8a95cefa1b54727ec0e0662c688e6450545174f1ce90c978495fb14
MD5 b8011e19707e3b8de71e22b6c97a8f44
BLAKE2b-256 9469c0252e70c5357f2fd8d6afaa5588a9db4defa2c19c0af4bca382291b1a65

See more details on using hashes here.

File details

Details for the file text_thief-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: text_thief-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 15.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for text_thief-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7e3e46f495f56f84bd0ee88f97d66b13e4f9f2b88a1be61f54f67d5222451e7b
MD5 3be806d0946fdf4679fb3d4dd8f1bec9
BLAKE2b-256 dfbac0d66879cf650620dde5fc4e57a23e7e6f5e3cd1d13cbc60b0f92b6aeed4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page