Powerful and flexible search engine for BeautifulSoup
Project description
Powerful and flexible web scraping Search Engine
Table of Contents
About
With many web scraping libraries available, each with unique interfaces and conventions, soupsavvy provides conistent and easy way of building selection workflows.
With soupsavvy, developers can focus on data extraction workflows instead of wrestling with library-specific quirks and inconsistencies. Eliminate complexity and introduce scalability and maintainability to your web scraping projects.
Key Features
soupsavvy introduces the concept of Selector, a declarative, consistent, and reusable search procedure that is based on following principles:
- Decoupling: Selection logic is abstracted away from DOM node and traversal implementations.
- Framework-Agnostic: Operates consistently with any supported library.
- Flexibile & Extensibile: Lightweight, reusable components used to build complex selection workflows.
Installation
soupsavvy is published on PyPi and can be installed via pip:
pip install soupsavvy
Documentation
Full documentation can be found at documentation.
Demos
For more information about the package, its concepts and usage, read Demos section of the documentation. It's step by step guide to the most important features of the package.
Contributing
If you'd like to contribute to soupsavvy, feel free to check out the GitHub repository and submit pull requests into one of development branches. Any feedback, bug reports, or feature requests are welcome! In case of any doubts, follow Contribution Guidelines
License
soupsavvy is licensed under , allowing for both personal and commercial use. See the
LICENSE file for more information.
Happy scraping! ✨
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file soupsavvy-1.0.0.tar.gz.
File metadata
- Download URL: soupsavvy-1.0.0.tar.gz
- Upload date:
- Size: 56.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b9fa22abcdd8f8964ddf49acb51c72dda62795d51d9e429a5551b5abaf697c67
|
|
| MD5 |
ee6eb8a64008739adf1afbf1daa77161
|
|
| BLAKE2b-256 |
b2e84364dc8bf9c7f98c83358ef0600c239a3d4b380e77330e83581e8c0591c0
|
Provenance
The following attestation bundles were made for soupsavvy-1.0.0.tar.gz:
Publisher:
production_release.yml on sewcio543/soupsavvy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
soupsavvy-1.0.0.tar.gz -
Subject digest:
b9fa22abcdd8f8964ddf49acb51c72dda62795d51d9e429a5551b5abaf697c67 - Sigstore transparency entry: 157382677
- Sigstore integration time:
-
Permalink:
sewcio543/soupsavvy@dc3bf608b0034787b66851c3a6b2b5345fbe4576 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/sewcio543
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
production_release.yml@dc3bf608b0034787b66851c3a6b2b5345fbe4576 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file soupsavvy-1.0.0-py3-none-any.whl.
File metadata
- Download URL: soupsavvy-1.0.0-py3-none-any.whl
- Upload date:
- Size: 76.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd94e0a60545df69adf4d7275406c0b5d943664c322d2d994f1c60878b974f76
|
|
| MD5 |
7d9f47a7f82b0ca22f64c1617ec9b7db
|
|
| BLAKE2b-256 |
4ea1658f0aae9009cb51587a80f003e6f3cfd5f61c1caf000c86b6314f20e77a
|
Provenance
The following attestation bundles were made for soupsavvy-1.0.0-py3-none-any.whl:
Publisher:
production_release.yml on sewcio543/soupsavvy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
soupsavvy-1.0.0-py3-none-any.whl -
Subject digest:
bd94e0a60545df69adf4d7275406c0b5d943664c322d2d994f1c60878b974f76 - Sigstore transparency entry: 157382678
- Sigstore integration time:
-
Permalink:
sewcio543/soupsavvy@dc3bf608b0034787b66851c3a6b2b5345fbe4576 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/sewcio543
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
production_release.yml@dc3bf608b0034787b66851c3a6b2b5345fbe4576 -
Trigger Event:
workflow_dispatch
-
Statement type: