Skip to main content

osn-requests simplifies web scraping and requests in Python. It provides easy-to-use functions for fetching HTML, finding web elements using XPath, managing proxies, and generating random user agents.

Project description

osn-requests: Simplified Web Scraping and Requests

osn-requests is a lightweight Python library designed to simplify common web scraping and request tasks. It builds upon popular libraries like requests, lxml, and BeautifulSoup, providing a cleaner and more convenient interface for fetching and extracting data from websites.

Key Features:

  • Easy HTML Parsing: Quickly parse HTML content using get_html, which returns an lxml etree object ready for XPath queries.
  • Simplified Element Finding: Locate specific web elements using find_web_element and find_web_elements, abstracting away the complexities of XPath handling.
  • Integrated Proxy Support: Seamlessly integrate proxies into your requests using the proxies parameter in get_html and get_json.
  • Dynamic User-Agent Generation: Easily obtain random user agents using get_random_user_agent to avoid being blocked by websites. This function generates ~5 * 10^777 unique user-agents.
  • Free Proxy List Retrieval: Fetch a list of free proxies with get_free_proxies, filtering by protocol if desired.

Installation:

  • With pip:

    pip install osn-requests
    
  • With git:

    pip install git+https://github.com/oddshellnick/osn-requests.git
    

Example Usage:

from osn_requests import find_web_element, get_req, get_html
from osn_requests.user_agents import generate_random_user_agent
from osn_requests.proxies import get_free_proxies

user_agent = generate_random_user_agent()
print(f"Using User-Agent: {user_agent}")

http_proxies = get_free_proxies("http")
print(f"Found {len(http_proxies)} HTTP proxies")

html = get_html("https://www.example.com", headers={"User-Agent": user_agent}, proxies=http_proxies)

title_element = find_web_element(html, "//title")
if title_element is not None:
    print(f"Page Title: {title_element.text}")

json_data = get_req("https://api.example.com/data", headers={"User-Agent": user_agent}).json()
print(f"JSON Data: {json_data}")

Future Notes

osn-requests is continually being developed and improved. Future plans include adding support for more advanced scraping techniques, expanding proxy management features, and incorporating additional utilities for handling various web data formats. Contributions and feature requests are welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

osn_requests-1.0.0.tar.gz (13.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

osn_requests-1.0.0-py3-none-any.whl (12.4 kB view details)

Uploaded Python 3

File details

Details for the file osn_requests-1.0.0.tar.gz.

File metadata

  • Download URL: osn_requests-1.0.0.tar.gz
  • Upload date:
  • Size: 13.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for osn_requests-1.0.0.tar.gz
Algorithm Hash digest
SHA256 9d4ac9118650cf2a270cbfc09d819627f59570ad0b41ab17061cf1ed67c10a00
MD5 a1a6b3aa2b812860a035bbea4aad915f
BLAKE2b-256 a6125b0bcc0e9492172e9f9f3dbcebaf919cf26cd2fc2afc543fbd9a14853123

See more details on using hashes here.

Provenance

The following attestation bundles were made for osn_requests-1.0.0.tar.gz:

Publisher: python-publish.yml on oddshellnick/osn-requests

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file osn_requests-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: osn_requests-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 12.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for osn_requests-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 841fb63f2ae8589d894f3234abde99bc471a5fef3129503ad938f6ec74c67322
MD5 44e111e4fc273d35d0511fbf2ed7f982
BLAKE2b-256 2ef1157c7afe16ad63dc63608da6bddeb45822925f5f1d892eec9ebc7d95235e

See more details on using hashes here.

Provenance

The following attestation bundles were made for osn_requests-1.0.0-py3-none-any.whl:

Publisher: python-publish.yml on oddshellnick/osn-requests

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page