Universal content extraction library with tiered fetching strategies and anti-bot bypass

These details have not been verified by PyPI

Project links

Project description

OmniFetch Python Library

Python implementation of OmniFetch - a universal content extraction library.

Features

Universal Extraction: Fetches content from any URL, handling standard sites, SPAs, and paywalls.
Tiered System:
1. Light Fetch: Fast, standard HTTP request.
2. Headless Browser: Handles dynamic JS-heavy sites (requires Netlify endpoint).
3. Search Fallback: Finds alternative sources for paywalled or blocked content.
Smart Parsing: Converts HTML to clean Markdown or JSON.

Installation

pip install omnifetch-lib

Quick Start

from omnifetch import omni_fetch

# Text extraction (Markdown)
result = omni_fetch('https://example.com', mode='TEXT')
print(result.content)

# JSON extraction (Structured Data)
json_result = omni_fetch('https://example.com', mode='JSON')
print(json_result.content['title'])

Configuration

def omni_fetch(
    url: str,
    mode: str = 'TEXT',           # 'JSON' for structured, 'TEXT' for markdown
    timeout: int = 30,            # Request timeout in seconds
    netlify_endpoint: str = None, # Headless browser endpoint (Tier 2)
    headers: dict = None,         # Custom headers
    skip_headless: bool = False,  # Skip Tier 2
    skip_search: bool = False,    # Skip Tier 3
    force_title: str = None       # Override title for search fallback
) -> OmniFetchResult

Advanced Usage

Handling Blocked Domains (e.g., X/Twitter)

Some domains block direct scraping. OmniFetch automatically handles this by falling back to search (Tier 3). For opaque URLs, you can provide a force_title to improve search results.

result = omni_fetch(
    'https://x.com/someuser/status/12345',
    mode='TEXT',
    force_title='Specific Tweet Content Title' # Helps find the content via search
)

Headless Browser Support

To enable Tier 2 (Headless Browser) for dynamic sites, you need to deploy the provided Netlify function and pass the endpoint.

result = omni_fetch(
    'https://dynamic-site.com',
    netlify_endpoint='https://your-site.netlify.app/.netlify/functions/headless-fetch'
)

Development Installation

pip install -e .

Running Tests

pip install -e ".[dev]"
pytest

See the main README.md for full documentation.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.3.0

Mar 9, 2026

1.2.1

Feb 24, 2026

1.2.0

Feb 24, 2026

This version

1.1.0

Feb 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omnifetch_lib-1.1.0.tar.gz (15.6 kB view details)

Uploaded Feb 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

omnifetch_lib-1.1.0-py3-none-any.whl (17.9 kB view details)

Uploaded Feb 8, 2026 Python 3

File details

Details for the file omnifetch_lib-1.1.0.tar.gz.

File metadata

Download URL: omnifetch_lib-1.1.0.tar.gz
Upload date: Feb 8, 2026
Size: 15.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for omnifetch_lib-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`5810b33a96c434abdb538c8887dec9dce46c9e9a51269b7daa39fc79be628f32`
MD5	`a8527986601f0027a4adce525956376c`
BLAKE2b-256	`8c3a9cf13246efbf4f4abeaac6408014fe9f2924e49e3aefd9ed670f67b485d8`

See more details on using hashes here.

File details

Details for the file omnifetch_lib-1.1.0-py3-none-any.whl.

File metadata

Download URL: omnifetch_lib-1.1.0-py3-none-any.whl
Upload date: Feb 8, 2026
Size: 17.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for omnifetch_lib-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f3f31f21fc49190b51a3cecb5df424aad2a2cba50c5d047e4ebe81c2f659e723`
MD5	`822d79df45ae85ad82e85225cf9f2ded`
BLAKE2b-256	`674a397bbdb4e9842e786524156d833fbbed8ccdb52f28c5d5ca0d3475f8b516`

See more details on using hashes here.

omnifetch-lib 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OmniFetch Python Library

Features

Installation

Quick Start

Configuration

Advanced Usage

Handling Blocked Domains (e.g., X/Twitter)

Headless Browser Support

Development Installation

Running Tests

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes