Skip to main content

Python library wrapping Jina AI Reader: convert URLs, search results, and files into LLM-friendly Markdown.

Project description

jina-curl

A small Python library wrapping the Jina AI Reader APIs to turn URLs, web searches, and local files into LLM-friendly Markdown (or text / HTML / JSON). It adds consistent error handling, automatic retries (honouring Retry-After), quota monitoring, and layered configuration on top of the raw HTTP API.

Sync (JinaReader) and async (AsyncJinaReader) clients expose the same surface.

Install

uv add jina-curl        # or: pip install jina-curl

Requires Python ≥ 3.10. An API key is optional — calls fall back to anonymous (rate-limited) access — but set one for higher limits (see Configuration).

Quick start

from jina_curl import JinaReader

with JinaReader() as r:
    resp = r.read("https://example.com")   # r.jina.ai  — URL → Markdown
    print(resp.content)

    results = r.search("jina ai reader")   # s.jina.ai  — web search
    facts = r.ground("The Eiffel Tower is in Paris")  # g.jina.ai — fact-check

print(resp.title, resp.url, resp.usage.tokens_used)

Every call returns a ReaderResponse (.content, .url, .title, .format, .usage, .timestamp, .to_dict()).

Async

import asyncio
from jina_curl import AsyncJinaReader

async def main() -> None:
    async with AsyncJinaReader() as r:
        results = await asyncio.gather(
            r.read("https://example.com"),
            r.read("https://example.org"),
        )

asyncio.run(main())

Converting local content

Besides fetching URLs, both clients can POST local content to r.jina.ai for conversion:

with JinaReader() as r:
    # Raw HTML string (url is optional; helps resolve relative links)
    r.read_html("<h1>Hi</h1><p>...</p>", url="https://example.com")

    # Local file — dispatched by extension
    r.read_file("page.html")     # .html / .htm  → sent as HTML text
    r.read_file("report.pdf")    # .pdf          → base64
    r.read_file("deck.pptx")     # Office docs   → base64 (converted server-side)

read_file supports .html, .htm, .pdf, and MS Office documents (.docx, .doc, .xlsx, .xls, .pptx, .ppt); other extensions raise ValueError.

Output formats & options

from jina_curl import JinaReader, OutputFormat, ReaderOptions

with JinaReader() as r:
    r.read("https://example.com", fmt=OutputFormat.JSON)
    r.read(
        "https://example.com",
        options=ReaderOptions(no_cache=True, with_links_summary=True),
    )

OutputFormat: MARKDOWN (default), TEXT, HTML, SCREENSHOT, PAGESHOT, JSON. ReaderOptions maps to Jina's x-* request headers (caching, selectors, link/image summaries, engine, timeout, max tokens, locale, JSON schema, …); unset fields are omitted.

Configuration

API key resolution (highest priority first):

  1. JinaReader(api_key=...) argument
  2. JINA_API_KEY environment variable
  3. ~/.config/jina-curl/config.toml
  4. anonymous (fallback)

Errors

All raise subclasses of JinaError: AuthError (401/403), RateLimitError (429, carries retry_after / quota_remaining), ApiError (4xx/5xx), and ConfigError. Retries cover RateLimitError, 5xx ApiError, and transport errors; 4xx (non-429) are never retried.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jina_curl-0.1.0.tar.gz (21.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jina_curl-0.1.0-py3-none-any.whl (14.6 kB view details)

Uploaded Python 3

File details

Details for the file jina_curl-0.1.0.tar.gz.

File metadata

  • Download URL: jina_curl-0.1.0.tar.gz
  • Upload date:
  • Size: 21.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.7

File hashes

Hashes for jina_curl-0.1.0.tar.gz
Algorithm Hash digest
SHA256 36adccd2b729ff54a9761ee048d1d5e7f02957d38172ce4f4f9e70134a5016bd
MD5 136193598d0415d6a4a8a64b21d089c0
BLAKE2b-256 f37ce370e73890ee58803fa7d25eeecca10fdcd32005b5bf58a400b19787c849

See more details on using hashes here.

File details

Details for the file jina_curl-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: jina_curl-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 14.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.7

File hashes

Hashes for jina_curl-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d1db0cf18ba92a1efb92b10e0fd71cd7cb166175dba80652b795a70432140da7
MD5 c2b6671072c0c753263c50f2d7d6b536
BLAKE2b-256 7cb30ba85f69fcd2c331fba82d18e67f0a71472af6c8c351dd58fe7e5d62d711

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page