Skip to main content

Add your description here

Project description

ainfo

gather structured information from any website - ready for LLMs

Architecture

The project separates concerns into distinct modules:

  • fetching – obtain raw data from a source
  • parsing – transform raw data into a structured form
  • extraction – pull relevant information from the parsed data
  • output – handle presentation of the extracted results

Usage

Install the project and run the CLI against a URL:

pip install -e .
ainfo run https://example.com

The command fetches the page, parses its content and prints any emails, phone numbers or addresses that were detected.

To delegate information extraction or summarisation to an LLM, provide an OpenRouter API key via the OPENROUTER_API_KEY environment variable and pass --use-llm or --summarize:

export OPENROUTER_API_KEY=your_key
ainfo run https://example.com --use-llm --summarize

If the target site relies on client-side JavaScript, enable rendering with a headless browser:

ainfo run https://example.com --render-js

To crawl multiple pages starting from a URL and extract contact details from each one:

ainfo crawl https://example.com --depth 2

The crawler visits pages breadth-first up to the specified depth and prints results for every page encountered.

Both commands accept --render-js to execute JavaScript before scraping, which uses Playwright. Installing the browser drivers may require running playwright install.

Environment configuration

Copy .env.example to .env and populate it with your OpenRouter credentials to enable LLM-powered features.

Limitations

  • Crawling retrieves each page twice: once for discovery and once for extraction, which may impact performance on large sites.
  • Extraction focuses on basic contact details; more extractors can be added.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ainfo-0.1.0.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ainfo-0.1.0-py3-none-any.whl (14.4 kB view details)

Uploaded Python 3

File details

Details for the file ainfo-0.1.0.tar.gz.

File metadata

  • Download URL: ainfo-0.1.0.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ainfo-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8c9d014157b6dcb6e5abfc60e009918f9a6479c4be4abe818a14e3a38a1cc4d9
MD5 8f8d4e5d50e007435328e391554dbf78
BLAKE2b-256 4b7fc75268323b697913d31813a7d99ae050d5109d9bbb756e5256fa34abafda

See more details on using hashes here.

Provenance

The following attestation bundles were made for ainfo-0.1.0.tar.gz:

Publisher: python-publish.yml on MisterXY89/ainfo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ainfo-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ainfo-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 14.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ainfo-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 23ab5ad69b7cf1e4e72a0ebc13d7490dde53d6f840fd47a774b32a81a64c0b7e
MD5 4d818915d90a25374890030b02f09e90
BLAKE2b-256 82deb98bdb8a5e010da42e913f4c3af92c0165a35fad0d0e4ef29beb2177760e

See more details on using hashes here.

Provenance

The following attestation bundles were made for ainfo-0.1.0-py3-none-any.whl:

Publisher: python-publish.yml on MisterXY89/ainfo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page