Skip to main content

Deep Web Research Tool with Structural Positional Search - AI-powered synthesis using Ordinal Distance algorithm

Project description

ACHEM - Deep Web Research Tool

ACHEM Banner

ACHEM (Arabic: آشم) is a powerful deep web research tool that extracts content from 100+ sources, scrapes full article text, filters relevant content, and generates AI-powered conclusions.

Features

  • 100+ Sources: Searches DuckDuckGo for up to 100 results
  • Full Content Extraction: Scrapes full article text using Trafilatura
  • Smart Content Filtering: Removes ads/boilerplate, keeps only relevant sentences
  • AI Conclusions: Generates synthesized final verdicts with probability predictions
  • Multi-AI Providers: OpenRouter (free), Groq, Gemini, Ollama
  • Markdown Export: Saves complete reports with all sources to ~/Documents/ACHEM/
  • Multi-language: Supports English, French, and Arabic
  • Rate Limit Retry: Automatic retry on 429 errors

Installation

Prerequisites

  • Python 3.10 or higher
  • uv package manager (recommended)

Quick Install

git clone https://github.com/sarok-exe/achem.git
cd achem
uv venv .venv && source .venv/bin/activate
uv pip install -e .

API Configuration

Create config at ~/.ACHEM/api.env or ~/Documents/ACHEM/api.env:

# OpenRouter (free, recommended)
OPENROUTER_API_KEY=your_openrouter_key_here
OPENROUTER_MODEL=google/gemma-4-31b-it:free

# Ollama (local AI)
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.2
OLLAMA_PRIMARY=false

Get OpenRouter API key: https://openrouter.ai/settings

Usage

Command Line

achem "your research query" --ddg-limit 100

Options

--ddg-limit N        Number of DuckDuckGo results (default: 100)
--mode ai           Use AI for conclusions (default)
--mode local        Use local TF-IDF (no API needed)
--lang en/fr/ar     Response language
--no-wikipedia      Skip Wikipedia sources
--no-cache         Skip cache

How It Works

┌─────────────────────────────────────────────────────┐
│ 1. SEARCH (100+ sources)                           │
│    • DuckDuckGo web search                         │
│    • Prioritizes relevant content                 │
├─────────────────────────────────────────────────────┤
│ 2. SCRAPE (Full article text)                      │
│    • Extracts full content from URLs               │
│    • Uses Trafilatura for clean text               │
│    • Scrapes up to 100 pages concurrently         │
├─────────────────────────────────────────────────────┤
│ 3. FILTER (Relevant content only)                    │
│    • Removes boilerplate and ads                   │
│    • Keeps sentences matching keywords              │
│    • Deduplicates similar content                  │
├─────────────────────────────────────────────────────┤
│ 4. AI CONCLUSION                                   │
│    • Analyzes all content                          │
│    • Generates final prediction                    │
│    • Includes probability percentages               │
│    • Provides key reasons                          │
└─────────────────────────────────────────────────────┘

Output

Reports saved to ~/Documents/ACHEM/ include:

  • AI Conclusion: Synthesized final prediction
  • All Articles: Full extracted content from each source
  • Keywords: Identified topics
  • Extracted Web Content: Combined filtered content

License

MIT License - see LICENSE file

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

achem-1.1.0.tar.gz (48.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

achem-1.1.0-py3-none-any.whl (59.1 kB view details)

Uploaded Python 3

File details

Details for the file achem-1.1.0.tar.gz.

File metadata

  • Download URL: achem-1.1.0.tar.gz
  • Upload date:
  • Size: 48.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for achem-1.1.0.tar.gz
Algorithm Hash digest
SHA256 5d2ac4630f07adb0ee0b99b2d5b3dd6e1df4663bc8e59bd67107e8e6b0e959fb
MD5 bab5c8618553049db4d1db087a352946
BLAKE2b-256 30be3a183283871182a2d9cf58feaf87290214a32e4dddc19c743c442cd022b8

See more details on using hashes here.

Provenance

The following attestation bundles were made for achem-1.1.0.tar.gz:

Publisher: release.yml on sarok-exe/achem

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file achem-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: achem-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 59.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for achem-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e76027bbabd0ebf6e017149be21637843d1df7398bb8d7652f341567a2d828be
MD5 816f6670dc045d3e5d7184fd8f3f51ad
BLAKE2b-256 418d324fc72c2b68e721613537db45318a08843beeb2b09168a7fec0049380ad

See more details on using hashes here.

Provenance

The following attestation bundles were made for achem-1.1.0-py3-none-any.whl:

Publisher: release.yml on sarok-exe/achem

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page