Skip to main content

A module to enrich search queries using LLM intelligence

Project description

CASQE - Context-Aware Search Query Enrichment

A generic, domain-agnostic search query enrichment system that automatically generates comprehensive search variations using contextual knowledge and AI-powered insights.

Note: This module does not perform the search itself. It enriches queries to be used with search engines or search APIs later on.

🎯 What Problem Does CASQE Solve?

When researching any topic—whether it's a product, company, person, or technology—most people start with basic queries but struggle to think of all the effective ways to search. CASQE automates the expertise of skilled researchers by systematically generating targeted, context-aware search variations.

✨ Key Features

  • Universal Applicability: Works for any search target (products, companies, people, technologies, research topics)
  • Dual Enrichment Strategies:
    • Basic Enrichment: Combinatorial approach using identifiers, platforms, and entities
    • Advanced Enrichment: AI-powered natural language query generation
  • Contextual Intelligence: Uses provided context to generate relevant, high-value search variations
  • Scoring System: Each enriched query gets a relevance score and explanation
  • Flexible Configuration: Customizable enrichment strategies and output limits

🚀 Quick Start

Installation

pip install -r requirements.txt

Basic Usage

from casqe.query_enricher import SearchQueryEnricher
from casqe.schemes import SearchQueryEnrichmentRequestObject

# Initialize the enricher
enricher = SearchQueryEnricher()

# Create a request object
request = SearchQueryEnrichmentRequestObject(
    query="Tesla Model Y",
    identifier_context="Electric SUV manufactured by Tesla, competitor to BMW iX",
    search_reason_context="Find detailed technical specifications and reviews",
    how_many=10,
    use_basic_enrichment=True,
    use_advanced_enrichment=True
)

# Generate enriched queries
enriched_queries = enricher.enrich(request)

# Print results
for query in enriched_queries:
    print(f"{query.query} (score: {query.score:.3f}) - {query.explanation}")

📖 Usage Examples

Product Research

request = SearchQueryEnrichmentRequestObject(
    query="iPhone 15 Pro",
    identifier_context="Apple's flagship smartphone, A17 Pro chip, titanium build",
    search_reason_context="Compare with competitors and find detailed reviews",
    how_many=8,
    use_advanced_enrichment=True
)

Sample Output:

  • iPhone 15 Pro vs Samsung Galaxy S24 comparison (score: 0.92)
  • iPhone 15 Pro titanium durability test (score: 0.87)
  • A17 Pro chip benchmarks performance (score: 0.85)

Company Research

request = SearchQueryEnrichmentRequestObject(
    query="Anthropic",
    identifier_context="AI safety company founded 2021, creates Claude AI assistant",
    search_reason_context="Research business model and competitive landscape",
    how_many=10,
    use_basic_enrichment=True,
    use_advanced_enrichment=True
)

Sample Output:

  • Anthropic funding Series C investment (score: 0.89)
  • Claude AI vs ChatGPT comparison (score: 0.86)
  • Anthropic constitutional AI research papers (score: 0.83)

Technology Research

request = SearchQueryEnrichmentRequestObject(
    query="Kubernetes",
    identifier_context="Container orchestration platform, originally by Google, CNCF project",
    search_reason_context="Find best practices for production deployment",
    how_many=12,
    use_advanced_enrichment=True
)

Sample Output:

  • Kubernetes production best practices 2024 (score: 0.91)
  • Google Kubernetes Engine deployment guide (score: 0.88)
  • CNCF Kubernetes security checklist (score: 0.85)

🔧 API Reference

SearchQueryEnrichmentRequestObject

Parameter Type Description Default
query str The base search query Required
identifier_context str Additional context about the search target None
search_reason_context str What you're trying to accomplish with the search None
text_rules str Custom rules for query generation None
how_many int Maximum number of enriched queries to return 10
use_basic_enrichment bool Enable combinatorial enrichment False
use_advanced_enrichment bool Enable AI-powered enrichment False
use_thinking bool Use advanced reasoning model (o3) False

UnifiedQueryCandidate

Each enriched query returns:

  • query: The enriched search string
  • score: Relevance score (0.0-1.0)
  • explanation: Description of what information this query targets
  • origin: Whether from "basic" or "advanced" enrichment

🏗️ Architecture

Basic Enrichment

  1. Extract Components: AI identifies identifiers, platforms, and entities from context
  2. Generate Combinations: Creates systematic combinations (identifier + platform, identifier + entity, etc.)
  3. Score & Filter: Assigns relevance scores and filters low-quality combinations

Advanced Enrichment

  1. Context Analysis: AI analyzes the search goal and available context
  2. Natural Query Generation: Creates human-like search variations
  3. Quality Assessment: Each query gets explanation and relevance scoring

Merge & Rank

  • Combines results from both enrichment strategies
  • Deduplicates similar queries (keeping highest scored)
  • Returns top N results sorted by relevance score

⚙️ Configuration

Environment Setup

Create a .env file with your LLM service configuration:

OPENAI_API_KEY=your_api_key_here
# Add other model provider keys as needed

Custom Models

from casqe.myllmservice import MyLLMService

# Use specific model
llm_service = MyLLMService()
enricher = SearchQueryEnricher(llm=llm_service)

# For requests requiring advanced reasoning
request.use_thinking = True  # Uses o3 model

🎯 Use Cases

  • Competitive Intelligence: Comprehensive competitor research
  • Product Research: Technical specifications, reviews, comparisons
  • Academic Research: Literature reviews, expert finding
  • OSINT: Open source intelligence gathering
  • Market Research: Industry analysis, trend identification
  • Due Diligence: Background research for business decisions
  • Content Research: Finding diverse sources for content creation

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Add tests for new functionality
  5. Commit your changes (git commit -m 'Add amazing feature')
  6. Push to the branch (git push origin feature/amazing-feature)
  7. Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙋‍♀️ Support

If you encounter any issues or have questions:

  • Open an issue on GitHub
  • Check the documentation for detailed guides
  • Review the example scripts in the examples/ directory

CASQE - Making comprehensive research accessible to everyone.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

casqe-0.0.1.tar.gz (14.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

casqe-0.0.1-py3-none-any.whl (13.7 kB view details)

Uploaded Python 3

File details

Details for the file casqe-0.0.1.tar.gz.

File metadata

  • Download URL: casqe-0.0.1.tar.gz
  • Upload date:
  • Size: 14.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for casqe-0.0.1.tar.gz
Algorithm Hash digest
SHA256 b04796787821b802685a40cc4998dd3932f6b550d8e71d5610abf715987b5177
MD5 b93a1a8be6202dfb86c50bfac625c2f1
BLAKE2b-256 c59eb69b4bbc94483fbb15d1bf11d67567b01e2fbf103bed98018d15a9624cb1

See more details on using hashes here.

File details

Details for the file casqe-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: casqe-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 13.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for casqe-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6358c108f4b947e74bb7881b76f0fde504191ebb6ffbd3c277497f5f7353a5f4
MD5 02464ceb250580d1722de3cf27f5779c
BLAKE2b-256 fba871adb855d2c6b91106d34ab6b0123c8b75691413eb47126688af7a319740

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page