Skip to main content

A new package is designed to simplify the process of extracting structured information from user-provided text inputs by leveraging a language model with pattern matching capabilities. The system prom

Project description

structuredxtract

PyPI version License: MIT Downloads LinkedIn

Extract structured information from unstructured text with pattern-matching precision.

A Python package that simplifies structured data extraction from plain text inputs using a language model with pattern-matching capabilities. Ideal for surveys, feedback analysis, and report generation where consistent, well-formatted outputs are required.


🚀 Features

  • Pattern-based extraction: Uses regex patterns to enforce structured output formats.
  • Flexible LLM integration: Works with default ChatLLM7 or any LangChain-compatible model.
  • No multimedia support: Focuses solely on text-based inputs for reliability.
  • Consistent formatting: Ensures responses match expected schemas (tables, summaries, key-value pairs).
  • Easy customization: Replace default LLM with OpenAI, Anthropic, Google, or any other LangChain model.

📦 Installation

pip install structuredxtract

🔧 Usage

Basic Usage (Default LLM7)

from structuredxtract import structuredxtract

user_input = """
Name: John Doe
Age: 30
Occupation: Software Engineer
"""

response = structuredxtract(user_input)
print(response)  # Structured output based on predefined patterns

Custom LLM Integration

Replace the default ChatLLM7 with your preferred model:

OpenAI

from langchain_openai import ChatOpenAI
from structuredxtract import structuredxtract

llm = ChatOpenAI()
response = structuredxtract(user_input, llm=llm)

Anthropic

from langchain_anthropic import ChatAnthropic
from structuredxtract import structuredxtract

llm = ChatAnthropic()
response = structuredxtract(user_input, llm=llm)

Google Vertex AI

from langchain_google_genai import ChatGoogleGenerativeAI
from structuredxtract import structuredxtract

llm = ChatGoogleGenerativeAI()
response = structuredxtract(user_input, llm=llm)

🔑 API Key

  • Default: Uses LLM7_API_KEY from environment variables.
  • Manual override: Pass via api_key parameter or set LLM7_API_KEY before importing.
    import os
    os.environ["LLM7_API_KEY"] = "your_api_key_here"
    

Get a free API key at LLM7 Token.


📜 Parameters

Parameter Type Description
user_input str Plain text input to extract structured data from.
api_key Optional[str] LLM7 API key (optional if using environment variable).
llm Optional[BaseChatModel] Custom LangChain LLM (e.g., ChatOpenAI, ChatAnthropic). Defaults to ChatLLM7.

📊 Output

Returns a List[str] of extracted data matching predefined patterns. Example:

[
    {"Name": "John Doe", "Age": "30", "Occupation": "Software Engineer"},
    {"Key1": "Value1", "Key2": "Value2"}
]

🔄 Rate Limits

  • LLM7 Free Tier: Sufficient for most use cases.
  • Custom API Key: For higher limits, pass via api_key or environment variable.

📝 License

MIT


📢 Support & Issues

For bugs or feature requests, open an issue on GitHub.


👤 Author

Eugene Evstafev 📧 hi@euegne.plus 🔗 GitHub: chigwell

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

structuredxtract-2025.12.21145156.tar.gz (4.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

structuredxtract-2025.12.21145156-py3-none-any.whl (5.4 kB view details)

Uploaded Python 3

File details

Details for the file structuredxtract-2025.12.21145156.tar.gz.

File metadata

File hashes

Hashes for structuredxtract-2025.12.21145156.tar.gz
Algorithm Hash digest
SHA256 efe8305cb7ac9c024765591caf89f1121c6d98bc7ad68587a131f8524eb673b2
MD5 bcc62bddac8e69749a6f2765656f7300
BLAKE2b-256 75c409585b9bff0ffb2e11544539e7fa064f3c00ac0d53724773e6bd88cf6514

See more details on using hashes here.

File details

Details for the file structuredxtract-2025.12.21145156-py3-none-any.whl.

File metadata

File hashes

Hashes for structuredxtract-2025.12.21145156-py3-none-any.whl
Algorithm Hash digest
SHA256 bb160fd3d635c4efe53aa5172223e1b8d839e29b198e1bd92becdcc00d030781
MD5 bad3941a2aac5a5c72c3dce3488a1ae3
BLAKE2b-256 9663f6c35f452449e18715ea41f43887dff1bc91e2ba37f26ee326acb7ff0364

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page