Skip to main content

Convert unstructured text into structured, queryable knowledge with llmatch-messages. Extract and organize key details for fast, reliable access—ideal for teams and researchers.

Project description

texttoknowledge

PyPI version License: MIT Downloads LinkedIn

texttoknowledge is a lightweight Python package that transforms unstructured text from documents into structured, query‑able knowledge. By leveraging the llmatch-messages library and a language model (LLM), the package extracts key information and organizes it into predefined formats, making critical details easy to retrieve and keep up‑to‑date.

Features

  • Simple API – Call a single function with your raw text.
  • Customizable LLM – Use the default ChatLLM7 or provide any LangChain‑compatible LLM (OpenAI, Anthropic, Google, etc.).
  • Regex‑driven output – Guarantees that the extracted data conforms to a pattern you define.
  • No boilerplate – Handles LLM initialization, API key resolution, and error handling for you.

Installation

pip install texttoknowledge

Quick Start

from texttoknowledge import texttoknowledge

# Your raw document text
raw_text = """
Project Alpha:
- Owner: Alice
- Deadline: 2025-03-15
- Status: In progress
"""

# Extract structured knowledge
structured_data = texttoknowledge(user_input=raw_text)

print(structured_data)

API Reference

texttoknowledge(user_input: str, api_key: Optional[str] = None, llm: Optional[BaseChatModel] = None) -> List[str]

Parameter Type Description
user_input str The raw text from which knowledge will be extracted.
llm Optional[BaseChatModel] A LangChain LLM instance. If omitted, the function creates a ChatLLM7 instance automatically.
api_key Optional[str] API key for the default ChatLLM7. If omitted, the function reads the environment variable LLM7_API_KEY.

Returns: List[str] – Extracted pieces of knowledge that match the predefined regex pattern.

Using a Custom LLM

You can pass any LangChain‑compatible LLM that adheres to BaseChatModel. Below are a few examples:

OpenAI

from langchain_openai import ChatOpenAI
from texttoknowledge import texttoknowledge

llm = ChatOpenAI()  # Configure as needed
response = texttoknowledge(user_input="Your document text here", llm=llm)

Anthropic

from langchain_anthropic import ChatAnthropic
from texttoknowledge import texttoknowledge

llm = ChatAnthropic()
response = texttoknowledge(user_input="Your document text here", llm=llm)

Google Generative AI

from langchain_google_genai import ChatGoogleGenerativeAI
from texttoknowledge import texttoknowledge

llm = ChatGoogleGenerativeAI()
response = texttoknowledge(user_input="Your document text here", llm=llm)

Default LLM – ChatLLM7

If you do not provide an LLM, texttoknowledge automatically uses ChatLLM7 from the langchain_llm7 package:

from langchain_llm7 import ChatLLM7

The free tier of LLM7 offers generous rate limits suitable for most use cases. To increase limits, simply supply your own API key:

response = texttoknowledge(user_input="...", api_key="YOUR_LLM7_API_KEY")

You can obtain a free API key by registering at https://token.llm7.io/.

Environment Variables

  • LLM7_API_KEY – If set, the package will use this key for the default ChatLLM7 instance.

Contributing & Issues

If you encounter bugs or have feature requests, please open an issue:

GitHub Issues: https://github....

License

This project is licensed under the MIT License.

Author


Happy structuring! 🎉

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

texttoknowledge-2025.12.21231444.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

texttoknowledge-2025.12.21231444-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file texttoknowledge-2025.12.21231444.tar.gz.

File metadata

File hashes

Hashes for texttoknowledge-2025.12.21231444.tar.gz
Algorithm Hash digest
SHA256 aa43427fbd4d1aec881d10073eb520af1305024867615c1ca8866959969d9704
MD5 43c2c2da80e28cf90fad324bf675b2b0
BLAKE2b-256 de05c36f94e602ed391e2657fb1908ab118ce30784939774537b27e371711927

See more details on using hashes here.

File details

Details for the file texttoknowledge-2025.12.21231444-py3-none-any.whl.

File metadata

File hashes

Hashes for texttoknowledge-2025.12.21231444-py3-none-any.whl
Algorithm Hash digest
SHA256 fc20983b55130a63a94ef0469c142b1f1730c85b7e4ad7c0ad8b87c62f418daf
MD5 711c619278e3c11cb4376224dcb413b3
BLAKE2b-256 b3b2e7a07695e90de939f992b2ab3d27bfc69e238eb2ff793af3bf9217211daf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page