Transforms raw text into clean, readable articles by removing ads, nav, sidebars, preserving core narrative with structured output.
Project description
Text-Digestor
Transform raw text into a clean, readable article format with Text-Digestor.
Overview
Text-Digestor is a package that extracts the main content from unstructured text, such as web content or documents, and processes it to remove unnecessary elements like advertisements, navigation links, and sidebars. It focuses on preserving the core narrative, making it ideal for applications that require distraction-free reading experiences.
Features
- Extracts main content from unstructured text
- Removes unnecessary elements like advertisements, navigation links, and sidebars
- Preserves core narrative
- Well-organized and easy-to-consume output
Installation
pip install text_digestor
Example Usage
from text_digestor import text_digestor
user_input = "Unstructured text to process..."
response = text_digestor(user_input)
print(response)
Input Parameters
user_input:str: the user input text to processllm:Optional[BaseChatModel]: the Langchain LLM instance to use, defaults toChatLLM7fromlangchain_llm7if not providedapi_key:Optional[str]: the API key for LLM7, defaults to an empty string (api_key is None) if not provided
Note that you can safely pass your own LLM instance if you want to use another LLM. For example, to use OpenAI's LLM, you can pass it like this:
from langchain_openai import ChatOpenAI
from text_digestor import text_digestor
llm = ChatOpenAI()
response = text_digestor(user_input, llm=llm)
Similarly, you can use Anthropic's LLM:
from langchain_anthropic import ChatAnthropic
from text_digestor import text_digestor
llm = ChatAnthropic()
response = text_digestor(user_input, llm=llm)
Or Google's LLM:
from langchain_google_genai import ChatGoogleGenerativeAI
from text_digestor import text_digestor
llm = ChatGoogleGenerativeAI()
response = text_digestor(user_input, llm=llm)
Rate Limits
The default rate limits for LLM7 free tier are sufficient for most use cases of this package. If you need higher rate limits for LLM7, you can pass your own API key via environment variable LLM7_API_KEY or directly like this:
response = text_digestor(user_input, api_key="your_api_key")
You can get a free API key by registering at Token.LLM7.IO
Issues
Report any issues to our GitHub Issues Page
Author
Eugene Evstafev eugene.evstafev@hi@euegne.plus
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file text_digestor-2025.12.22080445.tar.gz.
File metadata
- Download URL: text_digestor-2025.12.22080445.tar.gz
- Upload date:
- Size: 5.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8ecb78600f52b34d963c0cab95bf78e7e667ffa8f7563ff2ee4666d2ad58b081
|
|
| MD5 |
fd6ae6026bcbee1ed093646649c59e07
|
|
| BLAKE2b-256 |
c3ccf803e3a5b0aa7b6ecb8951f28058c549ca0bebdb3225e10c1c96515462e3
|
File details
Details for the file text_digestor-2025.12.22080445-py3-none-any.whl.
File metadata
- Download URL: text_digestor-2025.12.22080445-py3-none-any.whl
- Upload date:
- Size: 5.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
140d61c5212254abc47d879ec76033e52f269460eba10634df1c6ee28bc942be
|
|
| MD5 |
0f06b36eb16969badbe566480cb13fba
|
|
| BLAKE2b-256 |
7abe24914af73cd6da39a635bd18a9a9fda938d8bf57e0291858e1926c402117
|