Use an LLM to summarize paragraphs.
Project description
LLM Summary
Use an LLM to summarize text.
The purpose of the library is to summarize large texts, where core information is captured in the summary
Input: Paragraphs of text
Output: Summary
Install and use library
pip install llm_summary
from llm_summary.inference_api.summary_inference
import SummaryInferenceProvider
text = "us business leaders ...."
inference_provider = SummaryInferenceProvider(model="llama3.2")
summary = inference_provider.summarize(text)
print (summary)
Output:
Summary inference completed in 13.31 seconds
summary="Business leaders are donating large sums to Donald Trump's second inaugural fund, with predicted total donations exceeding $107m"
Environment setup
Local LLM - Ollama
Run a local LLM using OLLAMA.
-
Download and install OLLAMA: https://ollama.com/download
-
Download the LLM Model
ollama run gemma3:12b
Example
Model: gemma3:27b
import logging
import os
from llm_summary.inference_api.summary_inference import SummaryInferenceProvider
from llm_summary.inference_api.mlflow_config import MlFlowConfig
summarizer = SummaryInferenceProvider(
model="gemma3:27b",
strategy="ollama"
ollama_host="http://localhost:11434"
)
summarizer.summarize(text=text)
summary_text = result.summary
Prompt management
Prompts can vary between models/providers. Version controlling the prompts and annotating the prompt helps with faster evaluation.
MlFlow provides a prompt repository and is used to get the user prompt and system prompt.
E.g. prompts for news article summaries
-e MLFLOW_SYSTEM_PROMPT_ID=SUMMARY_System/@article \
-e MLFLOW_USER_PROMPT_ID=SUMMARY_User/@article \
Data pipelines - Consumers
Two data pipeline examples show how existing mongodb documents can be decorated with a summary:
- Decorate text with a summary
E.g.
Mongo DB Summary Decorator
In this example text is read from a MongoDb Collection and the a summary is generated. The entry in mongodb is updated with the summary.
The container would read a collection in batches and generates summaries.
docker run -e OLLAMA_MODEL=gemma3:12b \
-e OLLAMA_HOST=http://ollama:11434 \
-e MONGODB_CONNECTION_STRING=mongodb://mongodb:27017 \
-e MLFLOW_SYSTEM_PROMPT_ID=SUMMARY_System/@article \
-e MLFLOW_USER_PROMPT_ID=SUMMARY_User/@article \
-e MLFLOW_TRACKING_HOST=http://mlflow:5000
--network development_network \
--rm \
--name summarizer \
--hostname summarizer \
mongodb-summarizer
Dependencies
Ollama should be running in a docker container; or a network route should be available.
Run the following for a CPU-Only ollama instance:
docker run -d --rm
-v /usr/share/ollama/.ollama:/root/.ollama \
-p 11434:11434 \
--network development_network \
--name ollama \
--hostname ollama \
ollama/ollama
Note: replace the path /usr/share/ollama/.ollama to the host's ollama model path
Evaluation
Due to the sheer amount of model options and inference providers; it is necessary to evaluate.
Evaluation allows methodical selection of hyper parameters:
- User Prompt (based on model, text source and model provider)
- System Prompt (based on model, text source and model provider)
- Model
Evaluation criteria
The summary should contain 'key information' extracted from the larger text; the information can be subjective due to the following:
- The context (i.e. what is the subject matter, and how the summary is used)
- Definition 'key information'
- Writing style of the summary
LLM as a judge
Create an LLM Judge, the judge would judge summaries.
Evaluate the LLM as a judge
The LLM judge is key during evaluation of models/prompts.
Therefore the LLM judge should have a strong Score. The LLM judge will be evaluated using the following metrics:
LLM Judge F1-Score:
- True positives (TP): the summary is "similar" to the original
- False positives (FP): the summary contains additional irrelevant information
- True negatives (TN): the summary does not include trivial information
- False negatives (FN): the summary is missing significant information
Precision
Proportion of correctly summarized text (accuracy).
High precision means that when the model summarizes, it is likely to be correct.
$$ \begin{aligned} \text{Precision} &= \frac{TP}{TP + FP} \end{aligned} $$
Where:
- ( TP ) = True Positives
- ( FP ) = False Positives
Recall
Ability to summarize text.
High recall means the model has summarized significant information from the text.
$$ \begin{aligned} \text{Recall} &= \frac{TP}{TP + FN} \end{aligned} $$
Where:
- ( TP ) = True Positives
- ( FN ) = False Negatives
F1 Score
$$ \begin{aligned} F1 &= 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}} \end{aligned} $$
Experiments
Publish the evaluation metrics for each "experiment".
ML-Flow is a good tool for analysis.
Metrics
Below are some evaluations, the evaluations are based on the following metrics:
- LLM Model
- Quality of summarization
- Local vs. Cloud
- Estimate costs
Conclude the evaluation
Pick the metric most significant, then short-list the models, and do a cost analysis.
To evaluate several weights can be given to each of the metric.
E.g. load your metrics and use a ranker to evaluate:
(e.g. Accuracy: 1, Percentage:2, Distance: 3 and Latency: 1)
License
Copyright (C) 2026 Paul Eger
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_summary-0.6.0.tar.gz.
File metadata
- Download URL: llm_summary-0.6.0.tar.gz
- Upload date:
- Size: 30.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
30b6c5c4e6b536eafeacf9d31a003be35d7887c0ef35346e64aaeda0f8ccd471
|
|
| MD5 |
98695fb8ca501a54f6c2cfdcfde84ef1
|
|
| BLAKE2b-256 |
530e54d047d1570efd3c286fa9edd643baa64186b68f788a86f05a2ccb7f6d56
|
File details
Details for the file llm_summary-0.6.0-py3-none-any.whl.
File metadata
- Download URL: llm_summary-0.6.0-py3-none-any.whl
- Upload date:
- Size: 38.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bca2d23cdfca5d1f2f539bb6cb8909ba6781bd9be2db40ce2e5c1d82ff6e21df
|
|
| MD5 |
2f6b68eced9cceb87bf50d6cf11d0b8a
|
|
| BLAKE2b-256 |
d9ef0f6ca2c5721420d7bc94392588f9a156e347eebfffb89a50ce8d5580157a
|