Skip to main content

Use an LLM to summarize paragraphs.

Project description

LLM Summary

Use an LLM to summarize text.

The purpose of the library is to summarize large texts, where core information is captured in the summary

Input: Paragraphs of text

Output: Summary

Install and use library

pip install llm_summary
from llm_summary.inference_api.summary_inference 
import SummaryInferenceProvider

text = "us business leaders ...."

inference_provider = SummaryInferenceProvider(model="llama3.2") 
summary = inference_provider.summarize(text)

print (summary)

Output:

Summary inference completed in 13.31 seconds

summary="Business leaders are donating large sums to Donald Trump's second inaugural fund, with predicted total donations exceeding $107m"

Environment setup

Local LLM - Ollama

Run a local LLM using OLLAMA.

ollama run gemma3:12b

Example

Model: gemma3:27b

import logging
import os
from llm_summary.inference_api.summary_inference import SummaryInferenceProvider
from llm_summary.inference_api.mlflow_config import MlFlowConfig

summarizer = SummaryInferenceProvider(
            model="gemma3:27b",
            strategy="ollama"
            ollama_host="http://localhost:11434"
        )

summarizer.summarize(text=text)
                            summary_text = result.summary

Prompt management

Prompts can vary between models/providers. Version controlling the prompts and annotating the prompt helps with faster evaluation.

MlFlow provides a prompt repository and is used to get the user prompt and system prompt.

E.g. prompts for news article summaries

-e MLFLOW_SYSTEM_PROMPT_ID=SUMMARY_System/@article \
-e MLFLOW_USER_PROMPT_ID=SUMMARY_User/@article \

Data pipelines - Consumers

Two data pipeline examples show how existing mongodb documents can be decorated with a summary:

  • Decorate text with a summary

E.g.

Mongo DB Summary Decorator

In this example text is read from a MongoDb Collection and the a summary is generated. The entry in mongodb is updated with the summary.

The container would read a collection in batches and generates summaries.

docker run -e OLLAMA_MODEL=gemma3:12b \
           -e OLLAMA_HOST=http://ollama:11434 \
           -e MONGODB_CONNECTION_STRING=mongodb://mongodb:27017 \
           -e MLFLOW_SYSTEM_PROMPT_ID=SUMMARY_System/@article \
           -e MLFLOW_USER_PROMPT_ID=SUMMARY_User/@article \
           -e MLFLOW_TRACKING_HOST=http://mlflow:5000
           --network development_network \
           --rm \
           --name summarizer \
           --hostname summarizer \
           mongodb-summarizer

Dependencies

Ollama should be running in a docker container; or a network route should be available.

Run the following for a CPU-Only ollama instance:

docker run -d --rm 
    -v /usr/share/ollama/.ollama:/root/.ollama \
    -p 11434:11434 \
    --network development_network \
    --name ollama \
    --hostname ollama  \
    ollama/ollama

Note: replace the path /usr/share/ollama/.ollama to the host's ollama model path

Evaluation

Due to the sheer amount of model options and inference providers; it is necessary to evaluate.

Evaluation allows methodical selection of hyper parameters:

  • User Prompt (based on model, text source and model provider)
  • System Prompt (based on model, text source and model provider)
  • Model

Evaluation criteria

The summary should contain 'key information' extracted from the larger text; the information can be subjective due to the following:

  • The context (i.e. what is the subject matter, and how the summary is used)
  • Definition 'key information'
  • Writing style of the summary

LLM as a judge

Create an LLM Judge, the judge would judge summaries.

Evaluate the LLM as a judge

The LLM judge is key during evaluation of models/prompts.

Therefore the LLM judge should have a strong Score. The LLM judge will be evaluated using the following metrics:

LLM Judge F1-Score:

  • True positives (TP): the summary is "similar" to the original
  • False positives (FP): the summary contains additional irrelevant information
  • True negatives (TN): the summary does not include trivial information
  • False negatives (FN): the summary is missing significant information

Precision

Proportion of correctly summarized text (accuracy).

High precision means that when the model summarizes, it is likely to be correct.

$$ \begin{aligned} \text{Precision} &= \frac{TP}{TP + FP} \end{aligned} $$

Where:

  • ( TP ) = True Positives
  • ( FP ) = False Positives

Recall

Ability to summarize text.

High recall means the model has summarized significant information from the text.

$$ \begin{aligned} \text{Recall} &= \frac{TP}{TP + FN} \end{aligned} $$

Where:

  • ( TP ) = True Positives
  • ( FN ) = False Negatives

F1 Score

$$ \begin{aligned} F1 &= 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}} \end{aligned} $$

Experiments

Publish the evaluation metrics for each "experiment".

ML-Flow is a good tool for analysis.

Metrics

Below are some evaluations, the evaluations are based on the following metrics:

  • LLM Model
  • Quality of summarization
  • Local vs. Cloud
  • Estimate costs

Conclude the evaluation

Pick the metric most significant, then short-list the models, and do a cost analysis.

To evaluate several weights can be given to each of the metric.

E.g. load your metrics and use a ranker to evaluate:

(e.g. Accuracy: 1, Percentage:2, Distance: 3 and Latency: 1)

License

Copyright (C) 2026 Paul Eger

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_summary-0.6.0.tar.gz (30.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_summary-0.6.0-py3-none-any.whl (38.4 kB view details)

Uploaded Python 3

File details

Details for the file llm_summary-0.6.0.tar.gz.

File metadata

  • Download URL: llm_summary-0.6.0.tar.gz
  • Upload date:
  • Size: 30.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for llm_summary-0.6.0.tar.gz
Algorithm Hash digest
SHA256 30b6c5c4e6b536eafeacf9d31a003be35d7887c0ef35346e64aaeda0f8ccd471
MD5 98695fb8ca501a54f6c2cfdcfde84ef1
BLAKE2b-256 530e54d047d1570efd3c286fa9edd643baa64186b68f788a86f05a2ccb7f6d56

See more details on using hashes here.

File details

Details for the file llm_summary-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: llm_summary-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 38.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for llm_summary-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bca2d23cdfca5d1f2f539bb6cb8909ba6781bd9be2db40ce2e5c1d82ff6e21df
MD5 2f6b68eced9cceb87bf50d6cf11d0b8a
BLAKE2b-256 d9ef0f6ca2c5721420d7bc94392588f9a156e347eebfffb89a50ce8d5580157a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page