A Python library for generating discrete paragraph labels from concept extraction, graph communities, and interpretable assignment rules.

These details have not been verified by PyPI

Project links

Project description

paralabelgen

paralabelgen is a Python library for generating discrete multi-label annotations for text paragraphs from concept extraction, graph communities, and interpretable assignment rules.

PyPI distribution: paralabelgen
Python import package: labelgen
Repository: https://github.com/HuRuilizhen/labelgen

Install

pip install paralabelgen
python -m spacy download en_core_web_sm

en_core_web_sm is the recommended default model. If you already use another compatible English spaCy pipeline, you can point spacy_model_name at that installed model instead.

Quick Start

from labelgen import LabelGenerator, LabelGeneratorConfig

paragraphs = [
    "OpenAI builds language models for developers.",
    "Developers use language models in production systems.",
]

config = LabelGeneratorConfig(
    use_nlp_extractor=False,
    use_graph_community_detection=False,
)
generator = LabelGenerator(LabelGeneratorConfig())
result = generator.fit_transform(paragraphs)

print("Concepts:")
for concept in result.concepts:
    print(concept.normalized, concept.kind, concept.document_frequency, sep=" | ")

print("Labels:")
for assignment in result.paragraph_labels:
    print(assignment.paragraph_id, assignment.label_ids, assignment.label_scores)

The default public pipeline uses spaCy extraction and Leiden community detection. Install the recommended spaCy model before running the example.

Public API

The main public entrypoints are:

LabelGenerator
LabelGeneratorConfig
Paragraph, Concept, ConceptMention, Community, ParagraphLabels
dump_result() and load_result()

Detailed API notes are available in docs/public_api.md.

Examples

Runnable examples are available in examples/:

Configuration Notes

fit() learns concept communities from a corpus.
transform() applies previously learned communities to new paragraphs.
fit_transform() learns and labels the same input in one pass.
The default pipeline uses spaCy extraction and Leiden community detection.
The default NLP path requires the configured spaCy model to be installed.
en_core_web_sm is the recommended default model name.
If the configured model is missing, the library raises an explicit runtime error.
Set use_nlp_extractor=False to switch to the deterministic heuristic extractor.
Set use_graph_community_detection=False to switch to deterministic connected-components community detection.
The heuristic extractor uses capitalized spans as lightweight entities and non-stopword spans as candidate noun phrases.

Opt Out Of Enhanced Implementations

from labelgen import LabelGenerator, LabelGeneratorConfig

config = LabelGeneratorConfig(
    use_nlp_extractor=False,
    use_graph_community_detection=False,
)
generator = LabelGenerator(config)

Use A Different spaCy Model

from labelgen import LabelGenerator, LabelGeneratorConfig

config = LabelGeneratorConfig()
config.extraction.spacy_model_name = "en_core_web_md"

generator = LabelGenerator(config)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.3

Apr 18, 2026

0.2.2

Apr 14, 2026

0.2.1

Mar 31, 2026

0.2.0

Mar 25, 2026

This version

0.1.1

Mar 24, 2026

0.1.0

Mar 23, 2026

0.0.0

Mar 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

paralabelgen-0.1.1.tar.gz (39.4 kB view details)

Uploaded Mar 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

paralabelgen-0.1.1-py3-none-any.whl (28.7 kB view details)

Uploaded Mar 24, 2026 Python 3

File details

Details for the file paralabelgen-0.1.1.tar.gz.

File metadata

Download URL: paralabelgen-0.1.1.tar.gz
Upload date: Mar 24, 2026
Size: 39.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for paralabelgen-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`629a4dd3757d043386b3f87e7fc3f1cf1be75912fe7bae3b355ba1fe976bd819`
MD5	`599f893ad00040698fabd38accdf0250`
BLAKE2b-256	`770cb2428f3d51e5739117785064f41f60b06f1525385bdc88d6cf05bd5b3045`

See more details on using hashes here.

File details

Details for the file paralabelgen-0.1.1-py3-none-any.whl.

File metadata

Download URL: paralabelgen-0.1.1-py3-none-any.whl
Upload date: Mar 24, 2026
Size: 28.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for paralabelgen-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`29217b29f52b60a0b9405ab865a0502de70793f9aa38f8d969c2c75b4be2ca71`
MD5	`0676204e7a09c9414b3c1ed8a0651295`
BLAKE2b-256	`c77548bed63b0f3f2d303d6553f96dc5ac6dd956ab805319930f7d1bf8864ae7`

See more details on using hashes here.

paralabelgen 0.1.1

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

paralabelgen

Install

Quick Start

Public API

Examples

Configuration Notes

Opt Out Of Enhanced Implementations

Use A Different spaCy Model

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes