Skip to main content

High-performance TextRank implementation with Python bindings

Project description

rapid_textrank

License: MIT Python 3.9+ Rust

High-performance TextRank implementation in Rust with Python bindings.

Extract keywords and key phrases from text up to 10-100x faster than pure Python implementations (depending on document size and tokenization), with support for multiple algorithm variants and 18 languages.

Features

  • Fast: Up to 10-100x faster than pure Python implementations (see benchmarks)
  • Multiple algorithms: TextRank, PositionRank, BiasedTextRank, TopicRank, SingleRank, TopicalPageRank, and MultipartiteRank variants
  • Unicode-aware: Proper handling of CJK and other scripts (emoji are ignored by the built-in tokenizer)
  • Multi-language: Stopword support for 18 languages
  • Dual API: Native Python classes + JSON interface for batch processing
  • Rust core: Computation happens in Rust (the Python GIL is currently held during extraction)

Quick Start

pip install rapid_textrank
from rapid_textrank import extract_keywords

text = """
Machine learning is a subset of artificial intelligence that enables
systems to learn and improve from experience. Deep learning, a type of
machine learning, uses neural networks with many layers.
"""

keywords = extract_keywords(text, top_n=5, language="en")
for phrase in keywords:
    print(f"{phrase.text}: {phrase.score:.4f}")

Output:

machine learning: 0.2341
deep learning: 0.1872
artificial intelligence: 0.1654
neural networks: 0.1432
systems: 0.0891

How TextRank Works

TextRank is a graph-based ranking algorithm for keyword extraction, inspired by Google's PageRank.

The Algorithm

  1. Build a co-occurrence graph: Words become nodes. An edge connects two words if they appear within a sliding window (default: 4 words).

  2. Run PageRank: The algorithm iteratively distributes "importance" through the graph. Words connected to many important words become important themselves.

  3. Extract phrases: High-scoring words are grouped into noun chunks (POS-filtered) to form key phrases. Scores are aggregated (sum, mean, or max).

Text: "Machine learning enables systems to learn from data"

Co-occurrence graph (window=2):
    machine ←→ learning ←→ enables ←→ systems ←→ learn ←→ data
                              ↓
                            PageRank
                              ↓
    Scores: machine(0.23) learning(0.31) enables(0.12) ...
                              ↓
                        Phrase extraction
                              ↓
    "machine learning" (0.54), "systems" (0.18), ...

Further Reading

Algorithm Variants

Variant Best For Description
BaseTextRank General text Standard TextRank implementation
PositionRank Academic papers, news Favors words appearing early in the document
BiasedTextRank Topic-focused extraction Biases results toward specified focus terms
TopicRank Multi-topic documents Clusters similar phrases into topics and ranks the topics
SingleRank Longer documents Uses weighted co-occurrence edges and cross-sentence windowing
TopicalPageRank Topic-model-guided extraction Biases SingleRank towards topically important words via personalized PageRank
MultipartiteRank Multi-topic documents Builds a k-partite graph removing intra-topic edges; boosts first-occurring variants

PositionRank

Weights words by their position—earlier appearances score higher. Useful for documents where key information appears in titles, abstracts, or opening paragraphs.

from rapid_textrank import PositionRank

extractor = PositionRank(top_n=10)
result = extractor.extract_keywords("""
Quantum Computing Advances in 2024

Researchers have made significant breakthroughs in quantum error correction.
The quantum computing field continues to evolve rapidly...
""")

# "quantum computing" and "quantum" will rank higher due to early position

BiasedTextRank

Steers extraction toward specific topics using focus terms. The bias_weight parameter controls how strongly results favor the focus terms.

from rapid_textrank import BiasedTextRank

extractor = BiasedTextRank(
    focus_terms=["security", "privacy"],
    bias_weight=5.0,  # Higher = stronger bias
    top_n=10
)

result = extractor.extract_keywords("""
Modern web applications must balance user experience with security.
Privacy regulations require careful data handling. Performance
optimizations should not compromise security measures.
""")

# Results will favor security/privacy-related phrases

TopicRank

TopicRank clusters similar candidate phrases into topics, then ranks the topics. It is exposed via the JSON interface (useful for spaCy-tokenized input).

import json
import spacy
from rapid_textrank import extract_from_json

nlp = spacy.load("en_core_web_sm")
doc = nlp(text)

tokens = []
for sent_idx, sent in enumerate(doc.sents):
    for token in sent:
        tokens.append({
            "text": token.text,
            "lemma": token.lemma_,
            "pos": token.pos_,
            "start": token.idx,
            "end": token.idx + len(token.text),
            "sentence_idx": sent_idx,
            "token_idx": token.i,
            "is_stopword": token.is_stop,
        })

payload = {
    "tokens": tokens,
    "variant": "topic_rank",
    "config": {
        "top_n": 10,
        "language": "en",
        "topic_similarity_threshold": 0.25,
        "topic_edge_weight": 1.0,
    },
}

result = json.loads(extract_from_json(json.dumps(payload)))
for phrase in result["phrases"][:10]:
    print(phrase["text"], phrase["score"])

SingleRank

SingleRank (Wan & Xiao, 2008) extends TextRank in two ways: edges are weighted by co-occurrence count (repeated neighbors get stronger connections), and the sliding window ignores sentence boundaries so that terms at the end of one sentence connect to terms at the start of the next.

from rapid_textrank import SingleRank

extractor = SingleRank(top_n=10)
result = extractor.extract_keywords("""
Machine learning is a powerful tool. Deep learning is a subset of
machine learning. Neural networks power deep learning systems.
""")

# Cross-sentence co-occurrences strengthen "machine learning" edges
for phrase in result.phrases:
    print(f"{phrase.text}: {phrase.score:.4f}")

SingleRank is also available via the JSON interface with variant="single_rank".

When to use SingleRank over BaseTextRank: SingleRank works well on longer documents where important terms co-occur across sentence boundaries. The weighted edges amplify frequently co-occurring pairs, giving a clearer signal than the binary edges used by BaseTextRank.

Topical PageRank

Topical PageRank (Sterckx et al., 2015) extends SingleRank by biasing the random walk towards topically important words. Instead of uniform teleportation, PageRank uses a personalization vector derived from per-word topic weights.

Users supply pre-computed topic weights as a {lemma: weight} dictionary. These typically come from a topic model (e.g., LDA via gensim or sklearn), but any source of word importance scores works. Words absent from the dictionary receive a configurable minimum weight (min_weight, default 0.0 — matching PKE's OOV behavior).

from rapid_textrank import TopicalPageRank

# Topic weights from an external topic model or manual assignment
topic_weights = {
    "neural": 0.9,
    "network": 0.8,
    "learning": 0.7,
    "deep": 0.6,
}

extractor = TopicalPageRank(
    topic_weights=topic_weights,
    min_weight=0.01,  # Floor for out-of-vocabulary words
    top_n=10
)

result = extractor.extract_keywords("""
Deep learning is a subset of machine learning that uses artificial neural
networks. Neural networks with many layers can learn complex patterns.
Convolutional neural networks excel at image recognition tasks.
""")

for phrase in result.phrases:
    print(f"{phrase.text}: {phrase.score:.4f}")

# Update topic weights for a different document/topic
result = extractor.extract_keywords(
    "Machine learning enables data-driven decisions...",
    topic_weights={"machine": 0.9, "data": 0.8}
)

TopicalPageRank is also available via the JSON interface with variant="topical_pagerank" (aliases: "tpr", "single_tpr"). Set topic_weights and optionally topic_min_weight in the JSON config:

import json
from rapid_textrank import extract_from_json

payload = {
    "tokens": tokens,  # Pre-tokenized (e.g., from spaCy)
    "variant": "topical_pagerank",
    "config": {
        "top_n": 10,
        "topic_weights": {"neural": 0.9, "network": 0.8, "learning": 0.7},
        "topic_min_weight": 0.01,
    },
}

result = json.loads(extract_from_json(json.dumps(payload)))

Computing topic weights from LDA

The topic_weights_from_lda helper computes per-lemma weights from a trained gensim LDA model, so you can go from corpus to keywords in a few lines:

pip install rapid_textrank[topic]   # installs gensim
from gensim.corpora import Dictionary
from gensim.models import LdaModel
from rapid_textrank import TopicalPageRank, topic_weights_from_lda

# 1. Train (or load) an LDA model
texts = [doc.split() for doc in corpus]      # list of token lists
dictionary = Dictionary(texts)
bow_corpus = [dictionary.doc2bow(t) for t in texts]
lda = LdaModel(bow_corpus, num_topics=10, id2word=dictionary)

# 2. Compute topic weights for a single document
weights = topic_weights_from_lda(lda, bow_corpus[0], dictionary)

# 3. Extract keywords using those weights
extractor = TopicalPageRank(topic_weights=weights, top_n=10)
result = extractor.extract_keywords(raw_text)
for phrase in result.phrases:
    print(f"{phrase.text}: {phrase.score:.4f}")

topic_weights_from_lda accepts an optional aggregation parameter ("max" or "mean") and top_n_words to control how many words per topic are considered. See the docstring for details.

TopicalPageRank vs BiasedTextRank: Both bias extraction towards specific terms, but they differ in how:

  • BiasedTextRank takes a list of focus terms and a single bias weight. It's manual and direct — good when you know exactly which terms matter.
  • TopicalPageRank takes per-word weights, typically from a topic model. It's data-driven — good when you want the topic distribution to guide extraction automatically.

Topic modeling is optional. You can supply any word-importance scores: TF-IDF weights, embedding similarities, domain relevance scores, or hand-picked values.

MultipartiteRank

MultipartiteRank (Boudin, NAACL 2018) extends TopicRank by keeping individual candidates as graph nodes instead of collapsing topics into single representatives. It removes intra-topic edges to form a k-partite graph and applies an alpha weight adjustment that boosts the first-occurring variant in each topic cluster, encoding positional preference directly into edge weights.

from rapid_textrank import MultipartiteRank

extractor = MultipartiteRank(
    similarity_threshold=0.26,  # Jaccard threshold for topic clustering
    alpha=1.1,                  # Position boost strength (0 = disabled)
    top_n=10
)

result = extractor.extract_keywords("""
Machine learning is a powerful tool for data analysis. Deep learning
is a subset of machine learning. Neural networks power deep learning
systems. Convolutional neural networks excel at image recognition.
""")

for phrase in result.phrases:
    print(f"{phrase.text}: {phrase.score:.4f}")

MultipartiteRank is also available via the JSON interface with variant="multipartite_rank" (aliases: "multipartiterank", "multipartite", "mpr"). Set multipartite_alpha and multipartite_similarity_threshold in the JSON config:

import json
from rapid_textrank import extract_from_json

payload = {
    "tokens": tokens,  # Pre-tokenized (e.g., from spaCy)
    "variant": "multipartite_rank",
    "config": {
        "top_n": 10,
        "multipartite_alpha": 1.1,
        "multipartite_similarity_threshold": 0.26,
    },
}

result = json.loads(extract_from_json(json.dumps(payload)))

MultipartiteRank vs TopicRank: Both cluster candidates into topics, but they differ in graph construction:

  • TopicRank collapses each topic into a single node and ranks topics as a whole, then picks the best representative from each.
  • MultipartiteRank keeps every candidate as its own node but removes edges within the same topic. This preserves fine-grained candidate distinctions while still preventing intra-topic competition.

API Reference

Convenience Function

The simplest way to extract keywords:

from rapid_textrank import extract_keywords

phrases = extract_keywords(
    text,           # Input text
    top_n=10,       # Number of keywords to return
    language="en"   # Language for stopword filtering
)

Class-Based API

For more control, use the extractor classes:

from rapid_textrank import BaseTextRank, PositionRank, BiasedTextRank, SingleRank, TopicalPageRank, MultipartiteRank

# Standard TextRank
extractor = BaseTextRank(top_n=10, language="en")
result = extractor.extract_keywords(text)

# Position-weighted
extractor = PositionRank(top_n=10, language="en")
result = extractor.extract_keywords(text)

# Topic-biased
extractor = BiasedTextRank(
    focus_terms=["machine", "learning"],
    bias_weight=5.0,
    top_n=10,
    language="en"
)
result = extractor.extract_keywords(text)

# You can also pass focus_terms per-call
result = extractor.extract_keywords(text, focus_terms=["neural", "network"])

# SingleRank: weighted edges + cross-sentence windowing
extractor = SingleRank(top_n=10, language="en")
result = extractor.extract_keywords(text)

# Topical PageRank: topic-weight-biased extraction
extractor = TopicalPageRank(
    topic_weights={"neural": 0.9, "network": 0.8},
    min_weight=0.01,
    top_n=10,
    language="en"
)
result = extractor.extract_keywords(text)

# MultipartiteRank: k-partite graph with position boost
extractor = MultipartiteRank(
    similarity_threshold=0.26,
    alpha=1.1,
    top_n=10,
    language="en"
)
result = extractor.extract_keywords(text)

TopicRank is available via the JSON interface using variant="topic_rank" (see below).

Configuration

Fine-tune the algorithm with TextRankConfig:

from rapid_textrank import TextRankConfig, BaseTextRank

config = TextRankConfig(
    damping=0.85,              # PageRank damping factor (0-1)
    max_iterations=100,        # Maximum PageRank iterations
    convergence_threshold=1e-6,# Convergence threshold
    window_size=3,             # Co-occurrence window size
    top_n=10,                  # Number of results
    min_phrase_length=1,       # Minimum words in a phrase
    max_phrase_length=4,       # Maximum words in a phrase
    score_aggregation="sum",   # How to combine word scores: "sum", "mean", "max", "rms"
    language="en",             # Language for stopwords
    include_pos=["NOUN","ADJ","PROPN","VERB"],  # POS tags to include in the graph
    use_pos_in_nodes=True,     # If True, graph nodes are lemma+POS
    phrase_grouping="scrubbed_text",   # "lemma" or "scrubbed_text"
    stopwords=["custom", "terms"]  # Additional stopwords (extends built-in list)
)

extractor = BaseTextRank(config=config)

Result Objects

result = extractor.extract_keywords(text)

# TextRankResult attributes
result.phrases      # List of Phrase objects
result.converged    # Whether PageRank converged
result.iterations   # Number of iterations run

# Phrase attributes
for phrase in result.phrases:
    phrase.text     # The phrase text (e.g., "machine learning")
    phrase.lemma    # Lemmatized form
    phrase.score    # TextRank score
    phrase.count    # Occurrences in text
    phrase.rank     # 1-indexed rank

# Convenience method
tuples = result.as_tuples()  # [(text, score), ...]

JSON Interface

For processing large documents or integrating with spaCy, use the JSON interface. This accepts pre-tokenized data to avoid re-tokenizing in Rust. Stopword handling can use each token's is_stopword field and/or a config.language plus config.stopwords (additional words that extend the built-in list). Language codes follow the Supported Languages table below.

from rapid_textrank import extract_from_json, extract_batch_from_json
import json

# Single document
doc = {
    "tokens": [
        {
            "text": "Machine",
            "lemma": "machine",
            "pos": "NOUN",
            "start": 0,
            "end": 7,
            "sentence_idx": 0,
            "token_idx": 0,
            "is_stopword": False
        },
        # ... more tokens
    ],
    "variant": "textrank",
    "config": {
        "top_n": 10,
        "language": "en",
        "stopwords": ["nlp", "transformers"]
    }
}

result_json = extract_from_json(json.dumps(doc))
result = json.loads(result_json)

# Batch processing (Rust core; per-document processing is sequential)
docs = [doc1, doc2, doc3]
results_json = extract_batch_from_json(json.dumps(docs))
results = json.loads(results_json)

variant can be "textrank" (default), "position_rank", "biased_textrank", "topic_rank", "single_rank", "topical_pagerank" (aliases: "tpr", "single_tpr"), or "multipartite_rank" (aliases: "multipartiterank", "multipartite", "mpr"). For "biased_textrank", set focus_terms and bias_weight in the JSON config. For "topic_rank", set topic_similarity_threshold and topic_edge_weight in the JSON config. For "topical_pagerank", set topic_weights and optionally topic_min_weight in the JSON config. For "multipartite_rank", set multipartite_alpha and multipartite_similarity_threshold in the JSON config.

Supported Languages

Stopword filtering is available for 18 languages. Use these codes for the language parameter in all APIs (including JSON config):

Code Language Code Language Code Language
en English de German fr French
es Spanish it Italian pt Portuguese
nl Dutch ru Russian sv Swedish
no Norwegian da Danish fi Finnish
hu Hungarian tr Turkish pl Polish
ar Arabic zh Chinese ja Japanese

You can inspect the built-in stopword list with:

import rapid_textrank as rt
rt.get_stopwords("en")

Performance

rapid_textrank achieves significant speedups through Rust's performance characteristics and careful algorithm implementation.

Benchmark Script

Run this script to compare performance on your hardware:

"""
Benchmark: rapid_textrank vs pytextrank

Prerequisites:
    pip install rapid_textrank pytextrank spacy
    python -m spacy download en_core_web_sm
"""

import time
import statistics

# Sample texts of varying sizes
TEXTS = {
    "small": """
        Machine learning is a subset of artificial intelligence.
        Deep learning uses neural networks with many layers.
    """,

    "medium": """
        Natural language processing (NLP) is a field of artificial intelligence
        that focuses on the interaction between computers and humans through
        natural language. The ultimate goal of NLP is to enable computers to
        understand, interpret, and generate human language in a valuable way.

        Machine learning approaches have transformed NLP in recent years.
        Deep learning models, particularly transformers, have achieved
        state-of-the-art results on many NLP tasks including translation,
        summarization, and question answering.

        Key applications include sentiment analysis, named entity recognition,
        machine translation, and text classification. These technologies
        power virtual assistants, search engines, and content recommendation
        systems used by millions of people daily.
    """,

    "large": """
        Artificial intelligence has evolved dramatically since its inception in
        the mid-20th century. Early AI systems relied on symbolic reasoning and
        expert systems, where human knowledge was manually encoded into rules.

        The machine learning revolution changed everything. Instead of explicit
        programming, systems learn patterns from data. Supervised learning uses
        labeled examples, unsupervised learning finds hidden structures, and
        reinforcement learning optimizes through trial and error.

        Deep learning, powered by neural networks with multiple layers, has
        achieved remarkable success. Convolutional neural networks excel at
        image recognition. Recurrent neural networks and transformers handle
        sequential data like text and speech. Generative adversarial networks
        create realistic synthetic content.

        Natural language processing has been transformed by these advances.
        Word embeddings capture semantic relationships. Attention mechanisms
        allow models to focus on relevant context. Large language models
        demonstrate emergent capabilities in reasoning and generation.

        Computer vision applications include object detection, facial recognition,
        medical image analysis, and autonomous vehicle perception. These systems
        process visual information with superhuman accuracy in many domains.

        The ethical implications of AI are significant. Bias in training data
        can lead to unfair outcomes. Privacy concerns arise from data collection.
        Job displacement affects workers across industries. Regulation and
        governance frameworks are being developed worldwide.

        Future directions include neuromorphic computing, quantum machine learning,
        and artificial general intelligence. Researchers continue to push
        boundaries while addressing safety and alignment challenges.
    """ * 3  # ~1000 words
}


def benchmark_rapid_textrank(text: str, runs: int = 10) -> dict:
    """Benchmark rapid_textrank."""
    from rapid_textrank import BaseTextRank

    extractor = BaseTextRank(top_n=10, language="en")

    # Warmup
    extractor.extract_keywords(text)

    times = []
    for _ in range(runs):
        start = time.perf_counter()
        result = extractor.extract_keywords(text)
        elapsed = time.perf_counter() - start
        times.append(elapsed * 1000)  # Convert to ms

    return {
        "min": min(times),
        "mean": statistics.mean(times),
        "median": statistics.median(times),
        "std": statistics.stdev(times) if len(times) > 1 else 0,
        "phrases": len(result.phrases)
    }


def benchmark_pytextrank(text: str, runs: int = 10) -> dict:
    """Benchmark pytextrank with spaCy."""
    import spacy
    import pytextrank

    nlp = spacy.load("en_core_web_sm")
    nlp.add_pipe("textrank")

    # Warmup
    doc = nlp(text)

    times = []
    for _ in range(runs):
        start = time.perf_counter()
        doc = nlp(text)
        phrases = list(doc._.phrases[:10])
        elapsed = time.perf_counter() - start
        times.append(elapsed * 1000)

    return {
        "min": min(times),
        "mean": statistics.mean(times),
        "median": statistics.median(times),
        "std": statistics.stdev(times) if len(times) > 1 else 0,
        "phrases": len(phrases)
    }


def main():
    print("=" * 70)
    print("TextRank Performance Benchmark")
    print("=" * 70)

    for size, text in TEXTS.items():
        word_count = len(text.split())
        print(f"\n{size.upper()} TEXT (~{word_count} words)")
        print("-" * 50)

        # Benchmark rapid_textrank
        rust_results = benchmark_rapid_textrank(text)
        print(f"rapid_textrank:  {rust_results['mean']:>8.2f} ms (±{rust_results['std']:.2f})")

        # Benchmark pytextrank
        try:
            py_results = benchmark_pytextrank(text)
            print(f"pytextrank:     {py_results['mean']:>8.2f} ms (±{py_results['std']:.2f})")

            speedup = py_results['mean'] / rust_results['mean']
            print(f"Speedup:        {speedup:>8.1f}x faster")
        except Exception as e:
            print(f"pytextrank:     (not available: {e})")

    print("\n" + "=" * 70)
    print("Note: pytextrank times include spaCy tokenization.")
    print("For fair comparison with pre-tokenized input, use rapid_textrank's JSON API.")
    print("=" * 70)


if __name__ == "__main__":
    main()

Why Rust is Fast

The performance advantage comes from several factors:

  1. CSR Graph Format: The co-occurrence graph uses Compressed Sparse Row format, enabling cache-friendly memory access during PageRank iteration.

  2. String Interning: Repeated words share a single allocation via StringPool, reducing memory usage 10-100x for typical documents.

  3. Parallel Processing: Rayon provides data parallelism in internal graph construction without explicit thread management.

  4. Link-Time Optimization (LTO): Release builds use full LTO with single codegen unit for maximum inlining.

  5. Rust core: Most computation happens in Rust, minimizing Python-level overhead.

  6. FxHash: Fast non-cryptographic hashing for internal hash maps.

Installation

From PyPI

pip install rapid_textrank

Import name is rapid_textrank.

With spaCy Support

pip install rapid_textrank[spacy]
import spacy
import rapid_textrank.spacy_component  # registers the pipeline factory

nlp = spacy.load("en_core_web_sm")
nlp.add_pipe("rapid_textrank")

doc = nlp("Machine learning is a subset of artificial intelligence.")
for phrase in doc._.phrases[:5]:
    print(f"{phrase.text}: {phrase.score:.4f}")

From Source

Requirements: Rust 1.70+, Python 3.9+

git clone https://github.com/xang1234/rapid-textrank
cd rapid_textrank
pip install maturin
maturin develop --release

Development Setup

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run Rust tests
cargo test

Publishing

Publishing is automated with GitHub Actions using Trusted Publishing (OIDC), so no API tokens are stored.

TestPyPI release (push a tag):

git tag -a test-0.1.0 -m "TestPyPI 0.1.0"
git push origin test-0.1.0

Tag pattern: test-*

PyPI release (push a tag):

git tag -a v0.1.0 -m "Release 0.1.0"
git push origin v0.1.0

Tag pattern: v*

Wheel builds

GitHub Actions builds wheels for Python 3.9–3.12 on Linux, macOS, and Windows.

Before the first publish, add Trusted Publishers on TestPyPI and PyPI:

  • Repo: xang1234/textranker
  • Workflows: .github/workflows/publish-testpypi.yml and .github/workflows/publish-pypi.yml
  • Environments: testpypi and pypi

You can also trigger either workflow manually via GitHub Actions if needed.

License

MIT License - see LICENSE for details.

Citation

If you use rapid_textrank in research, please cite the original TextRank paper:

@inproceedings{mihalcea-tarau-2004-textrank,
    title = "{T}ext{R}ank: Bringing Order into Text",
    author = "Mihalcea, Rada and Tarau, Paul",
    booktitle = "Proceedings of EMNLP 2004",
    year = "2004",
    publisher = "Association for Computational Linguistics",
}

For PositionRank:

@inproceedings{florescu-caragea-2017-positionrank,
    title = "{P}osition{R}ank: An Unsupervised Approach to Keyphrase Extraction from Scholarly Documents",
    author = "Florescu, Corina and Caragea, Cornelia",
    booktitle = "Proceedings of ACL 2017",
    year = "2017",
}

For SingleRank:

@inproceedings{wan-xiao-2008-singlerank,
    title = "Single Document Keyphrase Extraction Using Neighborhood Knowledge",
    author = "Wan, Xiaojun and Xiao, Jianguo",
    booktitle = "Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (AAAI 2008)",
    year = "2008",
    pages = "855--860",
}

For TopicRank:

@inproceedings{bougouin-boudin-daille-2013-topicrank,
    title = "{T}opic{R}ank: Graph-Based Topic Ranking for Keyphrase Extraction",
    author = "Bougouin, Adrien and Boudin, Florian and Daille, B{\\'e}atrice",
    booktitle = "Proceedings of the Sixth International Joint Conference on Natural Language Processing",
    year = "2013",
    pages = "543--551",
    publisher = "Asian Federation of Natural Language Processing",
}

For MultipartiteRank:

@inproceedings{boudin-2018-multipartiterank,
    title = "Unsupervised Keyphrase Extraction with Multipartite Graphs",
    author = "Boudin, Florian",
    booktitle = "Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2018)",
    year = "2018",
    pages = "667--672",
}

For Topical PageRank:

@inproceedings{sterckx-etal-2015-topical,
    title = "Topical Word Importance for Fast Keyphrase Extraction",
    author = "Sterckx, Lucas and Demeester, Thomas and Deleu, Johannes and Develder, Chris",
    booktitle = "Proceedings of the 24th International Conference on World Wide Web (Companion Volume)",
    year = "2015",
    pages = "121--122",
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rapid_textrank-0.1.4.tar.gz (249.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rapid_textrank-0.1.4-cp312-cp312-win_amd64.whl (471.6 kB view details)

Uploaded CPython 3.12Windows x86-64

rapid_textrank-0.1.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (593.5 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

rapid_textrank-0.1.4-cp312-cp312-macosx_11_0_arm64.whl (552.1 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

rapid_textrank-0.1.4-cp311-cp311-win_amd64.whl (470.9 kB view details)

Uploaded CPython 3.11Windows x86-64

rapid_textrank-0.1.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (594.2 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

rapid_textrank-0.1.4-cp311-cp311-macosx_11_0_arm64.whl (551.2 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

rapid_textrank-0.1.4-cp310-cp310-win_amd64.whl (471.1 kB view details)

Uploaded CPython 3.10Windows x86-64

rapid_textrank-0.1.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (594.2 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

rapid_textrank-0.1.4-cp310-cp310-macosx_11_0_arm64.whl (551.4 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

rapid_textrank-0.1.4-cp39-cp39-win_amd64.whl (471.3 kB view details)

Uploaded CPython 3.9Windows x86-64

rapid_textrank-0.1.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (594.7 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

rapid_textrank-0.1.4-cp39-cp39-macosx_11_0_arm64.whl (551.7 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

File details

Details for the file rapid_textrank-0.1.4.tar.gz.

File metadata

  • Download URL: rapid_textrank-0.1.4.tar.gz
  • Upload date:
  • Size: 249.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rapid_textrank-0.1.4.tar.gz
Algorithm Hash digest
SHA256 db1f18c17cc8292806f194a3ac1f29aac496f917aa02491d44a90f80484e8a01
MD5 94730aa9a50145e81866d40ede789ff4
BLAKE2b-256 c0448296f24a051730ea8122f81f781961f9377967200fd40424601ed23468dd

See more details on using hashes here.

Provenance

The following attestation bundles were made for rapid_textrank-0.1.4.tar.gz:

Publisher: publish-pypi.yml on xang1234/rapid-textrank

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rapid_textrank-0.1.4-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for rapid_textrank-0.1.4-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 649ec50461a618dc24e7f07efe3ccba0293a3be2d59889e473b17adda6bc9746
MD5 bca48c0bce1706eb042119308b09fa42
BLAKE2b-256 6f7139867c369daa021e5daa3da38eda6ff69f4c441b66ed38b33643d8b44179

See more details on using hashes here.

Provenance

The following attestation bundles were made for rapid_textrank-0.1.4-cp312-cp312-win_amd64.whl:

Publisher: publish-pypi.yml on xang1234/rapid-textrank

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rapid_textrank-0.1.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rapid_textrank-0.1.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7e69a3da07f090181c5d723c18fbcd260a00785f22145827f1e040b361e7b436
MD5 fe022b682113c7d8ab7bf2089b7ded0e
BLAKE2b-256 6e547ac47a3cf84fdb4d773a841641496d2d0cd03731b75d6a44a7cfe538abfa

See more details on using hashes here.

Provenance

The following attestation bundles were made for rapid_textrank-0.1.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish-pypi.yml on xang1234/rapid-textrank

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rapid_textrank-0.1.4-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rapid_textrank-0.1.4-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 735e0dadfedf2a602ac9c69a18fdc8e2fd1238f6c7d161e3343db0f19c25895f
MD5 c9dcb89bd11b29245ffbccae90e0cb58
BLAKE2b-256 a1fee6497455031c5bae653a777cfb71f563cd1d99dd890df7078a953ec7e3b4

See more details on using hashes here.

Provenance

The following attestation bundles were made for rapid_textrank-0.1.4-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: publish-pypi.yml on xang1234/rapid-textrank

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rapid_textrank-0.1.4-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for rapid_textrank-0.1.4-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 436757693d143f99e21c15617c65a866a4a0f659735e5798054254d352ac6418
MD5 b1abc73e38cc6ca3c516b59448d9e54b
BLAKE2b-256 e98ee1c994cff4fb41d602c766641d10da40583aa55c8f5c9ae837c0edcef582

See more details on using hashes here.

Provenance

The following attestation bundles were made for rapid_textrank-0.1.4-cp311-cp311-win_amd64.whl:

Publisher: publish-pypi.yml on xang1234/rapid-textrank

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rapid_textrank-0.1.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rapid_textrank-0.1.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 769996ef2e45ae9978d90cb09a7721ae107283c9a866d092987cf90e66de32fd
MD5 fa71930651bba11066a183c0212ac0f0
BLAKE2b-256 1ceaa475cdf00789631c616d405c3452ae07de4b95dfc7068df3054590df56a0

See more details on using hashes here.

Provenance

The following attestation bundles were made for rapid_textrank-0.1.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish-pypi.yml on xang1234/rapid-textrank

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rapid_textrank-0.1.4-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rapid_textrank-0.1.4-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b03fdccc3a527d57f6c030f2c6ac347a0f1a5c01d0a394f85748a38f236c8b53
MD5 51b2765e6b45b2d2c33658f029bfcc49
BLAKE2b-256 924d57e6459ec65c1f056043dca242905a867a6cc280a722fdd87b93cf4e36b6

See more details on using hashes here.

Provenance

The following attestation bundles were made for rapid_textrank-0.1.4-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: publish-pypi.yml on xang1234/rapid-textrank

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rapid_textrank-0.1.4-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for rapid_textrank-0.1.4-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 9d97e92977f190d14051f7d5013b5f42f46815802a5ad09229780fd4bd97f7da
MD5 b15823af019faaaa420242d603b55a92
BLAKE2b-256 552fbe59e5d50683c96f4927c37935e14829598ec72984d9ea92dbd4679484aa

See more details on using hashes here.

Provenance

The following attestation bundles were made for rapid_textrank-0.1.4-cp310-cp310-win_amd64.whl:

Publisher: publish-pypi.yml on xang1234/rapid-textrank

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rapid_textrank-0.1.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rapid_textrank-0.1.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a38339c6667107c71e87d684401e4d41e576d0cd2990c705873820972912cb83
MD5 66d2ae5582673daccbd90eb0b778a5a7
BLAKE2b-256 e4b3352aafe58e5973c3a3bf1619667cd714d5a4c0fd384da01f497d9443f39e

See more details on using hashes here.

Provenance

The following attestation bundles were made for rapid_textrank-0.1.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish-pypi.yml on xang1234/rapid-textrank

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rapid_textrank-0.1.4-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rapid_textrank-0.1.4-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 99b178c7249312ea9aae73932c6720b06907dc52cd3641d13345328ec86c98b9
MD5 54024fbbc6db161531f40c37ebbbebd9
BLAKE2b-256 ec4c23d9523cb287d9fe7afc814de45a7118092221012062f4d67bb6a0bee977

See more details on using hashes here.

Provenance

The following attestation bundles were made for rapid_textrank-0.1.4-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: publish-pypi.yml on xang1234/rapid-textrank

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rapid_textrank-0.1.4-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for rapid_textrank-0.1.4-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 04f777fc0d7675bf77f72afd380fd5e3d8051fba45b1ff3d0157bbd9137629a8
MD5 5540e395a246213e701cf0fde21a61b4
BLAKE2b-256 5b805bc8f4428150526a7e6c4bdd84d6b579d4d9be17ce7342dfa9ec31f25833

See more details on using hashes here.

Provenance

The following attestation bundles were made for rapid_textrank-0.1.4-cp39-cp39-win_amd64.whl:

Publisher: publish-pypi.yml on xang1234/rapid-textrank

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rapid_textrank-0.1.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rapid_textrank-0.1.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e621364d86ca466fbe57402964b85bc8f81cfb6bdddc65db392486f168470bb6
MD5 e073fe3e34176e22515af8d7b308af55
BLAKE2b-256 1a0aa3b18298d53b04f975780992bfbeaced6356f4eba45b1dfece4802ef3415

See more details on using hashes here.

Provenance

The following attestation bundles were made for rapid_textrank-0.1.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish-pypi.yml on xang1234/rapid-textrank

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rapid_textrank-0.1.4-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rapid_textrank-0.1.4-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 25ac33e2daa2f1a12358522bc4ce70e21ce49b4fa5f5caaf6cb384a262034009
MD5 bd5ae5a3d7c4e7396917fcfc1096d813
BLAKE2b-256 5d9114a597179ba5037e97483f998954247cfc5e29ff6f656ded42ee54ba66fa

See more details on using hashes here.

Provenance

The following attestation bundles were made for rapid_textrank-0.1.4-cp39-cp39-macosx_11_0_arm64.whl:

Publisher: publish-pypi.yml on xang1234/rapid-textrank

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page