Skip to main content

A streamlined and approximate implementation of the LexRank algorithm for rapid text summarization.

Project description

FastLexRank

A streamlined and approximate implementation of the LexRank algorithm for rapid text summarization.

LexRank for large scale data

The original implementation of LexRank utilizes the power method to calculate the eigenvector associated with an eigenvalue of 1. In the foundational paper by Erkan and Radev[1], they mathematically demonstrated why the normalized similarity matrix is a stochastic matrix and will, therefore, converge.

However, a key challenge with the original LexRank algorithm is its dependence on the power method, which often requires multiple iterations to converge. For a large corpus, matrix multiplication can become a bottleneck, slowing down the computation considerably.

To address this issue, we introduce an approximate approach that efficiently computes a score for each sentence while retaining the essential characteristic of relative centrality. Our modified method offers significant speed improvements in LexRank calculations and delivers reliable results.

Reference

[1] Erkan, G., & Radev, D. R. (2004). Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of artificial intelligence research, 22, 457-479.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastlexrank-0.1.4.tar.gz (7.1 kB view hashes)

Uploaded Source

Built Distribution

fastlexrank-0.1.4-py3-none-any.whl (7.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page