A fast PageRank and Personalized PageRank implementation

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Fast Personalized PageRank Implementation

I needed a fast PageRank for Wikisim project. It had to be fast enough to run real time on relatively large graphs. NetworkX was the obvious library to use, however, it needed back and forth translation from my graph representation (which was the pretty standard csr matrix), to its internal graph data structure. These translations were slowing down the process.

I implemented two versions of the algorithm in Python, both inspired by the sparse fast solutions given in Cleve Moler's book, Experiments with MATLAB. The power method is much faster with enough precision for our task.

Personalized PageRank

I modified the algorithm a little bit to be able to calculate personalized PageRank as well.

Comparison with Popular Python Implementations: NetworkX and iGraph

Both implementations (exact solution and power method) are much faster than their correspondent methods in NetworkX. The power method is also faster than the iGraph native implementation, which is also an eigen-vector based solution. Benchmarking is done on a ml.t3.2xlarge SageMaker instance.

What is the major drawback of NetworkX PageRank?

I gave up using NetworkX for one simple reason: I had to calculate PageRank several times, and my internal representation of a graph was a simple sparse matrix. Every time I wanted to calculate PageRank I had to translate it to the graph representation of NetworkX, which was slow. My benchmarking shows that NetworkX has a pretty fast implementation of PageRank ( networkx.pagerank_numpy and 'networkx.pagerank_scipy), but translating from its own graph data structure to a csr matrix before doing the actual calculations is exactly what exactly slows down the whole algorithm.

Note: I didn't count the time spent on nx.from_scipy_sparse_matrix (converting a csr matrix before passing it to NetworkX PageRank) in my benchmarking, But I could! Because that was another bottleneck for me, and for many other cases that one has a csr adjacency matrix.

Python Implementation

The python package is hosted at https://github.com/asajadi/fast-pagerank and you can find the installation guide in the README.md file. You also can find a detailed analysis in the jupyter notebook or this blog post.

Usage

Installation:

pip install fast-pagerank

Example

Let's take Example 1 from https://www.cs.princeton.edu/~chazelle/courses/BIB/pagerank.htm

Assuming A=0, B=1, C=2, D=3:

>>> import numpy as np
>>> from scipy import sparse
>>> from fast_pagerank import pagerank
>>> from fast_pagerank import pagerank_power
>>> A = np.array([[0,1], [0, 2], [1, 2],[2,0],[3,2]])
>>> weights = [1,1,1,1,1]
>>> G = sparse.csr_matrix((weights, (A[:,0], A[:,1])), shape=(4, 4))
>>> pr=pagerank(G, p=0.85)
>>> pr
array([0.37252685, 0.19582391, 0.39414924, 0.0375    ])

The output elements are essentially the same numbers written on the nodes, but normalized (multiply the vector by 4 and you will get the same numbers)

We can add personalization, or use power method:

>>> personalize = np.array([0.4, 0.2, 0.2, 0.4])
>>> pr=pagerank_power(G, p=0.85, personalize=personalize, tol=1e-6)
>>> pr
array([0.37817981, 0.18572635, 0.38609383, 0.05      ])

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

1.0.0

Jul 2, 2023

0.0.4

Jun 27, 2019

0.0.3

Jun 3, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fast-pagerank-1.0.0.tar.gz (4.7 kB view details)

Uploaded Jul 2, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fast_pagerank-1.0.0-py3-none-any.whl (5.3 kB view details)

Uploaded Jul 2, 2023 Python 3

File details

Details for the file fast-pagerank-1.0.0.tar.gz.

File metadata

Download URL: fast-pagerank-1.0.0.tar.gz
Upload date: Jul 2, 2023
Size: 4.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for fast-pagerank-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`9d6ad2d4ce9eb556bc1cdc08fa2d9dd73dd478120b64d48b7f46efdb05ed835f`
MD5	`c07855d527f75d27e804900e9481b20c`
BLAKE2b-256	`c88e0618b4ca14b12515627305c14c43884798ae95481d05f293b5313ae4edd7`

See more details on using hashes here.

File details

Details for the file fast_pagerank-1.0.0-py3-none-any.whl.

File metadata

Download URL: fast_pagerank-1.0.0-py3-none-any.whl
Upload date: Jul 2, 2023
Size: 5.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for fast_pagerank-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`87396b1b2de7c972784af98cd2dbba986cc8df22069ce0111bd17aeddba399d0`
MD5	`bb19606f1324b0f20f1b705bcf472c99`
BLAKE2b-256	`27cf194e4502944bc186457f3e456e17b4eb380ec0560a023aa58538ed05e1ff`

See more details on using hashes here.

fast-pagerank 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Fast Personalized PageRank Implementation

Personalized PageRank

Comparison with Popular Python Implementations: NetworkX and iGraph

What is the major drawback of NetworkX PageRank?

Python Implementation

Usage

Installation:

Example

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes