Calculation of the Ripley K (spatial statistics) value in python
Project description
RipleyK
Calculation of the Ripley K (spatial statistics) value in python. This project is still being developed and currently only supports 'circle' based bounding regions for automated boundary correction. Can support 'rectangle' based bounding regions if you do not require boundary corrections. This package allows quick calculation (using kd-trees) the RipleyK values for 1D-3D systems.
Installation
You can install the RipleyK package using the following pip command:
pip install ripleyk
To get started from source quickly follow these steps:
-
Clone or download this repository.
-
Make a python environment to run in. We recommend the use of virtual environments. This package is tested on Python 3.9, 3.10, 3.11, 3.12 and 3.13.
-
Install requirements using the
requirements.txtfile into your new python environment:
pip install -r requirements.txt
Theory
The mathematical equations for the calculated Ripley K value and normalised L value at each dimension is as follows:
1D Equations:
$$K(r) = D \frac{\sum_{i=1}^{n} \sum_{i\ne j} I[D(i,j)\leq r]}{\omega n^{2}}$$
$$L(r) = D \frac{\sum_{i=1}^{n} \sum_{i\ne j} I[D(i,j)\leq r]}{\omega n^{2}} - 2r$$
2D Equations:
$$K(r) = A \frac{\sum_{i=1}^{n} \sum_{i\ne j} I[D(i,j)\leq r]}{\omega n^{2}}$$
$$L(r) = A \frac{\sum_{i=1}^{n} \sum_{i\ne j} I[D(i,j)\leq r]}{\omega n^{2}} - \pi r^{2}$$
3D Equations:
$$K(r) = V \frac{\sum_{i=1}^{n} \sum_{i\ne j} I[D(i,j)\leq r]}{\omega n^{2}}$$
$$L(r) = V \frac{\sum_{i=1}^{n} \sum_{i\ne j} I[D(i,j)\leq r]}{\omega n^{2}} - \frac{4}{3} \pi r^{3}$$
Note the term "region" is being used interchangeably for distance, area and volume for the 1D, 2D and 3D descriptions respectively. r is the line (1D) or radius (2D/3D) of the search region. ω is the ratio of overlap of the search region with the whole sample region, this is 1 if the search region is entirely within the sample region and <1 if some of the search region is outside of the sample region. If all of the search region is outside of the sample region this is 0 and the Ripley value will not be calculated. I is the indicator function which will be either 1 if the condition D(i,j)≤r is true or will be 0, where D(i,j) is the euclidean distance between points i and j. The size of the sample region is defined as a distance (D), area (A) or volume (V) within the 1D, 2D and 3D equations respectively.
Code Example
For these examples lets create a random subset of points in a sphere of radius 1:
import random
import numpy as np
xs = []
ys = []
zs = []
radius = 1
random.seed(0)
for i in range(0,10000):
positioned = False
while positioned is False:
x = random.uniform(-radius, radius)
y = random.uniform(-radius, radius)
z = random.uniform(-radius, radius)
if (x**2)+(y**2)+(z**2) < radius**2:
xs.append(x)
ys.append(y)
zs.append(z)
positioned = True
xs = np.array(xs)
ys = np.array(ys)
zs = np.array(zs)
Now we can use our list of points within the RipleyK calculation method. Note that all point list are numpy arrays.
Single radius calculation (2D)
Let us start simply. We shall calculate the Ripley K value for the circle (r=1) bounding space. As it is a circle we only require two dimensions (d1 and d2). The calculate_ripley method should be given the radius and bounding space as it's first two inputs. The points should be given as a numpy list for each dimension, where the spatial location of point[0] would be xs[0], ys[0].
import ripleyk
radius = 0.5
bounding_radius = 1
k = ripleyk.calculate_ripley(radius, bounding_radius, d1=xs, d2=ys)
print(k)
You should get a value of 0.7573 as k.
By default this will not include any boundary correction. To add this simply add boundary_correct=True to the above function. This should always increase the value. Now we get a value of 0.8646 as k.
Finally, there is a normalisation which can be applied to evaluate the clustering of the k-value in comparison to a completely spatially random (CSR). If the distribution is close to CSR the value should tend to 0. Below is the same calculation, but now includes boundary correction and normalisation:
import ripleyk
radius = 0.5
bounding_radius = 1
k = ripleyk.calculate_ripley(radius, bounding_radius, d1=xs, d2=ys, boundary_correct=True, CSR_Normalise=True)
print(k)
As we generated the points randomly, we see that this normalisation does make the value much closer to 0, here we got 0.0792. For non randomly distributed points a postive value indicates clustering and a negative value indicates dispersion.
Multiple radii calculation (3D)
It is also possible to use this function to evaluate multiple radii for their Ripley K values. This is the predominant way to evaluate the distribution, by looking over a range of radii and plotting the k value or normalised k value against radius. Below is an example of doing this for the 3D dataset we generated:
import ripleyk
radii = [0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]
k = ripleyk.calculate_ripley(radii, 1, d1=xs, d2=ys, d3=zs,boundary_correct=True, CSR_Normalise=True)
print(k)
You should get the following results:
[-9.29e-06, -0.000105, -0.000391, -0.000728, -0.001386, -0.002302, -0.002896, -0.004294, -0.005930, -0.007915]
This can be simply plotted against the radii inputted as seen below (would require installation of matplotlib):
import ripleyk
import matplotlib.pyplot as plt
radii = [0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]
k = ripleyk.calculate_ripley(radii, 1, d1=xs, d2=ys, d3=zs,boundary_correct=True, CSR_Normalise=True)
plt.plot(radii, k, 'bo')
plt.show()
Dependencies
Contributing
Contributions are welcome! To contribute, please follow these steps:
- Fork the repository.
- Create a new branch with a descriptive name.
- Make your changes and add tests for them.
- Run the tests to ensure everything is working correctly:
pytest - Create a pull request.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ripleyk-0.9.tar.gz.
File metadata
- Download URL: ripleyk-0.9.tar.gz
- Upload date:
- Size: 6.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b1140b740428d77befdd2d0fbe26a93fb9f2a945e99fce2b6e8538931112b910
|
|
| MD5 |
fd73059ca5de17ef43eb49c56b1d187b
|
|
| BLAKE2b-256 |
9b8831ac063b544e0ed5d33b66b3536c4e1d1f0545a07f3872ddf1ad70e44a99
|
File details
Details for the file ripleyk-0.9-py3-none-any.whl.
File metadata
- Download URL: ripleyk-0.9-py3-none-any.whl
- Upload date:
- Size: 6.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1032b7194c0574a718b2b0fec5fe27ac2909eb46816ec041043d319606a5a453
|
|
| MD5 |
c55785c123e59c0ded13be54470f2509
|
|
| BLAKE2b-256 |
10d4b791f93c41577fbfcc02e7e514e8892fc2d14aeb78475f36685b7e0e5322
|