Some features may not work without JavaScript. Please try enabling it if you encounter problems.

Discover colocation patterns from spatial datasets

These details have not been verified by PyPI

Project links

Homepage

Project description

General Co-location Miner

This repository contains the code for the spatial-colocation package, which implements the main algorithm from:

Shekhar, S., Huang, Y. (2001). Discovering Spatial Co-location Patterns: A Summary of Results. In: Jensen, C.S., Schneider, M., Seeger, B., Tsotras, V.J. (eds) Advances in Spatial and Temporal Databases. SSTD 2001. Lecture Notes in Computer Science, vol 2121. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47724-1_13

The Co-location Package

Installation

pip install general-colocation

Usage

Data Pre-Processing

The input to the general function is a pandas DataFrame with, at a minimum, columns for:

position: By default, items in this column must be a GeoPandas datatype (Point, Polygon, Line) so that distance calculations may be performed. It may be possible to define a custom spatial relation function that works with other data types, but none are included in the relations.py file of the package.
class: Labels to separate features by.
id: Unique identifiers for each instance of a class.

Optional Configurations

There are several optional parameters for the general function that modify the spatial relation, prevalence, and conditional probability thresholds as well as the format of the output data. See the explanation of key concepts section for more details on terms like prevalence and event-centric conditional probability.

k: The largest size co-location to find. All co-locations of size k=1,...,k will be returned as items in the output list T.
theta: The prevalence threshold. Co-locations with a participation index below theta will be pruned from the output.
alpha: The conditional probability threshold. Association rules with an event-centric conditional probability below alpha will be pruned from the output.
relation: The spatial relation function to use when determining an item's neighborhood. The relation is a minimum distance in meters by default, but can be changed to a unit distance with the string "unit", or to a custom function by passing a function name to this parameter.
threshold: The distance threshold for the chosen relation function in the basic cases. Some other type of threshold may be used in user-defined relation functions.
plot: If True, a scatter plot of all instances of prevalent co-locations will be shown. This only works when using a backend that will produce graphics or in an interactive environment like Jupyter Notebook.
shape_file: A path to a directory with a shape object to plot co-locations on top of.
out_plot: A path to a directory in which to store a scatter plot of co-locations for each value k=1,...,k
out_csv: A path to a directory in which to store a .csv file with one row for each co-location instance of size k

General Co-location

from general_colocation import colocation
T,R = colocation.general(data, position_column, class_column, id_column, k=3, theta=0.6, alpha=0.5, 
                        relation='meter', threshold=100, plot=False, shape_file=None, out_plot=None, out_csv=None):

Configurable to:

show a plot of all k-colocations with a participation index of theta or higher
store the plot of each set of co-locations for k=1,...,k as a .png file
store the items in the set of prevalent k-colocations as a .csv file

Returns:

a list of DataFrames T, one for each k=1,...,k
the set of association rules R with conditional probability of alpha or higher

Toy Example

To verify that the code is working correctly, execute the following code snippet:

    d = {'x':[1,2,2,3,4,6],'y':[3,1,5,3,5,1],'class':['solid_sq','empty_ci','empty_ci','solid_ci','dotted_sq','dotted_sq'],'id':[1,1,2,1,1,2]}
    data = pd.DataFrame(data=d)
    data['pos'] = gpd.points_from_xy(data.x,data.y)

    T,R = general(data, 'pos','class','id',relation='unit',threshold=2.3)
    
    print('\n',T[-1],'\n')
    
    for r in R:
        print(r)

By running the command:

    colocation

The terminal output should look something like this (timing may vary slightly):

    |C1| = 4, |P1| = 4, |R1| = 0, Rows in T1 = 6, Elapsed Time: 0:00:00.002245
    |C2| = 6, |P2| = 3, |R2| = 6, Rows in T2 = 7, Elapsed Time: 0:00:00.028943
    |C3| = 1, |P3| = 1, |R3| = 3, Rows in T3 = 2, Elapsed Time: 0:00:00.046490

            cat1  id1      cat2  id2      cat3  id3                     pos3
    1  empty_ci    1  solid_ci    1  solid_sq    1  POINT (1.00000 3.00000)
    5  empty_ci    2  solid_ci    1  solid_sq    1  POINT (1.00000 3.00000) 

    {solid_sq} => empty_ci (1.0, 1.0)
    {empty_ci} => solid_sq (1.0, 1.0)
    {empty_ci} => solid_ci (1.0, 1.0)
    {solid_sq} => solid_ci (1.0, 1.0)
    {empty_ci, solid_sq} => solid_ci (1.0, 1.0)
    {solid_ci} => empty_ci (1.0, 1.0)
    {empty_ci, solid_ci} => solid_sq (1.0, 1.0)
    {solid_ci, solid_sq} => empty_ci (1.0, 1.0)
    {solid_ci} => solid_sq (1.0, 1.0)

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.0.5

May 3, 2022

This version

1.0.4

May 3, 2022

1.0.3

May 3, 2022

1.0.2

May 3, 2022

1.0.1

May 3, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

general-colocation-1.0.4.tar.gz (7.6 kB view details)

Uploaded May 3, 2022 Source

File details

Details for the file general-colocation-1.0.4.tar.gz.

File metadata

Download URL: general-colocation-1.0.4.tar.gz
Upload date: May 3, 2022
Size: 7.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.0 CPython/3.7.0

File hashes

Hashes for general-colocation-1.0.4.tar.gz
Algorithm	Hash digest
SHA256	`fd71ed53a2b1a64c0005317b464efe9b86ed3f0561bbd2fa7fb405071f203d1b`
MD5	`b331e3ab0c04d0036cbe55c5164c403c`
BLAKE2b-256	`ffc2d33a4009218260f540e8a9a2449937dfe10422fc179985d6265c40002980`

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor

Datadog Monitoring

Depot Continuous Integration

Google Download Analytics

Pingdom Monitoring

Sentry Error logging

StatusPage Status page