These details have not been verified by PyPI

Project links

Homepage

Project description

!!! This site is under construction !!!

AABPL-toolkit-python

About

This repository is part of the Toolkit of Prime Locations (AABPL). It contains a Python version of the prime locations delineation algorithm developed by Ahlfeldt, Albers, and Behrens (2024). It is designed to be more readily accessible than the C++/Stata hybrid version used by Ahlfeldt, Albers, and Behrens (2024). The algorithm uses arbitrary spatial point patterns as input and returns a gridded version of the data along with polygons of the delineated spatial clusters as outputs.

Note that while this implementation of the algorithm follows the same basic steps as the one used by Ahlfeldt, Albers, and Behrens (2024), it will not necessarily generate exactly the same results. The Python package is designed to enhance usability. There are subtle differences in the way counterfactual distributions are generated, establishments are assigned to grid cells, clusters are aggregated, and convex hulls are generated. Importantly, the current version of the algorithm samples from a bounding box built around the establishments input into the algorithm, whereas Ahlfeldt, Albers, and Behrens (2024) condition on the presence of employment. Therefore, the parameter values that need to be defined in the program syntax cannot be directly transferred from Ahlfeldt, Albers, and Behrens (2024).

We recommend that users find their own preferred values depending on the context and purpose of the clustering. We aim to allow for a user-specified sampling area so that users can, akin to Ahlfeldt, Albers, and Behrens (2024), exclude arbitrary areas when generating counterfactual establishment distributions. For replication of the results reported in Ahlfeldt, Albers, and Behrens (2024), we refer to the official replication directory.

When using the algorithm in your work, please cite:

Ahlfeldt, Albers, Behrens (2024): Prime locations. American Economic Review: Insights, forthcoming.

Installation

To install the Python package of the ABRSQOL-toolkit, run the following command in your python environment in your terminal.

pip install aabpl

Alternatively you can also install it from within your python script:

import subprocess, sys
subprocess.check_call([sys.executable, "-m", "pip", "install", 'aabpl'])

In case an error occurs at the installation...

with an erorr message like 'metadata-generation-failed', it is likely caused by incompatabile versions of setuptools and packaging. This can be fixed by upgrading setuptools and packaging to compatible versions:

pip install --upgrade setuptools>=74.1.1
pip install --upgrade packaging>=22.0

Or by downgrading setuptools:

pip install --upgrade setuptools==70.0.0

Usage

You may then load the package by running:

import aabpl

Or if you prefer alternatively import the function and testdata explicitly:

# imports 
from pandas import read_csv
from aabpl.main import radius_search, detect_cluster_pts, detect_cluster_cells

Program syntax

Explain the syntax with its arguments here

Examples

Example 1:

path_to_your_csv = 'input_data/prime_points_weighted_79.txt'
crs_of_your_csv =  "EPSG:4326"
pts = read_csv(path_to_your_csv, sep=",", header=None)
pts.columns = ["eid", "employment", "industry", "lat","lon","moved"]

grid = detect_cluster_cells(
    pts=pts,
    crs=crs_of_your_csv,
    r=750,
    columns=['employment'],
    exclude_pt_itself=True,
    distance_thresholds=2500,
    k_th_percentiles=[99.5],
    n_random_points=int(1e5),
    make_convex=True,
    random_seed=0,
    silent = True,
)

## Save DataFrames with radius sums and clusters
# Using all the save options below is most likely excessive. 
# saving the shapefile for save_cell_clusters and save_sparse_grid is most
# likely sufficient

# save files as needed
# save only only clusters including their geometry, aggregate values, area and id
grid.save_cell_clusters(filename=output_gis_folder+'clusters', file_format='shp')
grid.save_cell_clusters(filename=output_data_folder+'clusters', file_format='csv')

# save sparse grid including cells only those cells that at least contain one point
grid.save_sparse_grid(filename=output_gis_folder+'sparse_grid', file_format='shp')
grid.save_sparse_grid(filename=output_data_folder+'sparse_grid', file_format='csv')

# save full grid including cells that have no points in them (through many empty cells this will occuppy unecessary disk space)
# grid.save_full_grid(filename=output_gis_folder+'full_grid', file_format='shp')
# grid.save_full_grid(filename=output_data_folder+'full_grid', file_format='csv')

pts.to_csv(filename=output_data_folder+'pts_df_w_clusters.csv')

# CREATE PLOTS
grid.plot.clusters(filename=output_maps_folder+'clusters_employment_750m_995th')
grid.plot.vars(filename=output_maps_folder+'employment_vars')
grid.plot.cluster_pts(filename=output_maps_folder+'employment_cluster_pts')
grid.plot.rand_dist(filename=output_maps_folder+'rand_dist_employment')

Ready-to-use script

If you are new to Python, you may find it useful to execute the Example.py (or Example.ipynb) script saved in this folder. It will install the package, load the testing data set, generate a quality-of-life index, and save it to your working directory. It should be straightforward to adapt the script to your data and preferred parameter values.

Inputs

The compulsory input into the algorithm is a file containing spatial point pattern data. In the application by Ahlfeldt, Albers, and Behrens (2024), spatial points are establishments. However, these could also be individuals, buildings, or any other subjects or objects whose location can be referenced by geographic coordinates. The data file should contain geographic coordinates in standard decimal degrees and a variable that defines the importance of a subject or object. In the application by Ahlfeldt, Albers, and Behrens (2024), the importance is represented by the employment of an establishment. However, it could also be the productivity of a worker, the height of a building, or any weight that summarizes the importance of a data point. Of course, equal importance will be reflected by a uniform value.

In case you wish to use the above Example.py script without having to make any adjustments (except for setting your root directory), you should create a comma-separated file with exactly the same name and structure as the plants.txt file provided in this repository (this is just the renamed 'prime_points_weighted_79.txt' file from the AABPL-toolkit). Note that this exemplary input file does not include variable names. It includes variables in the following order (separated by commas):

identifier variable: In our case, this is an establishment identifier. If you do not need this, you can set all values to 1.
importance weight: In our case, this is predicted employment. If you want to use equal weights, you can set all values to 1.
category identifier: In our case, this is the type of establishment (e.g., accounting, consulting, etc.). If you do not care, you can set all values to 1.
latitude: Given in decimal degrees in the standard WGS1984 geographic coordinate system.
longitude: Given in decimal degrees in the standard WGS1984 geographic coordinate system.
placebolder for another variable: You can ignore it.

Variable names will then be assigned by the script. Of course, with some adjustments to the 'Example.py' script, you can also import data sets that already contain variable names. Just make sure that latitudes and longitudes are defined by variables named lat and lon. You can define the name of the variable representing your importance weights in the program syntax.

For future versions of the package, we aim to allow for a shapefile that defines the sampling area of the counterfactual distribution as an optional input. This shapefile must be projected within the WGS1984 geographic coordinate system. Ahlfeldt, Albers, and Behrens (2024) exclude residential and undevelopable areas. Such a shapefile could also restrict the sampling area for counterfactual spatial distributions to inhabitable areas or to areas zoned for the development of tall buildings.

Outputs

The package will create the a number of folders in your working directory into which the outputs will be saved. File names are those specified in the Example.py file (you may choose different names).

Folder	File	Description
output_data	`clusters.csv`	CSV file containing information on the final delineated clusters, including geographic coordinates in decimal degrees, a cluster id that corresponds to the rank in the distribution of total mass within the cluster (in our case employment), the number of cells within the cluster, the total area of the cluster (in square meters). You may choose another file name in the 'Example.py' script.
output_data	`grid_clusters.csv`	CSV file containing a gridded version of the data set, including groups of clustered grid cells identified by the cluster id, geographic coordinates in decimal degrees, and the total mass in the grid cell (in our case employment). You may choose another file name in the 'Example.py' script.
output_data	`pts_df_w_clusters.csv`	CSV file containing the plants with the input data and, in addition, an identifier for the cluster to which a plant belongs. You may choose another file name in the 'Example.py' script.
output_gis	`grid_clusters.*`	Shapefile of the gridded data set including the same information as in `grid_clusters.csv`. You may choose another file name in the 'Example.py' script.
output_gis	`clusters.*`	Shapefile of final output, i.e. aggregated clusters (in our case prime locations) along with the same information as in 'clusters.csv'. You may choose another file name in the 'Example.py' script.
output_maps	`clusters_employment_750m_995th.png`	Map showing the boundaries of the final output, i.e. clusters after aggregation (in our case to prime locations), with the density of the selected importance weight (in our case employment) in the background. You may choose another file name in the 'Example.py' script.
output_maps	`employment_cluster_pts.png`	Map showing the plants and how clustered they are. You may choose another file name in the 'Example.py' script.
output_maps	`rand_dist_employment.png`	Technical output to inform the choice of the p-value. You may choose another file name in the 'Example.py' script.

Other outputs can be generated by activating the respective lines (by removing the '#') in the 'Exmaple.py' script.

Folder Structure and Files (OUTDATED)

Folder	Name	Description
aabpl	`main.py`	Contains main functions for user: radius_search and detect_clusters
aabpl	`disk_search.py`	Performs radius search in multiple steps:
(1):

(a) Assigns each point to a grid cell. 
(b) store pt ids for search target in grid cells and precalucaltes sums per grid cell.

(2): divides cell into cell regions that define which of the surrounding cells are fully included in search radius and which cell are partly overapped by search radius. It assign each point to such a relative search region avoiding unnecessary checks on cells (through methods from 2d_weak_ordering). (3): loops over all search source points sorted based on cell id and cell region and (a) sums precalculated sums of non empty cells that are fully within cell radius (or reuses this sum from last source point if same cells were relevant). (b) retrieves all search target points from partly overlapped cells (or reuses them from last source point if same cell were relevant). (c) checks bilateral distance from source point to target points and sums values for target points within search radius | | aabpl | grid_class.py | Mostly implemented. Creates class for Grid | | aabpl | 2d_weak_ordering.py | Complex logic that helps to reduce the number of cells that need to be checked if they overlap with the search radius. Relative to origing cell (0,0) it creates a hierarchical weak ordering for surrounding cells. E.g. cell(1,1) is always closer to any point within cell(0,0) than cell(2,2). But its unclear whether a point within cell(0,0) is closer to cell(1,0) or to cell(0,1) | | aabpl | random_distribution.py | functions to draw random points and optain cutoff value for k-th percentile | | aabpl | valid_area.py | Not implemented yet. Will include functions to allow the user to provide a (in)valid area by either providing a geometry or provide a list of (in)valid of cell ids | | aabpl | distances_to_cell.py | Includes helper functions to calculate smallest/largest distance from a cell to (1) another cell, (2) to a triangle, (3) to points. Also contains other functions related to cell distance checks. | | aabpl | general.py | Contains small helper functions unrelated to radius search methods | | aabpl/illustrations | *.py* | Contains method for illustrating methods (mainly used for testing but can remain in final version to allow user to illustrate the algorithm) | | plots | opt_grid | Created in first days of project to get a feeling for the importance of the relative size of the grid spacing with respect to the search radius | | aabpl | optimal_grid_spacing | Not fully implemented. Automatically choose optimal grid spacing to execute radius search as fast as possible | | aabpl | nested_search | Not fully implemented. Nesting grid cell improves scaling for vary dense (relative to search radius) point data sets. | | aabpl/documentation | docstring.py | Not implemented yet. Will include repetitive help text for functions |

Selected Files

Folder	Name	Description
-	`AABPL-Codebook.pdf`	Codebook laying out the structure of the deliniation algorithm in pseduo code

References

Ahlfeldt, Albers, Behrens (2024): Prime locations. American Economic Review: Insights, forthcoming.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.2.14

Jan 6, 2026

0.2.13

Jan 5, 2026

0.2.12

Dec 18, 2025

0.2.11

Nov 27, 2025

0.2.10

Nov 25, 2025

0.2.9

Nov 25, 2025

This version

0.2.8

Nov 25, 2025

0.2.7

Nov 21, 2025

0.2.6

Nov 3, 2025

0.2.5

Oct 31, 2025

0.2.4

Oct 30, 2025

0.2.3

Oct 30, 2025

0.2.2

Oct 28, 2025

0.2.1

Oct 27, 2025

0.2.0

Oct 27, 2025

0.1.27

Jan 22, 2025

0.1.26

Jan 22, 2025

0.1.25

Jan 21, 2025

0.1.24

Jan 21, 2025

0.1.23

Jan 21, 2025

0.1.22

Jan 21, 2025

0.1.21

Jan 21, 2025

0.1.20

Jan 21, 2025

0.1.19

Jan 21, 2025

0.1.18

Jan 21, 2025

0.1.17

Jan 19, 2025

0.1.16

Jan 18, 2025

0.1.15

Jan 18, 2025

0.1.14

Jan 18, 2025

0.1.13

Jan 18, 2025

0.1.12

Jan 18, 2025

0.1.11

Jan 18, 2025

0.1.10

Jan 18, 2025

0.1.9

Jan 18, 2025

0.1.8

Jan 18, 2025

0.1.7

Jan 18, 2025

0.1.6

Jan 16, 2025

0.1.5

Jan 16, 2025

0.1.4

Jan 16, 2025

0.1.3

Jan 16, 2025

0.1.2

Jan 16, 2025

0.1.1

Jan 16, 2025

0.1.0

Jan 10, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aabpl-0.2.8.tar.gz (20.1 MB view details)

Uploaded Nov 25, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aabpl-0.2.8-py3-none-any.whl (152.6 kB view details)

Uploaded Nov 25, 2025 Python 3

File details

Details for the file aabpl-0.2.8.tar.gz.

File metadata

Download URL: aabpl-0.2.8.tar.gz
Upload date: Nov 25, 2025
Size: 20.1 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for aabpl-0.2.8.tar.gz
Algorithm	Hash digest
SHA256	`e2ac1215e1ee3b0fde27c61903fae76666a8786a7d3fa7278e5e72ab6b37e860`
MD5	`df03b9f13845bb7ebde027a92c972d45`
BLAKE2b-256	`1ab3f4009f72a3b504c56aff6c1bb195bc16f752a918c5f5a604599e3ee58960`

See more details on using hashes here.

File details

Details for the file aabpl-0.2.8-py3-none-any.whl.

File metadata

Download URL: aabpl-0.2.8-py3-none-any.whl
Upload date: Nov 25, 2025
Size: 152.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for aabpl-0.2.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e5855ed058bc3ba217d15dedc20862584bd2f825b5bf0db2b00848e1d393f894`
MD5	`42fe64f530ddc9b00eacf6e448d40774`
BLAKE2b-256	`0a9ff72a45b645b561cc3b6de6bd3dba6369e5edd3086fa08548fe7032d1bf01`

See more details on using hashes here.

aabpl 0.2.8

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

!!! This site is under construction !!!

AABPL-toolkit-python

About

Installation

Usage

Program syntax

Examples

Example 1:

Ready-to-use script

Inputs

Outputs

Folder Structure and Files (OUTDATED)

Selected Files

References

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes