Skip to main content

Visualizing demographic evolution using geographically inconsistent census data

Project description

Piccard

Introduction

piccard is a Python package which provides an alternative framework to traditional harmonization techniques for combining spatial data with inconsistent geographic units across multiple years. It uses a network representation containing nodes and edges to retain all information available in the data. Nodes are used to represent all the geographic areas (e.g., census tracts, dissemination areas) for each year. An edge connects two nodes when the geographic area corresponding to the tail node has at least a 5% area overlap with the geographic area corresponding to the head node in the previous available year.

Research

The method behind this package can be found in the following research paper:
      Dias, F., & Silver, D. (2018). Visualizing demographic evolution using geographically inconsistent census data. California Digital Library (CDL). https://doi.org/10.31235/osf.io/a3gtd

Installation

The latest released version is available at the Python Package Index (PyPI)

pip install piccard

Importing the package

from piccard import piccard as pc 

Useful Functions

piccard.preprocessing(ct_data, year, id)
      Return a cleaned GeoDataFrame of the input data with a new column showing the area of each census tract.

piccard.create_network(census_dfs, years, id, threshold=0.05)
      Creates a network representation of the temporal connections present in census_dfs over years when each yearly geographic area has at most threshold percentage of overlap with its corresponding area(s) in the next year.

piccard.create_network_table(census_dfs, years, id, threshold=0.05)
      Return the final network table with all the temporal connections present in census_dfs over years when each yearly geographic area has at most threshold percentage of overlap with its corresponding area(s) in the next year.

piccard.draw_subnetwork(network_table, G, sample_pct=0.005)
      Draws a subgraph of the network representation, using a sample_pct% path sample from the network table.

Note: Further explanation of the parameters and example code for all the above functions can be found in the documentation.

Dependencies

GeoPandas - Allows spatial operations in Python, making it easier to work with geospatial data
Matplotlib - a comprehensive library for creating visualizations
NetworkX - Adds support for analyzing networks represented by nodes and edges
NumPy - Adds support for large, multi-dimensional arrays and matrices, with functions to operate on these arrays
pandas - Offers data structures and operations for manipulating numerical tables

Authors

Maliha Lodi, Fernando Calderon Figueroa, Daniel Silver

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

piccard-0.0.2.tar.gz (3.4 MB view hashes)

Uploaded Source

Built Distribution

piccard-0.0.2-py3-none-any.whl (9.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page