A project developed by the City of Cleveland Office of Urban Analytics and Innovation, built to simplify civic data processing for the public.
Project description
cledatatoolkit
...is a library of tools published by the City of Cleveland's Office of Urban Analytics & Innovation (Urban AI). The package is meant for civic data analysis using local, county and regional data sources. Many city datasets can be found using the City of Cleveland Open Data Portal.
This package is divided into a series of modules. These modules can be used to perform a variety of functions, but are not necessarily related to one another. To learn more about what each module does, please visit the Documentation section.
Table of Contents
Installation
Overview
Documentation
Additional Resources
Installation
You can use pip
in your terminal to install the package.
If you have difficulty installing on Linux (primarily Ubuntu) or macOS due to issues with the arcgis
package (a dependency of this toolkit), you will likely need to install extra dependencies. For those operating systems, try first installing the Kerberos library via sudo apt install libkrb5-dev
before using pip
to install this package.
pip install cle-data-toolkit
This will also install the following dependencies:
geopandas
pandas
arcgis
numpy
We recommending installing into a virtual environment to not modify your base version of Python.
Overview
This package contains several modules that perform a variety of functions including, but not limited to:
ArcGIS Online API Helper Functions
- Extracting GeoDataFrames from ArcGIS Online FeatureLayers.
- Managing fields, features, and metadata within ArcGIS Online items.
Spatial Helper Functions
- Apportioning: Allocating Census data to local boundaries that don't align
- Custom functions built on top of
geopandas
for more advanced spatial joins- Largest overlap
County Property Data Enhancement
- Extracting insights from property owner names
- Standardizing property types and ownership
Documentation
Table of Contents
cledatatoolkit.ago_helpers
module
cledatatoolkit.census
module
cledatatoolkit.property
module
cledatatoolkit.spatial
module
cledatatoolkit.ago_helpers
module
cledatatoolkit.ago_helpers.FLCWrapper(layer_id, container_id, gis)
FLCWrapper stands for FeatureLayerCollection Wrapper. This is a class that contains various "quality of life" functions for working with the ArcGIS Online API, specifically FeatureLayerCollections. FeatureLayerCollections are FeatureServices that contain one or more FeatureLayers.
Parameters:
container_id
(string): The ArcGIS Online ID of the FeatureLayerCollection to which theFLCWrapper
instance is based on.gis
(arcgis.gis.GIS): The GIS connection object to the ArcGIS REST API. This determines the context in which data can be retreived from ArcGIS Online.
Properties:
FLCWrapper.esriLookup
(dictionary): This is a dictionary that maps commonly used column types to specialized Esri field types. These field types are defined in the Service Definition of a FeatureService. This property is used inaudit_schema()
in theFLWrapper
class.FLCWrapper.container
(arcgis.features.FeatureLayerCollection): This is the FeatureLayerCollection object from the ArcGIS Online REST API.FLCWrapper.container_id
(string): This is the ArcGIS Online ID of theFLCWrapper.container
object.FLCWrapper.container_item
(arcgis.gis.item): This is the Item object from the ArcGIS Online REST API, which contains theFLCWrapper.container
object.FLCWrapper.gis
(arcgis.gis.GIS): This is the GIS connection object to the ArcGIS REST API. This determines the context in which data can be retreived from ArcGIS Online.FLCWrapper.sqlLookup
(dictionary): This is a dictionary that maps commonly used column types to specialized SQL field types. These field types are defined in the Service Definition of a FeatureService. This property is used inaudit_schema()
in theFLWrapper
class.
cledatatoolkit.ago_helpers.FLCWrapper.add(schema, type='layer')
Add a new FeatureLayer to the FeatureLayerCollection.
Parameters:
schema
(dictionary): A dictionary of Service Definition properties that define the Layer or Table.type
(string): Either "layer" or "table". This determines the type of the FeatureLayer. Defaults to "layer".
Raises:
Exception
: If thetype
parameter is neither "layer" nor "table".
Returns:
arcgis.features.FeatureLayer
: An ArcGIS Online reference to the newly created Layer iftype
is set to "layer".arcgis.features.Table
: An ArcGIS Online reference to the newly created Table iftype
is set to "table".
cledatatoolkit.ago_helpers.FLCWrapper.delete(id, type='layer')
Delete a FeatureLayer from the FeatureLayerCollection.
Parameters:
id
(integer): ID of the Layer or Table to delete. (i.e 0, 1, 2, etc.)type
(string): Either "layer" or "table". This determines the type of the FeatureLayer that is being deleted. Defaults to "layer".
Raises:
Exception
: If thetype
parameter is neither "layer" nor "table".
Returns:
None
cledatatoolkit.ago_helpers.FLCWrapper.get_layer(id)
Retreive a FeatureLayer from within the FeatureLayerCollection.
Parameters:
id
(integer): ID of the FeatureLayer within the FeatureLayerCollection. (i.e 0, 1, 2 etc.)
Returns:
arcgis.features.FeatureLayer
: The ArcGIS Online reference to the FeatureLayer.
cledatatoolkit.ago_helpers.FLCWrapper.get_layer_index(name)
Get the index of a FeatureLayer within a FeatureLayerCollection.
Parameters:
name
(string): The name of FeatureLayer as defined in the Service Definition.
Raises:
Exception
: If no results are found, an exception is raised.
Returns:
integer
: If only one result is found, the numeric ID of the FeatureLayer that matches thename
argument.list
: If multiple results are found, a list of numeric IDs corresponding to all FeatureLayers that match thename
argument.
cledatatoolkit.ago_helpers.FLCWrapper.get_table(id)
Retreive a Table from within the FeatureLayerCollection.
Parameters:
id
(integer): ID of the Table within the FeatureLayerCollection. (i.e 0, 1, 2 etc.)
Returns:
arcgis.features.FeatureLayer
: The ArcGIS Online reference to the Table.
cledatatoolkit.ago_helpers.FLCWrapper.get_table_index(name)
Get the index of a Table within a FeatureLayerCollection.
Parameters:
name
(string): The name of Table as defined in the Service Definition.
Raises:
Exception
: If no results are found, an exception is raised.
Returns:
integer
: If only one result is found, the numeric ID of the Table that matches thename
argument.list
: If multiple results are found, a list of numeric IDs corresponding to all Tables that match thename
argument.
cledatatoolkit.ago_helpers.FLCWrapper.paste(schema_id, layer_index)
This function will copy a Service Definition from a pre-existing ArcGIS Online FeatureLayer and append it to the FeatureLayerCollection as a new FeatureLayer without any Features. This function will only work for spatial Layers, Tables are not currently supported.
Parameters:
schema_id
(string): The ArcGIS Online ID of the FeatureLayerCollection reference that contains the FeatureLayer you want to paste.layer_index
(integer): The numeric index of the FeatureLayer within the FeatureLayerCollection referenced inschema_id
. (i.e 0, 1, 2 etc.)
Returns:
arcgis.features.FeatureLayer
: An ArcGIS Online reference to the newly created FeatureLayer.
cledatatoolkit.ago_helpers.FLCWrapper.update_container()
Refresh the connection to the FeatureLayerCollection.
Returns:
None
cledatatoolkit.ago_helpers.FLWrapper(layer_id, container_id, gis, how='layer')
FLWrapper stands for FeatureLayer Wrapper. This is a class that contains various "quality of life" functions for working with the ArcGIS Online API, specifically FeatureLayers. FeatureLayers are individual layers contained within a FeatureLayerCollections. FeatureLayers can either be spatial 'layers' or nonspatial 'tables'. This wrapper supports both types.
Extends cledatatoolkit.ago_helpers.FLCWrapper
Parameters:
layer_id
(integer): The ID of the Layer or Table within the FeatureLayerCollection. This is a number that often corresponds to the sequence of FeatureLayers within the collection.container_id
(string): The ArcGIS Online ID of the FeatureLayerCollection to which theFLWrapper
instance is based on.gis
(arcgis.gis.GIS): The GIS connection object to the ArcGIS REST API. This determines the context in which data can be retreived from ArcGIS Online.how
(string): The type of FeatureLayer, either layer or table. Defaults to 'layer'.
Properties:
- All properties contained in
cledatatoolkit.ago_helpers.FLCWrapper
. FLWrapper.crs
(integer): The EPSG ID of the FeatureLayer's coordinate reference system. This property defaults toNone
until theFLWrapper.spatialize()
method is executed.FLWrapper.fs
(arcgis.features.FeatureSet): An ArcGIS FeatureSet of the FeatureLayer. This property defaults toNone
until theFLWrapper.spatialize()
method is executed.FLWrapper.layer
(arcgis.features.FeatureLayer or arcgis.features.Table): A reference to the ArcGIS Online FeatureLayer object.FLWrapper.layer_id
(integer): The numeric index of the FeatureLayer within the containing FeatureLayerCollection.FLWrapper.gdf
(geopandas.GeoDataFrame): A GeoDataFrame based on the FeatureSet defined inFLWrapper.fs
. This property defaults toNone
until theFLWrapper.spatialize()
method is executed.FLWrapper.sdf
(pandas.DataFrame): A Spatially Enabled Pandas DataFrame based on the FeatureSet defined inFLWrapper.fs
. This property defaults toNone
until theFLWrapper.spatialize()
method is executed.
cledatatoolkit.ago_helpers.FLWrapper.add_field(field_dict)
Add a new field to the FeatureLayer.
Parameters:
field_dict
(dictionary): A dictionary of properties that define the field in the Service Definition, as outlined in the ArcGIS Online REST API.
Returns:
None
cledatatoolkit.ago_helpers.FLWrapper.audit_fields(columns)
Compares an inputted list of column names to field names in the FeatureLayer. This is useful for comparing a DataFrame column names to fields in the Service Definition.
Parameters:
columns
(list): A list of field names. Could be from a pandas or PySpark DataFrame.
Returns:
dictionary
: The key 'Only in FL' contains a list of fields that are only in the FeatureLayer, the key 'Not in FL' contains a list of fields that are only in the inputted list of column names.
cledatatoolkit.ago_helpers.FLWrapper.audit_schema(dtypes)
Compares the FeatureLayer's Service Definition field schema to a list of data type tuples, usually sourced from a pandas or PySpark DataFrame using the dtypes method. This function will compare field schemas across the following three categories: name, type, and order.
Parameters:
dtypes
(list): A list of tuples, where the first element of the tuple is a field name and the second is the data type.
Returns:
boolean
: If the order, types, and names of thedtypes
parameter all match that of the FeatureLayer,True
is returned. OtherwiseFalse
is returned.
cledatatoolkit.ago_helpers.FLWrapper.delete_field(field_name)
Delete a field from the FeatureLayer.
Parameters:
field_name
(string): The name of the field to delete.
Returns:
None
cledatatoolkit.ago_helpers.FLWrapper.spatialize(clause=None)
Query features from the FeatureLayer. This will initialize the Spatially Enabled DataFrame (
FLWrapper.sdf
) and FeatureSet (FLWrapper.fs
). This function will also extract the Coordinate Reference System (FLWrapper.crs
) and build a GeoDataFrame of the features (FLWrapper.gdf
).
Parameters:
clause
(string): A SQL clause for filtering features. If None is inputted, the entire FeatureLayer is queried. Defaults to None.
Returns:
None
cledatatoolkit.ago_helpers.FLWrapper.update(update_dict)
Update the FeatureLayer Service Definition.
Parameters:
update_dict
(dictionary): Dictionary of parameters to update in the definition.
Returns:
None
cledatatoolkit.ago_helpers.FLWrapper.upsert(fs, id_field, batch_size=0)
This function will upsert features to the FeatureLayer based on a FeatureSet. This means new features will be added or existing features will be updated depending on whether or not the feature is already in the FeatureLayer.
Parameters:
fs
(arcgis.features.FeatureSet): A FeatureSet containing features to add and/or update.id_field
(string): The field for which the upsert is performed. This field will be used to compare features from the inputted FeatureSet to features within the FeatureLayer.batch_size
(integer): Recommended for larger datasets. The number of features to upsert per batch. After every batch the system will sleep for one second to avoid a timeout error. If zero the entire dataset will be uploaded in a single batch. Defaults to 0.
Returns:
None
cledatatoolkit.census
module
cledatatoolkit.census.calc_moe(array, how='sum')
Helper function for developing margins of error (MOEs) for aggregations of sample estimates. This is recommended for when you are summing, or taking the proportion of multiple ACS estimates. This function implements the American Community Survey's documented methodology for calculating Margins of Error. To better understand how this process works, click here.
Parameters:
array
(list-like): A list of margins of error to propogate over. Ifhow
= 'proportion', the arrays must be inputted as a 2-D array containing lists in the following order:- The denominators of the proportion.
- The proportions themselves.
- The margins of error of the numerator.
- The margins of error of the denominator.
how
(string): Either 'sum' or 'proportion'. The aggregation methodology used for calculating the MOE. Defaults to 'sum'.
Raises:
Exception
: If thehow
argument is not either 'sum' nor 'proportion', an exception is raised.
Returns:
float
: The aggregated margin of error for the inputted array ifhow
='sum'.numpy.array
: The aggregated margins of error for the inputted array(s) ifhow
='proportion'.
cledatatoolkit.property
module
cledatatoolkit.spatial
module
Additional Resources
Guide
See our tutorial notebook repo, open-data-examples, for curated tutorials of how you might use this package with Cleveland civic data sources!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cle-data-toolkit-1.0.0.tar.gz
.
File metadata
- Download URL: cle-data-toolkit-1.0.0.tar.gz
- Upload date:
- Size: 13.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 190b3c55fc39f01f3b8288329f32abd6c012fa15763efa5bed6e41db5eef78ee |
|
MD5 | 6a0c36bb3f14647d0fee7848e2a82d51 |
|
BLAKE2b-256 | 37e620e6647eb3c3fe70cd1fd36471147893567beb813100284cb324bf862f18 |
File details
Details for the file cle_data_toolkit-1.0.0-py3-none-any.whl
.
File metadata
- Download URL: cle_data_toolkit-1.0.0-py3-none-any.whl
- Upload date:
- Size: 15.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 18ad43ce99768ad30f6a0b5b0d2280b65f1bd9913381cab6166919f3093d7928 |
|
MD5 | d2a08cd62b15cd82de19bff9cf446e89 |
|
BLAKE2b-256 | c88a755c24fb2958fe7ab24f7d181f010647e1769d4e55ffe04812de514ce573 |