Skip to main content

Implementation of locality splitting metrics for political redistricing plans

Project description

import pandas as pd
import metrics

Calculating metrics of locality splitting in political districts

In order to calculate population-based splitting metrics, we need to know for every census block which district it is in. Much of this repository is devoted to generating this data in so-called "block equivalency files." Here is an example of such a data set.

PA_block_eq_df = pd.read_csv('clean_data/PA/PA_classifications.csv')
PA_block_eq_df.head()
GEOID10 pop sldl_2000 cd_2013 cd_2018 sldu_2000 sldl_2012 sldl_2018 cd_2003 cd_2010 sldu_2014 sldl_2010 sldl_2014 sldu_2010
0 420350307003003 57 076 5 12 34 76 76 5 5 25 76 76 35
1 420350302001056 0 076 5 12 34 76 76 5 5 25 76 76 35
2 420350301001322 0 076 5 12 34 76 76 5 5 25 76 76 35
3 420350301002207 0 076 5 12 34 76 76 5 5 25 76 76 35
4 420350301001013 0 076 5 12 34 76 76 5 5 25 76 76 35

This DataFrame has one column for every plan in the state since 2000 (cd = congressional district, sldu = state legislative district upper, sldl = state legislative district lower). If a year is missing, it means the district plan provided to the Census Bureau was identical to the previous year.

Note that in many applications, generating the block equivalency files will not be necessary. For example, the Census Bureau published a national block equivalency file for congressional districts here https://www.census.gov/geographies/mapping-files/2019/dec/rdo/116-congressional-district-bef.html and state legislative districts here https://www.census.gov/geographies/mapping-files/2018/dec/rdo/2018-state-legislative-bef.html. Furthermore, states will often provide block equivalency files of proposed maps as part of the redistricting process. We wrote code for generating block equivalency files just so that we could score districting plans back to 2000.

In order to determine locality splitting, we also need a block equivalency file of the localities. When the localities are counties, this is easy to generate.

df_county = pd.DataFrame(PA_block_eq_df['GEOID10'])
df_county['county_fips'] = df_county['GEOID10'].astype(str).apply(lambda x: x[2:5])
df_county.head()
GEOID10 county_fips
0 420350307003003 035
1 420350302001056 035
2 420350301001322 035
3 420350301002207 035
4 420350301001013 035

Once we merge up the block equivalency files, we can use a function from metrics.py to calculate a whole ensemble of locality splitting metrics for a plan. Remember that we need to have the populations of the census blocks in a column labeled "pop."

input_df = pd.merge(PA_block_eq_df, df_county, on='GEOID10')
splitting_metrics = metrics.calculate_all_metrics(input_df, 'cd_2018', lclty_str='county_fips')
splitting_metrics
{'plan': 'cd_2018',
 'splits_all': 13,
 'splits_pop': 13,
 'intersections_all': 17,
 'intersections_pop': 17,
 'split_pairs': 0.35155708843835665,
 'conditional_entropy': 0.4732218666363808,
 'sqrt_entropy': 1.2259489228698355,
 'effective_splits': 16.854108898754916,
 'split_pairs_sym': 0.8315438136166731,
 'conditional_entropy_sym': 1.9181791252873452,
 'sqrt_entropy_sym': 3.095251349839012,
 'effective_splits_sym': 1370.9984050936714}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

locality_splitting-0.0.3.tar.gz (5.6 kB view hashes)

Uploaded Source

Built Distribution

locality_splitting-0.0.3-py3-none-any.whl (5.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page