Skip to main content
Python Software Foundation 20th Year Anniversary Fundraiser  Donate today!

Code Tutorials

Project description

Data Handling Handbook

The one stop shop to learn about data intake, processing, and visualization.

The Dataplay Handbook uses techniques covered in the Datalabs Guidebook.

Binder Binder Binder Open Source Love svg3

NPM License Active Python Versions GitHub last commit No Maintenance Intended

GitHub stars GitHub watchers GitHub forks GitHub followers

Tweet Twitter Follow

Install

The code is on PyPI so you can just run:

pip install dataplay geopandas

From the terminal to install the code and its dependencies

How to use

  1. Import the installed module into your code:
from dataplay.acsDownload import retrieve_acs_data 
  1. use it
retrieve_acs_data(state, county, tract, tableId, year, saveAcs)

Heres another one

from dataplay.merge import mergeDatasets
mergeDatasets(left_ds=False, right_ds=False, crosswalk_ds=False,  use_crosswalk = True, left_col=False, right_col=False, crosswalk_left_col = False, crosswalk_right_col = False, merge_how=False, interactive=True)

Examples

Import your modules

%%capture 
from dataplay.acsDownload import retrieve_acs_data
from dataplay.merge import mergeDatasets
import pandas as pd
from dataplay.geoms import readInGeometryData
from dataplay.geoms import workWithGeometryData

Read in some data

Define our download parameters.

More information on these parameters can be found in the tutorials!

tract = '*'
county = '510'
state = '24'
tableId = 'B19001'
year = '17'
saveAcs = False
df = retrieve_acs_data(state, county, tract, tableId, year, saveAcs)
Number of Columns 17
#hide_input
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
B19001_001E_Total B19001_002E_Total_Less_than_$10_000 B19001_003E_Total_$10_000_to_$14_999 ... state county tract
NAME
Census Tract 1901 796 237 76 ... 24 510 190100
Census Tract 1902 695 63 87 ... 24 510 190200
Census Tract 2201 2208 137 229 ... 24 510 220100
Census Tract 2303 632 3 20 ... 24 510 230300
Census Tract 2502.07 836 102 28 ... 24 510 250207

5 rows × 20 columns

Here we can import and display a dataset

#hide
# This dataset is taken from the public database provided by BNIAJFI hosted by Esri / ArcGIS
# BNIA ArcGIS Homepage: https://data-bniajfi.opendata.arcgis.com/
csa_gdf_url = "https://services1.arcgis.com/mVFRs7NF4iFitgbY/ArcGIS/rest/services/Hhchpov/FeatureServer/0/query?where=1%3D1&outFields=*&returnGeometry=true&f=pgeojson"
csa_gdf = readInGeometryData(url=csa_gdf_url, porg=False, geom='geometry', lat=False, lng=False, revgeocode=False,  save=False, in_crs=2248, out_crs=False)
csa_gdf.plot(column='hhchpov18')
RECIEVED url: https://services1.arcgis.com/mVFRs7NF4iFitgbY/ArcGIS/rest/services/Hhchpov/FeatureServer/0/query?where=1%3D1&outFields=*&returnGeometry=true&f=pgeojson, 
 porg: g, 
 geom: geometry, 
 lat: False, 
 lng: False, 
 revgeocode: False, 
 in_crs: 2248, 
 out_crs: 2248





<matplotlib.axes._subplots.AxesSubplot at 0x7f8840335048>

png

Now in this example we will load in a bunch of coorinates

geoloom_gdf_url = "https://services1.arcgis.com/mVFRs7NF4iFitgbY/ArcGIS/rest/services/Geoloom_Crowd/FeatureServer/0/query?where=1%3D1&outFields=*&returnGeometry=true&f=pgeojson"
geoloom_gdf = readInGeometryData(url=geoloom_gdf_url, porg=False, geom='geometry', lat=False, lng=False, revgeocode=False,  save=False, in_crs=4326, out_crs=False)
geoloom_gdf = geoloom_gdf.dropna(subset=['geometry']) 
RECIEVED url: https://services1.arcgis.com/mVFRs7NF4iFitgbY/ArcGIS/rest/services/Geoloom_Crowd/FeatureServer/0/query?where=1%3D1&outFields=*&returnGeometry=true&f=pgeojson, 
 porg: g, 
 geom: geometry, 
 lat: False, 
 lng: False, 
 revgeocode: False, 
 in_crs: 4326, 
 out_crs: 4326

And then use the dataset we retrieved just before it to attach community labels on which the points sit

geoloom_w_csas = workWithGeometryData(method='pinp', df=geoloom_gdf, polys=csa_gdf, ptsCoordCol='geometry', polygonsCoordCol='geometry', polyColorCol='hhchpov18', polygonsLabel='CSA2010', pntsClr='red', polysClr='white')
Total Points:  61.0
Total Points in Polygons:  50
Prcnt Points in Polygons:  0.819672131147541


/usr/local/lib/python3.6/dist-packages/dataplay/geoms.py:112: FutureWarning:     You are passing non-geometry data to the GeoSeries constructor. Currently,
    it falls back to returning a pandas Series. But in the future, we will start
    to raise a TypeError instead.
  polygons['pointsinpolygon'] = gpd.GeoSeries(pts_in_polys)

Lets map it after inspecting it a bit!

type(geoloom_w_csas)
geopandas.geodataframe.GeoDataFrame
geoloom_w_csas.plot(column='pointsinpolygon')
<matplotlib.axes._subplots.AxesSubplot at 0x7f88403b8710>

png

Legal

Disclaimer

Views Expressed: All views expressed in this tutorial are the authors own and do not represent the opinions of any entity whatsover with which they have been, are now, or will be affiliated.

Responsibility, Errors and Ommissions: The author makes no assurance about the reliability of the information. The author makes takes no responsibility for updating the tutorial nor maintaining it porformant status. Under no circumstances shall the Author or its affiliates be liable for any indirect incedental, consequential, or special and or exemplary damages arising out of or in connection with this tutorial. Information is provided 'as is' with distinct plausability of errors and ommitions. Information found within the contents is attached with an MIT license. Please refer to the License for more information.

Use at Risk: Any action you take upon the information on this Tutorial is strictly at your own risk, and the author will not be liable for any losses and damages in connection with the use of this tutorial and subsequent products.

Fair Use this site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. While no intention is made to unlawfully use copyrighted work, circumstanes may arise in which such material is made available in effort to advance scientific literacy. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Titile 17 U.S.C. Section 108, the material on this tutorial is distributed without profit to those who have expressed a prior interest in receiving the included information for research and education purposes.

for more information go to: http://www.law.cornell.edu/uscode/17/107.shtml. If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner.

License

Copyright © 2019 BNIA-JFI

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for dataplay, version 0.0.27
Filename, size File type Python version Upload date Hashes
Filename, size dataplay-0.0.27-py3-none-any.whl (33.8 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size dataplay-0.0.27.tar.gz (2.8 MB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page