Spatio temporal analysis for inferrence of statistical causality using XGenESeSS
Project description
# Spatio temporal analysis for inferrence of statistical causality
@author zed.uchicago.edu
CLASSES spatioTemporal uNetworkModels
- class spatioTemporal
Utilities for spatio temporal analysis@author zed.uchicago.eduAttributes:log_store (Pickle): Pickle storage of class data & dataframeslog_file (string): path to CSV of legacy dataframets_store (string): path to CSV containing most recent ts exportDATE (string):EVENT (string): column label for category filtercoord1 (string): first coordinate level type; is column namecoord2 (string): second coordinate level type; is column namecoord3 (string): third coordinate level type;(z coordinate)end_date (datetime.date): upper bound of daterangefreq (string): timeseries increments; e.g. D for datecolumns (list): list of column names to use;required at least 2 coordinates and event typetypes (list of strings): event type list of filtersvalue_limits (tuple): boundaries (magnitude of event;above threshold)grid (dictionary or list of lists): coordinate dictionary withrespective ranges and EPS value OR custom list of listsof custom grid tiles as [coord1_start, coord1_stop,coord2_start, coord2_stop]grid_type (string): parameter to determine if grid should be built upfrom a coordinate start/stop range (‘auto’) or bebuilt from custom tile coordinates (‘custom’)threshold (float): significance thresholdMethods defined here:__init__(self, log_store=’log.p’, log_file=None, ts_store=None, DATE=’Date’, year=None, month=None, day=None, EVENT=’Primary Type’, coord1=’Latitude’, coord2=’Longitude’, coord3=None, init_date=None, end_date=None, freq=None, columns=None, types=None, value_limits=None, grid=None, threshold=None)fit(self, grid=None, INIT=None, END=None, THRESHOLD=None, csvPREF=’TS’)Utilities for spatio temporal analysis@author zed.uchicago.eduFit dataproc with specified grid parameters andcreate timeseries fordate boundaries specified by INIT, THRESHOLD,and END or input list of custom coordinate boundaries which do NOT haveto match the arguments first input to the dataprocInputs:grid (dictionary or list of lists): coordinate dictionary withrespective ranges and EPS value OR custom list of listsof custom grid tiles as [coord1_start, coord1_stop,coord2_start, coord2_stop]INIT (datetime.date): starting timeseries dateEND (datetime.date): ending timeseries dateTHRESHOLD (float): significance thresholdOutputs:(None)getTS(self, _types=None, tile=None)Utilities for spatio temporal analysis@author zed.uchicago.eduUtilities for spatio temporal analysis@author zed.uchicago.eduGiven location tile boundaries and type category filter, creates thecorresponding timeseries as a pandas DataFrame(Note: can reassign type filter, does not have to be the same oneas the one initialized to the dataproc)Inputs:_types (list of strings): list of category filterstile (list of floats): location boundaries for tileOutputs:pd.Dataframe of timeseries data to corresponding grid tilepd.DF index is stringified LAT/LON boundarieswith the type filter includedpull(self, domain=’data.cityofchicago.org’, dataset_id=’crimes’, token=None, store=True, out_fname=’pull_df.p’, pull_all=False)Utilities for spatio temporal analysis@author zed.uchicago.eduPulls new entries from datasourceNOTE: should make flexible but for now use city of Chicago dataInput -domain (string): Socrata database domain hosting datadataset_id (string): dataset ID to pulltoken (string): Socrata token for increased pull capacity; Note: Requires Socrata accountstore (boolean): whether or not to write out new datasetpull_all (boolean): pull complete datasetinstead of just updatingOutput -None (writes out files if store is True and modifies inplace)timeseries(self, LAT=None, LON=None, EPS=None, _types=None, CSVfile=’TS.csv’, THRESHOLD=None, tiles=None)Utilities for spatio temporal analysis@author zed.uchicago.eduCreates DataFrame of location tiles and theirrespective timeseries frominput datasource withsignificance threshold THRESHOLDlatitude, longitude coordinate boundaries given by LAT, LON and EPSor the custom boundaries given by tilescalls on getTS for individual tile then concats them togetherInput:LAT (float):LON (float):EPS (float): coordinate increment ESP_types (list): event type filter; accepted event type listCSVfile (string): path to output fileOutput:(None): grid pd.Dataframe written out as CSV fileto path specified- class uNetworkModels
Utilities for storing and manipulating XPFSA modelsinferred by XGenESeSS@author zed.uchicago.eduAttributes:jsonFile (string): path to json file containing modelsMethods defined here:__init__(self, jsonFILE)augmentDistance(self)Utilities for storing and manipulating XPFSA modelsinferred by XGenESeSS@author zed.uchicago.eduCalculates the distance between all models and storesthem under thedistance key of each model;No I/Oselect(self, var=’gamma’, n=None, reverse=False, store=None)Utilities for storing and manipulating XPFSA modelsinferred by XGenESeSS@author zed.uchicago.eduSelects the N top models as ranked by var specified value(in reverse order if reverse is True)Inputs -var (string): model parameter to rank byn (int): number of models to returnreverse (boolean): return in ascending order (True)or descending (False) orderstore (string): name of file to store selection jsonReturns -(dictionary): top n models as ranked by varin ascending/descending orderto_json(outFile)Utilities for storing and manipulating XPFSA modelsinferred by XGenESeSS@author zed.uchicago.eduWrites out updated models json to fileInput -outFile (string): name of outfile to write json toOutput -None———————————————————————-Data descriptors defined here:models
FUNCTIONS draw_screen_poly(lats, lons, m, ax, val, cmap, ALPHA=0.6) utility function to draw polygons on basemap
- getalpha(arr, index, F=0.9)
utility function to normalize transparency of quiver
- readTS(TSfile, csvNAME=’TS1’, BEG=None, END=None)
Utilities for spatio temporal analysis @author zed.uchicago.edu
- Reads in output TS logfile into pd.DF
and then outputs necessary CSV files in XgenESeSS-friendly format
- Input -
TSfile (string): filename input TS to read csvNAME (string) BEG (string): start datetime END (string): end datetime
- Returns -
dfts (pandas.DataFrame)
- showGlobalPlot(coords, ts=None, fsize=[14, 14], cmap=’jet’, m=None, figname=’fig’, F=2)
plot global distribution of events within time period specified
- Inputs -
coords (string): filename with coord list as lat1#lat2#lon1#lon2 ts (string): time series filename with data in rows, space separated fsize (list): cmap (string): m (mpl.mpl_toolkits.Basemap): mpl instance for plotting figname (string): Name of the Plot
- Returns -
- m (mpl.mpl_toolkits.Basemap): mpl instance of heat map of
crimes from fitted data
- splitTS(TSfile, csvNAME=’TS1’, dirname=’./’, prefix=’@’, BEG=None, END=None)
Utilities for spatio temporal analysis @author zed.uchicago.edu
Writes out each row of the pd.DataFrame as a separate CSVfile For XgenESeSS binary
No I/O
- stringify(List)
Utility function @author zed.uchicago.edu
- Converts list into string separated by dashes
or empty string if input list is not list or is empty
- Input:
List (list): input list to be converted
- Output:
(string)
- to_json(pydict, outFile)
Writes dictionary json to file @author zed.uchicago.edu
- Input -
pydict (dict): ditionary to store outFile (string): name of outfile to write json to
- Returns -
Nonexs
- viz(unet, jsonfile=False, colormap=’autumn’, res=’c’, drawpoly=False, figname=’fig’)
utility function to visualize spatio temporal interaction networks @author zed.uchicago.edu
- Inputs -
unet (string): json filename unet (python dict): jsonfile (bool): True if unet is string specifying json filename colormap (string): colormap res (string): ‘c’ or ‘f’ drawpoly (bool): if True draws transparent patch showing srcs figname (string): prefix of pdf image file
- Returns -
m (Basemap handle) fig (figure handle) ax (axis handle) cax (colorbar handle)
DATA DEBUG = False version = ‘1.0.7’
VERSION 1.0.7
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.