Updating AGOL feature services with data from external tables.

These details have not been verified by PyPI

Project links

Project description

agrc/palletjack

Build Status

A library of classes and methods for automatically updating AGOL feature services with data from several different types of external sources. Client apps (sometimes called 'skids') can reuse these classes for common use cases. The code modules are oriented around each step in the extract, transform, and load process.

palletjack works with pandas DataFrames (either regular for tabular data or Esri's spatially-enabled dataframes for spatial data). The extract and transform methods return dataframes and the load methods consume dataframes as their source data.

The documentation includes a user guide along with an API description of the available classes and methods.

Pallet jack: forklift's little brother.

Dependencies

palletjack relies on the dependencies listed in setup.py. These are all available on PyPI and can be installed in most environments, including Google Cloud Functions.

The arcgis library does all the heavy lifting for spatial data. If the arcpy library is not available (such as in a cloud function), it relies on shapely for its geometry engine.

Installation

Activate your application's environment
pip install ugrc-palletjack

Quick start

Import the desired modules
Use a class in extract to load a dataframe from an external source
Transform your dataframe as desired with helper methods from transform

Use the dataframe to update a hosted feature service using the methods in load

from palletjack import extract, transform, load

#: Load the data from a Google Sheet
gsheet_extractor = extract.GSheetLoader(path_to_service_account_json)
sheet_df = gsheet_extractor.load_specific_worksheet_into_dataframe(sheet_id, 'title of desired sheet', by_title=True)

#: Convert the data to points using lat/long fields, clean for uploading
spatial_df = pd.DataFrame.spatial.from_xy(input_df, x_column='longitude', y_column='latitude')
renamed_df = transform.DataCleaning.rename_dataframe_columns_for_agol(spatial_df)
cleaned_df = transform.DataCleaning.switch_to_nullable_int(renamed_df, ['an_int_field_with_null_values'])

#: Truncate the existing feature service data and load the new data
gis = arcgis.gis.GIS('my_agol_org_url', 'username', 'super-duper-secure-password')
updates = palletjack.load.FeatureServiceUpdater.truncate_and_load_features(
   gis, 'feature_service_item_id', cleaned_df, r'c:\directory\to\save\truncated\data\in\case\of\error'
)

Development

Create a conda environment with Python 3.9
- conda create -n palletjack python=3.9
- activate palletjack
Clone the repo
Install in dev mode with development dependencies
- pip install -e .[tests]

Troubleshooting Weird Append Errors

If a FeatureLayer.append() call (within a load.FeatureServiceUpdater method) fails with an "Unknown Error: 500" error or something like that, you can query the results to get more info. The debug log will include the HTTP GET call, something like the following: https://services1.arcgis.com:443 POST /<unique string>/arcgis/rest/services/<feature layer name>/FeatureServer/<layer id>/append/jobs/<job guid>?f=json token=<crazy long token string>

You can use this and a token from an AGOL tab to build a new job status url. To get the token, log into AGOL in a browser and open a private hosted feature layer item. Click the layer, and then open the developer console. With the Network tab of the console open, click on the "View" link for the service URL. You should see a document in the list whose name includes "?token=". Copy the name and then copy out the token string.

Now that you've got the token string, you can build the status query: https://services1.arcgis.com/<unique string>/arcgis/rest/services/<feature layer name>/FeatureServer/<layer id>/append/jobs/<job guid>?f=json&<token from agol>

Calling this URL in a browser should return a message that will hopefully give you more info as to why it failed.

Updating Docs

palletjack uses pdoc3 to generate HTML docs in docs/palletjack from the docstrings within the code itself. These are then served up via github pages.

The github pages are served from the gh-pages branch. After you make edits to the code and update the docstrings, rebase this branch onto the updated main branch. To prevent github pages from trying to generate a site from the contents of docs/palletjack with jekyll, add a .nojekyll file to docs/palletjack.

To generate the docs, run pdoc --html -o docs\ c:\palletjack\repo\src\palletjack --force. The code's docstrings should be Google-style docstrings with proper indentation to ensure the argument lists, etc are parsed and displayed correctly.

docs/README.md is included at the top package level by adding the line .. include:: ../../docs/README.md in __init__.py's docstring. This tells pdoc to insert that markdown into the HTML generated for that docstring, and the include directive can be used for more in-depth documentation anywhere else as well. Note that pdoc tries to create links for anything surrounded by backticks, which are also used for coad blocks. You may need to manually edit the HTML to remove the links if they change the content of your code blocks (such as in the example import statement).

Once the contents of docs/palletjack look correct, force push the gh-pages branch to github. This will trigger the action to republish the site. The docs are then accessible at [agrc.github.io/palletjack/palletjack/index.html].

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

5.1.1

Oct 8, 2024

5.1.0

Oct 4, 2024

5.0.2

Sep 17, 2024

5.0.1

Aug 21, 2024

5.0.0

Aug 15, 2024

4.4.2

Jul 8, 2024

4.4.1

May 22, 2024

4.4.0

May 22, 2024

4.3.1

Apr 1, 2024

4.3.0

Dec 6, 2023

This version

4.2.0

Oct 24, 2023

4.1.0

Oct 16, 2023

4.0.1

Oct 3, 2023

4.0.0

Sep 29, 2023

3.1.0

Jul 31, 2023

3.0.0

Mar 13, 2023

3.0.0b6 pre-release

Feb 28, 2023

3.0.0b5 pre-release

Feb 28, 2023

3.0.0b4 pre-release

Feb 28, 2023

3.0.0b3 pre-release

Feb 24, 2023

3.0.0b2 pre-release

Feb 24, 2023

3.0.0b1 pre-release

Jan 31, 2023

2.7.3

Jan 23, 2023

2.7.2

Jan 12, 2023

2.7.1

Jan 6, 2023

2.7.0

Jan 4, 2023

2.6.2

Sep 7, 2022

2.5.1

Aug 12, 2022

2.4.1

Jul 8, 2022

2.4.0

Jun 30, 2022

2.3.0

Jun 30, 2022

2.2.0

Jun 14, 2022

2.1.0

May 27, 2022

2.0.2

Apr 11, 2022

2.0.1

Dec 1, 2021

2.0.0

Nov 30, 2021

1.0.0

Nov 16, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ugrc-palletjack-4.2.0.tar.gz (41.1 kB view hashes)

Uploaded Oct 24, 2023 Source

Built Distribution

ugrc_palletjack-4.2.0-py3-none-any.whl (41.1 kB view hashes)

Uploaded Oct 24, 2023 Python 3

Hashes for ugrc-palletjack-4.2.0.tar.gz

Hashes for ugrc-palletjack-4.2.0.tar.gz
Algorithm	Hash digest
SHA256	`17179eecf192fc90e6058e8a2e18c5246a74e94973a177365d3f140482ba3edc`
MD5	`6a6090b1a2124d1d00d224644c0de074`
BLAKE2b-256	`a428f8bad6e41f7b1a7eb18f2ae9b0091f507e60462679db30d1262f68a89c61`

Hashes for ugrc_palletjack-4.2.0-py3-none-any.whl

Hashes for ugrc_palletjack-4.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`19dd8d18224f4b52276b470977a27fc3bc0f3d28fececb38bb29b5aa3b5f0fad`
MD5	`3f0a68778b75dc9e289f70a6e16c134e`
BLAKE2b-256	`5ca9b67a4eb32ca286901a01d1521e799ec4f0eccf473e5a48924870372122b3`