CLI tool for building and querying ArcGIS item dependency graphs
Project description
ArcGIS Item Dependency Management
Overview
This tool builds and maintains an organization-wide ArcGIS item dependency graph, showing what Web Maps, Dashboards, Feature Services, and other items depend on each other. You can query the graph by item ID or portal search string and receive CSV, Excel, interactive HTML, and GML outputs — making it safe to audit, migrate, and clean up portal content without breaking downstream items.
Quick Start
1. Install
Standard Python / Mac / Linux:
pip install arcgis-item-graph
ArcGIS Pro (Windows) — uses Pro's bundled Python:
"%PROGRAMFILES%\ArcGIS\Pro\bin\Python\Scripts\pip.exe" install arcgis-item-graph
Windows one-click installer (end-user deployment):
Your GIS admin provides a tool folder containing install.bat, install.ps1, launch_query.py, and a pre-configured config/config.yaml. Double-click install.bat to install and launch.
What it does automatically:
- Detects conda (Miniconda / Anaconda / ArcGIS Pro)
- Downloads and installs Miniconda silently if conda is not found (~75 MB, one-time)
- Creates an
arcgis-graphconda environment with Python 3.11 - Installs the latest ArcGIS API for Python and
arcgis-item-graph - Launches the interactive query tool
Subsequent runs are fast — existing environments and packages are reused, and pip upgrades to the latest version automatically.
2. Configure
arcgis-graph setup
The wizard prompts for your portal URL, authentication method (named profile or username/password), and output preferences. Your credentials are never stored in config.yaml — they go to a gitignored .env file.
3. Build the graph (run once)
arcgis-graph create
This crawls your portal and saves a dependency graph locally. For large organizations (5,000+ items) it can take 30–90 minutes.
4. Query
arcgis-graph query --item-id abc123
arcgis-graph query --search "owner:jsmith type:Dashboard"
Prerequisites
- Python 3.9 or later
- ArcGIS API for Python 2.4.0 or later (
arcgis>=2.4.0)
Setup
See Quick Start above for installation and configuration.
For development setup, see For Contributors below.
Configuration
config/config.yaml controls authentication and all run-time settings. Two auth options are available:
Option 1 — Named ArcGIS profile (recommended for GIS admins)
Set the auth.profile key to the name of a saved ArcGIS credential profile:
auth:
profile: "my_portal_profile" # created via arcgis.gis.GIS(profile=...)
verify_cert: true
Run python -c "from arcgis.gis import GIS; GIS(profile='my_portal_profile')" to verify the profile name is correct.
Option 2 — Environment variables
Leave auth.profile blank and create a .env file in the project root:
ARCGIS_URL=https://your-portal/portal
ARCGIS_USER=your_username
ARCGIS_PASSWORD=your_password
The CLI loads .env automatically when a profile is not set.
Other settings
| Key | Default | Description |
|---|---|---|
paths.output_dir |
outputs/ |
Where all output files are written |
paths.gml_file |
outputs/graph.gml |
Persistent graph file |
create.max_items |
10000 |
Upper limit on items indexed |
update.max_retries |
5 |
Retries on transient API errors |
query.output_formats |
excel, html, gml |
Default outputs for each query (excel, csv, html, gml) |
query.traversal_direction |
upstream |
Controls which graph edges are followed: upstream — items that reference X (what breaks if X is removed); downstream — items X depends on; both — union of both without cross-contamination |
Usage
All commands are run via the unified CLI entry point:
python -m cli [--config /path/to/config.yaml] {create,update,query} [options]
Build the graph (run once)
Crawls the entire portal and saves a GML snapshot. For large organizations (5,000+ items) this can take 30–90 minutes.
python -m cli create
Keep the graph current (run on a schedule)
Finds items modified since the last run and merges changes into the existing GML. Designed for a daily cron job.
python -m cli update
Query the graph
# Query by item ID
python -m cli query --item-id abc123
# Query by portal search string
python -m cli query --search "owner:jsmith type:Dashboard"
# Request specific output formats for a single run
python -m cli query --item-id abc123 --format excel
python -m cli query --item-id abc123 --format csv --format html
# Use a different config file
python -m cli --config /path/to/other/config.yaml query --item-id abc123
Interactive dashboard (live server mode)
Add --serve to any query command to start a local HTTP server and open the dashboard in your browser automatically:
arcgis-graph query --item-id abc123 --serve
arcgis-graph query --search "owner:jsmith" --serve
# Use a different port if 8765 is taken
arcgis-graph query --item-id abc123 --serve --port 9000
The server runs at http://localhost:8765/ by default. It exposes:
GET /— the interactive HTML dashboardGET /query?id=<item_id>— live re-query from inside the dashboard (click any node)GET /export/excel?ids=<id1>,<id2>— download an Excel report for selected items
Press Ctrl+C in the terminal to stop the server.
Note: Opening the saved
.htmlfile directly (file://...) will not work for node re-queries or Excel exports because those features require the live server. Always use--servefor the full interactive experience.
Run python -m cli --help or python -m cli <command> --help for the full list of options and overrides.
Triage (migration planning)
Identify the highest-traffic consumer items in your portal and classify the services they depend on — prioritized by view count and dependency breadth. Designed for migration planning and portal housekeeping.
arcgis-graph triage # rank top 50 items (config default)
arcgis-graph triage --top-n 20 # rank top 20
arcgis-graph triage --min-dependents 2 # only items with 2+ service dependencies
arcgis-graph triage --deep # Tier 3 layer introspection (slower, more accurate)
arcgis-graph triage --no-usage-stats # rank by dependency count only (skip portal API)
arcgis-graph triage --force-refresh # bypass the triage_cache_hours window and re-run
Outputs to outputs/reports/triage/<timestamp>/:
| File | Contents |
|---|---|
triage_report.xlsx |
5-sheet workbook (see below) |
triage_manifest.json |
Machine-readable version of all triage data |
Excel workbook sheets:
| Sheet | Description |
|---|---|
| High Traffic Items | Ranked consumer items (Web Maps, Dashboards, Apps) by composite score (view count + dependency breadth) |
| Service Inventory | All map/feature services those items consume, with data_source_type (egdb / hosted / fgdb / external) and combined_view_impact |
| Dependency Matrix | One row per item × service pair — shows which item uses which service |
| Migration Hotspots | Services referenced by 2+ items (configurable), sorted by combined view impact — highest-risk services to touch during a migration |
| Consumer Chain | Items in the graph that depend on each ranked item — useful for understanding blast radius before deprecating or migrating a service |
Note:
data_source_typeclassification uses URL pattern matching (Tier 1), service JSON inspection (Tier 2), and optionally layer-level inspection (Tier 3 with--deep). Enterprise ArcGIS Server services backed by an Enterprise Geodatabase are classified asegdb; hosted services ashosted_relational; file-based data asfgdb.
Shared Deployment (Team Use)
For team environments, point paths.gml_file and paths.output_dir at a UNC share
so all users read from the same graph without running create individually.
1. Admin: initial setup
# On the admin machine, configure config.yaml to point at the share:
# paths.gml_file: "\\\\server\\share\\arcgis-graph\\graph.gml"
# paths.output_dir: "\\\\server\\share\\arcgis-graph\\outputs"
arcgis-graph create # one-time full crawl (~30-60 min for large orgs)
2. Automation: scheduled updates
Windows Task Scheduler (hourly):
arcgis-graph update --config \\server\share\arcgis-graph\config.yaml --skip-if-fresh
Linux/macOS cron (hourly):
0 * * * * arcgis-graph update --config /mnt/share/arcgis-graph/config.yaml --skip-if-fresh
--skip-if-fresh prevents double-runs if automation fires while a manual update is in progress.
3. Users: install and run
Option A — Windows installer (no Python required):
Distribute the tool folder (install.bat, install.ps1, launch_query.py, config/) to users.
They double-click install.bat. The installer handles everything: conda, packages, and launch.
The tool folder can live on a UNC share — users can run it directly from there:
\\server\share\arcgis-graph\install.bat
Option B — CLI (Python already installed):
Users point their local config.yaml at the share paths and run:
arcgis-graph query --item-id <id>
If the same item was queried within 24 hours, the cached outputs are returned instantly.
Use --force-refresh to bypass the cache and re-run the query.
Freshness thresholds (configurable)
cache:
update_warn_hours: 24 # Warn in query if graph is older than this (24 = daily, the default)
query_cache_hours: 24 # Reuse cached query outputs within this window
Output files
All output files land in the directory set by paths.output_dir (default: outputs/).
| Command | Output files |
|---|---|
create |
graph.gml, graph.timestamp |
update |
Updates graph.gml in place |
query |
dependency_report_<timestamp>.csv — tabular summary; dependency_report_<timestamp>.xlsx — 3-sheet Excel workbook (All Items, Dependency Edges, Broken Dependencies); dependency_graph_<timestamp>.html — interactive visualization; query_subgraph_<timestamp>.gml — sub-graph for further analysis |
Project structure
arcgis_item_graph/ Core library: creator, updater, query, reporter, visualizer, utils, console
cli/ Unified CLI entry point (python -m cli ...)
config/ config.example.yaml template — copy to config.yaml and fill in credentials
docs/ Documentation and design plans
lib/ Vendored frontend assets (cytoscape.js, dagre, cytoscape-dagre) for offline HTML output
outputs/ Generated output files (gitignored)
tests/ Unit and integration tests (pytest)
The CLI uses Rich for terminal output. Progress bars, error panels, and completion summaries all go through arcgis_item_graph/console.py — the single file in the project that imports Rich. Library modules (creator, updater, triage, etc.) remain UI-free and communicate with the CLI via on_progress/on_warning callback kwargs.
For Contributors
1. Clone the repository
git clone https://github.com/your-org/ArcGIS-Item-Dependency-Management.git
cd ArcGIS-Item-Dependency-Management
2. Install in editable mode with dev dependencies
pip install -e ".[dev]"
3. Activate the commit-message hook
git config core.hooksPath .githooks
4. Create your configuration file
cp config/config.example.yaml config/config.yaml
# or just run: arcgis-graph setup
Running tests
pytest tests/ -v
Performance & Architecture Notes
Graph Traversal
The query BFS uses collections.deque for O(1) popleft (O(V+E) total).
Seed items not found in the cached GML file are fetched live in parallel via
ThreadPoolExecutor (default 10 workers, configurable via fetch_workers
on ItemGraphQuery).
Traversal direction is controlled by query.traversal_direction in config:
upstream(default) — followscontained_by()edges: finds items that reference the queried item. Answers "what breaks if X is removed?" — the correct mode for migration impact analysis.downstream— followscontains()edges: finds items the queried item depends on. Answers "what does X need to function?"both— runs separate upstream and downstream passes. No cross-directional contamination (forward deps of upstream-reached nodes are not included).
Update Hydration
ItemGraphUpdater hydrates all cached graph nodes concurrently using
ThreadPoolExecutor (default 10 workers, configurable via hydration_workers).
Graph mutations (node removal) happen serially on the main thread after all
fetches complete. The modified-items search enforces a max_items cap (defaults
to create.max_items from config) and warns when results may be truncated.
Timestamps
All timestamps are stored in milliseconds with sub-second precision
(int(t.timestamp() * 1000)).
Excel Reports
ItemGraphReporter.to_excel() builds all three sheets from a single pass
through to_dataframe() — node.contains() is called once per node.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file arcgis_item_graph-0.2.22.tar.gz.
File metadata
- Download URL: arcgis_item_graph-0.2.22.tar.gz
- Upload date:
- Size: 414.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
053c0350709327bf01d52bd2a9f929a9fcfb2982f2479b0a7153d46f51a44d8c
|
|
| MD5 |
4fe0a24be344e3cc589e6bb96094f064
|
|
| BLAKE2b-256 |
c8600206f9c57f550eb2b5dea99825290030f7e0cafe99c48ca79e990799ae5f
|
File details
Details for the file arcgis_item_graph-0.2.22-py3-none-any.whl.
File metadata
- Download URL: arcgis_item_graph-0.2.22-py3-none-any.whl
- Upload date:
- Size: 372.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a1864789fb585da9ed79781f5f55aeae2e727e54769c3054b0b75064909daa96
|
|
| MD5 |
959b457282ed7a268ebf5a1385527766
|
|
| BLAKE2b-256 |
2034bc3a57ab5179d08e1dee2ad857224694b49df438abb777e0b372657f3b36
|