Skip to main content

Australian Government Interactive Functions Thesaurus (AGIFT) as a Neo4j knowledge graph with embeddings and dual edge types

Project description

AGIFT Graph

Australian Government Interactive Functions Thesaurus (AGIFT) as a Neo4j knowledge graph with embeddings and dual edge types.

What it does

Fetches the full AGIFT vocabulary from the TemaTres API, builds a Neo4j graph with structural hierarchy edges, generates embeddings (free local or Isaacus API), then creates semantic similarity edges between related terms.

TemaTres API ──► Neo4j Graph ──► Embeddings ──► Semantic Edges
  (AGIFT)        (PARENT_OF)     (384/512/768d)  (SIMILAR_TO)

Graph model

Two edge types with different weights for query-time flexibility:

Edge Type Weight Description
PARENT_OF structural 1.0 AGIFT hierarchy (L1 → L2 → L3)
SIMILAR_TO semantic 0.5 Cosine similarity above threshold

Nodes carry DCAT-AP theme mappings for interoperability with European open data standards.

Quick start

docker compose -f docker-compose.agift.yml up -d --build

Then open the dashboard at http://localhost:5050 and click "Full Pipeline" or "Graph Only".

Embedding providers

Provider Cost Dimensions Setup
local (sentence-transformers) Free 384, 768 Nothing — runs on CPU
isaacus (kanon-2-embedder) Paid 256–1792 Set API key in dashboard

The local provider uses all-MiniLM-L6-v2 (384d) or all-mpnet-base-v2 (768d). Models are downloaded on first run and cached in a Docker volume.

Configuration

Copy .env.example to .env and edit:

cp agift/.env.example .env
Variable Default Description
NEO4J_PASSWORD changeme Neo4j database password
ISAACUS_API_KEY (empty) Isaacus API key (optional)

All other settings (dimension, provider, similarity threshold, semantic edge weight) are configured via the dashboard UI and stored in Neo4j.

Services

Service Port Description
Neo4j Browser 7474 Graph database UI
Neo4j Bolt 7687 Database protocol
Dashboard 5050 Config, run controls, logs

CLI usage

# Full pipeline (fetch + graph + embed + semantic edges)
docker exec agift-worker python import_agift.py

# Graph only (no embeddings)
docker exec agift-worker python import_agift.py --skip-embed --skip-semantic

# Local embeddings, 384 dimensions
docker exec agift-worker python import_agift.py --provider local --dimension 384

# Force re-embed all terms
docker exec agift-worker python import_agift.py --force-embed

# Dry run (fetch from API, no writes)
docker exec agift-worker python import_agift.py --dry-run

Docker Hub (no source code needed)

docker compose -f docker-compose.agift.hub.yml up -d

Project structure

agift/
├── import_agift.py          # 4-stage pipeline (fetch/graph/embed/link)
├── dashboard/
│   ├── Dockerfile
│   ├── app.py               # Flask dashboard + run controls
│   └── templates/
│       └── index.html
├── worker/
│   ├── Dockerfile
│   └── entrypoint.sh        # Cron scheduler + manual trigger
├── .env.example
├── LICENSE                   # Apache 2.0
└── README.md

Data source

AGIFT is maintained by the National Archives of Australia and published via TemaTres at https://vocabularyserver.com/agift/

License

Apache 2.0 — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agift_graph-0.1.0.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agift_graph-0.1.0-py3-none-any.whl (16.2 kB view details)

Uploaded Python 3

File details

Details for the file agift_graph-0.1.0.tar.gz.

File metadata

  • Download URL: agift_graph-0.1.0.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agift_graph-0.1.0.tar.gz
Algorithm Hash digest
SHA256 cdf808a5f861506a462e8f068cf533d5d9466d7d0b0ceeca74c82fd05463641e
MD5 7c225fecd785b561f132f8085aa61e7c
BLAKE2b-256 70d4274300e07661a3be8614b65fdf3fc0b3594ae62f379f2d3dc3de361298b4

See more details on using hashes here.

Provenance

The following attestation bundles were made for agift_graph-0.1.0.tar.gz:

Publisher: publish.yml on DeepCivic/AGIFT-graph-builder

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file agift_graph-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: agift_graph-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agift_graph-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a5a84295ce50ef12c0ca5b789bdd15560643a304a7f332501a7741c61a7c2eb3
MD5 deb48954cb136bd023df5f2cb91f10a0
BLAKE2b-256 4d2469b7dc0d521ab39cb06ec2e759ada024cfa359e3800fbcaeedd8dbbc00a2

See more details on using hashes here.

Provenance

The following attestation bundles were made for agift_graph-0.1.0-py3-none-any.whl:

Publisher: publish.yml on DeepCivic/AGIFT-graph-builder

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page