Australian Government Interactive Functions Thesaurus (AGIFT) as a Neo4j knowledge graph with embeddings and dual edge types
Project description
AGIFT Graph
Australian Government Interactive Functions Thesaurus (AGIFT) as a Neo4j knowledge graph with embeddings and dual edge types.
What it does
Fetches the full AGIFT vocabulary from the TemaTres API, builds a Neo4j graph with structural hierarchy edges, generates embeddings (free local or Isaacus API), then creates semantic similarity edges between related terms.
TemaTres API ──► Neo4j Graph ──► Embeddings ──► Semantic Edges
(AGIFT) (PARENT_OF) (384/512/768d) (SIMILAR_TO)
Graph model
Two edge types with different weights for query-time flexibility:
| Edge | Type | Weight | Description |
|---|---|---|---|
PARENT_OF |
structural | 1.0 | AGIFT hierarchy (L1 → L2 → L3) |
SIMILAR_TO |
semantic | 0.5 | Cosine similarity above threshold |
Nodes carry DCAT-AP theme mappings for interoperability with European open data standards.
Quick start
docker compose -f docker-compose.agift.yml up -d --build
Then open the dashboard at http://localhost:5050 and click "Full Pipeline" or "Graph Only".
Embedding providers
| Provider | Cost | Dimensions | Setup |
|---|---|---|---|
| local (sentence-transformers) | Free | 384, 768 | Nothing — runs on CPU |
| isaacus (kanon-2-embedder) | Paid | 256–1792 | Set API key in dashboard |
The local provider uses all-MiniLM-L6-v2 (384d) or all-mpnet-base-v2 (768d). Models are downloaded on first run and cached in a Docker volume.
Configuration
Copy .env.example to .env and edit:
cp agift/.env.example .env
| Variable | Default | Description |
|---|---|---|
NEO4J_PASSWORD |
changeme |
Neo4j database password |
ISAACUS_API_KEY |
(empty) | Isaacus API key (optional) |
All other settings (dimension, provider, similarity threshold, semantic edge weight) are configured via the dashboard UI and stored in Neo4j.
Services
| Service | Port | Description |
|---|---|---|
| Neo4j Browser | 7474 | Graph database UI |
| Neo4j Bolt | 7687 | Database protocol |
| Dashboard | 5050 | Config, run controls, logs |
CLI usage
# Full pipeline (fetch + graph + embed + semantic edges)
docker exec agift-worker python import_agift.py
# Graph only (no embeddings)
docker exec agift-worker python import_agift.py --skip-embed --skip-semantic
# Local embeddings, 384 dimensions
docker exec agift-worker python import_agift.py --provider local --dimension 384
# Force re-embed all terms
docker exec agift-worker python import_agift.py --force-embed
# Dry run (fetch from API, no writes)
docker exec agift-worker python import_agift.py --dry-run
Docker Hub (no source code needed)
docker compose -f docker-compose.agift.hub.yml up -d
Project structure
agift/
├── import_agift.py # 4-stage pipeline (fetch/graph/embed/link)
├── dashboard/
│ ├── Dockerfile
│ ├── app.py # Flask dashboard + run controls
│ └── templates/
│ └── index.html
├── worker/
│ ├── Dockerfile
│ └── entrypoint.sh # Cron scheduler + manual trigger
├── .env.example
├── LICENSE # Apache 2.0
└── README.md
Data source
AGIFT is maintained by the National Archives of Australia and published via TemaTres at https://vocabularyserver.com/agift/
License
Apache 2.0 — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agift_graph-0.1.0.tar.gz.
File metadata
- Download URL: agift_graph-0.1.0.tar.gz
- Upload date:
- Size: 15.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cdf808a5f861506a462e8f068cf533d5d9466d7d0b0ceeca74c82fd05463641e
|
|
| MD5 |
7c225fecd785b561f132f8085aa61e7c
|
|
| BLAKE2b-256 |
70d4274300e07661a3be8614b65fdf3fc0b3594ae62f379f2d3dc3de361298b4
|
Provenance
The following attestation bundles were made for agift_graph-0.1.0.tar.gz:
Publisher:
publish.yml on DeepCivic/AGIFT-graph-builder
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agift_graph-0.1.0.tar.gz -
Subject digest:
cdf808a5f861506a462e8f068cf533d5d9466d7d0b0ceeca74c82fd05463641e - Sigstore transparency entry: 1222066774
- Sigstore integration time:
-
Permalink:
DeepCivic/AGIFT-graph-builder@b2071b73caa77742c4f7fa7ae09a33d809f60d39 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/DeepCivic
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@b2071b73caa77742c4f7fa7ae09a33d809f60d39 -
Trigger Event:
release
-
Statement type:
File details
Details for the file agift_graph-0.1.0-py3-none-any.whl.
File metadata
- Download URL: agift_graph-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a5a84295ce50ef12c0ca5b789bdd15560643a304a7f332501a7741c61a7c2eb3
|
|
| MD5 |
deb48954cb136bd023df5f2cb91f10a0
|
|
| BLAKE2b-256 |
4d2469b7dc0d521ab39cb06ec2e759ada024cfa359e3800fbcaeedd8dbbc00a2
|
Provenance
The following attestation bundles were made for agift_graph-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on DeepCivic/AGIFT-graph-builder
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agift_graph-0.1.0-py3-none-any.whl -
Subject digest:
a5a84295ce50ef12c0ca5b789bdd15560643a304a7f332501a7741c61a7c2eb3 - Sigstore transparency entry: 1222066844
- Sigstore integration time:
-
Permalink:
DeepCivic/AGIFT-graph-builder@b2071b73caa77742c4f7fa7ae09a33d809f60d39 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/DeepCivic
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@b2071b73caa77742c4f7fa7ae09a33d809f60d39 -
Trigger Event:
release
-
Statement type: