Labeled property graph transformer for followthemoney data.
Project description
followthemoney-graph
The followthemoney-graph (ftmg) tool transforms and loads FollowTheMoney entity data into a Neo4j property graph database. The tool provides flexible data transformation capabilities including filtering, reification of entity properties into graph nodes, and graph optimization.
Features
- Load FollowTheMoney entities into Neo4j with configurable schema mappings
- Reify entity properties (names, addresses, identifiers, etc.) as graph nodes to reveal shared values
- Automatically create unique constraints and indexes for optimal query performance
- Prune single-reference reified nodes to optimize graph structure
- Support for custom label mappings and schema filtering
- Handles out-of-sequence data (nodes defined after edges that reference them)
Installation
Requirements
- Python 3.10 or higher
- Neo4j 5.0 or higher (running and accessible)
Install from source
git clone https://github.com/opensanctions/followthemoney.git
cd followthemoney-graph
pip install -e .
Install for development
pip install -e ".[dev]"
This includes additional tools like mypy, pytest, and coverage for development.
Configuration
Create a config.yml file to configure database connection and transformation settings:
# Database connection settings
db:
url: bolt://localhost:7687
username: neo4j
password: your_password
# Node configuration
nodes:
# Schema-specific settings
schemata:
Position:
ignore: true # Skip this entity type
Address:
ignore: true # Don't create Address entity nodes
Person:
label: "Human" # Use custom label instead of "Person"
# Property type reification
types:
address:
reify: true # Create separate nodes for address values
identifier:
reify: true # Create separate nodes for identifiers
phone:
reify: true
email:
reify: true
url:
reify: true
# Topic-based labeling
topics:
"sanction":
label: "Sanctioned"
"role.pep":
label: "Politician"
"poi":
label: "PersonOfInterest"
"gov.national":
ignore: true # Skip entities with this topic
# Edge configuration
edges:
schemata:
Occupancy:
ignore: true # Skip this relationship type
Configuration Options
Database (db)
url: Neo4j connection URL (bolt:// or neo4j://)username: Database usernamepassword: Database password
Nodes (nodes)
Schemata Configuration (nodes.schemata)
Configure how FollowTheMoney entity schemas are mapped:
ignore: true: Skip entities of this schema typelabel: "CustomLabel": Use a custom Neo4j label instead of the schema name
Type Reification (nodes.types)
Specify which property types should be reified as separate nodes:
reify: true: Create a separate node for this property type- When reified, properties like addresses or emails become nodes that can be shared between entities
Topic Labels (nodes.topics)
Map FollowTheMoney topics to Neo4j labels:
label: "CustomLabel": Apply this label to entities with the topicignore: true: Skip entities with this topic
Edges (edges)
Schemata Configuration (edges.schemata)
Configure which relationship types to include:
ignore: true: Skip relationships of this type
Usage
The ftmg command-line tool provides several commands for managing your graph database:
Check Configuration
Validate and display the expanded configuration:
ftmg check-config config.yml
This parses your configuration file and outputs the complete configuration including defaults.
Load Data
Load FollowTheMoney entities from a JSON Lines file into Neo4j:
ftmg load config.yml --source entities.ftm.json
This command:
- Creates unique constraints and indexes for all node types
- Reads entities from the source file (JSON Lines format)
- Transforms and loads them into Neo4j according to your configuration
- Handles out-of-sequence data automatically
Source file format: Each line should contain a single FollowTheMoney entity as JSON.
Prune Graph
Remove reified value nodes that are only referenced by a single entity:
ftmg prune config.yml
This optimization command:
- Identifies reified nodes (addresses, emails, identifiers, etc.)
- Counts unique entities referencing each reified node
- Deletes nodes referenced by fewer than 2 entities
- Reports the number of nodes pruned per type
Why prune? Reified nodes are most valuable when they reveal shared values between multiple entities. Single-reference reified nodes don't add structural value to the graph.
Delete All Data
Completely wipe the database (use with caution):
ftmg trash config.yml
This command requires confirmation and will delete all nodes and relationships.
Examples
Basic Workflow
# 1. Validate your configuration
ftmg check-config config.yml
# 2. Load your data
ftmg load config.yml --source my-entities.ftm.json
# 3. Optimize the graph by removing single-reference reified nodes
ftmg prune config.yml
Starting Fresh
# Clear the database
ftmg trash config.yml
# Load new data
ftmg load config.yml --source entities.ftm.json
Graph Structure
Entity Nodes
Entities are loaded as nodes with:
- Base label:
Entity - Schema label: e.g.,
Person,Company,Asset - Topic labels: e.g.,
Sanctioned,Politician(if configured) - Properties: All entity properties as node properties
- Special property:
id(unique constraint enforced)
Reified Value Nodes
When property types are marked for reification:
- Each unique value becomes a separate node
- Relationships connect entities to value nodes
- Value nodes can be shared between entities
- Labels: e.g.,
address,identifier,email - Special property:
id(unique constraint enforced)
Relationships
Entity relationships are preserved as graph edges with:
- Relationship type based on the FollowTheMoney schema
- Properties from the relationship entity
Development
Running Tests
pytest
Type Checking
mypy ftmg
Code Coverage
pytest --cov=ftmg --cov-report=html
Releasing
Releases to PyPI are published
automatically by the build GitHub Actions workflow when a version tag is pushed,
using PyPI Trusted Publishing (OIDC) — no API token is stored in the repository.
To cut a release:
bump2version patch # or: minor / major — creates a commit and a vX.Y.Z tag
git push --follow-tags
The tag push runs the test/lint/type-check job, builds the wheel + sdist, attaches a build-provenance attestation, and publishes to PyPI.
Links
License
MIT License - see LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file followthemoney_graph-0.1.0.tar.gz.
File metadata
- Download URL: followthemoney_graph-0.1.0.tar.gz
- Upload date:
- Size: 13.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e97476eca6f268f84feb388b9d4be871efc72a14d1af194789c46b677ec397ff
|
|
| MD5 |
cf8274420b1c72dc6625833e2714f035
|
|
| BLAKE2b-256 |
6d8ec7b399b26a12b260d843b9d984777cec6e28635114e61fd54f52645e24f6
|
Provenance
The following attestation bundles were made for followthemoney_graph-0.1.0.tar.gz:
Publisher:
build.yml on opensanctions/followthemoney-graph
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
followthemoney_graph-0.1.0.tar.gz -
Subject digest:
e97476eca6f268f84feb388b9d4be871efc72a14d1af194789c46b677ec397ff - Sigstore transparency entry: 1763355633
- Sigstore integration time:
-
Permalink:
opensanctions/followthemoney-graph@2d8addb536ff4920352d3e7fab32572cf4cecaa6 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/opensanctions
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build.yml@2d8addb536ff4920352d3e7fab32572cf4cecaa6 -
Trigger Event:
push
-
Statement type:
File details
Details for the file followthemoney_graph-0.1.0-py3-none-any.whl.
File metadata
- Download URL: followthemoney_graph-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
136c09de9c621fa10894937530349ea979c77e965694750f561893433054894b
|
|
| MD5 |
3412eb0750073adc0913c0eecf5ab6a1
|
|
| BLAKE2b-256 |
788a5ccaab1ccb4743972194d47b825f2455fffc5e3b10254a35051ab45aa9d9
|
Provenance
The following attestation bundles were made for followthemoney_graph-0.1.0-py3-none-any.whl:
Publisher:
build.yml on opensanctions/followthemoney-graph
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
followthemoney_graph-0.1.0-py3-none-any.whl -
Subject digest:
136c09de9c621fa10894937530349ea979c77e965694750f561893433054894b - Sigstore transparency entry: 1763355779
- Sigstore integration time:
-
Permalink:
opensanctions/followthemoney-graph@2d8addb536ff4920352d3e7fab32572cf4cecaa6 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/opensanctions
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build.yml@2d8addb536ff4920352d3e7fab32572cf4cecaa6 -
Trigger Event:
push
-
Statement type: