Convert STIX cyber threat intelligence bundles to NetworkX graphs
Project description
stix2nx
Convert STIX cyber threat intelligence bundles to NetworkX graphs.
Installation
pip install stix2nx
Quick Start
from stix2nx import stix_to_graph
# Convert a STIX bundle file to a NetworkX graph
G = stix_to_graph("enterprise-attack.json")
print(f"{len(G.nodes)} nodes, {len(G.edges)} edges")
# → 15,058 nodes, 25,383 edges
Before / After
Before (without stix2nx):
import json
import networkx as nx
with open("enterprise-attack.json") as f:
bundle = json.load(f)
G = nx.MultiDiGraph()
for obj in bundle["objects"]:
if obj["type"] == "relationship":
G.add_edge(
obj["source_ref"],
obj["target_ref"],
relationship_type=obj["relationship_type"],
id=obj["id"],
**{k: v for k, v in obj.items()
if k not in ("type", "source_ref", "target_ref", "id")},
)
elif obj["type"] == "sighting":
G.add_node(obj["id"], **obj)
if "sighting_of_ref" in obj:
G.add_edge(obj["id"], obj["sighting_of_ref"],
relationship_type="sighting_of")
for ref in obj.get("where_sighted_refs", []):
G.add_edge(obj["id"], ref, relationship_type="seen_by")
for ref in obj.get("observed_data_refs", []):
G.add_edge(obj["id"], ref, relationship_type="observed")
elif obj["type"] in ("marking-definition", "language-content"):
continue
else:
G.add_node(obj["id"], **obj)
# Don't forget STIX 2.0 embedded SCOs in observed-data...
if obj["type"] == "observed-data" and "objects" in obj:
for key, embedded in obj["objects"].items():
synthetic_id = f"{obj['id']}--embedded-{key}"
G.add_node(synthetic_id, **embedded, id=synthetic_id)
After (with stix2nx):
from stix2nx import stix_to_graph
G = stix_to_graph("enterprise-attack.json")
API Reference
stix_to_graph(source, graph_type="multidigraph", include_scos=True)
Parameters:
-
source:str | list[str] | list[dict]- File path (
strending in.jsonor existing file): reads and parses a single STIX bundle file - Directory path (
strpointing to an existing directory): globs all*.jsonfiles in the directory, merges into one graph list[str]: each string is parsed as a full STIX bundle JSONlist[dict]: each dict is treated as a parsed STIX bundle
- File path (
-
graph_type:"multidigraph"(default) |"digraph"MultiDiGraphallows multiple edges between the same pair of nodes, which is technically correct for STIX (a threat actor can both "uses" and "attributed-to" the same malware). However, some NetworkX algorithms (like certain centrality measures) only work on simple DiGraphs. Choose based on your use case.
-
include_scos:bool(defaultTrue)- When
True, STIX Cyber-observable Objects (IP addresses, domain names, file hashes, etc.) become nodes. WhenFalse, only SDOs and relationships are included.
- When
-
Returns:
nx.MultiDiGraphornx.DiGraph
Graph Structure
STIX objects map to graph elements as follows:
| STIX Object | Graph Element | Details |
|---|---|---|
| SDOs (threat-actor, malware, etc.) | Nodes | All properties become node attributes |
| SCOs (ipv4-addr, file, etc.) | Nodes | When include_scos=True |
| Relationships | Directed edges | relationship_type, start_time, stop_time, confidence as attributes |
| Sightings | Nodes + edges | sighting_of, seen_by, observed edges |
| Marking definitions | Skipped | Not added to the graph |
| Language content | Skipped | Not added to the graph |
All STIX properties on objects are preserved as NetworkX attributes. List-valued properties remain Python lists.
Working with the Graph
from stix2nx import stix_to_graph
G = stix_to_graph("enterprise-attack.json")
Find all techniques used by APT28:
apt28 = [n for n, d in G.nodes(data=True) if d.get("name") == "APT28"][0]
techniques = [
G.nodes[target]["name"]
for _, target, data in G.edges(apt28, data=True)
if data.get("relationship_type") == "uses"
and G.nodes[target].get("type") == "attack-pattern"
]
Most connected threat actors (by degree):
actors = [(n, G.degree(n)) for n, d in G.nodes(data=True) if d["type"] == "intrusion-set"]
top_actors = sorted(actors, key=lambda x: x[1], reverse=True)[:10]
Merge multiple bundles:
G = stix_to_graph("/path/to/stix-bundles/") # all .json files in directory
Use a DiGraph for algorithm compatibility:
G = stix_to_graph("enterprise-attack.json", graph_type="digraph")
centrality = nx.betweenness_centrality(G)
Visualization
Generated with examples/visualize_attack.py, which extracts a 2-hop neighborhood around APT28 from the ATT&CK dataset and color-codes nodes by type.
Running Tests
# Install dev dependencies
pip install -e ".[dev]"
# Run all tests (uses curated ATT&CK subset, no network needed)
pytest
# Run integration test with full live ATT&CK bundle (~30MB download)
STIX2NX_LIVE_ATTACK=true pytest tests/test_attack.py -v
# Regenerate the curated subset from latest ATT&CK (requires network)
python tests/data/build_subset.py
STIX Version Support
Supports both STIX 2.0 and STIX 2.1 bundles. STIX 2.0 observed-data objects with embedded observables are automatically extracted as standalone nodes when include_scos=True.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stix2nx-0.1.0.tar.gz.
File metadata
- Download URL: stix2nx-0.1.0.tar.gz
- Upload date:
- Size: 955.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f301593946d0f564d48d7c5f1e472a1ce00d9861cf014fd183b5712ca20d73c3
|
|
| MD5 |
49b805eae664c94082e081883c45e6c1
|
|
| BLAKE2b-256 |
ae2624e4ac14b1073c23b434bd8b24cbb9b54ab84f08778254f2e8e4251e9615
|
File details
Details for the file stix2nx-0.1.0-py3-none-any.whl.
File metadata
- Download URL: stix2nx-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9d4855b32b71f12ed109abd1012b9f22ba96964c8ffafddbdaea46a74b03a562
|
|
| MD5 |
0f0da4544023636849b9d618fc656c87
|
|
| BLAKE2b-256 |
c8bbc8025487209974e25604ad8a6d9353d4169c908c33f4f7784a69768afa7b
|