Extract threat indicators (IOCs) from unstructured text — IPs, domains, URLs, hashes, CVEs, MITRE techniques, threat actors, and malware families. Layer 1 of an IOC-lifecycle toolkit.
Project description
iocflow
Pull indicators of compromise out of unstructured text — threat-intel reports, advisories, emails, tickets — in one call. iocflow extracts IPs, domains, URLs, filenames, file hashes, CVEs, MITRE ATT&CK technique IDs, threat actors, and malware families, with the false-positive defenses you'd otherwise write by hand: a Public Suffix List domain validator, benign-domain/IP allowlists, hash de-duplication across MD5/SHA1/SHA256, and re-fanging of defanged IOCs.
from iocflow import extract
text = """
APT28 (a.k.a. Fancy Bear) staged Cobalt Strike from evil-domain[.]ru and
185.220.101.5, dropping install.ps1 (MD5 a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4).
Exploited CVE-2021-44228 via T1190. Contact: ops@evil-domain[.]ru.
"""
entities = extract(text)
print(entities.summary())
# 1 IPs, 1 domains, 1 filenames, 1 hashes, 1 CVEs, 1 emails, 1 threat actors, 1 MITRE techniques
for ind in entities.iter_indicators():
print(ind.kind, ind.value)
# ip 185.220.101.5
# domain evil-domain.ru
# ...
The defanged evil-domain[.]ru and ops@evil-domain[.]ru are re-fanged
automatically; 185.220.101.5 is kept while private/benign IPs are dropped.
Install
pip install iocflow # core — one dependency (tldextract)
pip install "iocflow[mitre]" # + a ready-made MITRE ATT&CK malware-name source
What it extracts
extract(text) returns an ExtractedEntities with:
ips— public IPv4, excluding private ranges, benign IPs, and version-number-like valuesdomains— validated against the Mozilla Public Suffix List viatldextracturls— bothhttps://…and barehost/pathforms (so package-registry paths survive)filenames— suspicious script/executable/macro/archive filenameshashes—{"md5": [...], "sha1": [...], "sha256": [...]}, de-duplicated across lengthscves—CVE-YYYY-NNNN+, normalized to uppercaseemailsmitre_techniques—T1059,T1059.001, …threat_actors(+threat_actors_enriched) — APT/UNC/FIN/TA/DEV/STORM designators, a curated well-known list, and the"<Name> ransomware"patternmalware_families— populated when you supply a malware-name source (see below)
Each individual extractor is also importable and composable:
from iocflow import extract_ips, extract_hashes, refang_text
extract_ips(refang_text("c2 at 185[.]220[.]101[.]5")) # ['185.220.101.5']
Pluggable name sources
The core has no external-data dependency. Two enrichment sources are optional and supplied by you, so iocflow drops cleanly into any environment — plug in your own feeds, or use the bundled MITRE extra.
Malware families. Give extract a MalwareNames and it matches families
(with alias-to-canonical normalization) behind a three-layer false-positive
defense. Build one from your own list, from MITRE-shaped records, or from the
optional extra:
from iocflow import extract, MalwareNames
# Your own list:
names = MalwareNames.from_names(["Cobalt Strike", "Emotet", "Qakbot"])
entities = extract(report_text, malware_names=names)
# Or the bundled MITRE ATT&CK source (needs: pip install "iocflow[mitre]"):
from iocflow.mitre import mitre_malware_names
entities = extract(report_text, malware_names=mitre_malware_names())
Threat-actor aliases. Give extract an ActorAliases to match a custom
name set and enrich actors with common_name / region / all_names. Without
it, actors are still found by pattern and curated list:
from iocflow import extract, ActorAliases
aliases = ActorAliases.from_index({
"apt28": {"common_name": "APT28", "region": "Russia",
"all_names": ["Fancy Bear", "Sofacy", "Sednit"]},
})
entities = extract(report_text, actor_aliases=aliases)
entities.threat_actors_enriched[0].region # "Russia"
entities.threat_actors_enriched[0].aliases_display() # "Fancy Bear, Sofacy, Sednit"
Command line
iocflow "APT28 used 185.220.101.5 and evil[.]example[.]com"
echo "report text…" | iocflow --json
iocflow --mitre "Emotet dropped Cobalt Strike" # needs iocflow[mitre]
Where this is going
iocflow is Layer 1 of an IOC-lifecycle toolkit. The plan is to grow it in
independently-useful layers, each behind its own pip extra: enrichment
(VirusTotal, Recorded Future, AbuseIPDB, Shodan, abuse.ch), AI commentary,
suggested hunts, and optional perimeter blocking — each configured by plugging
in your own API keys. ExtractedEntities (and its iter_indicators() view) is
the stable hand-off type those layers consume.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file iocflow-0.1.0.tar.gz.
File metadata
- Download URL: iocflow-0.1.0.tar.gz
- Upload date:
- Size: 23.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e60fd5c935db9a8f76157da33bd525711d52e07f183c85c10bbcf6f9c8b11a2c
|
|
| MD5 |
90d3c6b535d57b5d99e40fdab752e67d
|
|
| BLAKE2b-256 |
5971794ac8b62028e79263863262be4880e579e45e88a016903d75a90d093069
|
Provenance
The following attestation bundles were made for iocflow-0.1.0.tar.gz:
Publisher:
release.yml on vinayvobbili/iocflow
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
iocflow-0.1.0.tar.gz -
Subject digest:
e60fd5c935db9a8f76157da33bd525711d52e07f183c85c10bbcf6f9c8b11a2c - Sigstore transparency entry: 1677054526
- Sigstore integration time:
-
Permalink:
vinayvobbili/iocflow@343f6af63a1793265e4593210986d6b7174cbc4e -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/vinayvobbili
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@343f6af63a1793265e4593210986d6b7174cbc4e -
Trigger Event:
push
-
Statement type:
File details
Details for the file iocflow-0.1.0-py3-none-any.whl.
File metadata
- Download URL: iocflow-0.1.0-py3-none-any.whl
- Upload date:
- Size: 24.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3be671f44dbe934fed20349a08b0c2356bd36fb60bf8531649d210f3b594a6e5
|
|
| MD5 |
254e3efc1b0035f2855e4bc61d260184
|
|
| BLAKE2b-256 |
313d1335efbf03d7d3d57650d5c4ab28acea27128ba7b7ba9c8caa3e2912e299
|
Provenance
The following attestation bundles were made for iocflow-0.1.0-py3-none-any.whl:
Publisher:
release.yml on vinayvobbili/iocflow
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
iocflow-0.1.0-py3-none-any.whl -
Subject digest:
3be671f44dbe934fed20349a08b0c2356bd36fb60bf8531649d210f3b594a6e5 - Sigstore transparency entry: 1677054531
- Sigstore integration time:
-
Permalink:
vinayvobbili/iocflow@343f6af63a1793265e4593210986d6b7174cbc4e -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/vinayvobbili
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@343f6af63a1793265e4593210986d6b7174cbc4e -
Trigger Event:
push
-
Statement type: