Static IOC extraction engine for binaries, text, and logs.

These details have not been verified by PyPI

Project links

Project description

Coverage Tests Python Version Performance Throughput Pathological IPv6 Timing

malx‑ioc‑extractor

Static IOC extraction for binaries, text, and artifacts — fast, safe, and open‑source.

malx‑ioc‑extractor is a lightweight, extensible engine for extracting Indicators of Compromise (IOCs) using pure static analysis. No execution. No sandboxing. No risk. Built for DFIR workflows, SOC automation, and large‑scale threat analysis.

It’s designed to be:

Safe — never executes untrusted code
Fast — built for automation and pipelines
Extensible — plug in your own regexes, parsers, and rules
Developer‑friendly — clean API, CLI, and examples
Open‑source — the extraction engine is free; enrichment lives in the MalX cloud platform

This project is the foundation of the MalX Labs ecosystem for scalable, modern threat‑analysis tooling.

Why malx‑ioc‑extractor?

malx‑ioc‑extractor is designed for environments where safety, determinism, and automation matter. While many IOC extractors operate only on raw text, malx‑ioc‑extractor includes binary‑aware static analysis and an extensible rule system, making it suitable for DFIR pipelines, CI systems, and high‑volume threat‑intel processing.

Key advantages

Static‑only design — no execution, no sandboxing, and no risk of running untrusted code
Binary parsing — extracts indicators from Windows PE files in addition to raw text
Deterministic behaviour — stable output and predictable performance, ideal for automated workflows
Extensible rule engine — plug in custom detectors, parsers, and enrichment logic
Consistent JSON schema — uniform output that integrates cleanly with SIEM, SOAR, and log pipelines
Low dependency footprint — minimal attack surface and safe for enterprise environments
Designed for pipelines — fast start‑up, fast throughput, and no heavyweight runtime requirements

Use Cases

malx‑ioc‑extractor fits naturally into DFIR, security automation, and threat‑intelligence workflows. Typical usage patterns include:

SOC & Incident Response

Extract indicators from suspicious emails, alerts, or analyst clipboard text
Parse IOCs from incident reports and triage notes into structured JSON
Safely inspect malware samples statically without executing anything

Threat Intelligence Processing

Normalize indicators from threat‑intel feeds
Batch‑process dumps of unstructured text into machine‑readable IOC sets
Build enrichment pipelines on top of the deterministic output format

CI/CD & DevSecOps

Scan new binaries for embedded indicators before publishing artifacts
Integrate IOC extraction into automated security checks
Detect accidental inclusion of URLs or addresses during build steps

Bulk Automation & Scripting

Pipe logs, artifacts, or telemetry through malx‑ioc‑extractor to extract actionable indicators
Use the Python API for batch workflows, ETL pipelines, or custom tooling
Combine with rule extensions to tailor detection to internal patterns or datasets

v0.2.0 — High‑Reliability IP Detection in Hostile Data

Version 0.2.0 significantly improves IPv4/IPv6 extraction in noisy, malformed, mixed-content environments — the kind often seen in:

SIEM log lines
network captures
DFIR corpus samples
pasted analyst dumps

Real CLI Output (Chaos Corpus Sample)

$ iocx chaos_corpus.json
{
  "file": "examples/samples/structured/chaos_corpus.json",
  "type": "text",
  "iocs": {
    "urls": [
      "http://[2001:db8::1]:443"
    ],
    "domains": [],
    "ips": [
      "2001:db8::1",
      "2001:db8::1:443",
      "10.0.0.1",
      "192.168.1.10",
      "fe80::dead:beef%eth0",
      "1.2.3.4",
      "fe80::1%eth0",
      "192.168.1.110",
      "fe80::1%eth0fe80",
      "::2%eth1",
      "2001:db8::"
    ],
    "hashes": [],
    "emails": [],
    "filepaths": [],
    "base64": []
  },
  "metadata": {}
}

Chaos Corpus: Input → Extracted Output → Explanation

Input	Extracted Output	Explanation
fe80::dead:beef%eth0/garbage	fe80::dead:beef%eth0	Salvaged valid IPv6, junk ignored.
xxx192.168.1.10yyy	192.168.1.10	IPv4 inside junk text.
DROP:client=10.0.0.1;;;ERR	10.0.0.1	IPv4 from noisy log field.
[2001:db8::1]::::443	2001:db8::1	IPv6 and IPv6+port extracted.
	2001:db8::1:443
GET http://[2001:db8::1]:443/index	http://[2001:db8::1]:443	URL with IPv6 parsed correctly.
udp://[fe80::1%eth0]::::53	fe80::1%eth0	Concatenated IPv6 split up.
192.168.1.110.0.0.1	192.168.1.110	Combined IP segment salvaged.
fe80::1%eth0fe80::2%eth1	fe80::1%eth0fe80, ::2%eth1	Concatenated IPv6 split up.
2001:db8::12001:db8::2	2001:db8::	Longest valid IPv6 prefix found.
256.256.256:256	—	Invalid indicator ignored.

Performance Benchmarks (v0.2.0)

All measurements from the latest performance suite:

Sample Type	Time
1 MB mixed‑content sample	0.0053s
Pathological IPv6 blob	0.0055s
100 KB sample	0.0006s
300 KB sample	0.0017s
600 KB sample	0.0031s
1 MB sample	0.0055s

Throughput: ~200 MB/s
Worst‑case IPv6 blob: ~0.5 ms
Linear scaling: almost perfect from 100 KB → 1 MB

Performance Benchmarks (v0.3.0)

All measurements from the latest performance suite:

Sample Type	Time
IP
==============================	==========
1 MB mixed‑content sample	0.0070s
Pathological IPv6 blob	0.0004s
100 KB sample	0.0008s
300 KB sample	0.0021s
600 KB sample	0.0038s
1 MB sample	0.0068s
------------------------------	----------
Filepath
==============================	==========
1 MB mixed‑content sample	0.0040s
Pathological deep unix path	0.0237s
300 KB sample	0.0011s
600 KB sample	0.0022s
1000 KB sample	0.0038s
1500 KB sample	0.0055s
------------------------------	----------
Crypto
==============================	==========
1 MB mixed‑content sample	0.0021s
Pathological ETH-like blob	0.0012s
300 KB sample	0.0006s
600 KB sample	0.0012s
1000 KB sample	0.0020s
1500 KB sample	0.0031s

Throughput: ~200 MB/s
Worst‑case IPv6 blob: ~0.5 ms
Worst‑case filepath blob: ~23 ms
Worst‑case crypto blob: ~1 ms
Linear scaling: almost perfect from 100 KB → 1 MB

Features

IOC Extraction

Windows PE files (.exe, .dll)
Raw text
Extracted strings from binaries
Caching for increased performance

Detections

URLs
Domains
IPv4 / IPv6 addresses
File paths
Hashes (MD5 / SHA1 / SHA256 / SHA512 / Generic Hex)
Email addresses
Base64
Crypto wallets (Ethereum, Bitcoin) (new in v0.3.0)

Static PE Parsing

Imports
Sections
Resources
Metadata

Developer‑Friendly

Clean JSON output
CLI + Python API
Modular, extensible rule system
Minimal dependency footprint

Security‑First

Zero malware execution
Safe for untrusted input
Deterministic behaviour

Why Static Only?

Static analysis ensures safety, determinism, and CI‑friendly operation. No sandboxing, no execution, and no risk of triggering malware behaviour.

Quickstart

Install

pip install iocx

Extract IOCs from a file

iocx suspicious.exe

Extract from text

echo "Visit http://bad.example.com" | iocx -

Extract from a log file

iocx alerts.log

Python API

from iocx import extract

results = extract("suspicious.exe")
print(results)

Show Example JSON Output

{
  "file": "suspicious.exe",
  "type": "PE",
  "iocs": {
    "urls": ["http://malicious.example.com"],
    "domains": ["malicious.example.com"],
    "ips": ["45.77.12.34"],
    "hashes": ["d41d8cd98f00b204e9800998ecf8427e"],
    "emails": ["attacker@example.com"],
    "filepaths": [
      "c:\\windows\\system32\\cmd.exe",
      "d:\\temp\\payload.bin"
    ],
    "base64": []
  },
  "metadata" : {
    "file_type": "PE",
    "imports": [
      "KERNEL32.dll",
      "msvcrt.dll"
    ],
    "sections": [
      ".text",
      ".data",
      ".rdata",
      ".pdata",
      ".xdata",
      ".bss",
      ".idata",
      ".CRT",
      ".tls",
      ".reloc"
    ],
    "resource_strings": [
      "C:\\Windows\\System32\\cmd.exe",
      "\\\\SERVER01\\share\\dropper.exe",
      "/home/alice/.config/evil.sh@%APPDATA%\\Microsoft\\Windows\\Start Menu\\Programs\\Startup\\evil.lnk"
    ]
  }
}

Architecture

malx-ioc-extractor/
│
├── examples/        # Sample files + generators
├── tests/           # Unit and integration tests
├── iocx
    ├── detectors/   # Regex-based IOC detectors
    ├── parsers/     # PE parsing, string extraction
    ├── cli/         # Command-line interface

The engine is intentionally modular so components can be extended or replaced easily.

Extending the Engine

You can add custom:

Regex detectors
File parsers
Normalisation logic

Register a custom detector

The second argument is a detector function (a callable that receives the input and returns extracted values):

from iocx.detectors import register_detector

def extract(data):
    # custom extraction logic here
    return ["wallet123"]

register_detector("crypto_wallet", extract)

Safe Testing (No Malware Required)

All test samples are:

Synthetic
Benign
Publicly safe (EICAR, GTUBE)
Designed to avoid accidental malware handling

Contributing

We welcome:

New IOC detectors
Parser improvements
Bug reports
Documentation updates
Synthetic test samples

See CONTRIBUTING.md for full guidelines.

Security

If you discover a security issue, do not open a GitHub issue. Please follow the instructions in SECURITY.md.

Related Projects (MalX Labs)

malx-core — foundational primitives
malx-utils — shared utilities
malx-sandbox — dynamic analysis environment
malx-forge — adversarial payload tooling
malx-archive — research + PoCs

License

Licensed under the MIT License. See LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.7.2

May 1, 2026

0.7.1

May 1, 2026

0.7.0

Apr 17, 2026

0.6.0

Apr 14, 2026

0.5.1

Apr 10, 2026

0.5.0

Apr 8, 2026

0.4.0

Mar 30, 2026

This version

0.3.0

Mar 23, 2026

0.2.0

Mar 11, 2026

0.1.0

Mar 1, 2026

0.0.1

Feb 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iocx-0.3.0.tar.gz (21.7 kB view details)

Uploaded Mar 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

iocx-0.3.0-py3-none-any.whl (23.0 kB view details)

Uploaded Mar 23, 2026 Python 3

File details

Details for the file iocx-0.3.0.tar.gz.

File metadata

Download URL: iocx-0.3.0.tar.gz
Upload date: Mar 23, 2026
Size: 21.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for iocx-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`8d73200f1cd2b451a8d0db35f76c730e8b214540c0e571b1ca82c2cfc16dda74`
MD5	`fc148a3d532030629041d95665281f21`
BLAKE2b-256	`3625f5af46326f26a1ab70e8756ee907d570ccf6e705d7723f50df28f303f91a`

See more details on using hashes here.

File details

Details for the file iocx-0.3.0-py3-none-any.whl.

File metadata

Download URL: iocx-0.3.0-py3-none-any.whl
Upload date: Mar 23, 2026
Size: 23.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for iocx-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`34f5bdd57f752a576a26b3b4c272453843e2fa9ad758764cd5858997f67c0548`
MD5	`d7e4516d0c0b9045dae928888c187eb5`
BLAKE2b-256	`b7ebb8ab81d8963cdb9df28cc2d34a44efc471b6b96b9f5d930d4aab5aa0638e`

See more details on using hashes here.

iocx 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

malx‑ioc‑extractor

Why malx‑ioc‑extractor?

Use Cases

SOC & Incident Response

Threat Intelligence Processing

CI/CD & DevSecOps

Bulk Automation & Scripting

v0.2.0 — High‑Reliability IP Detection in Hostile Data

Real CLI Output (Chaos Corpus Sample)

Features

IOC Extraction

Detections

Static PE Parsing

Developer‑Friendly

Security‑First

Why Static Only?

Quickstart

Install

Extract IOCs from a file

Extract from text

Extract from a log file

Python API

Architecture

Extending the Engine

Register a custom detector

Safe Testing (No Malware Required)

Contributing

Security

Related Projects (MalX Labs)

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes