Skip to main content

A CLI tool for converting Splunk .conf configurations to SPL2

Project description

conf-spl2-converter

A CLI tool for converting Splunk .conf configurations (props.conf / transforms.conf) to SPL2 pipeline templates, and generating expected test outputs from Splunk field extractions or CIM field annotations.

Alpha — this project is under active development. APIs and output format may change.

Installation

Requires Python 3.9+.

pip install conf-spl2-converter

To use the test generation pipeline (generate-expected), install with the testing extra:

pip install conf-spl2-converter[testing]

Quick start

# 1. Generate SPL2 pipeline files from a TA
conf-spl2-converter generate /path/to/ta

# 2a. Generate expected test outputs from CIM fields (no Docker needed)
conf-spl2-converter generate-expected-cim /path/to/ta

# 2b. Or generate expected test outputs via Splunk (requires Docker)
conf-spl2-converter generate-expected /path/to/ta

Commands

generate — Create SPL2 pipeline templates

Reads the TA's props.conf and transforms.conf and generates SPL2 pipeline files.

# Auto-discover all sourcetypes from props.conf (no config file needed)
conf-spl2-converter generate /path/to/ta

# Use a config file to control which sourcetypes are processed and how
conf-spl2-converter generate /path/to/ta -c field_extraction_config.json

# Write output to a custom directory
conf-spl2-converter generate /path/to/ta -o /tmp/my-output

# Export parsed template data as JSON (useful for debugging / integration)
conf-spl2-converter generate /path/to/ta -o /tmp/my-output -f json

# Combine all options with verbose logging
conf-spl2-converter generate /path/to/ta -c config.json -o ./out -f spl2 -v

Config file

When --config / -c is not provided, the tool looks for field_extraction_config.json inside the TA directory. If found, it is used automatically. If not found, sourcetypes are auto-discovered from props.conf.

When --config / -c is provided, the specified config file is used instead (overrides the default lookup in the TA directory).

The config file controls which sourcetypes are processed along with extra settings like lookups, fields to trim, kv_mode overrides, etc.

Single config for all commands — The same field_extraction_config.json is used by generate, knowledge-build, and the generate-expected family. Use one config file and the same TA path; all commands share the same resolution order (explicit path, then TA directory, then auto-discovery) and the same keys (sourcetype names, source, slug, etc.). Example:

CONFIG=path/to/field_extraction_config.json
TA_PATH=/path/to/Splunk_TA_windows

conf-spl2-converter generate    $TA_PATH -c $CONFIG -o out/gen
conf-spl2-converter knowledge-build $TA_PATH -c $CONFIG -o out/kb -k ta

Config file format (field_extraction_config.json)

The config file is a JSON object whose top-level keys are sourcetype names (as in props.conf). Each value is an object that can contain the following keys. All keys are optional unless noted.

Key Type Used by Description
slug string both Filesystem-safe identifier for output paths (e.g. pan_firewall). If omitted, derived from the sourcetype.
addon_name string converter Add-on identifier (e.g. splunk-add-on-for-palo-alto-networks).
version string converter Add-on version (metadata in generated pipeline).
label string converter Add-on label (e.g. Splunk_TA_paloalto_networks).
human_readable_name string converter Human-readable add-on name (metadata).
splunk_base_url string converter URL to Splunkbase or docs (metadata).
template_version string converter Template format version (metadata).
sample_files array of strings KB Sample file names for the sourcetype (Knowledge Builder).
lb_rule string KB Line-breaking rule (e.g. "\n") for the sourcetype (Knowledge Builder).
source array of strings both List of source names (e.g. ["WinEventLog:Security", "WinEventLog:Application"]). When set, the converter/KB generates one pipeline branch per source. Values are the part after source:: in props.conf stanza names.
sub_sourcetypes array of strings both List of sub-sourcetype stanzas to process under this sourcetype (e.g. ["pan:threat", "pan:traffic"]). Generates one pipeline branch per sub-sourcetype.
system_default_extractions array of strings converter List of default stanza names to use for extractions (converter only).
lookups_with_empty_values array of strings converter Lookup table names that accept empty values (converter).
fields_to_trim array of strings converter Field names to trim (whitespace) in the pipeline.
fields_to_trim_newlines array of strings converter Field names to trim newlines from.
fields_to_trim_quotes array of strings converter Field names to trim surrounding quotes from.
remove_duplicate_fields_case_insensitive array of strings converter Field names to deduplicate case-insensitively.
convert_string_to_array array of strings converter Field names to convert from string to array.
kv_mode string both Override KV mode: auto, none, json, xml, or other values supported by the TA. Default when omitted is auto.
template_name string converter Override @template annotation name. Default: derived from add-on name and sourcetype.
template_description string converter Override @template annotation description.
template_runtime array of strings converter Override @template runtime list. Default: ["ingestProcessor", "edgeProcessor"].
template_sourcetype object converter Override @template sourcetype matching (field, operator, values). Supports EQUAL and MATCH operators.
template_events array of objects converter Sample events for @template annotation (each with host, sourcetype, source, _raw).

Minimal example (one sourcetype, no sources or sub-sourcetypes):

{
  "mysourcetype": {
    "slug": "mysourcetype",
    "addon_name": "my-addon"
  }
}

Example with sources and options (converter + Knowledge Builder):

{
  "pan:firewall": {
    "addon_name": "splunk-add-on-for-palo-alto-networks",
    "slug": "pan_firewall",
    "version": "3.0.0",
    "label": "Splunk_TA_paloalto_networks",
    "human_readable_name": "Splunk Add-on for Palo Alto Networks",
    "sample_files": ["pan_firewall.samples"],
    "lb_rule": "\n",
    "sub_sourcetypes": ["pan:threat", "pan:traffic", "pan:system"],
    "fields_to_trim": ["threat_name", "signature"],
    "remove_duplicate_fields_case_insensitive": ["action", "rule"],
    "convert_string_to_array": ["flags"]
  }
}

When no config file is provided (and none is found in the TA), the tool auto-discovers all sourcetypes from props.conf and uses default behaviour: no source/sub_sourcetypes, no field trimming or lookups, and slug is derived from the sourcetype name.

generate options

Flag Short Description
--config -c Path to a field_extraction_config.json. When omitted, looks for it in the TA directory; falls back to auto-discovery from props.conf.
--output -o Output directory for generated files. Defaults to <ta_path>/default/data/spl2/.
--format -f Output format: spl2 (default) or json.
--no-annotation Disable @template annotation generation in SPL2 output.
--verbose -v Enable debug logging.

Output formats

  • spl2 (default) — renders .spl2 pipeline files ready for use in Splunk.
  • json — writes a structured JSON file per sourcetype containing the parsed template data (extractions, evals, lookups, etc.).

@template annotation

By default, every generated SPL2 pipeline includes a @template annotation just before the $pipeline statement. This annotation provides metadata for Data Orchestrator, including the template name, sourcetype matching rules, and sample events.

Example output:

@template("Palo Alto Networks: Firewall events field extractions", sourcetype: {field: "sourcetype", operator: "MATCH", values: ["/(pan_log|pan:[^:]+)(?!(?::|_)cloud)/i"]}, events: [{...}]);

The annotation fields are derived automatically from the TA metadata (add-on name, sourcetype) but can be overridden in field_extraction_config.json using the template config keys below.

To disable annotation generation entirely, use --no-annotation:

conf-spl2-converter generate /path/to/ta --no-annotation

Template config keys

These optional keys in field_extraction_config.json control the @template annotation content:

Key Type Description
template_name string Override the annotation name. Default: "<addon_name>: <sourcetype> field extractions".
template_description string Override the description. Default: auto-generated from the pipeline context.
template_runtime array of strings Override the runtime list. Default: ["ingestProcessor", "edgeProcessor"].
template_sourcetype object Override sourcetype matching. Object with field, operator (EQUAL or MATCH), and values. Default: EQUAL with the sourcetype name(s).
template_events array of objects Sample events to embed. Each object should contain host, sourcetype, source, and _raw.

Example with MATCH operator and sample events:

{
  "pan:firewall": {
    "slug": "pan_firewall",
    "template_name": "Palo Alto Networks: Firewall events field extractions",
    "template_sourcetype": {
      "field": "sourcetype",
      "operator": "MATCH",
      "values": ["/(pan_log|pan:[^:]+)(?!(?::|_)cloud)/i"]
    },
    "template_events": [
      {
        "host": "so1",
        "sourcetype": "pan:traffic",
        "source": "pan:traffic",
        "_raw": "May 14 12:03:13 gateway ..."
      }
    ]
  }
}

generate-expected — Generate expected test outputs

Runs the full test generation pipeline in a single command. Requires Docker.

The pipeline:

  1. Starts a Splunk Docker container with the TA installed.
  2. Collects test samples from the TA's tests/knowledge/samples/ directory (XML/log files).
  3. Sends each sample event to Splunk via HEC.
  4. Retrieves Splunk's extracted fields via the Splunk SDK.
  5. Generates module.test.json files containing the expected field extractions.
  6. Stops the Splunk container (unless --keep-running is used).
# Basic usage — starts Docker, runs pipeline, stops Docker
conf-spl2-converter generate-expected /path/to/ta

# With a config file and custom output directory
conf-spl2-converter generate-expected /path/to/ta -c config.json -o ./out

# Skip Docker management (assumes Splunk is already running on localhost)
conf-spl2-converter generate-expected /path/to/ta --skip-docker

# Leave the Splunk container running after completion (useful for iterating)
conf-spl2-converter generate-expected /path/to/ta --keep-running

# Verbose logging
conf-spl2-converter generate-expected /path/to/ta -v

Options

Flag Short Description
--config -c Path to a field_extraction_config.json. When omitted, looks for it in the TA directory; falls back to auto-discovery from props.conf.
--output -o Output directory for generated files. Defaults to <ta_path>/default/data/spl2/.
--skip-docker Skip Docker container management; assume Splunk is already running.
--keep-running Leave the Splunk container running after completion.
--verbose -v Enable debug logging.

Generated files

For each sourcetype, the pipeline produces:

<output_dir>/<sourcetype_slug>/
    <sourcetype_slug>.samples      # JSONL file with collected sample events
    module.test.json               # Expected field extractions for each sample

Environment variables

Splunk connection settings can be overridden via environment variables:

Variable Default Description
SPL2_TF_SPLUNK_INSTANCE_IP 127.0.0.1 Splunk host address
SPL2_TF_SPLUNK_INSTANCE_PORT 8088 HEC port
SPL2_TF_SPLUNK_INSTANCE_API_PORT 8089 Splunk management API port
SPL2_TF_SPLUNK_INSTANCE_USERNAME admin Splunk admin username
SPL2_TF_SPLUNK_INSTANCE_PASSWORD newPassword Splunk admin password
SPL2_TF_SPLUNK_INSTANCE_INDEX cov_test Index used for test events
SPL2_TF_SPLUNK_INSTANCE_HEC_TOKEN cc7f4d5e-... HEC authentication token

generate-expected-all — Splunk + CIM in a single command

Runs both pipelines sequentially: first generate-expected (Splunk-based) to populate expected_destination_result, then generate-expected-cim to add expected_cim_fields. The result is a module.test.json with both full Splunk extraction results and CIM field expectations.

conf-spl2-converter generate-expected-all /path/to/ta

# Skip Docker management if Splunk is already running
conf-spl2-converter generate-expected-all /path/to/ta --skip-docker -v

Accepts the same options as generate-expected (--config, --output, --skip-docker, --keep-running, --verbose).

generate-expected-cim — Generate expected test outputs from CIM fields

Offline alternative to generate-expected. Instead of running events through a Splunk instance, this command reads CIM field annotations already present in the TA's XML sample files and writes them as expected_cim_fields in module.test.json. No Docker or Splunk required.

Each XML event can contain a <cim> element with <cim_fields>, <models>, and <missing_recommended_fields>. This command extracts the CIM field name/value pairs and:

  • If a module.test.json already exists (e.g. from a prior generate-expected run), it merges the CIM data into matching test entries (matched by _raw) as an expected_cim_fields section, preserving the existing expected_destination_result.
  • If no prior test exists for a sample, it creates a new entry with an empty expected_destination_result and the expected_cim_fields section.
# Basic usage
conf-spl2-converter generate-expected-cim /path/to/ta

# With a config file and custom output directory
conf-spl2-converter generate-expected-cim /path/to/ta -c config.json -o ./out

# Verbose logging
conf-spl2-converter generate-expected-cim /path/to/ta -v

Options

Flag Short Description
--config -c Path to a field_extraction_config.json. When omitted, looks for it in the TA directory; falls back to auto-discovery from props.conf.
--output -o Output directory for generated files. Defaults to <ta_path>/default/data/spl2/.
--verbose -v Enable debug logging.

Note: Events without CIM field annotations (<cim/> or missing <cim>) are skipped.

Full workflow example

Generate SPL2 pipelines, expected test data, and run tests for a TA:

# Step 1: Generate SPL2 pipeline templates
conf-spl2-converter generate /path/to/ta

# Step 2: Generate expected test outputs — pick one:
#   Option A: From CIM fields in XML samples (fast, no Docker)
conf-spl2-converter generate-expected-cim /path/to/ta
#   Option B: Via Splunk field extraction (full fidelity, requires Docker)
conf-spl2-converter generate-expected /path/to/ta
#   Option C: Both Splunk + CIM in one command (requires Docker)
conf-spl2-converter generate-expected-all /path/to/ta

# Step 3: Run tests with spl2-testing-framework (see below)
cd <ta>/default/data/spl2
spl2_tests_run cli -v --ignore_additional_fields_in_actual --ignore_empty_strings

All commands write to <ta_path>/default/data/spl2/ by default, producing:

<ta>/default/data/spl2/<sourcetype_slug>/
    pipeline_<sourcetype_slug>.spl2    # SPL2 pipeline (from generate)
    <sourcetype_slug>.samples          # Sample events (from generate-expected)
    module.test.json                   # Expected outputs (from generate-expected)

Running tests with spl2-testing-framework

Use spl2-testing-framework to verify that the generated SPL2 pipelines produce the expected field extractions.

pip install spl2-testing-framework

cd <ta>/default/data/spl2
spl2_tests_run cli -v --ignore_additional_fields_in_actual --ignore_empty_strings

Knowledge Builder integration

The Knowledge Builder builds knowledge bases from a TA (and optionally security content) and generates SPL2 noise-reduction pipelines. It uses the same input config behaviour as the generate command.

Command

# TA path required; optional config and output
conf-spl2-converter knowledge-build <ta_path> [-c CONFIG] [-o OUTPUT_DIR] [-k KNOWLEDGE_SOURCE] [-v]

# Examples
conf-spl2-converter knowledge-build /path/to/Splunk_TA_cisco-asa
conf-spl2-converter knowledge-build /path/to/ta -o ./out/kb-cisco -k ta -v
Flag Short Description
ta_path (Required) Path to the TA package directory (must contain default/props.conf).
--config -c Path to field_extraction_config.json. When omitted, looks for it in the TA directory; if not found, sourcetypes are auto-discovered from props.conf.
--output -o Output directory for SPL2 templates and knowledge bases. If omitted, uses paths from the package config.
--knowledge-source -k ta (TA only) or security_content. Use ta when the security_content repo is not available.
--verbose -v Enable verbose/debug logging.

Input config (same as generate)

  • If config is provided (-c or field_extraction_config.json found in the TA): that config defines which sourcetypes (and optional sources) are processed.
  • If no config is used: sourcetypes are auto-discovered from the TA’s default/props.conf (all non–source:: stanzas).

Output when no config (auto-discovery)

When no config file is used, the Knowledge Builder generates:

  1. One combined template for all discovered sourcetypes:

    • Path: <output_dir>/all_sourcetypes/all_sourcetypes_noisereduce.spl2
  2. One template per sourcetype, in a directory per sourcetype (slug):

    • Path: <output_dir>/<slug>/<slug>_noisereduce.spl2
    • Example: out/kb-cisco/cisco_asa/cisco_asa_noisereduce.spl2, out/kb-cisco/syslog/syslog_noisereduce.spl2

Naming convention: All generated SPL2 templates use the pattern {sourcetype}_noisereduce.spl2, where the sourcetype is represented by its slug (filesystem-safe, e.g. cisco:asacisco_asa). The combined template uses the slug all_sourcetypes.

Knowledge base JSON files and field-regex mappings are written under <output_dir>/knowledge_bases/.

Generated SPL2 templates include a comment header (metadata table, disclaimer, overview, purpose). Metadata is taken from the config when available (human_readable_name, version, splunk_base_url, template_version); missing values are replaced with placeholders (e.g. [SPLUNKBASE_URL]) for you to fill in.

Output when config is provided

When a config file is used, a single combined SPL2 file is written to the legacy path (e.g. <output_dir>/<sourcetype1>_<sourcetype2>_..._noise_reduction.spl2), and the per-sourcetype directory layout above is not used.

Development

Prerequisites

  • Python 3.9+
  • uv (recommended) — install with: pip install uv or see uv installation

Running from the repository

To run the CLI from the repo (without installing the package globally):

  1. One-time setup — from the repo root, create the environment and install the project:

    uv sync --group dev
    

    This creates a .venv in the repo and installs the project in editable mode with all dependencies.

  2. Run the CLI — use uv run so the command uses the project’s environment:

    uv run conf-spl2-converter knowledge-build /path/to/Splunk_TA_cisco-asa -o ./out/kb-cisco -k ta -v
    uv run conf-spl2-converter generate /path/to/ta -o ./out
    

    You can run these from any directory; uv run will resolve the project from the current working directory.

    Alternative: Activate the virtual environment and run the script directly:

    source .venv/bin/activate   # macOS/Linux
    # or:  .venv\Scripts\activate  on Windows
    conf-spl2-converter knowledge-build /path/to/ta -o ./out/kb -k ta
    

    After activation, conf-spl2-converter is on PATH because the project is installed in .venv.

Tests, lint, format

# Run tests
uv run pytest

# Lint and format
uv run ruff check .
uv run ruff format .

# Install pre-commit hooks
uv run pre-commit install

License

Copyright (C) 2026 Splunk Inc. All Rights Reserved. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

conf_spl2_converter-0.11.0.tar.gz (409.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

conf_spl2_converter-0.11.0-py3-none-any.whl (163.4 kB view details)

Uploaded Python 3

File details

Details for the file conf_spl2_converter-0.11.0.tar.gz.

File metadata

  • Download URL: conf_spl2_converter-0.11.0.tar.gz
  • Upload date:
  • Size: 409.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for conf_spl2_converter-0.11.0.tar.gz
Algorithm Hash digest
SHA256 4ce115d8fb84e31afb4090ddc30d17673d8f0a5775ddb837c3c30b636b5b7829
MD5 bcd111931729c0153f47852a468b0170
BLAKE2b-256 2b43c7617dac8ea72a481434c86e1727099f19ea81c0e96f00e13d59e9a27e5c

See more details on using hashes here.

File details

Details for the file conf_spl2_converter-0.11.0-py3-none-any.whl.

File metadata

  • Download URL: conf_spl2_converter-0.11.0-py3-none-any.whl
  • Upload date:
  • Size: 163.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for conf_spl2_converter-0.11.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5bd56ae9ea9fe1bd608bfe96943a877096f21e05a84de6a57be0f5ead9475538
MD5 598e946d817c80c8fb436563053edb9a
BLAKE2b-256 27080d9487cef66d96a48512f8a91bad327dc3a6af0692aa932649f44bf02bca

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page