Skip to main content

Convert Splunk SPL queries into PySpark code

Project description

Overview

spl_transpiler is a Rust + Python port of Databricks Labs' spl_transpiler. The goal is to provide a high-performance, highly portable, convenient tool for adapting common SPL code into PySpark code when possible, making it easy to migrate from Splunk to other data platforms for log processing.

Installation

pip install spl_transpiler

Usage

from spl_transpiler import convert_spl_to_pyspark

print(convert_spl_to_pyspark(r"""multisearch
[index=regionA | fields +country, orders]
[index=regionB | fields +country, orders]"""))

# spark.table("regionA").select(F.col("country"), F.col("orders")).unionByName(
#     spark.table("regionB").select(F.col("country"), F.col("orders")),
#     allowMissingColumns=True,
# )

Interactive CLI

For demonstration purposes and ease of use, an interactive CLI is also provided.

pip install spl_transpiler[cli]
python -m spl_transpiler

This provides an in-terminal user interface (using textual) where you can type an SPL query and see the converted Pyspark code in real time, alongside a visual representation of how the transpiler is understanding your query.

Why?

Why transpile SPL into Spark? Because a huge amount of domain knowledge is locked up in the Splunk ecosystem, but Splunk is not always the optimal place to store and analyze data. Transpiling existing queries can make it easier for analysts and analytics to migrate iteratively onto other platforms. SPL is also a very laser-focused language for certain analytics, and in most cases it's far more concise than other languages (PySpark or SQL) at log processing tasks. Therefore, it may be preferable to continue writing queries in SPL and use a transpiler layer to make that syntax viable on various platforms.

Why rewrite the Databricks Lab transpiler? A few reasons:

  1. The original transpiler is written in Scala and assumes access to a Spark environment. That requires a JVM to execute and possibly a whole ecosystem of software (maybe even a running Spark cluster) to be available. This transpiler stands alone and compiles natively to any platform.
  2. While Scala is a common language in the Spark ecosystem, Spark isn't the only ecosystem that would benefit from having an SPL transpiler. By providing a transpiler that's both easy to use in Python and directly linkable at a system level, it becomes easy to embed and adapt the transpiler for any other use case too.
  3. Speed. Scala's plenty fast, to be honest, but Rust is mind-numbingly fast. This transpiler can parse SPL queries and generate equivalent Python code in a fraction of a millisecond. This makes it viable to treat the transpiler as a realtime component, for example embedding it in a UI and re-computing the converted results after every keystroke.
  4. Maintainability. Rust's type system helps keep things unambiguous as data passes through parsers and converters, and built-in unit testing makes it easy to adapt and grow the transpiler without risk of breaking existing features. While Rust is undoubtedly a language with a learning curve, the resulting code is very hard to break without noticing. This makes it much easier to maintain than a similarly complicated system would be in Python.

Contributing

This project is in early development. While it parses most common SPL queries and can convert a non-trivial variety of queries to PySpark, it's extremely limited and not yet ready for any serious usage. However, it lays a solid foundation for the whole process and is modular enough to easily add incremental features to.

Ways to contribute:

  • Add SPL queries and what the equivalent PySpark could would be. These test cases can drive development and prioritize the most commonly used features.
  • Add support for additional functions and commands. While the SPL parser works for most commands, many do not yet the ability to render back out to PySpark. Please see /src/pyspark/transpiler/command/eval_fns.rs and /convert_fns.rs to add support for more in-command functions, and /src/pyspark/transpiler/command/mod.rs to add support for more top-level commands.

Future

  • Add UI (with textual) for interactive demonstration/use

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spl_transpiler-0.1.1.tar.gz (58.2 kB view hashes)

Uploaded Source

Built Distributions

spl_transpiler-0.1.1-pp310-pypy310_pp73-musllinux_1_2_x86_64.whl (868.6 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ x86-64

spl_transpiler-0.1.1-pp310-pypy310_pp73-musllinux_1_2_i686.whl (882.6 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ i686

spl_transpiler-0.1.1-pp310-pypy310_pp73-musllinux_1_2_armv7l.whl (908.3 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARMv7l

spl_transpiler-0.1.1-pp310-pypy310_pp73-musllinux_1_2_aarch64.whl (835.1 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARM64

spl_transpiler-0.1.1-pp39-pypy39_pp73-musllinux_1_2_x86_64.whl (869.4 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ x86-64

spl_transpiler-0.1.1-pp39-pypy39_pp73-musllinux_1_2_i686.whl (882.8 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ i686

spl_transpiler-0.1.1-pp39-pypy39_pp73-musllinux_1_2_armv7l.whl (908.7 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARMv7l

spl_transpiler-0.1.1-pp39-pypy39_pp73-musllinux_1_2_aarch64.whl (835.7 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARM64

spl_transpiler-0.1.1-pp38-pypy38_pp73-musllinux_1_2_x86_64.whl (869.8 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ x86-64

spl_transpiler-0.1.1-pp38-pypy38_pp73-musllinux_1_2_i686.whl (883.2 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ i686

spl_transpiler-0.1.1-pp38-pypy38_pp73-musllinux_1_2_armv7l.whl (908.5 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARMv7l

spl_transpiler-0.1.1-pp38-pypy38_pp73-musllinux_1_2_aarch64.whl (836.0 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARM64

spl_transpiler-0.1.1-pp37-pypy37_pp73-musllinux_1_2_x86_64.whl (872.2 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ x86-64

spl_transpiler-0.1.1-pp37-pypy37_pp73-musllinux_1_2_i686.whl (885.3 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ i686

spl_transpiler-0.1.1-pp37-pypy37_pp73-musllinux_1_2_armv7l.whl (910.5 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARMv7l

spl_transpiler-0.1.1-pp37-pypy37_pp73-musllinux_1_2_aarch64.whl (838.3 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARM64

spl_transpiler-0.1.1-cp312-none-win_amd64.whl (669.6 kB view hashes)

Uploaded CPython 3.12 Windows x86-64

spl_transpiler-0.1.1-cp312-none-win32.whl (584.0 kB view hashes)

Uploaded CPython 3.12 Windows x86

spl_transpiler-0.1.1-cp312-cp312-musllinux_1_2_x86_64.whl (881.5 kB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.2+ x86-64

spl_transpiler-0.1.1-cp312-cp312-musllinux_1_2_i686.whl (894.7 kB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.2+ i686

spl_transpiler-0.1.1-cp312-cp312-musllinux_1_2_armv7l.whl (922.0 kB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.2+ ARMv7l

spl_transpiler-0.1.1-cp312-cp312-musllinux_1_2_aarch64.whl (840.1 kB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.2+ ARM64

spl_transpiler-0.1.1-cp312-cp312-macosx_11_0_arm64.whl (637.3 kB view hashes)

Uploaded CPython 3.12 macOS 11.0+ ARM64

spl_transpiler-0.1.1-cp312-cp312-macosx_10_12_x86_64.whl (661.6 kB view hashes)

Uploaded CPython 3.12 macOS 10.12+ x86-64

spl_transpiler-0.1.1-cp311-none-win_amd64.whl (679.6 kB view hashes)

Uploaded CPython 3.11 Windows x86-64

spl_transpiler-0.1.1-cp311-none-win32.whl (596.4 kB view hashes)

Uploaded CPython 3.11 Windows x86

spl_transpiler-0.1.1-cp311-cp311-musllinux_1_2_x86_64.whl (870.4 kB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.2+ x86-64

spl_transpiler-0.1.1-cp311-cp311-musllinux_1_2_i686.whl (882.9 kB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.2+ i686

spl_transpiler-0.1.1-cp311-cp311-musllinux_1_2_armv7l.whl (909.7 kB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.2+ ARMv7l

spl_transpiler-0.1.1-cp311-cp311-musllinux_1_2_aarch64.whl (835.6 kB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.2+ ARM64

spl_transpiler-0.1.1-cp311-cp311-macosx_11_0_arm64.whl (633.4 kB view hashes)

Uploaded CPython 3.11 macOS 11.0+ ARM64

spl_transpiler-0.1.1-cp311-cp311-macosx_10_12_x86_64.whl (655.1 kB view hashes)

Uploaded CPython 3.11 macOS 10.12+ x86-64

spl_transpiler-0.1.1-cp310-none-win_amd64.whl (679.3 kB view hashes)

Uploaded CPython 3.10 Windows x86-64

spl_transpiler-0.1.1-cp310-none-win32.whl (596.2 kB view hashes)

Uploaded CPython 3.10 Windows x86

spl_transpiler-0.1.1-cp310-cp310-musllinux_1_2_x86_64.whl (870.3 kB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.2+ x86-64

spl_transpiler-0.1.1-cp310-cp310-musllinux_1_2_i686.whl (882.9 kB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.2+ i686

spl_transpiler-0.1.1-cp310-cp310-musllinux_1_2_armv7l.whl (909.6 kB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.2+ ARMv7l

spl_transpiler-0.1.1-cp310-cp310-musllinux_1_2_aarch64.whl (835.5 kB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.2+ ARM64

spl_transpiler-0.1.1-cp310-cp310-macosx_11_0_arm64.whl (633.2 kB view hashes)

Uploaded CPython 3.10 macOS 11.0+ ARM64

spl_transpiler-0.1.1-cp39-none-win_amd64.whl (679.6 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

spl_transpiler-0.1.1-cp39-none-win32.whl (597.2 kB view hashes)

Uploaded CPython 3.9 Windows x86

spl_transpiler-0.1.1-cp39-cp39-musllinux_1_2_x86_64.whl (870.9 kB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.2+ x86-64

spl_transpiler-0.1.1-cp39-cp39-musllinux_1_2_i686.whl (883.6 kB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.2+ i686

spl_transpiler-0.1.1-cp39-cp39-musllinux_1_2_armv7l.whl (909.5 kB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.2+ ARMv7l

spl_transpiler-0.1.1-cp39-cp39-musllinux_1_2_aarch64.whl (836.4 kB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.2+ ARM64

spl_transpiler-0.1.1-cp39-cp39-macosx_11_0_arm64.whl (634.4 kB view hashes)

Uploaded CPython 3.9 macOS 11.0+ ARM64

spl_transpiler-0.1.1-cp38-none-win_amd64.whl (679.0 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

spl_transpiler-0.1.1-cp38-none-win32.whl (596.4 kB view hashes)

Uploaded CPython 3.8 Windows x86

spl_transpiler-0.1.1-cp38-cp38-musllinux_1_2_x86_64.whl (870.8 kB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.2+ x86-64

spl_transpiler-0.1.1-cp38-cp38-musllinux_1_2_i686.whl (883.5 kB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.2+ i686

spl_transpiler-0.1.1-cp38-cp38-musllinux_1_2_armv7l.whl (909.5 kB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.2+ ARMv7l

spl_transpiler-0.1.1-cp38-cp38-musllinux_1_2_aarch64.whl (836.2 kB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.2+ ARM64

spl_transpiler-0.1.1-cp37-none-win_amd64.whl (679.7 kB view hashes)

Uploaded CPython 3.7 Windows x86-64

spl_transpiler-0.1.1-cp37-none-win32.whl (598.0 kB view hashes)

Uploaded CPython 3.7 Windows x86

spl_transpiler-0.1.1-cp37-cp37m-musllinux_1_2_x86_64.whl (870.7 kB view hashes)

Uploaded CPython 3.7m musllinux: musl 1.2+ x86-64

spl_transpiler-0.1.1-cp37-cp37m-musllinux_1_2_i686.whl (883.6 kB view hashes)

Uploaded CPython 3.7m musllinux: musl 1.2+ i686

spl_transpiler-0.1.1-cp37-cp37m-musllinux_1_2_armv7l.whl (909.4 kB view hashes)

Uploaded CPython 3.7m musllinux: musl 1.2+ ARMv7l

spl_transpiler-0.1.1-cp37-cp37m-musllinux_1_2_aarch64.whl (836.2 kB view hashes)

Uploaded CPython 3.7m musllinux: musl 1.2+ ARM64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page