SBSV: Square Brackets Separated Values
Project description
SBSV: square bracket separated values
A flexible, schema-based structured log data format.
Install
python3 -m pip install sbsv
Use
You can read this log-like data:
[meta-data] [id 1] [format string]
[meta-data] [id 2] [format token]
[data] [string] [id 1] [actual some long string...]
[data] [token] [id 2] [actual [some] [multiple] [tokens]]
[stat] [rows 2]
import sbsv
parser = sbsv.parser()
parser.add_schema("[meta-data] [id: int] [format: str]")
parser.add_schema("[data] [string] [id: int] [actual: str]")
parser.add_schema("[data] [token] [id: int] [actual: list[str]]")
parser.add_schema("[stat] [rows: int]")
with open("testfile.sbsv", "r") as f:
result = parser.load(f)
Result would looks like:
{
"meta-data": [{"id": 1, "format": "string"}, {"id": 2, "format": "string"}],
"data": {
"string": [{"id": 1, "actual": "some long string..."}],
"token": [{"id": 2, "actual": ["some", "multiple", "tokens"]}]
},
"stat": [{"rows": 2}]
}
Details
Basic schema
Schema is consisted with schema name, variable name and type annotation.
[schema-name] [var-name: type]
You can use [A-Za-z0-9-_] for names.
Sub schema
[my-schema] [sub-schema] [some: int] [other: str] [data: bool]
You can add any sub schema. But if you add sub schema, you cannot add new schema with same schema name without sub schema.
[my-schema] [no: int] [sub: str] [schema: str]
# this will cause error
Ignore
- Not available yet
[2024-03-04 13:22:56] [DEBUG] [necessary] [from] [this part]
Regular log file may contain unnecessary data. You can specify parser to ignore [2024-03-04 13:22:56] [DEBUG] part.
parser.add_schema("[$ignore] [$ignore] [necessary] [from] [this: str]")
Duplicating names
Sometimes, you may want to use same name multiple times. You can distinguish them using additional tags.
[my-schema] [node 1] [node 2] [node 3]
Tag is added like node$some-tag, after $. Data should not contain tags: they will be only used in schema.
parser.add_schema("[my-schema] [node$0: int] [node$1: int] [node$2: int]")
result = parser.loads("[my-schema] [node 1] [node 2] [node 3]\n")
result["my-schema"][0]["node$0"] == 1
Name matching
If there are additional element in data, it will be ignored. The sequence of the names should not be changed.
parser.add_schema("[my-schema] [node: int] [value: int]")
data = "[my-schema] [node 1] [unknown element] [value 3]\n"
result = parser.loads(data)
result["my-schema"][0] == { "node": 1, "value": 3 }
Ordering
You may need a global ordering of each line.
parser.add_schema("[data] [string] [id: int] [actual: str]")
parser.add_schema("[data] [token] [id: int] [actual: list[str]]")
result = parser.load(f)
# This returns all elements in order
elems_all = parser.get_result_in_order()
# This returns elements matching names in order
# If it contains sub-schema, use $
# For example, [data] [string] [id: int] -> "data$string"
elems = parser.get_result_in_order(["[data] [string]", "[data] [token]"])
# You can also use ["data$string", "data$token"]
Or, you can get schema id (data$string and data$token) like this:
sbsv.get_schema_id("node") == "node"
sbsv.get_schema_id("data", "string") == "data$string"
# this is equal to
sbsv.get_schema_id("data", "string") == '$'.join(["data", "string"])
Group
[data] [begin]
[block] [data 1]
[block] [data 2]
[data] [end]
[data] [begin]
[block] [data 3]
[block] [data 4]
[data] [end]
You can group block 1, 2
# First, add all to schema
parser.add_schema("[data] [begin]")
parser.add_schema("[data] [end]")
parser.add_schema("[block] [data: int]")
# Second, add group name, group start, group end
parser.add_group("data", "[data] [begin]", "[data] [end]")
parser.load(sbsv_file)
# Iterate groups
for block in parser.iter_group("data"):
print("group start")
for block_data in block:
if block_data.schema_name == "block":
print(block_data["data"])
# Or, use index
block_indices = parser.get_group_index("data")
for index in block_indices:
print("use index")
for block in parser.get_result_by_index("[block]", index):
print(block["data"])
Output:
group start
1
2
group start
3
4
use index
1
2
use index
3
4
You can use group without closing schema.
[group-wo-closing] [new-group a]
[some] [data 9]
[some] [data 8]
[some] [data 7]
[group-wo-closing] [new-group b]
[some] [data 6]
[some] [data 5]
[group-wo-closing] [new-group c]
[some] [data 4]
# First, add all to schema
parser.add_schema("[group-wo-closing] [new-group: str]")
parser.add_schema("[some] [data: int]")
# Second, add group name, group start == group end
parser.add_group("new-group", "[group-wo-closing]", "[group-wo-closing]")
parser.load(sbsv_file)
# Iterate groups
for block in parser.iter_group("new-group"):
print("group start")
for block_data in block:
if block_data.schema_name == "some":
print(block_data["data"])
# Or, use index
block_indices = parser.get_group_index("new-group")
for index in block_indices:
print("use index")
for block in parser.get_result_by_index("[some]", index):
print(block["data"])
Output
group start
9
8
7
group start
6
5
group start
4
use index
9
8
7
use index
6
5
use index
4
Primitive types
Primitive types are str, int, float, bool, null.
Complex types
nullable
[car] [id 1] [speed 100] [power 2] [price]
[car] [id] [speed 120] [power 3] [price 33000]
parser.add_schema("[car] [id?: int] [data: obj[speed: int, power: int, price?: int]]")
- Not available yet
list
[data] [token] [id 2] [actual [some] [multiple] [tokens]]
parser.add_schema("[data] [token] [id: int] [actual: list[str]]")
obj
[car] [id 1] [data [speed 100] [power 2] [price 20000]]
parser.add_schema("[car] [id: int] [data: obj[speed: int, power: int, price: int]])
map
[map-example] [mymap [id: 1, name: alice, email: wd@email.com]]
parser.add_schema("[map-example] [mymap: map]")
Escape sequences for string
[car] [id 1] [name "\[name with square bracket\]"]
f"[car] [id {id}] [name {sbsv.escape_str("[name with square bracket]")}]"
Use sbsv.escape_str() to get escaped string and sbsv.unescape_str() to get original string from escaped string.
Contribute
python3 -m pip install --upgrade pip
python3 -m pip install black
You should run black linter before commit.
python3 -m black .
Before implementing new features or fixing bugs, add new tests in tests/.
python3 -m unittest
Build and update
python3 -m build
python3 -m twine upload dist/*
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sbsv-0.1.1.tar.gz.
File metadata
- Download URL: sbsv-0.1.1.tar.gz
- Upload date:
- Size: 9.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
43bc606010331d3a1771ae863ddbe22b65afe5d90f986e8c9e1de6821c31b3fe
|
|
| MD5 |
6f13fcb97a5e20177b241757f2bbe6a6
|
|
| BLAKE2b-256 |
3e1f08075269511f163b6893ebeac0470c1ece56c17a93f98bbdec2babd9c3d0
|
File details
Details for the file sbsv-0.1.1-py3-none-any.whl.
File metadata
- Download URL: sbsv-0.1.1-py3-none-any.whl
- Upload date:
- Size: 8.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3435276cf7d9a915162c26758250e691cf34661413051d0c852cc5a68d43dc38
|
|
| MD5 |
86dc4cc3a59c789fd586748e17c1e8dc
|
|
| BLAKE2b-256 |
1f4fe20f7a717d29eabf058baf719c27c1cd09ad2963d395c1f8dbb5f2ab9ef5
|