Structured data format with modern metadata system, Dublin Core and JSON-LD support
Project description
sdata_core
Structured data format with modern metadata system, Dublin Core and JSON-LD support.
Features
- Type-safe Metadata: Dataclass-based
AttributeandMetadataclasses - Dublin Core Support: Built-in vocabulary mapping for scientific data
- JSON-LD Export: Semantic web compatible output
- DataFrame Integration: Seamless pandas DataFrame conversion
- SUUID: Semantic UUIDs for reproducible identification
- JSON Schema: Auto-generated validation schemas
Installation
pip install sdata_core
Or with uv:
uv add sdata_core
Quick Start
Basic Metadata
from sdata_core import Attribute, Metadata, DType
# Create attributes
attr = Attribute(
name="temperature",
value=293.15,
dtype=DType.FLOAT,
unit="K",
description="Sample temperature"
)
# Create metadata container
meta = Metadata(name="Experiment 001")
meta.set_attr("force", 5000.0, unit="N", dtype=DType.FLOAT)
meta.set_attr("material", "DP800 Steel")
meta.set_attr("valid", True, dtype=DType.BOOL)
# Access attributes
print(meta["force"].value) # 5000.0
print(meta.keys()) # ['force', 'material', 'valid']
Serialization
# JSON export (NaN-safe)
json_str = meta.to_json()
# DataFrame export
df = meta.to_dataframe()
print(df)
# name value unit dtype
# key
# force force 5000.0 N float
# material material DP800 Steel - str
# valid valid True - bool
# Round-trip
meta2 = Metadata.from_json(json_str)
meta3 = Metadata.from_dataframe(df)
Dublin Core Integration
from sdata_core import Metadata, DublinCore, add_dc_attribute
meta = Metadata(name="Research Dataset")
# Add Dublin Core metadata
add_dc_attribute(meta, "title", "Tensile Test Results")
add_dc_attribute(meta, "creator", "Dr. Jane Smith")
add_dc_attribute(meta, "identifier", "doi:10.1234/example")
# Get Dublin Core representation
dc_dict = DublinCore.to_dc_dict(meta)
print(dc_dict)
# {'dc:title': 'Tensile Test Results', 'dc:creator': 'Dr. Jane Smith', ...}
JSON-LD Export
import json
jsonld = meta.to_jsonld()
print(json.dumps(jsonld, indent=2))
# {
# "@context": {
# "@vocab": "https://schema.org/",
# "dc": "http://purl.org/dc/elements/1.1/",
# ...
# },
# "@type": "sdata_core:Metadata",
# ...
# }
Type-Annotated Fields
from typing import Annotated
from sdata_core import FieldMeta, create_attribute_from_annotated
# Define typed field with metadata
Temperature = Annotated[float, FieldMeta(
unit="K",
description="Temperature measurement",
ontology="http://purl.obolibrary.org/obo/PATO_0000146"
)]
# Create attribute from annotated type
attr = create_attribute_from_annotated("sample_temp", 293.15, Temperature)
print(attr.unit) # "K"
print(attr.ontology) # "http://purl.obolibrary.org/obo/PATO_0000146"
Semantic UUIDs (SUUID)
from sdata_core import SUUID
# Create deterministic SUUID from name
sid = SUUID.from_name(class_name="Experiment", name="Test 001")
print(sid.sname) # "Experiment__test_001__<uuid>"
print(sid.did) # "did:sdata_core-suuid:Experiment__test_001__<uuid>"
# Random SUUID
sid2 = SUUID(class_name="Data", name="sample")
print(sid2.huuid) # Random 32-char hex string
JSON Schema Generation
schema = Metadata.get_schema()
print(schema["title"]) # "sdata_core Metadata Schema"
# Validate with jsonschema library
import jsonschema
data = meta.to_dict()
jsonschema.validate(instance=data, schema=schema)
Supported Data Types
| DType | Python Type | Description |
|---|---|---|
DType.FLOAT |
float |
Floating point numbers |
DType.INT |
int |
Integers |
DType.STR |
str |
Strings |
DType.BOOL |
bool |
Booleans |
DType.TIMESTAMP |
datetime |
ISO 8601 timestamps |
DType.LIST |
list[str] |
List of strings |
Export Formats
- JSON:
to_json()/from_json() - DataFrame:
to_dataframe()/from_dataframe() - CSV:
to_csv()/from_csv() - JSON-LD:
to_jsonld()/from_jsonld() - Dict:
to_dict()/from_dict()
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sdata_core-0.1.7-cp312-cp312-manylinux_2_35_x86_64.whl.
File metadata
- Download URL: sdata_core-0.1.7-cp312-cp312-manylinux_2_35_x86_64.whl
- Upload date:
- Size: 727.6 kB
- Tags: CPython 3.12, manylinux: glibc 2.35+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
84879809aa1f94f88ce424e387e3e8a879329c3043cb4f12235c2df269e352c7
|
|
| MD5 |
fa0b7fe7b868faf77fd91adc9581ccf3
|
|
| BLAKE2b-256 |
a3940c921126d04fb1661f7d54c53f7e3dec3076fb2df0592ebed6da365fe740
|