Skip to main content

Deterministic ingestion and normalization for MetaSPN

Project description

metaspn-io

metaspn-io is the ingestion and normalization layer for MetaSPN. It converts raw external records into canonical signal envelopes with deterministic IDs and ordering.

Quick Demo (5 lines)

python -m metaspn_io io ingest \
  --adapter social_jsonl_v1 \
  --source tests/fixtures/social \
  --date 2026-02-05 \
  --out /tmp/social-signals \
  --stats

v0.1 Adapters

  • social_jsonl_v1 (MUST): browser-extension social JSONL (post_seen, profile_seen)
  • outcomes_jsonl_v1 (SHOULD): manual outcomes JSONL (message_sent, reply_received, meeting_booked, revenue_event)
  • solana_rpc_v1: token/platform JSONL (trade, holder_change, supply_change, liquidity_event, metadata_update, reward_update)
  • pumpfun_v1 (experimental): pump.fun token JSONL (same canonical token event mapping)

Schema Mapping

Input adapter Input type Output payload
social_jsonl_v1 post_seen SocialPostSeen
social_jsonl_v1 profile_seen ProfileSnapshotSeen
outcomes_jsonl_v1 message_sent MessageSent
outcomes_jsonl_v1 reply_received ReplyReceived
outcomes_jsonl_v1 meeting_booked MeetingBooked
outcomes_jsonl_v1 revenue_event RevenueEvent
solana_rpc_v1 / pumpfun_v1 trade TokenTradeSeen
solana_rpc_v1 / pumpfun_v1 holder_change HolderChangeSeen
solana_rpc_v1 / pumpfun_v1 supply_change SupplyChangeSeen
solana_rpc_v1 / pumpfun_v1 liquidity_event LiquidityEventSeen
solana_rpc_v1 / pumpfun_v1 metadata_update TokenMetadataUpdated
solana_rpc_v1 / pumpfun_v1 reward_update RewardUpdated

Output Envelope

JSONL lines are emitted as canonical envelopes:

{
  "schema_version": "0.1",
  "signal_id": "s_4e9b5c8417d3af2ef9baf8d1",
  "timestamp": "2026-02-05T12:00:00Z",
  "source": "twitter",
  "payload_type": "SocialPostSeen",
  "payload": {
    "platform": "twitter",
    "author_handle": "alice",
    "post_url": "https://x.com/alice/status/1",
    "text": "hello world",
    "action": "seen"
  },
  "entity_refs": [
    {
      "kind": "platform_identifier",
      "platform": "twitter",
      "identifier": "alice"
    }
  ],
  "trace": {
    "ingested_at": "2026-02-06T00:00:00Z",
    "input_file": "raw/social/2026-02-05.jsonl",
    "input_line_number": 1,
    "adapter_name": "social_jsonl_v1",
    "adapter_version": "0.1",
    "raw_id": null,
    "original_timezone": "UTC"
  }
}

CLI

Primary command:

metaspn io ingest --adapter social_jsonl_v1 --source raw/social --out workspace/store/signals/2026-02-05.jsonl

Supported flags:

  • --source file or directory
  • --out output JSONL path or directory (with --date, writes <out>/<date>.jsonl)
  • --store optional store root (writes to <store>/signals/YYYY-MM-DD.jsonl)
  • --date one-day UTC ingest window (YYYY-MM-DD)
  • --since ISO timestamp lower bound
  • --until ISO timestamp upper bound
  • --dry-run
  • --stats
  • --lenient

Demo orchestrator invocation:

metaspn io ingest --adapter social_jsonl_v1 --source raw/social --date 2026-02-05 --out workspace/store/signals

Default mode is strict: bad records are skipped and logged to workspace/logs/ingest_errors.jsonl unless overridden.

Determinism Rules

  • Stable IDs via stable_signal_id(source, timestamp, key)
  • Timestamps normalized to UTC
  • Deterministic sort: timestamp, then canonical key
  • JSON output uses sorted keys

Add A New Adapter (<50 lines)

from dataclasses import dataclass
from pathlib import Path
from metaspn_io.adapters.base import AdapterOptions

@dataclass
class MyAdapter:
    name: str = "my_adapter_v1"
    version: str = "0.1"

    def iter_signals(self, source_path: Path, options: AdapterOptions | None = None):
        for raw in iter_jsonl_records(source_path):
            if isinstance(raw, ParseIssue):
                self.issues.append(raw)
                continue
            signal = convert_to_signal(raw)
            yield signal

Register it in metaspn_io.adapters.default_registry().

Tests

python3 -m pytest -q

Publishing

publish.yml publishes to PyPI when you push a version tag:

git tag -a v0.1.3 -m "v0.1.3"
git push origin v0.1.3

Configure PyPI trusted publishing for this GitHub repository, then the workflow will upload dist/* automatically. Before tagging, ensure CI (.github/workflows/ci.yml) is green, which validates python3 -m pytest -q and package build artifacts.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metaspn_io-0.1.3.tar.gz (14.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

metaspn_io-0.1.3-py3-none-any.whl (16.5 kB view details)

Uploaded Python 3

File details

Details for the file metaspn_io-0.1.3.tar.gz.

File metadata

  • Download URL: metaspn_io-0.1.3.tar.gz
  • Upload date:
  • Size: 14.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for metaspn_io-0.1.3.tar.gz
Algorithm Hash digest
SHA256 ea784f24ee51d511094d9c904c3b339ffbaeb2707f18b052f8aef354660a2577
MD5 bc82f2c7ef2319c60a211ba70993be3a
BLAKE2b-256 ffaf940ad2d5b7e4e1c46365c40ad5ad3b2061fcfce0ccd617da3ef12c5d0c92

See more details on using hashes here.

Provenance

The following attestation bundles were made for metaspn_io-0.1.3.tar.gz:

Publisher: publish.yml on MetaSPN/metaspn-io

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file metaspn_io-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: metaspn_io-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 16.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for metaspn_io-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1c808bad60d13f497361491ffe51f006e778a51fb4698ba0134c54db7fa2820a
MD5 75c6c28b5ed4f440d421217deba7e245
BLAKE2b-256 ccbdbe6d7d47953b98292c2582721c08572fa01f7486a461bc6536a9141ae6ca

See more details on using hashes here.

Provenance

The following attestation bundles were made for metaspn_io-0.1.3-py3-none-any.whl:

Publisher: publish.yml on MetaSPN/metaspn-io

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page