Skip to main content

Deterministic ingestion and normalization for MetaSPN

Project description

metaspn-io

metaspn-io is the ingestion and normalization layer for MetaSPN. It converts raw external records into canonical signal envelopes with deterministic IDs and ordering.

Quick Demo (5 lines)

python -m metaspn_io io ingest \
  --adapter social_jsonl_v1 \
  --source tests/fixtures/social \
  --date 2026-02-05 \
  --out /tmp/social-signals \
  --stats

v0.1 Adapters

  • social_jsonl_v1 (MUST): browser-extension social JSONL (post_seen, profile_seen)
  • outcomes_jsonl_v1 (SHOULD): manual outcomes JSONL (message_sent, reply_received, meeting_booked, revenue_event)
  • solana_rpc_v1: token/platform JSONL (trade, holder_change, supply_change, liquidity_event, metadata_update, reward_update, metatowel_volume_window, reward_pool_funding)
  • pumpfun_v1 (experimental): pump.fun token JSONL (same canonical token event mapping)
  • season1_onchain_jsonl_v1: Season 1 chain JSONL (season_init, game_create, distribute, stake, end, claim)

Schema Mapping

Input adapter Input type Output payload
social_jsonl_v1 post_seen SocialPostSeen
social_jsonl_v1 profile_seen ProfileSnapshotSeen
outcomes_jsonl_v1 message_sent MessageSent
outcomes_jsonl_v1 reply_received ReplyReceived
outcomes_jsonl_v1 meeting_booked MeetingBooked
outcomes_jsonl_v1 revenue_event RevenueEvent
solana_rpc_v1 / pumpfun_v1 trade TokenTradeSeen
solana_rpc_v1 / pumpfun_v1 holder_change HolderChangeSeen
solana_rpc_v1 / pumpfun_v1 supply_change SupplyChangeSeen
solana_rpc_v1 / pumpfun_v1 liquidity_event LiquidityEventSeen
solana_rpc_v1 / pumpfun_v1 metadata_update TokenMetadataUpdated
solana_rpc_v1 / pumpfun_v1 reward_update RewardUpdated
solana_rpc_v1 metatowel_volume_window MetatowelVolumeWindowSeen
solana_rpc_v1 reward_pool_funding RewardPoolFundingSeen
season1_onchain_jsonl_v1 season_init SeasonInitialized
season1_onchain_jsonl_v1 game_create SeasonGameCreated
season1_onchain_jsonl_v1 distribute SeasonRewardDistributed
season1_onchain_jsonl_v1 stake SeasonStakeRecorded
season1_onchain_jsonl_v1 end SeasonEnded
season1_onchain_jsonl_v1 claim SeasonRewardClaimed

Output Envelope

JSONL lines are emitted as canonical envelopes:

{
  "schema_version": "0.1",
  "signal_id": "s_4e9b5c8417d3af2ef9baf8d1",
  "timestamp": "2026-02-05T12:00:00Z",
  "source": "twitter",
  "payload_type": "SocialPostSeen",
  "payload": {
    "platform": "twitter",
    "author_handle": "alice",
    "post_url": "https://x.com/alice/status/1",
    "text": "hello world",
    "action": "seen"
  },
  "entity_refs": [
    {
      "kind": "platform_identifier",
      "platform": "twitter",
      "identifier": "alice"
    }
  ],
  "trace": {
    "ingested_at": "2026-02-06T00:00:00Z",
    "input_file": "raw/social/2026-02-05.jsonl",
    "input_line_number": 1,
    "adapter_name": "social_jsonl_v1",
    "adapter_version": "0.1",
    "raw_id": null,
    "original_timezone": "UTC"
  }
}

CLI

Primary command:

metaspn io ingest --adapter social_jsonl_v1 --source raw/social --out workspace/store/signals/2026-02-05.jsonl

Supported flags:

  • --source file or directory
  • --out output JSONL path or directory (with --date, writes <out>/<date>.jsonl)
  • --store optional store root (writes to <store>/signals/YYYY-MM-DD.jsonl)
  • --date one-day UTC ingest window (YYYY-MM-DD)
  • --since ISO timestamp lower bound
  • --until ISO timestamp upper bound
  • --dry-run
  • --stats
  • --lenient

Demo orchestrator invocation:

metaspn io ingest --adapter social_jsonl_v1 --source raw/social --date 2026-02-05 --out workspace/store/signals

Daily Season 1 run examples:

metaspn io ingest \
  --adapter season1_onchain_jsonl_v1 \
  --source raw/chain/season1 \
  --date 2026-02-07 \
  --out workspace/store/season1-signals \
  --stats
metaspn io ingest \
  --adapter solana_rpc_v1 \
  --source raw/tokens/season1 \
  --date 2026-02-07 \
  --out workspace/store/token-signals \
  --stats

Default mode is strict: bad records are skipped and logged to workspace/logs/ingest_errors.jsonl unless overridden.

Determinism Rules

  • Stable IDs via stable_signal_id(source, timestamp, key)
  • Timestamps normalized to UTC
  • Deterministic sort: timestamp, then canonical key
  • JSON output uses sorted keys

Add A New Adapter (<50 lines)

from dataclasses import dataclass
from pathlib import Path
from metaspn_io.adapters.base import AdapterOptions

@dataclass
class MyAdapter:
    name: str = "my_adapter_v1"
    version: str = "0.1"

    def iter_signals(self, source_path: Path, options: AdapterOptions | None = None):
        for raw in iter_jsonl_records(source_path):
            if isinstance(raw, ParseIssue):
                self.issues.append(raw)
                continue
            signal = convert_to_signal(raw)
            yield signal

Register it in metaspn_io.adapters.default_registry().

Tests

python3 -m pytest -q

Publishing

publish.yml publishes to PyPI when you push a version tag:

git tag -a v0.1.3 -m "v0.1.3"
git push origin v0.1.3

Configure PyPI trusted publishing for this GitHub repository, then the workflow will upload dist/* automatically. Before tagging, ensure CI (.github/workflows/ci.yml) is green, which validates python3 -m pytest -q and package build artifacts.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metaspn_io-0.1.4.tar.gz (16.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

metaspn_io-0.1.4-py3-none-any.whl (19.2 kB view details)

Uploaded Python 3

File details

Details for the file metaspn_io-0.1.4.tar.gz.

File metadata

  • Download URL: metaspn_io-0.1.4.tar.gz
  • Upload date:
  • Size: 16.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for metaspn_io-0.1.4.tar.gz
Algorithm Hash digest
SHA256 e17fb413d207d58891ed177c3bd4b6bc566d9810624f4d3869eddf124170b6e4
MD5 a9d686e3cf6d6bcc450501860e101036
BLAKE2b-256 733b0f86c8ae947c2a8fe5b305d760ee3a64d10206f3528578b4b6a577e177c8

See more details on using hashes here.

Provenance

The following attestation bundles were made for metaspn_io-0.1.4.tar.gz:

Publisher: publish.yml on MetaSPN/metaspn-io

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file metaspn_io-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: metaspn_io-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 19.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for metaspn_io-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 51c97fc3f73f538f7c2d403d3733e09d1e7f3fa5539dc0c2b5d4f84471bcf140
MD5 9e90d4fc1b289074bc781b57f7512cf8
BLAKE2b-256 2fa260c96663478be3ea86859db606e9ff0c91eec2a96395557c4e71b04f8253

See more details on using hashes here.

Provenance

The following attestation bundles were made for metaspn_io-0.1.4-py3-none-any.whl:

Publisher: publish.yml on MetaSPN/metaspn-io

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page