Skip to main content

python interface for interacting with flashbots mempool dumpster

Project description

absorb 🧽

the sovereign dataset manager

absorb makes it easy to 1) collect, 2) query, 3) manage, and 4) customize datasets from nearly any data source

features

  • limitless dataset library: access to millions of datasets across 16 diverse data sources
  • intuitive cli+python interfaces: collect or query any dataset in a single line of code
  • maximal modularity: built on open standards for frictionless integration with other tools
  • easy extensibility: add new datasets or data sources with just a few lines of code

Contents

  1. Installation
  2. Example Usage i. Command Line ii. Python
  3. Supported Data sources
  4. Output Format
  5. Configuration

Installation

uv install absorb

Example Usage

Example Command Line Usage

# collect dataset and save as local files
absorb collect kalshi

# list datasets that are collected or available
absorb ls

# show schemas of dataset
absorb schema kalshi

# create new custom dataset
absorb new custom_dataset

# upload custom dataset
absorb upload custom_dataset

Example Python Usage

import absorb

# collect dataset and save as local files
absorb.collect('kalshi')

# list datasets that are collected or available
datasets = absorb.list()

# get schemas of dataset
schema = absorb.schema('kalshi')

# load dataset as polars DataFrame
df = absorb.load('kalshi')

# scan dataset as polars LazyFrame
lf = absorb.scan('kalshi')

# create new custom dataset
absorb.new('custom_dataset')

# upload custom dataset
absorb.upload('custom_dataset')

Supported Data Sources

absorb collects data from each of these sources:

To list all available datasets and data sources, type absorb ls on the command line.

Output Format

To display information about the schema and other metadata of a dataset, type absorb help <DATASET> on the command line.

absorb stores each dataset as a collection of parquet files.

Datasets can be stored in any location on your disks, and absorb will use symlinks to organize those files in the TRUCK_ROOT tree.

the TRUCK_ROOT filesystem directory is organized as:

{TRUCK_ROOT}/
    datasets/
        <source>/
            tables/
                <datatype>/
                    {filename}.parquet
                table_metadata.json
            repos/
                {repo_name}/
    absorb_config.json

Configuration

absorb uses a config file to specify which datasets to track.

Schema of absorb_config.json:

{
    'tracked_tables': list[TrackedTable]
}

schema of dataset_config.json:

{
    "name": str,
    "definition": str,
    "parameters": dict[str, Any],
    "repos": [str]
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

paradigm_absorb-0.1.0.tar.gz (46.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

paradigm_absorb-0.1.0-py3-none-any.whl (68.2 kB view details)

Uploaded Python 3

File details

Details for the file paradigm_absorb-0.1.0.tar.gz.

File metadata

  • Download URL: paradigm_absorb-0.1.0.tar.gz
  • Upload date:
  • Size: 46.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.32.3

File hashes

Hashes for paradigm_absorb-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e9c901220706ce798c65fcf9f328db90f1ddafc705ec8638a83790d438219dcb
MD5 960e0e4dbc6567b1766e85ca41e97720
BLAKE2b-256 f1fb9f5996bcdfb599e3d5b426fc84b118ed17078fc1a7d7f45591e0553213ba

See more details on using hashes here.

File details

Details for the file paradigm_absorb-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for paradigm_absorb-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7f0269a79bb4cec9c56222db6d2b57dbb019f32f39793c286dc9b922604164af
MD5 d68ba27557e4e5644201c3a00a3af8a2
BLAKE2b-256 3dcab037d77fccc44ef389677faa2ee1fd22123143e35c9698701cbf951d5ec2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page