Skip to main content

No project description provided

Project description

Tagging system

Overview

The tagging system is composed by the following major components:

  • Preprocessors: preprocesses the input data objects before tagging.
  • Tag ID strategies: independent strategies to identify tags from input data objects
  • Aggregators: post-process and aggregates the tagging results from tag ID pipelines configured for each assets
  • Handlers: assembles preprocessor, tag ID pipelines, and result aggregation logics for each type of data source

Input of the tagging system

Currently, the system is implemented to receive batches of data objects from various data sources, e.g., Twitter ( implemented), Discord (TBI), Medium (TBI), Reddit (TBI), etc.

Preprocessors

Preprocessors are used to preprocess the input data objects before tagging.

Strategies

Strategies are used to identify tags from input (pre-processed) data objects. Strategies are expected to work independently, and will work per-asset.

Aggregators

Aggregators are used to post-process and aggregate the tagging results from tag ID pipelines configured for each assets. Cross-asset tagging strategies should also be implemented here. Aggregators are expected to have access to all outputs from tagging strategies (asset-wise), and the input data objects.

Handlers

Handlers control the flows of the actual tagging process for each data source. The handler reads preprocessing pipeline and aggregation pipeline from the data_source_configs of global config, and reads tag identification pipelines from ticker-specific configurations.

Applying TaggingSystem in downstream logics

  1. Prepare tagger configs in a JSON file, e.g., config.json. You may find a sample in: https://github.com/MetaSearch-IO/TaggingSystem/blob/master/sample_configs/config.json . This config file contains global settings for all tickers, including: preprocessing pipeline, and ticker idenfication results aggregation pipeline. Read this config as:
config = json.load(open('config.json'))
  1. Prepare a list of ticker specific configs, e.g., ticker_configs.json. You may find a lot of prepared configs in https://github.com/MetaSearch-IO/TickerConfigs . Read this config as:
# In this example we only read one ticker config
ticker_config_list = json.load(open('ticker_configs/curated_tickers/Chains/ETH.json'))
  1. Initialize a Handler object, and tag your data object(s) by calling it on the data object(s):
from TaggingSystem.handler.DiscordHandler import DiscordHandler

# Init handler
crypto_ticker_tagger = DiscordHandler(config=config, ticker_config_list=ticker_config_list)

# Sample data, could also be a list of dicts
processed_data = {"content": "test BTC"}

# Apply Tagging Logic
crypto_tickers = crypto_ticker_tagger(processed_data)

# crypto_tickers = [{'BTC': 1.0}]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tagging_system-0.1.0.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

tagging_system-0.1.0-py3-none-any.whl (24.7 kB view details)

Uploaded Python 3

File details

Details for the file tagging_system-0.1.0.tar.gz.

File metadata

  • Download URL: tagging_system-0.1.0.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.1 CPython/3.9.14 Darwin/21.6.0

File hashes

Hashes for tagging_system-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6e0dedc42ac5d37262ac1b7a1d11bae63f5f0c8796cc63a02e899166f6b872f5
MD5 b5573851eb7f9aec5d250f198e43d4e0
BLAKE2b-256 ef8db1769bc1cd62be6cd0e9bfe42790427f9bea73656cd69e375cc25cf240d7

See more details on using hashes here.

File details

Details for the file tagging_system-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: tagging_system-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 24.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.1 CPython/3.9.14 Darwin/21.6.0

File hashes

Hashes for tagging_system-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 56277b751f976c6824d73c52e59286c938f85700108dad9a20b261125ab8983a
MD5 ad0b61895e6e396444da1e59a09fea61
BLAKE2b-256 aaedbfa799aed21e44ba66f9bc45be92ed6f7316c3ba8aa9b2c0b5ac5a0151fe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page