Skip to main content

A small example package

Project description

Redis Naming Convention

Redis Key

TYPE:NAME:DIMENSION:ID:TIMESTAMP:METRIC
Example :

  • JSON -> J::P:001::
  • TS -> TS:5MINUTES:S:001::UD
@unique
class RedisNC(IntEnum):
    TYPE = 0,
    NAME = 1,
    DIMENSION = 2,
    RECORD_ID = 3,
    TS = 4,
    METRIC = 5
@unique
class Type(Enum):
    STREAM = 'ST'
    HASH = 'H'
    JSON = 'J'
    INDEX = 'I'
    TIMESERIES = 'TS'
    BLOOM = 'B'
    SORTEDSET = 'SS'
    SET = 'S'
    LIST = 'L'
    CHANNEL = 'C'
    CMS = 'CMS'
    HLL = 'HLL'
NAME = "custom_dev_choice"
@unique
class Dimension(Enum):
    WEBSITE = 'W'
    SECTION = 'S'
    PAGE = 'P'
    DEVICE = 'D'
    AUDIO = 'A'
    VIDEO = 'V'
    PODCAST = 'PC'
    METRIC = 'M'
ID = "unique_key_identifier" # hash
TIMESTAMP = "timestamp_key" # int
@unique
class Metric(Enum):
    PAGEVIEWS = 'PG'
    DEVICES = 'D'
    UNIQUE_DEVICES = 'UD'

Redis JSON

Website

{
    "id": "28be7102962bea2626e3dd6c71026b3102211029ae09", (string - hash)
    "name": "RTL", (string - uppercase)
    "last_visited": 17102562426214, (numeric - timestamp)
    "sections": [
        {
            "id": "c102c621102bbe062bd4410d5b625dd10293fd631028", (string - hash)
            "pretty_name": "actu/monde", (string) -TODO-> "name": "actu/monde"
            "last_visited": 1710241020110225, (numeric - timestamp)
        }
    ], (tracker Section object)
    "pages": [
        {
            "id": "f0362c3463d7e27102db1f3102e9a4d102628a5eebd", (string - hash)
            "url": "https://5minutes.rtl.lu/actu/monde/a/110210234.html", (string - url without query params)
            "article_id": 110210234, (numeric)
            "last_visited": 1710241020110225, (numeric - timestamp)
        }
    ] (tracker Page object)
}

Section

{
    "id": "", (string - hash)
    "pretty_name": "", (string)
    "level_0": "", (string) -> "levels": {"level_0": "", "level_n": "n"}
    "level_1": "", (string)
    "level_2": "", (string)
    "level_3": "", (string)
    "level_4": "", (string)
    "last_visited": 1710256271021022, (numeric - timestamp)
    "website": {
        "id": "", (string - hash)
        "name": "", (string)
    },
    "pages": [
        {
            "id": "f0362c3463d7e27102db1f3102e9a4d102628a5eebd", (string - hash)
            "url": "https://5minutes.rtl.lu/actu/monde/a/110210234.html", (string - url without query params)
            "article_id": 110210234, (numeric)
            "last_visited": 1710241020110225, (numeric - timestamp)
        }
    ]
}

Page

{
    "id": "", (string - hash)
    "url": "", (string - url without query params)
    "article_id": "", (numeric)
    "last_visited": 17102626236220, (numeric, timestamp)
    "metadata": {
        "title": "", (string)
        "kicker": "", (string)
        "display_data": 17102626236220
    },
    "website": {
        "id": "",
        "name": ""
    },
    "section": {
        "id": "", (string - hash)
        "pretty_name": "", (string) -TODO-> "name": "" (string)
        "levels": {
            "level_0": "", (string)
            "level_1": "", (string)
            "level_2": "", (string)
            "level_3": "", (string)
            "level_4": "", (string)
        }
    }
}

Redis Index

Website on prefix = J::W: on JSON

  • id TAG as id,
  • name TAG as name,
  • last_visited NUMERIC as last_visited, -TODO-> last_visited NUMERIC as last_visited SORTABLE true,
  • sections[*].id TAG as section_id,
  • sections[].pretty_name TAG as section_pretty_name, -TODO-> sections[].name TAG as section_name,
  • pages[*].id TAG as page_id

Section on prefix = J::S: on JSON

  • id TAG as id,
  • pretty_name TAG as pretty_name SEPARATOR '/', -TODO-> name TAG as name SEPARATOR '/',
  • level_0 TAG as level_0, -TODO-> levels.level_0 TAG as level_0
  • level_1 TAG as level_1,
  • level_2 TAG as level_2,
  • level_3 TAG as level_3,
  • level_4 TAG as level_4,
  • last_visited NUMERIC as last_vistited SORTABLE true
  • website.id TAG as website_id,
  • website.name TAG as website_name

Page on prefix = J::P: on JSON

  • id TAG as id,
  • url TEXT as url,
  • metadata.title TEXT as title,
  • metadata.kicker TEXT as kicker,
  • last_visited NUMERIC as last_visited,
  • website.id TAG as website_id,
  • website.name TAG as website_name,
  • section.id TAG as section_id,
  • section.pretty_name TAG as section_pretty_name SEPARATOR '/',
  • section.levels.level_0 TAG as section_level_0,
  • section.levels.level_1 TAG as section_level_1,
  • section.levels.level_2 TAG as section_level_2,
  • section.levels.level_3 TAG as section_level_3,
  • section.levels.level_4 TAG as section_level_4

Redis TimeSeries

Website

  • ts_name: 5MINUTES, 10MINUTES, etc.
  • dimension: W, S, P
  • M: PG, UD
  • website_id: dbc102621029a62c7102fb1f62c362f2ad162646c2e, ...
  • name: RTL, ...

Section

  • ts_name: 5MINUTES, 10MINUTES, etc.
  • dimension: W, S, P
  • M: PG, UD
  • section_id: 46f3112310246f1027a1029e4dbeb63f14a162e1102a, ...
  • pretty_name: meenung/carte-blanche, ... -TODO-> section_name: meenung/carte-blanche
  • website_id: dbc102621029a62c7102fb1f62c362f2ad162646c2e, ...
  • website_name: RTL, ...

Page

  • ts_name: 5MINUTES, 10MINUTES, etc.
  • dimension: W, S, P
  • M: PG, UD
  • page_id: 6271022b62cd179ca7afa1eebc710271024ed2bff102, ...
  • website_id: dbc102621029a62c7102fb1f62c362f2ad162646c2e, ...
  • website_name: RTL, ...
  • section_id: 46f3112310246f1027a1029e4dbeb63f14a162e1102a, ...
  • section_pretty_name: meenung/carte-blanche, ... -TODO-> section_name: meenung/carte-blanche

Pythonic Redis Backend

Build Python project

change version in pyproject.toml delete /dist files python3 -m build

Upload Python package

python3 -m twine upload --repository testpypi dist/* python3 -m twine upload dist/*

Update Local Python Package

pip install rgtracker==0.0.1.1.102

Run RedisGears Jobs

python src/jobs/produce.py python src/jobs/create_requirements.py

gears-cli run --host localhost --port 6379 src/jobs/bigbang.py REQUIREMENTS rgtracker==0.0.1.1.102 gears-cli run --host localhost --port 6379 src/jobs/bigbang.py REQUIREMENTS rgtracker==0.0.1.1.102 pandas requests

gears-cli run --host localhost --port 6379 src/jobs/rotate_pageviews/rotate_pg_website.py REQUIREMENTS rgtracker==0.0.1.1.102 pandas
gears-cli run --host localhost --port 6379 src/jobs/rotate_pageviews/rotate_pg_section.py REQUIREMENTS rgtracker==0.0.1.1.102 pandas
gears-cli run --host localhost --port 6379 src/jobs/rotate_pageviews/rotate_pg_page.py REQUIREMENTS rgtracker==0.0.1.1.102 pandas

gears-cli run --host localhost --port 6379 src/jobs/rotate_unique_devices/rotate_ud_website.py REQUIREMENTS rgtracker==0.0.1.1.102 pandas
gears-cli run --host localhost --port 6379 src/jobs/rotate_unique_devices/rotate_ud_section.py REQUIREMENTS rgtracker==0.0.1.1.102 pandas
gears-cli run --host localhost --port 6379 src/jobs/rotate_unique_devices/rotate_ud_page.py REQUIREMENTS rgtracker==0.0.1.1.102 pandas

gears-cli run --host localhost --port 6379 src/jobs/enrich.py REQUIREMENTS rgtracker==0.0.1.1.102 pandas requests

Notes

https://stackoverflow.com/questions/2210210262/how-to-apply-hyperloglog-to-a-timeseries-stream
https://redis.com/blog/7-redis-worst-practices/
https://redis.com/blog/streaming-analytics-with-probabilistic-data-structures/
https://findwork.dev/blog/advanced-usage-python-requests-timeouts-retries-hooks/
https://www.peterbe.com/plog/best-practice-with-retries-with-requests

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rgtracker-0.0.1.1.102.tar.gz (22.1 kB view details)

Uploaded Source

Built Distribution

rgtracker-0.0.1.1.102-py3-none-any.whl (41.6 kB view details)

Uploaded Python 3

File details

Details for the file rgtracker-0.0.1.1.102.tar.gz.

File metadata

  • Download URL: rgtracker-0.0.1.1.102.tar.gz
  • Upload date:
  • Size: 22.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.7.13

File hashes

Hashes for rgtracker-0.0.1.1.102.tar.gz
Algorithm Hash digest
SHA256 014732e35c560b619c4e42e13e2e3a76c3c9f97aad73218933f5bbbb6eaa4d4e
MD5 bda72043b99ea647c3135dd266c774f7
BLAKE2b-256 1a334dfc1ee83a500eed6ad53c3b4abcd0f29dbce9b5d6c57b0b93c24b9257be

See more details on using hashes here.

File details

Details for the file rgtracker-0.0.1.1.102-py3-none-any.whl.

File metadata

File hashes

Hashes for rgtracker-0.0.1.1.102-py3-none-any.whl
Algorithm Hash digest
SHA256 76800d41c0056a4e9cf1afcc9c16554cca2b95182b221f9deac6e6bd466f603a
MD5 0e0a1916b262bf9ea2a41ef7bec3d183
BLAKE2b-256 5829ca2b3ae671c7e7592c48cdb60c6b35ebcd258fc2688cd1c78c06a298d5bc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page