Skip to main content

Good Kiwi Common Library

Project description

good-common

A small set of common dependencies for Good Kiwi.

Dependency Provider

BaseProvider is a base class for creating fast_depends (so FastAPI and FastStream compatible) dependency providers.

class APIClient:
    def __init__(self, api_key: str):
        self.api_key = api_key

    def get(self, url: str):
        return f"GET {url} with {self.api_key}"

class APIClientProvider(BaseProvider[APIClient], APIClient):
    pass


from fast_depends import inject

@inject
def some_task(
    api_client: APIClient = APIClientProvider(api_key="1234"),
):
    return api_client.get("https://example.com")

Can also be used without fast_depends:

client = APIClientProvider(api_key="1234").get()

Override initializer to customize how the dependency class is initialized.

class APIClientProvider(BaseProvider[APIClient], APIClient):
    def initializer(
        self,
        cls_args: typing.Tuple[typing.Any, ...],  # args passed to the Provider
        cls_kwargs: typing.Dict[str, typing.Any],  # kwargs passed to the Provider
        fn_kwargs: typing.Dict[str, typing.Any],  # kwargs passed to the function at runtime
    ):
        return cls_args, {**cls_kwargs, **fn_kwargs}  # override the api_key with the one passed to the function


@inject
def some_task(
    api_key: str,
    api_client: APIClient = APIClientProvider(),
):
    return api_client.get("https://example.com")


some_task(api_key="5678")

Pipeline

Overview

The Pipeline library provides a flexible and efficient way to create and execute pipelines of components in Python. It supports both synchronous and asynchronous execution, type checking, parallel processing, and error handling.

Features

  • Create pipelines with multiple components that can accept multiple inputs and produce multiple outputs
  • Typed "channels" for passing data between components
  • Support for both synchronous and asynchronous components
  • Type checking for inputs and outputs using Python type annotations
  • Parallel execution of pipeline instances
  • Error handling with Result types
  • Function mapping for flexible component integration

Quick Start

from typing import Annotated
from good_common.pipeline import Pipeline, Attribute

def add(a: int, b: int) -> Annotated[int, Attribute("result")]:
    return a + b

def multiply(result: int, factor: int) -> Annotated[int, Attribute("result")]:
    return result * factor

# Create a pipeline
my_pipeline = Pipeline(add, multiply)

# Execute the pipeline
result = await my_pipeline(a=2, b=3, factor=4)
print(result.result)  # Output: 20

Usage

Creating a Pipeline

Use the Pipeline class to create a new pipeline:

from pipeline import Pipeline

my_pipeline = Pipeline(component1, component2, component3)

Defining Components

Components can be synchronous or asynchronous functions:

from typing import Annotated
from pipeline import Attribute

def sync_component(x: int) -> Annotated[int, Attribute("result")]:
    return x + 1

async def async_component(x: int) -> Annotated[int, Attribute("result")]:
    await asyncio.sleep(0.1)
    return x * 2

Executing a Pipeline

Execute a pipeline asynchronously:

result = await my_pipeline(x=5)
print(result.result)

Parallel Execution

Execute a pipeline with multiple inputs in parallel:

inputs = [{"a": 1, "b": 2, "factor": 2}, {"a": 2, "b": 3, "factor": 3}]
results = [result async for result in my_pipeline.execute(*inputs, max_workers=3)]

for result in results:
    if result.is_ok():
        print(result.unwrap().result)
    else:
        print(f"Error: {result.unwrap_err()}")

Error Handling

The pipeline handles errors gracefully in parallel execution:

def faulty_component(x: int) -> Annotated[int, Attribute("result")]:
    if x == 2:
        raise ValueError("Error on purpose!")
    return x + 1

pipeline = Pipeline(faulty_component)
inputs = [{"x": 1}, {"x": 2}, {"x": 3}]
results = [result async for result in pipeline.execute(*inputs)]

for result in results:
    if result.is_ok():
        print(result.unwrap().result)
    else:
        print(f"Error: {result.unwrap_err()}")

Function Mapping

Use function_mapper to adjust input parameter names:

from pipeline import function_mapper

def multiply_diff(difference: int, factor: int) -> Annotated[int, Attribute("result")]:
    return difference * factor

pipeline = Pipeline(subtract, function_mapper(multiply_diff, diff="difference"))

Advanced Features

  • Mixed synchronous and asynchronous components in a single pipeline
  • Custom output types with Attribute annotations
  • Flexible error handling in both single and parallel executions

URL Plugin System

The URL class in good-common now supports a plugin system for extending URL processing capabilities without modifying the core library.

Features

  • Extend URL canonicalization rules
  • Add custom tracking parameters to filter
  • Define domain-specific processing rules
  • Add URL classification patterns
  • Register short URL providers and bio link domains
  • Apply custom URL transformations

Built-in Plugins

Good-common includes several built-in plugins for common use cases:

ECommerceURLPlugin

Handles e-commerce website URLs (Amazon, eBay, Etsy, AliExpress, etc.)

  • Removes tracking parameters like ref, hash, _trkparms
  • Preserves product identifiers and search parameters
  • Transforms mobile URLs to desktop versions
  • Classifies product pages, search results, shopping carts

AnalyticsTrackingPlugin

Removes analytics and tracking parameters from all major platforms

  • Google Analytics (utm_*, gclid, etc.)
  • Facebook (fbclid, fb_*)
  • Microsoft/Bing (msclkid)
  • Email marketing (mc_cid, _hsenc, mkt_tok)
  • Social media tracking parameters
  • Preserves content identifiers and navigation parameters

VideoStreamingPlugin

Handles video platform URLs (YouTube, Vimeo, Twitch, etc.)

  • Removes tracking parameters like feature, ab_channel
  • Preserves video IDs, timestamps, and playlist information
  • Transforms mobile YouTube URLs to desktop
  • Classifies video pages, channels, playlists

SearchEnginePlugin

Processes search engine URLs (Google, Bing, DuckDuckGo)

  • Removes search tracking parameters (ved, ei, source)
  • Preserves search queries and result types
  • Overrides built-in disable rules for Google
  • Classifies different search types (images, videos, maps)

DocumentSharingPlugin

Handles document and cloud storage platforms (Google Drive/Docs, Dropbox, Box)

  • Removes sharing tracking parameters (usp, dl, raw)
  • Preserves document identifiers and view settings
  • Classifies different document types

Using Built-in Plugins

from good_common.types.builtin_plugins import load_builtin_plugins

# Load all built-in plugins
load_builtin_plugins()

# Load specific plugins only
load_builtin_plugins(["ecommerce", "analytics", "video"])

# Use enhanced URL processing
url = URL("https://www.amazon.com/dp/B123?ref=sr&utm_source=google")
canonical = url.canonicalize()  # Removes both ref and utm_source

Creating a Plugin

from good_common.types import URLPlugin
import re

class MyURLPlugin(URLPlugin):
    def get_tracking_params(self) -> Set[str]:
        """Additional tracking parameters to remove during canonicalization."""
        return {"my_tracking_id", "custom_ref"}
    
    def get_canonical_params(self) -> Set[str]:
        """Parameters that should be preserved."""
        return {"article_id", "product_id"}
    
    def get_domain_rules(self) -> Dict[str, Dict[str, Any]]:
        """Domain-specific canonicalization rules."""
        return {
            r".*\.mysite\.com": {
                "canonical": {"id", "page"},
                "non_canonical": {"session", "temp"},
                "force_www": True,
            }
        }
    
    def get_short_url_providers(self) -> Set[str]:
        """Additional short URL domains."""
        return {"mylink.co", "short.link"}
    
    def get_classification_patterns(self) -> Dict[str, Pattern]:
        """Custom URL classification patterns."""
        return {
            "product_page": re.compile(r"/products?/[\w-]+"),
            "category_page": re.compile(r"/categor(y|ies)/[\w-]+"),
        }
    
    def transform_url(self, url: 'URL', config: 'UrlParseConfig') -> Optional['URL']:
        """Apply custom URL transformations."""
        from good_common.types import URL
        
        # Example: Rewrite mobile URLs to desktop
        if url.host == "m.mysite.com":
            return URL.build(
                scheme="https",
                host="www.mysite.com",
                path=url.path,
                query=url.query_params(format="plain", flat_delimiter=","),
            )
        return None

Registering Plugins

Method 1: Entry Points (Recommended for Packages)

Add to your package's pyproject.toml:

[project.entry-points."good_common.url_plugins"]
my_plugin = "my_package.plugins:MyURLPlugin"
social_media = "my_package.plugins:SocialMediaPlugin"

Plugins registered via entry points are automatically loaded when the good-common module is imported.

Method 2: Direct Registration

from good_common.types import URL, URLPlugin

class MyPlugin(URLPlugin):
    # ... implementation ...

# Register at class level
URL.register_plugin(MyPlugin())

# Or use the global registry
from good_common.types import url_plugin_registry
url_plugin_registry.register(MyPlugin())

Method 3: Runtime Registration

from good_common.types import URL

# Create and register a plugin at runtime
plugin = MyURLPlugin()
URL.register_plugin(plugin)

# Use the enhanced URL functionality
url = URL("https://example.com/page?my_tracking_id=123&article_id=456")
canonical = url.canonicalize()  # my_tracking_id will be removed, article_id preserved

# Check custom classifications
classifications = url.classify()
if classifications.get("product_page"):
    print("This is a product page")

# Unregister when done
URL.unregister_plugin(plugin)

Example Plugins

The library includes example plugins in good_common.types.example_plugin:

  • SocialMediaURLPlugin: Handles social media specific parameters and transformations
  • NewsMediaURLPlugin: Manages news site tracking parameters and classifications
from good_common.types.example_plugin import SocialMediaURLPlugin

# Use the pre-built social media plugin
plugin = SocialMediaURLPlugin()
URL.register_plugin(plugin)

# Now URLs from social media sites will be processed with specialized rules
url = URL("https://instagram.com/p/ABC123?igshid=tracker")
canonical = url.canonicalize()  # igshid parameter will be removed

Performance Considerations

  • Plugins are designed with minimal overhead (<10% when registered)
  • Plugin data is cached for efficiency
  • Lazy loading ensures plugins only impact performance when used
  • Use entry points for automatic loading or register manually for fine control

Utilities

Various utility functions for common tasks.

Look at /tests/good_common/utilities for usage

Project details


Release history Release notifications | RSS feed

This version

1.5.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

good_common-1.5.1-cp314-cp314-win_amd64.whl (894.6 kB view details)

Uploaded CPython 3.14Windows x86-64

good_common-1.5.1-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64manylinux: glibc 2.28+ x86-64

good_common-1.5.1-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (2.3 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

good_common-1.5.1-cp314-cp314-macosx_10_15_universal2.whl (1.2 MB view details)

Uploaded CPython 3.14macOS 10.15+ universal2 (ARM64, x86-64)

good_common-1.5.1-cp313-cp313-win_amd64.whl (881.3 kB view details)

Uploaded CPython 3.13Windows x86-64

good_common-1.5.1-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64manylinux: glibc 2.28+ x86-64

good_common-1.5.1-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (2.3 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

good_common-1.5.1-cp313-cp313-macosx_10_13_universal2.whl (1.2 MB view details)

Uploaded CPython 3.13macOS 10.13+ universal2 (ARM64, x86-64)

File details

Details for the file good_common-1.5.1-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for good_common-1.5.1-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 6291cffc30ebf34e7b78db3e5bcd162033d5afc983b921fd9ec11a4207709cb4
MD5 519918ec525650b7ff3725c4c40e4f61
BLAKE2b-256 bb0be654802df671505e1dd5f21a87c0c70a21187afc56815ff93f660df5c1b7

See more details on using hashes here.

File details

Details for the file good_common-1.5.1-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for good_common-1.5.1-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4f5f21ffb26a7788c743e510479569e6cffcd2722e2b36096754fab0e72bcd7f
MD5 e1eeee343f6c79a8378121a5af3b7045
BLAKE2b-256 e48643f07e4ddf8bf927973857f5b1a3765ba472034334df85e44b1d7b37c103

See more details on using hashes here.

File details

Details for the file good_common-1.5.1-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for good_common-1.5.1-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 f490f0cd64d64b8a2b789a51592ce6a6075aec159c2e1ef37d618ebadd46bc93
MD5 ecd80164019eb579613014b67c1aea5e
BLAKE2b-256 e25b59e2b376f4435736c7370d2db7ab17f18a107a648d9ad4808eace2db5b23

See more details on using hashes here.

File details

Details for the file good_common-1.5.1-cp314-cp314-macosx_10_15_universal2.whl.

File metadata

File hashes

Hashes for good_common-1.5.1-cp314-cp314-macosx_10_15_universal2.whl
Algorithm Hash digest
SHA256 e5eb71f34dc004f183a4e38a2bb12d7723048a14d175dff9ac011951235eae62
MD5 bdc20049ec5187a50740d8b72956912f
BLAKE2b-256 cfbcadbe112e14135bd724c3a063e11259b08217cbbdb88dc26fe79f5a40be03

See more details on using hashes here.

File details

Details for the file good_common-1.5.1-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for good_common-1.5.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 a1521f3285e22d0d94a766c5ca8444bb7ca3c5b2829c36acd0a7cb607a38fc66
MD5 1a41b29a6b245bafce5c1a75836706ee
BLAKE2b-256 476adfe0053ff2ebd20b6db7ea5581ac8589593e1faf59326af99668b228a6bc

See more details on using hashes here.

File details

Details for the file good_common-1.5.1-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for good_common-1.5.1-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 885d27c6367198e108ed2e3b4e85af2d766de9a13614ad94623b44315a532a32
MD5 396e7dda2b2a42c878ad188420cfb114
BLAKE2b-256 9c4b8b6e2e994352fe423eaa97b8e0237864738f669db91b5c54f25eb95b3212

See more details on using hashes here.

File details

Details for the file good_common-1.5.1-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for good_common-1.5.1-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 bf6e84a914a1481dda79091f1b1ce991b532c5559cd2fc03187ceae3b5e8ae3e
MD5 9cac3e999cd947950fa737a343f4d569
BLAKE2b-256 de8b371b69633660dae4788a7a3e548145cfe3f7ae01d8eb6fffe7b333ac5cab

See more details on using hashes here.

File details

Details for the file good_common-1.5.1-cp313-cp313-macosx_10_13_universal2.whl.

File metadata

File hashes

Hashes for good_common-1.5.1-cp313-cp313-macosx_10_13_universal2.whl
Algorithm Hash digest
SHA256 9eaf240d0c5f9decf964d93622eb4305a10e2904d4ae10ef5a02910638c81c0b
MD5 e2507f3c2efa3aea479ccaffc055b784
BLAKE2b-256 19ce93138bce78d42fa8c72d61403f464e079ef36e406591c08587fdbb43480b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page