Skip to main content

No project description provided

Project description

Resource Segmentation

ci pip install resource-segmentation pypi resource-segmentation python versions license

A Python library for intelligently grouping and segmenting resources with configurable overlap and boundary conditions.

Overview

Resource Segmentation provides a flexible way to group resources based on their properties and constraints. It supports:

  • Hierarchical segmentation: Resources can be grouped into segments based on boundary levels
  • Intelligent grouping: Groups resources with configurable maximum counts and overlap ratios
  • Streaming processing: Handles large datasets efficiently with iterator-based processing
  • Flexible boundary conditions: Supports integer-based boundary levels for segmentation control

Installation

pip install resource-segmentation

Core Concepts

Resources

Resources are the basic units that contain:

  • count: The quantity/weight of the resource
  • start_incision: The boundary level at the start (integer)
  • end_incision: The boundary level at the end (integer)
  • payload: Generic data associated with the resource

Segments

Segments are collections of resources that can be grouped together based on compatible boundary levels.

Groups

Groups are the final output containing:

  • head: Optional overlapping resources from previous group (automatically truncated)
  • body: Main resources in this group
  • tail: Optional overlapping resources for next group (automatically truncated)
  • head_remain_count/tail_remain_count: Actual count in head/tail after automatic truncation

Gap Truncation: The library automatically truncates head and tail to optimize overlap:

  • head is truncated from back to front (keeping resources closer to body)
  • tail is truncated from front to back (keeping resources closer to body)
  • This ensures efficient memory usage while maintaining necessary overlap between groups

Usage Examples

Basic Resource Grouping

from resource_segmentation import split, Resource

# Create sample resources
resources = [
    Resource(100, 0, 0, 0),
    Resource(100, 0, 0, 1),
    Resource(100, 0, 0, 2),
    Resource(100, 0, 0, 3),
    Resource(100, 0, 0, 4),
]

# Group resources with max 400 per group and 25% overlap
groups = list(split(
    resources=iter(resources),
    max_segment_count=400,
    border_incision=0,
    gap_rate=0.25,
    tail_rate=0.5
))

# Process groups
for i, group in enumerate(groups):
    print(f"Group {i}:")
    print(f"  Body: {len(group.body)} items, total count: {sum(item.count for item in group.body)}")
    print(f"  Head: {len(group.head)} items (count: {group.head_remain_count})")
    print(f"  Tail: {len(group.tail)} items (count: {group.tail_remain_count})")

Segment-based Grouping

from resource_segmentation import split, Resource, Segment

# Resources with different incision levels
resources = [
    Resource(100, 0, 0, 0),
    Resource(100, 0, 1, 0),
    Resource(100, 1, 1, 0),
    Resource(100, 1, 0, 0),
    Resource(100, 0, 0, 0),
]

# The middle three resources will be grouped into a segment
groups = list(split(
    resources=iter(resources),
    max_segment_count=1000,
    border_incision=0,
    gap_rate=0.0  # No overlap
))

Handling Large Resources

from resource_segmentation import split, Resource

# Mix of small and large resources
resources = [
    Resource(100, 0, 0, 0),
    Resource(300, 0, 0, 1), # Large resource
    Resource(100, 0, 0, 2),
    Resource(100, 0, 0, 3),
]

# Group with max 400 per group - large resource will be handled appropriately
groups = list(split(
    resources=iter(resources),
    max_segment_count=400,
    border_incision=0,
    gap_rate=0.25,
    tail_rate=0.5
))

Custom Overlap Distribution

from resource_segmentation import split, Resource

resources = [
    Resource(400, 0, 0, 0),
    Resource(200, 0, 0, 1),
    Resource(400, 0, 0, 2),
]

# Distribute overlap mostly to tail (80% tail, 20% head)
groups = list(split(
    resources=iter(resources),
    max_segment_count=400,
    border_incision=0,
    gap_rate=0.25,
    tail_rate=0.8  # 80% to tail
))

# All overlap to tail
groups = list(split(
    resources=iter(resources),
    max_segment_count=400,
    border_incision=0,
    gap_rate=0.25,
    tail_rate=1.0  # 100% to tail
))

API Reference

Main Function

split(resources, max_segment_count, border_incision, gap_rate=0.0, tail_rate=0.5)

Groups resources into segments with configurable constraints.

Parameters:

  • resources (Iterator[Resource[P]]): Iterator of resources to group
  • max_segment_count (int): Maximum total count per segment (including head, body, and tail)
  • border_incision (int): Border incision level for segmentation
  • gap_rate (float, optional): Overlap ratio between groups (0.0-1.0). Default: 0.0
    • The gap (overlap) is calculated as floor(max_segment_count * gap_rate)
    • The body max count is max_segment_count - gap * 2
  • tail_rate (float, optional): Distribution ratio for overlap (0.0-1.0). Default: 0.5
    • 0.0 means all overlap goes to head, 1.0 means all overlap goes to tail

Yields:

  • Group[P]: Grouped resources with head, body, tail sections (head and tail are automatically truncated)

Data Types

Resource[P]

@dataclass
class Resource(Generic[P]):
    count: int                   # Resource quantity
    start_incision: int          # Start boundary level
    end_incision: int            # End boundary level
    payload: P                   # Associated data

Segment[P]

@dataclass
class Segment(Generic[P]):
    count: int                   # Total count of contained resources
    resources: list[Resource[P]] # List of resources in segment

Group[P]

@dataclass
class Group(Generic[P]):
    head_remain_count: int                   # Actual count in head after truncation
    tail_remain_count: int                   # Actual count in tail after truncation
    head: list[Resource[P] | Segment[P]]     # Head section (overlap, truncated)
    body: list[Resource[P] | Segment[P]]     # Main body section
    tail: list[Resource[P] | Segment[P]]     # Tail section (overlap, truncated)

Boundary Levels

The library uses integer boundary levels to determine how resources can be segmented. Higher values indicate stronger boundary conditions.

Development

Setup

First, install dependencies using Poetry:

poetry install

Testing

Run the test suite:

python test.py

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

resource_segmentation-0.0.6.tar.gz (7.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

resource_segmentation-0.0.6-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file resource_segmentation-0.0.6.tar.gz.

File metadata

  • Download URL: resource_segmentation-0.0.6.tar.gz
  • Upload date:
  • Size: 7.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.13.4 Darwin/25.1.0

File hashes

Hashes for resource_segmentation-0.0.6.tar.gz
Algorithm Hash digest
SHA256 3fe247c636519ac2fcdf11dd4c6496e4086d856fbb788a7716deebeac3797f2a
MD5 0e3253115e333f31c6554fcf327d1f38
BLAKE2b-256 5a864db2113536af7e92be3bdf8216c14cfc25f7276aa5a8822b62973b0ad7bd

See more details on using hashes here.

File details

Details for the file resource_segmentation-0.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for resource_segmentation-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 6f1855f9862ac2d60bb53c6b7b0f74e6ca3e69a13f93165187e481b8624e9853
MD5 79a4f42b05b4d33e3063789d4803ef86
BLAKE2b-256 4dd0361b1bf20713b66c1c5a154f65abd3a8b5bf9795f2cd94ecd14e17dbac61

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page