Segmenting (classification + caching) for Keble positioning grid.
Project description
keble-segmenting
Segmenting, masking, cloning, aggregation, and agentic mutation runtime for Keble positioning grids.
Current package line:
- version:
0.11.12 - python:
>=3.13,<3.14
Version 0.11.12 Coverage And Prompt Context Remediation
- Coverage source signatures now include current mask definitions, mask options, and preset mask type values. Changing mask semantics invalidates stale coverage even when grid mask ids stay unchanged.
- Coverage phase completion now records
completed_envelope_ids, rejects duplicate completion events for the same envelope, and avoids incrementing alreadyCOMPLETEorFAILEDphase rows. - Prompt context policy fields are enum-backed while preserving their serialized values, and market-channel source URLs stay in registered references instead of prompt prose.
- Both Hatch and Poetry metadata now declare the same current package version.
Version 0.11.10 Dimension Option Rebucketing
UPDATE_DIMENSION_OPTIONSis documented as membership-preserving only. Use it for typo, copy, wording, niche, or role changes when existing item memberships remain valid.- Price-band split, merge, and rebucketing flows must retire old option keys,
create fresh buckets, and then run
AUTO_SEGMENT_AND_MASKfor the affected dimension. - Deleting dimension options now preserves historical
SegmentedResultrows in storage. Current aggregations ignore retired option keys, so old rows remain fetchable for audit/history without affecting current positioning cells.
Version 0.11.9 Prompt Context And References
- Preset-mask prompts keep user-facing evidence in the requested output language while allowing external tool queries in English or the target marketplace language when that gives better source coverage.
- Prompt-owned context now tells agents not to copy helper labels, dimension
keys, option keys, cell signatures, or machine-style
key=valuestatistics into evidence text. - Prior preset and parent-cell source digests now render as readable market context, with broad reusable references kept behind exact product/cell/source evidence.
Version 0.11.8 Rare Option Coverage
- Bootstrap auto-segment planning now queues every discovered dimension for item
segmentation, including custom and niche axes, so rare options receive
SegmentedResultrows during the first run. - Preset-mask work remains pricing anchored through the existing pricing+feature and pricing+scene target pairs; richer item coverage does not create extra preset-mask pair fanout.
- Dimension discovery, dimension merge, and single-dimension segmentation
prompts now share context-backed option discovery guidance. Concrete
market-relevant one-item options should stay
STANDARDwithniche=trueinstead of being hidden inOTHER.
Version 0.11.7 Exact-Cell Metric Estimates
- MARKET_DEMAND and POTENTIAL_SALES AI rows with numeric metric payloads must
use
area_type=CELLSwith exactly one concrete cell.ROWand multi-cellCELLSoutputs are rejected before storage expansion. - Numeric preset prompts now require one result per exact cell and forbid copying one broad numeric range across multiple cells. Each estimate must account for the cell's own price band, feature/use-case niche, observed sample, empty-cell state, comparable markets, and seller profile.
- Normal custom masks and non-metric preset masks keep their existing
ROWexpansion behavior. - Mock handler deps now carry the current prompt and queue context fields so focused regressions match the real agent dependency surface.
Version 0.11.6 Preset Mask Reference Placement
- Added
MaskResultReferenceUsagePolicyso citation rules are owned by mask-result evidence and metric-range evidence, not generic prompt context. - Preset MARKET_DEMAND and POTENTIAL_SALES prompts now explicitly place directly used source keys on headline result evidence and exact numeric metric evidence while keeping empty references valid when no external source directly supports the claim.
- Prompt-owned market-channel assumption sources are registered in the per-call
SegmentingReferenceRegistryfor preset mask prompts, so public assumptions can persist as normal web references instead of prose-only context. - Downstream preset context now carries readable source labels from stored references and avoids raw reference keys, URLs, provider internals, enum names, and JSON in prior-demand or prior-sales evidence lines.
Version 0.11.5 References And Locale Rule
- Segmenting tool registrars now receive a per-call
SegmentingReferenceRegistry. Tools register concrete references and return onlyReferenceCandidaterows withreference_key, title, and snippet. - LLM output schemas accept
reference_keysand resolve them after generation. Unknown keys are logged and dropped so fabricated citations are not persisted or displayed. - Mask results, metric ranges, relational reasonings, aggregate cells, and
metric summaries now carry backward-compatible
references: []fields. - Reference dedupe uses normalized URL, product identity, segmenting cell signature, or a type/title fallback.
- Preset MARKET_DEMAND, POTENTIAL_SALES, LIKELY_UNACHIEVABLE, and MARKET_OPPORTUNITY_HIGHLIGHT prompts include a strong output-language rule next to preset guidance so user-facing evidence follows the prompt locale.
Version 0.11.4 Auto-Mask Allowlist
AutoSegmentAndMaskAction,SegmentingActionConfig, andSegmentingClient.aauto_segment_and_mask(...)now acceptauto_mask_types: list[PresetMaskType] | None.Nonepreserves the package default preset behavior,[]disables preset stages, and a non-empty list selects exactly those preset mask types.- Selected preset stages always run in canonical dependency-safe order: demand, feasibility, potential sales, then opportunity.
- Opportunity context is best-effort from available demand and potential-sales rows. Feasibility remains supported and readable, but is included only when the resolved allowlist selected it and matching results exist.
Current Branch Prompt Context Contract
- Branch
fix/segmenting-prompt-contextkeeps package version metadata unchanged. SegmentingPromptContextis the typed prompt-owned contract for user prompt, language, marketplace, reference style, pricing policy, and off-scope policy. Client entry points and queued auto-segment jobs thread it through dimension discovery, mask discovery, normal mask matching, preset demand, and preset sales.- Prompt rules now require off-scope products to route to OTHER or omission, pricing options to prefer numeric ranges, demand to mean target-market monthly demand across channels, and sales to mean launchable monthly sales for the target segment.
- Evidence should be localized, markdown-renderable, and reference ASIN, brand, title, marketplace, or public source names rather than internal ids.
get_market_channel_assumption(...)records source names, URLs, publication dates, and access dates for supported marketplaces instead of using an unattributed fixed marketplace-share shortcut.- Item prompts use
PromptItemReferencealiases such assample_1; backend stable item keys stay in mapping code and are restored only after typed model output returns. - Dimension discovery returns
DimensionByAirows directly. Alias restoration belongs only to item-classification outputs such asSegmentedResultsByAiandSegmentedResultsByAiWithNewOptions. - Dimension and mask worker loops emit safe batch-level
AutoSegmentProgressEventmessages while work is still running. Messages must not contain package names, stable keys, raw ids, enum tokens, or JSON payloads, and progress counts come from the runtime progress ledger rather than per-batch indexes.
Version 0.11.3 Pair-Based Positioning Coverage
- Auto-segment queue requests now separate
work_dimension_keysfromcoverage_scopes. One bootstrap run can segment pricing, feature, and scene together while still claiming durable coverage rows for exact visible pairs. - Preset mask target order is pricing + feature first, then pricing + scene, so the primary seller decision table becomes useful before secondary views.
- Coverage source signatures no longer include
grid.updatedorinfos_id, because Infos generation belongs to the same worker run and must not invalidate the coverage row before the frontend can read it. - Coverage phase updates can filter by envelope id, so completing one pricing+feature preset job does not complete the sibling pricing+scene row.
Version 0.11.2 Update
- Added durable segmenting coverage storage for one normalized grid dimension-key scope and one exact source signature.
- Coverage uses enum-backed phase/status fields, with phase states for item segmentation, Infos generation, normal masks, and preset mask stages.
- Added normalized scope/source signature helpers and an atomic
aclaim_or_get_coverage(...)path so lazy positioning queues can create or reuse one active coverage row before worker envelopes are enqueued. - Added phase transition helpers for queued, processing, complete, and failed
worker states.
BackendAutoSegmentRunMongoObjectremains the worker ledger; coverage is the public read-side state for displayed dimension combinations. - Follow-up CRUD coverage now proves processing, completion, and failure transitions update only the targeted phase and preserve sibling phase state.
Version 0.11.1 Update
- Locked the aggregate display status rule for partially segmented visible axes:
if any displayed dimension lacks terminal segmentation rows for the scoped
items, synthetic cells remain
PROCESSING. EMPTYnow stays documented as the authoritative terminal state only after every displayed dimension is fully terminal and the exact cell has no matched items.- Added two-axis aggregation regressions proving missing custom-axis rows render as processing while fully terminal no-match cells remain empty.
Version 0.11.0 Update
- Removed Infos freshness/cache behavior.
Infosno longer storessource_signature, generation context no longer storessource_revision, and shared grids update the same linked Infos row in place when generation is explicitly requested. - Added
infos_generation_enabledbesidemask_generation_enabledon action, client, and queue contracts. Auto segmenting now queues dimensions first, then one shared Infos generation phase when enabled, then normal and preset mask jobs. - Infos generation now uses dimensions, segmented option assignments, segmentable item prompt samples, and parent context. It intentionally does not load masks or mask results.
- Infos context is reused by normal/custom masks, preset masks, and prompt-facing grid strings. Missing Infos renders an explicit empty marker instead of failing mask generation.
- Renamed the public mask runtime to
amask_cells_for_mask(...)and replaced preset-only Exa wiring with generictool_registrarsfor discovery, segmentation, normal masks, and preset masks.
Version 0.10.4 Update
- Preset mask stage workers now load existing mask results first and return a cheap zero-upsert completion event when every exact pair cell signature is already covered.
- Auto-segment queue planning now filters preset market stages by missing exact coverage before queue insertion. Fully covered stages are skipped, and Infos refresh runs only when at least one queued preset stage needs market context.
- Preset cell context providers now receive only pending cells, so prior covered cells do not trigger repeat context building, tool work, or masking.
- No schema or storage contract changed. The active rule remains retained
mask results plus exact
(mask_key, cell signature)missing-coverage repair.
Version 0.10.3 Update
- Changed dimension and dimension-option mutations to retain existing
mask-result rows instead of deleting or staling them. Old results remain
effective when their exact
(mask_key, cell signature)still matches current work. - Kept
_missing_combinations_for_mask(...)as the canonical coverage repair rule. New dimensions/options create new cell signatures, so auto mask workers generate only missing coverage. - Dimension/option semantic edits keep previous mask judgments by stable keys. This intentionally accepts the risk that old evidence can be less precise after wording changes, in exchange for continuity and lower rerun cost.
- Dimension/option deletes still remove dependent segmented rows, while persisted mask-result rows are retained and ignored by current-grid readers when their removed references no longer fit the active grid.
Version 0.10.2 Update
- Kept cloned grids sharing
infos_id, but made Infos source signatures and generation prompts semantic-only. Dimension keys and option keys are no longer part of reusable Infos freshness or prompt context. - Added keyless
InfosDimensionContext/InfosDimensionOptionContextpayloads for Infos generation, preserving names, descriptions, preset roles, selection mode, niche flags, and option type. - Added a shared-Infos stale guard: when multiple grids reference the same Infos row and one grid's semantic context diverges, that grid creates and links a new Infos row instead of overwriting the shared row.
- Added grid
infosIdcounting/index support and focused clone/freshness tests proving stable key remaps do not stale shared Infos, while real semantic edits still trigger regeneration.
Version 0.10.1 Update
- Preserved
DimensionOptionType.OTHERand option niche when cloning grid dimensions, so cloned grids keep fallback buckets hidden and non-maskable. - Dimension-option create, update, and delete callbacks now replace runtime dimensions from the full persisted post-normalization payload. This prevents in-memory drift when OTHER is inserted, collapsed, moved last, or preserved after deletes.
- Aggregated cell displays now expose structured metric summaries for preset demand and potential-sales rows. Prompt-facing grid strings render the customer-pool and monthly-sales ranges with evidence, so follow-up agents see the numeric basis instead of only mask labels.
- Infos context remains enforced for preset market masks because demand, potential sales, low-feasibility, and opportunity stages need the shared market story and seller methodology. Custom masks intentionally stay bounded by their explicit mask description.
- Added focused mock and IRL coverage for clone option roles, runtime dimension sync after option actions, metric aggregation output, prompt-facing metric labels, and the custom-mask no-Infos policy.
Version 0.10.0 Update
- Replaced mask-result AI output
cellwith schema-ownedArea(ROWorCELLS) while keeping persistedMaskedResultBase.cellas one concrete stored combination row. There is no backward-compatibledimension_and_option_listalias. - Added segmenting-owned
Infosstorage/generation with gridinfos_id, source signatures, optional parent context, and stale refresh before direct and queued mask work. - Added
cell_contains_other(...)and excludedDimensionOptionType.OTHERfrom normal/custom mask cells, preset cartesian cells, missing-mask planning, and prompt-facing grid dimensions/tables. - Changed preset market masks:
MARKET_DEMANDandPOTENTIAL_SALESare now single metric channels with stored numeric ranges;LIKELY_UNACHIEVABLEandMARKET_OPPORTUNITY_HIGHLIGHTremain boolean. - Added the seller cake-theory, replacement-cake, budget-as-cake, and 3-4-year lifecycle methodology as enforced demand/sales prompt context so estimates use Infos, positionable-item context, comparable markets, and tool evidence instead of sample-only counts.
- Added preset stage groups: base demand/feasibility, dependent potential
sales, then final opportunity. Dependent stages receive exact prior demand
and sales evidence, and feasibility evidence only for allowlists that include
LIKELY_UNACHIEVABLE. - Dimension creates/updates and dimension-option updates now invalidate affected stored mask rows so regenerated masks use current cell semantics.
- Dimension-option deletion reuses the same option replacement helper, so the single OTHER invariant is preserved after deletes.
- IRL coverage now includes Infos generation, current-Infos preset queueing through all four market stages, and HTTPX SOCKS proxy support for live tests.
Version 0.9.1 Update
- Tightened the internal
normalize_dimension_options(...)generic bound from broadBaseModelto a structural protocol exposingname,option_type, and pydanticmodel_copy(...). - This keeps the Phase 1 typed OTHER behavior unchanged while removing Pylance/Pyright attribute-access diagnostics in the normalizer helper.
Version 0.9.0 Update
- Added
DimensionOptionTypewithSTANDARDandOTHERroles across AI, persisted, create, and update option schemas. - Added
normalize_dimension_options(...)as the canonical option normalizer: fallback labels such as unknown, unmentioned, not provided, German fallback labels, and Chinese未提及/其他labels collapse into oneOTHERbucket placed last. - AI-created and AI-updated dimension paths now ensure exactly one
OTHERoption. Late_asegment_dimension(...)fallbacknew_optionsmap to the existingOTHERbucket instead of creating duplicate fallback rows. _asegment_dimension(...)now stores explicit unresolved in-scope AIunmatched_itemsinOTHERwhen the dimension has that bucket;OMITTEDremains reserved for invalid model output, untouched batch items, or malformed legacy dimensions withoutOTHER.
Version 0.8.6 Update
SegmentingClient.apreset_mask_stage_job(...)accepts the optionalAutoSegmentQueueContextused by queued auto-segment runs and forwards it intoabuild_agent_deps(...).- Preset mask cell context providers now receive
owner_typefrom the outerSegmentingAgentDeps.auto_segment_queue_context, matching the actual runtime ownership boundary. - Preview flows remain unchanged: callers can still disable generated masks
through
mask_generation_enabled=False.
Version 0.8.5 Update
AutoSegmentAndMaskActionnow accepts optionalmask_generation_enabled.Nonepreserves the client/global default;Falselets host-owned preview flows build lean dimension/cell grids without any generated mask work.SegmentingClient.aauto_segment_and_mask(...)forwards the general mask override through the existing canonical action path.- The old preset-only mask flag was removed rather than kept as an alias, so preview callers must use the general mask toggle.
Version 0.8.4 Update
DiscoverDimensionsActioncan now carry optionalpreset_dimension_typesso host flows can request only known axes such as pricing and features/functionality.SegmentingClient.adiscover_dimensions(...)forwards the requested preset roles through the existing canonical action runtime; normal open-ended discovery is unchanged when the list is empty.- The LLM prompt and handler both enforce the requested preset roles, with a typed validation error if a required role is missing from new or existing dimensions.
Version 0.8.3 Update
AutoSegmentAndMaskActionadded a preview-facing mask planning override through the existing canonical action path, avoiding a second preview-only queue API.- MARKET_OPPORTUNITY_HIGHLIGHT guidance now explicitly treats under-250 observed groups as limited sample context and asks evidence to say whether external or comparable-market evidence was used, unavailable, or intentionally not used.
Version 0.8.2 Update
- MARKET_OPPORTUNITY_HIGHLIGHT evidence must now say whether external or comparable-market evidence was used.
- Empty or small-sample cells can no longer be finalized from sample-only reasoning while a parent-owned market-evidence tool is available.
- TRUE empty-cell opportunities must name the positive analogy, willingness-to-pay clue, low-competition clue, or unserved customer-sector reason behind the highlight.
Version 0.8.0 Update
- Market-opportunity preset prompt guidance now treats observed samples as non-exhaustive evidence instead of the whole market.
- Empty observed cells can be highlighted when comparable markets, web evidence, adjacent demand, pricing willingness, or seller-profile fit support a plausible white-space opportunity.
- The prompt now asks for optimistic but selective opportunity reasoning that explains demand, competition, feasibility, incumbent barriers, and seller capability rather than rejecting cells solely because no sample product exists.
Version 0.7.1 Update
- Auto-segment progress events now derive the generic
AgenticActionEvent.statusfrom the typed progress stage. Failed progress emitsFAILED, terminal success emitsSUCCEEDED, and intermediate work emitsSTARTEDorPROGRESSEDso SSE diagnostics cannot mistake failures for successful action events.
Version 0.7.0 Update
- Added typed
AUTO_SEGMENT_PROGRESSevents for the auto-segment room stream. Payloads carry run/task/root/grid ids, stage, message, completion counts, normalized percent source value, and update time. - Strengthened preset-mask prompts so search-enabled demand, opportunity, and feasibility stages must use external/comparable-market evidence instead of treating Exa-style search as optional when local context seems sufficient.
- Opportunity prompts now explicitly cover limited sample size, empty observed cells, adjacent customer sectors, unserved demand, competition, incumbent brand barriers, and entry barriers.
Version 0.6.1 Update
- Removed the unused untyped
CRUDSegmentedResult.aiter_item_dimension_maps(...)helper instead of carrying a second grouping path after theoption_keyshard-break. - Cleaned CRUD/action/client tests so direct segmented-result payloads use
option_keyseverywhere. Dimension options and mask results still use their singularoption_keyfields because those are separate contracts. - The canonical read path for grouped placements remains aggregation; callers should use aggregate/view APIs instead of ad hoc per-item dimension maps.
Version 0.6.0 Update
- Segmenting results now persist
option_keysinstead of the old singularoption_key; this is a breaking current-line contract with no old-data fallback or migration. - Dimensions carry
selection_mode: pricing is single-select, features and scenes are multi-select by default, and custom dimensions use AI output plus schema-owned conservative inference. - Aggregation expands matched multi-select rows into cartesian cell placements,
so
item_countsmeans placement count for the displayed cell. - Preset-mask prompts can receive bounded parent Exa tools for demand, feasibility, and opportunity stages; normal/custom mask prompts remain tool-free.
Version 0.4.9 Update
- Default
AUTO_SEGMENT_AND_MASKbootstrap now skips custom non-preset masks together with custom dimensions, keeping the first run display-critical only. - Explicit lazy
dimension_keysrequests still queue custom non-preset masks so selected non-default axes can receive mask analysis after the user chooses them. - This avoids heavy bootstrap work when grids already have custom masks but the positioning room has not selected those axes yet.
Version 0.4.8 Update
AutoSegmentAndMaskActionnow accepts optionaldimension_keysfor lazy selected-axis segmentation.- Default auto-segment bootstrap queues only display-critical preset dimensions: pricing, features/functionality, and scene/use-case. Custom axes are left for explicit lazy room selection.
- Preset mask target planning is scoped to the selected/default dimension set, and unknown explicit dimension keys raise typed client-side errors before queue fanout.
Version 0.4.7 Update
- Aggregate reads now synthesize missing
PROCESSING/EMPTYcartesian cells only wheninclude_synthetic_status_cells=True. - Full-grid prompt/admin reads keep the default
Falsebehavior so large grids return only item-backed and stored mask/reasoning-backed cells. - The fallback-option ordering helper remains internal; public callers should
use
DimensionOptionByAi.merge_semantic_duplicates(...)orDimensionBase.build_from_dimensions_by_ai(...).
Version 0.4.6 Update
- Dimension prompts and
DimensionOptionByAinormalization allow one fallback/unclear/other option only when needed, then keep that fallback last across Chinese, English, and German naming variants. - Auto-segment dimension work now enqueues preset roles in priority order: pricing, features/functionality, scene/use-case, then custom dimensions in grid order.
CRUDMaskedResultRelationalReasoning.aupsert_multi(...)keepscreatedinsert-only through$setOnInsert, matching the mask-result bulk upsert pattern and avoiding Mongo update-path conflicts.- Aggregate reads now expose
aggregate_statusonAggregatedCellDisplaywithPROCESSING,EMPTY,AGGREGATED, andUNAVAILABLEcell states.
Version 0.4.5 Update
- Dimension option mutations now preserve the parent dimension's
preset_dimension_type, including create/update/reorder option actions and auto-segment-discovered option appends. - Preset-mask creation now links created or recovered preset mask ids into
grid.masks, so queued preset-mask stage workers can rebuild runtime state and load the mask by key. - Queued mask workers can recover a persisted mask row by
(grid_id, mask_key)when runtime scope is stale, then repair the grid mask scope for future runs.
Version 0.4.4 Update
- Preset role inference now recognizes conservative Chinese/German pricing dimension wording such as price bands, budgets, costs, and German price levels.
- Removed the broad standalone Chinese
使用scene keyword so generic usage or instruction dimensions do not become scene/use-case dimensions. - Added focused normalizer regressions for non-English pricing and generic usage non-scene behavior.
Version 0.4.3 Update
- Dimension discovery now normalizes preset dimension roles before persistence
and preserves
preset_dimension_typefor both existing and newly discovered dimensions. - Preset role inference covers conservative Chinese/German scene/use-case and feature/functionality terms. Pricing remains conservative; no true pricing dimension means no pricing-anchored preset-mask work.
- Added focused regressions for preset role persistence and the no-pricing preset-mask policy.
Version 0.4.2 Update
- Added an explicit queue-stage contract regression proving
AutoSegmentPresetMaskStagemay share wire values withPresetMaskTypewhile remaining a queue/progress stage enum, not the persisted semantic mask role field.
Version 0.4.0 Update
- Aggregate reads now union item-backed cell signatures with stored mask results and preset reasoning signatures, so empty analyzed preset-mask cells remain visible in read models.
- Preset-mask omission is documented as neutral unknown/unclassified work: missing model rows are completed for progress accounting, not failures, false classifications, or lowest-option outputs.
AggregatedCellDisplay.build_empty_masked_cell(...)owns empty-cell display construction, keeping aggregation code focused on signature collection.- Typed auto worker event exports cover dimension, mask, and preset-mask stages for backend/core/frontend room propagation.
Version 0.3.1 Update
MaskedResultBase.build_signature(...)accepts any sequence ofDimensionAndOption; callers should not copy the list before signing.CRUDMaskedResult.aupsert_multi(...)writes typed Mongo payloads and keepscreatedinsert-only through$setOnInsert, avoiding bulk-write path conflicts while preserving combination-signature identity.- Focused mock/IRL CRUD tests use the current
cellcontract, not removed single-pair mask result fields.
Version 0.1.25 Update
- Rejects blank normalized dimension and mask names before direct create/update writes and before AI discovery persistence.
- Keeps duplicate discovery merging from
0.1.24, but now treats an empty semantic name as invalid instead of allowing it to fail later in runtime name maps.
Version 0.1.23 Update
- Hard-breaks action events to canonical
action_typeonly; old discriminator inputs and read aliases are removed from schemas and tests. - Keeps result storage bootstrapping focused on current grid-scoped indexes without obsolete malformed-row cleanup in CRUD startup.
Version 0.1.22 Update
- Publishes the current gridless runtime and
GridAgentContextRequestcontract for downstream positioning/backend wheels. - Keeps prompt context grid-only; task graph and relation enrichment remain outside this package.
MS7 Branch Note
Branch feature/room-display-chat-contract tightens the room display contract.
The frontend room consumes aggregate cell counts, aggregate item_keys, typed
cell_display.description / cell_display.images, direct mutation events, and
authoritative positioning view refetches. Result fanout remains tolerant, and
grouped followers stay metadata-only.
Why
This library helps a parent application maintain a typed positioning grid over external items.
The parent module owns the item source. keble-segmenting owns:
- grid, dimension, mask, and result storage
- typed mutation actions
- agent-facing context projection
- one unified mutation tool
- cloning and aggregation helpers
Canonical Runtime
Use these surfaces as the source of truth:
SegmentingClient: public async clientSegmentingClient.abuild_agent_deps(...): public grid-agnostic deps-builder for backend/runtime integrationkeble_segmenting.agent.handler.aapply_actions(...): canonical action executorkeble_segmenting.agent.register_mutation_tools(...): singular agentic mutation tool registrationkeble_segmenting.client.cloning: grid cloning workflowkeble_segmenting.client.aggregations: aggregated cell-display workflow
Do not build new orchestration around older client-side action engines or obsolete background mutation wrappers.
The agent runtime is explicit per grid: SegmentingGridRuntime owns the loaded
grid structure, structural indexes, Redis progress, lazy item cache, segmented
result cache, and masked result cache for one grid. ActionObjs,
active_grid_id, and active-grid proxy properties are not part of the runtime
contract. Handler and worker helpers receive the resolved runtime explicitly,
while Actions.grid_id remains the single public low-level selector for one
ordered action batch.
Action events use the shared keble_helpers.AgenticActionEvent envelope through
the package-local ActionEvent type. Segmenting still exposes EventCallbacks
for callers, but callback failures now propagate to the action executor instead
of being logged and ignored. Build callback containers with
EventCallbacks.build(...) instead of standalone normalizer helpers.
AUTO_SEGMENT_AND_MASK is now explicit queue scheduling, not inline fanout
completion. Callers that want auto work must include one terminal
AutoSegmentAndMaskAction; backend supplies the queue scheduler while
segmenting supplies typed queue requests, worker job APIs, and direct worker
completion events such as SEGMENT_DIMENSION_JOB_COMPLETED.
Queued worker execution also emits batch-level mutation events as CRUD happens.
Rooms should treat CREATE_DIMENSION_OPTIONS, UPSERT_RESULTS, and
UPSERT_MASK_RESULTS as the store-mutation events; aggregate worker completion
events are diagnostic. In 0.1.20, those mutation payloads include explicit
room indexes such as grid_id, affected dimension_keys, item_keys,
mask_keys, and option_keys so consumers do not infer routing only from row
contents.
Current room display contract keeps CellDisplay display-only:
description is compact text, images are thumbnails, and
AggregatedCellDisplay.item_keys carries the full item-key identity set for
selection/detail sidebars. Parent item adapters own how those display fields are
built; segmenting only carries the typed payload.
Downstream TypeScript consumers should use keble-core 0.1.33+
TaskWorkspaceEvent.build(...) for these direct package payloads instead of
frontend-local unknown payload normalization.
Core Concepts
SegmentableItemProtocol- item interface owned by the parent module
- items are not persisted in this package
- the protocol provides
key, prompt payloads, and representative/cell-display helpers
SegmentedGridCreate- public metadata-only payload for creating a new empty grid
GridAgentContextRequest- explicit prompt-context request for one grid
- requires
compare_dimension_key - optionally narrows visible dimensions with
viewing_dimension_keys
Actions.grid_id- explicit grid scope for one canonical action batch
ActionedResults- ordered per-action results
- each concrete action result carries explicit
grid_id
Dimension- grid-bounded categorical axis with ordered
DimensionOptions
- grid-bounded categorical axis with ordered
Mask- grid-bounded classifier with ordered
MaskOptions
- grid-bounded classifier with ordered
SegmentedResult(grid_id, item_key, dimension_key) -> option_keysplus evidence
MaskedResult(grid_id, mask_key, dimension_and_option) -> option_keyplus evidence
Storage:
- MongoDB stores grids, dimensions, masks, segmented results, and masked results
- Redis stores
SegmentingProgressfor action execution progress and interruption
Startup
Use the async client startup hook:
from keble_segmenting import SegmentingClient
client = SegmentingClient(
agentic_llm_list=[...],
async_items_loader=...,
)
await client.aensure_indexes(amongo)
Notes:
agentic_llm_listis required for discovery and auto actionsasync_items_loaderis required for discovery and auto actionsaensure_indexes(...)is the canonical Mongo startup hook
Agent Context APIs
There are now two client context helpers:
aget_grid_for_agent(...)- structured prompt-facing projection of one requested grid
- request-based:
context=GridAgentContextRequest(...) - includes finite compare-vs-each markdown tables built from real aggregated cells
aget_grid_context_string(...)- lightweight string form of
GridForAgent
- lightweight string form of
aget_agent_context_string(...)- canonical workspace-ready context string
- request-based:
contexts=[GridAgentContextRequest(...), ...] - includes schema meanings, action runtime rules, action type intentions, and one rendered grid section per request
Recommended pattern for upstream agents:
from keble_segmenting.agent import GridAgentContextRequest
context_text = await client.aget_agent_context_string(
amongo,
contexts=[
GridAgentContextRequest(
grid_id=grid_id,
compare_dimension_key="benefit",
viewing_dimension_keys=None,
)
],
)
Important:
- refresh agent context from the latest grid before each reasoning or mutation turn
- do not keep a stale cached copy after actions mutate the grid
- one prompt may describe multiple grids, but low-level mutation still stays single-grid per
Actionsbatch - if you only need one rendered grid section,
aget_grid_context_string(...)is the lighter helper
Actions
The canonical mutation contract is:
from keble_segmenting.schemas import Actions
from keble_helpers import AgenticActionWarningLevel
await client.aapply_actions(
payload=Actions(
message="Apply one typed segmenting batch.",
warning_level=AgenticActionWarningLevel.SAFE,
grid_id=grid_id,
actions=[...],
),
db_deps=db_deps,
language=language,
)
Rules:
Actionsis strongly typed; the client does not accept raw dict payloadsActions.grid_idis required on the canonical action path- nested action/input payloads do not repeat
grid_id - concrete
ActionedResultpayloads do carry explicitgrid_id progress_taskis optional and parent-owned onAgentDbDeps.progress_task; segmenting only emitsset_message(...)updates through it- action batches execute sequentially
AUTO_SEGMENT_AND_MASKis canonicalized to at most one terminal batch action- create-dimension, create-dimension-option, create-mask, create-mask-option, discover-dimensions, and discover-masks do not imply hidden auto work; callers must add
AutoSegmentAndMaskActionexplicitly - direct delete actions for segmented results and mask results are not part of the canonical action surface
Runtime notes:
- low-level execution is still single-grid per
Actionsbatch - agent deps now keep lazy per-grid runtimes internally, so one session can know multiple grids without eagerly loading all result state
SegmentingAgentDepskeepsauto_segment_queue_schedulerandauto_segment_queue_contexton the deps root so explicit auto actions can queue backend worker jobs from the handler path- prompt context tables use aggregated cell summaries only: item count,
cell_displaytitle/description, and aligned mask labels
When a parent runtime wants human-readable stage updates during discovery or auto:
from keble_helpers import ProgressTask
segmenting_task = (
request.resources.progress_task.new_subtask()
if request.resources.progress_task is not None
else None
)
results = await client.aapply_actions(
payload=payload,
db_deps=db_deps,
language=language,
progress_task=segmenting_task,
)
Progress-task contract:
- pass the task or subtask explicitly through the public client argument; the built agent deps then expose it at
deps.progress_task keble-segmentingonly callsset_message(...)- parent repos keep ownership of subtask creation and terminal success/failure
- Redis
SegmentingProgressstays numeric/interruption-focused and does not store these human messages - the human-readable message wording is intentionally varied; treat stage order and persisted state as the stable contract, not exact phrasing
Public direct client helpers map 1:1 to the canonical actions:
- grid meta:
aupdate_grid_meta(...) - dimensions:
acreate_dimensions(...),aupdate_dimensions(...),adelete_dimensions(...) - dimension options:
acreate_dimension_options(...),aupdate_dimension_options(...),adelete_dimension_options(...) - masks:
acreate_masks(...),aupdate_masks(...),adelete_masks(...) - mask options:
acreate_mask_options(...),aupdate_mask_options(...),adelete_mask_options(...) - direct rows:
aupsert_results(...),aupsert_mask_results(...) - reorder:
areorder_dimensions(...),areorder_masks(...),areorder_dimension_options(...),areorder_mask_options(...) - discovery:
adiscover_dimensions(...),adiscover_masks(...) - auto:
aauto_segment_and_mask(...)
Grouped positioning followers are parent-module metadata, not segmenting rows.
When a parent loader skips follower keys to save tokens, keble-segmenting
persists results only for the explicit item_keys it receives and must not copy
main/group result rows to follower item keys. Worker dimension jobs also reject
model-returned item keys outside the current batch, so model key typos cannot
create orphan segmented rows.
Queued worker events are emitted as direct package events. Backend/SSE transports
may fill root_id, object_id, and correlation_id, but they should not wrap
or rename the package event.
event = AutoSegmentDimensionJobEvent(
payload=AutoSegmentDimensionJobResult(
dimension_key="price_tier",
upserted_result_count=12,
)
)
assert event.source == "keble-segmenting"
assert event.action_type == "SEGMENT_DIMENSION_JOB_COMPLETED"
Example:
from keble_segmenting.schemas import CreateDimensionInput, CreateDimensionOptionInput
result = await client.acreate_dimensions(
db_deps=db_deps,
language=language,
grid_id=grid_id,
dimensions=[
CreateDimensionInput(
name="Price Positioning",
niche=False,
options=[
CreateDimensionOptionInput(
name="Budget",
description="Low-price entry option.",
niche=False,
),
CreateDimensionOptionInput(
name="Premium",
description="Higher-price premium option.",
niche=False,
),
],
)
],
)
Agent Tool Registration
The canonical agent tool surface is one singular mutation tool:
from keble_segmenting.agent import SegmentingAgentDeps, register_mutation_tools
agent = Agent[SegmentingAgentDeps, Any](...)
register_mutation_tools(agent)
Registered tool:
mutate_segmenting- takes
payload: Actions - delegates directly into the unified action runtime
- takes
Deps shape:
SegmentingAgentDepsinheritskeble_db.AgentDbDeps; do not pass Mongo/Redis as separate tool args.- Segmenting runtime state is under
ctx.deps.segmenting. - Composite parent deps should inherit
SegmentingAgentDepsand provide the same.segmentingnamespace instead of manually re-registering backend-owned copies of this tool.
Optional tool customization:
from keble_segmenting.agent import register_mutation_tools
from keble_helpers import AgentToolConfig
from keble_segmenting.agent.schemas import MutationToolsConfig
register_mutation_tools(
agent,
tools_config=MutationToolsConfig(
mutate_segmenting=AgentToolConfig(
name="mutate_segmenting",
description="Apply one typed segmenting action batch.",
requires_approval=True,
),
),
)
Grid CRUD, Cloning, and Aggregation
Grid shell APIs:
acreate_grid(...)creates an empty metadata-only gridaget_grid(...),aget_grid_by_id(...),aget_grids(...)read gridsaget_grid_detail(...)loads the grid plus ordered dimensions and masksadelete_grid(...)cascades through dimensions, masks, segmented results, and masked results
Grid cloning:
clone = await client.aclone_grid(
amongo,
grid_id=grid_id,
)
aclone_grid(...) clones the full grid and returns GridCloneResult with the new grid plus key remapping maps.
Aggregation:
cell_displays = await client.aget_aggregated_cell_displays(
amongo,
grid_detail=grid_detail,
items=items,
include_synthetic_status_cells=False,
)
AggregatedCellDisplay is the canonical aggregation output:
- full ordered cell
- aligned flattened
masks_and_options - typed
mask_reasoningsfor preset mask decisions item_countscell_display- optional synthetic
PROCESSING/EMPTYstatus cells when the caller passes explicit display dimensions and setsinclude_synthetic_status_cells=True
Guidance For Other Agents
When another agent or workflow consumes this package:
- load fresh context from
aget_agent_context_string(...)before reasoning about the current grid - build mutations with typed
Actions, not ad hoc dicts - use stable keys for update, delete, reorder, and direct result actions
- assume the README and the client context API should stay aligned; if one changes, update the other in the same pass
Runtime Notes
The current lazy multi-grid runtime is explicit per grid:
SegmentingGridRuntimeowns the loaded grid, dimensions, masks, indexes, optional progress, and lazy item/result caches for exactly one griddeps.segmenting.loaded_grid_runtimesis the only runtime map for multi-grid sessionsdeps.aget_or_load_grid_runtime(grid_id=...)loads structural context without creating execution progressdeps.aensure_executing_grid_runtime(grid_id=...)returns the target runtime and creates Redis progress only for that executing grid- handler/helper code receives
runtimeexplicitly instead of reading an implicit active grid from deps
The nested deps.segmenting namespace owns only shared session state:
- client/config adapters
loaded_grid_runtimes- accumulated
actioned_results
Removed runtime surfaces:
- do not target
deps.action_objs,deps.progress, or other active-grid proxy properties in downstream tests or helpers - do not restore
active_grid_id,aactivate_grid_runtime(...), orrequire_active_runtime(...) - do not restore the removed
ActionObjsmodel; item loading is owned bySegmentingAgentDeps.aensure_all_items_loaded(runtime=...)
Prompt tables also follow two important correctness rules:
- table cells are indexed by stable option keys, not by human labels
- when multiple hidden cells collapse into one projected cell, the prompt copy is marked as merged instead of reusing one hidden cell's title/description
Latest Diagnostic Timing Note
The IRL client suite now includes one diagnostic-only timing probe:
tests/irl/agent/test_handler/test_stage_latency_breakdown.py
It measures:
adiscover_dimensions(...)aauto_segment_and_mask(...)- discovered dimension and option counts
- persisted segmented-result counts
This probe is intentionally diagnostic, not a hard latency budget gate. Use it to see whether current runtime cost is dominated by dimension discovery or by the downstream per-item segmentation pass.
Latest Discovery Name Validation Note
On 0.1.25, AI-discovered dimensions and masks are normalized before persistence:
- same-name discovered dimensions merge into one first-name-preserving dimension
- same-name discovered masks merge into one first-name-preserving mask
- duplicate options under the merged dimension or mask keep the first display rule
- dimension
nicheflags are merged withany(...) - blank normalized dimension and mask names fail before persistence
Direct create/update handlers now preflight the final grid namespace before any
write, so direct actions cannot leave a grid with duplicate normalized dimension
or mask names, including blank normalized names. build_unique_name_map(...)
remains strict and should still fail if corrupted persisted state already exists.
0.2.0 Preset Dimensions And Combination Masks
keble-segmenting 0.2.0 intentionally breaks the old single
dimension_and_option mask-result contract. Mask results now persist one full
cell plus a stable dimension_option_signature, so masks
can classify two-dimensional preset cells such as pricing by use case or pricing
by feature family.
The package stays domain-generic. Parent services may inject typed
SegmentingCellContextProvider context and preset-market scoped tool
registrars, while this package owns only segmenting schemas, CRUD, aggregation,
LLM prompts, and queue contracts. Preset mask roles cover market demand,
opportunity highlight, and low feasibility, with enhanced reasoning stored in
MaskedResultRelationalReasoningMongoObject and surfaced through aggregate
cells.
0.2.1 Preset Mask Review Follow-Up
keble-segmenting 0.2.1 closes the first review gaps on top of the breaking
combination-mask line:
- Parent-owned
SegmentingToolRegistrarhooks allow backend to provide generic tools such as market evidence search without importing backend code into this package. SegmentingActionConfig.mask_generation_enabledcontrols whether package auto-segmenting plans generated mask work at all, not only whether parent cell context is available.- Preset dimension normalization still keeps first explicit AI tags, but it now
infers obvious missing
PRICING,SCENE_OR_USE_CASE, andFEATURES_OR_FUNCTIONALITYroles from generated names/descriptions so preset masks do not silently skip when the AI omitted a clear tag.
0.2.2 Preset Market Tool Scope
keble-segmenting 0.2.2 narrows the parent-tool boundary introduced in the
previous review pass. SegmentingClient accepted the old market-named
registrar boundary, and the runtime passed those tools only into the
MARKET_DEMAND preset mask prompt. As of 0.6.0, the current boundary is
tool_registrars, which can supply bounded Exa tools to demand,
low-feasibility, and opportunity preset-mask prompts while keeping dimension
discovery, dimension merge, and normal/custom masks tool-free.
Preset reasoning persistence also separates fields more strictly:
reasoning stores the AI explanation from the mask result, while evidence
stores the parent-provided cell source digest and evidence lines. When no parent
source digest exists, the row says so explicitly instead of duplicating the AI
reasoning text.
0.3.0 Preset Mask Contract Hardening
keble-segmenting 0.3.0 renames the system-owned preset role contract from
display semantics to PresetMaskType. AI/user-created masks no longer accept a
preset role field; only MaskBase.build_preset_mask(...) can create masks tagged
with preset_mask_type. Normal update/delete/option/reorder actions now reject
structural mutations on preset masks, while the auto preset pipeline can still
persist combination mask results.
The cartesian preset-cell builder now lives on
DimensionAndOptionCombination.build_cartesian(...), and mask-result signatures
accept generic Sequence[DimensionAndOption] inputs.
0.4.0 Empty Preset Mask Cells
keble-segmenting 0.4.0 keeps preset-mask omission neutral: when the model does
not emit a row for a cell, that cell is completed for progress accounting but
remains unknown/unclassified rather than failed, FALSE, or the lowest option.
Aggregate reads now union item-backed cells with stored mask and preset
reasoning signatures, so an empty analyzed cell can render mask overlays and
reasoning without copying item rows into that cell.
0.4.1 Preset Mask Naming Cleanup
keble-segmenting 0.4.1 keeps the 0.4.0 empty analyzed-cell behavior and
cleans the remaining display-worded preset terminology from the current schema
descriptions and docs. Public code should refer to PresetMaskType and preset
mask roles.
0.5.0 Terminal Segmentation Omissions
keble-segmenting 0.5.0 persisted AI omissions as terminal
SegmentedResultStatus.OMITTED rows. As of 0.9.0, explicit in-scope
unresolved AI matches use the canonical DimensionOptionType.OTHER bucket when
the dimension owns one. Omitted rows now mean invalid model output, untouched
batch items, out-of-scope/safety failures, or malformed legacy dimensions
without OTHER; omitted rows still do not participate in option-key
aggregation.
Matched rows required one selected option in 0.5.0; as of 0.6.0, matched
rows store non-empty option_keys. Aggregated cell reads use matched rows to
build placements and use all terminal rows to decide whether synthetic cells
are still processing or settled empty. This prevents one omitted item from
keeping a lazy axis in perpetual processing.
0.6.0 Multi-Select Segmenting Results
keble-segmenting 0.6.0 hard-breaks the persisted result contract from
singular option_key to option_keys.
DimensionSelectionMode.SINGLE_SELECTrequires exactly one matched option key.DimensionSelectionMode.MULTI_SELECTallows one item to belong to multiple options under the same dimension.SegmentedResultStatus.OMITTEDrows keepoption_keys=[]; omitted remains terminal unknown/unclassified and is not Other/Unclear.- Aggregate reads expand multi-select display axes into cartesian cell
placements while mask cell identity remains one
DimensionAndOptionper dimension.
0.6.1 Result Contract Cleanup
keble-segmenting 0.6.1 removes the unused untyped item/dimension-map iterator
that duplicated aggregation behavior after the multi-select rewrite. Tests now
exercise direct result writes through option_keys only; singular option_key
remains valid only for dimension options, mask options, and mask result choices.
0.8.1 Opportunity Evidence Context
keble-segmenting 0.8.1 keeps the auto-segment queue context slim and carries
only one new optional owner field:
AutoSegmentQueueContext.owner_typelets backend context providers load the correct user/org seller profile for preset masks.- Market-opportunity prompt guidance now treats an empty observed sample as missing direct sample evidence, not as proof that demand is absent.
- When external tools are available, opportunity analysis should use comparable market and web evidence before final TRUE/FALSE decisions for empty or small-sample cells.
0.11.3 Positioning Evidence And Sample Metadata Correction
The current positioning correction keeps package version 0.11.3 but tightens
the user-facing display contract:
CellImageDisplaymay carry optionalCellImageDisplayMetadatafor public sample-image popovers: title, ASIN, brand, marketplace, price, rating, reviews, and monthly sold. Do not put item keys, stable keys, graph ids, or transport payloads in this metadata.- Evidence persisted on segmented results, mask results, metric ranges, and
preset relational reasoning is sanitized before storage through
sanitize_user_evidence(...). - Prompts should request JSON-only structured responses, but evidence string fields may use markdown-safe GFM prose. Do not emit markdown outside the JSON response boundary.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file keble_segmenting-0.11.12.tar.gz.
File metadata
- Download URL: keble_segmenting-0.11.12.tar.gz
- Upload date:
- Size: 717.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b990bf8bed49189488cddd1f1bde78efd4268b3fbf5ca424b0318aa445800b93
|
|
| MD5 |
7afbb424dbd536158760de617b22d9ec
|
|
| BLAKE2b-256 |
83fc99a1ac04fa0a16ecaa87c54d01743ab7d166e216c3672a2df4a9a738cebe
|
File details
Details for the file keble_segmenting-0.11.12-py3-none-any.whl.
File metadata
- Download URL: keble_segmenting-0.11.12-py3-none-any.whl
- Upload date:
- Size: 196.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a2ce60cd3cf42feefcc6df47249ba0d80938ea88d2d90d0d57f9ec2152c4c848
|
|
| MD5 |
1b0c1ddd5596f7c89b94f3c5700bf075
|
|
| BLAKE2b-256 |
447c961b9be79dcdaa7cc7d94e5b37cf7127791eec7368d3d1e3dcb063860dc0
|