Skip to main content

A Python library for parsing, standardizing, and comparing seafood product descriptions in foodservice

Project description

fishlib 🐟

A Python library for parsing, standardizing, and comparing seafood product descriptions in foodservice.

The Problem: Seafood product descriptions are messy. The same product can be described a hundred different ways. Comparing prices across distributors, suppliers, or market data requires deep domain knowledge to know if two items are actually comparable.

The Solution: fishlib parses item descriptions into structured attributes, standardizes them to common codes, and enables apples-to-apples comparisons — so you don't need to be a fish expert to work with seafood data.

What's New in 0.6.4

  • Fixed: Bare WALLEYE no longer false-matches Alaska Pollock (new dedicated walleye and pike categories)
  • Fixed: Bare-letter trim codes (A, B, C, D, E) now recognized with word-boundary safety
  • Fixed: Squid TUBES no longer misclassified as rings — new TUBE form code added
  • Added 16 new categories: trout, sea_bass, branzino, snapper, grouper, monkfish, mackerel, walleye, pike, perch, barramundi, hamachi, rockfish, roe, crawfish, conch
  • Expanded oyster: added Kumamoto, European Flat (Belon), and Olympia species
  • Cleaner schema: removed duplicate codes (SKOFF folded into SKLS, MRNTD kept only in value_added, meat-grade CLAW renamed to CLAW_MEAT to disambiguate from form)
  • Zero dependencies: pandas removed from install requirements (never actually imported in library code)

Coverage: 38 categories, 109 species/subspecies, 411 aliases.

Installation

pip install fishlib

Quick Start

import fishlib

# Parse any item description
item = fishlib.parse("SALMON FIL ATL SKON D 6OZ IVP")

print(item['category'])       # 'salmon'
print(item['subspecies'])     # 'atlantic'
print(item['form'])           # 'FIL'
print(item['skin'])           # 'SKON'
print(item['trim'])           # 'D'
print(item['size'])           # '6OZ'
print(item['pack'])           # 'IVP'

# Get a comparison key for matching
key = fishlib.comparison_key(item)
# "SALMON|ATLANTIC|FIL|SKON|D|6OZ"

# Check if two items are comparable
result = fishlib.match(
    "SALMON PRTN ATL BNLS SKLS 6 OZ CENTER CUT",
    "Portico Salmon Fillet 6 oz Boneless / Skinless",
)
print(result['is_comparable'])        # True
print(result['confidence'])           # e.g., 0.85
print(result['different_attributes']) # ['form']
print(result['recommendation'])       # human-readable summary

Features

Parse Item Descriptions

Turn messy text into structured data:

fishlib.parse("POLLOCK FIL WILD ALASKA PROCESSED CHINA 6OZ IVP")
# {
#   'category': 'pollock', 'subspecies': 'alaska',
#   'form': 'FIL', 'size': '6OZ', 'pack': 'IVP',
#   'origin_harvest': 'USA', 'origin_processed': 'CHN',
#   'freeze_cycle': 'TWICE',   # finfish processed in Asia ⇒ twice frozen
#   ...
# }

Standardized Codes

Attribute Codes
Form FIL (Fillet), PRTN (Portion), LOIN, WHL (Whole), STEAK, TAIL, TUBE, RING, CLUSTER, LEG, CLAW, MEAT, H&G, PD, PUD, HLSO, EZPL, ...
Skin SKON (Skin On), SKLS (Skinless)
Bone BNLS (Boneless), BIN (Bone In), PBO (Pin Bone Out)
Trim A, B, C, D, E, FTRIM — bare letter or TRIM-D / DTRM variants
Pack IVP, IQF, CVP, BULK, SHL, TRAY
Storage FRZ (Frozen), FRSH (Fresh), RFRSH (Refreshed)
Harvest WILD, FARM
Preparation RAW, CKD, SMKD, CURED
Value-added BRDD, STFD, MRNTD, SSNDD, POF

Origin & Freeze Cycle

# origin split into harvest (where caught/farmed) and processed (where cut/portioned)
item['origin_harvest']     # 'USA'  — caught in Alaska
item['origin_processed']   # 'CHN'  — portioned in China
item['freeze_cycle']       # 'TWICE' — inferred from Asian processing of finfish

# origin is also populated for legacy single-field use
item['origin']             # 'USA'

Species Coverage

38 categories across parent groups:

  • Finfish: salmon, tuna, mahi, swordfish, sea_bass, branzino, snapper, grouper, monkfish, mackerel, barramundi, hamachi, rockfish
  • Groundfish: cod (incl. black cod/sablefish), haddock, pollock
  • Flatfish: halibut, flounder, sole
  • Freshwater: tilapia, swai (pangasius/basa/tra), catfish, trout, walleye, pike, perch
  • Crustacean: crab, lobster, shrimp, crawfish
  • Mollusk: scallop, clam, oyster, mussel, calamari, octopus, conch
  • Roe: ikura, tobiko, masago, caviar, uni

Trim Guide (Salmon)

Trim Description Skin
A Backbone off, bellybone off ON
B A + backfin off, collarbone off, belly fat/fins off ON
C B + pin bone out ON
D C + back trimmed, tailpiece off, belly membrane off, nape trimmed ON
E D + skin removed OFF

Foodservice standard: Trim D (skin on) and Trim E (skin off).

Cut Styles (Portions)

Style Description Value
Center Cut From center of fish only, no tails/nape Premium
Bias Cut at angle for better presentation Premium
Block Straight cuts end-to-end, includes tails Mid
Random Mixed pieces, various shapes Value

Match & Compare

fishlib.is_comparable(item1, item2)              # True / False
fishlib.match(item1, item2)                      # full match dict
fishlib.find_matches(target, candidates, threshold=0.8)
fishlib.explain_difference(item1, item2)         # human-readable explanation

Reference Data

from fishlib import reference

reference.trim_levels('salmon')       # Trim A–E definitions
reference.is_trim_skin_on('D')        # True  (trim D is skin-on)
reference.cut_style('CENTER')         # {'description': ..., 'premium': True}
reference.price_tier('salmon', 'king')  # {'tier': 'ultra-premium'}

Why This Exists

In foodservice distribution, comparing prices requires knowing if products are truly comparable. A "6oz salmon fillet" from two different sources might be:

  • Center-cut bias portion at $12/lb (premium)
  • Block-cut with tail pieces at $8/lb (commodity)

Without the right attributes, price comparisons are meaningless. fishlib encodes the domain knowledge needed to make accurate comparisons — so you don't need 20 years of fish experience to work with seafood data.

Contributing

Contributions welcome. Areas of interest:

  • Additional species and regional variants
  • International market terminology
  • Packaging and processing codes
  • Price reference data

Author

Karen Morton — seafood industry professional with 20+ years of experience in category management and procurement. Built from years of experience managing seafood categories and the realization that this knowledge should be accessible to everyone, not trapped in experts' heads.

Acknowledgments

Developed with assistance from Claude (Anthropic) for code scaffolding, refactoring, test harness construction, and documentation. All seafood domain knowledge — species classification, trim logic, cut styles, freeze-cycle inference rules, alias curation, and design decisions — comes from the author's 20+ years in foodservice category management.

License

MIT License — use it, modify it, share it. Just make seafood data better for everyone.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fishlib-0.6.4.tar.gz (37.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fishlib-0.6.4-py3-none-any.whl (33.5 kB view details)

Uploaded Python 3

File details

Details for the file fishlib-0.6.4.tar.gz.

File metadata

  • Download URL: fishlib-0.6.4.tar.gz
  • Upload date:
  • Size: 37.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for fishlib-0.6.4.tar.gz
Algorithm Hash digest
SHA256 86ae40e97bb6df3ce76d11399177621c74b282d5259503cac25129e5699d30d9
MD5 81c309d3f09a524ed3228952ddc1ad34
BLAKE2b-256 62c2b17a8f9553285f6947d2b3a88f9c9df8b06ec4bddc21c7e15de1afd23a4f

See more details on using hashes here.

File details

Details for the file fishlib-0.6.4-py3-none-any.whl.

File metadata

  • Download URL: fishlib-0.6.4-py3-none-any.whl
  • Upload date:
  • Size: 33.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for fishlib-0.6.4-py3-none-any.whl
Algorithm Hash digest
SHA256 95affe87eb38903aad94abfe0bd9fd92b19f496e8e811665cbc72bebef6a3756
MD5 bb5d3cedd68e2777881623b4f4d8862d
BLAKE2b-256 985f397339c544677288097bebfc883e9a6b5aab30258c169caca3afd595d352

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page