Skip to main content

A Python library for parsing, standardizing, and comparing seafood product descriptions in foodservice

Project description

fishlib 🐟

A Python library for parsing, standardizing, and comparing seafood product descriptions in the food industry.

The Problem: Seafood product descriptions are messy. The same product can be described a hundred different ways. Comparing prices across distributors, suppliers, or market data requires deep domain knowledge to know if two items are actually comparable.

The Solution: fishlib parses item descriptions into structured attributes, standardizes them to common codes, and enables apples-to-apples comparisons—so you don't need to be a fish expert to work with seafood data.

Installation

pip install fishlib

Quick Start

import fishlib

# Parse any item description
item = fishlib.parse("SALMON FIL ATL SKON DTRM 6OZ IVP")

print(item)
# {
#     'species': 'Atlantic Salmon',
#     'form': 'FIL',
#     'skin': 'SKON',
#     'bone': 'BNLS',
#     'trim': 'D',
#     'size': '6OZ',
#     'size_bucket': '6-8OZ',
#     'pack': 'IVP',
#     'storage': 'FRZ'
# }

# Get a comparison key for matching
key = fishlib.comparison_key(item)
print(key)
# "SALMON|ATLANTIC|FIL|SKON|BNLS|D|6-8OZ"

# Check if two items are comparable
distributor_item = "SALMON PORTION ATL BNLS SKLS 6 OZ CENTER CUT"
circana_item = "Portico Salmon Fillet 6 oz Boneless / Skinless"

match = fishlib.match(distributor_item, circana_item)
print(match)
# {
#     'is_match': True,
#     'confidence': 0.85,
#     'differences': ['form: PORTION vs FIL'],
#     'recommendation': 'Comparable with caution - form differs'
# }

Features

Parse Item Descriptions

Turn messy text into structured data:

fishlib.parse("SALMON SOCKEYE FIL WILD ALASKA SKON 8OZ IQF")
# Returns structured dict with all attributes

Origin Tracking (v0.4.0)

Separate harvest and processing countries for accurate sourcing:

fishlib.parse("POLLOCK FIL WILD ALASKA PROCESSED IN CHINA 6OZ")
# Returns:
# {
#     'origin_harvest': 'USA',
#     'origin_processed': 'CHN',
#     'freeze_cycle': 'TWICE',
#     ...
# }

Freeze Cycle Inference (v0.4.0)

Automatically determines single-frozen vs twice-frozen:

  • Finfish + Asian processing country → TWICE (twice-frozen)
  • Finfish domestic processing → SINGLE (single-frozen)
  • Crustaceans/mollusks → exempt
  • Freeze cycle mismatch = hard block on comparability

Size Bucket Matching (v0.4.2)

Exact sizes and ranges map to competitive buckets for PMI comparisons:

fishlib.parse("POLLOCK FIL 2OZ")['size_bucket']    # '2-3OZ'
fishlib.parse("POLLOCK FIL 2-3OZ")['size_bucket']  # '2-3OZ'

# Now comparable!
fishlib.is_comparable("POLLOCK FIL 2OZ", "POLLOCK FIL 2-3OZ")  # True

Enhanced Attribute Extraction (v0.2.0)

# Crab meat grade detection
fishlib.parse("CRAB MEAT JUMBO LUMP PASTEURIZED")
# Returns: {'meat_grade': 'JUMBO_LUMP', ...}

# Preparation status (raw, cooked, smoked, cured)
fishlib.parse("SHRIMP 16/20 P&D COOKED")
# Returns: {'preparation': 'COOKED', ...}

# Value-added detection (breaded, stuffed, marinated, etc.)
fishlib.parse("COD FIL PANKO CRUSTED 4OZ")
# Returns: {'value_added': 'BREADED', ...}

Standardize Codes

Consistent codes across any data source:

Attribute Codes
Form FIL (Fillet), PRTN (Portion), LOIN, WHL (Whole), STEAK, etc.
Skin SKON (Skin On), SKLS (Skinless), SKOFF (Skin Off)
Bone BNLS (Boneless), BIN (Bone In), PBO (Pin Bone Out)
Trim A, B, C, D, E (see Trim Guide)
Pack IVP, IQF, CVP, BULK
Storage FRZ (Frozen), FRSH (Fresh), RFRSH (Refreshed)
Meat Grade JUMBO_LUMP, LUMP, BACKFIN, SPECIAL, CLAW
Preparation RAW, COOKED, SMOKED, CURED
Value-Added BREADED, STUFFED, MARINATED, GLAZED, BLACKENED, FORMED

Species Support

Built-in knowledge for 46 seafood categories and 90+ species:

  • Salmon: Atlantic, King/Chinook, Sockeye, Coho, Keta/Chum, Pink
  • Crab: King, Snow, Dungeness, Blue, Stone, Jonah, Soft Shell
  • Lobster: Maine, Canadian, Warm Water
  • Shrimp: White, Pink, Brown, Tiger, Rock, Royal Red
  • Groundfish: Cod (Atlantic, Pacific, Black/Sablefish, Ling), Haddock, Pollock, Rockfish
  • Flatfish: Flounder, Halibut, Sole (Dover, Petrale, Lemon, Rex, Gray)
  • Shellfish: Scallops (Sea, Bay, Calico), Clams, Oysters, Mussels
  • Snapper: Red, Yellowtail, Vermilion, Lane, Mangrove, Silk
  • Grouper: Red, Black, Gag, Yellowedge, Scamp
  • Catfish: US Farm-Raised (Domestic), Channel (Imported), Blue
  • Other Finfish: Branzino, Sea Bass (Chilean, Black, Striped), Trout, Barramundi, Wahoo, Monkfish, Mahi, Swordfish, Tuna, Anchovy, Whiting, Perch, Sardine, Herring, Mackerel, Hake, Orange Roughy, Corvina, Cobia, Hamachi, Pike
  • Other Shellfish: Crawfish, Calamari, Octopus, Langostino, Conch

Reference Data

Access industry knowledge:

# Salmon trim levels
fishlib.reference.trim_levels('salmon')
# Returns definitions for Trim A-E with skin status

# Species price tiers (relative positioning, not dollar amounts)
fishlib.species.get_price_tier('salmon', 'king')
# Returns: 'ultra-premium'

# Cut style definitions
fishlib.reference.cut_style('center_cut')
# Returns: {'description': 'Portions from center of fish only...', 'premium': True}

Match & Compare

Find comparable items across data sources:

# Simple match
fishlib.is_comparable(item1, item2)  # Returns True/False

# Detailed match with confidence score
fishlib.match(item1, item2)  # Returns match details

# Find best matches in a list
fishlib.find_matches(target_item, list_of_items, threshold=0.8)

Trim Guide (Salmon)

Trim Description Skin
A Backbone off, bellybone off ON
B + Backfin off, collarbone off, belly fat/fins off ON
C + Pin bone out ON
D + Back trimmed, tailpiece off, belly membrane off, nape trimmed ON
E Everything in D + skin removed OFF

Key insight: Trim A-D are all skin ON. Only Trim E is skin OFF. Foodservice standard: Trim D (skin on) and Trim E (skin off).

Cut Styles (Portions)

Style Description Value
Center Cut From center of fish only, no tails/nape Premium
Bias Cut at angle for better presentation Premium
Block Straight cuts end-to-end, includes tails Mid
Random Mixed pieces, various shapes Value

Why This Exists

In food distribution, comparing prices requires knowing if products are truly comparable. A "6oz salmon fillet" from two different sources might be:

  • Center-cut bias portion (premium)
  • Block-cut with tail pieces (commodity)

Without the right attributes, price comparisons are meaningless. fishlib encodes the domain knowledge needed to make accurate comparisons—so you don't need 20 years of fish experience to work with seafood data.

Changelog

See CHANGELOG.md for full version history.

Latest: v0.4.3

  • Rockfish: Own category (Pacific Rockfish / Sebastes) — no longer misclassified as Striped Bass
  • Striped Bass: Reversed word order ("BASS STRIPED") now parses correctly
  • Catfish: Split into Domestic (US farm-raised), Channel (imported), and Blue subspecies
  • Scallop: Fixed false-match on standalone "SEA" alias

v0.4.2

  • Size buckets: 2OZ and 2-3OZ now match for competitive comparisons

v0.4.0

  • Origin split: Separate harvest vs processing country tracking
  • Freeze cycle: Automatic single-frozen vs twice-frozen inference

v0.3.0

  • 14 new species: Anchovy, Whiting, Perch, Sardine, Herring, Mackerel, Hake, Orange Roughy, Corvina, Cobia, Langostino, Conch, Hamachi, Pike

v0.2.0

  • New attributes: meat_grade, preparation, value_added
  • 19 new species: Snapper, Grouper, Branzino, Sea Bass, Trout, Barramundi, Wahoo, Monkfish, Crawfish

v0.1.0

  • Initial release

Contributing

Contributions welcome! Areas of interest:

  • Additional species and regional variants
  • International market terminology
  • Packaging and processing codes

Author

Karen Morton — Seafood industry professional with 20+ years of experience in category management and procurement.

Built from years of experience managing seafood categories and the realization that this knowledge should be accessible to everyone, not trapped in experts' heads.

License

MIT License — Use it, modify it, share it. Just make seafood data better for everyone.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fishlib-0.5.0.tar.gz (51.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fishlib-0.5.0-py3-none-any.whl (48.0 kB view details)

Uploaded Python 3

File details

Details for the file fishlib-0.5.0.tar.gz.

File metadata

  • Download URL: fishlib-0.5.0.tar.gz
  • Upload date:
  • Size: 51.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for fishlib-0.5.0.tar.gz
Algorithm Hash digest
SHA256 1a924ee93d45d9cac6e560b462aa5e26abf3b67b4c3b05f39bb747af190ebfe2
MD5 865ddd4bb9ce0bd5128ff54e4f71a100
BLAKE2b-256 e9d10df792d5e4ec919e892ef397b62af14df946a477c068475e9d5c3af92a3b

See more details on using hashes here.

File details

Details for the file fishlib-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: fishlib-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 48.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for fishlib-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 aa4764e684adc7e9abe6fda767bf1e15bae9b986dc3b623bb661674e3bc0b02c
MD5 27f38afcf0e0b0496b5fe96942827445
BLAKE2b-256 e43b2385d1108fbaa18703617c226f8a1e857348d136443826def0ba09102799

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page