A Python package to parse structured information from recipe ingredient sentences

These details have not been verified by PyPI

Project links

Project description

Ingredient Parser

The Ingredient Parser package is a Python package for parsing structured information out of recipe ingredient sentences.

1 large onion, finely chopped

becomes

{
    "quantity": 1,
    "unit": "large",
    "name": "onion",
    "comment": "finely chopped"
}

Documentation

Documentation on using the package and training the model can be found at https://ingredient-parser.readthedocs.io/en/latest/.

Quick Start

Install the package using pip

python -m pip install ingredient-parser-nlp

Import the ```parse_ingredient`` function and pass it an ingredient sentence.

>>> from ingredient_parser import parse_ingredient

>>> parse_ingredient("3 pounds pork shoulder, cut into 2-inch chunks")
{'sentence': '3 pounds pork shoulder, cut into 2-inch chunks',
 'quantity': '3',
 'unit': 'pound',
 'name': 'pork shoulder',
 'comment': ', cut into 2-inch chunks',
 'other': ''}

# Output confidence for each label
>>> parse_ingredient("3 pounds pork shoulder, cut into 2-inch chunks", confidence=True)
{'sentence': '3 pounds pork shoulder, cut into 2-inch chunks',
 'quantity': '3',
 'unit': 'pound',
 'name': 'pork shoulder',
 'comment': ', cut into 2-inch chunks',
 'other': '',
 'confidence': {'quantity': 0.9986,
  'unit': 0.9967,
  'name': 0.9535,
  'comment': 0.9967,
  'other': 0}}

The returned dictionary has the format

{
    "sentence": str,
    "quantity": str,
    "unit": str,
    "name": str,
    "comment": Union[List[str], str],
    "other": Union[List[str], str]
}

Model accuracy

The model provided in ingredient-parser/ directory has the following accuracy on a test data set of 25%:

Sentence-level results:
	Total: 9277
	Correct: 7689
	-> 82.88%

Word-level results:
	Total: 52931
	Correct: 50051
	-> 94.56%

Development

The development dependencies are in the requirements-dev.txt file.

Note that development includes training the model.

Black is used for code formatting.
isort is used for import sorting.
flake8 is used for linting. Note the line length standard (E501) is ignored.
pyrigt is used for type static analysis.

The documentation dependencies are in the requirement-doc.txt file.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.1.1

May 18, 2025

2.1.0

Apr 21, 2025

2.0.0

Feb 21, 2025

1.3.2

Dec 6, 2024

1.3.1

Nov 29, 2024

1.3.0

Nov 6, 2024

1.2.0

Sep 29, 2024

1.1.2

Aug 23, 2024

1.1.1

Aug 16, 2024

1.1.0

Aug 15, 2024

1.0.1

Aug 10, 2024

1.0.0

Jun 17, 2024

0.1.0b11 pre-release

May 27, 2024

0.1.0b10 pre-release

Apr 12, 2024

0.1.0b9 pre-release

Apr 6, 2024

0.1.0b8 pre-release

Jan 27, 2024

0.1.0b7 pre-release

Nov 21, 2023

0.1.0b6 pre-release yanked

Oct 24, 2023