Skip to main content

A Python package to parse structured information from recipe ingredient sentences

Project description

Ingredient Parser

The Ingredient Parser package is a Python package for parsing structured information out of recipe ingredient sentences.

Documentation

Documentation on using the package and training the model can be found at https://ingredient-parser.readthedocs.io/.

Quick Start

Install the package using pip

$ python -m pip install ingredient-parser-nlp

Import the parse_ingredient function and pass it an ingredient sentence.

>>> from ingredient_parser import parse_ingredient
>>> parse_ingredient("3 pounds pork shoulder, cut into 2-inch chunks")
ParsedIngredient(
    name=[IngredientText(text='pork shoulder', confidence=0.996867, starting_index=2)],
    size=None,
    amount=[IngredientAmount(quantity=Fraction(3, 1),
                             quantity_max=Fraction(3, 1),
                             unit=<Unit('pound')>,
                             text='3 pounds',
                             confidence=0.999982,
                             starting_index=0,
                             unit_system=<UnitSystem.US_CUSTOMARY: 'us_customary'>,
                             APPROXIMATE=False,
                             SINGULAR=False,
                             RANGE=False,
                             MULTIPLIER=False,
                             PREPARED_INGREDIENT=False)],
	preparation=IngredientText(text='cut into 2 inch chunks',
                               confidence=0.999946,
                               starting_index=5),
	comment=None,
	purpose=None,
	foundation_foods=[],
	sentence='3 pounds pork shoulder, cut into 2-inch chunks'
)

Refer to the documentation here for the optional parameters that can be used with parse_ingredient .

Model

The core of the library is a sequence labelling model that is used to label each token in the sentence with the part of the sentence it belongs to. A data set of 81,000 example sentences is used to train and evaluate the model. See the Explanation section of the documentation for more details.

The model has the following accuracy on a test data set of 20% of the total data used:

╒══════════════════════════╤══════════════════════════╕
│ Sentence-level results   │ Word-level results       │
╞══════════════════════════╪══════════════════════════╡
│ Accuracy: 95.25%         │ Accuracy: 98.09%         │
│                          │ Precision (micro) 98.07% │
│                          │ Recall (micro) 98.09%    │
│                          │ F1 score (micro) 98.08%  │
╘══════════════════════════╧══════════════════════════╛

Development

Basic

Train and fine-tune new ingredient datasets to expand beyond the existing trained model provided in the library. The development dependencies are in the requirements-dev.txt file. Details on the training process can be found in the Explanation documentation.

Web App

The ingredient parser library provides a convenient web interface that you can run locally to access most of the library's functionality, including using the parser, browsing the database, labelling entries, and training the model(s). View the specific README in webtools for a detailed overview.

Parser Labeller Trainer
Screen shot of web parser Screen shot of web labeller Screen shot of web trainer

Documentation

The dependencies for building the documentation are in the requirements-doc.txt file.

Contribution

Please target the develop branch for pull requests. The main branch is used for stable releases and hotfixes only.

Before committing anything, install pre-commit and run the following to install the hooks:

$ pre-commit install

Pre-commit hooks cover both the main python library code and the web app (webtools) code.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ingredient_parser_nlp-2.5.0.tar.gz (4.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ingredient_parser_nlp-2.5.0-py3-none-any.whl (4.3 MB view details)

Uploaded Python 3

File details

Details for the file ingredient_parser_nlp-2.5.0.tar.gz.

File metadata

  • Download URL: ingredient_parser_nlp-2.5.0.tar.gz
  • Upload date:
  • Size: 4.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for ingredient_parser_nlp-2.5.0.tar.gz
Algorithm Hash digest
SHA256 c4c40a51f617a5ffa2012071d5946e0b447f00efe572a10d9adc20442e51fca5
MD5 9df280bdbf09b4ffbcd028ee45b12350
BLAKE2b-256 006089dbeb798397855d4ceb800e950e466c9a98f5bf2570a712fc9ac7045163

See more details on using hashes here.

File details

Details for the file ingredient_parser_nlp-2.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ingredient_parser_nlp-2.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8d9c7589f6d98570ffcd7f01484db573d15b3ccaab5579566419317a85cbf340
MD5 ec67bc6bdc3ac37af878ea7dd7c3f656
BLAKE2b-256 90340ed35a6e695ed6bb1e20888fbbc58415042ed166981cda79ffa1289d911a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page