A Python package to parse structured information from recipe ingredient sentences
Project description
Ingredient Parser
The Ingredient Parser package is a Python package for parsing structured information out of recipe ingredient sentences.
Documentation
Documentation on using the package and training the model can be found at https://ingredient-parser.readthedocs.io/.
Quick Start
Install the package using pip
$ python -m pip install ingredient-parser-nlp
Import the parse_ingredient function and pass it an ingredient sentence.
>>> from ingredient_parser import parse_ingredient
>>> parse_ingredient("3 pounds pork shoulder, cut into 2-inch chunks")
ParsedIngredient(
name=[IngredientText(text='pork shoulder', confidence=0.996867, starting_index=2)],
size=None,
amount=[IngredientAmount(quantity=Fraction(3, 1),
quantity_max=Fraction(3, 1),
unit=<Unit('pound')>,
text='3 pounds',
confidence=0.999982,
starting_index=0,
unit_system=<UnitSystem.US_CUSTOMARY: 'us_customary'>,
APPROXIMATE=False,
SINGULAR=False,
RANGE=False,
MULTIPLIER=False,
PREPARED_INGREDIENT=False)],
preparation=IngredientText(text='cut into 2 inch chunks',
confidence=0.999946,
starting_index=5),
comment=None,
purpose=None,
foundation_foods=[],
sentence='3 pounds pork shoulder, cut into 2-inch chunks'
)
Refer to the documentation here for the optional parameters that can be used with parse_ingredient .
Model
The core of the library is a sequence labelling model that is used to label each token in the sentence with the part of the sentence it belongs to. A data set of 81,000 example sentences is used to train and evaluate the model. See the Explanation section of the documentation for more details.
The model has the following accuracy on a test data set of 20% of the total data used:
╒══════════════════════════╤══════════════════════════╕
│ Sentence-level results │ Word-level results │
╞══════════════════════════╪══════════════════════════╡
│ Accuracy: 95.25% │ Accuracy: 98.09% │
│ │ Precision (micro) 98.07% │
│ │ Recall (micro) 98.09% │
│ │ F1 score (micro) 98.08% │
╘══════════════════════════╧══════════════════════════╛
Development
Basic
Train and fine-tune new ingredient datasets to expand beyond the existing trained model provided in the library. The development dependencies are in the requirements-dev.txt file. Details on the training process can be found in the Explanation documentation.
Web App
The ingredient parser library provides a convenient web interface that you can run locally to access most of the library's functionality, including using the parser, browsing the database, labelling entries, and training the model(s). View the specific README in webtools for a detailed overview.
| Parser | Labeller | Trainer |
|---|---|---|
Documentation
The dependencies for building the documentation are in the requirements-doc.txt file.
Contribution
Please target the develop branch for pull requests. The main branch is used for stable releases and hotfixes only.
Before committing anything, install pre-commit and run the following to install the hooks:
$ pre-commit install
Pre-commit hooks cover both the main python library code and the web app (webtools) code.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ingredient_parser_nlp-2.5.0.tar.gz.
File metadata
- Download URL: ingredient_parser_nlp-2.5.0.tar.gz
- Upload date:
- Size: 4.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4c40a51f617a5ffa2012071d5946e0b447f00efe572a10d9adc20442e51fca5
|
|
| MD5 |
9df280bdbf09b4ffbcd028ee45b12350
|
|
| BLAKE2b-256 |
006089dbeb798397855d4ceb800e950e466c9a98f5bf2570a712fc9ac7045163
|
File details
Details for the file ingredient_parser_nlp-2.5.0-py3-none-any.whl.
File metadata
- Download URL: ingredient_parser_nlp-2.5.0-py3-none-any.whl
- Upload date:
- Size: 4.3 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d9c7589f6d98570ffcd7f01484db573d15b3ccaab5579566419317a85cbf340
|
|
| MD5 |
ec67bc6bdc3ac37af878ea7dd7c3f656
|
|
| BLAKE2b-256 |
90340ed35a6e695ed6bb1e20888fbbc58415042ed166981cda79ffa1289d911a
|