Skip to main content

A Python package to parse structured information from recipe ingredient sentences

Reason this release was yanked:

No upper limit on Python version set.

Project description

Ingredient Parser

The Ingredient Parser package is a Python package for parsing structured information out of recipe ingredient sentences.

1 large onion, finely chopped

becomes

{
    "quantity": 1,
    "unit": "large",
    "name": "onion",
    "comment": "finely chopped"
}

Documentation

Documentation on using the package and training the model can be found at https://ingredient-parser.readthedocs.io/en/latest/.

Quick Start

Install the package using pip

python -m pip install ingredient-parser-nlp

Import the ```parse_ingredient`` function and pass it an ingredient sentence.

>>> from ingredient_parser import parse_ingredient

>>> parse_ingredient("3 pounds pork shoulder, cut into 2-inch chunks")
{'sentence': '3 pounds pork shoulder, cut into 2-inch chunks',
 'quantity': '3',
 'unit': 'pound',
 'name': 'pork shoulder',
 'comment': ', cut into 2-inch chunks',
 'other': ''}

# Output confidence for each label
>>> parse_ingredient("3 pounds pork shoulder, cut into 2-inch chunks", confidence=True)
{'sentence': '3 pounds pork shoulder, cut into 2-inch chunks',
 'quantity': '3',
 'unit': 'pound',
 'name': 'pork shoulder',
 'comment': ', cut into 2-inch chunks',
 'other': '',
 'confidence': {'quantity': 0.9986,
  'unit': 0.9967,
  'name': 0.9535,
  'comment': 0.9967,
  'other': 0}}

The returned dictionary has the format

{
    "sentence": str,
    "quantity": str,
    "unit": str,
    "name": str,
    "comment": Union[List[str], str],
    "other": Union[List[str], str]
}

Model accuracy

The model provided in ingredient-parser/ directory has the following accuracy on a test data set of 25%:

Sentence-level results:
	Total: 9277
	Correct: 7689
	-> 82.88%

Word-level results:
	Total: 52931
	Correct: 50051
	-> 94.56%

Development

The development dependencies are in the requirements-dev.txt file.

Note that development includes training the model.

  • Black is used for code formatting.

  • isort is used for import sorting.

  • flake8 is used for linting. Note the line length standard (E501) is ignored.

  • pyrigt is used for type static analysis.

The documentation dependencies are in the requirement-doc.txt file.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ingredient_parser_nlp-0.1.0a1.tar.gz (790.6 kB view details)

Uploaded Source

Built Distribution

ingredient_parser_nlp-0.1.0a1-py3-none-any.whl (787.7 kB view details)

Uploaded Python 3

File details

Details for the file ingredient_parser_nlp-0.1.0a1.tar.gz.

File metadata

  • Download URL: ingredient_parser_nlp-0.1.0a1.tar.gz
  • Upload date:
  • Size: 790.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.4

File hashes

Hashes for ingredient_parser_nlp-0.1.0a1.tar.gz
Algorithm Hash digest
SHA256 b4397d79869a16ba1d626e3fc07fc8e5d8d6f59871789c80b19be8e3dc7358ca
MD5 42c44c1cf9e907cc5218d3fc74d0ef57
BLAKE2b-256 efc98c752e891a8b268a5c4e4c3bdbf10f942c227eb0a2eee693494532c5025d

See more details on using hashes here.

File details

Details for the file ingredient_parser_nlp-0.1.0a1-py3-none-any.whl.

File metadata

File hashes

Hashes for ingredient_parser_nlp-0.1.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 cd7f0965e6e0da3400b19ecd9fc0277bdba20a439266eccb7be02d4079f264e5
MD5 8db6360706ec1837023d3087fd124586
BLAKE2b-256 c04aa232f3c4c5ad37f19fa04d7d73996a33ab9b963600f4354aa30fffab9c27

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page