Skip to main content

A python toolkit for parsing sentences (natural language) into scene graphs (symbolic representations).

Project description

SceneGraphParser

SceneGraphParser (sng_parser) is a python toolkit for parsing sentences (in natural language) into scene graphs (as symbolic representation) based on the dependency parsing. This project is inspired by the Stanford Scene Graph Parser.

Different from the Stanford version, this parser is written purely by Python. It has an easy-to-use user interface and an easy-to-configure design. It parses sentences into graphs where the nodes are nouns (with modifiers such as determinants or adjectives) and the edges are relations between nouns. Please see the example section for details.

Highlight: This project is still being developed. ALL APIs are subject to ANY change.

Note: As you may notice, the parsing is done by a set of human-written rules on the parsing tree. Thus, we need help from everyone on collecting failure/corner cases of the current program. Any kind of reports or help should be more than welcome.

This repo was developed for:
Unified Visual-Semantic Embeddings: Bridging Vision and Language With Structured Meaning Representations
Hao Wu, Jiayuan Mao, Yufeng Zhang, Yuning Jiang, Lei Li, Weiwei Sun, and Wei-Ying Ma
In Conference on Computer Vision and Pattern Recognition (CVPR) 2019 (Oral Presentation)

Please consider citing our paper if you feel confortable :). The difference between this repo and the original Stanford Scene Graph Parser can be found in #7.

Installation

The package can be installed using pip. As it currently only support spaCy as Backend, the English package needs to be downloaded after installation.

pip install SceneGraphParser
python -m spacy download en  # to use the parser for English

Example

The easiest way to use this tool is by calling the parse function. In design, sng_parser supports different backends. Currently, we only support the spaCy backend.

pip install spacy
>>> import sng_parser
>>> graph = sng_parser.parse('A woman is playing the piano in the room.')
>>> from pprint import pprint
>>> pprint(graph)
{'entities': [{'head': 'woman',
               'lemma_head': 'woman',
               'lemma_span': 'a woman',
               'modifiers': [{'dep': 'det', 'lemma_span': 'a', 'span': 'A'}],
               'span': 'A woman'},
              {'head': 'piano',
               'lemma_head': 'piano',
               'lemma_span': 'the piano',
               'modifiers': [{'dep': 'det',
                              'lemma_span': 'the',
                              'span': 'the'}],
               'span': 'the piano'},
              {'head': 'room',
               'lemma_head': 'room',
               'lemma_span': 'the room',
               'modifiers': [{'dep': 'det',
                              'lemma_span': 'the',
                              'span': 'the'}],
               'span': 'the room'}],
 'relations': [{'object': 1, 'relation': 'playing', 'subject': 0},
               {'object': 2, 'relation': 'in', 'subject': 0}]}
>>> sng_parser.tprint(graph)  # we provide a tabular visualization of the graph.
Entities:
+--------+-----------+-------------+
| Head   | Span      | Modifiers   |
|--------+-----------+-------------|
| woman  | a woman   | a           |
| piano  | the piano | the         |
| room   | the room  | the         |
+--------+-----------+-------------+
Relations:
+-----------+------------+----------+
| Subject   | Relation   | Object   |
|-----------+------------+----------|
| woman     | playing    | piano    |
| woman     | in         | room     |
+-----------+------------+----------+

Alternatively, you can configure your own parser:

>>> import sng_parser
>>> parser = sng_parser.Parser('spacy', model='en')  # the positional argument specifies the backend, and the keyward arguments are for the backend initialization.
>>> graph = parser.parse('A woman is playing the piano in the room.')

Specification of the graph

We use the pure pythonic dict and list to represent a graph. Although this flexibility may bring some unwanted issues, we prefer this representation because:

  1. currently, the tool is still being developed, these APIs are subject to change.
  2. this makes the tool easy to be integrated into any python-based projects. You don't need to care about pickling/unpickling the results. Use it anywhere in your code!

The generated scene graphs match the following spec:

{
  'entities': [  # a list of entities
    {
      'span': "the full span of a noun phrase",
      'lemma_span': "the lemmatized version of the span",
      'head': "the head noun",
      'lemma_head': "the lemmatized version of the head noun",
      'modifiers': [
        {
          'dep': "the dependency type",
          'span': "the span of the modifier",
          'lemma_span': "the lemmatized version of the span"
        },
        # other modifiers...
      ]
    },
    # other entities...
  ],

  'relations': [  # a list of relations
    # the subject and object fields are sometimes called "head" and "tail" in relation extraction papers.
    {
      'subject': "the entity id of the subject",
      'object': "the entity id of the object",
      'relation': "the relation"
    }
    # other relations...
  ]
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

SceneGraphParser-0.1.0.tar.gz (18.3 kB view details)

Uploaded Source

Built Distribution

SceneGraphParser-0.1.0-py3-none-any.whl (19.6 kB view details)

Uploaded Python 3

File details

Details for the file SceneGraphParser-0.1.0.tar.gz.

File metadata

  • Download URL: SceneGraphParser-0.1.0.tar.gz
  • Upload date:
  • Size: 18.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/0.0.0 pkginfo/1.8.2 readme-renderer/27.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.2 keyring/23.4.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for SceneGraphParser-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5ccf22410744dfd65fe865cd0dd51ba9c14c28313c57cecbd724f40d9723cdfa
MD5 c3e3802a942952e3d691c411a878f029
BLAKE2b-256 9bdf8381efe4075e737922cd24508ff92476402c1a9e6ca826e945981bba103c

See more details on using hashes here.

File details

Details for the file SceneGraphParser-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: SceneGraphParser-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 19.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/0.0.0 pkginfo/1.8.2 readme-renderer/27.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.2 keyring/23.4.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for SceneGraphParser-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c7269b21c00cd1ae94a3678a321a22f47e0406adad4e9df4c296da40786275cb
MD5 4dfb376ce2624bcc88ae9f6af20408b6
BLAKE2b-256 2079ccb4130a946c1c31703352e7f46c072eaa975ff50080f4444f03f6861650

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page