Skip to main content

Visualization package for Spark NLP

Project description

spark-nlp-display

A library for the simple visualization of different types of Spark NLP annotations.

Supported Visualizations:

  • Dependency Parser
  • Named Entity Recognition
  • Entity Resolution
  • Relation Extraction
  • Assertion Status

Complete Tutorial

Open In Colab

https://github.com/JohnSnowLabs/spark-nlp-display/blob/main/tutorials/Spark_NLP_Display.ipynb

Requirements

  • spark-nlp
  • ipython
  • svgwrite
  • pandas
  • numpy

Installation

pip install spark-nlp-display

How to use

Databricks

For all modules, pass in the additional parameter "return_html=True" in the display function and use Databrick's function displayHTML() to render visualization as explained below:

from sparknlp_display import NerVisualizer

ner_vis = NerVisualizer()

## To set custom label colors:
ner_vis.set_label_colors({'LOC':'#800080', 'PER':'#77b5fe'}) #set label colors by specifying hex codes

pipeline_result = ner_light_pipeline.fullAnnotate(text) ##light pipeline
#pipeline_result = ner_full_pipeline.transform(df).collect()##full pipeline

vis_html = ner_vis.display(pipeline_result[0], #should be the results of a single example, not the complete dataframe
                    label_col='entities', #specify the entity column
                    document_col='document', #specify the document column (default: 'document')
                    labels=['PER'], #only allow these labels to be displayed. (default: [] - all labels will be displayed)
                    return_html=True)

displayHTML(vis_html)

title

Jupyter

To save the visualization as html, provide the export file path: save_path='./export.html' for each visualizer.

Dependency Parser

from sparknlp_display import DependencyParserVisualizer

dependency_vis = DependencyParserVisualizer()

pipeline_result = dp_pipeline.fullAnnotate(text)
#pipeline_result = dp_full_pipeline.transform(df).collect()##full pipeline

dependency_vis.display(pipeline_result[0], #should be the results of a single example, not the complete dataframe.
                       pos_col = 'pos', #specify the pos column
                       dependency_col = 'dependency', #specify the dependency column
                       dependency_type_col = 'dependency_type', #specify the dependency type column
                       save_path='./export.html' # optional - to save viz as html. (default: None)
                       )

title

Named Entity Recognition

from sparknlp_display import NerVisualizer

ner_vis = NerVisualizer()

pipeline_result = ner_light_pipeline.fullAnnotate(text)
#pipeline_result = ner_full_pipeline.transform(df).collect()##full pipeline

ner_vis.display(pipeline_result[0], #should be the results of a single example, not the complete dataframe
                    label_col='entities', #specify the entity column
                    document_col='document', #specify the document column (default: 'document')
                    labels=['PER'], #only allow these labels to be displayed. (default: [] - all labels will be displayed)
                    save_path='./export.html' # optional - to save viz as html. (default: None)
                    )

## To set custom label colors:
ner_vis.set_label_colors({'LOC':'#800080', 'PER':'#77b5fe'}) #set label colors by specifying hex codes

title

Entity Resolution

from sparknlp_display import EntityResolverVisualizer

er_vis = EntityResolverVisualizer()

pipeline_result = er_light_pipeline.fullAnnotate(text)

er_vis.display(pipeline_result[0], #should be the results of a single example, not the complete dataframe
               label_col='entities', #specify the ner result column
               resolution_col = 'resolution',
               document_col='document', #specify the document column (default: 'document')
               save_path='./export.html' # optional - to save viz as html. (default: None)
               )

## To set custom label colors:
er_vis.set_label_colors({'TREATMENT':'#800080', 'PROBLEM':'#77b5fe'}) #set label colors by specifying hex codes

title

Relation Extraction

from sparknlp_display import RelationExtractionVisualizer

re_vis = RelationExtractionVisualizer()

pipeline_result = re_light_pipeline.fullAnnotate(text)

re_vis.display(pipeline_result[0], #should be the results of a single example, not the complete dataframe
               relation_col = 'relations', #specify relations column
               document_col = 'document', #specify document column
               show_relations=True, #display relation names on arrows (default: True)
               save_path='./export.html' # optional - to save viz as html. (default: None)
               )

title

Assertion Status

from sparknlp_display import AssertionVisualizer

assertion_vis = AssertionVisualizer()

pipeline_result = ner_assertion_light_pipeline.fullAnnotate(text)

assertion_vis.display(pipeline_result[0], 
                      label_col = 'entities', #specify the ner result column
                      assertion_col = 'assertion', #specify assertion column
                      document_col = 'document', #specify the document column (default: 'document')
                      save_path='./export.html' # optional - to save viz as html. (default: None)
                      )
                      
## To set custom label colors:
assertion_vis.set_label_colors({'TREATMENT':'#008080', 'problem':'#800080'}) #set label colors by specifying hex codes

title

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spark-nlp-display-1.9.tar.gz (85.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spark_nlp_display-1.9-py3-none-any.whl (95.4 kB view details)

Uploaded Python 3

File details

Details for the file spark-nlp-display-1.9.tar.gz.

File metadata

  • Download URL: spark-nlp-display-1.9.tar.gz
  • Upload date:
  • Size: 85.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for spark-nlp-display-1.9.tar.gz
Algorithm Hash digest
SHA256 0f4c7e9c45ac94b029d17e99da499b8b676e68b46cb31377543d9ae883bcaee2
MD5 00e7c3f6009094166e3e94ec809bef43
BLAKE2b-256 4cc23771144b3fa9a0c660d175f01d5def42040ff5c14ec4fe41e99ce0dfd4d1

See more details on using hashes here.

File details

Details for the file spark_nlp_display-1.9-py3-none-any.whl.

File metadata

  • Download URL: spark_nlp_display-1.9-py3-none-any.whl
  • Upload date:
  • Size: 95.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for spark_nlp_display-1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 7b545475723a5ecf2c2525ed987d89c3a92abe6ea4805cc9f420eec5a8b5dec7
MD5 2ac0769f732b11ef47d46b348396317f
BLAKE2b-256 d00f6bd0a21d608c90f33087b98fbd7adcdaaa587703719f46692960268ee1fc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page