Skip to main content

loads spark

Project description

Load session prior to running

from sparknlp.base import  DocumentAssembler, Pipeline
from sparknlp.annotator import (
    NerDLModel, NerDLApproach, 
    GraphExtraction, UniversalSentenceEncoder,
    Tokenizer, WordEmbeddingsModel
)


# load spark session before this

use = UniversalSentenceEncoder \
    .pretrained() \
    .setInputCols("document") \
    .setOutputCol("use_embeddings")

document_assembler = DocumentAssembler() \
    .setInputCol("value") \
    .setOutputCol("document")

tokenizer = Tokenizer() \
    .setInputCols(["document"]) \
    .setOutputCol("token")

word_embeddings = WordEmbeddingsModel \
    .pretrained() \
    .setInputCols(["document", "token"]) \
    .setOutputCol("embeddings")


ner_tagger = NerDLModel \
    .pretrained() \
    .setInputCols(["document", "token", "embeddings"]) \
    .setOutputCol("ner")

graph_extraction = GraphExtraction() \
            .setInputCols(["document", "token", "ner"]) \
            .setOutputCol("graph") \
            .setRelationshipTypes(["lad-PER", "lad-LOC"]) \
            .setMergeEntities(True)

graph_pipeline = Pipeline() \
    .setStages([
        document_assembler, tokenizer,
        word_embeddings, ner_tagger,
        graph_extraction
    ])

df = sess.read.text('./data/train.dat')
graph_pipeline.fit(df).transform(df)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spark_loader-0.0.3.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

spark_loader-0.0.3-py3-none-any.whl (3.8 kB view details)

Uploaded Python 3

File details

Details for the file spark_loader-0.0.3.tar.gz.

File metadata

  • Download URL: spark_loader-0.0.3.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.9.13 Linux/6.6.6-76060606-generic

File hashes

Hashes for spark_loader-0.0.3.tar.gz
Algorithm Hash digest
SHA256 5c2d9a538e49e4970e0357e26a56e8c5d4f8b2147c7134bdd7b2313b38720312
MD5 e05400e404b51cb446e093da915592e9
BLAKE2b-256 2de4a0ce77d88410f8992917685d907321704e4d4d387641c22d14b8fccaea62

See more details on using hashes here.

File details

Details for the file spark_loader-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: spark_loader-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 3.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.9.13 Linux/6.6.6-76060606-generic

File hashes

Hashes for spark_loader-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d45b8357194fcc68c62e0b0f6c1f710846c4f0ed24123f9c8e7ce11b7683db35
MD5 1992f5c19d7921868840edd47b1c6eac
BLAKE2b-256 c4d07a9bd54547d4a0698e34a47e940687156c3a0d6cbe33fe8a88d76011409f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page