Skip to main content

loads spark

Project description

Load session prior to running

from sparknlp.base import  DocumentAssembler, Pipeline
from sparknlp.annotator import (
    NerDLModel, NerDLApproach, 
    GraphExtraction, UniversalSentenceEncoder,
    Tokenizer, WordEmbeddingsModel
)


# load spark session before this

use = UniversalSentenceEncoder \
    .pretrained() \
    .setInputCols("document") \
    .setOutputCol("use_embeddings")

document_assembler = DocumentAssembler() \
    .setInputCol("value") \
    .setOutputCol("document")

tokenizer = Tokenizer() \
    .setInputCols(["document"]) \
    .setOutputCol("token")

word_embeddings = WordEmbeddingsModel \
    .pretrained() \
    .setInputCols(["document", "token"]) \
    .setOutputCol("embeddings")


ner_tagger = NerDLModel \
    .pretrained() \
    .setInputCols(["document", "token", "embeddings"]) \
    .setOutputCol("ner")

graph_extraction = GraphExtraction() \
            .setInputCols(["document", "token", "ner"]) \
            .setOutputCol("graph") \
            .setRelationshipTypes(["lad-PER", "lad-LOC"]) \
            .setMergeEntities(True)

graph_pipeline = Pipeline() \
    .setStages([
        document_assembler, tokenizer,
        word_embeddings, ner_tagger,
        graph_extraction
    ])

df = sess.read.text('./data/train.dat')
graph_pipeline.fit(df).transform(df)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spark_loader-0.0.4.tar.gz (4.3 kB view details)

Uploaded Source

Built Distribution

spark_loader-0.0.4-py3-none-any.whl (3.8 kB view details)

Uploaded Python 3

File details

Details for the file spark_loader-0.0.4.tar.gz.

File metadata

  • Download URL: spark_loader-0.0.4.tar.gz
  • Upload date:
  • Size: 4.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.9.13 Linux/6.6.6-76060606-generic

File hashes

Hashes for spark_loader-0.0.4.tar.gz
Algorithm Hash digest
SHA256 537a206213a421e9ca9145909110bc12c3c6068ba5b3ae2e7a11235549dcfc5e
MD5 7d707de4a034110fb7c5d63ed8598997
BLAKE2b-256 8a5aabf3602f26e3301353af81ac31456adc7e160cdbd0d8b958776b7e853720

See more details on using hashes here.

File details

Details for the file spark_loader-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: spark_loader-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 3.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.9.13 Linux/6.6.6-76060606-generic

File hashes

Hashes for spark_loader-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 50b746e8359aeaf259c41ee106867a7555898e1f1f626de7a3c054be707162da
MD5 ca9ef4d52b98694dd9a31dacec9f09c4
BLAKE2b-256 2fed2bf7df7e580342ca11a85ed1fa42f53c09486680c2df4fcde95f15ed991d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page