loads spark
Project description
Load session prior to running
from sparknlp.base import DocumentAssembler, Pipeline
from sparknlp.annotator import (
NerDLModel, NerDLApproach,
GraphExtraction, UniversalSentenceEncoder,
Tokenizer, WordEmbeddingsModel
)
# load spark session before this
use = UniversalSentenceEncoder \
.pretrained() \
.setInputCols("document") \
.setOutputCol("use_embeddings")
document_assembler = DocumentAssembler() \
.setInputCol("value") \
.setOutputCol("document")
tokenizer = Tokenizer() \
.setInputCols(["document"]) \
.setOutputCol("token")
word_embeddings = WordEmbeddingsModel \
.pretrained() \
.setInputCols(["document", "token"]) \
.setOutputCol("embeddings")
ner_tagger = NerDLModel \
.pretrained() \
.setInputCols(["document", "token", "embeddings"]) \
.setOutputCol("ner")
graph_extraction = GraphExtraction() \
.setInputCols(["document", "token", "ner"]) \
.setOutputCol("graph") \
.setRelationshipTypes(["lad-PER", "lad-LOC"]) \
.setMergeEntities(True)
graph_pipeline = Pipeline() \
.setStages([
document_assembler, tokenizer,
word_embeddings, ner_tagger,
graph_extraction
])
df = sess.read.text('./data/train.dat')
graph_pipeline.fit(df).transform(df)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
spark_loader-0.0.2.tar.gz
(4.4 kB
view details)
Built Distribution
File details
Details for the file spark_loader-0.0.2.tar.gz
.
File metadata
- Download URL: spark_loader-0.0.2.tar.gz
- Upload date:
- Size: 4.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.9.13 Linux/6.6.6-76060606-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c64b709a6b80730a87d385e4870fef988cd2a30dd4dc48d013be68617b7b17db |
|
MD5 | 017ea647e5c3b6bcea2fefa09d2e5a4d |
|
BLAKE2b-256 | 0de4b8468cc7e4c2a69ae7d574bac338d09d47630457f935f0999fd49b10d178 |
File details
Details for the file spark_loader-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: spark_loader-0.0.2-py3-none-any.whl
- Upload date:
- Size: 3.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.9.13 Linux/6.6.6-76060606-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 390b66ec9bb9788512478f88142524a46eebf1acda6e16255c62b5f8c07aad26 |
|
MD5 | 880068e24788406454bbc58c5533220c |
|
BLAKE2b-256 | d9ea774ae2a5485dc6ff1bb798088c3b98b8c11e5fcda6a24764c4e3cd605923 |