loads spark
Project description
Load session prior to running
from sparknlp.base import DocumentAssembler, Pipeline
from sparknlp.annotator import (
NerDLModel, NerDLApproach,
GraphExtraction, UniversalSentenceEncoder,
Tokenizer, WordEmbeddingsModel
)
# load spark session before this
use = UniversalSentenceEncoder \
.pretrained() \
.setInputCols("document") \
.setOutputCol("use_embeddings")
document_assembler = DocumentAssembler() \
.setInputCol("value") \
.setOutputCol("document")
tokenizer = Tokenizer() \
.setInputCols(["document"]) \
.setOutputCol("token")
word_embeddings = WordEmbeddingsModel \
.pretrained() \
.setInputCols(["document", "token"]) \
.setOutputCol("embeddings")
ner_tagger = NerDLModel \
.pretrained() \
.setInputCols(["document", "token", "embeddings"]) \
.setOutputCol("ner")
graph_extraction = GraphExtraction() \
.setInputCols(["document", "token", "ner"]) \
.setOutputCol("graph") \
.setRelationshipTypes(["lad-PER", "lad-LOC"]) \
.setMergeEntities(True)
graph_pipeline = Pipeline() \
.setStages([
document_assembler, tokenizer,
word_embeddings, ner_tagger,
graph_extraction
])
df = sess.read.text('./data/train.dat')
graph_pipeline.fit(df).transform(df)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
spark_loader-0.0.4.tar.gz
(4.3 kB
view details)
Built Distribution
File details
Details for the file spark_loader-0.0.4.tar.gz
.
File metadata
- Download URL: spark_loader-0.0.4.tar.gz
- Upload date:
- Size: 4.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.9.13 Linux/6.6.6-76060606-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 537a206213a421e9ca9145909110bc12c3c6068ba5b3ae2e7a11235549dcfc5e |
|
MD5 | 7d707de4a034110fb7c5d63ed8598997 |
|
BLAKE2b-256 | 8a5aabf3602f26e3301353af81ac31456adc7e160cdbd0d8b958776b7e853720 |
File details
Details for the file spark_loader-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: spark_loader-0.0.4-py3-none-any.whl
- Upload date:
- Size: 3.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.9.13 Linux/6.6.6-76060606-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 50b746e8359aeaf259c41ee106867a7555898e1f1f626de7a3c054be707162da |
|
MD5 | ca9ef4d52b98694dd9a31dacec9f09c4 |
|
BLAKE2b-256 | 2fed2bf7df7e580342ca11a85ed1fa42f53c09486680c2df4fcde95f15ed991d |