loads spark
Project description
Load session prior to running
from sparknlp.base import DocumentAssembler, Pipeline
from sparknlp.annotator import (
NerDLModel, NerDLApproach,
GraphExtraction, UniversalSentenceEncoder,
Tokenizer, WordEmbeddingsModel
)
# load spark session before this
use = UniversalSentenceEncoder \
.pretrained() \
.setInputCols("document") \
.setOutputCol("use_embeddings")
document_assembler = DocumentAssembler() \
.setInputCol("value") \
.setOutputCol("document")
tokenizer = Tokenizer() \
.setInputCols(["document"]) \
.setOutputCol("token")
word_embeddings = WordEmbeddingsModel \
.pretrained() \
.setInputCols(["document", "token"]) \
.setOutputCol("embeddings")
ner_tagger = NerDLModel \
.pretrained() \
.setInputCols(["document", "token", "embeddings"]) \
.setOutputCol("ner")
graph_extraction = GraphExtraction() \
.setInputCols(["document", "token", "ner"]) \
.setOutputCol("graph") \
.setRelationshipTypes(["lad-PER", "lad-LOC"]) \
.setMergeEntities(True)
graph_pipeline = Pipeline() \
.setStages([
document_assembler, tokenizer,
word_embeddings, ner_tagger,
graph_extraction
])
df = sess.read.text('./data/train.dat')
graph_pipeline.fit(df).transform(df)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
spark_loader-0.0.3.tar.gz
(4.4 kB
view details)
Built Distribution
File details
Details for the file spark_loader-0.0.3.tar.gz
.
File metadata
- Download URL: spark_loader-0.0.3.tar.gz
- Upload date:
- Size: 4.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.9.13 Linux/6.6.6-76060606-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5c2d9a538e49e4970e0357e26a56e8c5d4f8b2147c7134bdd7b2313b38720312 |
|
MD5 | e05400e404b51cb446e093da915592e9 |
|
BLAKE2b-256 | 2de4a0ce77d88410f8992917685d907321704e4d4d387641c22d14b8fccaea62 |
File details
Details for the file spark_loader-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: spark_loader-0.0.3-py3-none-any.whl
- Upload date:
- Size: 3.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.9.13 Linux/6.6.6-76060606-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d45b8357194fcc68c62e0b0f6c1f710846c4f0ed24123f9c8e7ce11b7683db35 |
|
MD5 | 1992f5c19d7921868840edd47b1c6eac |
|
BLAKE2b-256 | c4d07a9bd54547d4a0698e34a47e940687156c3a0d6cbe33fe8a88d76011409f |