loads spark
Project description
Load session prior to running
from sparknlp.base import DocumentAssembler, Pipeline
from sparknlp.annotator import (
NerDLModel, NerDLApproach,
GraphExtraction, UniversalSentenceEncoder,
Tokenizer, WordEmbeddingsModel
)
# load spark session before this
use = UniversalSentenceEncoder \
.pretrained() \
.setInputCols("document") \
.setOutputCol("use_embeddings")
document_assembler = DocumentAssembler() \
.setInputCol("value") \
.setOutputCol("document")
tokenizer = Tokenizer() \
.setInputCols(["document"]) \
.setOutputCol("token")
word_embeddings = WordEmbeddingsModel \
.pretrained() \
.setInputCols(["document", "token"]) \
.setOutputCol("embeddings")
ner_tagger = NerDLModel \
.pretrained() \
.setInputCols(["document", "token", "embeddings"]) \
.setOutputCol("ner")
graph_extraction = GraphExtraction() \
.setInputCols(["document", "token", "ner"]) \
.setOutputCol("graph") \
.setRelationshipTypes(["lad-PER", "lad-LOC"]) \
.setMergeEntities(True)
graph_pipeline = Pipeline() \
.setStages([
document_assembler, tokenizer,
word_embeddings, ner_tagger,
graph_extraction
])
df = sess.read.text('./data/train.dat')
graph_pipeline.fit(df).transform(df)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
spark_loader-0.0.3.tar.gz
(4.4 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spark_loader-0.0.3.tar.gz.
File metadata
- Download URL: spark_loader-0.0.3.tar.gz
- Upload date:
- Size: 4.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.9.13 Linux/6.6.6-76060606-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5c2d9a538e49e4970e0357e26a56e8c5d4f8b2147c7134bdd7b2313b38720312
|
|
| MD5 |
e05400e404b51cb446e093da915592e9
|
|
| BLAKE2b-256 |
2de4a0ce77d88410f8992917685d907321704e4d4d387641c22d14b8fccaea62
|
File details
Details for the file spark_loader-0.0.3-py3-none-any.whl.
File metadata
- Download URL: spark_loader-0.0.3-py3-none-any.whl
- Upload date:
- Size: 3.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.9.13 Linux/6.6.6-76060606-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d45b8357194fcc68c62e0b0f6c1f710846c4f0ed24123f9c8e7ce11b7683db35
|
|
| MD5 |
1992f5c19d7921868840edd47b1c6eac
|
|
| BLAKE2b-256 |
c4d07a9bd54547d4a0698e34a47e940687156c3a0d6cbe33fe8a88d76011409f
|