Skip to main content

NLP Text processing library built on top of Apache Spark

Project description


John Snow Labs Spark-NLP is a natural language processing library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines, that scale easily in a distributed environment.


SparkNLP is built on top of Apache Spark 2.4.0 and works with any user provided Spark 2.x.x it is advised to have basic knowledge of the framework and a working environment before using Spark-NLP.

Spark-NLP for Python

Dependencies on python3-devel and wheel python module

Build python package with python3 sdist bdist_wheel

Install with python3 -m pip install --force-reinstall --user dist/spark_nlp-1.8.3-py2.py3-none-any.whl

Project details

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for spark-nlp, version 1.8.3
Filename, size File type Python version Upload date Hashes
Filename, size spark_nlp-1.8.3-py2.py3-none-any.whl (101.7 MB) File type Wheel Python version py2.py3 Upload date Hashes View
Filename, size spark-nlp-1.8.3.tar.gz (101.6 MB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page