Skip to main content

Industrial-strength Natural Language Processing (NLP) in Python

Project description

spaCy: Industrial-strength NLP

spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products.

spaCy comes with pretrained pipelines and currently supports tokenization and training for 60+ languages. It features state-of-the-art speed and neural network models for tagging, parsing, named entity recognition, text classification and more, multi-task learning with pretrained transformers like BERT, as well as a production-ready training system and easy model packaging, deployment and workflow management. spaCy is commercial open-source software, released under the MIT license.

💫 Version 3.0 out now! Check out the release notes here.

Azure Pipelines Current Release Version pypi Version conda Version Python wheels Code style: black
PyPi downloads Conda downloads spaCy on Twitter

📖 Documentation

Documentation
⭐️ spaCy 101 New to spaCy? Here's everything you need to know!
📚 Usage Guides How to use spaCy and its features.
🚀 New in v3.0 New features, backwards incompatibilities and migration guide.
🪐 Project Templates End-to-end workflows you can clone, modify and run.
🎛 API Reference The detailed reference for spaCy's API.
📦 Models Download trained pipelines for spaCy.
🌌 Universe Plugins, extensions, demos and books from the spaCy ecosystem.
👩‍🏫 Online Course Learn spaCy in this free and interactive online course.
📺 Videos Our YouTube channel with video tutorials, talks and more.
🛠 Changelog Changes and version history.
💝 Contribute How to contribute to the spaCy project and code base.

💬 Where to ask questions

The spaCy project is maintained by @honnibal, @ines, @svlandeg, @adrianeboyd and @polm. Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly, so that more people can benefit from it.

Type Platforms
🚨 Bug Reports GitHub Issue Tracker
🎁 Feature Requests & Ideas GitHub Discussions
👩‍💻 Usage Questions GitHub Discussions · Stack Overflow
🗯 General Discussion GitHub Discussions

Features

  • Support for 60+ languages
  • Trained pipelines for different languages and tasks
  • Multi-task learning with pretrained transformers like BERT
  • Support for pretrained word vectors and embeddings
  • State-of-the-art speed
  • Production-ready training system
  • Linguistically-motivated tokenization
  • Components for named entity recognition, part-of-speech-tagging, dependency parsing, sentence segmentation, text classification, lemmatization, morphological analysis, entity linking and more
  • Easily extensible with custom components and attributes
  • Support for custom models in PyTorch, TensorFlow and other frameworks
  • Built in visualizers for syntax and NER
  • Easy model packaging, deployment and workflow management
  • Robust, rigorously evaluated accuracy

📖 For more details, see the facts, figures and benchmarks.

⏳ Install spaCy

For detailed installation instructions, see the documentation.

  • Operating system: macOS / OS X · Linux · Windows (Cygwin, MinGW, Visual Studio)
  • Python version: Python 3.6+ (only 64 bit)
  • Package managers: pip · conda (via conda-forge)

pip

Using pip, spaCy releases are available as source packages and binary wheels. Before you install spaCy and its dependencies, make sure that your pip, setuptools and wheel are up to date.

pip install -U pip setuptools wheel
pip install spacy

To install additional data tables for lemmatization and normalization you can run pip install spacy[lookups] or install spacy-lookups-data separately. The lookups package is needed to create blank models with lemmatization data, and to lemmatize in languages that don't yet come with pretrained models and aren't powered by third-party libraries.

When using pip it is generally recommended to install packages in a virtual environment to avoid modifying system state:

python -m venv .env
source .env/bin/activate
pip install -U pip setuptools wheel
pip install spacy

conda

You can also install spaCy from conda via the conda-forge channel. For the feedstock including the build recipe and configuration, check out this repository.

conda install -c conda-forge spacy

Updating spaCy

Some updates to spaCy may require downloading new statistical models. If you're running spaCy v2.0 or higher, you can use the validate command to check if your installed models are compatible and if not, print details on how to update them:

pip install -U spacy
python -m spacy validate

If you've trained your own models, keep in mind that your training and runtime inputs must match. After updating spaCy, we recommend retraining your models with the new version.

📖 For details on upgrading from spaCy 2.x to spaCy 3.x, see the migration guide.

📦 Download model packages

Trained pipelines for spaCy can be installed as Python packages. This means that they're a component of your application, just like any other module. Models can be installed using spaCy's download command, or manually by pointing pip to a path or URL.

Documentation
Available Pipelines Detailed pipeline descriptions, accuracy figures and benchmarks.
Models Documentation Detailed usage and installation instructions.
Training How to train your own pipelines on your data.
# Download best-matching version of specific model for your spaCy installation
python -m spacy download en_core_web_sm

# pip install .tar.gz archive or .whl from path or URL
pip install /Users/you/en_core_web_sm-3.0.0.tar.gz
pip install /Users/you/en_core_web_sm-3.0.0-py3-none-any.whl
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.0.0/en_core_web_sm-3.0.0.tar.gz

Loading and using models

To load a model, use spacy.load() with the model name or a path to the model data directory.

import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp("This is a sentence.")

You can also import a model directly via its full name and then call its load() method with no arguments.

import spacy
import en_core_web_sm

nlp = en_core_web_sm.load()
doc = nlp("This is a sentence.")

📖 For more info and examples, check out the models documentation.

⚒ Compile from source

The other way to install spaCy is to clone its GitHub repository and build it from source. That is the common way if you want to make changes to the code base. You'll need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, virtualenv and git installed. The compiler part is the trickiest. How to do that depends on your system.

Platform
Ubuntu Install system-level dependencies via apt-get: sudo apt-get install build-essential python-dev git .
Mac Install a recent version of XCode, including the so-called "Command Line Tools". macOS and OS X ship with Python and git preinstalled.
Windows Install a version of the Visual C++ Build Tools or Visual Studio Express that matches the version that was used to compile your Python interpreter.

For more details and instructions, see the documentation on compiling spaCy from source and the quickstart widget to get the right commands for your platform and Python version.

git clone https://github.com/explosion/spaCy
cd spaCy

python -m venv .env
source .env/bin/activate

# make sure you are using the latest pip
python -m pip install -U pip setuptools wheel

pip install -r requirements.txt
pip install --no-build-isolation --editable .

To install with extras:

pip install --no-build-isolation --editable .[lookups,cuda102]

🚦 Run tests

spaCy comes with an extensive test suite. In order to run the tests, you'll usually want to clone the repository and build spaCy from source. This will also install the required development dependencies and test utilities defined in the requirements.txt.

Alternatively, you can run pytest on the tests from within the installed spacy package. Don't forget to also install the test utilities via spaCy's requirements.txt:

pip install -r requirements.txt
python -m pytest --pyargs spacy

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spacy-3.1.2.tar.gz (1.0 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

spacy-3.1.2-cp39-cp39-win_amd64.whl (11.6 MB view details)

Uploaded CPython 3.9Windows x86-64

spacy-3.1.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.9 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

spacy-3.1.2-cp39-cp39-macosx_10_9_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.9macOS 10.9+ x86-64

spacy-3.1.2-cp38-cp38-win_amd64.whl (12.0 MB view details)

Uploaded CPython 3.8Windows x86-64

spacy-3.1.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

spacy-3.1.2-cp38-cp38-macosx_10_9_x86_64.whl (6.0 MB view details)

Uploaded CPython 3.8macOS 10.9+ x86-64

spacy-3.1.2-cp37-cp37m-win_amd64.whl (11.8 MB view details)

Uploaded CPython 3.7mWindows x86-64

spacy-3.1.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.8 MB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.17+ x86-64

spacy-3.1.2-cp37-cp37m-macosx_10_9_x86_64.whl (6.0 MB view details)

Uploaded CPython 3.7mmacOS 10.9+ x86-64

spacy-3.1.2-cp36-cp36m-win_amd64.whl (11.8 MB view details)

Uploaded CPython 3.6mWindows x86-64

spacy-3.1.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.9 MB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.17+ x86-64

spacy-3.1.2-cp36-cp36m-macosx_10_9_x86_64.whl (6.0 MB view details)

Uploaded CPython 3.6mmacOS 10.9+ x86-64

File details

Details for the file spacy-3.1.2.tar.gz.

File metadata

  • Download URL: spacy-3.1.2.tar.gz
  • Upload date:
  • Size: 1.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.7.9

File hashes

Hashes for spacy-3.1.2.tar.gz
Algorithm Hash digest
SHA256 4bc9faddc9ed4f042ad424af4ff5cb67d193ce4e22ad19d59ff1a14715f113b5
MD5 9d494694655c1c967520a995d9f8a103
BLAKE2b-256 69f6e15dcebee0e7b510624c68b895241b8c96656152bfd502e983ae5c0bf38d

See more details on using hashes here.

File details

Details for the file spacy-3.1.2-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: spacy-3.1.2-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 11.6 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.7.9

File hashes

Hashes for spacy-3.1.2-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 e33b1a843a042a16d7a8411728917758fc532175b4f36b579b907f71afc8064e
MD5 f4371523cdc1307902c3b8a14ef84988
BLAKE2b-256 13d63419fd05884df6b31a84f1178fff180b6078bc96a7d3ca2b38c98404cf2b

See more details on using hashes here.

File details

Details for the file spacy-3.1.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

  • Download URL: spacy-3.1.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
  • Upload date:
  • Size: 5.9 MB
  • Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.7.9

File hashes

Hashes for spacy-3.1.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8cf175a70ef726641dd53391c109fdb6d424c9ea870e879cde92788781fe6efd
MD5 24e13b04dfb74b4a1102dfd5b411e8a3
BLAKE2b-256 22752c4d8a5e487b9be08f77cf974a6fbabc44e86ab9c3cd9b74c3cde3cda4d7

See more details on using hashes here.

File details

Details for the file spacy-3.1.2-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: spacy-3.1.2-cp39-cp39-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 6.2 MB
  • Tags: CPython 3.9, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.7.9

File hashes

Hashes for spacy-3.1.2-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 a7ad178afb7938b1ae83a7ab18895d596d17cb140d4e9c8b65db0e19df139855
MD5 3c9aba64b642eeed6bb5d8c019ac40d1
BLAKE2b-256 db794742c2495ab783619d7b1a77bd7c566bb19171311e836d6c4cb325e265b3

See more details on using hashes here.

File details

Details for the file spacy-3.1.2-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: spacy-3.1.2-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 12.0 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.7.9

File hashes

Hashes for spacy-3.1.2-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 e01d05497d08755c6e4e72125602e9bcc4285b8846fbc3ab7d14e821d41ed84c
MD5 becece2b46a7311403bffafb95beef05
BLAKE2b-256 f8ca702f8406cdf67aeb1a00c37b0c9370fd0cd35c9401e850eabf4fead81ba7

See more details on using hashes here.

File details

Details for the file spacy-3.1.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

  • Download URL: spacy-3.1.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
  • Upload date:
  • Size: 6.1 MB
  • Tags: CPython 3.8, manylinux: glibc 2.17+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.7.9

File hashes

Hashes for spacy-3.1.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8b9b22e4486573233e5aad19b94838d559547c31b8db17d550538a7d96545c62
MD5 d8bf4b787d52672319b0e6a89a552b79
BLAKE2b-256 4a2502f59fa1784164e87b7ef960bd8f09674db6c656738183e75d28b140d1d8

See more details on using hashes here.

File details

Details for the file spacy-3.1.2-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: spacy-3.1.2-cp38-cp38-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 6.0 MB
  • Tags: CPython 3.8, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.7.9

File hashes

Hashes for spacy-3.1.2-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 30d65483eda3c8e27ddf17a68f29a91cc1e572a60888d8759a0baa7145edd62a
MD5 fd70d2c65aa12c6cbb2404e10aeaae02
BLAKE2b-256 b402a6b5b1b12bf8339800b89a40d482dbe4e48dbd618a6f9c17780b5c425657

See more details on using hashes here.

File details

Details for the file spacy-3.1.2-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: spacy-3.1.2-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 11.8 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.7.9

File hashes

Hashes for spacy-3.1.2-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 d1937bddcb41f14c35821d83b0249c28ef7e557fff0624999fcda65c404c91b9
MD5 a8533be9ef9bcaebff6c1a225c7fe29f
BLAKE2b-256 2bfd5a5c97fac8bf8e63c99cd34a368d3df3acec22e108e272c8d712cc7722a6

See more details on using hashes here.

File details

Details for the file spacy-3.1.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for spacy-3.1.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6d88e0474d4cdb4a6a440ea145f522e642ddb1ba6eb119e45f8d6ba340ba5420
MD5 28963288c2631c1ef6112be777164309
BLAKE2b-256 a5fb90c8becc9c303eb14458b9d2f17c779d016e1adf9960d0c59ca8bca72ba6

See more details on using hashes here.

File details

Details for the file spacy-3.1.2-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: spacy-3.1.2-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 6.0 MB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.7.9

File hashes

Hashes for spacy-3.1.2-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 85d4638ace39d7a39a47a0bbca93f6d163694be74bc319189a35e598332df3d0
MD5 9b33074fedc03c313db79c28912322c4
BLAKE2b-256 201b284010fe674c06ef11405e61dd71075195819c921f2038e8396171bded6e

See more details on using hashes here.

File details

Details for the file spacy-3.1.2-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: spacy-3.1.2-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 11.8 MB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.7.9

File hashes

Hashes for spacy-3.1.2-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 090ea698d5f4ed468386a038190f836093ca006a2c8a244ed0cb9952d83b39db
MD5 97cb8ffd37a2a1886cdf95f4c3378af3
BLAKE2b-256 4bad5d6f1b80bb2d17c49a1520429a9718d892e21ca0f8c6db7182e163a0d799

See more details on using hashes here.

File details

Details for the file spacy-3.1.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for spacy-3.1.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e164190bf7dc55d7bc5edfda3f787b391dff1d55db1676a013402f3441416da5
MD5 0dcae3e9d89c85d12ca0d3b0564c0eef
BLAKE2b-256 ee705a9e643e78de0b3f8598119ccac7e841ed84398f553c0b84500e3db2974f

See more details on using hashes here.

File details

Details for the file spacy-3.1.2-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: spacy-3.1.2-cp36-cp36m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 6.0 MB
  • Tags: CPython 3.6m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.7.9

File hashes

Hashes for spacy-3.1.2-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 87927de0cc36d24525aa854c37020eb6bb28ffbae2d589638f408abed6b83f9a
MD5 ded04dbe96ea9acc214032305651bf53
BLAKE2b-256 f0545228f554ae2c7885491a57fb7853514f32ffb366624f21ceb71b4e99d41a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page