Skip to main content

Industrial-strength Natural Language Processing (NLP) in Python

Project description

spaCy: Industrial-strength NLP

spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products.

spaCy comes with pretrained pipelines and vectors, and currently supports tokenization for 60+ languages. It features state-of-the-art speed, convolutional neural network models for tagging, parsing, named entity recognition, text classification and more, multi-task learning with pretrained transformers like BERT, as well as a production-ready training system and easy model packaging, deployment and workflow management. spaCy is commercial open-source software, released under the MIT license.

💫 Version 3.0 (nightly) out now! Check out the release notes here.

Azure Pipelines Current Release Version pypi Version conda Version Python wheels PyPi downloads Conda downloads Model downloads Code style: black spaCy on Twitter

📖 Documentation

Documentation
spaCy 101 New to spaCy? Here's everything you need to know!
Usage Guides How to use spaCy and its features.
New in v3.0 New features, backwards incompatibilities and migration guide.
Project Templates End-to-end workflows you can clone, modify and run.
API Reference The detailed reference for spaCy's API.
Models Download statistical language models for spaCy.
Universe Libraries, extensions, demos, books and courses.
Changelog Changes and version history.
Contribute How to contribute to the spaCy project and code base.

💬 Where to ask questions

The spaCy project is maintained by @honnibal, @ines, @svlandeg and @adrianeboyd. Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly, so that more people can benefit from it.

Type Platforms
🚨 Bug Reports GitHub Issue Tracker
🎁 Feature Requests & Ideas GitHub Discussions
👩‍💻 Usage Questions GitHub Discussions · Stack Overflow
🗯 General Discussion GitHub Discussions

Features

  • Support for 60+ languages
  • Trained pipelines
  • Multi-task learning with pretrained transformers like BERT
  • Pretrained word vectors
  • State-of-the-art speed
  • Production-ready training system
  • Linguistically-motivated tokenization
  • Components for named entity recognition, part-of-speech-tagging, dependency parsing, sentence segmentation, text classification, lemmatization, morphological analysis, entity linking and more
  • Easily extensible with custom components and attributes
  • Support for custom models in PyTorch, TensorFlow and other frameworks
  • Built in visualizers for syntax and NER
  • Easy model packaging, deployment and workflow management
  • Robust, rigorously evaluated accuracy

📖 For more details, see the facts, figures and benchmarks.

Install spaCy

For detailed installation instructions, see the documentation.

  • Operating system: macOS / OS X · Linux · Windows (Cygwin, MinGW, Visual Studio)
  • Python version: Python 3.6+ (only 64 bit)
  • Package managers: pip · conda (via conda-forge)

pip

Using pip, spaCy releases are available as source packages and binary wheels (as of v2.0.13). Before you install spaCy and its dependencies, make sure that your pip, setuptools and wheel are up to date.

pip install -U pip setuptools wheel
pip install spacy

To install additional data tables for lemmatization and normalization in spaCy v2.2+ you can run pip install spacy[lookups] or install spacy-lookups-data separately. The lookups package is needed to create blank models with lemmatization data for v2.2+ plus normalization data for v2.3+, and to lemmatize in languages that don't yet come with pretrained models and aren't powered by third-party libraries.

When using pip it is generally recommended to install packages in a virtual environment to avoid modifying system state:

python -m venv .env
source .env/bin/activate
pip install -U pip setuptools wheel
pip install spacy

conda

Thanks to our great community, we've finally re-added conda support. You can now install spaCy via conda-forge:

conda install -c conda-forge spacy

For the feedstock including the build recipe and configuration, check out this repository. Improvements and pull requests to the recipe and setup are always appreciated.

Updating spaCy

Some updates to spaCy may require downloading new statistical models. If you're running spaCy v2.0 or higher, you can use the validate command to check if your installed models are compatible and if not, print details on how to update them:

pip install -U spacy
python -m spacy validate

If you've trained your own models, keep in mind that your training and runtime inputs must match. After updating spaCy, we recommend retraining your models with the new version.

📖 For details on upgrading from spaCy 2.x to spaCy 3.x, see the migration guide.

Download models

Trained pipelines for spaCy can be installed as Python packages. This means that they're a component of your application, just like any other module. Models can be installed using spaCy's download command, or manually by pointing pip to a path or URL.

Documentation
Available Pipelines Detailed pipeline descriptions, accuracy figures and benchmarks.
Models Documentation Detailed usage instructions.
# Download best-matching version of specific model for your spaCy installation
python -m spacy download en_core_web_sm

# pip install .tar.gz archive from path or URL
pip install /Users/you/en_core_web_sm-2.2.0.tar.gz
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz

Loading and using models

To load a model, use spacy.load() with the model name or a path to the model data directory.

import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp("This is a sentence.")

You can also import a model directly via its full name and then call its load() method with no arguments.

import spacy
import en_core_web_sm

nlp = en_core_web_sm.load()
doc = nlp("This is a sentence.")

📖 For more info and examples, check out the models documentation.

Compile from source

The other way to install spaCy is to clone its GitHub repository and build it from source. That is the common way if you want to make changes to the code base. You'll need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, virtualenv and git installed. The compiler part is the trickiest. How to do that depends on your system. See notes on Ubuntu, OS X and Windows for details.

git clone https://github.com/explosion/spaCy
cd spaCy

python -m venv .env
source .env/bin/activate

# make sure you are using the latest pip
python -m pip install -U pip setuptools wheel

pip install .

To install with extras:

pip install .[lookups,cuda102]

To install all dependencies required for development:

pip install -r requirements.txt

Compared to regular install via pip, requirements.txt additionally installs developer dependencies such as Cython. For more details and instructions, see the documentation on compiling spaCy from source and the quickstart widget to get the right commands for your platform and Python version.

Ubuntu

Install system-level dependencies via apt-get:

sudo apt-get install build-essential python-dev git

macOS / OS X

Install a recent version of XCode, including the so-called "Command Line Tools". macOS and OS X ship with Python and git preinstalled.

Windows

Install a version of the Visual C++ Build Tools or Visual Studio Express that matches the version that was used to compile your Python interpreter.

Run tests

spaCy comes with an extensive test suite. In order to run the tests, you'll usually want to clone the repository and build spaCy from source. This will also install the required development dependencies and test utilities defined in the requirements.txt.

Alternatively, you can run pytest on the tests from within the installed spacy package. Don't forget to also install the test utilities via spaCy's requirements.txt:

pip install -r requirements.txt
python -m pytest --pyargs spacy

See the documentation for more details and examples.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spacy-nightly-3.0.0rc5.tar.gz (7.0 MB view details)

Uploaded Source

Built Distributions

spacy_nightly-3.0.0rc5-cp39-cp39-win_amd64.whl (11.4 MB view details)

Uploaded CPython 3.9 Windows x86-64

spacy_nightly-3.0.0rc5-cp39-cp39-macosx_10_9_x86_64.whl (12.1 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

spacy_nightly-3.0.0rc5-cp38-cp38-win_amd64.whl (11.7 MB view details)

Uploaded CPython 3.8 Windows x86-64

spacy_nightly-3.0.0rc5-cp38-cp38-macosx_10_9_x86_64.whl (12.3 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

spacy_nightly-3.0.0rc5-cp37-cp37m-win_amd64.whl (11.6 MB view details)

Uploaded CPython 3.7m Windows x86-64

spacy_nightly-3.0.0rc5-cp37-cp37m-macosx_10_9_x86_64.whl (12.2 MB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

spacy_nightly-3.0.0rc5-cp36-cp36m-win_amd64.whl (11.6 MB view details)

Uploaded CPython 3.6m Windows x86-64

spacy_nightly-3.0.0rc5-cp36-cp36m-macosx_10_9_x86_64.whl (12.4 MB view details)

Uploaded CPython 3.6m macOS 10.9+ x86-64

File details

Details for the file spacy-nightly-3.0.0rc5.tar.gz.

File metadata

  • Download URL: spacy-nightly-3.0.0rc5.tar.gz
  • Upload date:
  • Size: 7.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.9

File hashes

Hashes for spacy-nightly-3.0.0rc5.tar.gz
Algorithm Hash digest
SHA256 bfe85c9fe05ce17b5db34c8386576c3fa41d232931dc8b9b1802104951ad518e
MD5 1bfb1e031655456789c7c227a0358b9c
BLAKE2b-256 cefed91eff412a6122ea73d218fb56272c826431dbf52ae8341d7d5660c99c24

See more details on using hashes here.

File details

Details for the file spacy_nightly-3.0.0rc5-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: spacy_nightly-3.0.0rc5-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 11.4 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.9

File hashes

Hashes for spacy_nightly-3.0.0rc5-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 92492d159ce11b5000475c0ea699e5297971e99f9723027b3e0bc12f446a8c37
MD5 61cb2ac9939fcb1a6af9b68d524dc2d7
BLAKE2b-256 a123a52d35f35f7902f69120b208f4b07007e908e0c6ec22397961407255a204

See more details on using hashes here.

File details

Details for the file spacy_nightly-3.0.0rc5-cp39-cp39-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for spacy_nightly-3.0.0rc5-cp39-cp39-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 63c3e5598e0c2a97bb4b47943a7ee4207302758bafc3e82e30d0714da5be73a6
MD5 1bf818d3f080aaa74527421ad1c59c6e
BLAKE2b-256 be47ea73b2ef714835b39e41cbe3f5394df39b19006a004034ac637192d07fbf

See more details on using hashes here.

File details

Details for the file spacy_nightly-3.0.0rc5-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: spacy_nightly-3.0.0rc5-cp39-cp39-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 12.1 MB
  • Tags: CPython 3.9, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.9

File hashes

Hashes for spacy_nightly-3.0.0rc5-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 0683140372fc53c271e916cdc4b02ae049356ff65ac3a408be1772a563bb7fcd
MD5 8598b91caf3ce2bb5ffb0845fa2ac2f2
BLAKE2b-256 5cf820cab5d2386f8863c0555731a43ed45a263c9f3f3eaa29b19c78cb210bee

See more details on using hashes here.

File details

Details for the file spacy_nightly-3.0.0rc5-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: spacy_nightly-3.0.0rc5-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 11.7 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.9

File hashes

Hashes for spacy_nightly-3.0.0rc5-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 2fa3bcfd00ae9548de91b8032f32b86f0adc77f6a1cde49cec51431645e97e9f
MD5 52538a4e4425c5e61820e2bc95223914
BLAKE2b-256 f2fe69997a3896565ee81059ef2b386bd1b7181da3739858301514888025ad9b

See more details on using hashes here.

File details

Details for the file spacy_nightly-3.0.0rc5-cp38-cp38-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for spacy_nightly-3.0.0rc5-cp38-cp38-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6d36bcdf61dace5577b42c139bc106fae113bc9980d1375fd52540bf097610b9
MD5 ec8c714e24990bdc9f4843b8ee161cd6
BLAKE2b-256 f73f90cac63ee8e42309aa2d14e28ae475f406d36165dd26c71fa37de8aca175

See more details on using hashes here.

File details

Details for the file spacy_nightly-3.0.0rc5-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: spacy_nightly-3.0.0rc5-cp38-cp38-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 12.3 MB
  • Tags: CPython 3.8, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.9

File hashes

Hashes for spacy_nightly-3.0.0rc5-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 bd4d0cf0ecbcdb8c74cb55d370f9b042f264d24c296fe33c30731249df049de5
MD5 c739316b8bbf8ca7460c59593683a7c5
BLAKE2b-256 1ce7dcfb835709e214f6bb1255ee6a008f8badae666a6719a65e8088cdda990c

See more details on using hashes here.

File details

Details for the file spacy_nightly-3.0.0rc5-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: spacy_nightly-3.0.0rc5-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 11.6 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.9

File hashes

Hashes for spacy_nightly-3.0.0rc5-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 b857e0cd954bd8ff4206edfcffdabba7c81ac9baa9b326512f62fcf17cda446f
MD5 10f0a980ad81003f801d8d41a488b0c4
BLAKE2b-256 3872ab1da290687ff4bc09752f2c0b4f705cbcf1a1ab0de81af5189b30f56df6

See more details on using hashes here.

File details

Details for the file spacy_nightly-3.0.0rc5-cp37-cp37m-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for spacy_nightly-3.0.0rc5-cp37-cp37m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4daddef56ba26935d6d9b0a1a4485ee9dacb6951d3bb5c03635f60fcf75f27e5
MD5 f0c8f2f577bf597bcab6290f71254d7a
BLAKE2b-256 cbdb471e80de2478a37a33c6432b8ac441088f2183e2aaca2ed7058ed39491c8

See more details on using hashes here.

File details

Details for the file spacy_nightly-3.0.0rc5-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: spacy_nightly-3.0.0rc5-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 12.2 MB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.9

File hashes

Hashes for spacy_nightly-3.0.0rc5-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 39714d04f433556d6c76c4a8c0cf7dbca67348b0fd14bc2194aa8155a473542b
MD5 06b829c0c6a11d439d8e27551e861f5c
BLAKE2b-256 bcd92c4c7935df04b8e3b201c4492ffd709316d2bae9292e62f23a7cb2898e67

See more details on using hashes here.

File details

Details for the file spacy_nightly-3.0.0rc5-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: spacy_nightly-3.0.0rc5-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 11.6 MB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.9

File hashes

Hashes for spacy_nightly-3.0.0rc5-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 e3dd749f6a02dd2b9b15dbb92655a4373824d659e79d38424ce0569935dd78b7
MD5 9c6f641aeb9a066c27ccbde508df9037
BLAKE2b-256 d7275b08f0abe7e86c03c59c0e59bfc36adf3005dfe78a492de9ce24c0d9f043

See more details on using hashes here.

File details

Details for the file spacy_nightly-3.0.0rc5-cp36-cp36m-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for spacy_nightly-3.0.0rc5-cp36-cp36m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 94ef9ac6031c02565bcdd32ffe064bdc909e70dffc4bf634cee072eb65f22431
MD5 7d58dc1c8e4b7047211d198278c7c32d
BLAKE2b-256 48d363f003904fcac813ae622a54298a0e6179263a6a349d5843f481e3cf1d9b

See more details on using hashes here.

File details

Details for the file spacy_nightly-3.0.0rc5-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: spacy_nightly-3.0.0rc5-cp36-cp36m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 12.4 MB
  • Tags: CPython 3.6m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.9

File hashes

Hashes for spacy_nightly-3.0.0rc5-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 3c95370e72bc6001561ce486aaf190620deea97a2e45fed986b11db4a65fb4ad
MD5 b7c66805966bd22307a9a92d6ad6dc17
BLAKE2b-256 8c0b200fff0ad21b6969e42db142837365e9001e2aada7bed8b53a015538e2b1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page