Skip to main content

Industrial-strength Natural Language Processing (NLP) in Python

Project description

spaCy: Industrial-strength NLP

spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. spaCy comes with pretrained statistical models and word vectors, and currently supports tokenization for 50+ languages. It features state-of-the-art speed, convolutional neural network models for tagging, parsing and named entity recognition and easy deep learning integration. It's commercial open-source software, released under the MIT license.

💫 Version 2.2 out now! Check out the release notes here.

Azure Pipelines Travis Build Status Current Release Version pypi Version conda Version Python wheels PyPi downloads Conda downloads Model downloads Code style: black spaCy on Twitter

📖 Documentation

Documentation
spaCy 101 New to spaCy? Here's everything you need to know!
Usage Guides How to use spaCy and its features.
New in v2.2 New features, backwards incompatibilities and migration guide.
API Reference The detailed reference for spaCy's API.
Models Download statistical language models for spaCy.
Universe Libraries, extensions, demos, books and courses.
Changelog Changes and version history.
Contribute How to contribute to the spaCy project and code base.

💬 Where to ask questions

The spaCy project is maintained by @honnibal and @ines, along with core contributors @svlandeg and @adrianeboyd. Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly, so that more people can benefit from it.

Type Platforms
🚨 Bug Reports GitHub Issue Tracker
🎁 Feature Requests GitHub Issue Tracker
👩‍💻 Usage Questions Stack Overflow · Gitter Chat · Reddit User Group
🗯 General Discussion Gitter Chat · Reddit User Group

Features

  • Non-destructive tokenization
  • Named entity recognition
  • Support for 50+ languages
  • pretrained statistical models and word vectors
  • State-of-the-art speed
  • Easy deep learning integration
  • Part-of-speech tagging
  • Labelled dependency parsing
  • Syntax-driven sentence segmentation
  • Built in visualizers for syntax and NER
  • Convenient string-to-hash mapping
  • Export to numpy data arrays
  • Efficient binary serialization
  • Easy model packaging and deployment
  • Robust, rigorously evaluated accuracy

📖 For more details, see the facts, figures and benchmarks.

Install spaCy

For detailed installation instructions, see the documentation.

  • Operating system: macOS / OS X · Linux · Windows (Cygwin, MinGW, Visual Studio)
  • Python version: Python 2.7, 3.5+ (only 64 bit)
  • Package managers: pip · conda (via conda-forge)

pip

Using pip, spaCy releases are available as source packages and binary wheels (as of v2.0.13).

pip install spacy

To install additional data tables for lemmatization in spaCy v2.2+ you can run pip install spacy[lookups] or install spacy-lookups-data separately. The lookups package is needed to create blank models with lemmatization data, and to lemmatize in languages that don't yet come with pretrained models and aren't powered by third-party libraries.

When using pip it is generally recommended to install packages in a virtual environment to avoid modifying system state:

python -m venv .env
source .env/bin/activate
pip install spacy

conda

Thanks to our great community, we've finally re-added conda support. You can now install spaCy via conda-forge:

conda install -c conda-forge spacy

For the feedstock including the build recipe and configuration, check out this repository. Improvements and pull requests to the recipe and setup are always appreciated.

Updating spaCy

Some updates to spaCy may require downloading new statistical models. If you're running spaCy v2.0 or higher, you can use the validate command to check if your installed models are compatible and if not, print details on how to update them:

pip install -U spacy
python -m spacy validate

If you've trained your own models, keep in mind that your training and runtime inputs must match. After updating spaCy, we recommend retraining your models with the new version.

📖 For details on upgrading from spaCy 1.x to spaCy 2.x, see the migration guide.

Download models

As of v1.7.0, models for spaCy can be installed as Python packages. This means that they're a component of your application, just like any other module. Models can be installed using spaCy's download command, or manually by pointing pip to a path or URL.

Documentation
Available Models Detailed model descriptions, accuracy figures and benchmarks.
Models Documentation Detailed usage instructions.
# download best-matching version of specific model for your spaCy installation
python -m spacy download en_core_web_sm

# pip install .tar.gz archive from path or URL
pip install /Users/you/en_core_web_sm-2.2.0.tar.gz
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz

Loading and using models

To load a model, use spacy.load() with the model name, a shortcut link or a path to the model data directory.

import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp("This is a sentence.")

You can also import a model directly via its full name and then call its load() method with no arguments.

import spacy
import en_core_web_sm

nlp = en_core_web_sm.load()
doc = nlp("This is a sentence.")

📖 For more info and examples, check out the models documentation.

Compile from source

The other way to install spaCy is to clone its GitHub repository and build it from source. That is the common way if you want to make changes to the code base. You'll need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, virtualenv and git installed. The compiler part is the trickiest. How to do that depends on your system. See notes on Ubuntu, OS X and Windows for details.

# make sure you are using the latest pip
python -m pip install -U pip
git clone https://github.com/explosion/spaCy
cd spaCy

python -m venv .env
source .env/bin/activate
export PYTHONPATH=`pwd`
pip install -r requirements.txt
python setup.py build_ext --inplace

Compared to regular install via pip, requirements.txt additionally installs developer dependencies such as Cython. For more details and instructions, see the documentation on compiling spaCy from source and the quickstart widget to get the right commands for your platform and Python version.

Ubuntu

Install system-level dependencies via apt-get:

sudo apt-get install build-essential python-dev git

macOS / OS X

Install a recent version of XCode, including the so-called "Command Line Tools". macOS and OS X ship with Python and git preinstalled.

Windows

Install a version of the Visual C++ Build Tools or Visual Studio Express that matches the version that was used to compile your Python interpreter. For official distributions these are VS 2008 (Python 2.7), VS 2010 (Python 3.4) and VS 2015 (Python 3.5).

Run tests

spaCy comes with an extensive test suite. In order to run the tests, you'll usually want to clone the repository and build spaCy from source. This will also install the required development dependencies and test utilities defined in the requirements.txt.

Alternatively, you can find out where spaCy is installed and run pytest on that directory. Don't forget to also install the test utilities via spaCy's requirements.txt:

python -c "import os; import spacy; print(os.path.dirname(spacy.__file__))"
pip install -r path/to/requirements.txt
python -m pytest <spacy-directory>

See the documentation for more details and examples.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spacy-2.2.4.tar.gz (6.1 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

spacy-2.2.4-cp38-cp38-win_amd64.whl (10.1 MB view details)

Uploaded CPython 3.8Windows x86-64

spacy-2.2.4-cp38-cp38-manylinux1_x86_64.whl (10.6 MB view details)

Uploaded CPython 3.8

spacy-2.2.4-cp38-cp38-macosx_10_9_x86_64.whl (10.6 MB view details)

Uploaded CPython 3.8macOS 10.9+ x86-64

spacy-2.2.4-cp37-cp37m-win_amd64.whl (9.9 MB view details)

Uploaded CPython 3.7mWindows x86-64

spacy-2.2.4-cp37-cp37m-manylinux1_x86_64.whl (10.6 MB view details)

Uploaded CPython 3.7m

spacy-2.2.4-cp37-cp37m-macosx_10_9_x86_64.whl (10.5 MB view details)

Uploaded CPython 3.7mmacOS 10.9+ x86-64

spacy-2.2.4-cp36-cp36m-win_amd64.whl (9.9 MB view details)

Uploaded CPython 3.6mWindows x86-64

spacy-2.2.4-cp36-cp36m-manylinux1_x86_64.whl (10.6 MB view details)

Uploaded CPython 3.6m

spacy-2.2.4-cp36-cp36m-macosx_10_9_x86_64.whl (10.7 MB view details)

Uploaded CPython 3.6mmacOS 10.9+ x86-64

File details

Details for the file spacy-2.2.4.tar.gz.

File metadata

  • Download URL: spacy-2.2.4.tar.gz
  • Upload date:
  • Size: 6.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.8

File hashes

Hashes for spacy-2.2.4.tar.gz
Algorithm Hash digest
SHA256 f0f3a67c5841e6e35d62c98f40ebb3d132587d3aba4f4dccac5056c4e90ff5b9
MD5 31ad3e9cc6ecb294b70b2f261b6e7264
BLAKE2b-256 921ba982be17aa65d61121718f0309a2d8a56a04d6babee4c1a6882965f0d56d

See more details on using hashes here.

File details

Details for the file spacy-2.2.4-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: spacy-2.2.4-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 10.1 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.8

File hashes

Hashes for spacy-2.2.4-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 877d8e157a708c8b77c0dea61e526632f6d57f27be64087dac22a4581facea68
MD5 846c0f67e792cc36b77313b47abed51b
BLAKE2b-256 60a196bacfb27b4f48557d50b943930852059fd437d1fc7322dc31e240afb86b

See more details on using hashes here.

File details

Details for the file spacy-2.2.4-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: spacy-2.2.4-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 10.6 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.8

File hashes

Hashes for spacy-2.2.4-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6c1618c05bf65ae4bc94608f2390130ca21112fb3d920d1a03727691e3e7fb1b
MD5 a4dea3474de7bae5c2765d40016fb72b
BLAKE2b-256 f88ebd5058daa751a8089e3106b1b9c8db1e86e78d80b13e28af0a37c7ad175f

See more details on using hashes here.

File details

Details for the file spacy-2.2.4-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: spacy-2.2.4-cp38-cp38-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 10.6 MB
  • Tags: CPython 3.8, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.8

File hashes

Hashes for spacy-2.2.4-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 7313b4fa921ed997d9719f99f5a375d672d2f4a908c7750033c4b37d9fa8547a
MD5 539acf346c303a3e0230edd943c5cfca
BLAKE2b-256 d6481ea7fed7eff91c98d7209b4539af334a14254e87b1f5716f74bfaf6ae3e0

See more details on using hashes here.

File details

Details for the file spacy-2.2.4-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: spacy-2.2.4-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 9.9 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.8

File hashes

Hashes for spacy-2.2.4-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 c5e6f8155f6b54a8ef89637b3c7d553f0ddb5478c4dd568fde7392efbf8a26c8
MD5 6bdf8d2690edf5781856102ff715184c
BLAKE2b-256 39c96c6bbb563588cd9b5d155ebda200796911588fb46a85e5f4d21b7b2ab759

See more details on using hashes here.

File details

Details for the file spacy-2.2.4-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: spacy-2.2.4-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 10.6 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.8

File hashes

Hashes for spacy-2.2.4-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 212314be762bd40dfbbeeba1c4742c242e4b6ea3f9340891f0ff282b2e723ed0
MD5 41c7151f3e1564042aa56aeab8d1f834
BLAKE2b-256 37ff2a7c89f2069173a1ecbccd95d2a23fc42f89045b33f8a71ef57b360a3de4

See more details on using hashes here.

File details

Details for the file spacy-2.2.4-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: spacy-2.2.4-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 10.5 MB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.8

File hashes

Hashes for spacy-2.2.4-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 ce3886e9bfb9071d2708d2cd7157ada93ab378bbb38cf079842181cd671fc6f9
MD5 a89f218ba96e097224a0de3b7411befe
BLAKE2b-256 a790785efc7bd26ff8e399f03d02b259216cf29b389a8f3c2412624f0ac32b00

See more details on using hashes here.

File details

Details for the file spacy-2.2.4-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: spacy-2.2.4-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 9.9 MB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.8

File hashes

Hashes for spacy-2.2.4-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 f75ba238066455f5b5498a987b4e2c84705d92138e02e890e0b0a1d1eb2d9462
MD5 5b5925cf9a8c20614e605b5368e3afa0
BLAKE2b-256 0a192b2c0e1340131a8e23ce4a9804cdccdd62d4d23d3d86c1754857b3de7a14

See more details on using hashes here.

File details

Details for the file spacy-2.2.4-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: spacy-2.2.4-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 10.6 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.8

File hashes

Hashes for spacy-2.2.4-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 01202066f75c7f2cfeb9c167c3184b5b0a9d465604b0ca553bd9e788353c5905
MD5 d0254130179c3f6cc08bc35c5ad26242
BLAKE2b-256 552eac00f5c9d01e66cc6ab75eb2a460c9b0dc21ad99a12f810c86a58309e63c

See more details on using hashes here.

File details

Details for the file spacy-2.2.4-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: spacy-2.2.4-cp36-cp36m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 10.7 MB
  • Tags: CPython 3.6m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.8

File hashes

Hashes for spacy-2.2.4-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 fd740cb1b50cd86c648f64313be4734b0c2a2931d83761f46821061f42d791a3
MD5 981c37a95dd719c4f908642dfc1c6fa7
BLAKE2b-256 9597a63a0f2a682459fa5152a9129513095a0cf8c93538f7972a336f9b8254e4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page