Skip to main content

Industrial-strength Natural Language Processing (NLP) in Python

Project description

spaCy: Industrial-strength NLP

spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. spaCy comes with pretrained statistical models and word vectors, and currently supports tokenization for 50+ languages. It features state-of-the-art speed, convolutional neural network models for tagging, parsing and named entity recognition and easy deep learning integration. It's commercial open-source software, released under the MIT license.

💫 Version 2.1 out now! Check out the release notes here.

Azure Pipelines Travis Build Status Current Release Version pypi Version conda Version Python wheels PyPi downloads Conda downloads Code style: black spaCy on Twitter

📖 Documentation

Documentation
spaCy 101 New to spaCy? Here's everything you need to know!
Usage Guides How to use spaCy and its features.
New in v2.1 New features, backwards incompatibilities and migration guide.
API Reference The detailed reference for spaCy's API.
Models Download statistical language models for spaCy.
Universe Libraries, extensions, demos, books and courses.
Changelog Changes and version history.
Contribute How to contribute to the spaCy project and code base.

💬 Where to ask questions

The spaCy project is maintained by @honnibal and @ines, along with core contributors @svlandeg and @adrianeboyd. Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly, so that more people can benefit from it.

Type Platforms
🚨 Bug Reports GitHub Issue Tracker
🎁 Feature Requests GitHub Issue Tracker
👩‍💻 Usage Questions Stack Overflow · Gitter Chat · Reddit User Group
🗯 General Discussion Gitter Chat · Reddit User Group

Features

  • Non-destructive tokenization
  • Named entity recognition
  • Support for 50+ languages
  • pretrained statistical models and word vectors
  • State-of-the-art speed
  • Easy deep learning integration
  • Part-of-speech tagging
  • Labelled dependency parsing
  • Syntax-driven sentence segmentation
  • Built in visualizers for syntax and NER
  • Convenient string-to-hash mapping
  • Export to numpy data arrays
  • Efficient binary serialization
  • Easy model packaging and deployment
  • Robust, rigorously evaluated accuracy

📖 For more details, see the facts, figures and benchmarks.

Install spaCy

For detailed installation instructions, see the documentation.

  • Operating system: macOS / OS X · Linux · Windows (Cygwin, MinGW, Visual Studio)
  • Python version: Python 2.7, 3.5+ (only 64 bit)
  • Package managers: pip · conda (via conda-forge)

pip

Using pip, spaCy releases are available as source packages and binary wheels (as of v2.0.13).

pip install spacy

When using pip it is generally recommended to install packages in a virtual environment to avoid modifying system state:

python -m venv .env
source .env/bin/activate
pip install spacy

conda

Thanks to our great community, we've finally re-added conda support. You can now install spaCy via conda-forge:

conda config --add channels conda-forge
conda install spacy

For the feedstock including the build recipe and configuration, check out this repository. Improvements and pull requests to the recipe and setup are always appreciated.

Updating spaCy

Some updates to spaCy may require downloading new statistical models. If you're running spaCy v2.0 or higher, you can use the validate command to check if your installed models are compatible and if not, print details on how to update them:

pip install -U spacy
python -m spacy validate

If you've trained your own models, keep in mind that your training and runtime inputs must match. After updating spaCy, we recommend retraining your models with the new version.

📖 For details on upgrading from spaCy 1.x to spaCy 2.x, see the migration guide.

Download models

As of v1.7.0, models for spaCy can be installed as Python packages. This means that they're a component of your application, just like any other module. Models can be installed using spaCy's download command, or manually by pointing pip to a path or URL.

Documentation
Available Models Detailed model descriptions, accuracy figures and benchmarks.
Models Documentation Detailed usage instructions.
# download best-matching version of specific model for your spaCy installation
python -m spacy download en_core_web_sm

# out-of-the-box: download best-matching default model
python -m spacy download en

# pip install .tar.gz archive from path or URL
pip install /Users/you/en_core_web_sm-2.2.0.tar.gz
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz

Loading and using models

To load a model, use spacy.load() with the model name, a shortcut link or a path to the model data directory.

import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp(u"This is a sentence.")

You can also import a model directly via its full name and then call its load() method with no arguments.

import spacy
import en_core_web_sm

nlp = en_core_web_sm.load()
doc = nlp(u"This is a sentence.")

📖 For more info and examples, check out the models documentation.

Support for older versions

If you're using an older version (v1.6.0 or below), you can still download and install the old models from within spaCy using python -m spacy.en.download all or python -m spacy.de.download all. The .tar.gz archives are also attached to the v1.6.0 release. To download and install the models manually, unpack the archive, drop the contained directory into spacy/data and load the model via spacy.load('en') or spacy.load('de').

Compile from source

The other way to install spaCy is to clone its GitHub repository and build it from source. That is the common way if you want to make changes to the code base. You'll need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, virtualenv and git installed. The compiler part is the trickiest. How to do that depends on your system. See notes on Ubuntu, OS X and Windows for details.

# make sure you are using the latest pip
python -m pip install -U pip
git clone https://github.com/explosion/spaCy
cd spaCy

python -m venv .env
source .env/bin/activate
export PYTHONPATH=`pwd`
pip install -r requirements.txt
python setup.py build_ext --inplace

Compared to regular install via pip, requirements.txt additionally installs developer dependencies such as Cython. For more details and instructions, see the documentation on compiling spaCy from source and the quickstart widget to get the right commands for your platform and Python version.

Ubuntu

Install system-level dependencies via apt-get:

sudo apt-get install build-essential python-dev git

macOS / OS X

Install a recent version of XCode, including the so-called "Command Line Tools". macOS and OS X ship with Python and git preinstalled.

Windows

Install a version of the Visual C++ Build Tools or Visual Studio Express that matches the version that was used to compile your Python interpreter. For official distributions these are VS 2008 (Python 2.7), VS 2010 (Python 3.4) and VS 2015 (Python 3.5).

Run tests

spaCy comes with an extensive test suite. In order to run the tests, you'll usually want to clone the repository and build spaCy from source. This will also install the required development dependencies and test utilities defined in the requirements.txt.

Alternatively, you can find out where spaCy is installed and run pytest on that directory. Don't forget to also install the test utilities via spaCy's requirements.txt:

python -c "import os; import spacy; print(os.path.dirname(spacy.__file__))"
pip install -r path/to/requirements.txt
python -m pytest <spacy-directory>

See the documentation for more details and examples.

Project details


Release history Release notifications | RSS feed

This version

2.2.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spacy-2.2.0.tar.gz (5.8 MB view details)

Uploaded Source

Built Distributions

spacy-2.2.0-cp37-cp37m-win_amd64.whl (9.3 MB view details)

Uploaded CPython 3.7mWindows x86-64

spacy-2.2.0-cp37-cp37m-manylinux1_x86_64.whl (10.2 MB view details)

Uploaded CPython 3.7m

spacy-2.2.0-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (14.0 MB view details)

Uploaded CPython 3.7mmacOS 10.10+ Intel (x86-64, i386)macOS 10.10+ x86-64macOS 10.6+ Intel (x86-64, i386)macOS 10.9+ Intel (x86-64, i386)macOS 10.9+ x86-64

spacy-2.2.0-cp36-cp36m-win_amd64.whl (9.3 MB view details)

Uploaded CPython 3.6mWindows x86-64

spacy-2.2.0-cp36-cp36m-manylinux1_x86_64.whl (10.2 MB view details)

Uploaded CPython 3.6m

spacy-2.2.0-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (14.3 MB view details)

Uploaded CPython 3.6mmacOS 10.10+ Intel (x86-64, i386)macOS 10.10+ x86-64macOS 10.6+ Intel (x86-64, i386)macOS 10.9+ Intel (x86-64, i386)macOS 10.9+ x86-64

spacy-2.2.0-cp35-cp35m-win_amd64.whl (9.2 MB view details)

Uploaded CPython 3.5mWindows x86-64

spacy-2.2.0-cp35-cp35m-manylinux1_x86_64.whl (10.1 MB view details)

Uploaded CPython 3.5m

spacy-2.2.0-cp27-cp27mu-manylinux1_x86_64.whl (10.2 MB view details)

Uploaded CPython 2.7mu

File details

Details for the file spacy-2.2.0.tar.gz.

File metadata

  • Download URL: spacy-2.2.0.tar.gz
  • Upload date:
  • Size: 5.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.6

File hashes

Hashes for spacy-2.2.0.tar.gz
Algorithm Hash digest
SHA256 2921352c60d730978c7b2ebfac54045284dfa2022245a061f9d1336ef1b8cfb1
MD5 04fcfac4b49f70b59c322794b3ad4455
BLAKE2b-256 79fd13cdec8c1a3a2d57f5161e183639367d9d002965dfc893f222293fb1cc04

See more details on using hashes here.

File details

Details for the file spacy-2.2.0-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: spacy-2.2.0-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 9.3 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy-2.2.0-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 bc878a190e82ae41fbfee4151c3312cd6ca52405649be0bad7ed3dd766d11d99
MD5 573ac275f7b53833a960ec965da3fa85
BLAKE2b-256 df0547bef6b21a7e6e2b660b464d9017eb15d485b9e5b33188d3002dfc8e5bb1

See more details on using hashes here.

File details

Details for the file spacy-2.2.0-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: spacy-2.2.0-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 10.2 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy-2.2.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 4244c2a1a5b70b2c0785493fc8525f3ab1363130037193a66d4126420d13caf5
MD5 11262f00f7648e51094d5580c8f27335
BLAKE2b-256 4aed6f1281a10c4fe64ae8b1a5dc9ab60e7f1a09480367e1d9e2c874d5850356

See more details on using hashes here.

File details

Details for the file spacy-2.2.0-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl.

File metadata

File hashes

Hashes for spacy-2.2.0-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 463318dbfa9a5d0038cc2a5ad0dc631ed78e403b4d30a41b3f28a242c450c065
MD5 e43ab248007cbbea9e5c7cbfed9446c9
BLAKE2b-256 e451c197c7b3888b9e6b88ef1a4c706ad6dd48850560fda17d3e6ffe4974952d

See more details on using hashes here.

File details

Details for the file spacy-2.2.0-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: spacy-2.2.0-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 9.3 MB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy-2.2.0-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 bec973cebbbc9bf2a8370d8021430aeaea4eef008deb8e3cbe6cb2e8f939a849
MD5 73eaa483a30420f23fd1ede11cea7b8f
BLAKE2b-256 1e2f38654a83035ba7f1cc793ce5009aa95c69d3425b59bd513d08c5fa5938b3

See more details on using hashes here.

File details

Details for the file spacy-2.2.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: spacy-2.2.0-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 10.2 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy-2.2.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 c960e34c0e9e22d34dc645043dd505844503d7917dda7e3f2b94e289bd25eb44
MD5 f207ba7eadb64eb1e293a141772f78ee
BLAKE2b-256 1c9610881bc82d05664c40a38cf7c645e6ab1877e5808ef4a3684a99cd1445a4

See more details on using hashes here.

File details

Details for the file spacy-2.2.0-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl.

File metadata

File hashes

Hashes for spacy-2.2.0-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 860a637478449dcbe9f45eae1bc0c03e5bde9757707f6de2ebe0d80022528d5e
MD5 35813c814c71a709912039426571996b
BLAKE2b-256 4546598c3aaa1a888e8428271050eab6e92f45f7ef83d9457270d8e8ab6a2c4c

See more details on using hashes here.

File details

Details for the file spacy-2.2.0-cp35-cp35m-win_amd64.whl.

File metadata

  • Download URL: spacy-2.2.0-cp35-cp35m-win_amd64.whl
  • Upload date:
  • Size: 9.2 MB
  • Tags: CPython 3.5m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy-2.2.0-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 a7b0b9bbde9740979ee38d2dc124d2338ddea6db33a27243df935f71429c8876
MD5 81abb3d97842aaa764ee89baac3bda67
BLAKE2b-256 e9ad8d727e3cf9278c755eea174d3ca5e527a2aa02b1b0093ed3b9a1fc7401ba

See more details on using hashes here.

File details

Details for the file spacy-2.2.0-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

  • Download URL: spacy-2.2.0-cp35-cp35m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 10.1 MB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy-2.2.0-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 1eba7a94eda50ba24d1d6fc87d2f5fc6f53b36fb4d1b31ed42c98985f8095185
MD5 38f29e5c5410b3f49e5abd07be082750
BLAKE2b-256 0ccbddbbb86ac8425e52fea9953386915605836fe71982e10d846769b0868ebd

See more details on using hashes here.

File details

Details for the file spacy-2.2.0-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

  • Download URL: spacy-2.2.0-cp27-cp27mu-manylinux1_x86_64.whl
  • Upload date:
  • Size: 10.2 MB
  • Tags: CPython 2.7mu
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy-2.2.0-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 d299a91d5a32ff60c482d650232df481b3c5d86596e3214708a1b22dc84882dd
MD5 b4a2f89e047fff9beff13712399f7488
BLAKE2b-256 0d3eddcef3219171c175d434d74a5edfd9c75a9bd404e4259310696aa779ee91

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page