spaCy pipelines for pre-trained BERT and other transformers
Project description
spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
This package provides spaCy components and
architectures to use transformer models via
Hugging Face's transformers
in
spaCy. The result is convenient access to state-of-the-art transformer
architectures, such as BERT, GPT-2, XLNet, etc.
🌙 This release is a pre-release and requires spaCy v3 (nightly). For the previous version of this library, see the
v0.6.x
branch.
Features
- Use pretrained transformer models like BERT, RoBERTa and XLNet to power your spaCy pipeline.
- Easy multi-task learning: backprop to one transformer model from several pipeline components.
- Train using spaCy v3's powerful and extensible config system.
- Automatic alignment of transformer output to spaCy's tokenization.
- Easily customize what transformer data is saved in the
Doc
object. - Easily customize how long documents are processed.
- Out-of-the-box serialization and model packaging.
🚀 Installation
Installing the package from pip will automatically install all dependencies, including PyTorch and spaCy. Make sure you install this package before you install the models. Also note that this package requires Python 3.6+, PyTorch v1.5+ and spaCy v3.0+.
pip install spacy-nightly[transformers] --pre
For GPU installation, find your CUDA version using nvcc --version
and add the
version in brackets, e.g.
spacy-nightly[transformers,cuda92]
for CUDA9.2 or
spacy-nightly[transformers,cuda100]
for CUDA10.0.
If you are having trouble installing PyTorch, follow the instructions on the official website for your specific operation system and requirements, or try the following:
pip install spacy-transformers --pre -f https://download.pytorch.org/whl/torch_stable.html
📖 Documentation
⚠️ Important note: This package has been extensively refactored to take advantage of spaCy v3.0. Previous versions that were built for spaCy v2.x worked considerably differently. Please see previous tagged versions of this README for documentation on prior versions.
- 📘 Embeddings, Transformers and Transfer Learning: How to use transformers in spaCy
- 📘 Training Pipelines and Models: Train and update components on your own data and integrate custom models
- 📘 Layers and Model Architectures: Power spaCy components with custom neural networks
- 📗
Transformer
: Pipeline component API reference - 📗 Transformer architectures: Architectures and registered functions
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for spacy-transformers-1.0.0rc3.dev3.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | f3d62ad8be5299563fde5e261e28538789501e3891495d2d09f0c719aba95d8b |
|
MD5 | e94036dc2d7d6768de9ee736025adf68 |
|
BLAKE2b-256 | b32646b7da6626d4547ef721f65ef24dce7e5917bb836ec4de1ea77994ca127b |
Hashes for spacy_transformers-1.0.0rc3.dev3-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0e3f828370efc879038fc3f6ed90da2a7c268a73bfdd33a45bf5b6b2c9b04119 |
|
MD5 | 67a4880ba14a1d272f1fce6a5696afb5 |
|
BLAKE2b-256 | bf20e27beb995f3068c6a25fd53b988577b525a274f2ead2e36285e6e3e74972 |