Skip to main content

Text utilities, models, transforms, and datasets for PyTorch.

Project description

docs/source/_static/img/torchtext_logo.png https://circleci.com/gh/pytorch/text.svg?style=svg https://codecov.io/gh/pytorch/text/branch/main/graph/badge.svg https://img.shields.io/badge/dynamic/json.svg?label=docs&url=https%3A%2F%2Fpypi.org%2Fpypi%2Ftorchtext%2Fjson&query=%24.info.version&colorB=brightgreen&prefix=v

torchtext

This repository consists of:

Installation

We recommend Anaconda as a Python package management system. Please refer to pytorch.org for the details of PyTorch installation. The following are the corresponding torchtext versions and supported Python versions.

Version Compatibility

PyTorch version

torchtext version

Supported Python version

nightly build

main

>=3.8, <=3.11

1.14.0

0.15.0

>=3.8, <=3.11

1.13.0

0.14.0

>=3.7, <=3.10

1.12.0

0.13.0

>=3.7, <=3.10

1.11.0

0.12.0

>=3.6, <=3.9

1.10.0

0.11.0

>=3.6, <=3.9

1.9.1

0.10.1

>=3.6, <=3.9

1.9

0.10

>=3.6, <=3.9

1.8.1

0.9.1

>=3.6, <=3.9

1.8

0.9

>=3.6, <=3.9

1.7.1

0.8.1

>=3.6, <=3.9

1.7

0.8

>=3.6, <=3.8

1.6

0.7

>=3.6, <=3.8

1.5

0.6

>=3.5, <=3.8

1.4

0.5

2.7, >=3.5, <=3.8

0.4 and below

0.2.3

2.7, >=3.5, <=3.8

Using conda:

conda install -c pytorch torchtext

Using pip:

pip install torchtext

Optional requirements

If you want to use English tokenizer from SpaCy, you need to install SpaCy and download its English model:

pip install spacy
python -m spacy download en_core_web_sm

Alternatively, you might want to use the Moses tokenizer port in SacreMoses (split from NLTK). You have to install SacreMoses:

pip install sacremoses

For torchtext 0.5 and below, sentencepiece:

conda install -c powerai sentencepiece

Building from source

To build torchtext from source, you need git, CMake and C++11 compiler such as g++.:

git clone https://github.com/pytorch/text torchtext
cd torchtext
git submodule update --init --recursive

# Linux
python setup.py clean install

# OSX
CC=clang CXX=clang++ python setup.py clean install

# or ``python setup.py develop`` if you are making modifications.

Note

When building from source, make sure that you have the same C++ compiler as the one used to build PyTorch. A simple way is to build PyTorch from source and use the same environment to build torchtext. If you are using the nightly build of PyTorch, checkout the environment it was built with conda (here) and pip (here).

Additionally, datasets in torchtext are implemented using the torchdata library. Please take a look at the installation instructions to download the latest nightlies or install from source.

Documentation

Find the documentation here.

Datasets

The datasets module currently contains:

  • Language modeling: WikiText2, WikiText103, PennTreebank, EnWik9

  • Machine translation: IWSLT2016, IWSLT2017, Multi30k

  • Sequence tagging (e.g. POS/NER): UDPOS, CoNLL2000Chunking

  • Question answering: SQuAD1, SQuAD2

  • Text classification: SST2, AG_NEWS, SogouNews, DBpedia, YelpReviewPolarity, YelpReviewFull, YahooAnswers, AmazonReviewPolarity, AmazonReviewFull, IMDB

  • Model pre-training: CC-100

Models

The library currently consist of following pre-trained models:

Tokenizers

The transforms module currently support following scriptable tokenizers:

Tutorials

To get started with torchtext, users may refer to the following tutorial available on PyTorch website.

Disclaimer on Datasets

This is a utility library that downloads and prepares public datasets. We do not host or distribute these datasets, vouch for their quality or fairness, or claim that you have license to use the dataset. It is your responsibility to determine whether you have permission to use the dataset under the dataset’s license.

If you’re a dataset owner and wish to update any part of it (description, citation, etc.), or do not want your dataset to be included in this library, please get in touch through a GitHub issue. Thanks for your contribution to the ML community!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

torchtext-0.16.2-cp312-cp312-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.12Windows x86-64

torchtext-0.16.2-cp312-cp312-manylinux1_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.12

torchtext-0.16.2-cp312-cp312-macosx_11_0_arm64.whl (2.1 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

torchtext-0.16.2-cp312-cp312-macosx_10_13_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

torchtext-0.16.2-cp311-cp311-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.11Windows x86-64

torchtext-0.16.2-cp311-cp311-manylinux1_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.11

torchtext-0.16.2-cp311-cp311-macosx_11_0_arm64.whl (2.1 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

torchtext-0.16.2-cp311-cp311-macosx_10_13_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

torchtext-0.16.2-cp310-cp310-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.10Windows x86-64

torchtext-0.16.2-cp310-cp310-manylinux1_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.10

torchtext-0.16.2-cp310-cp310-macosx_11_0_arm64.whl (2.1 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

torchtext-0.16.2-cp310-cp310-macosx_10_13_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

torchtext-0.16.2-cp39-cp39-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.9Windows x86-64

torchtext-0.16.2-cp39-cp39-manylinux1_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.9

torchtext-0.16.2-cp39-cp39-macosx_11_0_arm64.whl (2.1 MB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

torchtext-0.16.2-cp39-cp39-macosx_10_13_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.9macOS 10.13+ x86-64

torchtext-0.16.2-cp38-cp38-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.8Windows x86-64

torchtext-0.16.2-cp38-cp38-manylinux1_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.8

torchtext-0.16.2-cp38-cp38-macosx_11_0_arm64.whl (2.1 MB view details)

Uploaded CPython 3.8macOS 11.0+ ARM64

torchtext-0.16.2-cp38-cp38-macosx_10_13_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.8macOS 10.13+ x86-64

File details

Details for the file torchtext-0.16.2-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: torchtext-0.16.2-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for torchtext-0.16.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 b819440b36017c86637ce831f452ac7393e57ff425b3180ab1bfec35c474caf1
MD5 0abe336a702c8293a56259bb913aec0d
BLAKE2b-256 5f6d97d9d92df20c5f7b13234392c72bf9d6223a25e27934fd3bdfbcc90b80f7

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp312-cp312-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for torchtext-0.16.2-cp312-cp312-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 94f79469799104f144e661ebefd5255c92c34ddbd858c3a6602f0aae9e0275ca
MD5 4743c191696d13ef2bc9f91b2d84b415
BLAKE2b-256 fbf3c385d8d2a9219d3d990000efb126e304f328aba9d3b9d46457931bb97066

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for torchtext-0.16.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e1a6840726c6cab5f577e12da70cb2c594713eb80f7dbf9a46871822ee78e545
MD5 4e5f85cbc364d37bddff8f5b36b7a264
BLAKE2b-256 3f47ea35c681bc4d4441c3a0a58a33be6a3997c4fd06f59805183070da7c7de4

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for torchtext-0.16.2-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 4a43119bfac056f5eb3481b04a51c6d8b077dd76b6606850cda2459187f25110
MD5 57e4917542d1af53143fc991c8647872
BLAKE2b-256 f49cf39a40704fb49f14bea89fe1af0f86b0ef0d91cf0a0fdb92c52836db4233

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: torchtext-0.16.2-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for torchtext-0.16.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 bb1688191dc7a05708c80b4e7b4be63eca7b955514857fb3d2bcb156ab5a762b
MD5 ab92cc0a5b7d9180636604a25da2dca1
BLAKE2b-256 a8a0092d79d5cab3017cc98942509de22aceb30d9f91ee313dfe7bfd07e1bc63

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp311-cp311-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for torchtext-0.16.2-cp311-cp311-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 181b69b7dfa8d3c7749b0238fa4bb253dda7b958122d2159b4a2c1cdd013bc81
MD5 b9bc9418f66a7f3980c8ac88f2df488e
BLAKE2b-256 69925a641e1f706f686e2483927023249e7245204c5566081d0e9efc7d67325e

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for torchtext-0.16.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d7faca6b4b1fbb5bc7c8d4d110a9ec32c8c05ed6f9ed4eec63ae1e997594d8e4
MD5 5f38eef3cb405b8019d05c98274c3726
BLAKE2b-256 918ed65632f1398d9b343f50f528bd8f6bf9b8a64c15731c1f2752c249559fb7

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for torchtext-0.16.2-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 b3427b7f0ddc0b4994455cb28b1ad84111c7c575959ee01f494587ff3ef212d9
MD5 ecf61141b72383579c0d82f301e3eab7
BLAKE2b-256 7621ca961fea9180b75a2097099b3ade78a03499307e2bcc2c79449b4226a5aa

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: torchtext-0.16.2-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for torchtext-0.16.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 8e8346df98aae3d087c005415dbf4f66c205347261e91c2f3935da1537b70bf8
MD5 3558fce240390894022aac570acb4dd3
BLAKE2b-256 ff601a8e3dace9e6a673c494233a7727aeb2d1bc62053dc75e6bdaf2b217e4bf

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp310-cp310-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for torchtext-0.16.2-cp310-cp310-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 05df8b456b4954bdbc860d325bdec33da7ce669c9cf915b610be1271b5b6c605
MD5 61d1ba820e863cc775230dc4035c23f8
BLAKE2b-256 ad059803b163f79a015f2916895f8899dcbb062e5bd7264cb4cdea8de9ccf775

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for torchtext-0.16.2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 529276a9d0c05e9767631a9948f8a01d1ebf38f820990a483d5a52f3f102a4c6
MD5 dfb3a0dc1c85d425ef46a9bbbfcf5f18
BLAKE2b-256 9810e8a12342e816dd91caaa59a8c05a2e61b2f3700cb9268d6baa136d5c4f50

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for torchtext-0.16.2-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 dbffb95efcf18cb891d6caf3371055a974157742d0b912bb30d23bd0f03e9601
MD5 a83fbb9156dd4fabba830dc803086e8b
BLAKE2b-256 d8b6f0d7866f03ac5de3a8c7d81eba9c8cc587bf5cfacc8d5e115b789ca1986c

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: torchtext-0.16.2-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for torchtext-0.16.2-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 2f6dcfce858a73688ad9ab93e6582c002c2a9807f4f089d62bf1bbc082d3951f
MD5 009227910d89366786a65e91ed1aefd0
BLAKE2b-256 db71261400aa28f56499297517553f46a5ec84e1960392189ac7978217ea6641

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp39-cp39-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for torchtext-0.16.2-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 c5c9dd937c1f1dfcf46d066503c59f94aaf46975504ae6664f932115d91e8252
MD5 89402fce2d97b96283989806b9204c6d
BLAKE2b-256 74110f3f7a965c834c8b987d9113cf5df6b7ae615bcfd6dd92fb237cb7fa386c

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for torchtext-0.16.2-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 104e9d76da16796d20c4bf42d95856f841576f6328a174186027aa0628f37f40
MD5 7ab088674965e9f482d4b00d32ca8e23
BLAKE2b-256 d07f03db56771ff4861db039e6bdc466aaa24d3d3e49cd0475f62d0225f46d65

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp39-cp39-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for torchtext-0.16.2-cp39-cp39-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 d376bb54755c627b03b945443f3dd954f77a5918da395f700a59f6b8ca83da30
MD5 b3283f94243f7fe616945612933692b0
BLAKE2b-256 c9eec2c5a888dd7b9dd473de03a1e9694d360f622f58da6eed1f9e74a59d76bf

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: torchtext-0.16.2-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for torchtext-0.16.2-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 26cd38124f2398ec5b80f9302fa7cdf91e24df6e69b289e09805f8b134419d3f
MD5 8848365b86c564c66a294176237a97e4
BLAKE2b-256 50dddc4f23d118ccc2f3dd710a67d94495a1045fffe60183692cde566ffcad74

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp38-cp38-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for torchtext-0.16.2-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 71a02dd779f667353048ae4e4fb996aa540427af1570c2cb73fc4a012f5aff52
MD5 f89c48f1e70966c4f9b1889e2b86efbd
BLAKE2b-256 dc4eecd8e2fe22c4966040afb49fe8dd962ba978e14f31b3f8c25dfed29408f8

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for torchtext-0.16.2-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8fac943eb862f5448a79096481117564e5acadebcc0e7d5d2407362ee3b5713a
MD5 907bed45d0493cba90eb6b74e7a03226
BLAKE2b-256 b257d867747e225916989ec1b1afd725fd151c594746811362af50ce32d983d4

See more details on using hashes here.

File details

Details for the file torchtext-0.16.2-cp38-cp38-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for torchtext-0.16.2-cp38-cp38-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 5b212784aba09bbd45b8cca87a01f0561486afc1d3c097b52b2b926b736a3ecb
MD5 2afd480367742acaa774355463aa8691
BLAKE2b-256 18f554d30c4bd7a06052147c82882e59743e522af9a909af026c3ecb36776354

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page