Skip to main content

Text related datasets

Project description

CircleCI PyPI version

Text-related datasets

This datamaestro plugin covers text-related datasets:

  • Information Retrieval
  • Natural Language Processing tasks

The list of available datasets and usage instruction can be found in the documentation.

List of available datasets

Below is the list of available datasets along with ids. Some datasets have several versions; in this case, the dataset id is suffixed with this information.

Documents

  • Aquaint edu.upenn.ldc.aquaint
  • TIPSTER gov.nist.trec.tipster
  • WikiText-2 and WikiText-103 io.metamind.research.wikitext

Word embeddings

  • Glove edu.stanford.glove

Sentiment analysis

  • IMDB edu.stanford.aclimdb

Information Retrieval

TREC

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datamaestro_text-2019.12.5.tar.gz (13.1 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page