Skip to main content

Datalabs

Project description

DataLab API CN

Installation

Install

```shell
pip install --upgrade pip
pip install datalabs
```  

or

```shell
pip install --upgrade pip
git clone https://github.com/ExpressAI/Datalab.git
cd Datalab
pip install .
```

Dataset Operation

# pip install datalab
from datalabs import operations, load_dataset
from featurize import *


dataset = load_dataset("ag_news")

# print(task schema)
print(dataset['test']._info.task_templates)

# data operators
res = dataset["test"].apply(get_text_length)
print(next(res))


# get entity
res = dataset["test"].apply(get_entity_spacy)
print(next(res))

# get postag
res = dataset["test"].apply(get_postag_spacy)
print(next(res))

from edit import *
# add typos
res = dataset["test"].apply(add_typo)
print(next(res))

#  change person name
res = dataset["test"].apply(change_person_name)
print(next(res))

Task Schema

  • text-classification

    • text:str
    • label:ClassLabel
  • text-matching

    • text1:str
    • text2:str
    • label:ClassLabel
  • summarization

    • text:str
    • summary:str
  • sequence-labeling

    • tokens:List[str]
    • tags:List[ClassLabel]
  • question-answering-extractive:

    • context:str
    • question:str
    • answers:List[{"text":"","answer_start":""}]

one can use dataset[SPLIT]._info.task_templates to get more useful task-dependent information, where SPLIT could be train or validation or test.

Supported Datasets

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datalabs-0.1.4.dev0.tar.gz (292.3 kB view hashes)

Uploaded Source

Built Distribution

datalabs-0.1.4.dev0-py2.py3-none-any.whl (2.2 MB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page