Skip to main content

OpenDataLab Python SDK

Project description

OpenDataLab Python SDK

Downloads PyPI PyPI - Python Version


IMPORTANT: OpenDataLab SDK WIP, not ensure the necessary compatibility of OpenAPI and SDK. As a result, please use the SDK with the latest version.


OpenDataLab Python SDK is a python library to access Opendatalab and use open datasets.
It provides:

  • A pythonic way to access opendatalab resources.
  • A convenient CLI tool odl to access open datasets.

Installation

$ pip3 install opendatalab

Usage:

An account is needed to access to opendatalab platform. Please visit offical websit to get the account username and password first.

Help

Show cmd help

$ odl -h
$ odl --help

Usage: odl [OPTIONS] COMMAND [ARGS]...

  You can use `odl <command>` to access open datasets.

Options:
  --version   Show the version and exit.
  -h, --help  Show this message and exit.

Commands:
  get      Get(Download) dataset files into local path.
  info     Print dataset info.
  login    Login opendatalab with account.
  logout   Logout opendatalab account.
  ls       List files of the dataset.
  search   Search dataset info.
  version  Show opendatalab version.

Version

$ odl version
odl version, current: 0.0.2, svc: 1.8

Login

Login with opendatalab username and password. If you haven't an opendatalab account,please register with link: https://opendatalab.com/

$ odl login
Username []: wangrui@pjlab.org.cn
Password []: 
Login successfully as wangrui@pjlab.org.cn
or
$ odl login -u wangrui@pjlab.org.cn
Password[]:

Logout

Logout current opendatalab account

$ odl logout
Do you want to logout? [y/N]: y
wangrui@pjlab.org.cn.com logout

List Dataset Files

List dataset files, support prefix of sub_directory

# list all dataset files 
$ odl ls  MNIST
total: 4, size: 11.1M
+----------------------------+--------------+
| File Name                  | Size         |
+----------------------------+--------------+
| train-labels-idx1-ubyte.gz | 28.2K        |
+----------------------------+--------------+
| train-images-idx3-ubyte.gz | 9.5M         |
+----------------------------+--------------+
| t10k-labels-idx1-ubyte.gz  | 4.4K         |
+----------------------------+--------------+
| t10k-images-idx3-ubyte.gz  | 1.6M         |
+----------------------------+--------------+                                                                          	1.6M

# list sub directory files
$ odl ls MNIST/t10k
total: 2, size: 1.6M
+---------------------------+--------------+
| File Name                 | Size         |
+---------------------------+--------------+
| t10k-labels-idx1-ubyte.gz | 4.4K         |
+---------------------------+--------------+
| t10k-images-idx3-ubyte.gz | 1.6M         |
+---------------------------+--------------+
# download dataset files into local  
# get all files of dataset  
$ odl get MNIST  

# get partial files of dataset  
$ odl get MNIST/t10k  

Python Develop Sample

import json
from opendatalab.__version__ import __url__
from opendatalab.cli.get import implement_get
from opendatalab.cli.info import implement_info
from opendatalab.cli.login import implement_login
from opendatalab.cli.ls import implement_ls
from opendatalab.cli.search import implement_search
from opendatalab.cli.utility import ContextInfo

if __name__ == '__main__':
    """
    ContextInfo: default
        please use shell login first, use: opendatalab login
    """
    ctx = ContextInfo(__url__, "")
    client = ctx.get_client()
    odl_api = client.get_api()

    # 0. login with account
    # account = "xxxxx"  # your username
    # pw = "xxxxx"  # your password
    # print(f'*****'*8)
    # implement_login(ctx, account, pw)

    # 1. search demo    
    res_list = odl_api.search_dataset("coco")
    for index, res in enumerate(res_list):
        print(f"index: {index}, result: {res['name']}")

    # implement_search("coco")
    print(f'*****'*8)

    # 2. list demo
    implement_ls(ctx, 'TAO')
    print(f'*****' * 8)

    # 3. read file online demo
    dataset = client.get_dataset('FB15k')
    with dataset.get('meta/info.json', False) as fd:
        content = json.load(fd)
        print(f"{content}")
    print(f'*****'*8)

    # 4. get dataset info
    implement_info(ctx, 'FB15k')

    # 5. download
    # get all files of dataset
    # implement_get(ctx, "MNIST", 4, 0)

    # get partial files of dataset
    implement_get(ctx, "GOT-10k/data/test_data.zip", 4, 0) # 139, zip 1.16G GOT-10k
    print(f'*****' * 5)

Documentation

More information can be found on the documentation site

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

opendl-0.0.3b2.tar.gz (20.2 kB view details)

Uploaded Source

Built Distribution

opendl-0.0.3b2-py3-none-any.whl (26.1 kB view details)

Uploaded Python 3

File details

Details for the file opendl-0.0.3b2.tar.gz.

File metadata

  • Download URL: opendl-0.0.3b2.tar.gz
  • Upload date:
  • Size: 20.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.13

File hashes

Hashes for opendl-0.0.3b2.tar.gz
Algorithm Hash digest
SHA256 47be9d13b4dd066df01d78cf228e9f6331c9dbbb25b8239cc6b7d0ba95af4192
MD5 6a3b1b54b9e6b04788abb5f7a6ce031e
BLAKE2b-256 de941755e896e52df5915d6b6ffbf323040dee8c1d5c8018d21c773cfa6f8a68

See more details on using hashes here.

File details

Details for the file opendl-0.0.3b2-py3-none-any.whl.

File metadata

  • Download URL: opendl-0.0.3b2-py3-none-any.whl
  • Upload date:
  • Size: 26.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.13

File hashes

Hashes for opendl-0.0.3b2-py3-none-any.whl
Algorithm Hash digest
SHA256 8a22015f661750540a95570234b2b63603d29b5d794ab6f416801c745cb84c05
MD5 222b50d321333b6aa465799f78146647
BLAKE2b-256 ade9596974e02afef61f8402e47220bcdd8697efa27e520c81113900e8411113

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page