Skip to main content

Datahub Python SDK

Project description

PyPI version Docs License Implementation

Elegent way to access Datahub Python SDK API. Documentation

Installation

The quick way:

$ sudo pip install pydatahub

The dependencies will be installed automatically.

If network is not available, requirements are in dependency folder:

$ cd dependency
$ pip install -r dependency.txt

Or from source code:

$ virtualenv pydatahub_env
$ source pydatahub_env/bin/activate
$ git clone <git clone URL> pydatahub
$ cd pydatahub
$ sudo python setup.py install

Python Version

Tested on Python 2.7, 3.3, 3.4, 3.5, 3.6 and pypy, Python 3.6 recommended

Dependencies

  • setuptools (>=3.0)

  • requests (>=2.4.0)

  • simplejson(>=3.3.0)

  • six(>=1.1.0)

  • enum34(>=1.1.5 for python_version < ‘3.4’)

Run Tests

  • install tox:

$ pip install -U tox
  • fill datahub/tests/datahub.ini with your configuration

  • run shell

$ tox

Usage

from datahub import DataHub
dh = DataHub('**your-access-id**', '**your-secret-access-key**', endpoint='**your-end-point**')

# ============================= create project =============================

project_name = 'my_project_name'
comment = 'my project'
dh.create_project(project_name, comment)

# ============================= get project =============================

project_result = dh.get_project('pydatahub_test')
print(project_result)

# ============================= create tuple topic =============================

from datahub.models import RecordSchema, FieldType
topic_name='tuple_topic_test'
shard_count = 3
life_cycle = 7
comment = 'tuple topic'
record_schema = RecordSchema.from_lists(['bigint_field', 'string_field', 'double_field', 'bool_field', 'time_field'],
                                        [FieldType.BIGINT, FieldType.STRING, FieldType.DOUBLE, FieldType.BOOLEAN, FieldType.TIMESTAMP])
dh.create_tuple_topic(project_name, topic_name, shard_count, life_cycle, record_schema, comment)

# ============================= create blob topic =============================

topic_name='blob_topic_test'
shard_count = 3
life_cycle = 7
comment = 'blob topic'
dh.create_tuple_topic(project_name, topic_name, shard_count, life_cycle, comment)

# ============================= get topic =============================

topic_result = dh.get_topic(project_name, topic_name)
print(topic_result)
print(topic_result.record_schema)

# ============================= list shard =============================

shards_result = dh.list_shards(project_name, topic_name)
print(shards_result)

# ============================= put tuple records =============================

from datahub.models import TupleRecord
records0 = []
record0 = TupleRecord(schema=topic.record_schema, values=[1, 'yc1', 10.01, True, 1455869335000000])
record0.shard_id = '0'
record0.put_attribute('AK', '47')
records0.append(record0)
put_result = dh.put_records('pydatahub_test', 'tuple_topic_test', records0)
print(put_result)

# ============================= put tuple records =============================

from datahub.models import BlobRecord
data = None
with open(os.path.join(root_path, 'tests/resources/datahub.png'), 'rb') as f:
    data = f.read()
records1 = []
record1 = BlobRecord(blob_data=data)
record1.shard_id = '0'
record1.put_attribute('a', 'b')
records1.append(record1)
put_result = dh.put_records('pydatahub_test', 'blob_topic_test', records1)
print(put_result)

# ============================= get cursor =============================

from datahub.models import CursorType
cursor_result = dh.get_cursor(project_name, topic_name, '0', CursorType.OLDEST)
print(cursor_result)

# ============================= get blob records =============================

get_result = dh.get_blob_records(project_name, topic_name, '0', cursor, 10)
print(get_result)
print(get_result.records)
print(get_result.records[0])

# ============================= get tuple records =============================

get_result = dh.get_tuple_records(project_name, topic_name, '0', record_schema, cursor, 10)
print(get_result)
print(get_result.records)
print(get_result.records[0].values)

Examples

see more examples in examples

Release

Update changelog, then use bumpversion to update version:

  1. bugfix: bumpversion patch

  2. small feature:bumpversion minor

  3. breaking change:bumpversion major

Contributing

For a development install, clone the repository and then install from source:

git clone https://github.com/aliyun/aliyun-datahub-sdk-python.git

License

Licensed under the Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydatahub-beta-0.0.6.tar.gz (42.2 kB view details)

Uploaded Source

File details

Details for the file pydatahub-beta-0.0.6.tar.gz.

File metadata

File hashes

Hashes for pydatahub-beta-0.0.6.tar.gz
Algorithm Hash digest
SHA256 f92639e82a6a74adc1db3cc7e384c94baa09f6589541be0eaa15bdfa43033bdb
MD5 cbaba54aeda2431d4be550e54f2208db
BLAKE2b-256 0fb360a7f930453ae24ec3b4ee98bfae4d21cb13c88792565ea8c58108d6cf0f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page