ODPS Python SDK and data analysis framework
Project description
Elegent way to access ODPS API. Documentation
Installation
The quick way:
pip install pyodps[full]
If you don’t need to use Jupyter, just type
pip install pyodps
The dependencies will be installed automatically.
Or from source code:
$ virtualenv pyodps_env
$ source pyodps_env/bin/activate
$ git clone <git clone URL> pyodps
$ cd pyodps
$ python setup.py install
Dependencies
Python (>=2.6), including Python 3+, pypy, Python 2.7 recommended
setuptools (>=3.0)
requests (>=2.4.0)
Run Unittest
copy conf/test.conf.template to odps/tests/test.conf, and fill it with your account
run python -m unittest discover
Usage
>>> from odps import ODPS
>>> o = ODPS('**your-access-id**', '**your-secret-access-key**',
... project='**your-project**', endpoint='**your-end-point**')
>>> dual = o.get_table('dual')
>>> dual.name
'dual'
>>> dual.schema
odps.Schema {
c_int_a bigint
c_int_b bigint
c_double_a double
c_double_b double
c_string_a string
c_string_b string
c_bool_a boolean
c_bool_b boolean
c_datetime_a datetime
c_datetime_b datetime
}
>>> dual.creation_time
datetime.datetime(2014, 6, 6, 13, 28, 24)
>>> dual.is_virtual_view
False
>>> dual.size
448
>>> dual.schema.columns
[<column c_int_a, type bigint>,
<column c_int_b, type bigint>,
<column c_double_a, type double>,
<column c_double_b, type double>,
<column c_string_a, type string>,
<column c_string_b, type string>,
<column c_bool_a, type boolean>,
<column c_bool_b, type boolean>,
<column c_datetime_a, type datetime>,
<column c_datetime_b, type datetime>]
DataFrame API
>>> from odps.df import DataFrame
>>> df = DataFrame(o.get_table('pyodps_iris'))
>>> df.dtypes
odps.Schema {
sepallength float64
sepalwidth float64
petallength float64
petalwidth float64
name string
}
>>> df.head(5)
|==========================================| 1 / 1 (100.00%) 0s
sepallength sepalwidth petallength petalwidth name
0 5.1 3.5 1.4 0.2 Iris-setosa
1 4.9 3.0 1.4 0.2 Iris-setosa
2 4.7 3.2 1.3 0.2 Iris-setosa
3 4.6 3.1 1.5 0.2 Iris-setosa
4 5.0 3.6 1.4 0.2 Iris-setosa
>>> df[df.sepalwidth > 3]['name', 'sepalwidth'].head(5)
|==========================================| 1 / 1 (100.00%) 12s
name sepalwidth
0 Iris-setosa 3.5
1 Iris-setosa 3.2
2 Iris-setosa 3.1
3 Iris-setosa 3.6
4 Iris-setosa 3.9
Command-line and IPython enhancement
In [1]: %load_ext odps In [2]: %enter Out[2]: <odps.inter.Room at 0x10fe0e450> In [3]: %sql select * from pyodps_iris limit 5 |==========================================| 1 / 1 (100.00%) 2s Out[3]: sepallength sepalwidth petallength petalwidth name 0 5.1 3.5 1.4 0.2 Iris-setosa 1 4.9 3.0 1.4 0.2 Iris-setosa 2 4.7 3.2 1.3 0.2 Iris-setosa 3 4.6 3.1 1.5 0.2 Iris-setosa 4 5.0 3.6 1.4 0.2 Iris-setosa
Python UDF Debugging Tool
#file: plus.py
from odps.udf import annotate
@annotate('bigint,bigint->bigint')
class Plus(object):
def evaluate(self, a, b):
return a + b
$ cat plus.input 1,1 3,2 $ pyou plus.Plus < plus.input 2 5
Contributing
For a development install, clone the repository and then install from source:
git clone https://github.com/aliyun/aliyun-odps-python-sdk cd pyodps pip install -r requirements.txt -e .
If you need to modify the frontend code, you need to install nodejs/npm. To build and install your frontend code, use
python setup.py build_js python setup.py install_js
License
Licensed under the Apache License 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for pyodps-0.6.9-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6ae70463a65b479ba5dfff54459a9f6bf9aae8f46419358029c689c03c056605 |
|
MD5 | f82cb74a64ee1769009c9b60ea7f1cb4 |
|
BLAKE2b-256 | d2a63bdc76407a87b483a97ae8d08af83140b0f1697f7919e871679a3e837d0a |
Hashes for pyodps-0.6.9-cp36-cp36m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b48f0fddd7a7c7c9e05f70cf171d39938fbd624acb03d4e69673835cded7aed1 |
|
MD5 | f920172e5a9cbb88d1ca89b622f45d9f |
|
BLAKE2b-256 | dfb992aba60742986cd6a9431bd393f2ef249f8b11e1145d9fdaee78f2b9cc14 |
Hashes for pyodps-0.6.9-cp35-cp35m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b448dbd8048a0890e3858046352152867e656c2b6bf2e3713b2770bf223809ff |
|
MD5 | 111dd0fc0ed44bff1d36474ee68db282 |
|
BLAKE2b-256 | 7f09c3a5ab139f0d02927f7b2ab232557906a289d76dc600a96e0e59351a7041 |
Hashes for pyodps-0.6.9-cp35-cp35m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4497fc64cf935d4d0d5c2ed9611cbf5975e55e806647a0ec740311ea50c03447 |
|
MD5 | 75830c89ac863095cdc91bf10dab79c1 |
|
BLAKE2b-256 | bd24eeb9a30c3ec557d19b67dcf5bf26881c57fe96ede7b503cf9e212a37ea11 |
Hashes for pyodps-0.6.9-cp27-cp27m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2a368d08301f0ca89a9ba5af876a50e290dcc2ce76723db63c466749822d7204 |
|
MD5 | 345be1dfef747182772ae717ac4c83f4 |
|
BLAKE2b-256 | 908271f8ce4a4793da602ff2988176a529bf79d709c53863df738901489fa12c |
Hashes for pyodps-0.6.9-cp27-cp27m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aed1a0ff6f86ac1bb5e31b588c188e90ea7eff92d7264ac37b89df72d04182ab |
|
MD5 | 03af0e7e2ac7b907082c291bc9d0b6d5 |
|
BLAKE2b-256 | c3fac459e47dbabd97b64380c634209ad18a0cfbad2d46d60654d5aee48fd11d |