Unofficial Octoparse API client.
Project description
Octoparse
Unofficial Octoparse API client in python
With support for Advanced API and China as well
Installation:
use pip to install:
pip install octoparse
Credentials:
3 methods are supported as below:
1) Support for ENV variables
Include the following as environment variables:
export OCTOPARSE_USERNAME=octoparse_user
export OCTOPARSE_PASSWORD=octoparse_passwd
2) Support for .env
file
Include the following in .env
file in script directory:
OCTOPARSE_USERNAME=octoparse_user
OCTOPARSE_PASSWORD=octoparse_passwd
3) Manual input of username & password
Input username & password manually once from prompt:
Enter Octoparse Username: octoparse_user
Password:
Example usage:
from octoparse import Octoparse
# initialize api client
# it will try to log in & ask for credentials if required
octo = Octoparse()
# if using advanced API:
octo = Octoparse(advanced_api=True)
# if using from China:
octo = Octoparse(china=True)
# List all task groups
groups = octo.list_all_task_groups()
# List all tasks in a group
tasks = octo.list_all_tasks_in_group(group_id='xxxx-ssdsd-1212')
# Check if a task is currently running. This isn't provided in Standard API.
status = octo.is_task_running(task_id='abcd-1234-djfsd-dfdf')
# Export the not exported data
data = octo.get_not_exported_data(task_id='abcd-1234-djfsd-dfdf', size=100)
# Update data status
resp = octo.update_data_status(task_id='abcd-1234-djfsd-dfdf')
# get all the data for a task with task id: 'abcd-1234-djfsd-dfdf'
data = octo.get_task_data(task_id='abcd-1234-djfsd-dfdf')
# get all the task data as a pandas.DataFrame for a task with task id: 'abcd-1234-djfsd-dfdf'
df = octo.get_task_data_df(task_id='abcd-1234-djfsd-dfdf')
# get an offset of data for a task with task id: 'abcd-1234-djfsd-dfdf'
# e.g get 100 rows starting from 200
data = octo.get_task_data(task_id='abcd-1234-djfsd-dfdf', offset=200, size=100)
# fetch task data in a loop using the generator function:
for data in octo.get_task_data_generator(task_id='abcd-1234-djfsd-dfdf', offset=200, size=100):
print(data)
do_something_with_data()
# clear data for a task with task id: 'abcd-1234-djfsd-dfdf'
octo.clear_task_data(task_id='abcd-1234-djfsd-dfdf')
Following are supported for Advanced API
# Get Tasks' status
task_list = ['abcd-1234-djfsd-dfdf', 'ab23-5677-djfsd-dfdf']
resp = octo.get_task_status(task_list)
# Get Task's parameter
resp = octo.get_task_param(task_id='abcd-1234-djfsd-dfdf', name='loopAction1.Url')
# Update Task's parameter
resp = octo.update_task_param(task_id='abcd-1234-djfsd-dfdf', name='loopAction1.Url', value='http://xyz.abc')
# Add new URLs/text to an existing loop
resp = octo.add_url_text_to_loop(task_id='abcd-1234-djfsd-dfdf', name='loopAction1.Url', value='http://xyz.abc')
# Start running task
resp = octo.start_task(task_id='abcd-1234-djfsd-dfdf')
# Stop running task
resp = octo.stop_task(task_id='abcd-1234-djfsd-dfdf')
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
octoparse-1.6.0.tar.gz
(8.1 kB
view details)
Built Distribution
octoparse-1.6.0-py3-none-any.whl
(11.4 kB
view details)
File details
Details for the file octoparse-1.6.0.tar.gz
.
File metadata
- Download URL: octoparse-1.6.0.tar.gz
- Upload date:
- Size: 8.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.9.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2b51ed243a3cbd96702208544c2c0b06dab0f3af97a1cfb1495963c2a102a541 |
|
MD5 | b07c2612136047591bcefe706fa4391e |
|
BLAKE2b-256 | 837a44d21a4999b50dfa71818e8a7d6091f0e8f9642cad31442af999b519215e |
File details
Details for the file octoparse-1.6.0-py3-none-any.whl
.
File metadata
- Download URL: octoparse-1.6.0-py3-none-any.whl
- Upload date:
- Size: 11.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.9.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8ef7800c49d209767080ae1e7a4347d74975bfd865eeaf7523e90a23bf99db9d |
|
MD5 | 6957dcc1ff902470324abb14c0e15af1 |
|
BLAKE2b-256 | aabbdee015ced2fc911f624a5b341d81dde2dffb65f15b340254c21291d0da69 |