Skip to main content

Unofficial Octoparse API client.

Project description

Octoparse

Python 3.6 Python 3.7 Python 3.8

Build


Unofficial Octoparse API client in python

With support for Advanced API and China as well

Installation:

use pip to install:

pip install octoparse

Credentials:

3 methods are supported as below:

1) Support for ENV variables

Include the following as environment variables:

export OCTOPARSE_USERNAME=octoparse_user
export OCTOPARSE_PASSWORD=octoparse_passwd
2) Support for .env file

Include the following in .env file in script directory:

OCTOPARSE_USERNAME=octoparse_user
OCTOPARSE_PASSWORD=octoparse_passwd
3) Manual input of username & password

Input username & password manually once from prompt:

Enter Octoparse Username: octoparse_user
Password: 

Example usage:

from octoparse import Octoparse

# initialize api client
# it will try to log in & ask for credentials if required
octo = Octoparse()

# if using advanced API:
octo = Octoparse(advanced_api=True)

# if using from China:
octo = Octoparse(china=True)

# List all task groups
groups = octo.list_all_task_groups()

# List all tasks in a group
tasks = octo.list_all_tasks_in_group(group_id='xxxx-ssdsd-1212')

# Check if a task is currently running. This isn't provided in Standard API.
status = octo.is_task_running(task_id='abcd-1234-djfsd-dfdf')

# Export the not exported data
data = octo.get_not_exported_data(task_id='abcd-1234-djfsd-dfdf', size=100)

# Update data status
resp = octo.update_data_status(task_id='abcd-1234-djfsd-dfdf')

# get all the data for a task with task id: 'abcd-1234-djfsd-dfdf'
data = octo.get_task_data(task_id='abcd-1234-djfsd-dfdf')

# get all the task data as a pandas.DataFrame for a task with task id: 'abcd-1234-djfsd-dfdf'
df = octo.get_task_data_df(task_id='abcd-1234-djfsd-dfdf')

# get an offset of data for a task with task id: 'abcd-1234-djfsd-dfdf'
# e.g get 100 rows starting from 200
data = octo.get_task_data(task_id='abcd-1234-djfsd-dfdf', offset=200, size=100)

# fetch task data in a loop using the generator function:
for data in octo.get_task_data_generator(task_id='abcd-1234-djfsd-dfdf', offset=200, size=100):
    print(data)
    do_something_with_data()

# clear data for a task with task id: 'abcd-1234-djfsd-dfdf'
octo.clear_task_data(task_id='abcd-1234-djfsd-dfdf')

Following are supported for Advanced API

# Get Tasks' status
task_list = ['abcd-1234-djfsd-dfdf', 'ab23-5677-djfsd-dfdf']
resp = octo.get_task_status(task_list)

# Get Task's parameter
resp = octo.get_task_param(task_id='abcd-1234-djfsd-dfdf', name='loopAction1.Url')

# Update Task's parameter
resp = octo.update_task_param(task_id='abcd-1234-djfsd-dfdf', name='loopAction1.Url', value='http://xyz.abc')

# Add new URLs/text to an existing loop
resp = octo.add_url_text_to_loop(task_id='abcd-1234-djfsd-dfdf', name='loopAction1.Url', value='http://xyz.abc')

# Start running task
resp = octo.start_task(task_id='abcd-1234-djfsd-dfdf')

# Stop running task
resp = octo.stop_task(task_id='abcd-1234-djfsd-dfdf')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

octoparse-1.6.0.tar.gz (8.1 kB view details)

Uploaded Source

Built Distribution

octoparse-1.6.0-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file octoparse-1.6.0.tar.gz.

File metadata

  • Download URL: octoparse-1.6.0.tar.gz
  • Upload date:
  • Size: 8.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.9.1

File hashes

Hashes for octoparse-1.6.0.tar.gz
Algorithm Hash digest
SHA256 2b51ed243a3cbd96702208544c2c0b06dab0f3af97a1cfb1495963c2a102a541
MD5 b07c2612136047591bcefe706fa4391e
BLAKE2b-256 837a44d21a4999b50dfa71818e8a7d6091f0e8f9642cad31442af999b519215e

See more details on using hashes here.

File details

Details for the file octoparse-1.6.0-py3-none-any.whl.

File metadata

  • Download URL: octoparse-1.6.0-py3-none-any.whl
  • Upload date:
  • Size: 11.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.9.1

File hashes

Hashes for octoparse-1.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8ef7800c49d209767080ae1e7a4347d74975bfd865eeaf7523e90a23bf99db9d
MD5 6957dcc1ff902470324abb14c0e15af1
BLAKE2b-256 aabbdee015ced2fc911f624a5b341d81dde2dffb65f15b340254c21291d0da69

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page