Skip to main content

Unofficial Octoparse API client.

Project description

Octoparse

Python 3.6 Python 3.7 Python 3.8

Build


Unofficial Octoparse API client in python

With support for Advanced API and China as well

Installation:

use pip to install:

pip install octoparse

Credentials:

3 methods are supported as below:

1) Support for ENV variables

Include the following as environment variables:

export OCTOPARSE_USERNAME=octoparse_user
export OCTOPARSE_PASSWORD=octoparse_passwd
2) Support for .env file

Include the following in .env file in script directory:

OCTOPARSE_USERNAME=octoparse_user
OCTOPARSE_PASSWORD=octoparse_passwd
3) Manual input of username & password

Input username & password manually once from prompt:

Enter Octoparse Username: octoparse_user
Password: 

Example usage:

from octoparse import Octoparse

# initialize api client
# it will try to log in & ask for credentials if required
octo = Octoparse()

# if using advanced API:
octo = Octoparse(advanced_api=True)

# if using from China:
octo = Octoparse(china=True)

# List all task groups
groups = octo.list_all_task_groups()

# List all tasks in a group
tasks = octo.list_all_tasks_in_group(group_id='xxxx-ssdsd-1212')

# Check if a task is currently running. This isn't provided in Standard API.
status = octo.is_task_running(task_id='abcd-1234-djfsd-dfdf')

# Export the not exported data
data = octo.get_not_exported_data(task_id='abcd-1234-djfsd-dfdf', size=100)

# Update data status
resp = octo.update_data_status(task_id='abcd-1234-djfsd-dfdf')

# get all the data for a task with task id: 'abcd-1234-djfsd-dfdf'
data = octo.get_task_data(task_id='abcd-1234-djfsd-dfdf')

# get all the task data as a pandas.DataFrame for a task with task id: 'abcd-1234-djfsd-dfdf'
df = octo.get_task_data_df(task_id='abcd-1234-djfsd-dfdf')

# get an offset of data for a task with task id: 'abcd-1234-djfsd-dfdf'
# e.g get 100 rows starting from 200
data = octo.get_task_data(task_id='abcd-1234-djfsd-dfdf', offset=200, size=100)

# clear data for a task with task id: 'abcd-1234-djfsd-dfdf'
octo.clear_task_data(task_id='abcd-1234-djfsd-dfdf')

Following are supported for Advanced API

# Get Tasks' status
task_list = ['abcd-1234-djfsd-dfdf', 'ab23-5677-djfsd-dfdf']
resp = octo.get_task_status(task_list)

# Get Task's parameter
resp = octo.get_task_param(task_id='abcd-1234-djfsd-dfdf', name='loopAction1.Url')

# Update Task's parameter
resp = octo.update_task_param(task_id='abcd-1234-djfsd-dfdf', name='loopAction1.Url', value='http://xyz.abc')

# Add new URLs/text to an existing loop
resp = octo.add_url_text_to_loop(task_id='abcd-1234-djfsd-dfdf', name='loopAction1.Url', value='http://xyz.abc')

# Start running task
resp = octo.start_task(task_id='abcd-1234-djfsd-dfdf')

# Stop running task
resp = octo.stop_task(task_id='abcd-1234-djfsd-dfdf')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

octoparse-1.5.0.tar.gz (7.4 kB view details)

Uploaded Source

Built Distribution

octoparse-1.5.0-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file octoparse-1.5.0.tar.gz.

File metadata

  • Download URL: octoparse-1.5.0.tar.gz
  • Upload date:
  • Size: 7.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.9.1

File hashes

Hashes for octoparse-1.5.0.tar.gz
Algorithm Hash digest
SHA256 b3134267ba68c5cc0790cc5f1f9738b5a2a968ff6c72b67855ff89dbbe1bff85
MD5 b54460a917dcb77d8841abd83934c805
BLAKE2b-256 9169c1874277b52e640f93c797083d910aaba886c929664f875b9b3d5aaf74b9

See more details on using hashes here.

File details

Details for the file octoparse-1.5.0-py3-none-any.whl.

File metadata

  • Download URL: octoparse-1.5.0-py3-none-any.whl
  • Upload date:
  • Size: 11.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.9.1

File hashes

Hashes for octoparse-1.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f4ff3a26c7f3eebbf8cb395ca34a921fda4af1e9153d7d0e0461b6244055fdf0
MD5 daf550fabca1c0bb4cab85dc0b77d5ef
BLAKE2b-256 2089fb36544e6b2a02b381f7a71b90ad54e1101e0327dd28eedd763c7091f62c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page