Unofficial Octoparse API client.
Project description
Octoparse
Unofficial Octoparse API client in python
With support for Advanced API and China as well
Installation:
use pip to install:
pip install octoparse
Credentials:
3 methods are supported as below:
1) Support for ENV variables
Include the following as environment variables:
export OCTOPARSE_USERNAME=octoparse_user
export OCTOPARSE_PASSWORD=octoparse_passwd
2) Support for .env
file
Include the following in .env
file in script directory:
OCTOPARSE_USERNAME=octoparse_user
OCTOPARSE_PASSWORD=octoparse_passwd
3) Manual input of username & password
Input username & password manually once from prompt:
Enter Octoparse Username: octoparse_user
Password:
Example usage:
from octoparse import Octoparse
# initialize api client
# it will try to log in & ask for credentials if required
octo = Octoparse()
# if using advanced API:
octo = Octoparse(advanced_api=True)
# if using from China:
octo = Octoparse(china=True)
# List all task groups
groups = octo.list_all_task_groups()
# List all tasks in a group
tasks = octo.list_all_tasks_in_group(group_id='xxxx-ssdsd-1212')
# Check if a task is currently running. This isn't provided in Standard API.
status = octo.is_task_running(task_id='abcd-1234-djfsd-dfdf')
# Export the not exported data
data = octo.get_not_exported_data(task_id='abcd-1234-djfsd-dfdf', size=100)
# Update data status
resp = octo.update_data_status(task_id='abcd-1234-djfsd-dfdf')
# get all the data for a task with task id: 'abcd-1234-djfsd-dfdf'
data = octo.get_task_data(task_id='abcd-1234-djfsd-dfdf')
# get all the task data as a pandas.DataFrame for a task with task id: 'abcd-1234-djfsd-dfdf'
df = octo.get_task_data_df(task_id='abcd-1234-djfsd-dfdf')
# get an offset of data for a task with task id: 'abcd-1234-djfsd-dfdf'
# e.g get 100 rows starting from 200
data = octo.get_task_data(task_id='abcd-1234-djfsd-dfdf', offset=200, size=100)
# clear data for a task with task id: 'abcd-1234-djfsd-dfdf'
octo.clear_task_data(task_id='abcd-1234-djfsd-dfdf')
Following are supported for Advanced API
# Get Tasks' status
task_list = ['abcd-1234-djfsd-dfdf', 'ab23-5677-djfsd-dfdf']
resp = octo.get_task_status(task_list)
# Get Task's parameter
resp = octo.get_task_param(task_id='abcd-1234-djfsd-dfdf', name='loopAction1.Url')
# Update Task's parameter
resp = octo.update_task_param(task_id='abcd-1234-djfsd-dfdf', name='loopAction1.Url', value='http://xyz.abc')
# Add new URLs/text to an existing loop
resp = octo.add_url_text_to_loop(task_id='abcd-1234-djfsd-dfdf', name='loopAction1.Url', value='http://xyz.abc')
# Start running task
resp = octo.start_task(task_id='abcd-1234-djfsd-dfdf')
# Stop running task
resp = octo.stop_task(task_id='abcd-1234-djfsd-dfdf')
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
octoparse-1.5.0.tar.gz
(7.4 kB
view details)
Built Distribution
octoparse-1.5.0-py3-none-any.whl
(11.3 kB
view details)
File details
Details for the file octoparse-1.5.0.tar.gz
.
File metadata
- Download URL: octoparse-1.5.0.tar.gz
- Upload date:
- Size: 7.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.9.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b3134267ba68c5cc0790cc5f1f9738b5a2a968ff6c72b67855ff89dbbe1bff85 |
|
MD5 | b54460a917dcb77d8841abd83934c805 |
|
BLAKE2b-256 | 9169c1874277b52e640f93c797083d910aaba886c929664f875b9b3d5aaf74b9 |
File details
Details for the file octoparse-1.5.0-py3-none-any.whl
.
File metadata
- Download URL: octoparse-1.5.0-py3-none-any.whl
- Upload date:
- Size: 11.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.9.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f4ff3a26c7f3eebbf8cb395ca34a921fda4af1e9153d7d0e0461b6244055fdf0 |
|
MD5 | daf550fabca1c0bb4cab85dc0b77d5ef |
|
BLAKE2b-256 | 2089fb36544e6b2a02b381f7a71b90ad54e1101e0327dd28eedd763c7091f62c |