Skip to main content

A Cli tool for Grepsr Developers

Project description

A Cli tool for Grepsr Developers

Installation

$ pip install grepsr-cli

Usage

passing parameters to amazon_com service.

gcli crawler test -s amazon_com -p '{"urls":["https://amazon.com/VVUH4HJ","https://amazon.com/FV4434"]}'

if JSON is complex, use file instead

# contents of /tmp/amazon_params.json
{"urls": ["https://amazon.com/VV%20UH4HJ"], "strip": ["'", "\"", "\\"]}

gcli crawler test -s amazon_com --params-file '/tmp/amazon_params.json'

Hacks Used.

If the json parameter has a space, it might break parameter parsing. If the json parameter has a dash - and any character after it has a space, it will break parameter parsing. Cause: no double quoting around $@ in run_service.php:5:49 here This is fixed hackily by replacing string with its unicode \u0020 sequence. This works beacause $@ does not split on \u0020.

inject custom command.

Say, for example you wanted to a inject a php function so that it could be called from inside you service_code when testing locally. Note: All these files should only be created inside ~/.grepsr/tmp. Creating it outside will not work.

  1. Create a file called inject.php inside ~/.grepsr/tmp/
  2. Implement your function inside ~/.grepsr/tmp/inject.php
function addRowLocal($arr) {
    ...
    ...
}
  1. Create a file called inject.sh inside ~/.grepsr/tmp/
  2. inside inject.sh add:
alias php='php -d auto_prepend_file=/tmp/inject.php'

Note: the file location is /tmp/inject.php instead of ~/.grepsr/tmp/inject.php. This is because, the local path ~/.grepsr/tmp gets mapped to /tmp in the docker container. And inject.sh runs inside docker, instead of the local filesystem. 5. Add an entry in ~/.grepst/config.yml like so:

    php:
        ...
        sdk_image: ...
        pre_entry_run_file: inject.sh      # relative and limited to the tmp/ dir
  1. Now you can use addRowLocal() in your any of your files.
public function main($params) {
    ...
    $arr = $this->dataSet->getEmptyRow();
    addRowLocal($arr); // won't throw error
    ...
}

Development

Be sure to uninstall gcli first, with pip uninstall grepsr-cli

git clone git@bitbucket.org:zznixt07/gcli.git grepsrcli
cd grepsrcli
pip install -e .

Features Added

  • stash untracked files to avoid failure when applying stash.
  • drop stash after pushed successully. Before this, all stashes were always kept.
  • run a custom shell file before running your crawler. This allows possiblity like always injecting a php function in all your crawlers.
  • auto add Dependencies: ... that your crawler class extends (dependecies that are not extended by crawler classes but used elsewhere is upcoming)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grepsr-cli-0.7.5.tar.gz (23.0 kB view details)

Uploaded Source

Built Distribution

grepsr_cli-0.7.5-py3-none-any.whl (29.3 kB view details)

Uploaded Python 3

File details

Details for the file grepsr-cli-0.7.5.tar.gz.

File metadata

  • Download URL: grepsr-cli-0.7.5.tar.gz
  • Upload date:
  • Size: 23.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.12

File hashes

Hashes for grepsr-cli-0.7.5.tar.gz
Algorithm Hash digest
SHA256 643d1bd062924ddad19fa38bf4aa1b9d70fc411128389b02128b48170c77cd4d
MD5 a14e3869efad63d3dce3834435ce3eb2
BLAKE2b-256 be5ceec01d2d1d0096b733f410aad2598e3f9b6481f3a0d03eca4c4d4d25533e

See more details on using hashes here.

File details

Details for the file grepsr_cli-0.7.5-py3-none-any.whl.

File metadata

  • Download URL: grepsr_cli-0.7.5-py3-none-any.whl
  • Upload date:
  • Size: 29.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.12

File hashes

Hashes for grepsr_cli-0.7.5-py3-none-any.whl
Algorithm Hash digest
SHA256 637f65f8caa3cd6e2f388d24b05207c93905f9c509b3f98807485f87c3ff9891
MD5 73ae39813d7132239b76e40ba5337952
BLAKE2b-256 81799a410d6ffc5508c8642355d4b727061842d80d02d06a9a79690d419e3e6a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page