Skip to main content

Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Scrapyd-Client, Scrapyd-API, Django and Vue.js

Project description

Gerapy

Build Read the Docs PyPI - Python Version GitHub stars PyPI - Downloads Docker Pulls PyPI - License

Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Scrapyd-Client, Scrapyd-API, Django and Vue.js.

Documentation

Documentation is available online at https://docs.gerapy.com/ and https://github.com/Gerapy/Docs.

Support

Gerapy is developed based on Python 3.x. Python 2.x may be supported later.

Usage

Install Gerapy by pip:

pip3 install gerapy

After the installation, you need to do these things below to run Gerapy server:

If you have installed Gerapy successfully, you can use command gerapy. If not, check the installation.

First use this command to initialize the workspace:

gerapy init

Now you will get a folder named gerapy. Also you can specify the name of your workspace by this command:

gerapy init <workspace>

Then cd to this folder, and run this command to initialize the Database:

cd gerapy
gerapy migrate

Next you need to create a superuser by this command:

gerapy createsuperuser

Then you can runserver by this command:

gerapy runserver

Then you can visit http://localhost:8000 to enjoy it. Also you can vist http://localhost:8000/admin to get the admin management backend.

If you want to run Gerapy in public, just run like this:

gerapy runserver 0.0.0.0:8000

Then it will run with public host and port 8000.

In Gerapy, You can create a configurable project and then configure and generate code of Scrapy automatically. But this module is unstable, we're trying to refine it.

Also you can drag your Scrapy Project to projects folder. Then refresh web, it will appear in the Project Index Page and comes to un-configurable, but you can edit this project through the web page.

As for deployment, you can move to Deploy Page. Firstly you need to build your project and add client in the Client Index Page, then you can deploy the project just by clicking button.

After the deployment, you can manage the job in Monitor Page.

Docker

Just run this command:

docker run -d -v ~/gerapy:/app/gerapy -p 8000:8000 germey/gerapy

Then it will run at port 8000. You can use the temp admin account (username: admin, password: admin) to login. And please change the password later for safety.

Command Usage:

docker run -d -v <workspace>:/app/gerapy -p <public_port>:<container_port> germey/gerapy

Please specify your workspace to mount Gerapy workspace by -v <workspace>:/app/gerapy and specify server port by -p <public_port>:<container_port>.

If you run Gerapy by Docker, you can visit Gerapy website such as http://localhost:8000 and enjoy it, no need to do other initialzation things.

TodoList

  • Add Visual Configuration of Spider with Previewing Website
  • Add Scrapyd Auth Management
  • Add Gerapy Auth Management
  • Add Timed Task Scheduler
  • Add Visual Configuration of Scrapy
  • Add Intelligent Analysis of Web Page

Communication

If you have any questions or ideas, you can send Issues or Pull Requests, your suggestions are really import for us, thanks for your contirbution.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gerapy-team-0.1.3.tar.gz (36.8 kB view details)

Uploaded Source

Built Distribution

gerapy_team-0.1.3-py3-none-any.whl (7.1 MB view details)

Uploaded Python 3

File details

Details for the file gerapy-team-0.1.3.tar.gz.

File metadata

  • Download URL: gerapy-team-0.1.3.tar.gz
  • Upload date:
  • Size: 36.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.5

File hashes

Hashes for gerapy-team-0.1.3.tar.gz
Algorithm Hash digest
SHA256 91f7522af3bba8d2d2982cb5e66d5c8389f10db9a7fc6d1ae926854b57753359
MD5 af4901a9eab4318f4c1cda4b87c52143
BLAKE2b-256 b550bc0e879f211d98e4ee11ecc35201c654059bd8ce6fcd2442b9b436aa4b9f

See more details on using hashes here.

File details

Details for the file gerapy_team-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: gerapy_team-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 7.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.5

File hashes

Hashes for gerapy_team-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 69476e69e037e31f6d867d712867746d01d13a74d008662ccde478e413102bbf
MD5 af087e9c6876816fab9395f81da7e970
BLAKE2b-256 1c167e5cbfc0ef02d4f2dd63037592304a42b53c3603ff5465b3629cc1544535

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page