Skip to main content

Portia is a tool that allows you to visually scrape websites without any programming knowledge required. With Portia you can annotate a web page to identify the data you wish to extract, and Portia will understand based on these annotations how to scrape data from similar pages..

Project description

Portia

Portia is a tool that allows you to visually scrape websites without any programming knowledge required. With Portia you can annotate a web page to identify the data you wish to extract, and Portia will understand based on these annotations how to scrape data from similar pages.

Running Portia

The easiest way to run Portia is using Docker:

You can run Portia using Docker & official Portia-image by running:

docker run -v ~/portia_projects:/app/data/projects:rw -p 9001:9001 scrapinghub/portia

You can also set up a local instance with Docker-compose by cloning this repo & running from the root of the folder:

docker-compose up

For more detailed instructions, and alternatives to using Docker, see the Installation docs.

Documentation

Documentation can be found from Read the docs. Source files can be found in the docs directory.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

portia_pro-1.2.5.tar.gz (3.6 kB view details)

Uploaded Source

Built Distribution

portia_pro-1.2.5-py3-none-any.whl (2.9 kB view details)

Uploaded Python 3

File details

Details for the file portia_pro-1.2.5.tar.gz.

File metadata

  • Download URL: portia_pro-1.2.5.tar.gz
  • Upload date:
  • Size: 3.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.2

File hashes

Hashes for portia_pro-1.2.5.tar.gz
Algorithm Hash digest
SHA256 1500782358d91d1ec568a26e383708d1ce6e86359beaea7f1d8360f209e06482
MD5 b5c5fcf01286fddf517a81decc288cf6
BLAKE2b-256 c69878fa411b2b541587b15abb909236f9f6482d7abed370c99b342135f7b621

See more details on using hashes here.

File details

Details for the file portia_pro-1.2.5-py3-none-any.whl.

File metadata

  • Download URL: portia_pro-1.2.5-py3-none-any.whl
  • Upload date:
  • Size: 2.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.2

File hashes

Hashes for portia_pro-1.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 fdc10f5bc19e84c4c8d91757d04dfa9de1328dfb530ec022bc7fb59dd7cfd584
MD5 f7a37552328ec6212cc6efcf799a1de9
BLAKE2b-256 93c78ca8ad04607dd057cc3047387f99abc4a657b3b45eda4631c7e1e0926150

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page