Skip to main content

scraping code to use with sciop-coordinated scrapes

Project description

sciop-scraping

A (yet-to-be-named) tool to enable scraping of very large datasets to be distributed across multiple volunteers and then reassembled as dataset parts on sciop.

NB. this is currently a work in progress, and it depends on planned features in sciop that are not yet stabilised. If you're interested in contributing, experience of or interest in web scraping, Python CLI tools and/or REST APIs would be very helpful.

We absolutely want this to be as easy to use as possible, so as soon as we can we'll be adding detailed documentation and putting out a call for wider testing. Watch this space!

In the meantime, please subscribe to the Safeguarding Research & Data forum for pointers to datasets that need saving along with help & advice with collecting them, preparing for upload and creating torrents.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sciop_scraping-0.1.5.tar.gz (17.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sciop_scraping-0.1.5-py3-none-any.whl (20.9 kB view details)

Uploaded Python 3

File details

Details for the file sciop_scraping-0.1.5.tar.gz.

File metadata

  • Download URL: sciop_scraping-0.1.5.tar.gz
  • Upload date:
  • Size: 17.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.25.2 CPython/3.13.5 Linux/6.12.12+bpo-amd64

File hashes

Hashes for sciop_scraping-0.1.5.tar.gz
Algorithm Hash digest
SHA256 15d4ef1773c0f23d0fecf96c5c6d47f259b05a457d893a8cf59d396618a92977
MD5 f16255ed61892a4d03e309103076c8a9
BLAKE2b-256 76c0b005830f6da1faa3e2686dae441b58faf49872011de936aeeaa9800a42c5

See more details on using hashes here.

File details

Details for the file sciop_scraping-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: sciop_scraping-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 20.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.25.2 CPython/3.13.5 Linux/6.12.12+bpo-amd64

File hashes

Hashes for sciop_scraping-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 3a1e4b6e127310239b4c3812ae90459e195dd9d31a9995a5fb7f3ced9403ef0a
MD5 421571443dd0a648c04ea20509d7ce9d
BLAKE2b-256 12837fe508fbb0a33cafdf7cde60459a3f48aea795d053c2b1932ebd4d5df6d6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page