A small crawler to scrape data from swranking.com and store in in the local db of the vm.
Project description
#######################################################
SWAgent Crawler
#######################################################
This is a simple web crawler specifically designed to scrape data from swranking.com continuously in order to build a database with data useful enoughn to train a ML model to make RTA draft predictions in real time.
The package contains two helper classes:
- USERAGENT: creates randomized user_agents to
send through the REST request t obtain data from the websites API.
- SEEKER: this is the actual crawler that finds
the information for us and then sends it out as a json object.
The package main routine focuses on a basic ETL schema. afte obtaining the data from the seeker object it then transfoms the data to be in the format wanted by the Database. Then we send it to the local db of the VM to store for further processing by other jobs.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file swagenttools-0.3.16.tar.gz.
File metadata
- Download URL: swagenttools-0.3.16.tar.gz
- Upload date:
- Size: 43.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cb0ce9ee68d5b12c570f27cb48fb5d39dafbec8bc3e938a6a30f0bb018a8ed13
|
|
| MD5 |
296ebd1b479f9612737a3348d1f982bb
|
|
| BLAKE2b-256 |
0c0bf13f5d17909bcac527edfebcd82a2ddf2acee267020ec2cce281274a6b63
|
File details
Details for the file swagenttools-0.3.16-py3-none-any.whl.
File metadata
- Download URL: swagenttools-0.3.16-py3-none-any.whl
- Upload date:
- Size: 43.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a4f80495e9b1a06cd7e854f3cce39682a17222b82a85626e30388554c5487d6f
|
|
| MD5 |
60886d61303997e0390765a6f140b163
|
|
| BLAKE2b-256 |
7ecede2c67b36c726f10c9aa106df1a78eeeade5adae0a290f99bb12f17da47a
|