Python library for scraping inside Airflow.
Project description
as-scraper
Python library for scraping inside Airflow.
Installation
The as-scraper library uses Geckodriver (Firefox) for scraping with the Selenium library. In order to use it, you need to have an airflow image having the Geckodriver dependency.
We have the as-airflow Docker image for you to have airflow ready with the Geckodriver dependency.
To use this library follow the next steps:
1. Download the docker-compose.yml
file from the Airflow docs.
Airflow provides the docker-compose.yml file you need for this library.
You can directly copy the docker-compose.yml
file from here or run the following command to download it:
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.3.4/docker-compose.yaml'
2. Modify the docker-compose.yml
file.
After that, simply go into the docker-compose.yml file and change the airflow image used:
...
version: '3'
x-airflow-common:
&airflow-common
# In order to add custom dependencies or upgrade provider packages you can use your extended image.
# Comment the image line, place your Dockerfile in the directory where you placed the docker-compose.yaml
# and uncomment the "build" line below, Then run `docker-compose build` to build the images.
image: ${AIRFLOW_IMAGE_NAME:-almiavicas/as-airflow:2.2.3}
...
And that's it! You can now start using the as-scraper library.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file as-scraper-1.1.1.tar.gz
.
File metadata
- Download URL: as-scraper-1.1.1.tar.gz
- Upload date:
- Size: 10.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 31e1ce89fc59f0e1b978745193eca1074bd93609b266e9fa346aaa3416fbb2bc |
|
MD5 | d9e290338451f2304b30aedf3e3a4e8e |
|
BLAKE2b-256 | ff58b413dee53ed9ddf54a754557003a6f872e6905978b01f8254902e20d883f |
File details
Details for the file as_scraper-1.1.1-py3-none-any.whl
.
File metadata
- Download URL: as_scraper-1.1.1-py3-none-any.whl
- Upload date:
- Size: 12.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4f0fb8b0c13a648b3e0064c79b3fc6c49615781a3e97e23b1e255984a7f9e486 |
|
MD5 | 4736052b3a886b5ec1b990b3d4539289 |
|
BLAKE2b-256 | 1bc71aab001e11c37ee373cd0823dc7e577b12028c05b05fbe1386b0d4e4089c |