scrapeTiger is an python package for web scraping.
Project description
scrapeTiger
scrapeTiger is an python package for web scraping.
requires-python = ">=3.7"
example of scraping multiple page
from scrapeTiger import multi_page,details_page
multi_page(url="https://quotes.toscrape.com/page/",start_page=1,end_page=3,css_selector= ".quote span a")
details_page(field_one_css='.author-title')
Let explain how this two function working. First we are scraping link of each items from page 1 to page 2 with the help of multi_page function then we are scraping author name from datils page of each item with the help of details_page function.
example of scraping single page
from scrapeTiger import single_page,details_page
single_page(url="https://quotes.toscrape.com/",css_selector= ".quote span a")
details_page(field_one_css='.author-title')
How to install
pip install scrapeTiger
dependency packages
install selenium and webdriver-manager for run this package properly.
pip install seleniumpip install webdriver-manager
Output Result
It will generate csv file after scraping.
multi_page
multi_page(url,start_page,end_page,css_selector,wait_second=2)
- url (string, required)
- start_page(integer, required)
- end_page(integer, required)
- css_selector(string, required)
- wait_second(integer, optional)
defult 2 second
single_page
single_page(url,css_selector,wait_second=2)
- url (string, required)
- css_selector(string, required)
- wait_second(integer, optional)
details_page
details_page(wait_second=2,field_one_css=None,field_one_html=False,main_image_css=None,img_attribute='src',gallary_image_css=None,gallary_image_attribute='src',header_added=False)
-
wait_second(integer, optional)
defult 2 second -
field_one_css(string, optional)
-
field_one_html(boolean, optional)
defult flase. If you want to get html then make it true otherwise you will get text value.
-
You can aslo use field_two_css,field_three_css,field_four_css same like field_one_css.
-
You can aslo use field_two_html, field_three_html,field_four_html same like field_one_html.
-
maximum four fields
-
main_image_css(string, optional)
-
img_attribute(string, optional)
defult img_attribute is
srcfor main_image_css. -
gallary_image_css(string, optional)
-
gallary_image_attribute(string, optional)
defult img_attribute is
srcfor gallary_image_css. -
header_added(boolean, optional)
defult flase. If you don't want to write csv header then make to true.
-
details_page() function support maximum four fields. example:
details_page(field_one_css=None,field_one_html=False,field_two_css=None,field_two_html=False,field_three_css=None,field_three_html=False,field_four_css=None,field_four_html=False,)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scrapeTiger-1.0.4.tar.gz.
File metadata
- Download URL: scrapeTiger-1.0.4.tar.gz
- Upload date:
- Size: 4.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
07052678a9bcb92545d192974930f898e92030e6e4c819fb7065e1f4bdbff9de
|
|
| MD5 |
1df6f089232eec62ef38185f09198343
|
|
| BLAKE2b-256 |
d7366a2ac2a7d3af4cf0e2236204d7eb68d42e8fd33d21a92b169355b83cf96a
|
File details
Details for the file scrapeTiger-1.0.4-py3-none-any.whl.
File metadata
- Download URL: scrapeTiger-1.0.4-py3-none-any.whl
- Upload date:
- Size: 4.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e8bb04bf0e697d6bf7dc9c858c64e7fc1b767fce29e8f5f2c91b304afd68313b
|
|
| MD5 |
0fab500c4bd18778e67272bba8c7ef77
|
|
| BLAKE2b-256 |
7eee2da78e96f8d5f6c47b5eee4ce991a304c7446f28343c1f9035631ca0501c
|