A simple scraper

Project description

Peviitor Scraper

Pe Viitor logo

Description

peviitor_pyscraper is a Python-based scraping library that relies on HTML parsing libraries, Beautiful Soup, and Requests. It allows you to extract the required data from web pages and save them in an easily usable format such as CSV or JSON. With peviitor_pyscraper, you can select specific HTML elements from a web page and extract necessary information like text, links, images, etc.

Features of peviitor_pyscraper:

Utilizes popular Python libraries, BeautifulSoup and Requests, to facilitate web scraping.
Extracts the required data from a web page using specific HTML selections.
Provides a variety of storage options for the scraped data, including JSON.
Is easy to use and integrate into existing Python projects.
It can render pages with dynamically generated elements.

peviitor_pyscraper is an excellent choice for Python developers seeking a powerful and flexible web scraping library. With peviitor_pyscraper, you can automate the process of extracting data from web pages, saving time and effort.

Installation

You need to have Python 3.6 or higher installed on your computer. pip install peviitor-pyscraper
Node JS is required for rendering pages with dynamically generated elements. npm i peviitor_jsscraper

Usage Examples

Downloading the content from a specific URL:

 from scraper.Scraper import Scraper
 scraper = Scraper()
 html = scraper.get_from_url('https://www.example.ro')
 print(html.prettify())

The two lines of code create a Scraper object with the URL https://www.example.ro and then download the HTML code from that URL using the get_from_url() method, which returns a BeautifulSoup object that can be later used to search for specific elements within the web page.

To extract all "a" tags that contain an "href" attribute starting with "https://" from the downloaded HTML code, you can use the following code:

from scraper.Scraper import Scraper
scraper = Scraper()
html = scraper.get_from_url('https://www.example.ro')
links = html.find_all('a', href=re.compile('^https://'))
for link in links:
    print(link.get('href'))

To extract the first "h1" tag from the page:

from scraper.Scraper import Scraper
scraper = Scraper()
html = scraper.get_from_url('https://www.example.ro')
h1 = html.find('h1')
print(h1.text)

Downloading JSON content from a specific URL:

from scraper.Scraper import Scraper
scraper = Scraper()
json = scraper.get_from_url('https://api.example.ro', type='JSON')
print(json)

These lines of code create a Scraper object with the URL https://api.example.ro and then download the JSON content from that URL using the get_from_url() method, which returns a JSON object that can be later used to search for specific elements within the web page.

To make a POST request to a specific URL:

from scraper.Scraper import Scraper
scraper = Scraper()
data = {'key1': 'value1', 'key2': 'value2'}
response = scraper.post('https://api.example.ro', data=data)
json = response.json()
print(json)

The peviitor_pyscraper can render pages with dynamically generated elements. To render a page with dynamically generated elements you need to install Node JS and the peviitor_jsscraper package. To install the package run npm i peviitor_jsscraper and then use the render_page() method.
```
from scraper.Scraper import Scraper
scraper = Scraper()
html = scraper.render_page('https://www.example.ro')
print(html.prettify())
```
Contains all BeautifulSoup methods and attributes.

Contributing

If you want to contribute to the development of the scraper, there are several ways you can do so. First, you can help by contributing to the source code by adding new features or fixing existing issues. Second, you can contribute to improving the documentation or translating it into other languages. Additionally, if you want to help but are unsure how to get started, you can check our list of open issues and ask us how you can assist. For more information, please refer to the "Contribute" section in our documentation.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

If you have any questions or suggestions, please contact us at

Email: contact@laurentiumarian.ro
Website: https://laurentiumarian.ro
GitHub: https://github.com/lalalaurentiu

Acknowledgements

Project details

Release history Release notifications | RSS feed

This version

0.0.7

May 7, 2026

0.0.6

Feb 4, 2026

0.0.5

Jul 24, 2023

0.0.4

Jul 23, 2023

0.0.3

Jul 23, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

peviitor_pyscraper-0.0.7.tar.gz (5.1 kB view details)

Uploaded May 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

peviitor_pyscraper-0.0.7-py3-none-any.whl (5.5 kB view details)

Uploaded May 7, 2026 Python 3

File details

Details for the file peviitor_pyscraper-0.0.7.tar.gz.

File metadata

Download URL: peviitor_pyscraper-0.0.7.tar.gz
Upload date: May 7, 2026
Size: 5.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.7

File hashes

Hashes for peviitor_pyscraper-0.0.7.tar.gz
Algorithm	Hash digest
SHA256	`e1e54c1a2844570dc491d56fba0422705c39d21326ec7c57eea654267830e007`
MD5	`8b2c64a6b304a28683837afff4e04604`
BLAKE2b-256	`c9f5a899a9a7ba78ae816e2f1b8b72f8ff503ed8a37d28140e5eee6c041cceef`

See more details on using hashes here.

File details

Details for the file peviitor_pyscraper-0.0.7-py3-none-any.whl.

File metadata

Download URL: peviitor_pyscraper-0.0.7-py3-none-any.whl
Upload date: May 7, 2026
Size: 5.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.7

File hashes

Hashes for peviitor_pyscraper-0.0.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`77b7d1d5b2cca839aeb74e2e774f2f682715b169540a584b03eb99cd84a8a0c5`
MD5	`1f3ac9b23fb3981e3c4f83c03ca03260`
BLAKE2b-256	`50797b6b3a63772796b995b91f84f728f9542d1715ef5d5c63eff1a499b18cb8`

See more details on using hashes here.

peviitor-pyscraper 0.0.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Peviitor Scraper

Description

Installation

Usage Examples

Contributing

License

Contact

Acknowledgements

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes