Skip to main content

A powerful pcpartpicker.com WebScraper that extracts Data from Part URL's and search Queries.

Project description

PCPartScraper

Author: Jeet Chugh

PCPartScraper is a simple, yet powerful pcpartpicker.com WebScraper that extracts Data from Part URL's.

Features:

  • Search for Parts from String Queries
  • View: name, type, sale_link, price, specifications, url, rating, reviews, queries
  • Install easily with pip
  • Lightweight, only uses Requests and BS4 BeautifulSoup

Github Link | PyPi Link | Example Code Link

Quick and Easy Installation via PIP: pip install pcpartscraper

Import Statement: from pcpartscraper.scraper import Part,Query

Dependencies: bs4, requests, python 3

Code License: MIT

Documentation

Documentation is split into 2 sections. First is the 'Part' Class and second is the 'Query' Function.

'Part' Class:

Part takes in an input of a URL as a string, and has many methods that return specific chunks of data.

Example1 Part: '/product/jxJwrH/intel-cpu-bx80623i52400' is a WD Blue 1tb Hard drive

Example2 Part: '/product/jxJwrH/intel-cpu-bx80623i52400' is an Intel i5-2400 Proccesor

Import:

from pcpartscraper.scraper import Part

Instantiation:

part1 = Part('/product/jxJwrH/intel-cpu-bx80623i52400') # Takes in url string (no .com)

part2 = Part('/product/jxJwrH/intel-cpu-bx80623i52400') # Organize different parts in variables

'Part' Methods:

methods return None if encountering Errors

Part('url').name()

returns a string containing the name of the part.

(Western Digital 1 TB 3.5" Hard Drive, etc.)


Part('url').type()

returns a string containing the type of the part

(Storage, Memory, Video Card, CPU Cooler, etc.)


Part('url').amazon_link()

returns a string containing the URL to the amazon listing for the product, if available. returns None if unavailable.

('https://www.amazon.com/dp/B004EBUXIA?tag=pcpapi-20&linkCode=ogi&th=1&psc=1', etc)


Part('url').price()

returns a float value for the cheapest price available for the part.

(34.99, 93.01, 45.62, etc.)


Part('url').advanced_specs()

returns a dictionary containing key/value pairs that correspond to the "specifications" sidebar for the part

Example Dictionary:{'model':'Intel','Core Clock':'3.2Ghz','TDP':'95W','Socket':'LGA1155'}


Part('url').url()

returns a string containing the runnable link for the part.

(https://pcpartpicker.com/product/jxJwrH/intel-cpu-bx80623i52400, etc.)


Part('url').rating()

returns a float value containing the review rating score, out of 5, for the part.

(3.6, 4.7, 1.3, 5.0, etc.)


Part('url').reviews(results=1)

inputs = results. The number of reviews that you want to pull from the part page.

returns a list containing x amount of text-reviews for a part. Reviews are from the part page, and are unfiltered.

(['really fast and good looking!','runs a little hot, but runs games extremely well!','Not good, waste of money.'], etc.)


'Query' Function:

Query takes in (url as a string), (results as an int), (exclude_laptops as a bool)

Import:

from pcpartscraper.scraper import Query

Instantiation:

result_list = Query(search_term='ryzen 5',results=1,exclude_laptops=True)

'Query' Inputs:

returns a list containing 'Part' classes pertaining to results.

Query(search_term='')

search_term is the keywords for finding a part through query. Main "searching" input.

(Western Digital , G-SKill, Cooler Master Hyper, 8gb RAM, etc.)


Query(results=3)

results is the number of results that you want to be returned in the returning list

The default value for result is 3, and the max is 20. > Results = More time usage

(6, 11, 3, 5, 1, 20, 13, etc.)


Query(exclude_laptops='search for a part')

Because of the laptop update to pcpartpicker.com, searching for parts often only result in laptops

exclude_laptops will ensure that no elements in the returning list contain instances of laptops.

The default value for exclude_laptops is True

(True, False)


Query('ryzen 5',3,True)

This example would return a list containing 3 'Part' objects for the top 3 searches pertaining to 'ryzen',excluding laptops.

A return would look like this

print(Query('ryzen',3,True)) --> [Part Object at x,Part Object at y,Part Object at z]


Thank you for reading the documentation. If you need an example using all these methods, go to [link]

If you have issues, report them to the github project link.

CHANGELOG

0.0.1 (7/8/20):

  • Release on PyPi
  • Fixed Text errors
  • Added example.py file

1.0.0 (7/8/20):

  • Updated Release
  • Attempted to fix Import Statement
  • Created Module Folder Architecture
  • Fixed ReadME.md

1.0.1 (7/8/20):

  • Fixed REAME
  • Updated Example.py

1.0.2 (7/8/20):

  • Accidentally Corrupted Changes
  • Final ReadME
  • GitHub commits

1.0.3 (7/8/20):

  • setup.py Final Changes
  • ReadMe.md Final
  • Added PyPi tags

1.0.4 (7/8/20):

  • Stable Release

1.0.5 (7/9/20):

  • Fixed Links in README.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pcpartscraper-1.0.5.tar.gz (7.8 kB view details)

Uploaded Source

File details

Details for the file pcpartscraper-1.0.5.tar.gz.

File metadata

  • Download URL: pcpartscraper-1.0.5.tar.gz
  • Upload date:
  • Size: 7.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.3

File hashes

Hashes for pcpartscraper-1.0.5.tar.gz
Algorithm Hash digest
SHA256 86f23f4768c1dc240f1b02c67db0daaf8b7e43668728b65746f18cac6f775dd2
MD5 9eb7a1e01fda0eb6efdf06e169c04641
BLAKE2b-256 43318d098999f9b246d7b6e3f2721c049f833681c0f92ce704d444bfdfd63174

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page