A package to scrape patents from 'https://patents.google.com/'
Project description
Patent Scraper
A python package to scrape patents from 'https://patents.google.com/'. The package is made up ofa single python class, google_scraper(). This scraper can be used both to retreive parsed html of a single patents page or a list of patents.
Main Use Cases
There are two primary ways to use this package:
- Scrape a single patent
# ~ Import packages ~ # from patent_scraper import google_scraper import json # ~ Initialize scraper class ~ # scraper=google_scrape() # ~ Scrape patents individually ~ # # # Request single patent returns whether the scrape # was successful and the parsed html using bs4 err_1, soup_1 = scraper.request_single_patent('US2668287A') err_2, soup_2 = scraper.request_single_patent('US266827A') # ~ Parse results of scrape ~ # patent_1_parsed = scraper.process_patent_html(soup_1) patent_2_parsed = scraper.process_patent_html(soup_2)
- Scrape a list of patents
# ~ Import packages ~ # from patent_scraper import google_scraper import json # ~ Initialize scraper class ~ # scraper=google_scrape() #<- Initialize class # ~ Add patents to list ~ # scraper.add_patents('2668287A') scraper.add_patents('266827A') # ~ Scrape all patents ~ # scraper.scrape_all_patents() # ~ Get results of scrape ~ # patent_1_parsed = scraper.parsed_patents['US2668287A'] patent_2_parsed = scraper.parsed_patents['US266827A'] # ~ Print inventors of patent US2668287A ~ # for inventor in json.loads(patent_1_parsed['inventor_name']): print('Patent inventor : {0}'.format(inventor['inventor_name'])
Example Files
I have provided two seperate example scripts for usage of this package:
- Scrape a patent
- Scrape many patents using multiprocessing module
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size | File type | Python version | Upload date | Hashes |
---|---|---|---|---|
Filename, size google_patent_scraper-1.0.5-py3-none-any.whl (5.6 kB) | File type Wheel | Python version py3 | Upload date | Hashes View |
Filename, size google_patent_scraper-1.0.5.tar.gz (4.7 kB) | File type Source | Python version None | Upload date | Hashes View |
Close
Hashes for google_patent_scraper-1.0.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5f6189674c9416add7154438c4535b21b9fba2d03b98558fc3cf7a852f5f55a5 |
|
MD5 | ff3c511374840a29e7fbd0ae3650ece5 |
|
BLAKE2-256 | 5242c831a84680f08964f56d35e9f6ff214791587d5b8fe0337d3fb655526b11 |
Close
Hashes for google_patent_scraper-1.0.5.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | ddc8dabe04d05cd6e144e86d242857648e2c023607d0b029bae329ea110abffd |
|
MD5 | 1fdf57bff03f2671ebf0b333962b0499 |
|
BLAKE2-256 | b743f0af085129718d607eb73b2ac9fb54498d924493b80b4a985c37d72efe7c |