Bibliographic capture system for non-scrapping data sources
Project description
Google Scholar Report
Description
Google Scholar Report is a tool for collecting data from Google Scholar profiles and storing it with metadata for each scientific paper. This tool has three main forms of use: generic, authenticated and admin; which differ in the amount and quality of the collected metadata. The default output is xlsx.
Usage from python-cli
Installation Package
$pip install GoogleScholarReport
For the first option of use (generic), use:
>>> from GoogleScholarReport import collector
>>> collector.gsr('url_to_google_scholar_profile', ouput='json')
Example
>>> from GoogleScholarReport import collector
>>> collector.gsr('https://scholar.google.com/citations?user=1sKULCoAAAAJ&hl=en', ouput='json')
For the second option of use (user authenticate):
>>> from GoogleScholarReport import collector
>>> collector.gsr('url_to_google_scholar_profile',email='user_email_google_scholar',password='pass_user_gs',' ouput='json')
Finally, for admin mode, use:
>>> from GoogleScholarReport import collector
>>> collector.gsr('url_to_google_scholar_profile',email='user_email_google_scholar',password='pass_user_gs', ouput='some_ouput(csv,json)',admin=True)
Usage from command-line
From command line, this tool has three main forms of use: generic, authenticated and admin; which differ in amount and quality of the collected metadata results.
For the first option of use (generic), use:
collector "url_for_the_google_scholar_profile"
Example:
collector "https://scholar.google.com/citations?user=1sKULCoAAAAJ&hl=en"
The above option return one xlsx file report in the current working directory with the following metadata:
'title', 'author', 'journal', 'volume', 'number','pages', 'year', 'cite_id', 'cites', 'TitleU'.
If you want the output in csv or json format aggregate the bellow flag and the desire output format, for instance:
collector "url_for_the_google_scholar_profile" --output csv
For the second option of use (user authenticate):
collector "url_for_the_google_scholar_profile" --email <email> --password <password>
This return one xlsx file report in the current working directory with the following metadata:
'cite_id', 'cites', 'publisher', 'year', 'pages', 'number', 'volume', 'journal', 'author', 'title','ENTRYTYPE', 'ID', 'school', 'booktitle', 'organization', 'note','month', 'institution' Finally, for admin mode, issue:
collector "url_for_the_google_scholar_profile" --email <email> --password <password> --admin
This returns by default an xlsx file with the same metadata that option two plus one fiedl 'bibtex'.
In general this command line tool have the following form:
collector "url_for_the_google_scholar_profile" --email <user_email> --password <password> --output <format> --admin
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for GoogleScholarReport-0.1.5.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | f80e466d3f4a4a6191d3f336de44b7a8d2e1c1ace291055a4cd3e466de1cc7ff |
|
MD5 | faa3a0c412cb4bcee5d8b6664adf32a7 |
|
BLAKE2b-256 | 21b6c407cd93a18e1ba90f0280f23776e20dcd0830e11b47cc3b26827ff868ab |
Hashes for GoogleScholarReport-0.1.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 49903536bc9d7239c9c4225dff8ec180aea639dd91832e5446e7b9e3de601258 |
|
MD5 | 3197608f1c7ad2b546635a2468c38cbe |
|
BLAKE2b-256 | 8f99e36a06df5b9a518e8e7c8153b3229b5bb77823f5a91cf3a052cfd8bef406 |