Python based code to scrap and download data from quora website: questions related to certain topics, answers given on certain questions and users profile data

These details have not been verified by PyPI

Project links

Project description

Quora-scraper

Quora-scraper is a command-line application written in Python that scrapes Quora. It simulates a browser environment to let you scrape Quora rich textual data. You can use one of the three scraping modules to: Find questions that discuss about certain topics (such as Finance, Politics, Tesla or Donald-Trump). Scrape Quora answers related to certain questions, or scrape users profile. Please use it responsibly !

Install

To use our scraper, please follow the steps below:

Install python 3.6 or upper versions.
Install the latest version of google-chrome.
Download chromedriver and add it to your sys path: https://sites.google.com/a/chromium.org/chromedriver/home
Install quora-scraper:

$ pip install quora-scraper

To update quora-scraper:

$ pip install quora-scraper --upgrade

Alternatively, you can clone the project and run the following command to install: Make sure you cd into the quora-scraper folder before performing the command below.

$  python setup.py install

Usage

quora-scraper has three scraping modules : questions ,answers,users.

1) Scraping questions URL:

You can scrape questions related to certain topics using questions command. This module takes as an input a list of topic keywords. Output is a questions_URL file containing the topic's question links.

Scraping a topic questions can be done as follows:

a) Use -l parameter + topic keywords list.

$ quora-scraper questions -l [finance,politics,Donald-Trump]

b) Use -f parameter + topic keywords file location. (keywords must be line separated inside the file):
```
$ quora-scraper questions -f  topics_file.txt
```

2) Scraping answers:

Quora answers are scraped using answers command. This module takes as an input a list of Questions URL. Output is a file of scraped answers (answers.txt). An answer consists of :

Quest-ID | AnswerDate | AnswerAuthor-ID | Quest-tags | Answer-Text

To scrape answers, use one of the following methods:

a) Use -l parameter + question URLs list.

$ quora-scraper answers -l [https://www.quora.com/Is-milk-good,https://www.quora.com/Was-Einstein-a-fake-and-a-plagiarist]

b) Use -f parameter + question URLs file location:

$ quora-scraper answers -f  questions_url.txt

3) Scraping Quora user profile:

You can scrape Quora Users profile using users command. The users module takes as an input a list of Quora user IDs. The output is UserProfile file containing:

Remaining lines (User's answers): AnswerDate | QuestionID | AnswerText

Scraping Users profile can be done as follows:

a) Use -l parameter + User-IDs list.

$ quora-scraper users -l [Albert-Einstein-195,Jackie-Chan-8]

b) Use -f parameter + User-IDs file.

$ quora-scraper users -f quora_username_file.txt

Notes

a) Input files must be line separated.

b) Output files fields are tab separated.

c) You can add a list/line index parameter In order to start the scraping from that index. The code below will start scraping from "physics" keyword: sh $ quora-scraper questions -l [finance,politics,tech,physics,life,sports] -i 3

d) Quora website puts limit on the number of questions accessible on a topic page. Thus, even if a topic has a large number of questions (ex: 100k), the number scraped questions links will not exceed 2k or 3k questions.

e) For more help use :

   $ quora-scraper --help

f) Quora-scraper uses xpaths and bs4 methods to scrape Quora webpage elements. Since Quora HTML Structure is constantly changing, the code may need modification from time to time. Please feel free to update and contribute to the source-code in order to keep the scraper up-to-date.

License

This project uses the following license: MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.1.3

Jul 29, 2020

1.1.2

Jul 27, 2020

1.1.0

Jul 10, 2020

1.0.8

Jun 20, 2020

1.0.6

Jun 18, 2020

1.0.5

Jun 18, 2020

1.0.4

Jun 18, 2020

1.0.3

Jun 18, 2020

1.0.2

Jun 18, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quora-scraper-1.1.3.tar.gz (5.2 MB view details)

Uploaded Jul 29, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

quora_scraper-1.1.3-py3-none-any.whl (5.2 MB view details)

Uploaded Jul 29, 2020 Python 3

File details

Details for the file quora-scraper-1.1.3.tar.gz.

File metadata

Download URL: quora-scraper-1.1.3.tar.gz
Upload date: Jul 29, 2020
Size: 5.2 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.6.5

File hashes

Hashes for quora-scraper-1.1.3.tar.gz
Algorithm	Hash digest
SHA256	`aa0bafed1604cbbc70b40b0ac0d9f068b3303a5f1beca5e0985e7252f36e10a0`
MD5	`8030ef1b1acd37ad654fbd6d8d9a9cbe`
BLAKE2b-256	`2fad10819ca9ca8e4b9676fa7ac15ff5ae03d1493f5983793ec6f780b05cc99a`

See more details on using hashes here.

File details

Details for the file quora_scraper-1.1.3-py3-none-any.whl.

File metadata

Download URL: quora_scraper-1.1.3-py3-none-any.whl
Upload date: Jul 29, 2020
Size: 5.2 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.6.5

File hashes

Hashes for quora_scraper-1.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`539a7b20b1819b09d2a299bd965491f855878081574b1b55332de49edbd2b583`
MD5	`96ab0b678763bbb25b165bc330a2215a`
BLAKE2b-256	`8f7184aecb9e3a3cc73772f5c1a1b12085b96b6742a8ff0abfbeb6e21ca5c95d`

See more details on using hashes here.

quora-scraper 1.1.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Quora-scraper

Install

Usage

1) Scraping questions URL:

2) Scraping answers:

3) Scraping Quora user profile:

Notes

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes