Skip to main content

scrapes search engine pages for query titles, descriptions and links

Project description

Search Engine Parser

Package to query popular search engines and scrape for result titles, links and descriptions

Python 3.6

PyPI version

Build StatusLicense: MIT

Installation

    pip install search-engine-parser

Development

Clone the repository

    git clone git@github.com:bisoncorps/search-engine-parser.git

Create virtual environment and install requirements

    mkvirtualenv search_engine_parser
    pip install -r requirements-dev.txt

Code Documentation

Found on Github Pages

Running the tests

    cd search_engine_parser/
    python tests/__init__.py

Usage

Code

Query Results can be scraped from popular search engines as shown in the example snippet below

    from search_engine_parser import YahooSearch, GoogleSearch, BingSearch
    import pprint

    search_args = ('preaching to the choir', 1)
    gsearch = GoogleSearch()
    ysearch = YahooSearch()
    bsearch = BingSearch()
    gresults = gsearch.search(*search_args)
    yresults = ysearch.search(*search_args)
    bresults = bsearch.search(*search_args)
    a = {
        "Google": gresults,
        "Yahoo": yresults,
        "Bing": bresults}
    # pretty print the result from each engine
    for k, v in a.items():
        print(f"-------------{k}------------")
            pprint.pprint(v)

    # print first title from google search
    print(gresults["titles"][0])
    # print 10th link from yahoo search
    print(yresults["links"][9])
    # print 6th description from bing search
    print(bresults["descriptions"][5])

Command line

Use python module runner to run the parser on the command line e.g

python -m search_engine_parser.core.cli --query "Preaching to the choir" --engine bing --type descriptions

Result

'Preaching to the choir' originated in the USA in the 1970s. It is a variant of the earlier 'preaching to the converted', which dates from England in the late 1800s and has the same meaning. Origin - the full story 'Preaching to the choir' (also sometimes spelled quire) is of US origin.

Full arguments shown below

    usage: cli.py [-h] [-e ENGINE] -q QUERY [-p PAGE] [-t TYPE] [-r RANK]

    SearchEngineParser

    optional arguments:
    -h, --help            show this help message and exit
    -e ENGINE, --engine ENGINE
                            Engine to use for parsing the query e.g yahoo
                            (default: google)
    -q QUERY, --query QUERY
                            Query string to search engine for
    -p PAGE, --page PAGE  Page of the result to return details for (default: 1)
    -t TYPE, --type TYPE  Type of detail to return i.e links, desciptions or
                            titles
    -r RANK, --rank RANK  Rank of detail in list to return e.g 5 (default: 0)

Contribution

You are very welcome to modify and use them in your own projects.

Please keep a link to the original repository. If you have made a fork with substantial modifications that you feel may be useful, then please open a new issue on GitHub with a link and short description and then make a pull request.

License (MIT)

This project is opened under the MIT 2.0 License which allows very broad use for both academic and commercial purposes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

search-engine-parser-0.2.5.tar.gz (8.0 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page