Skip to main content

A web page crawler which returns (title, og:image, og:description).

Project description

Latest Version PEP8

NOIZZE Crawler

A web page crawler PyPI Package which returns (title(og, head), image(og, meta), description(og, meta)).

Dependency

  • BeautifulSoup4

Installation

Run the folowing to install:

pip install noizze-crawler

Usage

import noizze_crawler as nc
import sys


if __name__ == '__main__':
    url = 'https://dvdprime.com/g2/bbs/board.php?bo_table=comm&wr_id=20525678'

    try:
        (title, desc, image_url, html) = nc.crawler(url)

    except nc.HostNotFound as e:
        print("Host Not Found")
        sys.exit(1)
    except nc.HTTPError as e:
        print("HTTP {}".format(e))
        sys.exit(1)

    print(title, desc, image_url)  # html

ChangeLog

  • v11: Fixed bugs #3 #8
  • v10: Fixed bugs
  • v9: Youtube crawler with Google API #4
  • v8: Changed PyPI dependency - bs4
  • v7: PEP8 passed codes
  • v6: HostNotFound, HTTPError exceptions

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

noizze-crawler-12.tar.gz (3.3 kB view details)

Uploaded Source

Built Distribution

noizze_crawler-12-py3-none-any.whl (4.3 kB view details)

Uploaded Python 3

File details

Details for the file noizze-crawler-12.tar.gz.

File metadata

  • Download URL: noizze-crawler-12.tar.gz
  • Upload date:
  • Size: 3.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.4.4

File hashes

Hashes for noizze-crawler-12.tar.gz
Algorithm Hash digest
SHA256 7a8d6b21f1bb8c21845e4fab7cb0d75cc07ec84a6b15c3313ed7275fefddc79e
MD5 3f4fb945bb6b011225a6fa019712bd73
BLAKE2b-256 736f64cc5f68dcb2f477832a8a4c9c447e485f86e50b00b24d71c1f7a1415449

See more details on using hashes here.

File details

Details for the file noizze_crawler-12-py3-none-any.whl.

File metadata

  • Download URL: noizze_crawler-12-py3-none-any.whl
  • Upload date:
  • Size: 4.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.4.4

File hashes

Hashes for noizze_crawler-12-py3-none-any.whl
Algorithm Hash digest
SHA256 80e11eac50eb70a4b0e2cae6ea2ce2fca0fbeed87086c3ebe837cf1b68c12dd7
MD5 605c342772f73eb4fbbfe82d0de0a678
BLAKE2b-256 4ea9b60f700b4e12adf0e21c67a7fd1e2ee4bd3511744ceb51ea4f9bd1fdab91

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page