Skip to main content

Parselab helper module

Project description

parselab

This package contains classes that help to write parsers in Python.

Usage

To use parelab just create a class derived from BasicParser.

from parselab.cache import FileCache
from parselab.network import NetworkManager
from parselab.parsing import BasicParser

class MyParser(BasicParser):

    def __init__(self):
        self.cache = FileCache(namespace='my-parser', path=os.environ.get('CACHE_PATH'))
        self.net = NetworkManager()
        db.connect(os.environ['PARSINGDB'])
        db.setup_project('my-project')

After that you will be able to download pages using BasicParser.get_page() method:

class MyParser(BasicParser):
    ...

    def run(self):
        page = self.get_page('https://google.com')

BasicParser will use network manager specified in __init__ method and will save all downloaded pages into directory specified by your $CACHE_PATH environment variable. Next time you invoke get_page() method it will get the requested page from cache if available.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parselab-0.1.8.tar.gz (8.2 kB view details)

Uploaded Source

File details

Details for the file parselab-0.1.8.tar.gz.

File metadata

  • Download URL: parselab-0.1.8.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.7.4

File hashes

Hashes for parselab-0.1.8.tar.gz
Algorithm Hash digest
SHA256 8bd73adca48d99bc7ff898b06197dcb1c1658c198680c79c1bd69812602a5a57
MD5 6904709ac795beed462d2f05f5ef592b
BLAKE2b-256 1ff99f75464511315da69dac064458353e5a13f588f153b5e30b756eec79cdc9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page