Crawler and search tools used by Sirji.
Project description
sirji-tools
sirji-tools
is a Python package.
Installation
Install sirji-tools
quickly with pip:
pip install sirji-tools
Usage
Crawl URLs
Crawl URLs tool will be used to crawl the web pages and extract the information from the web pages. And store the information for the further processing by researcher.
from sirji_tools import crawl_urls
urls = ['https://www.google.com', 'https://www.yahoo.com']
crawl_urls(urls, 'workspace/researcher')
Search
Search tool will be used to search the information from the web pages based on the search terms provided. It returns the list of URLs related to the search terms.
from sirji_tools import search_for
search_term = 'python programming'
urls = search_for(search_term)
Logger
Logger tool will be used to log the information in the log file. It will be used to log the information to show the progress of the execution.
from sirji_tools.logger import p_logger
p_logger.info("Log line here")
License
sirji-tools
is made available under the MIT License. See the included LICENSE file for more details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for sirji_tools-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 875aa061ad837d8114a799ebe2d87d4a777e71843e83509ccbddc439520627f0 |
|
MD5 | 75423df6db79462ae38432cbe1f52963 |
|
BLAKE2b-256 | 8cf6857d7e4c4f59a45545f563b3c8e57113d1b04102684a8a092757ad7147d9 |