Crawler and search tools used by Sirji.
Project description
Sirji is an agentic AI framework for software development.
Built with ❤️ by True Sparrow
Sirji Tools
sirji-tools
is a PyPI package that provides tools for:
- Crawling (downloading web pages to markdown files)
- Searching on Google
- Custom Logging
Installation
Setup Virtual Environment
We recommend setting up a virtual environment to isolate Python dependencies, ensuring project-specific packages without conflicting with system-wide installations.
python3 -m venv venv
source venv/bin/activate
Install Package
Install the package from PyPi:
pip install sirji-tools
Run the following command to install playwright:
playwright install
Usage
Environment Variables
Ensure that the following environment variables are set:
export SIRJI_PROJECT="Absolute folder path for Sirji to use as its project folder."
export SIRJI_RUN_PATH='Folder having run specific logs, etc.'
Crawl URLs
Crawl URLs tool will be used to crawl the web pages and extract the information from the web pages. And store the information for further processing by the researcher.
from sirji_tools import crawl_urls
urls = ['https://www.google.com', 'https://www.yahoo.com']
crawl_urls(urls, 'project/researcher')
Search
Search tool will be used to search the information from the web pages based on the search terms provided. It returns the list of URLs related to the search terms.
from sirji_tools import search_for
search_term = 'python programming'
urls = search_for(search_term)
Logger
Logger tool will be used to log the information in the log file. It will be used to log the information to show the progress of the execution.
from sirji_tools.logger import p_logger
p_logger.info("Log line here")
For Contributors
- Fork and clone the repository.
- Create and activate the virtual environment as described above.
- Set the environment variables as described above.
- Install the package in editable mode by running the following command from the repository root:
pip install -e .
- Run the following command to install playwright:
playwright install
Running Tests and Coverage Analysis
Follow the above-mentioned steps for "contributors", before running the test cases.
# Install testing dependencies
pip install pytest coverage
# Execute tests
pytest
# Measure coverage, excluding test files
coverage run --omit="tests/*" -m pytest
coverage report
License
Distributed under the MIT License. See LICENSE
for more information.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file sirji_tools-0.0.15.tar.gz
.
File metadata
- Download URL: sirji_tools-0.0.15.tar.gz
- Upload date:
- Size: 10.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 54b3e038793f103923d5bace10e17c86195df2ca497656e0103dac3e568f9980 |
|
MD5 | e18f213f7f49dd3c2f02de17a2386530 |
|
BLAKE2b-256 | 2054aaf58464fafdfa712d97f0da650425f93ae5d6c88dce8ed08359c910eb58 |
File details
Details for the file sirji_tools-0.0.15-py3-none-any.whl
.
File metadata
- Download URL: sirji_tools-0.0.15-py3-none-any.whl
- Upload date:
- Size: 10.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e61b957ba1ccfbdf7b6e4a7257d915204975f10bdb49458dfee70437961c2141 |
|
MD5 | 147ff417befdac610d93ad1e2b342aef |
|
BLAKE2b-256 | 558fa74824c8b5dfc410cbc2c1837b9a4ff04c493d1048cbe35a66a158244bb2 |