SearchURL

SearchURL lets perform Keyword, Fuzzy and Semantic Search through the text on websites using thier URLs.

These details have not been verified by PyPI

Project description

SearchURL

github-searchURL

Installation

Install SearchURL with pip

  pip install SearchURL

Documentation

1. Getting all the text from a webpage by not passing in keywords:

from SearchURL.main import SearchURL

search = SearchURL(cache=True)

data = search.searchUrl(
    url="https://en.wikipedia.org/wiki/Web_scraping"
    )

print(data)

output: {'success': True, 'data': 'Web scraping - Wikipedia ...'}

2. Searching with keywords:

from SearchURL.main import SearchURL

search = SearchURL(cache=True)

data = search.searchUrl(
    url="https://en.wikipedia.org/wiki/Web_scraping",
    keywords=['legal'])

print(data)

output: {'success': True, 'data': 'Legal issues Toggle Legal issues subsection Legal issues [ edit ] The legality of web scraping varies across the world ...'}

3. Fuzzy Searching:

from SearchURL.main import SearchURL

search = SearchURL(cache=True)

data = search.searchUrlFuzz(
    url="https://en.wikipedia.org/wiki/Web_scraping",
    keywords=['legal'])


print(data)

output: {'success': True, 'data': 'Legal issues [ edit ] | In the United States, website owners can use three major legal claims to prevent undesired web scraping: (1) copyright ...'}

4. Semantic Search: Yes, this package supports Semantic Search!

from SearchURL.main import SearchURL

search = SearchURL(createVector=True) # creates a in-memory vector database using chromadb

data = search.createEmbededData("https://en.wikipedia.org/wiki/Web_scraping") # loads and embeds all the data from the webpage.

if data.get('success'): # data = {'success': True, 'db': db}
    db = data.get('db') 
    results = db.query(keywords=['benefits', 'what benifits can we get from web scraping'], limit=10)
    print(results)

else:
    print(data.get('detail')) # data = {'success': False, 'detail': 'ERROR'}

Errors

If this package runs into some error while fetching and searching, it will return an object like this: {'success': False, 'detail': 'The error that occurred'}

The URL used in this readme is a link to an article on wikipedia.org on the topic of Web_scraping.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.1.4

Oct 9, 2024

1.1.3

Oct 1, 2024

1.1.2

Oct 1, 2024

1.1.1

Oct 1, 2024

1.1.0

Sep 30, 2024

0.0.5

Sep 29, 2024

0.0.4

Sep 29, 2024

0.0.3

Sep 28, 2024

0.0.2

Sep 28, 2024

0.0.1

Sep 28, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

SearchURL-1.1.4.tar.gz (5.4 kB view details)

Uploaded Oct 9, 2024 Source

Built Distribution

SearchURL-1.1.4-py3-none-any.whl (5.8 kB view details)

Uploaded Oct 9, 2024 Python 3

File details

Details for the file SearchURL-1.1.4.tar.gz.

File metadata

Download URL: SearchURL-1.1.4.tar.gz
Upload date: Oct 9, 2024
Size: 5.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.11.2

File hashes

Hashes for SearchURL-1.1.4.tar.gz
Algorithm	Hash digest
SHA256	`3e909e16dd75a734e00b6e358a2a663608c24a3ae288e289c55d18a065e48b4f`
MD5	`af416fb8d90552fa7f6546b49d128971`
BLAKE2b-256	`53f594f0b2e9557d3b1a14242d4fbc39a3f6155a404c02e062b92cf72dfa863a`

See more details on using hashes here.

File details

Details for the file SearchURL-1.1.4-py3-none-any.whl.

File metadata

Download URL: SearchURL-1.1.4-py3-none-any.whl
Upload date: Oct 9, 2024
Size: 5.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.11.2

File hashes

Hashes for SearchURL-1.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c6b8d848459dac364978a40d1af6a8476e6643a049bb9818e79a24e59ffa3f4a`
MD5	`7b4f3843cc5f92bfd717d5eba4cef250`
BLAKE2b-256	`38185f5e4c07407a241fc6fcef37ca0b6f8c35949055ed094077f75c40b61580`