Skip to main content

SearchURL lets perform Keyword, Fuzzy and Semantic Search through the text on websites using thier URLs.

Project description

SearchURL



github-searchURL

Installation

Install SearchURL with pip

  pip install SearchURL

Documentation

1. Getting all the text from a webpage by not passing in keywords:

from SearchURL.main import SearchURL

search = SearchURL(cache=True)

data = search.searchUrl(
    url="https://en.wikipedia.org/wiki/Web_scraping"
    )

print(data)

output: {'success': True, 'data': 'Web scraping - Wikipedia ...'}

2. Searching with keywords:

from SearchURL.main import SearchURL

search = SearchURL(cache=True)

data = search.searchUrl(
    url="https://en.wikipedia.org/wiki/Web_scraping",
    keywords=['legal'])

print(data)

output: {'success': True, 'data': 'Legal issues Toggle Legal issues subsection Legal issues [ edit ] The legality of web scraping varies across the world ...'}

3. Fuzzy Searching:

from SearchURL.main import SearchURL

search = SearchURL(cache=True)

data = search.searchUrlFuzz(
    url="https://en.wikipedia.org/wiki/Web_scraping",
    keywords=['legal'])


print(data)

output: {'success': True, 'data': 'Legal issues [ edit ] | In the United States, website owners can use three major legal claims to prevent undesired web scraping: (1) copyright ...'}

4. Semantic Search: Yes, this package supports Semantic Search!

from SearchURL.main import SearchURL

search = SearchURL(createVector=True) # creates a in-memory vector database using chromadb

data = search.createEmbededData("https://en.wikipedia.org/wiki/Web_scraping") # loads and embeds all the data from the webpage.

if data.get('success'): # data = {'success': True, 'db': db}
    db = data.get('db') 
    results = db.query(keywords=['benefits', 'what benifits can we get from web scraping'], limit=10)
    print(results)

else:
    print(data.get('detail')) # data = {'success': False, 'detail': 'ERROR'}

Errors

If this package runs into some error while fetching and searching, it will return an object like this: {'success': False, 'detail': 'The error that occurred'}



The URL used in this readme is a link to an article on wikipedia.org on the topic of Web_scraping.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

SearchURL-1.1.4.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

SearchURL-1.1.4-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file SearchURL-1.1.4.tar.gz.

File metadata

  • Download URL: SearchURL-1.1.4.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.2

File hashes

Hashes for SearchURL-1.1.4.tar.gz
Algorithm Hash digest
SHA256 3e909e16dd75a734e00b6e358a2a663608c24a3ae288e289c55d18a065e48b4f
MD5 af416fb8d90552fa7f6546b49d128971
BLAKE2b-256 53f594f0b2e9557d3b1a14242d4fbc39a3f6155a404c02e062b92cf72dfa863a

See more details on using hashes here.

File details

Details for the file SearchURL-1.1.4-py3-none-any.whl.

File metadata

  • Download URL: SearchURL-1.1.4-py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.2

File hashes

Hashes for SearchURL-1.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 c6b8d848459dac364978a40d1af6a8476e6643a049bb9818e79a24e59ffa3f4a
MD5 7b4f3843cc5f92bfd717d5eba4cef250
BLAKE2b-256 38185f5e4c07407a241fc6fcef37ca0b6f8c35949055ed094077f75c40b61580

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page