with the selected classifier
A command line tool that creates fulltext search indexes of your favourite websites on your machine, and allows you to search them locally
This package provides 16 stemmer algorithms (15 + Poerter English stemmer) generated from Snowball algorithms.
A classifier for detecting soft 404 pages
Python implementation of the main operation in the Solr API Rest
python library for interacting with SolrCloud
Python Solr query utility
Generate fake Intersphinx inventory
Paver tasks for Sphinx Search server
FTP and Web spiders and mirroring utilities
Spidy is the simple, easy to use command line web crawler.
Spyda - Python Spider Tool and Library
SPyDI: Simple Python Distributed Indexing
Python interface to Solr
Minimalistic interface to Solr.
A DSL for extracting data from a web page.
Morphological/Inflection/Lemmatization Engine for Croatian language, POS tagger, stopwords
Web mining module for Python.
text-sentence is text tokenizer and sentence splitter
Unofficial Python API for ThePirateBay.