Skip to main content

Dogsheep search index

Project description

dogsheep-beta

PyPI Changelog Tests License

Dogsheep search index

Installation

Install this tool like so:

$ pip install dogsheep-beta

Usage

Run the indexer using the dogsheep-beta command-line tool:

$ dogsheep-beta index dogsheep.db config.yml

The config.yml file contains details of the databases and tables that should be indexed:

twitter.db:
    tweets:
        sql: |-
            select
                tweets.id as key,
                'Tweet by @' || users.screen_name as title,
                tweets.created_at as timestamp,
                tweets.full_text as search_1
            from tweets join users on tweets.user = users.id
    users:
        sql: |-
            select
                id as key,
                name || ' @' || screen_name as title,
                created_at as timestamp,
                description as search_1
            from users

This will create a search_index table in the dogsheep.db database populated by data from those SQL queries.

By default the search index that this tool creates will be configured for Porter stemming. This means that searches for words like run will match documents containing runs or running.

If you don't want to use Porter stemming, use the --tokenize none option:

$ dogsheep-beta index dogsheep.db config.yml --tokenize none

You can pass other SQLite tokenize argumenst here, see the SQLite FTS tokenizers documentation.

Columns

The columns that can be returned by our query are:

  • key - a unique (within that table) primary key
  • title - the title for the item
  • timestamp - an ISO8601 timestamp, e.g. 2020-09-02T21:00:21
  • search_1 - a larger chunk of text to be included in the search index
  • category - an integer category ID, see below

Categories

Indexed items can be assigned a category. Categories are integers that correspond to records in the categories table, which defaults to containing the following:

id name
1 created
2 saved

created is intended for items that have been created by the Dogsheep instance owner. saved is intended for items that they have saved, liked or favourited.

Development

To set up this plugin locally, first checkout the code. Then create a new virtual environment:

cd dogsheep-beta
python3 -mvenv venv
source venv/bin/activate

Or if you are using pipenv:

pipenv shell

Now install the dependencies and tests:

pip install -e '.[test]'

To run the tests:

pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dogsheep-beta-0.4a0.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

dogsheep_beta-0.4a0-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file dogsheep-beta-0.4a0.tar.gz.

File metadata

  • Download URL: dogsheep-beta-0.4a0.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for dogsheep-beta-0.4a0.tar.gz
Algorithm Hash digest
SHA256 ca3e38d40ab6ab5cc25386f3b1b4a2885a243c441cf945285bffe1a17cb2f7fc
MD5 79acf8b95193c70973a7ddbb924f5a11
BLAKE2b-256 6c760f637687790d6733a0a35c11767a9816ba69e3ea6a9c45a9c5674e961d1b

See more details on using hashes here.

File details

Details for the file dogsheep_beta-0.4a0-py3-none-any.whl.

File metadata

  • Download URL: dogsheep_beta-0.4a0-py3-none-any.whl
  • Upload date:
  • Size: 6.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for dogsheep_beta-0.4a0-py3-none-any.whl
Algorithm Hash digest
SHA256 49d3c609df2f0eb9a541540fb9171e92779cbeebdd59f2d9bff78d912932c1b7
MD5 de8c71daaec367eec3e5a9b73be1e277
BLAKE2b-256 8a2c5cdf888c70b000f801f7316b57f5432ac494fee81537aed656c9e534e173

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page