Skip to main content

No project description provided

Project description

Build Status License: MIT

tantivy-py

Python bindings for Tantivy the full-text search engine library written in Rust.

Installation

The bindings can be installed using from pypi using pip:

pip install tantivy

If no binary wheel is present for your operating system the bindings will be build from source, this means that Rust needs to be installed before building can succeed.

Note that the bindings are using PyO3, which only supports python3.

Development

Setting up a development environment can be done in a virtual environment using nox or using local packages using the provided Makefile.

For the nox setup install the virtual environment and build the bindings using:

python3 -m pip install nox
nox

For the Makefile based setup run:

make

Running the tests is done using:

make test

Usage

The Python bindings have a similar API to Tantivy. To create a index first a schema needs to be built. After that documents can be added to the index and a reader can be created to search the index.

Building an index and populating it

import tantivy

# Declaring our schema.
schema_builder = tantivy.SchemaBuilder()
schema_builder.add_text_field("title", stored=True)
schema_builder.add_text_field("body", stored=True)
schema_builder.add_integer_field("doc_id",stored=True)
schema = schema_builder.build()

# Creating our index (in memory)
index = tantivy.Index(schema)

To have a persistent index, use the path parameter to store the index on the disk, e.g:

index = tantivy.Index(schema, path=os.getcwd() + '/index')

By default, tantivy offers the following tokenizers which can be used in tantivy-py:

  • default default is the tokenizer that will be used if you do not assign a specific tokenizer to your text field. It will chop your text on punctuation and whitespaces, removes tokens that are longer than 40 chars, and lowercase your text.

  • raw Does not actual tokenizer your text. It keeps it entirely unprocessed. It can be useful to index uuids, or urls for instance.

  • en_stem

In addition to what default does, the en_stem tokenizer also apply stemming to your tokens. Stemming consists in trimming words to remove their inflection. This tokenizer is slower than the default one, but is recommended to improve recall.

to use the above tokenizers, simply provide them as a parameter to add_text_field. e.g.

schema_builder.add_text_field("body",  stored=True,  tokenizer_name='en_stem')

Adding one document.

writer = index.writer()
writer.add_document(tantivy.Document(
	doc_id=1,
    title=["The Old Man and the Sea"],
    body=["""He was an old man who fished alone in a skiff in the Gulf Stream and he had gone eighty-four days now without taking a fish."""],
))
# ... and committing
writer.commit()

Building and Executing Queries

First you need to get a searcher for the index

# Reload the index to ensure it points to the last commit.
index.reload()
searcher = index.searcher()

Then you need to get a valid query object by parsing your query on the index.

query = index.parse_query("fish days", ["title", "body"])
(best_score, best_doc_address) = searcher.search(query, 3).hits[0]
best_doc = searcher.doc(best_doc_address)
assert best_doc["title"] == ["The Old Man and the Sea"]
print(best_doc)

Valid Query Formats

tantivy-py supports the query language used in tantivy. Some basic query Formats.

  • AND and OR conjunctions.
query = index.parse_query('(Old AND Man) OR Stream', ["title", "body"])
(best_score, best_doc_address) = searcher.search(query, 3).hits[0]
best_doc = searcher.doc(best_doc_address)
  • +(includes) and -(excludes) operators.
query = index.parse_query('+Old +Man chef -fished', ["title", "body"])
(best_score, best_doc_address) = searcher.search(query, 3).hits[0]
best_doc = searcher.doc(best_doc_address)

Note: in a query like above, a word with no +/- acts like an OR.

  • phrase search.
query = index.parse_query('"eighty-four days"', ["title", "body"])
(best_score, best_doc_address) = searcher.search(query, 3).hits[0]
best_doc = searcher.doc(best_doc_address)
  • integer search
query = index.parse_query('"eighty-four days"', ["doc_id"])
(best_score, best_doc_address) = searcher.search(query, 3).hits[0]
best_doc = searcher.doc(best_doc_address)

Note: for integer search, the integer field should be indexed.

For more possible query formats and possible query options, see Tantivy Query Parser Docs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tantivy-0.20.1.tar.gz (47.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

tantivy-0.20.1-cp311-none-win_amd64.whl (2.2 MB view details)

Uploaded CPython 3.11Windows x86-64

tantivy-0.20.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

tantivy-0.20.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (4.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

tantivy-0.20.1-cp311-cp311-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (5.6 MB view details)

Uploaded CPython 3.11macOS 10.9+ universal2 (ARM64, x86-64)macOS 10.9+ x86-64macOS 11.0+ ARM64

tantivy-0.20.1-cp311-cp311-macosx_10_7_x86_64.whl (2.9 MB view details)

Uploaded CPython 3.11macOS 10.7+ x86-64

tantivy-0.20.1-cp310-none-win_amd64.whl (2.2 MB view details)

Uploaded CPython 3.10Windows x86-64

tantivy-0.20.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

tantivy-0.20.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (4.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

tantivy-0.20.1-cp310-cp310-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (5.6 MB view details)

Uploaded CPython 3.10macOS 10.9+ universal2 (ARM64, x86-64)macOS 10.9+ x86-64macOS 11.0+ ARM64

tantivy-0.20.1-cp310-cp310-macosx_10_7_x86_64.whl (2.9 MB view details)

Uploaded CPython 3.10macOS 10.7+ x86-64

tantivy-0.20.1-cp39-none-win_amd64.whl (2.2 MB view details)

Uploaded CPython 3.9Windows x86-64

tantivy-0.20.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

tantivy-0.20.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (4.0 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ ARM64

tantivy-0.20.1-cp39-cp39-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (5.6 MB view details)

Uploaded CPython 3.9macOS 10.9+ universal2 (ARM64, x86-64)macOS 10.9+ x86-64macOS 11.0+ ARM64

tantivy-0.20.1-cp39-cp39-macosx_10_7_x86_64.whl (2.9 MB view details)

Uploaded CPython 3.9macOS 10.7+ x86-64

tantivy-0.20.1-cp38-none-win_amd64.whl (2.2 MB view details)

Uploaded CPython 3.8Windows x86-64

tantivy-0.20.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

tantivy-0.20.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (4.0 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ ARM64

tantivy-0.20.1-cp38-cp38-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (5.6 MB view details)

Uploaded CPython 3.8macOS 10.9+ universal2 (ARM64, x86-64)macOS 10.9+ x86-64macOS 11.0+ ARM64

tantivy-0.20.1-cp38-cp38-macosx_10_7_x86_64.whl (2.9 MB view details)

Uploaded CPython 3.8macOS 10.7+ x86-64

File details

Details for the file tantivy-0.20.1.tar.gz.

File metadata

  • Download URL: tantivy-0.20.1.tar.gz
  • Upload date:
  • Size: 47.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for tantivy-0.20.1.tar.gz
Algorithm Hash digest
SHA256 da1c937494f90d16ecfef00176b8f3b85132dadf35c79ce6115216ee85c8bdf0
MD5 e466b6f5061bc46c79597f5be3198716
BLAKE2b-256 c5669d12b90ae94570166c235d17a59cd154a446855ead59864fe159ed94a4e8

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp311-none-win_amd64.whl.

File metadata

  • Download URL: tantivy-0.20.1-cp311-none-win_amd64.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for tantivy-0.20.1-cp311-none-win_amd64.whl
Algorithm Hash digest
SHA256 1f163e249823dbb1aaeee19cbb56b412ea68e7b32a843e12e9e89060c93e9e89
MD5 a373221dc9b1f3d7d6b92422fbd465f0
BLAKE2b-256 a71f93b5152b04f48e3f120f0115f0ed56fcdd125d83b27e0d7844e77054511a

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tantivy-0.20.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9bde39fd6dbf4ed9c80e4f8af57dcc510a36ddd329db2b9698eb5b272ae139d0
MD5 b4d7f1102bb10ab4a5fd1040f86e9320
BLAKE2b-256 e4fb8ee1db240d6236824dd9dda29c4693f0640bb8890a5610c7bd8dd9dd2460

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tantivy-0.20.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 407010e2393205791d55d86088f76c07f5ec35a15440bb81593ff23666b3b935
MD5 ee9a3f29a2572a84036753b76ec228e9
BLAKE2b-256 2d1d398cef214a6b00d5c6658793991c69722f583dde87fb25da94c65e23aef2

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp311-cp311-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for tantivy-0.20.1-cp311-cp311-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 a689a7a8f3ff54a0a99ff72ad09a89ff1dd63a388370d12882b694b229c6bbe5
MD5 9263f44dff2dd7c428db9c9ba6267a22
BLAKE2b-256 db8ef15da28e4d617084213ebd5139d9e560839e0c9c36d86f9f6e79731240db

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp311-cp311-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for tantivy-0.20.1-cp311-cp311-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 2e3b77fd49ef2ce53de1fef4115c20eed52614daef32548275049d75e432fada
MD5 cde60f4cef7dd9aace96c8c729fdca6e
BLAKE2b-256 30a13666ca357009297eedec7d776eadc5006980db957e13178fd33ba0e85449

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp310-none-win_amd64.whl.

File metadata

  • Download URL: tantivy-0.20.1-cp310-none-win_amd64.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for tantivy-0.20.1-cp310-none-win_amd64.whl
Algorithm Hash digest
SHA256 f2e368c286d0aa1911fb609a5ff7d4b063c7a636eec4d1c4ee8250326363c9d4
MD5 2b9868b291800d759d5ea0c36537d925
BLAKE2b-256 fa19675e7d9b5f3697c1cac5e8777bbd5febd2c9504cb7570093b00213f4979e

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tantivy-0.20.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 82bb9ed9f622964a9f2901093456fb069dcaa81249477931e38bef1eb103327e
MD5 d07e5c5137ad336f6942a92452eaf75d
BLAKE2b-256 fc471f6391dc39ea71fe2f47ed9b80444f636463642c99b16d9b568077bfebad

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tantivy-0.20.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 a2a3350bd7354bfccaf00a7741885305194eb10ca004b27e7ba98b631db657b0
MD5 33193d58412cd58b73bf781ac29275c4
BLAKE2b-256 fc46e8508cbdadfab47f9d0b35004f43cd765ed1df07855cdb1d74ed6accd763

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp310-cp310-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for tantivy-0.20.1-cp310-cp310-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 cac53b5811035cd44ce891a31eb08002b91f94163308df2b3d712e45897d219c
MD5 3f373688e5c18c1bd65b6de8db773e50
BLAKE2b-256 95414a31d31da986b974d22a53981cfa538b78af252ad201f845a3e1ade43e2d

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp310-cp310-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for tantivy-0.20.1-cp310-cp310-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 0e1d2fff8440685e04b69e098dd6569336de19eb729f21785fb7c9b9421a5e2e
MD5 14f4d835b8125aba5264c526bb57152f
BLAKE2b-256 c6aacec665ff0a11faa1ce5b502ae2de00e049ef90595bb1da26b9c270afaa39

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp39-none-win_amd64.whl.

File metadata

  • Download URL: tantivy-0.20.1-cp39-none-win_amd64.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for tantivy-0.20.1-cp39-none-win_amd64.whl
Algorithm Hash digest
SHA256 7dc3eb52d0d44dbed5f61104fb9b37c0938aae60bf7c3a111c13723377b57fe5
MD5 a56ab21bcb00b56d68d34af2e8831e3c
BLAKE2b-256 3c6b59d3f8bdc5290d54f1620448fd5bee8aa830367e6c1c03dae80a51250cbe

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tantivy-0.20.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3733da8f0012371777b6dcf0310ceddab8391998b7f8d14bbc6eca2effb2d1ea
MD5 fbb4adc6424f6483885b7cfb8a232d77
BLAKE2b-256 f93677f290259ac89eeed278efec3cbbdd921aef2378d2f769829b25adaefbd8

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tantivy-0.20.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 2286badd1768f905221f49db565ae6aa230c90d22b59928688dc83688dbd7295
MD5 43890926e26e60c14a1050d3dacfaeb6
BLAKE2b-256 6b2c15c98239ee7995a9e9feb87c4c4510110b333357229b9279c3ff027e11af

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp39-cp39-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for tantivy-0.20.1-cp39-cp39-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 2fda8125e1da8b0ad621e4da83f0ad1fb6637c246195f0a5e5f896539de2aec2
MD5 1bf3712a07e9329d118591980116c313
BLAKE2b-256 9efdda4d50d564f194056f083616416008e3831d2db18ad8f69b82b11d4fbc15

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp39-cp39-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for tantivy-0.20.1-cp39-cp39-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 d11234ace32984cf4ce92484de244abaa54d75eae51da12a4c08997f13bc8add
MD5 30f68933af1af00f603456cbf8d0ca88
BLAKE2b-256 590a19a9384ec23712dc83e539741ebd3e8a4c40b8d097d2d11de655851b1a5a

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp38-none-win_amd64.whl.

File metadata

  • Download URL: tantivy-0.20.1-cp38-none-win_amd64.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for tantivy-0.20.1-cp38-none-win_amd64.whl
Algorithm Hash digest
SHA256 94be702ee4a0316249302a83be5b804820e218b237b55ad4d697586ff9ec001f
MD5 dafb7f1d6780564b86535f172542c6ce
BLAKE2b-256 f6e67569ae1f54216679a7207c0f80993d2f66c0424a2fd319845e98f7f8c8e0

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tantivy-0.20.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d3de1d5b92f2dd20d9554202a5a36762cf5d487ee569e7e0eff7bfa62fcf964c
MD5 ff2ad9e7371a056cd08ff556be2d461c
BLAKE2b-256 227e09696a75e634ad407f58a58985a7f4bad483d4a370277b3223f55224d79f

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tantivy-0.20.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 17a4ac33c685f1b0ea2581bceb00d97219f9cde27d82a7dff35f988a2e8511e7
MD5 39bb5c5710c918b44554fb0e2c34e5cd
BLAKE2b-256 1afc3f98d856a77609267a971e13d291ac0bc2b05cabc7645973e552058bab3a

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp38-cp38-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for tantivy-0.20.1-cp38-cp38-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 81fcf4553a4bf15dd2cf5f5b7d79d7cb331433dadc08bfa07e6ea57ebd2b4f95
MD5 0027afcf77405ac22fe8589b36d2b90e
BLAKE2b-256 f801429633030dddb67ef5743fa6db8a258f6f1a1fe5bdfb118b0cc72f6fe4c0

See more details on using hashes here.

File details

Details for the file tantivy-0.20.1-cp38-cp38-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for tantivy-0.20.1-cp38-cp38-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 a5603254426ea8d5ef203e90bd07dd47f20a0001614af7323bb3edb0a9d2b26f
MD5 984d1d6da764844bc7791d714776b2d5
BLAKE2b-256 d8719238a4bac49a216af8b8989848936b2b24a9a06a7b1614b09548da4d6bfa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page