Skip to main content

Client for the Solr search service

Project description

solrpy

solrpy is a Python client for Solr, an enterprise search server built on top of Lucene. solrpy allows you to add documents to a Solr instance, and then to perform queries and gather search results from Solr using Python.

  • Supports Solr 1.2 through 10.x
  • Automatic Solr version detection with runtime feature gating
  • Python 3.10+ required

Installation

pip install solrpy

Or with Poetry:

poetry add solrpy

Overview

import solr

# create a connection to a solr server
s = solr.Solr('http://localhost:8983/solr/mycore')

# the server version is auto-detected
print(s.server_version)  # e.g. (9, 4, 1)

# check if the server is reachable
print(s.ping())  # True

# add a document to the index
doc = {
    "id": 1,
    "title": "Lucene in Action",
    "author": ["Erik Hatcher", "Otis Gospodnetić"],
}
s.add(doc, commit=True)

# do a search
response = s.select('title:lucene')
for hit in response.results:
    print(hit['title'])

Response format

Since v1.0.4, solrpy uses JSON (wt=json) by default, matching Solr 7.0+ behavior.

For legacy XML mode:

s = solr.Solr('http://localhost:8983/solr/mycore', response_format='xml')

The Response object API is identical regardless of format.

More powerful queries

Optional parameters for query, faceting, highlighting, and more like this can be passed in as Python parameters to the query method. Convert the dot notation (e.g. facet.field) to underscore notation (e.g. facet_field) so that they can be used as parameter names.

response = s.select('title:lucene', facet='true', facet_field='subject')

If the parameter takes multiple values, pass them in as a list:

response = s.select('title:lucene', facet='true', facet_field=['subject', 'publisher'])

Version detection

solrpy automatically detects the connected Solr version and gates features accordingly. If a feature requires a newer Solr version than what is connected, a SolrVersionError is raised with a clear message.

import solr

s = solr.Solr('http://localhost:8983/solr/mycore')
print(s.server_version)  # (6, 6, 6)

Tests

Tests require a running Solr instance. Using Docker:

docker run -d --name solr-dev -p 8983:8983 solr:6.6 solr-precreate core0
poetry run pytest tests/

Changelog

1.10.1

  • Field builder: Field('price', alias='p'), Field.func('sum', 'price', 'tax'), Field.transformer('explain')
  • Sort builder: Sort('price', 'desc'), Sort.func('geodist()', 'asc')
  • Facet builder: Facet.field('category'), Facet.range('price', 0, 100, 10), Facet.query(), Facet.pivot()
  • Fully backward compatible — raw strings still work

1.10.0

  • SolrCloud: SolrCloud(zk, collection) with ZooKeeper or SolrCloud.from_urls(urls, collection) HTTP-only
  • Leader-aware writes, automatic failover, collection aliases
  • SolrZooKeeper class for ZooKeeper node discovery
  • kazoo optional dependency (pip install solrpy[cloud])
  • Docker Compose for local SolrCloud testing

1.9.2

  • Solr 6~10 full compatibility: wt=xml on Solr 7+ (wt=standard changed in 7.0)
  • Tested against Solr 6.6, 7.7, 8.11, 9.7, 10.0 — all 0 failures
  • GitHub Actions CI matrix for 5 Solr versions
  • KNN live tests version-gated (skip on < 9.0, efSearchScaleFactor skip on < 10.0)
  • Test isolation: Paginator no longer deletes all documents

1.9.1

  • KNN API overhaul: search(), similarity(), hybrid(), rerank() methods
  • Full {!knn} parameters: early_termination, seed_query, pre_filter, etc.
  • {!vectorSimilarity} threshold search (Solr 9.6+)
  • Hybrid (lexical OR vector) and re-ranking patterns

1.9.0

  • KNN / Dense Vector Search: KNN(conn) for {!knn} queries (Solr 9.0+)

1.8.1

  • HTTP transport abstraction: SolrTransport decouples companion classes from internal _get/_post
  • SchemaAPI, Suggest, Extract now use SolrTransport — prepares for httpx in 2.0.0

1.8.0

  • Bearer token auth: Solr(url, auth_token='...')
  • Custom auth callable: Solr(url, auth=my_fn) for OAuth2 dynamic refresh
  • Priority: auth callable > auth_token > http_user/http_pass

1.7.0

  • Grouping / Field Collapsing: resp.grouped['field'].groups for grouped results (Solr 3.3+)
  • GroupedResult, GroupField, Group classes with groupValue, doclist, matches, ngroups
  • Works in both JSON and XML modes

1.6.0

  • Extract: Extract(conn) wrapper class for Solr Cell (Apache Tika) via /update/extract (Solr 1.4+). Index rich documents (PDF, Word, HTML, …) with optional literal field values. extract_only() extracts text and metadata without indexing. from_path() / extract_from_path() open files by filesystem path, MIME type guessed automatically.
from solr import Solr, Extract

conn = Solr('http://localhost:8983/solr/mycore')
extract = Extract(conn)

# Index a PDF with metadata
with open('report.pdf', 'rb') as f:
    extract(f, content_type='application/pdf',
            literal_id='report1', literal_title='Annual Report',
            commit=True)

# Extract text only (no indexing)
text, metadata = extract.extract_from_path('report.pdf')
print(text[:200])

# Index from path (MIME type auto-detected)
extract.from_path('document.docx', literal_id='doc1', commit=True)

1.5.0

  • Suggest: Suggest(conn) wrapper class for Solr's SuggestComponent (Solr 4.7+). Returns a flat list of suggestion dicts from the /suggest handler.
  • Spellcheck: Response.spellcheck property returns a SpellcheckResult object with .collation and .suggestions accessors. Works in both JSON and XML modes (Solr 1.4+).
from solr import Solr, Suggest

conn = Solr('http://localhost:8983/solr/mycore')

# Suggest
suggest = Suggest(conn)
results = suggest('que', dictionary='mySuggester', count=5)
for s in results:
    print(s['term'], s['weight'])

# Spellcheck
resp = conn.select('misspeled query', spellcheck='true', spellcheck_collate='true')
if resp.spellcheck and not resp.spellcheck.correctly_spelled:
    print('Did you mean:', resp.spellcheck.collation)

1.4.2

  • New MoreLikeThis(conn) wrapper class — no need to know /mlt path

1.4.1

  • Breaking: conn.schema and conn.mlt removed from auto-initialization
  • Use SchemaAPI(conn) and SearchHandler(conn, '/mlt') explicitly
  • Keeps Solr class lightweight; optional features created on demand

1.4.0

  • Schema API: conn.schema.fields(), add_field(), replace_field(), delete_field(), copy fields, dynamic fields, field types (Solr 4.2+)

1.3.0

  • JSON Facet API: json_facet parameter for advanced faceting (Solr 5.0+)

1.2.0

  • Cursor pagination: resp.cursor_next() and conn.iter_cursor() for deep pagination (Solr 4.7+)

1.1.0

  • Soft Commit: conn.commit(soft_commit=True) (Solr 4.0+)
  • Atomic Update: conn.atomic_update(doc) with set/add/remove/inc modifiers (Solr 4.0+)
  • Real-time Get: conn.get(id='doc1') via /get handler (Solr 4.0+)
  • MoreLikeThis: conn.mlt handler for similar document search (Solr 4.0+)

1.0.9

  • Per-request timeout override: conn.select('*:*', timeout=5)

1.0.8

  • Exponential backoff on connection retries with configurable retry_delay
  • Each retry logged at WARNING level

1.0.7

  • Breaking: EmptyPage now inherits ValueError (was SolrException)
  • New PageNotAnInteger exception (inherits TypeError)
  • Paginator module no longer depends on SolrException

1.0.6

  • URL validation: warns if URL path doesn't contain /solr (Solr 10.0+ preparation)

1.0.5

  • Breaking: Removed SolrConnection class. Use Solr instead
  • Migration: add(**fields)add(dict), query()select(), raw_query()select.raw()

1.0.4

  • Breaking: Default response_format changed from 'xml' to 'json'
  • Pass response_format='xml' explicitly for legacy XML behavior

1.0.3

  • Added response_format constructor option ('xml' or 'json')
  • Split solr/core.py into exceptions.py, utils.py, response.py, parsers.py
  • All existing imports continue to work (re-exported via __init__.py)

1.0.2

  • mypy --strict passes with zero errors on solr/ package
  • Added type hints to all internal classes (ResponseContentHandler, Node, Results, UTC)
  • Fixed endElement variable shadowing for type safety

1.0.1

  • Added type hints to all public methods in solr/core.py and solr/paginator.py
  • Added solr/py.typed marker file for PEP 561 compatibility
  • Added mypy to dev dependencies
  • mypy passes with zero errors on solr/ package

0.9.11

  • Added JSON response parser (parse_json_response)
  • Added Solr.ping() convenience method
  • Added always_commit constructor option for auto-commit behavior
  • Added gzip response support (Accept-Encoding: gzip)

0.9.10

  • Added pyproject.toml metadata (authors, maintainers, classifiers, keywords)
  • Added Sphinx documentation (quickstart, API reference, version detection, changelog)
  • Rewrote README.md with current API examples and Docker test instructions
  • Updated CLAUDE.md development guidelines

0.9.9

  • Removed deprecated encoder/decoder attributes and codecs import
  • Fixed commit(_optimize=True) to correctly issue <optimize/> command
  • Added test coverage for <double> XML type parsing
  • Added test coverage for named <result> tag handling
  • Added Solr version auto-detection (server_version)
  • Added SolrVersionError exception and requires_version decorator
  • Removed all Python 2 compatibility code (Python 3.10+ only)
  • Migrated from setuptools to Poetry
  • Bumped version to 0.9.9

License

Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

solrpy-1.10.1.tar.gz (34.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

solrpy-1.10.1-py3-none-any.whl (38.6 kB view details)

Uploaded Python 3

File details

Details for the file solrpy-1.10.1.tar.gz.

File metadata

  • Download URL: solrpy-1.10.1.tar.gz
  • Upload date:
  • Size: 34.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.3 Linux/6.17.0-14-generic

File hashes

Hashes for solrpy-1.10.1.tar.gz
Algorithm Hash digest
SHA256 69ab824c56fa7f5c079f7b5e8fdd212a1728e0be7ced585de4ea11b57b48c8ac
MD5 cb3cad7733178acc0c40184bbb272fbf
BLAKE2b-256 4313eeb073e0158be14dccfe354992e0f942468a3f03b74d7b01383c7c4d26eb

See more details on using hashes here.

File details

Details for the file solrpy-1.10.1-py3-none-any.whl.

File metadata

  • Download URL: solrpy-1.10.1-py3-none-any.whl
  • Upload date:
  • Size: 38.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.3 Linux/6.17.0-14-generic

File hashes

Hashes for solrpy-1.10.1-py3-none-any.whl
Algorithm Hash digest
SHA256 008fa1dea9826d0a4a21e942c0f76d21909e7c3c6e0d3090c63a1cebb6a7d1a2
MD5 af0d3182dfa0ab7e030a38877021e3db
BLAKE2b-256 4743cb506d1b1794e8aa98ebd7e5f83746e59b8aa03ccc856ed6968b7b19936b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page