Skip to main content

Client for the Solr search service

Project description

solrpy

solrpy is a Python client for Solr, an enterprise search server built on top of Lucene. solrpy allows you to add documents to a Solr instance, and then to perform queries and gather search results from Solr using Python.

  • Supports Solr 1.2 through 10.x
  • Automatic Solr version detection with runtime feature gating
  • Python 3.10+ required

Installation

pip install solrpy

Or with Poetry:

poetry add solrpy

Overview

import solr

# create a connection to a solr server
s = solr.Solr('http://localhost:8983/solr/mycore')

# the server version is auto-detected
print(s.server_version)  # e.g. (9, 4, 1)

# check if the server is reachable
print(s.ping())  # True

# add a document to the index
doc = {
    "id": 1,
    "title": "Lucene in Action",
    "author": ["Erik Hatcher", "Otis Gospodnetić"],
}
s.add(doc, commit=True)

# do a search
response = s.select('title:lucene')
for hit in response.results:
    print(hit['title'])

Response format

Since v1.0.4, solrpy uses JSON (wt=json) by default, matching Solr 7.0+ behavior.

For legacy XML mode:

s = solr.Solr('http://localhost:8983/solr/mycore', response_format='xml')

The Response object API is identical regardless of format.

More powerful queries

Optional parameters for query, faceting, highlighting, and more like this can be passed in as Python parameters to the query method. Convert the dot notation (e.g. facet.field) to underscore notation (e.g. facet_field) so that they can be used as parameter names.

response = s.select('title:lucene', facet='true', facet_field='subject')

If the parameter takes multiple values, pass them in as a list:

response = s.select('title:lucene', facet='true', facet_field=['subject', 'publisher'])

Version detection

solrpy automatically detects the connected Solr version and gates features accordingly. If a feature requires a newer Solr version than what is connected, a SolrVersionError is raised with a clear message.

import solr

s = solr.Solr('http://localhost:8983/solr/mycore')
print(s.server_version)  # (6, 6, 6)

Tests

Tests require a running Solr instance. Using Docker:

docker run -d --name solr-dev -p 8983:8983 solr:6.6 solr-precreate core0
poetry run pytest tests/

Changelog

1.12.0

  • Streaming Expressions: Python builder with pipe (|) operator — no other non-Java client has this
  • search, merge, rollup, top, unique, innerJoin, etc.
  • Aggregate: count, sum, avg, min, max
  • conn.stream(expr) → iterator of result dicts
  • Pydantic model support via model= parameter

1.11.0

  • Pydantic response models: conn.select('*:*', model=MyDoc) converts results to Pydantic models
  • Response.as_models(MyDoc) for post-hoc conversion
  • conn.get(id='1', model=MyDoc) returns MyDoc | None
  • pip install solrpy[pydantic]

1.10.1

  • Field builder: Field('price', alias='p'), Field.func('sum', 'price', 'tax'), Field.transformer('explain')
  • Sort builder: Sort('price', 'desc'), Sort.func('geodist()', 'asc')
  • Facet builder: Facet.field('category'), Facet.range('price', 0, 100, 10), Facet.query(), Facet.pivot()
  • Fully backward compatible — raw strings still work

1.10.0

  • SolrCloud: SolrCloud(zk, collection) with ZooKeeper or SolrCloud.from_urls(urls, collection) HTTP-only
  • Leader-aware writes, automatic failover, collection aliases
  • SolrZooKeeper class for ZooKeeper node discovery
  • kazoo optional dependency (pip install solrpy[cloud])
  • Docker Compose for local SolrCloud testing

1.9.2

  • Solr 6~10 full compatibility: wt=xml on Solr 7+ (wt=standard changed in 7.0)
  • Tested against Solr 6.6, 7.7, 8.11, 9.7, 10.0 — all 0 failures
  • GitHub Actions CI matrix for 5 Solr versions
  • KNN live tests version-gated (skip on < 9.0, efSearchScaleFactor skip on < 10.0)
  • Test isolation: Paginator no longer deletes all documents

1.9.1

  • KNN API overhaul: search(), similarity(), hybrid(), rerank() methods
  • Full {!knn} parameters: early_termination, seed_query, pre_filter, etc.
  • {!vectorSimilarity} threshold search (Solr 9.6+)
  • Hybrid (lexical OR vector) and re-ranking patterns

1.9.0

  • KNN / Dense Vector Search: KNN(conn) for {!knn} queries (Solr 9.0+)

1.8.1

  • HTTP transport abstraction: SolrTransport decouples companion classes from internal _get/_post
  • SchemaAPI, Suggest, Extract now use SolrTransport — prepares for httpx in 2.0.0

1.8.0

  • Bearer token auth: Solr(url, auth_token='...')
  • Custom auth callable: Solr(url, auth=my_fn) for OAuth2 dynamic refresh
  • Priority: auth callable > auth_token > http_user/http_pass

1.7.0

  • Grouping / Field Collapsing: resp.grouped['field'].groups for grouped results (Solr 3.3+)
  • GroupedResult, GroupField, Group classes with groupValue, doclist, matches, ngroups
  • Works in both JSON and XML modes

1.6.0

  • Extract: Extract(conn) wrapper class for Solr Cell (Apache Tika) via /update/extract (Solr 1.4+). Index rich documents (PDF, Word, HTML, …) with optional literal field values. extract_only() extracts text and metadata without indexing. from_path() / extract_from_path() open files by filesystem path, MIME type guessed automatically.
from solr import Solr, Extract

conn = Solr('http://localhost:8983/solr/mycore')
extract = Extract(conn)

# Index a PDF with metadata
with open('report.pdf', 'rb') as f:
    extract(f, content_type='application/pdf',
            literal_id='report1', literal_title='Annual Report',
            commit=True)

# Extract text only (no indexing)
text, metadata = extract.extract_from_path('report.pdf')
print(text[:200])

# Index from path (MIME type auto-detected)
extract.from_path('document.docx', literal_id='doc1', commit=True)

1.5.0

  • Suggest: Suggest(conn) wrapper class for Solr's SuggestComponent (Solr 4.7+). Returns a flat list of suggestion dicts from the /suggest handler.
  • Spellcheck: Response.spellcheck property returns a SpellcheckResult object with .collation and .suggestions accessors. Works in both JSON and XML modes (Solr 1.4+).
from solr import Solr, Suggest

conn = Solr('http://localhost:8983/solr/mycore')

# Suggest
suggest = Suggest(conn)
results = suggest('que', dictionary='mySuggester', count=5)
for s in results:
    print(s['term'], s['weight'])

# Spellcheck
resp = conn.select('misspeled query', spellcheck='true', spellcheck_collate='true')
if resp.spellcheck and not resp.spellcheck.correctly_spelled:
    print('Did you mean:', resp.spellcheck.collation)

1.4.2

  • New MoreLikeThis(conn) wrapper class — no need to know /mlt path

1.4.1

  • Breaking: conn.schema and conn.mlt removed from auto-initialization
  • Use SchemaAPI(conn) and SearchHandler(conn, '/mlt') explicitly
  • Keeps Solr class lightweight; optional features created on demand

1.4.0

  • Schema API: conn.schema.fields(), add_field(), replace_field(), delete_field(), copy fields, dynamic fields, field types (Solr 4.2+)

1.3.0

  • JSON Facet API: json_facet parameter for advanced faceting (Solr 5.0+)

1.2.0

  • Cursor pagination: resp.cursor_next() and conn.iter_cursor() for deep pagination (Solr 4.7+)

1.1.0

  • Soft Commit: conn.commit(soft_commit=True) (Solr 4.0+)
  • Atomic Update: conn.atomic_update(doc) with set/add/remove/inc modifiers (Solr 4.0+)
  • Real-time Get: conn.get(id='doc1') via /get handler (Solr 4.0+)
  • MoreLikeThis: conn.mlt handler for similar document search (Solr 4.0+)

1.0.9

  • Per-request timeout override: conn.select('*:*', timeout=5)

1.0.8

  • Exponential backoff on connection retries with configurable retry_delay
  • Each retry logged at WARNING level

1.0.7

  • Breaking: EmptyPage now inherits ValueError (was SolrException)
  • New PageNotAnInteger exception (inherits TypeError)
  • Paginator module no longer depends on SolrException

1.0.6

  • URL validation: warns if URL path doesn't contain /solr (Solr 10.0+ preparation)

1.0.5

  • Breaking: Removed SolrConnection class. Use Solr instead
  • Migration: add(**fields)add(dict), query()select(), raw_query()select.raw()

1.0.4

  • Breaking: Default response_format changed from 'xml' to 'json'
  • Pass response_format='xml' explicitly for legacy XML behavior

1.0.3

  • Added response_format constructor option ('xml' or 'json')
  • Split solr/core.py into exceptions.py, utils.py, response.py, parsers.py
  • All existing imports continue to work (re-exported via __init__.py)

1.0.2

  • mypy --strict passes with zero errors on solr/ package
  • Added type hints to all internal classes (ResponseContentHandler, Node, Results, UTC)
  • Fixed endElement variable shadowing for type safety

1.0.1

  • Added type hints to all public methods in solr/core.py and solr/paginator.py
  • Added solr/py.typed marker file for PEP 561 compatibility
  • Added mypy to dev dependencies
  • mypy passes with zero errors on solr/ package

0.9.11

  • Added JSON response parser (parse_json_response)
  • Added Solr.ping() convenience method
  • Added always_commit constructor option for auto-commit behavior
  • Added gzip response support (Accept-Encoding: gzip)

0.9.10

  • Added pyproject.toml metadata (authors, maintainers, classifiers, keywords)
  • Added Sphinx documentation (quickstart, API reference, version detection, changelog)
  • Rewrote README.md with current API examples and Docker test instructions
  • Updated CLAUDE.md development guidelines

0.9.9

  • Removed deprecated encoder/decoder attributes and codecs import
  • Fixed commit(_optimize=True) to correctly issue <optimize/> command
  • Added test coverage for <double> XML type parsing
  • Added test coverage for named <result> tag handling
  • Added Solr version auto-detection (server_version)
  • Added SolrVersionError exception and requires_version decorator
  • Removed all Python 2 compatibility code (Python 3.10+ only)
  • Migrated from setuptools to Poetry
  • Bumped version to 0.9.9

License

Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

solrpy-1.12.0.tar.gz (36.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

solrpy-1.12.0-py3-none-any.whl (41.1 kB view details)

Uploaded Python 3

File details

Details for the file solrpy-1.12.0.tar.gz.

File metadata

  • Download URL: solrpy-1.12.0.tar.gz
  • Upload date:
  • Size: 36.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.3 Linux/6.17.0-14-generic

File hashes

Hashes for solrpy-1.12.0.tar.gz
Algorithm Hash digest
SHA256 87b09f76b3a2be8896171dc30bcea217c174ae51168b6d4a06ad1abfb44422e7
MD5 6d2fac60dbe474acdf8ca4913c66c304
BLAKE2b-256 f60e2af295cd54fba20b5f6f8f7cf4d23995383d926f80c3ed22b070590d1234

See more details on using hashes here.

File details

Details for the file solrpy-1.12.0-py3-none-any.whl.

File metadata

  • Download URL: solrpy-1.12.0-py3-none-any.whl
  • Upload date:
  • Size: 41.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.3 Linux/6.17.0-14-generic

File hashes

Hashes for solrpy-1.12.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cd33575ceaf77fc4bf207abe00b47099c82c95edbe1e4edf5e4de84c5db14972
MD5 d835f2e77e989f8b1b3a20d3c9021474
BLAKE2b-256 1ac5c2a2cf93ad8274fa7109321433cf10b953f56b96cb76331471060b3d7b3a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page