Skip to main content

Client for the Solr search service

Project description

solrpy

solrpy is a Python client for Solr, an enterprise search server built on top of Lucene. solrpy allows you to add documents to a Solr instance, and then to perform queries and gather search results from Solr using Python.

  • Supports Solr 1.2 through 10.x
  • Automatic Solr version detection with runtime feature gating
  • Python 3.10+ required

Installation

pip install solrpy

Or with Poetry:

poetry add solrpy

Overview

import solr

# create a connection to a solr server
s = solr.Solr('http://localhost:8983/solr/mycore')

# the server version is auto-detected
print(s.server_version)  # e.g. (9, 4, 1)

# check if the server is reachable
print(s.ping())  # True

# add a document to the index
doc = {
    "id": 1,
    "title": "Lucene in Action",
    "author": ["Erik Hatcher", "Otis Gospodnetić"],
}
s.add(doc, commit=True)

# do a search
response = s.select('title:lucene')
for hit in response.results:
    print(hit['title'])

Response format

Since v1.0.4, solrpy uses JSON (wt=json) by default, matching Solr 7.0+ behavior.

For legacy XML mode:

s = solr.Solr('http://localhost:8983/solr/mycore', response_format='xml')

The Response object API is identical regardless of format.

More powerful queries

Optional parameters for query, faceting, highlighting, and more like this can be passed in as Python parameters to the query method. Convert the dot notation (e.g. facet.field) to underscore notation (e.g. facet_field) so that they can be used as parameter names.

response = s.select('title:lucene', facet='true', facet_field='subject')

If the parameter takes multiple values, pass them in as a list:

response = s.select('title:lucene', facet='true', facet_field=['subject', 'publisher'])

Version detection

solrpy automatically detects the connected Solr version and gates features accordingly. If a feature requires a newer Solr version than what is connected, a SolrVersionError is raised with a clear message.

import solr

s = solr.Solr('http://localhost:8983/solr/mycore')
print(s.server_version)  # (6, 6, 6)

Tests

Tests require a running Solr instance. Using Docker:

docker run -d --name solr-dev -p 8983:8983 solr:6.6 solr-precreate core0
poetry run pytest tests/

Changelog

2.0.3

  • Async companion classes: AsyncSchemaAPI, AsyncKNN, AsyncMoreLikeThis, AsyncSuggest, AsyncExtract
  • Full async support for all companion features

2.0.2

  • AsyncSolr: async with AsyncSolr(url) as conn: await conn.select('*:*')
  • AsyncTransport for async companion classes
  • Full async: select, add, add_many, delete, commit, get

2.0.1

  • Breaking: http.client replaced with httpx.Client
  • Automatic connection pooling and keep-alive
  • httpx is now a required dependency
  • All public API unchanged — drop-in replacement for 1.x

1.12.0

  • Streaming Expressions: Python builder with pipe (|) operator — no other non-Java client has this
  • search, merge, rollup, top, unique, innerJoin, etc.
  • Aggregate: count, sum, avg, min, max
  • conn.stream(expr) → iterator of result dicts
  • Pydantic model support via model= parameter

1.11.0

  • Pydantic response models: conn.select('*:*', model=MyDoc) converts results to Pydantic models
  • Response.as_models(MyDoc) for post-hoc conversion
  • conn.get(id='1', model=MyDoc) returns MyDoc | None
  • pip install solrpy[pydantic]

1.10.1

  • Field builder: Field('price', alias='p'), Field.func('sum', 'price', 'tax'), Field.transformer('explain')
  • Sort builder: Sort('price', 'desc'), Sort.func('geodist()', 'asc')
  • Facet builder: Facet.field('category'), Facet.range('price', 0, 100, 10), Facet.query(), Facet.pivot()
  • Fully backward compatible — raw strings still work

1.10.0

  • SolrCloud: SolrCloud(zk, collection) with ZooKeeper or SolrCloud.from_urls(urls, collection) HTTP-only
  • Leader-aware writes, automatic failover, collection aliases
  • SolrZooKeeper class for ZooKeeper node discovery
  • kazoo optional dependency (pip install solrpy[cloud])
  • Docker Compose for local SolrCloud testing

1.9.2

  • Solr 6~10 full compatibility: wt=xml on Solr 7+ (wt=standard changed in 7.0)
  • Tested against Solr 6.6, 7.7, 8.11, 9.7, 10.0 — all 0 failures
  • GitHub Actions CI matrix for 5 Solr versions
  • KNN live tests version-gated (skip on < 9.0, efSearchScaleFactor skip on < 10.0)
  • Test isolation: Paginator no longer deletes all documents

1.9.1

  • KNN API overhaul: search(), similarity(), hybrid(), rerank() methods
  • Full {!knn} parameters: early_termination, seed_query, pre_filter, etc.
  • {!vectorSimilarity} threshold search (Solr 9.6+)
  • Hybrid (lexical OR vector) and re-ranking patterns

1.9.0

  • KNN / Dense Vector Search: KNN(conn) for {!knn} queries (Solr 9.0+)

1.8.1

  • HTTP transport abstraction: SolrTransport decouples companion classes from internal _get/_post
  • SchemaAPI, Suggest, Extract now use SolrTransport — prepares for httpx in 2.0.0

1.8.0

  • Bearer token auth: Solr(url, auth_token='...')
  • Custom auth callable: Solr(url, auth=my_fn) for OAuth2 dynamic refresh
  • Priority: auth callable > auth_token > http_user/http_pass

1.7.0

  • Grouping / Field Collapsing: resp.grouped['field'].groups for grouped results (Solr 3.3+)
  • GroupedResult, GroupField, Group classes with groupValue, doclist, matches, ngroups
  • Works in both JSON and XML modes

1.6.0

  • Extract: Extract(conn) wrapper class for Solr Cell (Apache Tika) via /update/extract (Solr 1.4+). Index rich documents (PDF, Word, HTML, …) with optional literal field values. extract_only() extracts text and metadata without indexing. from_path() / extract_from_path() open files by filesystem path, MIME type guessed automatically.
from solr import Solr, Extract

conn = Solr('http://localhost:8983/solr/mycore')
extract = Extract(conn)

# Index a PDF with metadata
with open('report.pdf', 'rb') as f:
    extract(f, content_type='application/pdf',
            literal_id='report1', literal_title='Annual Report',
            commit=True)

# Extract text only (no indexing)
text, metadata = extract.extract_from_path('report.pdf')
print(text[:200])

# Index from path (MIME type auto-detected)
extract.from_path('document.docx', literal_id='doc1', commit=True)

1.5.0

  • Suggest: Suggest(conn) wrapper class for Solr's SuggestComponent (Solr 4.7+). Returns a flat list of suggestion dicts from the /suggest handler.
  • Spellcheck: Response.spellcheck property returns a SpellcheckResult object with .collation and .suggestions accessors. Works in both JSON and XML modes (Solr 1.4+).
from solr import Solr, Suggest

conn = Solr('http://localhost:8983/solr/mycore')

# Suggest
suggest = Suggest(conn)
results = suggest('que', dictionary='mySuggester', count=5)
for s in results:
    print(s['term'], s['weight'])

# Spellcheck
resp = conn.select('misspeled query', spellcheck='true', spellcheck_collate='true')
if resp.spellcheck and not resp.spellcheck.correctly_spelled:
    print('Did you mean:', resp.spellcheck.collation)

1.4.2

  • New MoreLikeThis(conn) wrapper class — no need to know /mlt path

1.4.1

  • Breaking: conn.schema and conn.mlt removed from auto-initialization
  • Use SchemaAPI(conn) and SearchHandler(conn, '/mlt') explicitly
  • Keeps Solr class lightweight; optional features created on demand

1.4.0

  • Schema API: conn.schema.fields(), add_field(), replace_field(), delete_field(), copy fields, dynamic fields, field types (Solr 4.2+)

1.3.0

  • JSON Facet API: json_facet parameter for advanced faceting (Solr 5.0+)

1.2.0

  • Cursor pagination: resp.cursor_next() and conn.iter_cursor() for deep pagination (Solr 4.7+)

1.1.0

  • Soft Commit: conn.commit(soft_commit=True) (Solr 4.0+)
  • Atomic Update: conn.atomic_update(doc) with set/add/remove/inc modifiers (Solr 4.0+)
  • Real-time Get: conn.get(id='doc1') via /get handler (Solr 4.0+)
  • MoreLikeThis: conn.mlt handler for similar document search (Solr 4.0+)

1.0.9

  • Per-request timeout override: conn.select('*:*', timeout=5)

1.0.8

  • Exponential backoff on connection retries with configurable retry_delay
  • Each retry logged at WARNING level

1.0.7

  • Breaking: EmptyPage now inherits ValueError (was SolrException)
  • New PageNotAnInteger exception (inherits TypeError)
  • Paginator module no longer depends on SolrException

1.0.6

  • URL validation: warns if URL path doesn't contain /solr (Solr 10.0+ preparation)

1.0.5

  • Breaking: Removed SolrConnection class. Use Solr instead
  • Migration: add(**fields)add(dict), query()select(), raw_query()select.raw()

1.0.4

  • Breaking: Default response_format changed from 'xml' to 'json'
  • Pass response_format='xml' explicitly for legacy XML behavior

1.0.3

  • Added response_format constructor option ('xml' or 'json')
  • Split solr/core.py into exceptions.py, utils.py, response.py, parsers.py
  • All existing imports continue to work (re-exported via __init__.py)

1.0.2

  • mypy --strict passes with zero errors on solr/ package
  • Added type hints to all internal classes (ResponseContentHandler, Node, Results, UTC)
  • Fixed endElement variable shadowing for type safety

1.0.1

  • Added type hints to all public methods in solr/core.py and solr/paginator.py
  • Added solr/py.typed marker file for PEP 561 compatibility
  • Added mypy to dev dependencies
  • mypy passes with zero errors on solr/ package

0.9.11

  • Added JSON response parser (parse_json_response)
  • Added Solr.ping() convenience method
  • Added always_commit constructor option for auto-commit behavior
  • Added gzip response support (Accept-Encoding: gzip)

0.9.10

  • Added pyproject.toml metadata (authors, maintainers, classifiers, keywords)
  • Added Sphinx documentation (quickstart, API reference, version detection, changelog)
  • Rewrote README.md with current API examples and Docker test instructions
  • Updated CLAUDE.md development guidelines

0.9.9

  • Removed deprecated encoder/decoder attributes and codecs import
  • Fixed commit(_optimize=True) to correctly issue <optimize/> command
  • Added test coverage for <double> XML type parsing
  • Added test coverage for named <result> tag handling
  • Added Solr version auto-detection (server_version)
  • Added SolrVersionError exception and requires_version decorator
  • Removed all Python 2 compatibility code (Python 3.10+ only)
  • Migrated from setuptools to Poetry
  • Bumped version to 0.9.9

License

Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

solrpy-2.0.3.tar.gz (40.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

solrpy-2.0.3-py3-none-any.whl (47.3 kB view details)

Uploaded Python 3

File details

Details for the file solrpy-2.0.3.tar.gz.

File metadata

  • Download URL: solrpy-2.0.3.tar.gz
  • Upload date:
  • Size: 40.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.3 Linux/6.17.0-14-generic

File hashes

Hashes for solrpy-2.0.3.tar.gz
Algorithm Hash digest
SHA256 93232b7f1a677e2e29bf5ef3d308e305bfa512dbbae301001b893783a88dfc1c
MD5 fc9ca1108a536b8ced13d3654052a279
BLAKE2b-256 c078e959f8523eba92833d3ef6d46555a7e84dcd642771fe333710e8ee70be0b

See more details on using hashes here.

File details

Details for the file solrpy-2.0.3-py3-none-any.whl.

File metadata

  • Download URL: solrpy-2.0.3-py3-none-any.whl
  • Upload date:
  • Size: 47.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.3 Linux/6.17.0-14-generic

File hashes

Hashes for solrpy-2.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 eb4ac1b58ce6b642fc41e01264dea2ae2a37d20ef136b30aa2fdf749d878cdfd
MD5 c538800e54ad935ef84d8ff4cfb685f4
BLAKE2b-256 845d2db2ac9b6be025c98c94194b36f4f6919a1f492fbed59e93f3047ced942e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page