Skip to main content

Local OpenAlex database with 284M+ works, abstracts, and semantic search

Project description

OpenAlex Local

Local OpenAlex database with 284M+ scholarly works, abstracts, and semantic search.

Python License

Why OpenAlex Local?

Built for the LLM era - features that matter for AI research assistants:

Feature Benefit
📚 284M Works More coverage than CrossRef
📝 Abstracts ~45-60% availability for semantic search
🏷️ Concepts & Topics Built-in classification
👤 Author Disambiguation Linked to institutions
🔓 Open Access Info OA status and URLs

Perfect for: RAG systems, research assistants, literature review automation.

Installation
pip install openalex-local

From source:

git clone https://github.com/ywatanabe1989/openalex-local
cd openalex-local && make install

Database setup (~300 GB, ~1-2 days to build):

# Check system status
make status

# 1. Download OpenAlex Works snapshot (~300GB)
make download-screen  # runs in background

# 2. Build SQLite database
make build-db

# 3. Build FTS5 index
make build-fts
Python API
from openalex_local import search, get, count

# Full-text search (title + abstract)
results = search("machine learning neural networks")
for work in results:
    print(f"{work.title} ({work.year})")
    print(f"  Abstract: {work.abstract[:200]}...")
    print(f"  Concepts: {[c['name'] for c in work.concepts]}")

# Get by OpenAlex ID or DOI
work = get("W2741809807")
work = get("10.1038/nature12373")

# Count matches
n = count("CRISPR")
CLI
openalex-local search "CRISPR genome editing" -n 5
openalex-local get W2741809807
openalex-local get 10.1038/nature12373
openalex-local count "machine learning"
Related Projects

crossref-local - Sister project with CrossRef data:

Feature crossref-local openalex-local
Works 167M 284M
Abstracts ~21% ~45-60%
Update frequency Real-time Monthly
DOI authority ✓ (source) Uses CrossRef
Citations Raw references Linked works
Concepts/Topics
Author IDs
Best for DOI lookup, raw refs Semantic search

When to use CrossRef: Real-time DOI updates, raw reference parsing, authoritative metadata. When to use OpenAlex: Semantic search, citation analysis, topic discovery.

Data Source

Data from OpenAlex, an open catalog of scholarly works. Updated monthly from their snapshot.


SciTeX
AGPL-3.0 · ywatanabe@scitex.ai

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openalex_local-0.3.0.tar.gz (13.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openalex_local-0.3.0-py3-none-any.whl (14.8 kB view details)

Uploaded Python 3

File details

Details for the file openalex_local-0.3.0.tar.gz.

File metadata

  • Download URL: openalex_local-0.3.0.tar.gz
  • Upload date:
  • Size: 13.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for openalex_local-0.3.0.tar.gz
Algorithm Hash digest
SHA256 81b3edbf24c914214aeb149c254f6982fe3c1c19de3012a7d9d3ca8a5010d06a
MD5 0982f38c7d4fda68d5cefb4a197f19cf
BLAKE2b-256 e7185115332d8163bf2d6657401b4e262a69c83172530f80dd905bf83b9ef715

See more details on using hashes here.

File details

Details for the file openalex_local-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: openalex_local-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 14.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for openalex_local-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 afe120238fcb57c54e5655d15ef740f8d39b7a2c1b2168136be411850e5cf0b3
MD5 a98fdd256591dfd85dbe976212ca0d40
BLAKE2b-256 dac79f6de79d558f8e66241f57e797f3dcdb0338a6fff69fd38be4e20305c3b0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page