Local OpenAlex database with 284M+ works, abstracts, and semantic search
Project description
OpenAlex Local
Local OpenAlex database with 284M+ scholarly works, abstracts, and semantic search.
Why OpenAlex Local?
Built for the LLM era - features that matter for AI research assistants:
| Feature | Benefit |
|---|---|
| 📚 284M Works | More coverage than CrossRef |
| 📝 Abstracts | ~45-60% availability for semantic search |
| 🏷️ Concepts & Topics | Built-in classification |
| 👤 Author Disambiguation | Linked to institutions |
| 🔓 Open Access Info | OA status and URLs |
Perfect for: RAG systems, research assistants, literature review automation.
Installation
pip install openalex-local
From source:
git clone https://github.com/ywatanabe1989/openalex-local
cd openalex-local && make install
Database setup (~300 GB, ~1-2 days to build):
# Check system status
make status
# 1. Download OpenAlex Works snapshot (~300GB)
make download-screen # runs in background
# 2. Build SQLite database
make build-db
# 3. Build FTS5 index
make build-fts
Python API
from openalex_local import search, get, count
# Full-text search (title + abstract)
results = search("machine learning neural networks")
for work in results:
print(f"{work.title} ({work.year})")
print(f" Abstract: {work.abstract[:200]}...")
print(f" Concepts: {[c['name'] for c in work.concepts]}")
# Get by OpenAlex ID or DOI
work = get("W2741809807")
work = get("10.1038/nature12373")
# Count matches
n = count("CRISPR")
CLI
openalex-local search "CRISPR genome editing" -n 5
openalex-local get W2741809807
openalex-local get 10.1038/nature12373
openalex-local count "machine learning"
Related Projects
crossref-local - Sister project with CrossRef data:
| Feature | crossref-local | openalex-local |
|---|---|---|
| Works | 167M | 284M |
| Abstracts | ~21% | ~45-60% |
| Update frequency | Real-time | Monthly |
| DOI authority | ✓ (source) | Uses CrossRef |
| Citations | Raw references | Linked works |
| Concepts/Topics | ❌ | ✓ |
| Author IDs | ❌ | ✓ |
| Best for | DOI lookup, raw refs | Semantic search |
When to use CrossRef: Real-time DOI updates, raw reference parsing, authoritative metadata. When to use OpenAlex: Semantic search, citation analysis, topic discovery.
Data Source
Data from OpenAlex, an open catalog of scholarly works. Updated monthly from their snapshot.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openalex_local-0.1.0.tar.gz.
File metadata
- Download URL: openalex_local-0.1.0.tar.gz
- Upload date:
- Size: 6.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
295f2358e4942cc2af2bcbe8d99260769640b1e8cdf650cb587d07e8697829bb
|
|
| MD5 |
bacee6391cd2da33d55fce6539731ff2
|
|
| BLAKE2b-256 |
4f64d65bc37bbca2e2acbc9d32c08682e2935dca070b54bd0c36bdca60bbfd6f
|
File details
Details for the file openalex_local-0.1.0-py3-none-any.whl.
File metadata
- Download URL: openalex_local-0.1.0-py3-none-any.whl
- Upload date:
- Size: 6.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0d4bda3ef1cbc4846f347044653b4f79b76b7e6dbc6f8bb814dd64002c3be102
|
|
| MD5 |
49fb6c98e621d4e2dbb6b53fc322206b
|
|
| BLAKE2b-256 |
dbb8a001c4468138db58c934271d3062b8a14f2518cdc7347e58fe64a527dd26
|