Skip to main content

Pipeline for querying and turning NASA's ADS publications metadata into curated, analysis-ready datasets, topic maps, and citation networks.

Project description

ads-bib

Python 3.12 License MIT Docs Open in Colab

ads-bib takes a NASA ADS search query and produces a normalized, curated dataset, with disambiguated author names (AND via ads-and), topic models (via BERTopic or Toponymy), and citation networks ready for e.g. Gephi, CiteSpace, or VOSviewer, locally or via API.

Installation

Use uv and Python 3.12.

uv pip install ads-bib
# or: pip install ads-bib

Quick Start

Create a .env file in your project root with the relevant API keys.

ADS_TOKEN=your-ads-token           # required
OPENROUTER_API_KEY=your-key        # only for the openrouter road
HF_TOKEN=your-key                  # only for the huggingface road
MODAL_TOKEN_ID=your-modal-id       # only for AND with backend=modal
MODAL_TOKEN_SECRET=your-modal-secret

ADS user token settings | OpenRouter Keys | Hugging Face Access Tokens | Modal.

Then run in your terminal:

ads-bib run --preset openrouter --set search.query='author:"Hawking, S*"'

Author name disambiguation is off by default. Enable the local CPU/GPU path with --set author_disambiguation.enabled=true; use --set author_disambiguation.backend=modal only when your Modal credentials are configured.

Full setup details: Get Started | Runtime Roads

Python API

import ads_bib

ads_bib.run(
    preset="openrouter",
    query='author:"Hawking, S*"',
)

More examples and the NotebookSession interface: Python API docs

Pick a Runtime Road

Road Hardware Network Cost
openrouter any API calls pay-per-token
hf_api any API calls HF-plan-dependent
local_cpu CPU only model downloads only free after setup
local_gpu NVIDIA + CUDA model downloads only free after setup

Full provider matrix and first-run behavior: Runtime Roads

Output

Each run produces a self-contained output directory:

  • publications.parquet — cleaned, translated, topic-labeled publications, with disambiguated authors when AND is enabled
  • references.parquet — normalized cited-reference metadata, with disambiguated authors when AND is enabled
  • topic_info.parquet — one row per topic with labels, counts, and representation fields
  • topic_map.html — interactive topic visualization (open in any browser), using datamapplot
  • .gexf citation networks — direct citation, co-citation, bibliographic coupling, author co-citation
  • download_wos_export.txt — Web of Science format for e.g. CiteSpace / VOSviewer
  • run_summary.yaml — full run metadata and stage timings

Interactive topic map from the Hawking query Topic map output from author:"Hawking, S*" in datamapplot.

Author co-citation network from the Hawking query Author co-citation output from author:"Hawking, S*" in Gephi Lite.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ads_bib-0.1.0.tar.gz (409.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ads_bib-0.1.0-py3-none-any.whl (160.4 kB view details)

Uploaded Python 3

File details

Details for the file ads_bib-0.1.0.tar.gz.

File metadata

  • Download URL: ads_bib-0.1.0.tar.gz
  • Upload date:
  • Size: 409.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ads_bib-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ad596890e910122bb2a739f2867d449aebadd067000d6fa944731ceee7b07ce5
MD5 125836ec053e22dc6e10dd72f77f030f
BLAKE2b-256 c86e029077d4cb6efac4d5d2f51589bb18906cfd2e8ab356934a2c9b36a9fded

See more details on using hashes here.

Provenance

The following attestation bundles were made for ads_bib-0.1.0.tar.gz:

Publisher: release.yml on raphschlatt/ads-bib

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ads_bib-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ads_bib-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 160.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ads_bib-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4421febec4d43b0f08670a5b565c6f953774adf4cc78f12ff9a565e0899562c0
MD5 3d7b7eb8ab9e453a08a35c8d533422df
BLAKE2b-256 e3faee36ad47f23425ec7f19cac8657555d81a8d3f72a446db88765f15353351

See more details on using hashes here.

Provenance

The following attestation bundles were made for ads_bib-0.1.0-py3-none-any.whl:

Publisher: release.yml on raphschlatt/ads-bib

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page