Skip to main content

Pipeline for querying and turning NASA's ADS publications metadata into curated, analysis-ready datasets, topic maps, and citation networks.

Project description

ads-bib

Python 3.12 License MIT Docs Open in Colab

ads-bib takes a NASA ADS search query and produces a normalized, curated dataset, with disambiguated author names (AND via ads-and), topic models (via BERTopic or Toponymy), and citation networks ready for e.g. Gephi, CiteSpace, or VOSviewer, locally or via API.

Installation

Use uv and Python 3.12.

uv pip install ads-bib
# or: pip install ads-bib

Quick Start

Create a .env file in your project root with the relevant API keys.

ADS_TOKEN=your-ads-token           # required
OPENROUTER_API_KEY=your-key        # only for the openrouter road
HF_TOKEN=your-key                  # only for the huggingface road
MODAL_TOKEN_ID=your-modal-id       # only for AND with backend=modal
MODAL_TOKEN_SECRET=your-modal-secret

ADS user token settings | OpenRouter Keys | Hugging Face Access Tokens | Modal.

Then run in your terminal:

ads-bib run --preset openrouter --set search.query='author:"Hawking, S*"'

Author name disambiguation is off by default. Enable the local CPU/GPU path with --set author_disambiguation.enabled=true; use --set author_disambiguation.backend=modal only when your Modal credentials are configured.

Full setup details: Get Started | Runtime Roads

Python API

import ads_bib

ads_bib.run(
    preset="openrouter",
    query='author:"Hawking, S*"',
)

More examples and the NotebookSession interface: Python API docs

Pick a Runtime Road

Road Hardware Network Cost
openrouter any API calls pay-per-token
hf_api any API calls HF-plan-dependent
local_cpu CPU only model downloads only free after setup
local_gpu NVIDIA + CUDA model downloads only free after setup

Full provider matrix and first-run behavior: Runtime Roads

Output

Each run produces a self-contained output directory:

  • publications.parquet — cleaned, translated, topic-labeled publications, with disambiguated authors when AND is enabled
  • references.parquet — normalized cited-reference metadata, with disambiguated authors when AND is enabled
  • topic_info.parquet — one row per topic with labels, counts, and representation fields
  • topic_map.html — interactive topic visualization (open in any browser), using datamapplot
  • .gexf citation networks — direct citation, co-citation, bibliographic coupling, author co-citation
  • download_wos_export.txt — Web of Science format for e.g. CiteSpace / VOSviewer
  • run_summary.yaml — full run metadata and stage timings

Interactive topic map from the Hawking query Topic map output from author:"Hawking, S*" in datamapplot.

Author co-citation network from the Hawking query Author co-citation output from author:"Hawking, S*" in Gephi Lite.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ads_bib-0.1.1.tar.gz (410.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ads_bib-0.1.1-py3-none-any.whl (160.6 kB view details)

Uploaded Python 3

File details

Details for the file ads_bib-0.1.1.tar.gz.

File metadata

  • Download URL: ads_bib-0.1.1.tar.gz
  • Upload date:
  • Size: 410.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ads_bib-0.1.1.tar.gz
Algorithm Hash digest
SHA256 7dbbb0daf584f19ba96dc312c515618c7659d6cd647bf9e7c0b6803b08c5b301
MD5 48496a8fa3fa2bd08e4d853200e7c787
BLAKE2b-256 5b32af20d9fa54ab0548346170ed2730e87b9c8188a42c331a310eff019357bd

See more details on using hashes here.

Provenance

The following attestation bundles were made for ads_bib-0.1.1.tar.gz:

Publisher: release.yml on raphschlatt/ads-bib

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ads_bib-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: ads_bib-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 160.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ads_bib-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4bd91f7b8aee20a262d5d9be52c961a95f8d407ddfb2cce353378cd4a51c7b17
MD5 930ff72fe7d863bc0e5c28e9137bd722
BLAKE2b-256 a9b9774b281099f9fccf5646d6d190224f32186ae4e8fbee5bf58cd5b44b62ad

See more details on using hashes here.

Provenance

The following attestation bundles were made for ads_bib-0.1.1-py3-none-any.whl:

Publisher: release.yml on raphschlatt/ads-bib

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page