A Graph Database Toolkit for NASA's GCN
Project description
AI4GCNpy: A Graph Database Toolkit for NASA's GCN
AI4GCNpy is a Python toolkit for building and querying a knowledge graph of astrophysical transient events from NASA's Gamma-ray Coordinates Network (GCN). Powered by LangGraph and Neo4j technology, it enables natural language querying for astrophysical transient events.
Key capabilities:
- Automatic Information Extraction: Extract structured information from GCN circulars using LLMs.
- Knowledge Graph Construction: Convert unstructured GCN circulars into a structured Neo4j graph database.
- Intelligent Q&A System: Converts natural language questions into Cypher via LLM, executes graph queries in Neo4j, and generates final answers by combining structured results with relevant passages from the original GCN circulars.
- Beautiful Output: Colorful terminal output, syntax highlighting, and progress bars using the Rich library.
Quick Start
Install
Install the package from PyPI:
pip install ai4gcnpy
Or install from source for development:
git clone https://github.com/GZU-MuTian/AI4GCNpy.git
cd AI4GCNpy
pip install -e .
Set Up Neo4j (Required)
AI4GCNpy requires a locally running Neo4j instance as its graph database backend. Note that the APOC (Awesome Procedures On Cypher) plugin is required for advanced graph operations.
Environment Setup (Recommended)
To streamline usage and avoid repetitive CLI flags, we recommend configuring environment variables. This approach simplifies command execution and enhances security by avoiding credentials in command history.
# LLM Configuration (required)
DEEPSEEK_API_KEY="your-deepseek-api-key-here"
# Neo4j Configuration (required for graph operations)
NEO4J_URL="bolt://localhost:7687"
NEO4J_USERNAME="neo4j"
NEO4J_PASSWORD="your-neo4j-password"
# Optional: Set default output directory
GCN_OUTPUT_PATH="./gcn_data"
You may also pass these values directly via CLI flags (e.g., --url, --username, --password).
Usage Guide
Core Functions
The ai4gcnpy package provides three core functions for processing of NASA GCN data:
from ai4gcnpy import gcn_extractor, gcn_builder, gcn_graphrag
These functions form a complete pipeline: Extract → Build → Query, enabling structured knowledge extraction, graph population, and natural-language question answering.
- Extract structured data from GCN circulars:
from ai4gcnpy import gcn_extractor
import json
# Extract information from a single GCN circular
result = gcn_extractor(
input_file="path/to/gcn_circular.txt",
model="deepseek-chat",
model_provider="deepseek",
temperature=0.7,
max_tokens=4000,
reasoning=True
)
# Save the extracted data
if result:
output_file = "path/to/extracted_data.json"
with open(output_file, "w", encoding="utf-8") as f:
json.dump(result, f, indent=2, ensure_ascii=False)
The
modelandmodel_providerparameters are passed to LangChain’s unified chat model initializer:langchain.chat_models.init_chat_model. This allowsai4gcnpyto support multiple LLM providers (e.g., DeepSeek, OpenAI, Anthropic) through a consistent interface, while abstracting provider-specific setup details.
- Populate Neo4j with extracted data:
from ai4gcnpy import gcn_builder
gcn_builder(
json_file="path/to/extracted_data.json",
database="neo4j" # Optional: specify database (defaults to 'neo4j')
)
- Ask questions using natural language:
from ai4gcnpy import gcn_graphrag
response = gcn_graphrag(
query_text="Your question here",
model="deepseek-chat",
model_provider="deepseek"
database="neo4j"
)
print(f"Query: {response.get('query')}")
print(f"Cypher: {response.get('cypher_statement')}")
print(f"Data Sources: {response.get('retrieved_chunks')}")
print(f"Final Answer: {response.get('answer')}")
Command-Line Interface
For rapid prototyping or batch workflows, ai4gcnpy includes a CLI named gcn-cli. It uses the same core functions as the Python API—ensuring consistent behavior across interfaces.
🔧 Tip: Run
gcn-cli --helpfor an overview, orgcn-cli <command> --helpfor command-specific options.
Basic Commands:
# Extract information from GCN circulars
gcn-cli extractor path/to/gcn_circular.txt
# Batch extract from multiple files
gcn-cli batch_extractor --input path/to/circulars_directory/ --output path/to/extracted_data_directory/
# Build graph
gcn-cli builder path/to/extracted_data_directory/
# Ask a question
gcn-cli query "Your question here"
Adjust verbosity for debugging or quiet runs:
# Production - errors only
gcn-cli --log-level ERROR query "Your question here"
# Short form for debugging
gcn-cli -v DEBUG query "Your question here"
Project Structure
ai4gcnpy/
├── agents.py # LangGraph agents for complex workflows
├── chains.py # LangChain chains for LLM interactions
├── cli.py # Command-line interface built with Typer
├── core.py # Core functions
├── db_client.py # Neo4j database connector
├── llm_client.py # Unified LLM provider interface
├── utils.py # Utility functions (e.g., download_gcn_archive)
Related Resources
- NASA GCN Archive: https://gcn.nasa.gov/circulars/
- Neo4j Documentation: https://neo4j.com/docs/
- Neo4j Linux installation: https://neo4j.com/docs/operations-manual/current/installation/linux/debian/
- LangGraph Guide: https://docs.langchain.com/
Contact
For questions and support:
- Author: Yu Liu
- Email: yuliu@gzu.edu.cn
- Repository: https://github.com/GZU-MuTian/AI4GCNpy
Astronomical Discovery, Powered by AI! 🔭✨
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ai4gcnpy-0.1.1.tar.gz.
File metadata
- Download URL: ai4gcnpy-0.1.1.tar.gz
- Upload date:
- Size: 158.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.10 {"installer":{"name":"uv","version":"0.9.10"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"25.10","id":"questing","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
86b04538d8900e750f844f776602f8d189333caadb36e8699b5a17933e4c4056
|
|
| MD5 |
f9edf95ad6b8e60eec38ac562405ddfb
|
|
| BLAKE2b-256 |
25cabced954582bef22b52717e5ebdfb2d3c2395c9f4bef212be8baa43a5b2c2
|
File details
Details for the file ai4gcnpy-0.1.1-py3-none-any.whl.
File metadata
- Download URL: ai4gcnpy-0.1.1-py3-none-any.whl
- Upload date:
- Size: 28.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.10 {"installer":{"name":"uv","version":"0.9.10"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"25.10","id":"questing","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa963a4d05afe33dcb2f5fb8a0542fff6ff35dcb8144a5757d0e0edf507684f2
|
|
| MD5 |
49eaf943a0e069dc2e20916025ee68ac
|
|
| BLAKE2b-256 |
5eaceb3fd19b843b1a8c3df9011923b32194efa01c630822a60ffae6898334b9
|