Library that facilitates the creation of semantic layer metadata graph in Neo4j.

These details have not been verified by PyPI

Project description

Neocarta

An end-to-end library for generating metadata knowledge graphs in Neo4j for query generation and routing workflows.

Note: This library is not a Neo4j product. It is a Neo4j Labs project supported by the Neo4j field team.

Installation

pip install neocarta

Requires Python 3.10 or higher and a running Neo4j instance. Options:

Neo4j AuraDB — managed cloud (free tier available)
Neo4j Desktop — local GUI-based instance
Docker image — lightweight local instance

Metadata Graph

The metadata graph has the following schema. All connectors must convert their schema information to this graph schema to be compatible with the provided MCP server and ingestion tooling.

---
config:
    layout: elk
---
graph LR
%% Nodes
Database("Database<br/>id: STRING | KEY<br/>name: STRING<br/>description: STRING<br/>embedding: VECTOR")
Schema("Schema<br/>id: STRING | KEY<br/>name: STRING<br/>description: STRING<br/>embedding: VECTOR")
Table("Table<br/>id: STRING | KEY<br/>name: STRING<br/>description: STRING<br/>embedding: VECTOR")
Column("Column<br/>id: STRING | KEY<br/>name: STRING<br/>description: STRING<br/>embedding: VECTOR<br/>type: STRING<br/>nullable: BOOLEAN<br/>isPrimaryKey: BOOLEAN<br/>isForeignKey: BOOLEAN")
Value("Value<br/>id: STRING | KEY<br/>value: STRING")

%% Relationships
Database -->|HAS_SCHEMA| Schema
Schema -->|HAS_TABLE| Table
Table -->|HAS_COLUMN| Column
Column -->|REFERENCES| Column
Column -->|HAS_VALUE| Value


%% Styling
classDef node_0_color fill:#e3f2fd,stroke:#1976d2,stroke-width:3px,color:#000,font-size:12px
class Database node_0_color

classDef node_1_color fill:#fff9c4,stroke:#f57f17,stroke-width:3px,color:#000,font-size:12px
class Schema node_1_color

classDef node_2_color fill:#f3e5f5,stroke:#7b1fa2,stroke-width:3px,color:#000,font-size:12px
class Table node_2_color

classDef node_3_color fill:#e8f5e8,stroke:#388e3c,stroke-width:3px,color:#000,font-size:12px
class Column node_3_color

classDef node_4_color fill:#fff3e0,stroke:#f57c00,stroke-width:3px,color:#000,font-size:12px
class Value node_4_color

Nodes

Database
Schema
Table
Column
Value

Relationships

(:Database)-[:HAS_SCHEMA]->(:Schema)
(:Schema)-[:HAS_TABLE]->(:Table)
(:Table)-[:HAS_COLUMN]->(:Column)
(:Column)-[:HAS_VALUE]->(:Value)
(:Column)-[:REFERENCES]->(:Column)

Graph Generation

This project provides connector classes that organize the ETL process into reusable components:

Extractors - Connect to source data and read metadata tables
Transformers - Transform metadata into defined Neo4j schema
Loaders - Ingest transformed data into Neo4j
Connectors - Orchestrate the extract, transform, and load process

Each connector is implemented as a class that encapsulates its extractor, transformer, and loader components, providing a clean interface for metadata ingestion.

Connector Class Architecture

All connectors follow a consistent class-based architecture:

class Connector:
    def __init__(self, clients, config):
        # Initialize with required clients and configuration
        self.extractor = Extractor(...)
        self.transformer = Transformer(...)
        self.loader = Loader(...)

    def extract_metadata(self):
        # Extract data from source and cache
        pass

    def transform_metadata(self):
        # Transform extracted data to graph schema and cache
        pass

    def load_metadata(self):
        # Load transformed data into Neo4j
        pass

    def run(self):
        # Orchestrate the full ETL pipeline
        self.extract_metadata()
        self.transform_metadata()
        self.load_metadata()

Connectors

BigQuery Schema Connector

Connector for reading BigQuery Information Schema tables and ingesting schema metadata into Neo4j. Primary and foreign keys must be defined in the Information Schema tables in order for column level relationships to be created in the Neo4j graph.

What it extracts:

Database (GCP project)
Schemas (datasets)
Tables with descriptions
Columns with types, constraints, descriptions
Column references (foreign keys)
Column unique values (sample data)

This connector requires the following variables to be set in the .env file:

NEO4J_USERNAME=neo4j-username
NEO4J_PASSWORD=neo4j-password
NEO4J_URI=neo4j-uri
NEO4J_DATABASE=neo4j-database
GCP_PROJECT_ID=project-id
BIGQUERY_DATASET_ID=dataset-id

BigQuery Logs Connector

Connector for extracting query logs from BigQuery INFORMATION_SCHEMA.JOBS_BY_PROJECT, parsing SQL queries to understand table and column usage, and loading query patterns into Neo4j.

What it extracts:

SQL Queries
Tables and columns referenced in queries (discovered via SQL parsing)
Join relationships between tables (from SQL JOINs)
Query-to-table and query-to-column usage relationships

Graph schema additions:

Query nodes with properties:
- content - The query text
- query_id - Hash of query text
(:Query)-[:USES_TABLE]->(:Table) relationships
(:Query)-[:USES_COLUMN]->(:Column) relationships

This connector requires the following variables to be set in the .env file:

NEO4J_USERNAME=neo4j-username
NEO4J_PASSWORD=neo4j-password
NEO4J_URI=neo4j-uri
NEO4J_DATABASE=neo4j-database
GCP_PROJECT_ID=project-id
BIGQUERY_DATASET_ID=dataset-id
BIGQUERY_REGION=region-us (optional, defaults to region-us)

Connector Architecture

---
config:
    layout: elk
---
graph LR
    subgraph Schema["Graph Schema"]
        GS(Data Model Definition)
    end

    subgraph Source["Source Repository"]
        BQ(BigQuery Database)        
    end

    subgraph ETL["ETL Processes"]
        QE(Read RDBMS Schema)   
        PM(Validate + Transform<br>with Pydantic) 
    
        QE -->|Raw Data<br/>JSON| PM
    end
    
    subgraph Graph["Database"]
        NEO[(Neo4j Graph)]
    end

    BQ -->|Information Schema| QE
    GS -->|Schema Definition| PM
    PM -->|Ingest Data| NEO

Code Example - Schema Connector

import os
from neo4j import GraphDatabase
from google.cloud import bigquery
from neocarta.connectors.bigquery import BigQuerySchemaConnector

# Initialize clients
neo4j_driver = GraphDatabase.driver(
    uri=os.getenv("NEO4J_URI"),
    auth=(os.getenv("NEO4J_USERNAME"), os.getenv("NEO4J_PASSWORD")),
)
neo4j_database = os.getenv("NEO4J_DATABASE", "neo4j")
bigquery_client = bigquery.Client(project=os.getenv("GCP_PROJECT_ID"))

# Create connector instance
connector = BigQuerySchemaConnector(
    client=bigquery_client,
    project_id=os.getenv("GCP_PROJECT_ID"),
    dataset_id=os.getenv("BIGQUERY_DATASET_ID"),
    neo4j_driver=neo4j_driver,
    database_name=neo4j_database,
)

# Run the connector to extract, transform, and load BigQuery schema metadata into Neo4j
connector.run()

Code Example - Logs Connector

import os
from neo4j import GraphDatabase
from google.cloud import bigquery
from neocarta.connectors.bigquery import BigQueryLogsConnector

# Initialize clients
neo4j_driver = GraphDatabase.driver(
    uri=os.getenv("NEO4J_URI"),
    auth=(os.getenv("NEO4J_USERNAME"), os.getenv("NEO4J_PASSWORD")),
)
neo4j_database = os.getenv("NEO4J_DATABASE", "neo4j")
bigquery_client = bigquery.Client(project=os.getenv("GCP_PROJECT_ID"))

# Create connector instance
connector = BigQueryLogsConnector(
    client=bigquery_client,
    project_id=os.getenv("GCP_PROJECT_ID"),
    neo4j_driver=neo4j_driver,
    database_name=neo4j_database,
)

# Run the connector to extract query logs, parse SQL, and load into Neo4j
connector.run(
    dataset_id=os.getenv("BIGQUERY_DATASET_ID"),
    region="region-us",
    start_timestamp="2024-01-01 00:00:00",  # Optional
    end_timestamp="2024-01-31 23:59:59",    # Optional
    limit=100,                               # Optional, default 100
    drop_failed_queries=True,                # Optional, default True
)

Combined Usage

For the most complete picture, run both connectors:

# 1. Extract schema metadata
schema_connector = BigQuerySchemaConnector(...)
schema_connector.run()

# 2. Extract query logs
logs_connector = BigQueryLogsConnector(...)
logs_connector.run(dataset_id=os.getenv("BIGQUERY_DATASET_ID"))

This allows you to compare declared schema vs. actual usage patterns.

GCP Dataplex Universal Catalog

Connector for reading BigQuery metadata and Glossary information from Dataplex and ingesting into Neo4j. Please see the Dataplex README for more information and caveats of using this connector.

Warning: Importing this module raises a UserWarning: "The Dataplex connector is an incomplete feature. Current limitations of the Dataplex API prevent relationships between business terms and their tagged entities."

Connector Architecture

---
config:
    layout: elk
---
graph LR
    subgraph Schema["Graph Schema"]
        GS(Data Model Definition)
    end
    subgraph Big["Google Cloud Platform"]
        subgraph SR["Source Repository"]
            BQ(BigQuery Database)  
        end      
    

        subgraph Source["GCP Dataplex Universal Catalog"]
            G(Glossaries)
            MT(Metadata Types)        
        end
    end

    subgraph ETL["ETL Processes"]
        QE(Read Data Catalog)   
        PM(Validate + Transform<br>with Pydantic) 
    
        QE -->|Raw Data<br/>JSON| PM
    end
    
    subgraph Graph["Database"]
        NEO[(Neo4j Graph)]
    end

    BQ -->|BigQuery Metadata|MT
    G -->| Glossary Content| QE
    MT -->|BigQuery Metadata| QE
    GS -->|Schema Definition| PM
    PM -->|Ingest Data| NEO

Query Logs

Connector for parsing query log JSON files into Neo4j. Please see the Query Logs README for more information and caveats of using this connector.

Connector Architecture

---
config:
    layout: elk
---
graph LR
    subgraph Schema["Graph Schema"]
        GS(Data Model Definition)
    end
    subgraph Big["Google Cloud Platform"]
        subgraph Source["Source Repository"]
            BQ(BigQuery Database)
        end

        subgraph Logging["GCP Logging"]
            GL(Logs)
        end
    end

    subgraph ETL["ETL Processes"]
        QL(Read Query Logs)
        PM(Validate + Transform<br>with Pydantic)

        QL -->|Raw Logs<br/>JSON| PM
    end

    subgraph Graph["Database"]
        NEO[(Neo4j Graph)]
    end

    BQ -->|Query Logs| GL
    GL -->|Query Logs| QL
    GS -->|Schema Definition| PM
    PM -->|Ingest Data| NEO

CSV Files

Connector for loading metadata from structured CSV files into Neo4j. This connector is useful for importing metadata from systems that don't have direct API access, or for loading curated metadata that has been manually created or exported from other tools.

The CSV connector supports selective loading - you can choose which node types and relationships to load based on what metadata is available in your CSV files.

CSV File Structure:

The connector expects CSV files in a specified directory with the following default naming convention:

database_info.csv - Database nodes
schema_info.csv - Schema nodes
table_info.csv - Table nodes
column_info.csv - Column nodes
column_references_info.csv - Foreign key relationships
value_info.csv - Sample column values
query_info.csv - Query nodes
query_table_info.csv - Query-to-table relationships
query_column_info.csv - Query-to-column relationships
glossary_info.csv - Glossary nodes
category_info.csv - Category nodes
business_term_info.csv - Business term nodes

Custom file names can be specified using the csv_file_map parameter.

ID Strategy:

Entity IDs are automatically generated from name columns using a dot-separated hierarchy (e.g., database_name.schema_name.table_name). To use custom IDs instead, supply explicit *_id columns (database_id, schema_id, table_id, column_id, value_id) in every CSV file in the hierarchy. Do not mix strategies across files. See the CSV Connector README for full details.

This connector requires the following variables to be set in the .env file:

NEO4J_USERNAME=neo4j-username
NEO4J_PASSWORD=neo4j-password
NEO4J_URI=neo4j-uri
NEO4J_DATABASE=neo4j-database

Connector Architecture

---
config:
    layout: elk
---
graph LR
    subgraph Schema["Graph Schema"]
        GS(Data Model Definition)
    end

    subgraph Source["CSV Files"]
        CSV[(CSV Directory)]
    end

    subgraph ETL["ETL Processes"]
        QE(Read CSV Files)
        PM(Validate + Transform<br>with Pydantic)

        QE -->|Raw Data<br/>DataFrame| PM
    end

    subgraph Graph["Database"]
        NEO[(Neo4j Graph)]
    end

    CSV -->|CSV Files| QE
    GS -->|Schema Definition| PM
    PM -->|Ingest Data| NEO

Code Example

import os
from neo4j import GraphDatabase
from neocarta import NodeLabel as nl, RelationshipType as rt
from neocarta.connectors.csv import CSVConnector

# Initialize clients
neo4j_driver = GraphDatabase.driver(
    uri=os.getenv("NEO4J_URI"),
    auth=(os.getenv("NEO4J_USERNAME"), os.getenv("NEO4J_PASSWORD")),
)
neo4j_database = os.getenv("NEO4J_DATABASE", "neo4j")

# Create connector instance
connector = CSVConnector(
    csv_directory="datasets/csv",
    neo4j_driver=neo4j_driver,
    database_name=neo4j_database,
)

# Run the connector to load all CSV files into Neo4j
connector.run()

# Alternatively, load specific nodes and relationships
# Enum members are recommended, but exact string values (e.g. "Database", "HAS_SCHEMA") also work.
connector.run(
    include_nodes=[nl.DATABASE, nl.SCHEMA, nl.TABLE, nl.COLUMN, nl.VALUE],
    include_relationships=[rt.HAS_SCHEMA, rt.HAS_TABLE, rt.HAS_COLUMN, rt.HAS_VALUE, rt.REFERENCES]
)

# Or use a custom file mapping (configured at construction time)
custom_file_map = {
    NodeLabel.DATABASE: "my_database.csv",
    NodeLabel.SCHEMA: "my_schema.csv",
    # ... other custom filenames
}
connector = CSVConnector(
    csv_directory="datasets/csv",
    neo4j_driver=neo4j_driver,
    database_name=neo4j_database,
    csv_file_map=custom_file_map,
)
connector.run()

Sample Dataset

A sample e-commerce dataset is provided in datasets/csv/ that demonstrates the expected CSV file structure and can be used for testing the CSV connector.

Embeddings

Embeddings may be generated for the description fields of the following nodes:

Database
Schema
Table
Column
BusinessTerm

This project currently supports the following embeddings Providers:

OpenAI

This connector requires the following variables to be set in the .env file:

OPENAI_API_KEY=sk-...

Connector Architecture

---
config:
    layout: elk
---
graph LR

    subgraph D["Database Preparation"]
        VI(Create Vector Index)
    end

    subgraph ES["Embedding Service"]
        E(OpenAI Embeddings)
    end

    subgraph Graph["Database"]
        NEO[(Neo4j Graph)]
    end

    subgraph EP["Embedding Process"]
        C(Create Embeddings) 
    end
    
    VI-->NEO

    C<-->E

    NEO-->|Unprocessed Node Descriptions|C
    C-->|Embeddings|NEO

Code Example

import asyncio
import os
from neo4j import GraphDatabase
from openai import AsyncOpenAI
from neocarta import NodeLabel as nl
from neocarta.enrichment.embeddings import OpenAIEmbeddingsConnector

# Initialize clients
neo4j_driver = GraphDatabase.driver(
    uri=os.getenv("NEO4J_URI"),
    auth=(os.getenv("NEO4J_USERNAME"), os.getenv("NEO4J_PASSWORD")),
)
neo4j_database = os.getenv("NEO4J_DATABASE", "neo4j")
embedding_client = AsyncOpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# Create connector instance
connector = OpenAIEmbeddingsConnector(
    async_embedding_client=embedding_client,
    embedding_model="text-embedding-3-small",
    dimensions=768,
    neo4j_driver=neo4j_driver,
    database_name=neo4j_database,
)

# The node labels to generate embeddings for
# Enum members are recommended, but exact string values (e.g. "Database", "Table") also work.
node_labels = [nl.DATABASE, nl.TABLE, nl.COLUMN]

# Run the connector to create embeddings for the nodes
await connector.arun(node_labels=node_labels)

Full Pipeline

The full graph generation pipeline will run the BigQuery connector followed by the embedding generation connector.

It requires the following variables to be set in the .env file:

NEO4J_USERNAME=neo4j-username
NEO4J_PASSWORD=neo4j-password
NEO4J_URI=neo4j-uri
NEO4J_DATABASE=neo4j-database
GCP_PROJECT_ID=project-id
BIGQUERY_DATASET_ID=dataset-id
OPENAI_API_KEY=sk-...

Architecture

The combined BigQuery + Embeddings connector pipeline is seen below.

flowchart LR
    subgraph Schema["Graph Schema"]
        GS(Data Model Definition)
    end

    subgraph Source["Source Repository"]
        BQ(BigQuery Database)        
    end

    subgraph ETL["ETL Processes"]
        QE(Read RDBMS Schema)   
        PM(Validate + Transform<br>with Pydantic) 
    end
    
    subgraph Graph["Database"]
        NEO[(Neo4j Graph)]
    end

    subgraph D["Database Preparation"]
        VI(Create Vector Index)
    end

    subgraph ES["Embedding Service"]
        E(OpenAI Embeddings)
    end

    subgraph EP["Embedding Workflow"]
        C(Create Embeddings) 
    end

    %% BigQuery Flow
    BQ -->|Information Schema| QE
    QE -->|Raw Data JSON| PM
    GS -->|Schema Definition| PM
    PM -->|Ingest Data| NEO

    %% Embeddings Flow
    NEO -->|Database Ready| VI
    VI -->|Vector Index Created| NEO
    NEO -->|Unprocessed Node Descriptions| C
    C <--> E
    C -->|Embeddings| NEO

Running The Full Connector Pipeline

To run the full connector pipeline, use the following Make command:

make create-graph

Contributing

Dev Setup

This project uses uv for dependency management and requires Python 3.10 or higher.

uv — Python dependency manager:

curl -LsSf https://astral.sh/uv/install.sh | sh

Neo4j — a running Neo4j instance is required. Options:

Neo4j AuraDB — managed cloud (free tier available)
Neo4j Desktop — local GUI-based instance
Docker image — lightweight instance

Install All Dependencies (Recommended)

For most users, install all dependencies to run the complete workflow:

make install

This installs all dependency groups and allows you to:

Create the metadata graph from BigQuery
Run the MCP server
Run the Text2SQL agent

Install Specific Components

If you only need specific components, you can install individual dependency groups:

Metadata Graph Only (BigQuery ETL + embeddings)

make install-metadata-graph

MCP Server Only (SQL metadata retrieval server)

make install-mcp-server

Agent Only (Text2SQL agent with MCP servers)

make install-agent

Note: The agent group automatically includes mcp-server dependencies

Dependency Groups

The project is organized into the following dependency groups:

metadata-graph: BigQuery metadata extraction, Neo4j loading, and embedding generation
mcp-server: Local MCP server for SQL metadata retrieval from Neo4j
agent: Text2SQL agent with LangChain (includes mcp-server dependencies)
dev: Development tools (Jupyter notebooks)

Sample Dataset

This repository contains a sample dataset of ecommerce data.

Ensure that the following environment variable is set before running and that you are credentialed via the gcloud cli.

GCP_PROJECT_ID=project-id

To create the dataset in your BigQuery instance, you may run the following Make command.

make load-ecommerce-dataset

MCP

This project uses two MCP servers for the Text2SQL agent.

Local SQL Metadata MCP Server

This is a custom SQL metadata retrieval MCP server that provides tools to query the Neo4j database for relevant RDBMS schema information using semantic similarity search.

Tools

list_schemas - Lists all schemas and their associated databases.
list_tables_by_schema - Lists all tables for a given schema name.
get_metadata_schema_by_column_semantic_similarity - Retrieves table metadata using vector similarity search on column embeddings and graph traversal. Returns the most relevant tables and column subset with their references and example values.
get_metadata_schema_by_schema_and_table_semantic_similarity - Retrieves table metadata using embedding similarity at the schema and table level. Filters tables whose similarity score is close to their parent schema score and returns results ordered by relevance. Accepts an optional max_tables parameter (default: 5).
get_full_metadata_schema - Retrieves complete metadata schema for all tables in the database. Returns tables and columns with their references and example values. Warning: expensive query — use only for debugging.

Environment Variables

NEO4J_URI - Neo4j database connection URI (e.g., bolt://localhost:7687)
NEO4J_USERNAME - Neo4j username (default: neo4j)
NEO4J_PASSWORD - Neo4j password
NEO4J_DATABASE - Neo4j database name (default: neo4j)
OPENAI_API_KEY - OpenAI API key for generating query embeddings
EMBEDDING_MODEL - OpenAI embedding model to use (default: text-embedding-3-small)
EMBEDDING_DIMENSIONS - Embedding vector dimensions (default: 768)

Remote BigQuery MCP Server

This is the official BigQuery remote MCP server and will be used to execute SQL queries against our database.

Since this is a remote server, we don't need to worry about hosting it locally. We can just connect to the MCP endpoint in our GCP environment.

Tools (Filtered subset of total tools the server provides)

execute_sql - Execute a SQL query against BigQuery. Returns the raw results.

Unused BigQuery MCP Tools

list_dataset_ids
get_dataset_info
list_table_ids
get_table_info

Set Up

Enable use of the Bigquery MCP server in your project.

Additional information may be found here.

gcloud beta services mcp enable bigquery.googleapis.com --project=PROJECT_ID

To disable again run:

gcloud beta services mcp disable bigquery.googleapis.com --project=PROJECT_ID

You can test the BigQuery server connection with the following curl command

curl -k \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
  -d '{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "tools/call",
  "params": {
    "name": "execute_sql",
    "arguments": {
      "projectId": "<PROJECT_ID>",
      "query": "SELECT table_name FROM `<PROJECT_ID>.<DATASET_ID>.INFORMATION_SCHEMA.TABLES`"
    }
  }
}' \
  https://bigquery.googleapis.com/mcp

Agent

This is the Text2SQL agent that converts natural language questions into SQL queries for BigQuery. The agent uses two MCP servers to:

Retrieve relevant database metadata from Neo4j using semantic similarity
Execute generated SQL queries against BigQuery

The agent architecture can be seen below.

---
config:
    layout: dagre
---

graph LR
    
    subgraph GCP["GCP Environment"]
        BQMCP(BigQuery<br>MCP)

        subgraph DataWarehouse["Data Warehouse"]
            BQData[(BigQuery)]
        end
        
        BQMCP <--> BQData
    end

    subgraph Local["Local Environment"]
        Agent("Text2SQL Agent")
        MetadataMCP("SQL Metadata<br/>MCP")
        
        subgraph Graph["Database"]
            NEO[(Neo4j Graph)]
        end
        
        Agent <--> MetadataMCP
        MetadataMCP <--> NEO
    end
    
    User("User")
    
    subgraph LLM["LLM Service"]
        Model("LLM")
    end
    
    User <--> Agent
    Agent <--> Model
    Agent <--> BQMCP

How it works

User asks a natural language question about the data
Agent calls the SQL Metadata MCP server to retrieve relevant table schemas
Agent generates a SQL query via an LLM call based on the retrieved metadata context
Agent calls execute_sql from the BigQuery MCP server to run the query against BigQuery
Agent returns formatted results to the user

Running the Agent

Use this command to run the agent locally:

make agent

The agent will start an interactive chat session in the terminal where you can ask questions about your data.

BigQuery MCP Authentication

The agent uses Google Cloud Application Default Credentials for BigQuery authentication:

class GoogleAuth(httpx.Auth):
    def __init__(self):
        self.credentials, _ = default()

    def auth_flow(self, request):
        self.credentials.refresh(Request())
        request.headers["Authorization"] = f"Bearer {self.credentials.token}"
        yield request

Make sure you're authenticated with:

gcloud auth application-default login

Environment Variables

Required environment variables (add to .env file):

Neo4j Connection

NEO4J_URI - Neo4j database URI (e.g., bolt://localhost:7687)
NEO4J_USERNAME - Neo4j username (default: neo4j)
NEO4J_PASSWORD - Neo4j password
NEO4J_DATABASE - Neo4j database name (default: neo4j)

LLM - OpenAI

OPENAI_API_KEY - OpenAI API key for embeddings and LLM

Example Usage

> What are the total sales by product category?

Agent: [Calls get_metadata_schema_by_semantic_similarity with query about sales and categories]
Agent: [Generates SQL query using retrieved schema]
Agent: [Calls execute_sql with generated query]
Agent: Here are the total sales by product category:
- Electronics: $15,234.50
- Clothing: $8,912.30
...

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.1

Apr 17, 2026

This version

0.2.0

Apr 17, 2026

0.1.0

Apr 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neocarta-0.2.0.tar.gz (83.2 kB view details)

Uploaded Apr 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

neocarta-0.2.0-py3-none-any.whl (99.0 kB view details)

Uploaded Apr 17, 2026 Python 3

File details

Details for the file neocarta-0.2.0.tar.gz.

File metadata

Download URL: neocarta-0.2.0.tar.gz
Upload date: Apr 17, 2026
Size: 83.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for neocarta-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`2d2df55bd62be8e9ce4efff21821509629e86aeee526a3759395a4237e77f0d9`
MD5	`9310e31a615599b9e5e1ce1997e1347c`
BLAKE2b-256	`e9c4614e6d0883faedf96686b55c5dfc42df1da5904df0eb5312109ff2834dc0`

See more details on using hashes here.

File details

Details for the file neocarta-0.2.0-py3-none-any.whl.

File metadata

Download URL: neocarta-0.2.0-py3-none-any.whl
Upload date: Apr 17, 2026
Size: 99.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for neocarta-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5c3d534630640d4c260cf7c4d591c07dd6b1ab47837dd2131f8313da24b4df0e`
MD5	`fe45071b2ccc1f75fc37b1176933b61f`
BLAKE2b-256	`cccedba9f1a157e26f6dcb29bedc5a47b2772ca70a437556857a07af96541624`

See more details on using hashes here.

neocarta 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Neocarta

Installation

Metadata Graph

Graph Generation

Connector Class Architecture

Connectors

BigQuery Schema Connector

BigQuery Logs Connector

Connector Architecture

Code Example - Schema Connector

Code Example - Logs Connector

Combined Usage

GCP Dataplex Universal Catalog

Connector Architecture

Query Logs

Connector Architecture

CSV Files

Connector Architecture

Code Example

Sample Dataset

Embeddings

Connector Architecture

Code Example

Full Pipeline

Architecture

Contributing

Dev Setup

Install All Dependencies (Recommended)

Install Specific Components

Dependency Groups

Sample Dataset

MCP

Local SQL Metadata MCP Server

Remote BigQuery MCP Server

Set Up

Agent

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes