Skip to main content

A Streamlit component for rendering data lineage from Cognite Data Fusion

Project description

Stlin - Streamlit Data Lineage Component

A Streamlit component for rendering interactive data lineage graphs from Cognite Data Fusion. Built with React, TypeScript, and React Flow.

Installation

pip install stlin

Quick Start

import streamlit as st
from stlin import render_lineage

# Your lineage data (list of data processes with sources and destinations)
lineage_data = [
    {
        "externalId": "transformation_1",
        "name": "Process Raw Assets",
        "query": "SELECT * FROM raw_assets WHERE status = 'active'",
        "destination": {"type": "raw", "database": "processed", "table": "assets"},
        "sources": ["raw_assets", "_cdf.assets"],
        "destinations": ["processed.assets"],
        "lastFinishedJob": {
            "status": "success",
            "startedTime": 1692345600000,
            "finishedTime": 1692345660000
        }
    }
    # ... more transformations
]

# Render the component
selected_data = render_lineage(
    data=lineage_data,
    focus_mode=True,
    side_bar_width=300,
    height=800
)

# Handle selection
if selected_data:
    st.write("Selected:", selected_data)
    # Example of returned data structure:
    # For transformation: [{"type": "Data Process", "subType": "Transformation", "address": "transformation_1", "sources": [...], "destinations": [...], "query": "SELECT ..."}]
    # For data object: [{"type": "Data Object", "subType": "Staging", "address": "raw_assets", "producedBy": [...], "consumedBy": [...]}]

API Reference

render_lineage

The main component function for rendering data lineage.

Parameters:

  • data (list): List of transformation dictionaries containing lineage information
  • focus_mode (bool, default=True): Whether to show only direct lineage path or full graph
  • side_bar_width (int, default=300): Initial width of navigation sidebar in pixels
  • height (int, default=800): Height of the component in pixels
  • key (str, optional): Unique component key for Streamlit

Returns:

  • For data process nodes: Returns a structured record with:
    • type: "Data Process"
    • subType: "Transformation"
    • address: transformation external ID
    • sources: list of source identifiers
    • destinations: list of destination identifiers
    • query: SQL query or transformation logic
  • For data object nodes: Returns a structured record with:
    • type: "Data Object"
    • subType: specific data object type (e.g., "Staging", "Assets", "Data Model View", etc.)
    • address: data object identifier
    • producedBy: list of transformation IDs that produce this data object
    • consumedBy: list of transformation IDs that consume this data object
  • Returns empty list if nothing is selected

Data Format

The component expects transformation data in the following format:

{
    "externalId": "unique_transformation_id",
    "name": "Human Readable Name",
    "query": "SELECT * FROM source_table",
    "destination": {
        "type": "raw",
        "database": "target_db",
        "table": "target_table"
    },
    "sources": ["source1", "source2"],           # List of source identifiers
    "destinations": ["dest1", "dest2"],          # List of destination identifiers
    "lastFinishedJob": {
        "status": "success",
        "startedTime": 1692345600000,
        "finishedTime": 1692345660000
    }
}

Supported Data Object Types

The component automatically categorizes data objects based on their identifiers:

  • Legacy CDF Resources: _cdf.assets, _cdf.events, _cdf.timeseries, etc.
  • Data Model Instances: cdf_data_models(), cdf_nodes(), cdf_edges()
  • Raw/Staging Tables: database.table format
  • Unknown: Any unrecognized format

Development

Building from Source

git clone https://github.com/evertoncolling/stlin.git
cd stlin

# Install dependencies
uv sync

# Build frontend
cd stlin/frontend
npm install
npm run build

# Build Python package
cd ../..
python -m build

Running the Example

# Install in development mode
uv pip install -e .

# Run the example app
streamlit run example_app.py

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stlin-0.1.0.tar.gz (170.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stlin-0.1.0-py3-none-any.whl (169.5 kB view details)

Uploaded Python 3

File details

Details for the file stlin-0.1.0.tar.gz.

File metadata

  • Download URL: stlin-0.1.0.tar.gz
  • Upload date:
  • Size: 170.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.9

File hashes

Hashes for stlin-0.1.0.tar.gz
Algorithm Hash digest
SHA256 839cb643a836569aad4e6363fee76593e6c76bfeaced7ea08b689e1d52c5c39b
MD5 24367ad9bfc8368ade01fa4f89da49e3
BLAKE2b-256 05c53a54aade359501bc4d1d6a3f3dfe7e852e2fc38f6f964a16808815cec1b8

See more details on using hashes here.

File details

Details for the file stlin-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: stlin-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 169.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.9

File hashes

Hashes for stlin-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 008977f2cca78d9493f7bf11bfb58ac8ce78d7ec21e30bb764e020272c38cf6a
MD5 fedfcd2525ce5e28b8ddc6a40478a44c
BLAKE2b-256 c5d6e11ef77aeae19d1f40cf8250721cc83256b2db3c79641d97237b60f44a7f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page