Skip to main content

A set of AI tools for working with Cognite Data Fusion in Python.

Project description

cognite-ai

A set of AI tools for working with CDF (Cognite Data Fusion) in Python, including vector stores and intelligent data manipulation features leveraging large language models (LLMs).

Installation

This package is intended to be used in Cognite's Jupyter notebook and Streamlit. To get started, install the package using:

%pip install cognite-ai

Smart Data Tools

With cognite-ai, you can enhance your data workflows by integrating LLMs for intuitive querying and manipulation of data frames. The module is built on top of PandasAI and adds Cognite-specific features.

The Smart Data Tools come in three components:

Pandas Smart DataFrame Pandas Smart DataLake Pandas AI Agent

1. Pandas Smart DataFrame

SmartDataframe enables you to chat with individual data frames, using LLMs to query, summarize, and analyze your data conversationally.

Example

from cognite.ai import load_pandasai
from cognite.client import CogniteClient
import pandas as pd

# Load the necessary classes
client = CogniteClient()
SmartDataframe, SmartDatalake, Agent = await load_pandasai()

# Create demo data
workorders_df = pd.DataFrame({
    "workorder_id": ["WO001", "WO002", "WO003", "WO004", "WO005"],
    "description": [
        "Replace filter in compressor unit 3A",
        "Inspect and lubricate pump 5B",
        "Check pressure valve in unit 7C",
        "Repair leak in pipeline 4D",
        "Test emergency shutdown system"
    ],
    "priority": ["High", "Medium", "High", "Low", "Medium"]
})

# Create a SmartDataframe object
s_workorders_df = SmartDataframe(workorders_df, cognite_client=client)

# Chat with the dataframe
s_workorders_df.chat('Which 5 work orders are the most critical based on priority?')

Customizing LLM Parameters

You can configure the LLM parameters to control aspects like model selection and temperature.

params = {
    "model": "azure/gpt-4.1",
    "temperature": 0.5
}

s_workorders_df = SmartDataframe(workorders_df, cognite_client=client, params=params)

2. Pandas Smart DataLake

SmartDatalake allows you to combine and query multiple data frames simultaneously, treating them as a unified data lake.

Example

from cognite.ai import load_pandasai
from cognite.client import CogniteClient
import pandas as pd

# Load the necessary classes
client = CogniteClient()
SmartDataframe, SmartDatalake, Agent = await load_pandasai()

# Create demo data
workorders_df = pd.DataFrame({
    "workorder_id": ["WO001", "WO002", "WO003"],
    "asset_id": ["A1", "A2", "A3"],
    "description": ["Replace filter", "Inspect pump", "Check valve"]
})
workitems_df = pd.DataFrame({
    "workitem_id": ["WI001", "WI002", "WI003"],
    "workorder_id": ["WO001", "WO002", "WO003"],
    "task": ["Filter replacement", "Pump inspection", "Valve check"]
})
assets_df = pd.DataFrame({
    "asset_id": ["A1", "A2", "A3"],
    "name": ["Compressor 3A", "Pump 5B", "Valve 7C"]
})

# Combine them into a smart lake
smart_lake_df = SmartDatalake([workorders_df, workitems_df, assets_df], cognite_client=client)

# Chat with the unified data lake
smart_lake_df.chat("Which assets have the most work orders associated with them?")

3. Pandas AI Agent

The Agent provides conversational querying capabilities across a single data frame, allowing you to have follow up questions.

Example

from cognite.ai import load_pandasai
from cognite.client import CogniteClient
import pandas as pd

# Load the necessary classes
client = CogniteClient()
SmartDataframe, SmartDatalake, Agent = await load_pandasai()

# Create example data
sensor_readings_df = pd.DataFrame({
    "sensor_id": ["A1", "A2", "A3", "A4", "A5"],
    "temperature": [75, 80, 72, 78, 69],
    "pressure": [30, 35, 33, 31, 29],
    "status": ["Normal", "Warning", "Normal", "Warning", "Normal"]
})

# Create an Agent for the dataframe
agent = Agent(sensor_readings_df, cognite_client=client)

# Ask a question
print(agent.chat("Which sensors are showing a warning status?"))

Development

This project uses uv for dependency management and building. To get started with development:

# Install dependencies
uv sync

# Install the package in editable mode
uv pip install -e .

# Build the package
uv build

For more information on using uv, see the uv documentation.

Contributing

This package exists mainly to get around the install problems a user gets in Pyodide when installing pandasai due to dependencies that are not pure Python 3 wheels.

The current development cycle is not great, but consists of copying the contents of the source code in this package into e.g. a Jupyter notebook in Fusion to verify that everything works there.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cognite_ai-0.7.5.tar.gz (186.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cognite_ai-0.7.5-py3-none-any.whl (24.9 kB view details)

Uploaded Python 3

File details

Details for the file cognite_ai-0.7.5.tar.gz.

File metadata

  • Download URL: cognite_ai-0.7.5.tar.gz
  • Upload date:
  • Size: 186.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for cognite_ai-0.7.5.tar.gz
Algorithm Hash digest
SHA256 4232e8ab459f2db00aecfd1f7c4032dbbeebc312343124a9710a039d17a59d7a
MD5 a7fe298a39df0e4dd287c26c08f35b19
BLAKE2b-256 1ccc6d1919c25011896c9075387b4c3b5a71d106a783ad3be1f2554619108c10

See more details on using hashes here.

File details

Details for the file cognite_ai-0.7.5-py3-none-any.whl.

File metadata

  • Download URL: cognite_ai-0.7.5-py3-none-any.whl
  • Upload date:
  • Size: 24.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for cognite_ai-0.7.5-py3-none-any.whl
Algorithm Hash digest
SHA256 c7ea0f25fcacacc32d7fdefd3ba691c96ec5b0110c40050e9366a9fc8a26a1ff
MD5 57f977fd74cefad1d2a7ad0654f29558
BLAKE2b-256 14cbc205b0959c1cfcb1f0d7c290d43c6900d4b0235289f1241454381232a195

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page