Engage with your data (SQL, CSV, pandas, polars, mongodb, noSQL, etc.) using Ollama, an open-source tool that operates locally. Datadashr transforms data analysis into a conversational experience powered by Ollama LLMs and RAG.
Project description
Description
Converse with Your Data Through Open Source AI.
Unleash the power of your data with natural language questions.
Our open-source platform, built on Ollama, delivers powerful insights without the cost of APIs.
Integrate effortlessly with your existing infrastructure, connecting to various data sources including SQL, NoSQL, CSV, and XLS files.
Obtain in-depth analytics by aggregating data from multiple sources into a unified platform, providing a holistic view of your business.
Convert raw data into valuable insights, facilitating data-driven strategies and enhancing decision-making processes.
Design intuitive and interactive charts and visual representations to simplify the understanding and interpretation of your business metrics.
Installation
To install the package, run the following command:
pip install datadashr
Requirements
Our goal is to have a system that works completely locally, to do this we use Ollama and Codestral as a model
Download Ollama from the following link: https://ollama.com/download
install the model by running the following command:
ollama pull codestral
Starting the Interface
To start the user interface, run the following command:
datadashr
Usage Example
import pandas as pd
from pprint import pprint
from datadashr import DataDashr
from datadashr.core.llm import OllamaLLM
# Create DataFrame containing employee details
employees_df = pd.DataFrame({
'employeeid': [1, 2, 3],
'name': ['Alice', 'Bob', 'Charlie'],
'department': ['HR', 'IT', 'Finance']
})
# Create DataFrame containing salary information for employees
salaries_df = pd.DataFrame({
'employeeid': [1, 2, 3],
'salary': [50000, 60000, 70000]
})
# Create DataFrame containing department information and their managers
departments_df = pd.DataFrame({
'department': ['HR', 'IT', 'Finance'],
'manager': ['Dave', 'Eva', 'Frank']
})
# Create DataFrame containing project details and employee assignments
projects_df = pd.DataFrame({
'projectid': [101, 102, 103],
'projectname': ['Project A', 'Project B', 'Project C'],
'employeeid': [1, 2, 3]
})
# Structure to import and map the data sources
import_data = {
'sources': [
{"source_name": "employees_df", "data": employees_df, "source_type": "pandas",
"description": "Contains employee details including their department."},
{"source_name": "salaries_df", "data": salaries_df, "source_type": "pandas",
"description": "Contains salary information for employees."},
{"source_name": "departments_df", "data": departments_df, "source_type": "pandas",
"description": "Contains information about departments and their managers."},
{"source_name": "projects_df", "data": projects_df, "source_type": "pandas",
"description": "Contains information about projects and the employees assigned to them."},
],
'mapping': {
"employeeid": ['employees_df', 'salaries_df', 'projects_df'], # Mapping employeeid across three DataFrames
"department": ['employees_df', 'departments_df'] # Mapping department across two DataFrames
}
}
# Initialize the LLM (Language Learning Model) instance with specific parameters
llm = OllamaLLM(model='codestral', params={"temperature": 0.0}, verbose=False)
# Initialize the DataDashr object with imported data and LLM instance
df = DataDashr(data=import_data, llm_instance=llm, verbose=False, enable_cache=True, format_type='data')
# Perform a query on the combined DataFrame to get the employee with the highest salary and their salary
result = df.chat('Show the employer with highest salary and the salary')
# Print the result
pprint(result)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for datadashr-0.2.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8d1a17078d472f534c31a80779721a93b1777371c5a4544b469b0422554c589e |
|
MD5 | 2e0c2c025b547ff6827c87dba7dfda0c |
|
BLAKE2b-256 | 7251baab4b433c316de4d61e1515f395b9588c0d137ca6995f1aa04505625aa6 |