AI-powered analysis framework for structured data files and databases - part of the unified analysis framework suite

These details have not been verified by PyPI

Project links

Project description

Data Analysis Framework

Version 2.0.0 - Part of the unified analysis framework suite

📈 Purpose

Specialized framework for analyzing structured data (spreadsheets, databases, configuration files) with AI-powered pattern detection and safe agent query capabilities.

Important: This framework focuses on structured data access via natural language queries, not document chunking. For document processing, see the complementary frameworks below.

📦 Supported Formats

Spreadsheets & Tables

Excel: XLSX, XLS with multiple sheets
CSV/TSV: Delimiter detection and parsing
Apache Parquet: Columnar data analysis
JSON: Nested and flat structure analysis
JSONL: Line-delimited JSON streams

Configuration Data

YAML: Configuration files and data serialization
TOML: Configuration file analysis
INI: Legacy configuration parsing
Environment Files: .env variable analysis

Database Exports

SQL Dumps: Schema and data analysis
SQLite: Database file inspection
Database Connection: Live data analysis

🤖 AI Integration Features

Schema Detection: Automatic column type inference
Pattern Analysis: Anomaly and trend detection
Data Quality Assessment: Missing values, duplicates, outliers
Relationship Discovery: Cross-table dependencies
Business Logic Extraction: Rules and constraints
Predictive Insights: Forecasting and recommendations

🚀 Quick Start

from data_analysis_framework import DataAnalyzer

analyzer = DataAnalyzer()
result = analyzer.analyze("sales_data.xlsx")

print(f"Data Type: {result.document_type.type_name}")
print(f"Schema: {result.analysis.schema_info}")
print(f"Quality Score: {result.analysis.quality_metrics['overall_score']}")
print(f"AI Insights: {result.analysis.ai_insights}")

🔄 Unified Interface Support

This framework now supports the unified interface standard, providing consistent access patterns across all analysis frameworks:

import data_analysis_framework as daf

# Use the unified interface
result = daf.analyze_unified("sales_data.csv")

# All access patterns work consistently
doc_type = result['document_type']        # Dict access ✓
doc_type = result.document_type           # Attribute access ✓
doc_type = result.get('document_type')    # get() method ✓
as_dict = result.to_dict()                # Full dict conversion ✓

# Works the same across all frameworks
print(f"Framework: {result.framework}")   # 'data-analysis-framework'
print(f"Type: {result.document_type}")    # 'CSV Data'
print(f"Confidence: {result.confidence}")  # Quality-based confidence
print(f"AI opportunities: {result.ai_opportunities}")

The unified interface ensures compatibility when switching between frameworks or using multiple frameworks together.

🏗️ Status

🚧 Active Development - Core functionality implemented, v2.0.0 adopts unified framework interfaces

🌐 Framework Suite

This framework is part of a unified suite of analysis frameworks, each optimized for different data types:

Document Processing Frameworks (Chunking-Based)

These frameworks chunk documents for RAG/LLM consumption:

xml-analysis-framework - XML document analysis with 29+ specialized handlers (SCAP, Maven, Spring, etc.)
docling-analysis-framework - Office documents, PDFs, and images using IBM Docling
document-analysis-framework - General document processing and analysis

Data Access Framework (Query-Based)

This framework provides safe AI agent access to structured data:

data-analysis-framework (this framework) - Structured data via natural language queries

Shared Foundation

analysis-framework-base - Common interfaces and models for all frameworks

Key Differences

Framework Type	Use Case	AI Integration	Output
Document Frameworks	"Chunk this manual for search"	RAG, semantic search	Text chunks for embeddings
Data Framework	"Show customers with revenue > $10M"	Natural language queries	Query results and insights

When to Use What

Processing documents? Use xml/docling/document frameworks to chunk content for vector search
Querying databases/spreadsheets? Use data-analysis-framework for safe AI agent access
Both? Combine them! Document frameworks for knowledge + data framework for operational queries

See CHUNKING_DECISION.md for detailed explanation of this framework's query-based approach.

📝 What's New in v2.0.0

✅ Adopted analysis-framework-base for unified interfaces
✅ Inherits from BaseAnalyzer for consistent API across frameworks
✅ Implements UnifiedAnalysisResult for standard result format
✅ Added get_supported_formats() method for format discovery
✅ 100% backward compatible - all existing code works unchanged
ℹ️ Does not implement BaseChunker - uses query-based paradigm instead (see CHUNKING_DECISION.md)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

2.0.0

Oct 28, 2025

1.1.0

Jul 29, 2025

1.0.0

Jul 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_analysis_framework-2.0.0.tar.gz (61.5 kB view details)

Uploaded Oct 28, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

data_analysis_framework-2.0.0-py3-none-any.whl (30.1 kB view details)

Uploaded Oct 28, 2025 Python 3

File details

Details for the file data_analysis_framework-2.0.0.tar.gz.

File metadata

Download URL: data_analysis_framework-2.0.0.tar.gz
Upload date: Oct 28, 2025
Size: 61.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.12

File hashes

Hashes for data_analysis_framework-2.0.0.tar.gz
Algorithm	Hash digest
SHA256	`b8ef82720cd0b97b04cc25d3a2ca3c737b666cd6b959fa0762ce8d072683650c`
MD5	`28dec91c8b07231e20e74a721621f2b9`
BLAKE2b-256	`dd11dfab70145ba8b37b6627aa6bf32143f5f3b4e7c0314100d93cdf319b07e0`

See more details on using hashes here.

File details

Details for the file data_analysis_framework-2.0.0-py3-none-any.whl.

File metadata

Download URL: data_analysis_framework-2.0.0-py3-none-any.whl
Upload date: Oct 28, 2025
Size: 30.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.12

File hashes

Hashes for data_analysis_framework-2.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`55271a7b3793ee014343dd97eeb40a5c2a4314cfcd628bc17cc4518611f79811`
MD5	`4a3844ebc4b4776a90ad02d20f089ce6`
BLAKE2b-256	`7e7fbe34693e942cc8d864d5a7510625e6757d0208a4614056eaa33fd4ed4d8a`

See more details on using hashes here.

data-analysis-framework 2.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Data Analysis Framework

📈 Purpose

📦 Supported Formats

Spreadsheets & Tables

Configuration Data

Database Exports

🤖 AI Integration Features

🚀 Quick Start

🔄 Unified Interface Support

🏗️ Status

🌐 Framework Suite

Document Processing Frameworks (Chunking-Based)

Data Access Framework (Query-Based)

Shared Foundation

Key Differences

When to Use What

📝 What's New in v2.0.0

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes