Automated Data Elements Linking - Lite
Project description
Adel-Lite: Automated Data Elements Linking - Lite
Adel-Lite is a Python library for automated schema generation, data profiling, and relationship discovery for Pandas DataFrames. It helps you understand your data structure and relationships with minimal effort.
Features
🔍 Schema Generation: Automatic structural schema detection
📊 Data Profiling: Comprehensive statistics and semantic type inference
🔗 Relationship Mapping: Primary/Foreign key detection using heuristics
⚡ Constraint Discovery: Intra-row constraint detection (GT, EQ)
📈 Visualization: Schema graphs with Graphviz
📤 Multi-format Export: JSON, YAML, SQL DDL, Avro
🛠️ CLI Support: Command-line interface for batch processing
Installation
pip install adel-lite
Development installation
git clone https://github.com/Parthnuwal7/adel-lite.git
cd adel-lite
pip install -e .
Quick Start
Basic Usage
import pandas as pd
from adel_lite import schema, profile, map_relationships, build_meta
Load your data
customers = pd.DataFrame({
'customer_id': ,
'name': ['Alice', 'Bob', 'Charlie'],
'email': ['alice@test.com', 'bob@test.com', 'charlie@test.com']
})
orders = pd.DataFrame({
'order_id': ,
'customer_id': ,
'amount': [100.0, 150.0, 75.0]
})
df_list = [customers, orders]
table_names = ['customers', 'orders']
Generate comprehensive analysis
schema_result = schema(df_list, table_names)
profile_result = profile(df_list, table_names)
relationships_result = map_relationships(df_list, table_names)
Build final meta structure
meta = build_meta(schema_result, profile_result, relationships_result)
print(json.dumps(meta, indent=2))
Command Line usage
Analyze CSV files
adel-lite --input data/*.csv --output schema.json
Generate visualization
adel-lite --input *.csv --visualize --output schema.json
Export as SQL DDL
adel-lite --input data/*.csv --format ddl --output schema.sql
Skip constraint detection for faster processing
adel-lite --input *.csv --no-constraints --output schema.json
Core Functions
1. Schema Generation
from adel_lite import schema
Generate structural schema
result = schema(df_list, table_names)
Returns:
- Table names and column information
- Data types (pandas + high-level)
- Nullable flags and positions
2. Data Profiling
from adel_lite import profile
Generate comprehensive profiles
result = profile(df_list, table_names)
Returns:
- Statistical summaries (min, max, mean, etc.)
- Uniqueness and null ratios
- Semantic type inference (id, datetime, categorical, etc.)
- Primary key candidates
3. Relationship Mapping
from adel_lite import map_relationships
Detect relationships
result = map_relationships(df_list, table_names, fk_threshold=0.8)
Returns:
- Primary key detection
- Foreign key relationships with confidence scores
- Composite key candidates
4. Constraint Detection
from adel_lite import detect_constraints
Find intra-row constraints
result = detect_constraints(df_list, table_names, threshold=0.95)
Returns:
- GT constraints:
A > B - EQ constraints:
A + B = C - Confidence scores
5. Visualization
from adel_lite import visualize
Generate schema graph
path = visualize(schema_result, relationships_result, format='png')
6. Export
from adel_lite import export_schema
Export to different formats
json_content = export_schema(meta, format='json')
yaml_content = export_schema(meta, format='yaml')
ddl_content = export_schema(meta, format='ddl')
Example Output
{
"metadata": {
"generated_at": "2025-09-10T12:42:00",
"generator": "adel-lite",
"version": "0.1.0"
},
"tables": [
{
"name": "customers",
"primary_key": "customer_id",
"fields": [
{
"name": "customer_id",
"dtype": "integer",
"semantic_type": "id",
"subtype": "primary",
"nullable": false
}
]
}
],
"relationships": [
{
"type": "foreign_key",
"foreign_table": "orders",
"foreign_column": "customer_id",
"referenced_table": "customers",
"referenced_column": "customer_id",
"confidence": 0.92
}
]
}
Advanced Usage
Custom Thresholds
Adjust detection thresholds
relationships = map_relationships(
df_list, table_names,
fk_threshold=0.9, # Stricter FK detection
name_similarity_threshold=0.8
)
constraints = detect_constraints(
df_list, table_names,
threshold=0.98 # Very strict constraints
)
Sampling and Inspection
from adel_lite import sample
Get sample data for inspection
samples = sample(df_list, table_names, n=10, method='random')
Conditional sampling
samples = sample_by_condition(
df_list,
['age > 25', 'amount > 100'],
table_names
)
Configuration
CLI Configuration
Full configuration example
adel-lite
--input data/*.csv
--output analysis.json
--format json
--visualize
--viz-format svg
--sample 5
--constraint-threshold 0.9
--fk-threshold 0.8
--verbose
Logging
import logging
#Enable debug logging
logging.getLogger('adel_lite').setLevel(logging.DEBUG)
Performance Tips
- Skip constraints for large datasets:
--no-constraints - Limit sampling for inspection:
--sample 100 - Use appropriate thresholds based on data quality
- Process in batches for very large datasets
Requirements
- Python 3.8+
- pandas >= 1.3.0
- numpy >= 1.21.0
- pyyaml >= 6.0
- networkx >= 2.6
- matplotlib >= 3.5.0
- graphviz >= 0.20.0
- fuzzywuzzy >= 0.18.0
Contributing
- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Make changes and add tests
- Run tests:
pytest - Submit a pull request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Roadmap
- Support for more data sources (databases, APIs)
- Advanced constraint types (LIKE patterns, regex)
- Machine learning-based relationship detection
- Interactive web interface
- Integration with data catalogs
Support
Made with ❤️ for the data community by Parth Nuwal
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file adel_lite-0.1.1.tar.gz.
File metadata
- Download URL: adel_lite-0.1.1.tar.gz
- Upload date:
- Size: 25.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2291e3523e7e518104d03b848adcfb12f594444ea7f99b4cec092d50b8f28f2b
|
|
| MD5 |
4fb4f79f46dc291bece831dd5264e264
|
|
| BLAKE2b-256 |
9eb116a247172f80b1c9c40681fd90bd7ad1b50295f146be7e39becf75624808
|
File details
Details for the file adel_lite-0.1.1-py3-none-any.whl.
File metadata
- Download URL: adel_lite-0.1.1-py3-none-any.whl
- Upload date:
- Size: 29.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0bce6c0b50d7875035fe3400a05448f9c9177cabfa6d22d43305f9f01c3bb727
|
|
| MD5 |
44d5ad8e2e861c3cf45472604d4c36e4
|
|
| BLAKE2b-256 |
ec2cf2ec3148cabcd247448810ea66ab7b1e530d104f19565e0bbec7a858e3c8
|