MaxCompute SQL Query Runner - Execute queries on Alibaba Cloud

These details have not been verified by PyPI

Project links

Homepage

Project description

🚀 MaxQuery - MaxCompute SQL Query Runner

MaxQuery is a powerful, user-friendly command-line tool that makes it easy to execute SQL queries on Alibaba Cloud's MaxCompute (ODPS) platform. No more complex configurations or scripting—just run queries from your terminal!

✨ Features

✅ Easy Setup - One-time credential configuration
✅ Multiple Output Formats - CSV (default) or Parquet
✅ Flexible Query Execution - Single or batch queries
✅ Custom Output Paths - Save results anywhere
✅ Interactive CLI - User-friendly command-line interface
✅ Credential Management - Secure local credential storage
✅ Batch Processing - Run multiple queries at once
✅ Cross-Platform - Works on Linux, macOS, and Windows

📦 Installation

Option 1: From PyPI (Recommended)

pip install maxquery

Option 2: From Source (Development)

git clone https://github.com/chethanpatel/maxquery.git
cd maxquery
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -e .

Requirements

Python 3.8 or higher
pip (Python package manager)

🎯 Quick Start

1. First-Time Setup (Interactive)

maxquery config --setup

You'll be prompted to enter:

Access ID - Your Alibaba Cloud MaxCompute access ID
Access Key - Your Alibaba Cloud MaxCompute access key
Project Name - Your MaxCompute project name
Endpoint - API endpoint (default provided)
Region - Your region (default: ap-southeast-5)

Your credentials are saved securely in ~/.maxquery/credentials.json

2. Run Your First Query

maxquery run queries/my_query.sql

Results are saved to outputs/ by default as CSV format.

3. Check Configuration

maxquery info

Shows your current credentials and configuration.

🔐 Configuration

Managing Credentials

Show current credentials:

maxquery config --show

Update credentials (interactive setup):

maxquery config --setup

Delete saved credentials:

maxquery config --delete

Credentials Location

Credentials are stored in: ~/.maxquery/credentials.json

⚠️ Security Note: Never commit credentials to version control. The .maxquery/ directory is kept locally on your machine.

📚 Commands Reference

`maxquery run` - Execute Queries

Basic syntax:

maxquery run <sql_file> [sql_file2 ...] [OPTIONS]

Options:

Option	Short	Default	Description
`--format`	-	`1`	Output format: `1`=CSV, `2`=Parquet
`--output`	`-o`	`outputs`	Output directory for results
`--no-download`	-	-	Run query but don't save files
`--help`	-	-	Show command help

`maxquery config` - Manage Credentials

Basic syntax:

maxquery config [OPTIONS]

Options:

Option	Description
`--setup`	Interactive credential setup
`--show`	Display current credentials
`--delete`	Delete saved credentials
`--help`	Show command help

`maxquery info` - Show Configuration

Display current setup:

maxquery info

Shows:

Saved credentials location
Current project and region
Environment variables (if set)

`maxquery --version` - Check Version

maxquery --version

💡 Usage Examples

Example 1: Simple Query Execution

# Run a single query, save as CSV in outputs/
maxquery run queries/user_analysis.sql

Output:

📊 Running 1 query(ies)
   Format: CSV
============================================================
✅ Connected to ODPS Project: my_project

📄 user_analysis...
   ✅ 10500 records → outputs/user_analysis.csv

============================================================
✅ Completed: 1/1 queries
📊 Total records: 10500
💾 Results saved to: outputs/

Example 2: Run Multiple Queries at Once

# Run all SQL files in a directory
maxquery run queries/production/*.sql

# Run specific queries
maxquery run queries/sales.sql queries/inventory.sql queries/customers.sql

Example 3: Save as Parquet Format

# Parquet format (better for large datasets)
maxquery run queries/large_dataset.sql --format 2

Output:

📄 large_dataset...
   ✅ 5000000 records → outputs/large_dataset.parquet

Example 4: Custom Output Directory

# Save results to a specific folder
maxquery run queries/monthly_report.sql -o reports/2026/

# Absolute path
maxquery run queries/analysis.sql -o /home/user/data/exports/

# Current directory
maxquery run queries/test.sql -o .

Example 5: Run Without Saving Files

# Execute query but keep results in memory only
# Useful for testing or piping to other tools
maxquery run queries/validation.sql --no-download

Example 6: Batch Processing with Different Formats

# Run local test queries as CSV
maxquery run queries/local/*.sql --format 1 -o results/local/

# Run production queries as Parquet
maxquery run queries/production/*.sql --format 2 -o results/production/

Example 7: Complex Workflow

# 1. Setup credentials (first time only)
maxquery config --setup

# 2. Check configuration
maxquery info

# 3. Run test query
maxquery run queries/test_connection.sql

# 4. Run monthly reports
maxquery run queries/reports/monthly/*.sql --format 2 -o reports/2026-01/

# 5. Run analytics queries
maxquery run queries/analytics/user_metrics.sql -o analytics/ --format 2

📊 Output Formats

CSV Format (Default - Format 1)

maxquery run queries/data.sql --format 1

Pros:

Human-readable
Works in spreadsheet applications (Excel, Google Sheets)
Good for small to medium datasets

Cons:

Larger file size for big data
Slower to read/write

Parquet Format (Format 2)

maxquery run queries/data.sql --format 2

Pros:

Highly compressed (smaller file size)
Faster read/write performance
Better for big data processing
Preserves data types

Cons:

Requires specialized tools to read
Not directly readable in Excel

Reading Parquet files in Python:

import pandas as pd

df = pd.read_parquet('outputs/data.parquet')
print(df.head())

📁 Project Structure

maxquery/
├── maxquery/                    # Main package
│   ├── __init__.py             # Package initialization
│   ├── cli.py                  # Command-line interface
│   ├── core.py                 # Query execution logic
│   └── credentials.py          # Credential management
├── queries/                     # SQL query files
│   ├── local/                  # Test/development queries
│   │   └── test_connection.sql
│   ├── production/             # Production queries
│   │   └── analytics.sql
│   └── schema/                 # Schema definitions
├── outputs/                     # Query results (auto-created)
├── setup.py                     # Package setup configuration
├── pyproject.toml              # Project metadata
├── requirements.txt            # Python dependencies
├── README.md                   # This file
└── LICENSE                     # MIT License

🗂️ Query Organization Best Practices

Recommended Folder Structure

queries/
├── local/                       # For testing/development
│   ├── test_connection.sql
│   └── data_validation.sql
├── production/                  # For live queries
│   ├── daily/
│   │   ├── user_metrics.sql
│   │   └── sales_summary.sql
│   ├── weekly/
│   │   └── trend_analysis.sql
│   └── monthly/
│       └── business_report.sql
└── schema/                      # Table definitions & documentation
    ├── users_table.sql
    ├── orders_table.sql
    └── products_table.sql

Query File Naming Conventions

Use snake_case for file names: user_analysis.sql ✅
Be descriptive: daily_sales_report.sql instead of report.sql ✅
Group by frequency/type: daily_, weekly_, etc. ✅

🔧 Advanced Usage

Running Queries from a Cron Job (Scheduled Execution)

Create a script run_daily_reports.sh:

#!/bin/bash
source ~/.maxquery_env/bin/activate
maxquery run /queries/production/daily/*.sql -o /data/reports/daily/

Add to crontab (runs daily at 2 AM):

0 2 * * * /home/user/scripts/run_daily_reports.sh

Processing Large Datasets

For very large results:

# Use Parquet format for better performance
maxquery run queries/huge_dataset.sql --format 2 -o big_data/

# Process with Python
import pandas as pd
df = pd.read_parquet('big_data/huge_dataset.parquet')
# Work with chunks for memory efficiency
for chunk in pd.read_parquet('big_data/huge_dataset.parquet', chunksize=10000):
    process(chunk)

Piping Output to Other Tools

# Convert results to JSON
maxquery run queries/data.sql --no-download | jq .

# Process with awk
maxquery run queries/data.sql -o - | awk '{print $1}'

🐛 Troubleshooting

Issue 1: "No credentials configured"

❌ Error: No credentials configured
   Run: maxquery config --setup

Solution:

maxquery config --setup

Issue 2: "Invalid URL" or Connection Errors

❌ Error: Invalid URL 'hello/tenants': No scheme supplied

Causes & Solutions:

Endpoint format is wrong → Use full URL: https://service.ap-southeast-5.maxcompute.aliyun.com/api
Credentials are incorrect → Verify with maxquery config --show
Network issue → Check internet connection

Fix:

maxquery config --setup
# Re-enter correct credentials

Issue 3: "SQL file not found"

❌ Error: SQL file not found

Solution:

# Verify file exists
ls queries/my_query.sql

# Use correct path
maxquery run queries/my_query.sql  # ✅ Correct
maxquery run my_query.sql          # ❌ Wrong (file not in current dir)

Issue 4: "Permission denied" when saving results

❌ Error: Permission denied when writing to outputs/

Solution:

# Check directory permissions
ls -la outputs/

# Create directory if needed
mkdir -p outputs/

# Use writable directory
maxquery run queries/data.sql -o ~/Downloads/results/

Issue 5: Out of Memory with Large Results

Solution 1: Use no-download mode

maxquery run queries/huge_query.sql --no-download

Solution 2: Use Parquet format (more efficient)

maxquery run queries/huge_query.sql --format 2

Solution 3: Process in chunks (Python)

import pandas as pd

# Read in chunks
for chunk in pd.read_parquet('outputs/huge_data.parquet', chunksize=50000):
    process_chunk(chunk)

📖 Getting Help

View all commands:

maxquery --help

Get help for specific command:

maxquery run --help
maxquery config --help
maxquery info --help

Check version:

maxquery --version

🤝 Contributing

We welcome contributions! Here's how to contribute:

Fork the repository

git clone https://github.com/chethanpatel/maxquery.git
cd maxquery

Create a branch

git checkout -b feature/your-feature-name

Make changes and test
```
python -m pytest tests/
```

Commit and push

git add .
git commit -m "Add your feature description"
git push origin feature/your-feature-name

Open a Pull Request on GitHub

🚀 Publishing to PyPI

To publish a new version:

Update version in setup.py and pyproject.toml

Build package:

pip install build twine
python -m build

Upload to PyPI:
```
python -m twine upload dist/*
```

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

👨‍💻 Author

Chethan Patel

🙏 Acknowledgments

Built for Alibaba Cloud MaxCompute users
Inspired by the need for simple, efficient data query tools
Thanks to the Python community for excellent libraries

📞 Support

For issues, questions, or suggestions:

Check existing issues on GitHub Issues
Create a new issue with:
- Detailed description of the problem
- Steps to reproduce
- Error messages
- Your environment info (OS, Python version)
Email: chethanpatel100@gmail.com

📊 Usage Statistics

Track your query usage:

# Count total queries run
ls -la outputs/ | wc -l

# Check latest results
ls -lt outputs/ | head -10

🎓 Learning Resources

MaxCompute/ODPS Documentation

Python & SQL Learning

Happy Querying! 🎉

For the latest updates, visit: https://github.com/chethanpatel/maxquery

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.0.1

Jan 24, 2026

This version

1.0.0

Jan 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

maxquery-1.0.0.tar.gz (16.7 kB view details)

Uploaded Jan 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

maxquery-1.0.0-py3-none-any.whl (12.6 kB view details)

Uploaded Jan 24, 2026 Python 3

File details

Details for the file maxquery-1.0.0.tar.gz.

File metadata

Download URL: maxquery-1.0.0.tar.gz
Upload date: Jan 24, 2026
Size: 16.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for maxquery-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`d34c6c340aefeac2d94526e00749dfde69e237290f96ca6c8bac4879c7b80f86`
MD5	`c988d85761ffeaadd2ec1ac9793f4095`
BLAKE2b-256	`12142b114d09bba1f932ed916ed6d2304e3d586794ee81d97ca87939d3692009`

See more details on using hashes here.

File details

Details for the file maxquery-1.0.0-py3-none-any.whl.

File metadata

Download URL: maxquery-1.0.0-py3-none-any.whl
Upload date: Jan 24, 2026
Size: 12.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for maxquery-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`147b2c0ee498b1aafe508d549b163228faefb1af1eb5eff0d7f676d3b8af40dc`
MD5	`58f4b4e2dce67cede050d6331938dc63`
BLAKE2b-256	`4fed7f95a4a39931c8cc9e3ba25543ae9c07cbdbb64a653281db8d28fe26098f`

See more details on using hashes here.

maxquery 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🚀 MaxQuery - MaxCompute SQL Query Runner

📋 Table of Contents

✨ Features

📦 Installation

Option 1: From PyPI (Recommended)

Option 2: From Source (Development)

Requirements

🎯 Quick Start

1. First-Time Setup (Interactive)

2. Run Your First Query

3. Check Configuration

🔐 Configuration

Managing Credentials

Credentials Location

📚 Commands Reference

maxquery run - Execute Queries

maxquery config - Manage Credentials

maxquery info - Show Configuration

maxquery --version - Check Version

💡 Usage Examples

Example 1: Simple Query Execution

Example 2: Run Multiple Queries at Once

Example 3: Save as Parquet Format

Example 4: Custom Output Directory

Example 5: Run Without Saving Files

Example 6: Batch Processing with Different Formats

Example 7: Complex Workflow

📊 Output Formats

CSV Format (Default - Format 1)

Parquet Format (Format 2)

📁 Project Structure

🗂️ Query Organization Best Practices

Recommended Folder Structure

Query File Naming Conventions

🔧 Advanced Usage

Running Queries from a Cron Job (Scheduled Execution)

Processing Large Datasets

Piping Output to Other Tools

🐛 Troubleshooting

Issue 1: "No credentials configured"

Issue 2: "Invalid URL" or Connection Errors

Issue 3: "SQL file not found"

Issue 4: "Permission denied" when saving results

Issue 5: Out of Memory with Large Results

📖 Getting Help

🤝 Contributing

🚀 Publishing to PyPI

📝 License

👨‍💻 Author

🙏 Acknowledgments

📞 Support

📊 Usage Statistics

🎓 Learning Resources

MaxCompute/ODPS Documentation

Python & SQL Learning

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

`maxquery run` - Execute Queries

`maxquery config` - Manage Credentials

`maxquery info` - Show Configuration

`maxquery --version` - Check Version