AI-powered code quality analysis tool with GitHub integration and interactive chatbot

These details have not been verified by PyPI

Project links

Project description

Code Quality Intelligence Agent

A comprehensive AI-powered code quality analysis tool that provides intelligent insights into codebases using advanced language models, static analysis techniques, and semantic search capabilities.

Overview

The Code Quality Intelligence Agent is an enterprise-grade tool that combines traditional static analysis with modern AI capabilities to provide comprehensive code quality assessment. The system integrates multiple analysis engines, semantic search capabilities, and conversational AI to deliver actionable insights for software development teams.

Checklist

Core functionality tested and working
All CLI commands functional (17/17 tests passed)
Report generation working (JSON, Markdown, Console)
RAG system implemented and functional
GitHub and PyPi integration working with cloning
AI chatbot responding with context
Error handling graceful and informative
Documentation updated and complete
Cross-platform compatibility verified
Performance acceptable for production

Future Work

Cross LLM model inference provider
Cross embedding integration
Dashboard and visualization

Core Architecture

Analysis Engine

Multi-language Support: Python, JavaScript, TypeScript, Java, C++, C#, Go, Rust, PHP, Ruby, Swift, Kotlin, Scala
Static Analysis Integration: Bandit for security, Radon for complexity, Semgrep for patterns
AST Parsing: Deep structural analysis of code syntax and semantics
Pattern Recognition: Security vulnerabilities, performance bottlenecks, maintainability issues

AI Integration

Language Model: Groq API integration with LangChain framework
Model Used: deepseek-r1-distill-llama-70b for code analysis and conversation
Conversation Management: Context-aware dialogue with memory persistence
Fallback System: Heuristic responses when API unavailable

RAG System Implementation

Vector Database: FAISS for efficient similarity search
Embedding Strategy: Custom feature extraction optimized for code analysis
Feature Dimensions: 512-dimensional vectors capturing code patterns and semantics
Code-Specific Features:
- Function and class detection patterns
- Import statement analysis
- Security pattern recognition (eval, pickle, SQL injection)
- Complexity indicators (loops, conditionals, nesting)
- Vocabulary-based semantic features
Text Chunking: Recursive character splitting optimized for code structure
Persistence: Automatic index saving and loading for performance

Report Generation

Console Output: Rich formatted terminal reports with color coding and tables
JSON Export: Structured data format for integration with other tools
Markdown Export: Human-readable reports for documentation
Interactive Mode: Real-time Q&A interface with code context

Installation

Prerequisites

Python 3.8 or higher
Git (for GitHub repository analysis)
Groq API key (obtain from console.groq.com)

Core Installation

# Install core dependencies (recommended)
pip install langchain==0.2.16 langchain-groq==0.1.9 langchain-community==0.2.16 click==8.1.7 rich==13.7.1 gitpython==3.1.43 requests==2.31.0 python-dotenv==1.0.1 bandit==1.7.5 radon==6.0.1 safety==3.2.7 streamlit==1.31.0 pandas==2.2.0 faiss-cpu==1.8.0

# Install from requirements file (may require C++ build tools on Windows)
pip install -r requirements.txt

Package Installation

# Install as package (after building)
pip install dist/code_quality_intelligence-1.0.0-py3-none-any.whl

# Global commands available after installation
cqi analyze /path/to/code
code-quality analyze /path/to/code --interactive

# Setup web interface for easy access
python setup-web.py
# Then use: python cqi-web.py

Web Interface Setup and Launchers

# Setup for global access (run once)
python setup-web.py

# Available launcher methods:
python cqi-web.py              # Main launcher (works from anywhere)
cqi-web.bat                    # Windows batch file
cd Webpage && python launch.py # Local launcher with dependency check
cd Webpage && streamlit run app.py # Direct Streamlit launch

# Test web interface functionality
cd Webpage && python test_app.py

Configuration

API Key Setup

# Method 1: Environment variable
export GROQ_API_KEY=your_api_key_here

# Method 2: .env file
echo "GROQ_API_KEY=your_api_key_here" > .env

# Method 3: CLI parameter
python cqi.py analyze /path/to/code --groq-key your_api_key_here

# Method 4: Setup wizard
python cqi.py setup

System Validation

# Validate complete system functionality
python validation.py

# Test RAG system specifically
python -c "from code_quality_agent.rag_system import CodeRAGSystem; rag = CodeRAGSystem(); print('RAG Available:', rag.is_available()); print('System:', rag.get_collection_stats().get('system'))"

Usage

Basic Analysis Commands

Local File Analysis

# Analyze single file
python cqi.py analyze sample_code/module_a.py --format console

# Analyze directory
python cqi.py analyze sample_code --format console

# Get codebase statistics
python cqi.py info sample_code

GitHub Repository Analysis

# Analyze public repository
python cqi.py analyze https://github.com/pallets/flask --format console

# Analyze specific branch
python cqi.py analyze https://github.com/user/repo --branch develop

# Analyze with interactive chat
python cqi.py analyze https://github.com/user/repo --interactive

Report Generation

Export Formats

# Generate JSON report
python cqi.py analyze sample_code --format json --output analysis_report.json

# Generate Markdown report
python cqi.py analyze sample_code --format markdown --output analysis_report.md

# Console output (default)
python cqi.py analyze sample_code --format console

Interactive Features

AI-Powered Chat

# Interactive analysis with RAG-enhanced responses
python cqi.py analyze sample_code --interactive

# Dedicated chat session
python cqi.py chat

# Chat with GitHub repository context
python cqi.py analyze https://github.com/user/repo --interactive

Advanced Options

Customization

# Override API key
python cqi.py analyze code --groq-key custom_api_key

# Branch-specific analysis
python cqi.py analyze https://github.com/user/repo --branch feature-branch

# Combined options
python cqi.py analyze https://github.com/user/repo --format json --output github_analysis.json --interactive --branch main

Package Execution Methods

As Python Module

# Version information
python -m code_quality_agent --version

# Help documentation
python -m code_quality_agent --help

# Analysis execution
python -m code_quality_agent analyze sample_code --format console

# Interactive mode
python -m code_quality_agent analyze sample_code --interactive

As Entry Point Script

# Version information
python cqi.py --version

# Help documentation
python cqi.py --help

# Analysis execution
python cqi.py analyze sample_code --format console

# Interactive mode
python cqi.py analyze sample_code --interactive

Web Interfaces

Dashboard Access

# Simple HTML dashboard (no dependencies)
python simple_dashboard.py
# Access at: http://localhost:8080

# Advanced Streamlit dashboard
streamlit run streamlit_app.py
# Access at: http://localhost:8501

# Launch via CLI
python cqi.py dashboard

Professional Web Interface

# Quick launch - single command from anywhere
python cqi-web.py

# Windows batch file
cqi-web.bat

# Direct Streamlit launch
cd Webpage && streamlit run app.py
# Access at: http://localhost:8501

Web Interface Architecture:

Main Application: Webpage/app.py - Complete Streamlit web application
Configuration: Webpage/config.py - Centralized settings and constants
Launcher Scripts: Multiple launch options for different platforms
Testing Suite: Webpage/test_app.py - Comprehensive functionality tests
Dependencies: Webpage/requirements.txt - All required packages

Core Features:

Professional UI: Clean, emoji-free interface suitable for enterprise use
Complete CLI Parity: All command-line functionality accessible through web interface
Real-time Analysis: Live progress updates with visual feedback
Interactive Visualizations: Charts, graphs, and data tables using Plotly
Multi-page Navigation: 5 distinct pages (Home, Setup, Info, Analyze, Chat)
RAG System Integration: AI-powered code discussions with semantic search
GitHub Integration: Direct repository analysis and cloning
Local File Support: Comprehensive local codebase analysis

Page-Specific Functionality:

Home Page:

Welcome screen with feature overview
Getting started guide and supported languages
CLI commands reference and feature comparison

Setup Page:

Groq API key configuration and validation
System dependency checking and status
Environment setup and troubleshooting

Info Page:

Quick codebase statistics without full analysis
Language distribution analysis and file counts
File size and structure information

Analyze Page:

Full code quality analysis with multiple output formats
Interactive mode with real-time progress
Visual reports with severity and category charts
File analysis table with highlighting
Support for both local paths and GitHub repositories

Chat Page:

Enhanced conversational AI with RAG capabilities
Context-aware responses based on codebase analysis
Interactive chat interface with conversation history
Code-specific question answering

Technical Implementation:

Session Management: Persistent state across page navigation
Caching System: Optimized performance with result caching
Error Handling: Graceful error management with user feedback
Responsive Design: Wide layout optimized for data visualization
Modular Architecture: Separate functions for each component
Configuration Management: Centralized settings and constants

Supported Analysis Types:

Local file and directory analysis
GitHub repository analysis with branch support
Multiple output formats (Console, JSON, Markdown)
Interactive chat sessions
RAG-enhanced code discussions

Dependencies:

Core: code-quality-intelligence>=1.0.1
Web Framework: streamlit>=1.31.0, streamlit-chat>=0.1.1
Data Processing: pandas>=2.2.0, plotly>=5.19.0, numpy>=1.24.0
AI/ML: langchain>=0.2.16, sentence-transformers>=2.2.2, chromadb>=0.4.22
Visualization: matplotlib>=3.8.4, seaborn>=0.13.2, altair>=5.2.0
Code Analysis: bandit>=1.7.5, radon>=6.0.1, semgrep>=1.45.0

Web Interface File Structure:

Webpage/
├── app.py                 # Main Streamlit application (689 lines)
├── config.py              # Configuration and constants
├── requirements.txt        # Web interface dependencies
├── launch.py              # Local launcher with dependency checking
├── test_app.py            # Comprehensive functionality tests
├── README.md              # Web interface documentation
├── QUICK_START.md         # Quick start guide
├── cqi-web                # Unix launcher script
├── cqi-web.bat            # Windows batch launcher
├── launch.bat             # Windows batch file
├── launch.sh              # Unix shell script
├── components/            # Reusable UI components (empty)
├── pages/                 # Additional pages (empty)
├── static/                # Static assets (empty)
└── simple_embeddings_db/  # RAG system database

Key Functions in app.py:

initialize_session_state() - Session management
run_analysis() - Core analysis execution with caching
run_codebase_info() - Quick statistics without full analysis
create_overview_metrics() - Metrics display with professional styling
create_severity_chart() - Issue severity visualization
create_category_chart() - Issue category distribution
create_file_analysis_table() - Detailed file analysis with highlighting
create_chatbot_interface() - AI chat functionality
create_rag_stats() - RAG system statistics and management
setup_page() - Configuration and dependency checking
info_page() - Quick codebase information
analyze_page() - Full analysis interface
chat_page() - Enhanced chat interface

Configuration Options (config.py):

Application Settings: Title, icon, layout configuration
Server Settings: Host, port, and network configuration
Supported Languages: 13 programming languages with detection
Output Formats: Console, JSON, Markdown support
Analysis Types: Local Path and GitHub Repository
Chart Colors: Professional color scheme for severity levels
Page Configuration: Navigation structure and page definitions
Default Settings: Analysis type, output format, interactive mode
API Configuration: Groq API key environment variable and timeouts
File Limits: Maximum file size and total size constraints
Cache Settings: TTL and caching enablement
Logging: Log level and format configuration

Professional Design Features:

Emoji-Free Interface: Clean, enterprise-suitable design
Text-Based Severity Indicators: [CRITICAL], [HIGH], [MEDIUM], [LOW], [INFO]
Professional Color Scheme: Consistent color coding for severity levels
Responsive Layout: Wide layout optimized for data visualization
Custom CSS Styling: Professional header, metric cards, and chat messages
Clean Navigation: Simple page names without emoji clutter

Web Interface Troubleshooting:

Common Issues and Solutions:

Import Errors: Ensure main package is installed
```
pip install code-quality-intelligence
```
Missing Dependencies: Install web interface requirements
```
cd Webpage && pip install -r requirements.txt
```
Port Conflicts: Change port if 8501 is occupied
```
streamlit run app.py --server.port 8502
```
Analysis Failures: Check file paths and permissions
- Verify target directory contains supported code files
- Ensure proper read permissions for local files
- Check GitHub repository URL format
API Key Issues: Configure through Setup page
- Go to Setup page in web interface
- Enter valid Groq API key
- Check environment variable configuration
RAG System Problems: Check database and dependencies
- Verify ChromaDB installation
- Check sentence-transformers availability
- Review RAG statistics in Chat page

Testing and Validation:

# Test web interface functionality
cd Webpage && python test_app.py

# Check specific components
cd Webpage && python -c "from app import *; print('All imports successful')"

# Validate configuration
cd Webpage && python -c "from config import *; print('Configuration loaded')"

Performance Optimization:

Caching: Results are cached for 1 hour by default
File Limits: Maximum 10MB per file, 100MB total
Session Management: State persists across page navigation
Lazy Loading: Components load only when needed

Technical Implementation

RAG System Details

The Retrieval-Augmented Generation system implements semantic search capabilities for enhanced code analysis and conversational AI responses.

Embedding Implementation

Vector Database: FAISS IndexFlatIP for cosine similarity search
Feature Extraction: Custom 512-dimensional feature vectors
Code Patterns: Function definitions, class structures, import statements
Security Patterns: eval usage, pickle operations, SQL injection indicators
Vocabulary Building: Dynamic term frequency analysis
Chunking Strategy: Recursive character splitting with code-aware separators

Fallback Hierarchy

ChromaDB: Full vector database with sentence transformers (requires C++ build tools)
Simple Embedding RAG: FAISS-based semantic search with custom features
Simple RAG: Keyword-based matching and text search
No RAG: Basic analysis without enhanced context

Analysis Pipeline

File Processing

Input Validation: Path verification and file type detection
Language Detection: Automatic programming language identification
Content Extraction: UTF-8 encoding with error handling
Chunking: Code-aware text splitting for optimal analysis

Quality Assessment

Security Analysis: Bandit integration for vulnerability detection
Complexity Metrics: Radon integration for cyclomatic complexity
Pattern Matching: Semgrep rules for best practice violations
Duplication Detection: Cross-file similarity analysis

AI Enhancement

Context Preparation: RAG system provides relevant code chunks
LLM Processing: Groq API analysis with conversation memory
Response Generation: Contextual insights with actionable recommendations
Follow-up Support: Conversation continuation with maintained context

Supported Analysis Categories

Security Assessment

SQL injection vulnerabilities
Cross-site scripting risks
Unsafe deserialization patterns
Hardcoded credentials detection
Command injection vulnerabilities

Performance Analysis

Algorithmic complexity assessment
Memory usage patterns
Database query optimization
Loop efficiency analysis
Resource leak detection

Code Quality Metrics

Cyclomatic complexity measurement
Maintainability index calculation
Code duplication identification
Documentation coverage assessment
Naming convention compliance

Best Practice Validation

Error handling patterns
Design pattern implementation
Testing coverage gaps
API design principles
Security best practices

Integration Examples

CI/CD Pipeline Integration

name: Code Quality Assessment
on: [push, pull_request]

jobs:
  quality_check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Setup Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.11'
      - name: Install Code Quality Agent
        run: pip install code-quality-intelligence
      - name: Run Analysis
        run: cqi analyze . --format json --output quality_report.json
        env:
          GROQ_API_KEY: ${{ secrets.GROQ_API_KEY }}
      - name: Upload Report
        uses: actions/upload-artifact@v3
        with:
          name: quality-report
          path: quality_report.json

Pre-commit Hook

#!/bin/bash
# .git/hooks/pre-commit
python cqi.py analyze $(git diff --cached --name-only --diff-filter=ACM)

Performance Characteristics

Execution Metrics

Local Analysis: Instant results for codebases under 100 files
GitHub Analysis: 2-3 minutes for repositories with 100+ files
RAG Indexing: 0.5-2 seconds per code chunk
Embedding Generation: 50-100ms per code chunk
AI Response Time: 2-5 seconds (depending on API latency)

Scalability

File Limit: Automatically handles up to 100 files per analysis
Size Limit: Files larger than 1MB are skipped for performance
Memory Usage: Optimized for systems with 4GB+ RAM
Concurrent Processing: Parallel analysis of multiple files

Error Handling

Graceful Degradation

API Unavailable: Falls back to heuristic responses
RAG Disabled: Continues with basic analysis
Network Issues: Local analysis remains functional
Invalid Paths: Clear error messages with suggestions

Troubleshooting

Common Issues

"GROQ_API_KEY not found"

Solution: Set environment variable or use --groq-key parameter
Command: python cqi.py setup

"No supported files found"

Solution: Verify file extensions and path correctness
Command: python cqi.py info /path/to/code

"RAG system not available"

Solution: Install optional dependencies or use without RAG
Command: pip install faiss-cpu sentence-transformers

"GitHub cloning failed"

Solution: Verify Git installation and repository access
Command: Ensure Git is in system PATH

Development and Customization

Extending Analysis Rules

The system supports custom analysis rules through the LangChain integration:

# Custom analyzer implementation
from code_quality_agent.analyzers import CodeAnalyzer

class CustomAnalyzer(CodeAnalyzer):
    def analyze_custom_patterns(self, content: str) -> List[Dict]:
        # Implement custom analysis logic
        pass

Adding Language Support

# Language-specific analyzer
def analyze_rust_file(self, file_path: Path) -> Dict[str, Any]:
    # Implement Rust-specific analysis
    return {
        'language': 'rust',
        'issues': [],
        'complexity': {}
    }

API Reference

Core Classes

CodeQualityAgent: Main analysis orchestrator
CodeAnalyzer: Multi-language static analysis engine
CodeRAGSystem: Semantic search and context retrieval
CodeQualityChatbot: Conversational AI interface
ReportGenerator: Multi-format report generation

Configuration

Config.DEFAULT_MODEL: LLM model selection
Config.MAX_FILE_SIZE: File size limit (1MB default)
Config.SUPPORTED_EXTENSIONS: Supported file types
Config.QUALITY_CATEGORIES: Analysis categories

Testing and Validation

Comprehensive Test Suite

# Complete system validation
python ship_validation.py

# Feature-specific testing
python test_all_features.py

# Package functionality testing
python -m code_quality_agent analyze sample_code
python cqi.py analyze sample_code --interactive

Expected Results

Local Analysis: 11 issues detected in sample code (3 high-severity security issues)
GitHub Analysis: 136+ issues detected in test repositories
RAG System: 200+ code chunks indexed with semantic search
Performance: Complete validation in under 3 minutes

License and Attribution

License

MIT License - see LICENSE file for details

Dependencies

LangChain: Agent framework and LLM integration
Groq: High-performance LLM inference
FAISS: Efficient similarity search and clustering
Rich: Terminal formatting and user interface
Click: Command-line interface framework
GitPython: Git repository operations
Bandit: Python security analysis
Radon: Code complexity metrics

Support and Contributing

Bug Reports

Report issues through the GitHub issue tracker with detailed reproduction steps and environment information.

Feature Requests

Submit enhancement proposals with clear use cases and implementation suggestions.

Contributing Guidelines

Fork the repository
Create feature branch with descriptive name
Implement changes with appropriate test coverage
Submit pull request with detailed description
Ensure all validation tests pass

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.6.2

Sep 11, 2025

1.6.1

Sep 10, 2025

This version

1.6.0

Sep 10, 2025

1.5.0

Sep 10, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

code_quality_intelligence-1.6.0.tar.gz (57.4 kB view details)

Uploaded Sep 10, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

code_quality_intelligence-1.6.0-py3-none-any.whl (54.5 kB view details)

Uploaded Sep 10, 2025 Python 3

File details

Details for the file code_quality_intelligence-1.6.0.tar.gz.

File metadata

Download URL: code_quality_intelligence-1.6.0.tar.gz
Upload date: Sep 10, 2025
Size: 57.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for code_quality_intelligence-1.6.0.tar.gz
Algorithm	Hash digest
SHA256	`a44d1060263eac030575fe665cb5e8dcfb78999c5dc8a1ea74952fc90c5a86e1`
MD5	`1f4b82b3d64c8b62072af2a6ac6ddf7f`
BLAKE2b-256	`708d8fb2c3e2aae9e5157fac79d76d7a0074eb663d963b362e18ed4113fd54c5`

See more details on using hashes here.

File details

Details for the file code_quality_intelligence-1.6.0-py3-none-any.whl.

File metadata

Download URL: code_quality_intelligence-1.6.0-py3-none-any.whl
Upload date: Sep 10, 2025
Size: 54.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for code_quality_intelligence-1.6.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6c469eb9a442ddac9a918ddf8861ecfa9b6fc82b8746f02f4ca0f83e36839263`
MD5	`c5ede087c1e0cc3633b15a9162385c37`
BLAKE2b-256	`3d7242310548b1d4e64ac630d795d2d2b10c8523f6417212ade1c0506afe888d`

See more details on using hashes here.

code-quality-intelligence 1.6.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Code Quality Intelligence Agent

Overview

Checklist

Future Work

Core Architecture

Analysis Engine

AI Integration

RAG System Implementation

Report Generation

Installation

Prerequisites

Core Installation

Package Installation

Web Interface Setup and Launchers

Configuration

API Key Setup

System Validation

Usage

Basic Analysis Commands

Local File Analysis

GitHub Repository Analysis

Report Generation

Export Formats

Interactive Features

AI-Powered Chat

Advanced Options

Customization

Package Execution Methods

As Python Module

As Entry Point Script

Web Interfaces

Dashboard Access

Professional Web Interface

Technical Implementation

RAG System Details

Embedding Implementation

Fallback Hierarchy

Analysis Pipeline

File Processing

Quality Assessment

AI Enhancement

Supported Analysis Categories

Security Assessment

Performance Analysis

Code Quality Metrics

Best Practice Validation

Integration Examples

CI/CD Pipeline Integration

Pre-commit Hook

Performance Characteristics

Execution Metrics

Scalability

Error Handling

Graceful Degradation

Troubleshooting

Common Issues

Development and Customization

Extending Analysis Rules

Adding Language Support

API Reference

Core Classes

Configuration

Testing and Validation

Comprehensive Test Suite

Expected Results

License and Attribution

License

Dependencies

Support and Contributing

Bug Reports

Feature Requests