Skip to main content

A library for codebase analysis

Project description

Codebase Analysis Utils

Overview

eic-codebase-analysis is a Python library designed to assist in the analysis of code repositories and large codebases. It leverages advanced analysis techniques and AI to provide deep insights into project structure, components, functionality, and documentation.

Installation

pip install eic-codebase-analysis

Tools

This library provides a set of modular tools for different analysis tasks:

1. Repository Structure Extractor

  • Purpose: Extracts the directory and file structure of repositories without including file contents.
  • Output: Markdown tree structure.
  • Usage:
    python -m eic_codebase_analysis.repository_structure_extractor.main --root ./path/to/repo
    
  • Documentation: Read more

2. Detailed Code Content Extractor

  • Purpose: Generates a single Markdown document containing both the directory structure and the full contents of files (in code blocks). Ideal for RAG contexts.
  • Output: Markdown with file contents.
  • Usage:
    python -m eic_codebase_analysis.detailed_code_content_extractor.main --root ./path/to/repo
    
  • Documentation: Read more

3. Repository File Metadata Generator

  • Purpose: Uses AI (Gemini) to generate descriptive metadata for each file. Can output as sidecar files, a single aggregate file, or per-folder summaries.
  • Output: AI-generated summaries and documentation for files.
  • Usage:
    python -m eic_codebase_analysis.repository_file_metadata_generator.main --root ./path/to/repo --model gemini-1.5-pro
    
  • Documentation: Read more

4. Hierarchical Project Metadata Generator

  • Purpose: Generates AI metadata at three levels: File (sidecar), Folder (summary of contents), and Project (high-level overview).
  • Output: Hierarchical Markdown documentation (.ai-meta.md, .folder-ai-meta.md, project.ai-meta.md).
  • Usage:
    python -m eic_codebase_analysis.hierarchical_project_metadata_generator.main --root ./path/to/repo --model gemini-1.5-pro
    
  • Documentation: Read more

Integration

These tools are designed to be part of a broader ecosystem of AI-driven development tools. They can be integrated with existing libraries for Retrieval Augmented Generation (RAG) and dataset preparation.

Requirements

  • Python 3.x
  • google-generativeai (for AI-powered tools)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eic_codebase_analysis-0.1.1.tar.gz (33.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

eic_codebase_analysis-0.1.1-py3-none-any.whl (48.5 kB view details)

Uploaded Python 3

File details

Details for the file eic_codebase_analysis-0.1.1.tar.gz.

File metadata

  • Download URL: eic_codebase_analysis-0.1.1.tar.gz
  • Upload date:
  • Size: 33.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for eic_codebase_analysis-0.1.1.tar.gz
Algorithm Hash digest
SHA256 1231ac7cd9a4def3abc2d880f86a37dbecb7ed05e70efc4a085bb75dd5bf2d8c
MD5 6a7d237b6eb32cdd8355daf5b6bbe458
BLAKE2b-256 297c0cf7ac388e8cf9b66daea135205be42d60041b13ad0b2c3075c45d91d162

See more details on using hashes here.

File details

Details for the file eic_codebase_analysis-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for eic_codebase_analysis-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4937e2a226217f8f53b76ea0fd34f08ea968ff5eb53c12dca02ec32d86e73f51
MD5 c25a552b6d5def72588e05af0ac39a7c
BLAKE2b-256 d08f31d582d407cf8edc13c27d00ee5fb9ccf0671bdc8e57d2a8808e73a7fde1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page