Skip to main content

A library for codebase analysis

Project description

Codebase Analysis Utils

Overview

eic-codebase-analysis is a Python library designed to assist in the analysis of code repositories and large codebases. It leverages advanced analysis techniques and AI to provide deep insights into project structure, components, functionality, and documentation.

Installation

pip install eic-codebase-analysis

Tools

This library provides a set of modular tools for different analysis tasks:

1. Repository Structure Extractor

  • Purpose: Extracts the directory and file structure of repositories without including file contents.
  • Output: Markdown tree structure.
  • Usage:
    python -m eic_codebase_analysis.repository_structure_extractor.main --root ./path/to/repo
    
  • Documentation: Read more

2. Detailed Code Content Extractor

  • Purpose: Generates a single Markdown document containing both the directory structure and the full contents of files (in code blocks). Ideal for RAG contexts.
  • Output: Markdown with file contents.
  • Usage:
    python -m eic_codebase_analysis.detailed_code_content_extractor.main --root ./path/to/repo
    
  • Documentation: Read more

3. Repository File Metadata Generator

  • Purpose: Uses AI (Gemini) to generate descriptive metadata for each file. Can output as sidecar files, a single aggregate file, or per-folder summaries.
  • Output: AI-generated summaries and documentation for files.
  • Usage:
    python -m eic_codebase_analysis.repository_file_metadata_generator.main --root ./path/to/repo --model gemini-1.5-pro
    
  • Documentation: Read more

4. Hierarchical Project Metadata Generator

  • Purpose: Generates AI metadata at three levels: File (sidecar), Folder (summary of contents), and Project (high-level overview).
  • Output: Hierarchical Markdown documentation (.ai-meta.md, .folder-ai-meta.md, project.ai-meta.md).
  • Usage:
    python -m eic_codebase_analysis.hierarchical_project_metadata_generator.main --root ./path/to/repo --model gemini-1.5-pro
    
  • Documentation: Read more

Integration

These tools are designed to be part of a broader ecosystem of AI-driven development tools. They can be integrated with existing libraries for Retrieval Augmented Generation (RAG) and dataset preparation.

Requirements

  • Python 3.x
  • google-generativeai (for AI-powered tools)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eic_codebase_analysis-0.1.0.tar.gz (23.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

eic_codebase_analysis-0.1.0-py3-none-any.whl (31.1 kB view details)

Uploaded Python 3

File details

Details for the file eic_codebase_analysis-0.1.0.tar.gz.

File metadata

  • Download URL: eic_codebase_analysis-0.1.0.tar.gz
  • Upload date:
  • Size: 23.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for eic_codebase_analysis-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5559f383e2887cbfadd15918296d3b6db629418b097e974731a7a46c6a4e6882
MD5 86f2ce717c87292c21c02642adf15c8d
BLAKE2b-256 7a3f5b1ff122acbc4cfba62d660db16c9b0ba2fdb3770b5e2e6bb22fce13deaf

See more details on using hashes here.

File details

Details for the file eic_codebase_analysis-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for eic_codebase_analysis-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 54cbcb3b70ab8de5311f37866d60e210d0eaaba8f7fc0c949a8d758e9b77895e
MD5 f8fab3a7490e80e8586c9232434bb6ba
BLAKE2b-256 776c68e43a1fbc7215e09ef2c187e6f397e8652e56e3b3e7b857074b780cb41e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page