Skip to main content

A package for combining source code files into one

Project description

English | Italiano | Français | Deutsch | Español

Lisa - Code Analyzer for LLMs

Lisa (inspired by Lisa Simpson) is a tool designed to simplify source code analysis through Large Language Models (LLMs). Intelligent and analytical like the character it's named after, Lisa helps study and interpret code with logic and method.

Description

Lisa is an essential tool for those who want to analyze their code or study open source projects through Large Language Models. Its main objective is to generate a single text file that maintains all references and structure of the original code, making it easily interpretable by an LLM.

This approach solves one of the most common problems in code analysis with LLMs: file fragmentation and loss of references between different project components.

Configuration

The project uses a combine_config.yaml configuration file that allows you to customize which files to include or exclude from the analysis. The default configuration is:

# Inclusion patterns (extensions or directories to include)
includes:
  - "*.py"  
  # You can add other extensions or directories

# Exclusion patterns (directories or files to exclude)
excludes:
  - ".git"
  - "__pycache__"
  - "*.egg-info"
  - "venv*"
  - ".vscode"
  - "agents*"
  - "log"

Inclusion/Exclusion Patterns

  • Patterns in includes determine which files will be processed (e.g., "*.py" includes all Python files)
  • Patterns in excludes specify which files or directories to ignore
  • You can use the * character as a wildcard
  • Patterns are applied to both file names and directory paths
  • Important: Exclusion rules always take priority over inclusion rules

Rule Priority

When there are "conflicts" between inclusion and exclusion rules, exclusion rules always take precedence. Here are some examples:

Example 1:
/project_root
    /src_code
        /utils
            /logs
                file1.py
                file2.py
            helpers.py

If we have these rules:

  • includes: ["*.py"]
  • excludes: ["*logs"]

In this case, file1.py and file2.py will NOT be included despite having the .py extension because they are in a directory that matches the "*logs" exclusion pattern. The helpers.py file will be included.

Example 2:
/project_root
    /includes_dir
        /excluded_subdir
            important.py

If we have these rules:

  • includes: ["includes_dir"]
  • excludes: ["excluded"]

In this case, important.py will NOT be included because it's in a directory that matches an exclusion pattern, even though its parent directory matches an inclusion pattern.

Usage

The script is run from the command line with:

cmb [options]

Note: The leading underscore in the filename is intentional and allows for shell tab completion.

Default Structure and Name

To understand which filename will be used by default, consider this structure:

/home/user/projects
    /my_test_project     <- This is the root directory
        /scripts
            _combine_code.py
            combine_config.yaml
        /src
            main.py
        /tests
            test_main.py

In this case, the default name will be "MY_TEST_PROJECT" (the root directory name in uppercase).

Available parameters:

  • --clean: Removes previously generated text files
  • --output NAME: Specifies the output file name prefix
    # Example with default name (from structure above)
    python \scripts\_combine_code.py
    # Output: MY_TEST_PROJECT_20240327_1423.txt
    
    # Example with custom name
    python \scripts\_combine_code.py --output PROJECT_ANALYSIS
    # Output: PROJECT_ANALYSIS_20240327_1423.txt
    

Output

The script generates a text file with the format: NAME_YYYYMMDD_HHMM.txt

where:

  • NAME is the prefix specified with --output or the default one
  • YYYYMMDD_HHMM is the generation timestamp

Usage with GitHub Projects

To use Lisa with a GitHub project, follow these steps:

  1. Environment preparation:

    # Create and access a directory for your projects
    mkdir ~/projects
    cd ~/projects
    
  2. Clone the project to analyze:

    # Example with a hypothetical "moon_project"
    git clone moon_project.git
    
  3. Integrate Lisa into the project:

    # Clone Lisa's repository
    git clone https://github.com/yourname/lisa.git
    
    # Copy Lisa's scripts folder into moon_project
    cp -r lisa/scripts moon_project/
    cp lisa/scripts/combine_config.yaml moon_project/scripts/
    
  4. Run the analysis:

    cd moon_project
    python scripts/_combine_code.py
    

Best Practices for Analysis

  • Before running Lisa, make sure you're in the root directory of the project to analyze
  • Check and customize the combine_config.yaml file according to the specific needs of the project
  • Use the --clean option to keep the directory tidy when generating multiple versions

Additional Notes

  • Lisa maintains the hierarchical structure of files in the generated document
  • Each file is clearly delimited by separators indicating its relative path
  • Code is organized maintaining directory depth order
  • Generated files can be easily shared with LLMs for analysis

Contributing

If you want to contribute to the project, you can:

  • Open issues to report bugs or propose improvements
  • Submit pull requests with new features
  • Improve documentation
  • Share your use cases and suggestions

License

MIT License

Copyright (c) 2024

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hyperlisa-1.0.0.tar.gz (19.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hyperlisa-1.0.0-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file hyperlisa-1.0.0.tar.gz.

File metadata

  • Download URL: hyperlisa-1.0.0.tar.gz
  • Upload date:
  • Size: 19.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.8

File hashes

Hashes for hyperlisa-1.0.0.tar.gz
Algorithm Hash digest
SHA256 365be18861753f4b2b1f8c021ba7a2265548bc715ffb534662a334a00bc39a18
MD5 b3c30b3cc4753fefd3804341a1316a68
BLAKE2b-256 4778334a97aaa969958282154cb29873d449fb282e876b2cce8a954dd284f8c9

See more details on using hashes here.

File details

Details for the file hyperlisa-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: hyperlisa-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 6.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.8

File hashes

Hashes for hyperlisa-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 29a1a3512f2a139a7118f56702bca1e9444d6cea45061315eb53fa289a4cfba3
MD5 b211fe78bfc6cb0e9c56ce0cf809a780
BLAKE2b-256 0bfc6c780ad9d4129526b7c67b1d1a39c58a37e25820c713f670759bfe11939a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page