Skip to main content

Project Knowledge Extractor

Project description

pk-extractor

pk-extractor (Project Knowledge Extractor) is a tool that generates a comprehensive knowledge base from a given repository, including the project structure and file contents. It respects .gitignore rules and allows for additional exclusion patterns.

Features

  • Generates a markdown file containing the project structure and file contents
  • Respects .gitignore rules
  • Allows for additional file/directory exclusion via command-line arguments
  • Provides progress information during processing
  • Handles binary files and errors gracefully

Installation

You can install pk-extractor using pip:

pip install pk-extractor
poetry add pk-extractor

Usage

After installation, you can run pk-extractor from the command line:

pk-extractor <root_dir> [--output_file OUTPUT_FILE] [--exclude [EXCLUDE [EXCLUDE ...]]]

or

pipx run pk-extractor <root_dir> [--output_file OUTPUT_FILE] [--exclude [EXCLUDE [EXCLUDE ...]]]

Arguments:

  • root_dir: Path to the repository you want to analyze (required)
  • --output_file: Path to the output file (default: "knowledge.md")
  • --exclude: Patterns to exclude (e.g., ".pyc" "venv/")

Examples:

  1. Generate knowledge for a repository:

    pk-extractor /path/to/your/repo
    
  2. Specify an output file:

    pk-extractor /path/to/your/repo --output_file my_knowledge.md
    
  3. Exclude specific patterns:

    pk-extractor /path/to/your/repo --exclude "*.pyc" "venv/*" "*.log"
    

Output

The script generates a markdown file containing:

  1. Project structure
  2. File contents

Development

To set up the development environment:

  1. Clone the repository:

    git clone https://github.com/your-username/pk-extractor.git
    cd pk-extractor
    
  2. Install dependencies:

    poetry install
    

Now you can run the tool or tests within this environment.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pk_extractor-0.2.0.tar.gz (4.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pk_extractor-0.2.0-py3-none-any.whl (5.3 kB view details)

Uploaded Python 3

File details

Details for the file pk_extractor-0.2.0.tar.gz.

File metadata

  • Download URL: pk_extractor-0.2.0.tar.gz
  • Upload date:
  • Size: 4.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for pk_extractor-0.2.0.tar.gz
Algorithm Hash digest
SHA256 b250d372fcf2449ebf770a5df28a61be17572c574d50dd93db812b044cf3fa56
MD5 78382661a33218e4fe4c8acf2a56f557
BLAKE2b-256 7e0bdf3f071ae6b22b64d94f6b114e591e8eacdb2ddb924875ff9954f55c31dc

See more details on using hashes here.

File details

Details for the file pk_extractor-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: pk_extractor-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 5.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for pk_extractor-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 314651fad67c307f099c1e3388a87efff6b60e81e2957d7c2cbb6c9ac66dff23
MD5 f0044e632d93d9131f71f57331e8bc2e
BLAKE2b-256 43170f62ca3ba4d3705e938b87c31abbb525a118d1a17ba1dce1d6836b7a712b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page