Skip to main content

No project description provided

Project description

Code Blocks

Code Blocks is a simplified variant of Tree-sitter implemented to simplify the handling of code read and written by an LLM. The two main use cases are merging of incomplete code written by the LLM and splitting up code for embedding and indexing in a vector store.

Merging

If you send a complete code file along with an instruction on what should be changed to an LLM like GPT3.5 and 4, it often responds with only the parts of the code that have been updated. Even if you explicitly instruct the LLM to return the entire file, it doesn't always do so. Code Blocks' strategy is to handle the returned code instead of instructing GPT on how to return the code. This is done by applying a number of not entirely scientific methods. The merging process involves comparing the original and updated code blocks, identifying the differences, and merging them into a single, updated code block.

Examples and explanations of merge strategies:

Splitting

In order to index code in a vector store, it must first be split into blocks. Code Blocks attempts to do this by dividing the code into code blocks based on the structure of the code. This involves identifying the different components of the code (such as functions, classes, and statements), and splitting the code at these boundaries. The resulting code blocks can then be individually indexed in the vector store, allowing for more efficient storage and retrieval of code.

For examples of how Code Blocks splits code into chunks, see the following examples:

Supported Languages

  • Python
  • Java
  • TypeScript
  • JavaScript
  • ???

Installation

To install Code Blocks, you can use pip:

pip install codeblocks-gpt

Usage

Here is a basic example of how to use Code Blocks to split and merge code:

from codeblocks import create_parser, CodeSplitter

# Create a parser for the language of your code (e.g., Python)
parser = create_parser("python")

# Parse your code into a CodeBlock
code_block = parser.parse(your_code)

# Create a CodeSplitter to split your code into blocks
splitter = CodeSplitter("python")

# Split your code into blocks
blocks = splitter.split_text(your_code)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codeblocks_gpt-0.0.1.tar.gz (12.6 kB view hashes)

Uploaded Source

Built Distribution

codeblocks_gpt-0.0.1-py3-none-any.whl (15.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page