No project description provided
Project description
Code Blocks
Code Blocks is a simplified variant of Tree-sitter implemented to simplify the handling of code read and written by an LLM. The two main use cases are merging of incomplete code written by the LLM and splitting up code for embedding and indexing in a vector store.
Merging
If you send a complete code file along with an instruction on what should be changed to an LLM like GPT3.5 and 4, it often responds with only the parts of the code that have been updated. Even if you explicitly instruct the LLM to return the entire file, it doesn't always do so. Code Blocks' strategy is to handle the returned code instead of instructing GPT on how to return the code. This is done by applying a number of not entirely scientific methods. The merging process involves comparing the original and updated code blocks, identifying the differences, and merging them into a single, updated code block.
Examples and explanations of merge strategies:
Splitting
In order to index code in a vector store, it must first be split into blocks. Code Blocks attempts to do this by dividing the code into code blocks based on the structure of the code. This involves identifying the different components of the code (such as functions, classes, and statements), and splitting the code at these boundaries. The resulting code blocks can then be individually indexed in the vector store, allowing for more efficient storage and retrieval of code.
For examples of how Code Blocks splits code into chunks, see the following examples:
Supported Languages
- Python
- Java
- TypeScript
- JavaScript
- ???
Installation
To install Code Blocks, you can use pip:
pip install codeblocks-gpt
Usage
Here is a basic example of how to use Code Blocks to split and merge code:
from codeblocks import create_parser, CodeSplitter
# Create a parser for the language of your code (e.g., Python)
parser = create_parser("python")
# Parse your code into a CodeBlock
code_block = parser.parse(your_code)
# Create a CodeSplitter to split your code into blocks
splitter = CodeSplitter("python")
# Split your code into blocks
blocks = splitter.split_text(your_code)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for codeblocks_gpt-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 47dca924b6b4325e2d3fe5327cc59bc61fa14112b1056edb01388243784cf06b |
|
MD5 | 49df130e554e481dddf3e112dfa070b2 |
|
BLAKE2b-256 | 6adfe5500b1ada91689ae9f0e72a1c883c07baff1c9560a6c551af100f76a3e7 |